HTTPS and S3 links to genomic index files freely available in the AWS cloud
Home |
---|
Bowtie |
HISAT |
Kraken/Bracken |
Centrifuge |
Kraken 2 is a fast and memory efficient tool for taxonomic assignment of metagenomics sequencing reads. Bracken is a related tool that additionally estimates relative abundances of species or genera. See the Kraken 2 manual for more information about the individual libraries and their relationship to public repositories like Refseq.
Starting Fall 2020, we began creating indexes for more combinations of RefSeq databases.
All packages contain a Kraken 2 database along with Bracken databases built for 50, 75, 100, 150, 200, 250 and 300-mers.
In some cases we used the --max-db-size
option to cap the size of the database produced.
This makes the index smaller at the expense of some sensitivity and accuracy.
In all cases we use the defaults for k-mer length, minimizer length, and minimizer spacing.
Links in the “Inspect” column are to files containing the output of running kraken2-inspect
on the index, giving a quick way of checking what genomes & taxa are represented.
Collection | Contains | Date | Archive size (GB) | Index size (GB) | HTTPS URL | S3 URL | Inspect |
---|---|---|---|---|---|---|---|
Viral | viral | 12/2/2020 | 0.4 | 0.4 | .tar.gz | .tar.gz | .txt |
MinusB | archaea, viral, plasmid, human1, UniVec_Core | 12/2/2020 | 5.1 | 7.4 | .tar.gz | .tar.gz | .txt |
Standard | archaea, bacteria, viral, plasmid, human1, UniVec_Core | 12/2/2020 | 36.0 | 46.8 | .tar.gz | .tar.gz | .txt |
Standard-8 | Standard with DB capped at 8 GB | 12/2/2020 | 5.5 | 7.5 | .tar.gz | .tar.gz | .txt |
Standard-16 | Standard with DB capped at 16 GB | 12/2/2020 | 11.2 | 14.9 | .tar.gz | .tar.gz | .txt |
PlusPF | Standard plus protozoa & fungi | 12/2/2020 | 36.9 | 47.8 | .tar.gz | .tar.gz | .txt |
PlusPF-8 | PlusPF with DB capped at 8 GB | 12/2/2020 | 5.5 | 7.5 | .tar.gz | .tar.gz | .txt |
PlusPF-16 | PlusPF with DB capped at 16 GB | 12/2/2020 | 11.2 | 14.9 | .tar.gz | .tar.gz | .txt |
PlusPFP | Standard plus protozoa, fungi & plant | 12/2/2020 | 70.2 | 94.3 | .tar.gz | .tar.gz | .txt |
PlusPFP-8 | PlusPFP with DB capped at 8 GB | 12/2/2020 | 5.2 | 7.5 | .tar.gz | .tar.gz | .txt |
PlusPFP-16 | PlusPFP with DB capped at 16 GB | 12/2/2020 | 10.7 | 14.9 | .tar.gz | .tar.gz | .txt |
EuPathDB482 | Eukaryotic pathogen genomes with contaminants removed | 11/13/2020 | 26.4 | 34.1 | .tar.gz | .tar.gz | .txt |
--no-mask
argumentAll packages contain a Kraken 2 database along with Bracken databases built for 100mers, 150mers, and 200mers.
Collection | Size (MB) | HTTPS URL | S3 URL |
---|---|---|---|
Greengenes 13.5 | 73.2 | .tar.gz | .tar.gz |
RDP 11.5 | 168 | .tar.gz | .tar.gz |
Silva 132 | 117 | .tar.gz | .tar.gz |
Silva 138 | 112 | .tar.gz | .tar.gz |
Collection | Contains | Date | Archive size (GB) | Index size (GB) | HTTPS URL | S3 URL | Inspect |
---|---|---|---|---|---|---|---|
MinusB | archaea, viral, plasmid, human1, UniVec_Core | 9/19/2020 | 5.0 | 7.3 | .tar.gz | .tar.gz | .txt |
Standard | archaea, bacteria, viral, plasmid, human1, UniVec_Core | 9/19/2020 | 36.0 | 47.0 | .tar.gz | .tar.gz | .txt |
Standard-8 | Standard with DB capped at 8 GB | 9/19/2020 | 5.5 | 7.4 | .tar.gz | .tar.gz | .txt |
Standard-16 | Standard with DB capped at 16 GB | 9/19/2020 | 11.2 | 14.9 | .tar.gz | .tar.gz | .txt |
PlusPF | Standard plus protozoa & fungi | 9/19/2020 | 37.0 | 48.0 | .tar.gz | .tar.gz | .txt |
PlusPF-8 | PlusPF with DB capped at 8 GB | 9/19/2020 | 5.5 | 7.4 | .tar.gz | .tar.gz | .txt |
PlusPF-16 | PlusPF with DB capped at 16 GB | 9/19/2020 | 11.2 | 14.9 | .tar.gz | .tar.gz | .txt |
PlusPFP | Standard plus protozoa, fungi & plant | 9/19/2020 | 66.5 | 90.0 | .tar.gz | .tar.gz | .txt |
PlusPFP-8 | PlusPFP with DB capped at 8 GB | 9/19/2020 | 5.3 | 7.4 | .tar.gz | .tar.gz | .txt |
PlusPFP-16 | PlusPFP with DB capped at 16 GB | 9/19/2020 | 10.7 | 14.9 | .tar.gz | .tar.gz | .txt |
The following table points to the “Minikraken” indexes we created initially. All packages contain a Kraken 2 database along with Bracken databases built for 100, 150, and 200-mers. Some also contain Bracken databases for 50, 75 and 250-mers.
Collection | Contains | Date | Archive size (GB) | Index size (GB) | HTTPS URL | S3 URL |
---|---|---|---|---|---|---|
Minikraken v1 | Refseq: bacteria, archaea, viral | 3/2020 | 5.6 | 8 | .tar.gz | .tar.gz |
Minikraken v2 | Refseq: bacteria, archaea, viral, human* | 3/2020 | 5.5 | 8 | .tar.gz | .tar.gz |
Kraken, Kraken 2, Bracken and KrakenUniq are the work of Derrick Wood, Steven Salzberg, Jennifer Lu, Florian Breitwieser, Daniel Baker, Martin Steinegger and Ben Langmead among others. Please see the Kraken, Kraken 2, KrakenUniq and Bracken websites for more information on the software, authors, and how to cite the work.