Home
Bowtie
HISAT
Kraken/Bracken
Centrifuge
SPUMONI
SPUMONI 2

Centrifuge indexes

Centrifuge is a very rapid and memory-efficient system for the classification of DNA sequences from microbial samples.

Collection	Date	Size	HTTPS URL	S3 URL
Refseq: bacteria, archaea, viral, human (compressed)	December, 2016	5.4 GB	.tar.gz	.tar.gz
Refseq: bacteria, archaea, viral, human	December, 2016	7.9 GB	.tar.gz	.tar.gz
Refseq: bacteria, archaea (compressed)	April, 2018	6.2 GB	.tar.gz	.tar.gz
NCBI: nucleotide non-redundant sequences	March, 2018	64 GB	.tar.gz	.tar.gz

Centrifuge is the work of Daehwan Kim, Li Song, Florian Breitwieser, Chanhee Park, Steven Salzberg among others. Please see the Centrifuge website for more information on the software, authors, and how to cite it.

nt Database from Lawrence Livermore National Laboratory

A team from Lawrence Livermore National Laboratory (LLNL) have constructed a Centrifuge database spanning all of the BLAST nt sequences. This is described in a recent manuscript. This database can be downloaded as a collection of 7zip archives. You will need to have the 7zip softare (i.e. the 7z command) installed. Altogether, the compressed archives occupy 284G. These commands will download the archives:

curl https://genome-idx.s3.amazonaws.com/centrifuge/llnl/nt_wntr23/nt_wntr23_filt.cf.7z.[001-071] -O

Then you must decompress them with the command:

7z x nt_wntr23_filt.cf.7z.001

This index was constructed by Jose Manuel Martí, Car Reen Kok, James B. Thissen, Nisha J. Mulakken, Aram Avila-Herrera, Crystal J. Jaing, Jonathan E. Allen, and Nicholas A. Be at LLNL.