BLAST is a local alignment tool that finds similarity between sequences. This tool compares nucleotide or protein sequences to sequence databases, and calculates significance of matches. Sometimes these input sequences are large and using the command-line BLAST is required.
In order to test the BLAST (blast/2.2) performance on Tusker, we aligned three nucleotide query datasets,
large.fasta, against the non-redundant nucleotide nt.fasta database from NCBI. Some statistics about the query datasets and the time and memory resources used for the alignment are shown on the table below:
|total # of sequences||total # of bases||total size in MB||used time||used memory||# of used CPUs|
|small.fasta||41,715||35,581,740||37.627||~ 2 hours||~ 23 GB||8|
|medium.fasta||110,478||147,543,113||149||~ 4 hours||~ 24 GB||8|
|large.fasta||592,593||827,629,204||836||~ 15 hours||~ 47 GB||8|