Velvet

Velvet is a general sequence assembler designed to produce assembly from short, as well as long reads. Running Velvet consists of a sequence of two commands velveth and velvetg. velveth produces a hash table of k-mers, while velvetg constructs the genome assembly. The k-mer length, also known as hash length corresponds to the length, in base pairs, of the words of the reads being hashed.

Velvet has lots of parameters that can be found in its manual. However, the k-mer value is crucial in obtaining optimal assemblies. Higher k-mer values increase the specificity, and lower k-mer values increase the sensitivity.

Velvet supports multiple file formats: fasta, fastq, fasta.gz, fastq.gz, sam, bam, eland, gerald. Velvet also supports different read categories for different sequencing technologies and libraries, e.g. short, shortPaired, short2, shortPaired2, long, longPaired.

Each step of Velvet (velveth and velvetg) may be run as its own job. The following pages describe how to run Velvet in this manner on HCC and provide example submit scripts:

Running Velvet with Paired-End Data ¶

Description: How to run velvet with paired-end data on HCC resources

Running Velvet with Single-End and Paired-End Data ¶

Description: How to run velvet with single-end and paired-end data on HCC resources

Running Velvet with Single-End Data ¶

Description: How to run velvet with single-end data on HCC resources

Useful Information¶

In order to test the Velvet (velvet/1.2) performance, we used three paired-end input fastq files, small_1.fastq and small_2.fastq, medium_1.fastq and medium_2.fastq, and large_1.fastq and large_2.fastq. Some statistics about the input files and the time and memory resources used by Velvet are shown in the table below:

	total # of sequences	total # of bases	total size in MB	velveth used time	velveth used memory	velvetg used time	velvetg used memory	# of used CPUs
small_1.fastq	50,121	2,506,050	8.010	~ 0.02 minutes	~ 0.3 GB	~ 0.08 minutes	~ 0.2 GB	8
small_2.fastq	50,121	2,506,050	8.010	~ 0.02 minutes	~ 0.3 GB	~ 0.08 minutes	~ 0.2 GB	8
medium_1.fastq	786,742	59,792,392	152	~ 0.4 minutes	~ 1.5 GB	~ 0.8 minutes	~ 0.9 GB	8
medium_2.fastq	786,742	59,792,392	152	~ 0.4 minutes	~ 1.5 GB	~ 0.8 minutes	~ 0.9 GB	8
large_1.fastq	10,174,715	1,027,646,215	3,376	~ 7 minutes	~ 23 GB	~ 45 minutes	~ 51 GB	8
large_2.fastq	10,174,715	1,027,646,215	3,376	~ 7 minutes	~ 23 GB	~ 45 minutes	~ 51 GB	8

Velvet

Running Velvet with Paired-End Data¶

Running Velvet with Single-End and Paired-End Data¶

Running Velvet with Single-End Data¶

Useful Information¶

Running Velvet with Paired-End Data ¶

Running Velvet with Single-End and Paired-End Data ¶

Running Velvet with Single-End Data ¶