Oases
Velvet by itself generates assembled contigs for DNA data. However, using the Oases extension for Velvet, a transcriptome assembly can be produced. Oases is an extension of Velvet for generating de novo assembly for RNA-Seq data. Oases uses the preliminary assembly produced by Velvet as an input, and constructs transcripts.
In order to be able to run Oases, after velveth, velvetg needs to be run with the –read_trkg yes option:
$ velvetg output_directory/ -min_contig_lgth 200 -read_trkg yes
The output_directory/ after velvetg with -read_trkg option on contains the following files:
$ ls
contigs.fa Graph2 LastGraph Log PreGraph Roadmaps Sequences stats.txt
Oases has a lot of parameters that can be found in its manual. While Velvet is multi-threaded, Oases is not.
A simple SLURM script to run Oases on the Velvet output stored in output_directory/ with minimum transcript length of 200 is shown below:
#!/bin/bash
#SBATCH --job-name=Velvet_Oases
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=168:00:00
#SBATCH --mem=10gb
#SBATCH --output=Oases.%J.out
#SBATCH --error=Oases.%J.err
module load oases/0.2
oases output_directory/ -min_trans_lgth 200
Oases Output¶
The output_directory/ after Oases contains the following files:
$ ls output_directory/
contig-ordering.txt contigs.fa Graph2 LastGraph Log PreGraph Roadmaps Sequences stats.txt transcripts.fa
Oases produces two additional output files: transcripts.fa and contig-ordering.txt. The predicted transcript sequences are found in the fasta file transcripts.fa.
Useful Information¶
In order to test the Oases (oases/0.2.8) performance, we used three paired-end input fastq files, small_1.fastq and small_2.fastq, medium_1.fastq and medium_2.fastq, and large_1.fastq and large_2.fastq. Some statistics about the input files and the time and memory resources used by Oases are shown in the table below:
| total # of sequences | total # of bases | total size in MB | used time | used memory | # of used CPUs | |
|---|---|---|---|---|---|---|
| small_1.fastq | 50,121 | 2,506,050 | 8.010 | ~ 0.05 minutes | ~ 0.02 GB | 1 |
| small_2.fastq | 50,121 | 2,506,050 | 8.010 | |||
| medium_1.fastq | 786,742 | 59,792,392 | 152 | ~ 0.25 minutes | ~ 0.315 GB | 1 |
| medium_2.fastq | 786,742 | 59,792,392 | 152 | |||
| large_1.fastq | 10,174,715 | 1,027,646,215 | 3,376 | ~ 15 minutes | ~ 30 GB | 1 |
| large_2.fastq | 10,174,715 | 1,027,646,215 | 3,376 |