SOAPaligner(included SOAP, SOAP2, SOAP3 and SOAP3-dp) is a short reads alignment tool, which is a member of the SOAP (Short Oligonucleotide Analysis Package). The first generation SOAPaligner is called SOAP in the meaning of short Oligonucleotide alignment program, and SOAP has been in evolution from a single alignment tool to a tool package that provides full solution to next generation sequencing data analysis. Currently, it consists of a new alignment tool (SOAPaligner), a re-sequencing consensus sequence builder (SOAPsnp), an indel finder ( SOAPindel ), a structural variation scanner ( SOAPsv ) and a de novo short reads assembler ( SOAPdenovo ).

SOAPaligner used a Burrows Wheeler Transformation compression index (Burrows and Wheeler 1994) to substitute the seed strategy for indexing the reference sequence in the main memory and GPU acceleration to increase the alignment speed.

Because SOAP2 is completely beyond the SOAP on speed and accuracy, so the Basic Protocol describes how to use the SOAP2 tool to align a collection of short paired-end reads to a reference genome, which included the Indexing protocol describes how to build an index for a reference genome with the 2bwt-build tool. The Consensus and SNP Calling Protocol (Alternate Protocol 1) describes how to use SOAPsnp(Li et al., 2009) to call SNPs from SOAP2's output. The GPU-based SOAPaligner Protocol (Alternate Protocol 2) describes how to use the SOAP3-dp to align short reads to reference. Finally, the Support Protocols describe how to obtain and install SOAPaligner software.

System requirements

A computer with as much memory and computing power as possible is needed. At least 8 GB of memory is needed, but 16 to 32 GB is recommended. Disk with more than 8GB free space is needed.

GPU Hardware

Multi-core CPU (default quad-core), 20 GB main memory. CUDA-enabled GPU with compute capability 2.0 and at least 3GB memory (default 6GB).

Note: SOAP3-dp has been tested with the following GPU: NVIDIA Tesla C2070 (6GB memory), Tesla M2050 (3GB memory), GTX 580 (3GB memory). It should also work using Tesla C2050, C2075, Quadro 6000.


Only support with 64-bit Linux as the operating system (Linux/x86_64, 64bit AMD/Intel and compatible), kernel 2.6 or more recent.


E.coli Reference file: e_coli.fa.gz

E.coli Reads test files: e_coli.1.fq.gz and e_coli.2.fq.gz

Simulated SNP List e_coli_snp.lst

Aglingment result: e_coli.result.soap.gz

Aglingment result: e_coli.result.unpaired

Aglingment result: e_coli.result.unmap

Aglingment sorted result: e_coli.result.sorted.gz

SNP call result: e_coli.snp.consensus



