Introduction

ICLU is an open source tool developed for Identifying genome-wide large variants, such as CNVs and LOH etc. using target re-sequencing data without a mactched sample. It is implemented in the Perl platform and only for analysis on Human being re-sequencing data.

The alogrithms and software in this package were developed by BGI (wangyu@genomics.cn).

System requirements

  1. Hardware:
    1. 64-bit x86-64 Intel CPU with SSE instructions
    2. The program needs ~6 GB main memory to run with fq data from Homo sapiens
    3. The program needs ~2 GB main memory to run with bam data from Homo sapiens
  2. Software:
    1. 64-bit Linux System
    2. The version of gcc compiler is at least 4.2.4
    3. The version of perl is at least 5.8.5
    4. The version of R is at least 3.0.1

Download

    Download ICLU from the link below.

    Release 1.01, 19-02-2014

    download(MD5: c7613a59e40f83e51fe83cf431464e8c)

    Download 8 test samples from the link below.

    download(sample1.sort.bam, MD5: e57d19cb1bc818468d021613328a3c0d)

    download(sample2.sort.bam, MD5: 0b97bce980eb8a4b1f281fd0bc2312ff)

    download(sample3.sort.bam, MD5: 27a46eff2c0b5fb825f10381ef4c09a0)

    download(sample4.sort.bam, MD5: 15b5af825c999c9fe6a9cb3e78327c44)

    download(sample5.sort.bam, MD5: a912f0f2f33c24b318eb7f97f8f17bc9)

    download(sample6.sort.bam, MD5: fa83c2dcdd9edb534f55f1829ae6a9ef)

    download(sample7.sort.bam, MD5: 537b1424bcc2eb998224794a30c379d4)

    download(sample8.sort.bam, MD5: c216d9b9010a44b78bc0f175ec7dc916)

Usage

  • Download ICLUv1.01_package.tar.gz to your local directory.
  • $gzip -d	ICLUv1.01_package.tar.gz
    $tar -zxf	ICLUv1.01_package.tar
    $cd	ICLUv1.01_package
    $ls
    bin 	example 	material 	ICLU-v1.01.pdf 
    
  • How to Run
  • Please see the ICLU-v1.01.pdf

  • Command Line Option:
    1. $perl	ICLU.V1.01.pl 
      
      		Control Creat Median Ri and Rmi from multiple samples
      		Case    Detect CNV,LOH and UPD for case sample
      
      

      Contructing a datum-line from multiple samples

      $perl ICLU.V1.01.pl Control -Cfg <File> The total information file. -Format <Str> The format of the input file.[fq] for fastq, [bam] for bam -Gender <Str> The gender of the multiple samples.[M] for Male, [F] for Female -O <Dir> All processing and results of sample will be create in this directory. -h Print this help information.

      Analysis of test samples

      $perl ICLU.V1.01.pl Case -Cfg <File> The total information file. -Format <Str> The format of the input file.[fq] for fastq, [bam] for bam -Mcnv <Num> The minimum of probe number to identify a CNV.[45] -P <Float> The cutoff of p-value for a single probe.[0.05] -Mloh <Num> The minimum of SNP number to identify a loh.[20] -O <Dir> All processing and results of sample will be create in this directory. -h Print this help information.
  • the structure of the test sampleA directory as:
  • If the input file was fq format:	sampleA_1.fq(gz) and/or sampleA_2.fq(gz)
    
    --sampleA
    	|--Deal_raw_data
    	|--Alignment 
    	|--Variants
    	|--Circos
    	|--sh
    
    If the input file was bam format:	sampleA.bam
    
    --sampleA
    	|--Alignment 
    	|--Variants
    	|--Circos
    	|--sh
    
To Top

Reference


  • Krzywinski, M., et al. (2009) Circos: an information aesthetic for comparative genomics, Genome Res, 19, 1639-1645.

To Top