CNVhac manual

A. SNP 5.0 array with control samples
     
  1. Download the code "snp5.0_with_control.tar.gz" here.
  2. Unpack the tarball as follows:
    $tar zxvf snp5.0_with_control.tar.gz
    $cd snp5.0_with_control
 

3. Put the raw data (cel files) of case samples in directory ./case_cel/ and control samples in ./control_cel/.

  4. Run control_site_effect.R. This step computes the adjust factors (see our paper for details) which will be stored in directory ./materials/site_effect_control/.
    $Rscript control_site_effect.R &
 

5. Run preprocessing_with_control.R and the results will be stored in directory ./preprocessing/.

    $Rscript preprocessing_with_control.R &
  6. Run segmentation_with_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/.
    $Rscript segmentation_with_control.R &
     
B. SNP 5.0 array without control samples
     
  1. Download the code "snp5.0_without_control.tar.gz" here.
  2. Unpack the tarball as follows:
    $tar zxvf snp5.0_without_control.tar.gz
    $cd snp5.0_without_control
  3. Put the raw data (cel files) of case samples in directory ./case_cel/.
  4. Run preprocessing_without_control.R and the results will be stored in directory ./preprocessing/.
    $Rscript preprocessing_without_control.R &
  5. Run segmentation_without_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/.
    $Rscript segmentation_without_control.R &
     
C. SNP 6.0 array with control samples
     
  1. Download the code "snp6.0_with_control.tar.gz" here.
  2. Unpack the tarball as follows:
    $tar zxvf snp6.0_with_control.tar.gz
    $cd snp6.0_with_control
 

3. Put the raw data (cel files) of case samples in directory ./case_cel/ and control samples in ./control_cel/.

  4. Run control_site_effect.R. This step computes the adjust factors (see our paper for details) which will be stored in directory ./materials/site_effect_control/.
    $Rscript control_site_effect.R &
 

5. Run preprocessing_with_control.R and the results will be stored in directory ./preprocessing/.

    $Rscript preprocessing_with_control.R &
  6. Run segmentation_with_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/.
    $Rscript segmentation_with_control.R &
     
D. SNP 6.0 array without control samples
     
  1. Download the code "snp6.0_without_control.tar.gz" here.
  2. Unpack the tarball as follows:
    $tar zxvf snp6.0_without_control.tar.gz
    $cd snp6.0_without_control
  3. Put the raw data (cel files) of case samples in directory ./case_cel/.
  4. Run preprocessing_no_control.R and the results will be stored in directory ./preprocessing/.
    $Rscript preprocessing_without_control.R &
  5. Run segmentation_without_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/.
    $Rscript segmentation_without_control.R &
     
Output files description
   
  The "*_preprocessing" files in directory ./preprocessing/ list the allelic concentrations (ACs) for each locus. The "*_HMM_state" files in ./HMM_state/ give the CN hidden state for each locus and the estimated CNV regions of each individual are presented in directory ./segmentation/.
     
Test data
   
  Take SNP6.0 array for example. Download the test dataset "snp6.0_test_data.tar.gz" here. This dataset contains two raw cel files: NA06985_GW6_C.CEL and NA06991_GW6_C.CEL. For the situation with control samples, put the two cel files in directory ./snp6.0_with_control/case_cel/ and ./snp6.0_with_control/control_cel/ respectively; run the codes follow the instructions in step C, and the segmentation result should be the same as "segmentation_result_for_test_data_with_control.tar.gz" here. For the situation without control samples, put the two cel files in directory ./snp6.0_without_control/case_cel/; run the codes follow the instructions in step D, and the segmentation result should be the same as "segmentation_result_for_test_data_without_control.tar.gz" here.
     
System requirement
     
  CNVhac is one Linux based program. We usually recommend at least 10 Gb memory for Affymetrix SNP 5.0 and 6.0 array data analysis.