CNVhac manual
¡¡
A. SNP 5.0 array with control samples | ||
1. Download the code "snp5.0_with_control.tar.gz" here. | ||
2. Unpack the tarball as follows: | ||
$tar zxvf snp5.0_with_control.tar.gz | ||
$cd snp5.0_with_control | ||
3. Put the raw data (cel files) of case samples in directory ./case_cel/ and control samples in ./control_cel/. |
||
4. Run control_site_effect.R. This step computes the adjust factors ¦Ã (see our paper for details) which will be stored in directory ./materials/site_effect_control/. | ||
$Rscript control_site_effect.R & | ||
5. Run preprocessing_with_control.R and the results will be stored in directory ./preprocessing/. |
||
$Rscript preprocessing_with_control.R & | ||
6. Run segmentation_with_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/. | ||
$Rscript segmentation_with_control.R & | ||
B. SNP 5.0 array without control samples | ||
1. Download the code "snp5.0_without_control.tar.gz" here. | ||
2. Unpack the tarball as follows: | ||
$tar zxvf snp5.0_without_control.tar.gz | ||
$cd snp5.0_without_control | ||
3. Put the raw data (cel files) of case samples in directory ./case_cel/. | ||
4. Run preprocessing_without_control.R and the results will be stored in directory ./preprocessing/. | ||
$Rscript preprocessing_without_control.R & | ||
5. Run segmentation_without_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/. | ||
$Rscript segmentation_without_control.R & | ||
C. SNP 6.0 array with control samples | ||
1. Download the code "snp6.0_with_control.tar.gz" here. | ||
2. Unpack the tarball as follows: | ||
$tar zxvf snp6.0_with_control.tar.gz | ||
$cd snp6.0_with_control | ||
3. Put the raw data (cel files) of case samples in directory ./case_cel/ and control samples in ./control_cel/. |
||
4. Run control_site_effect.R. This step computes the adjust factors ¦Ã (see our paper for details) which will be stored in directory ./materials/site_effect_control/. | ||
$Rscript control_site_effect.R & | ||
5. Run preprocessing_with_control.R and the results will be stored in directory ./preprocessing/. |
||
$Rscript preprocessing_with_control.R & | ||
6. Run segmentation_with_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/. | ||
$Rscript segmentation_with_control.R & | ||
D. SNP 6.0 array without control samples | ||
1. Download the code "snp6.0_without_control.tar.gz" here. | ||
2. Unpack the tarball as follows: | ||
$tar zxvf snp6.0_without_control.tar.gz | ||
$cd snp6.0_without_control | ||
3. Put the raw data (cel files) of case samples in directory ./case_cel/. | ||
4. Run preprocessing_no_control.R and the results will be stored in directory ./preprocessing/. | ||
$Rscript preprocessing_without_control.R & | ||
5. Run segmentation_without_control.R. The hidden state for each locus will be stored in directory ./HMM_state/ and the segmentation results in ./segmentation/. | ||
$Rscript segmentation_without_control.R & | ||
Output files description | ||
The "*_preprocessing" files in directory ./preprocessing/ list the allelic concentrations (ACs) for each locus. The "*_HMM_state" files in ./HMM_state/ give the CN hidden state for each locus and the estimated CNV regions of each individual are presented in directory ./segmentation/. | ||
Test data | ||
Take SNP6.0 array for example. Download the test dataset "snp6.0_test_data.tar.gz" here. This dataset contains two raw cel files: NA06985_GW6_C.CEL and NA06991_GW6_C.CEL. For the situation with control samples, put the two cel files in directory ./snp6.0_with_control/case_cel/ and ./snp6.0_with_control/control_cel/ respectively; run the codes follow the instructions in step C, and the segmentation result should be the same as "segmentation_result_for_test_data_with_control.tar.gz" here. For the situation without control samples, put the two cel files in directory ./snp6.0_without_control/case_cel/; run the codes follow the instructions in step D, and the segmentation result should be the same as "segmentation_result_for_test_data_without_control.tar.gz" here. | ||
System requirement | ||
CNVhac is one Linux based program. We usually recommend at least 10 Gb memory for Affymetrix SNP 5.0 and 6.0 array data analysis. | ||