Volume 37 Issue 6
Nov.  2016
Turn off MathJax
Article Contents

Newton O. OTECKO, Min-Sheng PENG, He-Chuan YANG, Ya-Ping ZHANG, Guo-Dong WANG. Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array. Zoological Research, 2016, 37(6): 356-360. doi: 10.13918/j.issn.2095-8137.2016.6.356
Citation: Newton O. OTECKO, Min-Sheng PENG, He-Chuan YANG, Ya-Ping ZHANG, Guo-Dong WANG. Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array. Zoological Research, 2016, 37(6): 356-360. doi: 10.13918/j.issn.2095-8137.2016.6.356

Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array

doi: 10.13918/j.issn.2095-8137.2016.6.356
Funds:  This work was supported by grants from the NSFC (91531303) and the 973 programs (2013CB835200; 2013CB835202)
More Information
  • Corresponding author: Guo-Dong WANG
  • Received Date: 2016-10-18
  • Rev Recd Date: 2016-11-04
  • Publish Date: 2016-11-18
  • Quality deficiencies in single nucleotide polymorphism (SNP) analyses have important implications. We used missingness rates to investigate the quality of a recently published dataset containing 424 mitochondrial, 211 Y chromosomal, and 160 432 autosomal SNPs generated by a semicustom Illumina SNP array from 5 392 dogs and 14 grey wolves. Overall, the individual missingness rate for mitochondrial SNPs was ~43.8%, with 980 (18.1%) individuals completely missing mitochondrial SNP genotyping (missingness rate=1). In males, the genotype missingness rate was ~28.8% for Y chromosomal SNPs, with 374 males recording rates above 0.96. These 374 males also exhibited completely failed mitochondrial SNPs genotyping, indicative of a batch effect. Individual missingness rates for autosomal markers were greater than zero, but less than 0.5. Neither mitochondrial nor Y chromosomal SNPs achieved complete genotyping (locus missingness rate=0), whereas 5.9% of autosomal SNPs had a locus missingness rate=1. The high missingness rates and possible batch effect show that caution and rigorous measures are vital when genotyping and analyzing SNP array data for domestic animals. Further improvements of these arrays will be helpful to future studies.
  • 加载中
  • [1] Didion JP, Yang H, Sheppard K, Fu CP, McMillan L, De Villena FPM, Churchill GA. 2012. Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias. BMC Genomics, 13: 34.
    [2] Fu WQ, Wang Y, Wang Y, Li R, Lin R, Jin L. 2009. Missing call bias in high-throughput genotyping. BMC Genomics, 10: 106.
    [3] Goddard ME, Hayes BJ. 2009. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nature Reviews Genetics, 10(6): 381-391.
    [4] Hong HX, Su ZQ, Ge WG, Shi LM, Perkins R, Fang H, Xu JS, Chen JJ, Han T, Kaput J, Fuscoe JC, Tong WD. 2008. Assessing batch effects of genotype calling algorithm BRLMM for the Affymetrix GeneChip Human Mapping 500 K array set using 270 HapMap samples. BMC Bioinformatics, 9: S17.
    [5] Kitchen RR, Sabine VS, Sims AH, Macaskill EJ, Renshaw L, Thomas JS, Van Hemert JI, Dixon JM, Bartlett JMS. 2010. Correcting for intra-experiment variation in Illumina BeadChip data is necessary to generate robust gene-expression profiles. BMC Genomics, 11: 134.
    [6] Kupfer P, Guthke R, Pohlers D, Huber R, Koczan D, Kinne RW. 2012. Batch correction of microarray data substantially improves the identification of genes differentially expressed in rheumatoid arthritis and osteoarthritis. BMC Medical Genomics, 5: 23.
    [7] LaFramboise T. 2009. Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances. Nucleic Acids Research, 37(13): 4181-4193.
    [8] Leek JT. 2014. svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Research, 42(21): e161.
    [9] Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, Geman D, Baggerly K, Irizarry RA. 2010. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics, 11(10): 733-739.
    [10] Nishida N, Koike A, Tajima A, Ogasawara Y, Ishibashi Y, Uehara Y, Inoue I, Tokunaga K. 2008. Evaluating the performance of Affymetrix SNP Array 6.0 platform with 400 Japanese individuals. BMC Genomics, 9: 431.
    [11] Palanichamy MG, Zhang YP. 2010. Potential pitfalls in MitoChip detected tumor-specific somatic mutations: a call for caution when interpreting patient data. BMC Cancer, 10: 597.
    [12] Peng MS, He JD, Fan L, Liu J, Adeola AC, Wu SF, Murphy RW, Yao YG, Zhang YP. 2014. Retrieving Y chromosomal haplogroup trees using GWAS data. European Journal of Human Genetics, 22(8): 1046-1050.
    [13] Pompanon F, Bonin A, Bellemain E, Taberlet P. 2005. Genotyping errors: causes, consequences and solutions. Nature Reviews Genetics, 6(11): 847-859.
    [14] Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, De Bakker PIW, Daly MJ, Sham PC. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 81(3): 559-575.
    [15] Shannon LM, Boyko RH, Castelhano M, Corey E, Hayward JJ, McLean C, White ME, Abi Said M, Anita BA, Bondjengo NI, Calero J, Galov A, Hedimbi M, Imam B, Khalap R, Lally D, Masta A, Oliveira KC, Pérez L, Randall J, Tam NM, Trujillo-Cornejo FJ, Valeriano C, Sutter NB, Todhunter RJ, Bustamante CD, Boyko AR. 2015a. Data from: Genetic structure in village dogs reveals a Central Asian domestication origin. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.v9t5h.
    [16] Shannon LM, Boyko RH, Castelhano M, Corey E, Hayward JJ, McLean C, White ME, Abi Said M, Anita BA, Bondjengo NI, Calero J, Galov A, Hedimbi M, Imam B, Khalap R, Lally D, Masta A, Oliveira KC, Pérez L, Randall J, Tam NM, Trujillo-Cornejo FJ, Valeriano C, Sutter NB, Todhunter RJ, Bustamante CD, Boyko AR. 2015b. Genetic structure in village dogs reveals a Central Asian domestication origin. Proceedings of the National Academy of Sciences of the United States of America, 112(44): 13639-13644.
    [17] Spitzer M, Wildenhain J, Rappsilber J, Tyers M. 2014. BoxPlotR: a web tool for generation of box plots. Nature Methods, 11(2): 121-122.
    [18] Yu ZX. 2012. Family-based association tests using genotype data with uncertainty. Biostatistics, 13(2): 228-240.
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Article Metrics

Article views(448) PDF downloads(1014) Cited by()

Related
Proportional views

Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array

doi: 10.13918/j.issn.2095-8137.2016.6.356
Funds:  This work was supported by grants from the NSFC (91531303) and the 973 programs (2013CB835200; 2013CB835202)
    Corresponding author: Guo-Dong WANG

Abstract: Quality deficiencies in single nucleotide polymorphism (SNP) analyses have important implications. We used missingness rates to investigate the quality of a recently published dataset containing 424 mitochondrial, 211 Y chromosomal, and 160 432 autosomal SNPs generated by a semicustom Illumina SNP array from 5 392 dogs and 14 grey wolves. Overall, the individual missingness rate for mitochondrial SNPs was ~43.8%, with 980 (18.1%) individuals completely missing mitochondrial SNP genotyping (missingness rate=1). In males, the genotype missingness rate was ~28.8% for Y chromosomal SNPs, with 374 males recording rates above 0.96. These 374 males also exhibited completely failed mitochondrial SNPs genotyping, indicative of a batch effect. Individual missingness rates for autosomal markers were greater than zero, but less than 0.5. Neither mitochondrial nor Y chromosomal SNPs achieved complete genotyping (locus missingness rate=0), whereas 5.9% of autosomal SNPs had a locus missingness rate=1. The high missingness rates and possible batch effect show that caution and rigorous measures are vital when genotyping and analyzing SNP array data for domestic animals. Further improvements of these arrays will be helpful to future studies.

Newton O. OTECKO, Min-Sheng PENG, He-Chuan YANG, Ya-Ping ZHANG, Guo-Dong WANG. Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array. Zoological Research, 2016, 37(6): 356-360. doi: 10.13918/j.issn.2095-8137.2016.6.356
Citation: Newton O. OTECKO, Min-Sheng PENG, He-Chuan YANG, Ya-Ping ZHANG, Guo-Dong WANG. Re-evaluating data quality of dog mitochondrial, Y chromosomal, and autosomal SNPs genotyped by SNP array. Zoological Research, 2016, 37(6): 356-360. doi: 10.13918/j.issn.2095-8137.2016.6.356
Reference (18)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return