Effect of Marker Genotyping Error on the Prediction Accuracy of Genomic Breeding Value in Threshold Traits

Document Type : Research Paper


1 Kurdistan university. Independent researcher (PhD in genetics and Animal Breeding), College of Agriculture, Department of Animal Science, university of Kurdistan, Kurdistan, Iran

2 Associate Professor, Department of Animal Science, Astara Branch, Islamic Azad University, Astara, Iran


The purpose of this study was to investigate the effect of different rates marker genotyping error and the type of mating and selection design (breeding value and phenotypic) on the accuracy of genomic prediction assessment under different levels of heritability (0.05, 0.1 and 0.3) and marker density (500, 1000 and 1500) by simulation in threshold trait. The genome consisted of two chromosomes, each 100 cM, and 125 QTLs were randomly distributed on each chromosome. In order to simulation a threshold trait, 20 percent of the top-level phenotypes were considered to be 2, and the rest were considered as 1. Genomic breeding value was predicted using marker effects estimated by Bayes B statistical method. Comparison of the accuracy of genomic evaluations showed that selection and mating designs of breeding value was more accurate than the selection and mating designs of phenotypic. The accuracy of genomic prediction decreased with increasing marker genotyping error in both selection and mating designs of breeding value and phenotypic. The results showed that with increasing the percentage of marker genotyping error, increasing the number of markers leads to increasing the accuracy of genomic breeding value prediction.


Main Subjects

Extended Abstract


Many traits recorded in domestic species, including the litter size of large mammals, the extent of calving difficulty, and resistance to diseases, exhibit a discrete distribution of phenotypes, commonly referred to as threshold traits. Genetic progress in such traits depends on genetic diversity in the population, selection intensity, accuracy of prediction and generation interval. Genomic selection is a novel method to improve quantitative traits in plant and animal breeding. The factors affecting the accuracy of genomic prediction are the heritability of the interested trait, number of individuals in the reference population, the extent of relationships between selection candidates and the reference population, the accuracy of estimated marker effects, linkage disequilibrium between markers and quantitative trait loci (QTL), the distribution of QTL effects and marker genotyping error. The reported range of marker genotyping error falls between 0.1% and 15%, which may significantly influence genomic prediction accuracy. Hence, the objective of this study is to explore the impact of varying heritability levels, marker intensities, rates of marker genotyping error, as well as different types of selection and mating strategies, on the precision of genomic prediction for a threshold trait


Materials and methods

      To conduct a comparative analysis of different scenarios, we employed the QMsim software for simulating various datasets. Simulation started with a base population of 1000 animals, including 500 males and 500 females, which randomly mated for subsequent 1000 generations. Subsequently, a random selection of 20 females and 200 males was made from the last historical generation to expand the population size for an additional 10 generations. The training sets comprised individuals from the 8th to the 9th generations, while the validation set consisted of all individuals from the 10th generation. The simulated genome consisted of two chromosomes, each with an equal length of 1 Morgan. Scenarios were established to examine the impact of various factors on the accuracy of genomic prediction. These factors included different rates of marker genotyping error (0%, 4%, 8%, and 12%), different types of mating and selection designs (based on breeding value or phenotype), varying levels of heritability (0.05, 0.1, and 0.3), and different marker densities (500, 1000, and 1500). In this study, a threshold model was employed, and the marker effects were estimated using the Bayesian B methodology. To assess the accuracy of prediction, the correlation between the true and estimated genomic breeding values was calculated.


Results and discussion

     The accuracy of predictions, using breeding value as the mating and selection design, ranged from 0.54 to 0.56 for h2 = 0.05, 0.61 to 0.66 for h2 = 0.10, and 0.79 to 0.82 for h2 = 0.30. On the other hand, the prediction accuracies, employing phenotypic as the mating and selection design, varied from 0.63 to 0.69 for h2 = 0.05, 0.69 to 0.72 for h2 = 0.10, and 0.79 to 0.84 for h2 = 0.30. The comparison of these results revealed that the selection and mating designs based on breeding value exhibited higher accuracy compared to the selection and mating designs based on phenotypic traits. The results demonstrated that increasing the number of SNPs from 500 to 1500 led to improved prediction accuracy for both breeding value and phenotypic-based mating and selection designs. However, it was observed that the accuracy of genomic prediction declined as the marker genotyping error increased from zero to 12 percent in all scenarios.



     In summary, the findings indicate that increasing the number of markers from 500 to 1500 mitigates the impact of marker genotyping error rates from zero to 12%. Consequently, increasing the number of markers enhances the accuracy of genomic evaluation. 

Abdollahi-Arpanahi, R., Pakdel, A., Nejati-Javaremi, A. &Shahrbabak, M.M. (2013). Comparison of different methods of genomic evolution in traits with different genetic architecture. Journal of Animal Production, 15(1), 65-77. (In Farsi)
Akbarpour, T., GhaviHossein‑Zadeh, N. &Shadparvar, A. A. (2020). Marker genotyping error effects on genomic predictions under different genetic architectures. Molecular Genetics and Genomics, 296, 79–89.
Atefi, A. &Shadparvar, A. A. &GhaviHossein-Zadeh, N. (2016). Comparison of whole genome prediction accuracy across generations using parametric and semi parametric methods. ActaScientiarum Animal Sciences, 38(4), 447–453.
Bazzi, H., Tahmoorespour, M. & Rokoui, M. (2017). Accuracy of Bayesian methods in genomic evaluation threshold traits with different genetic architecture. Journal of Ruminant Research, 5(2), 129-143. (In Farsi)
Boichard, D., Ducrocq, V., Croiseau, P. & Fritz, S. (2016). Genomic selection in domestic animals: Principles, applications and perspectives. Comptes Rendus Biologies, 339(7), 274-277.
Brito, F. V., Neto, J. B., Sargolzaei, M., Cobuci, J. A. &Schenkel, F. S. (2011). Accuracy of genomic selection in simulated populations mimicking the extent of linkage disequilibrium in beef cattle. BMC Genet,12, 80.
Brzustowicz, L. M., Mérette, C., Xie, X., Townsend, L., Gilliam, T. C. &Ott, J. (1993). Molecular and statistical approaches to the detection and correction of errors in genotype databases. The American Journal of Human Genetics, 53, 1137–1145.
Cheng, K. F. & Chen, J. H. (2007). A simple and robust TDT-type test against genotyping error with error rates varying across families. Human Heredity, 64, 114–122.
Clark, S. A., Hickey, J. M., Daetwyler, H. D. & van der Werf, J. H. (2012). The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes. Genetic Selection Evolution, 44(1), 4.
Daetwyler, H. D., Villanueva, B., Bijma, P. &Woolliams, J. A. (2007). Inbreeding in genome-wide selection. Journal of Animal Breeding and Genetics, 124, 369–376.
Daetwyler, H. D., Villanueva, B. &Woolliams, J. A. (2008). Accuracy of predicting the genetic risk of disease using a genomewide selection. PLoS ONE, 3, e3395.
Dekkers, J. C. M. (2007). Prediction of response from markerassisted and genomic selection using selection index theory. Journal of Animal Breeding and Genetics, 124, 331–341.
Esfandyari, H., Sorensen, A. C. &Bijma, P. (2015). Maximizing crossbred performance through purebred genomic selection. Genetics Selection Evolution, 47, 16.
Foroutanifar, S. (2017). Effect of QTL Number and Distribution Effects on Some Statistical Methods Genomic Prediction of a Threshold Trait. Iranian Journal of Animal Science Research, 9(2), 221-228. (In Farsi)
Göring, H. H. H. &Terwilliger, J. D. (2000). Linkage analysis in the presence of errors II: Marker-locus genotyping errors modeled with hyper complex recombination fractions. The American Journal of Human Genetics, 66, 1107–1118.
Gowane, G. R., Lee, S. H., Clark, S., Moghaddar, N., Al-Mamun, H. A. & van der Werf, J. H. J. (2018). Effect of selection on bias and accuracy in genomic prediction of breeding values. bioRxiv, 2018.
Habier, D., Fernando, R. L., Kizilkaya, K. & Garrick, D. J. (2011). Extension of the Bayesian alphabet for genomic selection. BMC Bioinform, 12, 186–193.
Karimi, K., Sargolzaei, M., Plastow, G. S., Wang, Z. &Miar, Y. (2019). Opportunities for genomic selection in American mink: A simulation study. PLoS ONE, 14 (3), e0213873.
Latifi, M., Rashidi, A., Abdollahi-Arpanahi, R., Razmkabir, M. (2020). Comparison of different selection methods for improving litter size in sheep using computer simulation. Spanish Journal of Agricultural research, 18(1), e0403.
Marquard, V., Beckmann, L., Heid, I.M., Claudia, L. & Chang-Claude, J. (2009). Impact of genotyping errors on the type I error rate and the power of haplotype-based association methods. BMC Genetics, 10, 3.
Meuwissen, T. H. E., Hayes, B.J. & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157, 1819-1829.
Naderi, Y. (2018). Evaluation of Genomic Prediction Accuracy in Different Genomic Architectures of Quantitative and Threshold Traits with the Imputation of Simulated Genomic Data Using Random Forest Method. Research on Animal Production, 9, 129-138.
Naderi, S., Yin, T. &König, S. (2016). Random forest estimation of genomic breeding values for disease susceptibility over different disease incidences and genomic architectures in simulated cow calibration groups. Journal of Dairy Science, 99, 7261–7273.
Nejati-Javaremi, A., Smith, C. & Gibson, J. (1997). Effect of total allelic relationship on accuracy of evaluation and response to selection. Journal of Animal Science, 75, 1738-1745.
Perez, P. & de los Campos, G. (2014). Genome-wide regression and prediction with the BGLR statistical package. Genetics, 198, 483–495.
Pompanon, F., Bonin, A., Bellemain, E. &Taberlet, P. (2005). Genotyping errors: causes, consequences and solutions. Nature Reviews Genetics, 6, 847–859.
Sargolzaei, M. &Schenkel, F. S. (2009). QMSim: a large scale genome simulator for livestock. Bioinformatics, 25, 680-681.
Wang, C.L., Ding, X.D., Wang, J.Y., Liu, J.F., Fu, W.X., Zhang, Z., Yin, Z.J. & Zhang, Q. (2013). Bayesian methods for estimating GEBVs of threshold traits. Heredity, 110(3), 213-219.
Zhu, W. S., Fung, W. K. &Guo, J. (2007). Incorporating genotyping uncertainty in haplotype frequency estimation in pedigree studies. Human Heredity, 64, 172–181.