Document Type : Research Paper
Authors
1
M. Sc. Graduate of Mathematical Statistics, Faculty of Sciences, Yasouj University, Yasouj, Iran
2
Assistant Professor of Mathematical Statistics, Faculty of Sciences, Yasouj University, Yasouj, Iran
3
Associate Professor of Bioinformatics and Genetics, Faculty of Agriculture, Yasouj University, Yasouj, Iran
4
Assistant Professor, Department of Agricultural Biotechnology, Faculty of Agricultural Sciences, University of Guilan, Rasht, Iran
5
Ph.D. in Animal Breeding and Genetics, University of Guilan, Rasht, Iran
Abstract
Genomic selection is one of the greatest advances in the field of animal and plant breeding in the early twentieth century. This genomic evaluation procedure, which was based on marker-assisted selection, relies on the assumption that there is linkage disequilibrium between dense single nucleotide markers (SNPs) at the genome level and quantitative trait control (QTL) sites. In terms of genetic evaluation, genomic selection influenced many common models and led to the development of new statistical genetic models, each of which explored different hypotheses. Although these models can be grouped according to different criteria, but by considering the distribution of the studied traits, they can be divided into: parametric and non-parametric groupes. In this study, the accuracy of genomic breeding values was investigated using various parametric and non-parametric statistical methods. Parametric methods were ridge regression, Lasso regression, Elastic net method, mixed models, Bayesian methods including Bayesian regression, Lasso Bayes, Bayes A, Bays B, Bays C and Bayes D. Non-parametric methods were kernel regression, reproducing kernel Hilbert spaces regression and regression support vector machine. All of these methods were performed on a real data set including genomic and phenotypic information of 2300 animals using R codes. To select the appropriate model, the criteria of accuracy (correlation of actual and estimated breeding values) and mean squared error (MSE) were used. The results showed that the predictive efficiency of parametric methods was higher than non-parametric-methods. Among the genomic evaluation models, it was shown that Bayes B was relatively more accurate and efficient than other models, however, this results did not agree with the results of other researchers, which may have been due to the data structure used in this study. Since one of the objectives of this study was to provide statistical models of genomic evaluation along with their executive codes in R environment, so the codes mentioned in this article could help the users to learn the genetic evaluation models discussed in this study.
Keywords