شناسایی تنوعهای ژنومی و بررسی نقش آنها در سطح ژنوم گاومیشهای نژاد مازندرانی

نوع مقاله : مقاله پژوهشی

نویسندگان

گروه علوم دامی، دانشکدگان کشاورزی و منابع طبیعی دانشگاه تهران، کرج، ایران

چکیده

امروزه با پیشرفت‌ فناوری‌های ژنتیک مولکولی و توسعه پایگاه‌های بیوانفورماتیکی، روش‌های جایگزینی جهت افزایش سرعت و بازدهی تعیین‌ توالی دی‌ان‌ای و بررسی نقش آنها در سطح ژنوم توسعه یافته است. در روش توالی‌یابی‌کل‌ژنوم، ‌ژنوم موجود زنده (ژنوم هسته‌ای به‌همراه دی‌ان‌ای میتوکندریایی) بطور کامل توالی‌یابی می‌گردد. یکی از مباحث مهم در حوزه ارزیابی‌های ژنومی، مطالعه تفاوت‌ها و تنوع ژنوم، شامل چندشکلی‌های‌تک‌نوکلئوتیدی(SNP)  و حذف‌و‌درج‌ها بمنظور شناخت ارتباط میان ژنوتیپ و فنوتیپ می‌باشد. در ‌ژنوم موجودات، چندشکلی‌ها، ابزارهای نیرومندی برای واکاوی مولکولی صفات اقتصادی هستند و کاربردهای بالقوه‌ای در برنامه‌های اصلاحی دارند. برای دست‌یابی به این اهداف، تنوع‌های ‌ژنومی گاومیش‌های مازندرانی شناسایی و طبقه-بندی شدند. در‌این مطالعه ‌ژنوم 4راس گاومیش مازندرانی با فناوری ایلیومینا توالی‌یابی شد. سنجش کیفیت داده‌ها توسط نرم‌افزار FastQC  انجام شد. برای هم‌ردیفی با ژنوم‌مرجع از نرم‌افزار BWA-MEM استفاده شد. در‌نهایت واریانت‌ها با‌‌استفاده از freebayes شناسایی و به‌منظور محاسبه‌ی اثرات واریانت‌ها با ذکر نوع، محل و تعداد آن‌ها از برنامه SnpEff استفاده شد. یافته‌ها: نتیجه‌ی همردیفی خوانش‌ها، با ژنوم مرجع منجر‌ به شناسایی 56537534 نشانگر SNP و 6128529 حذف‌و‌درج (ایندل) با میانگین پوشش x4 تاx 13 شد. بیشترین واریانت‌ها در کروموزوم‌های 1 و جنسی (X) و کم‌ترین آن در کروموزوم‌‌های 23 و میتوکندریایی مشاهده شد. جهش‌های جابه‌جایی 236549743 و معکوس 108015966 بود. همچنین نرخ جهش‌های جابه‌جایی/ معکوس 19/2 محاسبه شد. فراوانی واریانت‌ها در مناطق بین ژنی 52746727، اینترون 23560994، پایین‌دست‌ژنی 3713594، بالادست‌ژنی 3571409 و اگزون 574093 برآورد شدند. نتیجه‌گیری: با‌توجه به این‌که مطالعه‌ی حاضر تنها تحقیق انجام شده در زمینه‌ی شناسایی تنوع‌های ژنومی گاومیش نژاد مازندرانی می‌باشد، تنوع‌های ژنومی شناسایی‌شده در‌این مطالعه می‌تواند برای توسعه‌ی آرایه‌های نشانگری با چگالی‌ بالادر نژاد‌های ایرانی مورد استفاده قرار گیرد.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Identification of genomic variants and considering their effects in Mazandarani buffaloes

نویسندگان [English]

  • Milad Hosseini
  • Hossein Moradi Shahrbabak
  • Mohammad Moradi Shahrbabak
Department of Animal Science, Faculty of Agriculture and Natural Resources, University of Tehran, Karaj, Iran
چکیده [English]

Nowadays, with the progress obtained in molecular genetic techniques and bioinformatics, alternative methods have been invented to increase the speed and efficiency of DNA sequencing and their roles in genome level.  In the whole-genome sequencing method, the whole genome sequence of organism (nuclear-genome with mitochondrial DNA) is sequenced. One of the important topics in genomics is genome differences, including single-nucleotide polymorphisms and INDELs for the relationship between genotype and phenotype. Polymorphisms are powerful tools for molecular analysis of economic traits and are important in breeding programs. For this purpose, whole-genome variations of Mazandarani buffaloes were identified and classified. In this study, the whole-genome of 4 Mazandarani buffaloes was sequenced with the Illumina platform. Data quality was measured by FastQC software. BWA-MEM was used for alignment with reference genome. Finally, the variants were obtained using freebayes and the SnpEff was used to calculate the effects of the variants. The result of aligning led us to identification of 56537534 SNPs, and 6128529 indels with an average coverage of x4 to x13. The most number of variants were observed on 1 and X chromosomes, and the least number were in 23 and mitochondrial chromosomes. The transition, transversion and rate of transition/transversion mutations were 236549743, 108015966 and 2/19 respectively. Also, the mutations was calculated. The frequency of variants in intergenic regions was estimated to be 52746727, intron 23560994, downstream 3713594, upstream 3571409 and exon 574093. Considering that this research is the only one that carried out to identify the genomic variations of Mazandarani buffalo, the identified genomic variations can be used for the development of SNP-arrays for Iranian breeds.

کلیدواژه‌ها [English]

  • Whole-Genome
  • Variations
  • Next-Generation-Sequencing
  • Mazandarani Buffalo

Extended Abstract

Introduction

     Livestock breeding is one of the main sectors of animal production, with important role in economy, self-sufficiency and food security. Existing genetic diversity and resource in native populations with good  adaptation to their  environmental conditions are very valuable for breeding and production plans. Buffalo is one of the important native livestock of the country and has an important contribution in the production of milk, meat.

 

Objective

     Research on DNA variants that directly affect the phenotype is one of the main and key field of genetic research in domestic animals. In the past few years, single nucleotide polymorphisms have played a important role in the field of genetics studies of domestic animals. One of the important topics in genomics studies is genome differences, including single-nucleotide polymorphisms and INDELs to consider relation between genotype and phenotype. Polymorphisms are powerful tools for molecular analysis of economic traits and are important in breeding programs. Thus the aim of this research was to identify single-nucleotide polymorphisms and INDELs across the whole genome of Mazanderani buffaloes.

 

Research method

      In this study, the whole-genome of 4 Mazandarani buffaloes was sequenced with the Illumina platform. Data quality was measured by FastQC software. This software uses 11 different tests to measure data quality. Trimmomatic software was used to edit the data. This software is a flexible tool with effective pre-processing and compatible with paired end data, and it is optimized for Illumina company's next generation sequencing data. The tasks of this software include removing adapters and removing or editing poor quality readings. Then, using the samtools software package, we converted the output file in sam format to bam. To obtain the alignment and coverage percentage, we used the flagstat and depth commands used in SamTools software. The file of genomic variants was obtained using freebayes. After receiving the reference genome file and its annotation file, it was necessary first to index the information related to the reference genome. BWA-MEM software package was used to align and index the data with the cow reference genome (UMD3.1). Compared to other algorithms, it has a higher processing speed. the SnpEff was used to calculate the effects of the variants.

 

Results

     The result of aligning led to identification of 56537534 SNPs, 6128529 indels with an average coverage of x4 to x13. The most number of variants were observed on 1 and X chromosomes, and the least number were in 23 and mitochondrial chromosomes. The transition, transversion and rate of transition/transversion mutations were 236549743, 108015966 and 2/19 respectively.  Also, the mutations were calculated as. The frequency of variants in intergenic regions was estimated to be 52746727, intron 23560994, downstream 3713594, upstream 3571409 and exon 574093.

 

Conclusion

     Considering the important role of buffalo in providing part of the income and necessities of the rural population, special attention should be paid to these animals in order to raise the level of welfare of the rural population and also to increase the production efficiency of buffaloes in the country. Therefore, in order to improve and raise the production level of these animals, it is very important to know the genetic variations. The present study is the only research carried out to identify the genomic variations of Mazandarani buffalo, so the genomic variations identified in this study can be used for the development of high-density SNP arrays for genetic and breeding applications in Iranian breeds.

عارف نژاد، بابک؛ کهرام، حمید؛ محمد، مرادی­شهربابک؛ شاکری، ملک؛ دونگ، یانگ؛ ژانگ، خیائولی؛ وانگ، ون؛ حسینی­سالکده، قاسم (2015). شناسایی واریانت‌های ژنی اسب کاسپین با استفاده از نسل جدید توالی‌یابی ژنوم با کارایی بالا. بیوتکنولوژی کشاورزی, 6(4)، 34-46.
Amaral, M. E. J., Grant, J. R., Riggs, P. K., Stafuzza, N. B., Goldammer, T., Weikard, R., Jeong, J. (2008). A first generation whole genome RH map of the river buffalo with comparison to domestic cattle. BMC genomics, 9(1), 1-11.
Aminafshar, M., Amirinia, C., & Torshizi, R. V. (2008). Genetic diversity in buffalo population of guilan using microsatellite markers. Journal of Animal Veterinary advances, (7), 1499-1502.
Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online. Retrieved May, 17, 2018.
Arefnejad, B., Kohram, H., Moradi Shahrbabak, M., Shakeri, M., Dong, Y., Zhang, X., Hoseini Salekdeh, G. (2015). Genic Variant Detection of Caspian Horse Using High-throughput Sequencing Technology. Agricultural Biotechnology Journal, 6(4), 101-116. (in persian).
Bickhart, D. M., Hou, Y., Schroeder, S. G., Alkan, C., Cardone, M. F., Matukumalli, L. K., Taylor, J. F. (2012). Copy number variation of individual cattle genomes using next-generation sequencing. Genome research, 22(4), 778-790.
Bohry, D., Ramos, H. C. C., Dos Santos, P. H. D., Boechat, M. S. B., Arêdes, F. A. S., Pirovani, A. A. V., & Pereira, M. G. (2021). Discovery of SNPs and InDels in papaya genotypes and its potential for marker assisted selection of fruit quality traits. Scientific Reports, 11(1), 292.
Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114-2120.
Cingolani, P., Platts, A., Wang, L. L., Coon, M., Nguyen, T., Wang, L., Ruden, D. M. (2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. fly, 6(2), 80-92.
Claverie, J.-M., & Notredame, C. (2006). Bioinformatics for dummies. John Wiley & Sons.
Das, A., Panitz, F., Gregersen, V. R., Bendixen, C., & Holm, L.-E. (2015). Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes. BMC genomics, 16(1), 1-12.
Di Meo, G., Perucatti, A., Floriot, S., Hayes, H., Schibler, L., Incarnato, D., Eggen, A. (2008). An extended river buffalo (Bubalus bubalis, 2n= 50) cytogenetic map: assignment of 68 autosomal loci by FISH-mapping and R-banding and comparison with human chromosomes. Chromosome research, 16, 827-837.
Doan, R., Cohen, N., Harrington, J., Veazy, K., Juras, R., Cothran, G., Dindot, S. V. (2012). Identification of copy number variants in horses. Genome research, 22(5), 899-907.
Fontanesi, L., Beretti, F., Martelli, P., Colombo, M., Dall'Olio, S., Occidente, M., Russo, V. (2011). A first comparative map of copy number variations in the sheep genome. Genomics, 97(3), 158-165.
Garrison, E., & Marth, G. (2012). Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907.
Guo, Y., Li, J., Li, C.-I., Long, J., Samuels, D. C., & Shyr, Y. (2012). The effect of strand bias in Illumina short-read sequencing data. BMC genomics, 13, 1-11.
Iamartino, D., Williams, J. L., Sonstegard, T., Reecy, J., Tassell, C. v., Nicolazzi, E. L., de Oliveira, D. A. (2013). The buffalo genome and the application of genomics in animal management and improvement. Buffalo Bulletin, 32(Special Issue 1), 151-158.
Kidd, J. M., Cooper, G. M., Donahue, W. F., Hayden, H. S., Sampas, N., Graves, T., Antonacci, F. (2008). Mapping and sequencing of structural variation from eight human genomes. Nature, 453(7191), 56-64.
Kijas, J. W., Barendse, W., Barris, W., Harrison, B., McCulloch, R., McWilliam, S., & Whan, V. (2011). Analysis of copy number variants in the cattle genome. Gene, 482(1-2), 73-77.
Le Roex, N., Noyes, H., Brass, A., Bradley, D. G., Kemp, S. J., Kay, S.,  Hoal, E. G. (2012). Novel SNP discovery in African buffalo, Syncerus caffer, using high-throughput sequencing. PloS one, 7(11), e48792.
Lei, C., Zhang, W., Chen, H., Lu, F., Liu, R., Yang, X.,  Lu, Z. (2007). Independent maternal origin of Chinese swamp buffalo (Bubalus bubalis). Animal genetics, 38(2), 97-102.
Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997.
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Subgroup, G. P. D. P. (2009). The sequence alignment/map format and SAMtools. bioinformatics, 25(16), 2078-2079.
Maghsoodi, S. M., MIRAEI, A. S. R., Banabazi, M. H., & MEHRABANI, Y. H. (2011). Polymorphism of prion protein gene (PRNP) in Iranian Holstein and two local cattle populations (Golpayegani and Sistani) of Iran.
Mardis, E. R. (2011). A decade’s perspective on DNA sequencing technology. Nature, 470(7333), 198-203.
Pavlopoulos, G. A., Oulas, A., Iacucci, E., Sifrim, A., Moreau, Y., Schneider, R., Iliopoulos, I. (2013). Unraveling genomic variation from next generation sequencing data. BioData mining, 6, 1-25.
Perucatti, A., Genualdo, V., Iannuzzi, A., Rebl, A., Di Berardino, D., Goldammer, T., & Iannuzzi, L. (2012). Advanced comparative cytogenetic analysis of X chromosomes in river buffalo, cattle, sheep, and human. Chromosome research, 20, 413-425.
Shirasawa, K., Fukuoka, H., Matsunaga, H., Kobayashi, Y., Kobayashi, I., Hirakawa, H., Tabata, S. (2013). Genome-wide association studies using single nucleotide polymorphism markers developed by re-sequencing of the genomes of cultivated tomato. DNA research, 20(6), 593-603.
Stafuzza, N., Ianella, P., Miziara, M., Agarwala, R., Schäffer, A., Riggs, P.,  Amaral, M. (2007). Preliminary radiation hybrid map for river buffalo chromosome 6 and comparison to bovine chromosome 3. Animal genetics, 38(4), 406-409.
Tuzun, E., Sharp, A. J., Bailey, J. A., Kaul, R., Morrison, V. A., Pertz, L. M.,  Pinkel, D. (2005). Fine-scale structural variation of the human genome. Nature genetics, 37(7), 727-732.
Wheeler, D. L., Barrett, T., Benson, D. A., Bryant, S. H., Canese, K., Chetvernin, V.,  Federhen, S. (2007). Database resources of the national center for biotechnology information. Nucleic acids research, 35(suppl_1), D5-D12.
Zhang, L., Jia, S., Yang, M., Xu, Y., Li, C., Sun, J., Zhou, Y. (2014). Detection of copy number variations and their effects in Chinese bulls. BMC genomics, 15, 1-9.