1
|
Taguchi YH, Komaki S, Sutoh Y, Ohmomo H, Otsuka-Yamasaki Y, Shimizu A. Integrated analysis of human DNA methylation, gene expression, and genomic variation in iMETHYL database using kernel tensor decomposition-based unsupervised feature extraction. PLoS One 2023; 18:e0289029. [PMID: 37556429 PMCID: PMC10411815 DOI: 10.1371/journal.pone.0289029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 07/07/2023] [Indexed: 08/11/2023] Open
Abstract
Integrating gene expression, DNA methylation, and genomic variants simultaneously without location coincidence (i.e., irrespective of distance from each other) or pairwise coincidence (i.e., direct identification of triplets of gene expression, DNA methylation, and genomic variants, and not integration of pairwise coincidences) is difficult. In this study, we integrated gene expression, DNA methylation, and genome variants from the iMETHYL database using the recently proposed kernel tensor decomposition-based unsupervised feature extraction method with limited computational resources (i.e., short CPU time and small memory requirements). Our methods do not require prior knowledge of the subjects because they are fully unsupervised in that unsupervised tensor decomposition is used. The selected genes and genomic variants were significantly targeted by transcription factors that were biologically enriched in KEGG pathway terms as well as in the intra-related regulatory network. The proposed method is promising for integrated analyses of gene expression, methylation, and genomic variants with limited computational resources.
Collapse
Affiliation(s)
- Y-h. Taguchi
- Department of Physics, Chuo University, Tokyo, Japan
| | - Shohei Komaki
- Division of Biomedical Information Analysis, Iwate Tohoku Medical Megabank Organization, Disaster Reconstruction Center, Iwate Medical University, Iwate, Japan
| | - Yoichi Sutoh
- Division of Biomedical Information Analysis, Iwate Tohoku Medical Megabank Organization, Disaster Reconstruction Center, Iwate Medical University, Iwate, Japan
| | - Hideki Ohmomo
- Division of Biomedical Information Analysis, Iwate Tohoku Medical Megabank Organization, Disaster Reconstruction Center, Iwate Medical University, Iwate, Japan
| | - Yayoi Otsuka-Yamasaki
- Division of Biomedical Information Analysis, Iwate Tohoku Medical Megabank Organization, Disaster Reconstruction Center, Iwate Medical University, Iwate, Japan
| | - Atsushi Shimizu
- Division of Biomedical Information Analysis, Iwate Tohoku Medical Megabank Organization, Disaster Reconstruction Center, Iwate Medical University, Iwate, Japan
| |
Collapse
|
2
|
Hansen PB, Ruud AK, de los Campos G, Malinowska M, Nagy I, Svane SF, Thorup-Kristensen K, Jensen JD, Krusell L, Asp T. Integration of DNA Methylation and Transcriptome Data Improves Complex Trait Prediction in Hordeum vulgare. PLANTS 2022; 11:plants11172190. [PMID: 36079572 PMCID: PMC9459846 DOI: 10.3390/plants11172190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/19/2022] [Accepted: 08/21/2022] [Indexed: 11/30/2022]
Abstract
Whole-genome multi-omics profiles contain valuable information for the characterization and prediction of complex traits in plants. In this study, we evaluate multi-omics models to predict four complex traits in barley (Hordeum vulgare); grain yield, thousand kernel weight, protein content, and nitrogen uptake. Genomic, transcriptomic, and DNA methylation data were obtained from 75 spring barley lines tested in the RadiMax semi-field phenomics facility under control and water-scarce treatment. By integrating multi-omics data at genomic, transcriptomic, and DNA methylation regulatory levels, a higher proportion of phenotypic variance was explained (0.72–0.91) than with genomic models alone (0.55–0.86). The correlation between predictions and phenotypes varied from 0.17–0.28 for control plants and 0.23–0.37 for water-scarce plants, and the increase in accuracy was significant for nitrogen uptake and protein content compared to models using genomic information alone. Adding transcriptomic and DNA methylation information to the prediction models explained more of the phenotypic variance attributed to the environment in grain yield and nitrogen uptake. It furthermore explained more of the non-additive genetic effects for thousand kernel weight and protein content. Our results show the feasibility of multi-omics prediction for complex traits in barley.
Collapse
Affiliation(s)
- Pernille Bjarup Hansen
- Center for Quantitative Genetics and Genomics, Aarhus University, 4200 Slagelse, Denmark
- Correspondence: (P.B.H.); (T.A.); Tel.: +45-87158243 (T.A.)
| | - Anja Karine Ruud
- Center for Quantitative Genetics and Genomics, Aarhus University, 4200 Slagelse, Denmark
| | - Gustavo de los Campos
- Departments of Epidemiology & Biostatistics and Statistics & Probability, Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Marta Malinowska
- Center for Quantitative Genetics and Genomics, Aarhus University, 4200 Slagelse, Denmark
| | - Istvan Nagy
- Center for Quantitative Genetics and Genomics, Aarhus University, 4200 Slagelse, Denmark
| | - Simon Fiil Svane
- Section for Crop Sciences, Department of Plant and Environmental Sciences, Copenhagen University, 2630 Taastrup, Denmark
| | - Kristian Thorup-Kristensen
- Section for Crop Sciences, Department of Plant and Environmental Sciences, Copenhagen University, 2630 Taastrup, Denmark
| | | | - Lene Krusell
- Sejet Plant Breeding, Nørremarksvej 67, 8700 Horsens, Denmark
| | - Torben Asp
- Center for Quantitative Genetics and Genomics, Aarhus University, 4200 Slagelse, Denmark
- Correspondence: (P.B.H.); (T.A.); Tel.: +45-87158243 (T.A.)
| |
Collapse
|
3
|
Wang T, Xia P, Su P. High-Dimensional DNA Methylation Mediates the Effect of Smoking on Crohn's Disease. Front Genet 2022; 13:831885. [PMID: 35450213 PMCID: PMC9016182 DOI: 10.3389/fgene.2022.831885] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 02/01/2022] [Indexed: 11/13/2022] Open
Abstract
Epigenome-wide mediation analysis aims to identify high-dimensional DNA methylation at cytosine–phosphate–guanine (CpG) sites that mediate the causal effect of linking smoking with Crohn’s disease (CD) outcome. Studies have shown that smoking has significant detrimental effects on the course of CD. So we assessed whether DNA methylation mediates the association between smoking and CD. Among 103 CD cases and 174 controls, we estimated whether the effects of smoking on CD are mediated through DNA methylation CpG sites, which we referred to as causal mediation effect. Based on the causal diagram, we first implemented sure independence screening (SIS) to reduce the pool of potential mediator CpGs from a very large to a moderate number; then, we implemented variable selection with de-sparsifying the LASSO regression. Finally, we carried out a comprehensive mediation analysis and conducted sensitivity analysis, which was adjusted for potential confounders of age, sex, and blood cell type proportions to estimate the mediation effects. Smoking was significantly associated with CD under odds ratio (OR) of 2.319 (95% CI: 1.603, 3.485, p < 0.001) after adjustment for confounders. Ninety-nine mediator CpGs were selected from SIS, and then, seven candidate CpGs were obtained by de-sparsifying the LASSO regression. Four of these CpGs showed statistical significance, and the average causal mediation effects (ACME) were attenuated from 0.066 to 0.126. Notably, three significant mediator CpGs had absolute sensitivity parameters of 0.40, indicating that these mediation effects were robust even when the assumptions were slightly violated. Genes (BCL3 and FKBP5) harboring these four CpGs were related to CD. These findings suggest that changes in methylation are involved in the mechanism by which smoking increases risk of CD.
Collapse
Affiliation(s)
- Tingting Wang
- Institute of Medical Sciences, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Pingtian Xia
- Department of General Surgery, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Ping Su
- Institute of Medical Sciences, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| |
Collapse
|
4
|
Yi X, He S, Wang S, Zhao H, Wu M, Liu S, Pan Y, Zhang Y, Sun X. Expression of different genotypes of bovine TRDMT1 gene and its polymorphisms association with body measures in Qinchuan cattle (Bos Taurus). Anim Biotechnol 2021:1-11. [PMID: 34629027 DOI: 10.1080/10495398.2021.1984248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
DNA methyltransferase 2 (DNMT2) was renamed as tRNA aspartic acid methyltransferase 1 (TRDMT1) by catalyzing the methylation of tRNAAsp anti-codon loop C38. The development of sequencing of nucleic acids and protein detection techniques have prompted the demonstration that TRDMT1 mediated tRNA modification affects protein synthesis efficiency. This process affects the growth and development of animals. The DNA of 224 Qinchuan cattles aged 2-4 years old was collected in this experiment. The genetic variations of TRDMT1 exon and some intron regions were detected by mixed pool sequencing technology. qRT-PCR and Western Blot were used to detect the expression levels of mRNA and protein produced with the combination of different genetic variant loci. Three haplotypes were detected and the distribution ratios were different. Muscle tissue mRNA and protein testing showed that there were differences in mRNA expression levels among different genotypes (P < 0.05) and the protein expression levels between different genotypes show the same trend as mRNA. This study provides potential molecular materials for the improvement of Qinchuan cattle reproductivity and provides theoretical support for studying the effects of livestock TRDMT1 on animal growth and development.
Collapse
Affiliation(s)
- Xiaohua Yi
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Shuai He
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Shuhui Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Haidong Zhao
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Mingli Wu
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Shirong Liu
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Yun Pan
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Yu Zhang
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Xiuzhu Sun
- College of Grassland Agriculture, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
5
|
Wang K, Wu P, Wang S, Ji X, Chen D, Jiang A, Xiao W, Gu Y, Jiang Y, Zeng Y, Xu X, Li X, Tang G. Genome-wide DNA methylation analysis in Chinese Chenghua and Yorkshire pigs. BMC Genom Data 2021; 22:21. [PMID: 34134626 PMCID: PMC8207654 DOI: 10.1186/s12863-021-00977-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 06/07/2021] [Indexed: 11/10/2022] Open
Abstract
Background The Chinese Chenghua pig (CHP) is a typical Chinese domestic fatty pig breed with superior meat quality characteristics, while the Yorkshire pig (YP) has the characteristics of fast growth and a high rate of lean meat. Long term natural selection and artificial selection resulted in great phenotypic differences between the two breeds, including growth, development, production performance, meat quality, and coat color. However, genome-wide DNA methylation differences between CHP and YP remain unclear. Results DNA methylation data were generated for muscle tissues of CHP and YP using reduced representation bisulfite sequencing (RRBS). In this study, a total of 2,416,211 CpG sites were identified. Besides, the genome-wide DNA methylation analysis revealed 722 differentially methylated regions (DMRs) and 466 differentially methylated genes (DMGs) in pairwise CHP vs. YP comparison. Six key genomic regions (Sus scrofa chromosome (SSC)1:253.47–274.23 Mb, SSC6:148.71–169.49 Mb, SSC7:0.25–9.86 Mb, SSC12:43.06–61.49 Mb, SSC14:126.43–140.95 Mb, and SSC18:49.17–54.54 Mb) containing multiple DMRs were identified, and differences of methylation patterns in these regions may be related to phenotypic differences between CHP and YP. Based on the functional analysis of DMGs, 8 DMGs (ADCY1, AGBL4, EXOC2, FUBP3, PAPPA2, PIK3R1, MGMT and MYH8) were considered as important candidate genes associated with muscle development and meat quality traits in pigs. Conclusions This study explored the difference in meat quality between CHP and YP from the epigenetic point of view, which has important reference significance for the local pork industry and pork food processing. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-021-00977-0.
Collapse
Affiliation(s)
- Kai Wang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Pingxian Wu
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Shujie Wang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Xiang Ji
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Dong Chen
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Anan Jiang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Weihang Xiao
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Yiren Gu
- Sichuan Animal Science Academy, Chengdu, 610066, China
| | - Yanzhi Jiang
- College of Life Science, Sichuan Agricultural University, Yaan, China
| | | | - Xu Xu
- Sichuan Animal Husbandry Station, Chengdu, 610041, China
| | - Xuewei Li
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China
| | - Guoqing Tang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, China.
| |
Collapse
|
6
|
Roudbar MA, Mousavi SF, Ardestani SS, Lopes FB, Momen M, Gianola D, Khatib H. Prediction of biological age and evaluation of genome-wide dynamic methylomic changes throughout human aging. G3-GENES GENOMES GENETICS 2021; 11:6214518. [PMID: 33826720 PMCID: PMC8495934 DOI: 10.1093/g3journal/jkab112] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 03/29/2021] [Indexed: 11/14/2022]
Abstract
The use of DNA methylation signatures to predict chronological age and aging rate is of interest in many fields, including disease prevention and treatment, forensics, and anti-aging medicine. Although a large number of methylation markers are significantly associated with age, most age-prediction methods use a few markers selected based on either previously published studies or datasets containing methylation information. Here, we implemented reproducing kernel Hilbert spaces (RKHS) regression and a ridge regression model in a Bayesian framework that utilized phenotypic and methylation profiles simultaneously to predict chronological age. We used over 450,000 CpG sites from the whole blood of a large cohort of 4,409 human individuals with a range of 10-101 years of age. Models were fitted using adjusted and un-adjusted methylation measurements for cell heterogeneity. Un-adjusted methylation scores delivered a significantly higher prediction accuracy than adjusted methylation data, with a correlation between age and predicted age of 0.98 and a root-mean-square error (RMSE) of 3.54 years in un-adjusted data, and 0.90 (correlation) and 7.16 (RMSE) years in adjusted data. Reducing the number of predictors (CpG sites) through subset selection improved predictive power with a correlation of 0.98 and an RMSE of 2.98 years in the RKHS model. We found distinct global methylation patterns, with a significant increase in the proportion of methylated cytosines in CpG islands and a decreased proportion in other CpG types, including CpG shore, shelf, and open sea (p < 5e-06). Epigenetic drift seemed to be a widespread phenomenon as more than 97% of the age-associated methylation sites had heteroscedasticity. Apparent methylomic aging rate (AMAR) had a sex-specific pattern, with an increase in AMAR in females with age related to males.
Collapse
Affiliation(s)
- Mahmoud Amiri Roudbar
- Department of Animal Science, Safiabad-Dezful Agricultural and Natural Resources Research and Education Center, Agricultural Research, Education & Extension Organization (AREEO), Dezful, Iran
| | - Seyedeh Fatemeh Mousavi
- Department of Animal Science, Faculty of Agriculture Engineering, University of Kurdistan, Sanandaj, Iran
| | - Siavash Salek Ardestani
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, NS B2N 5E3, Canada
| | - Fernando Brito Lopes
- Department of Animal Sciences, Sao Paulo State University, Julio de Mesquita Filho (UNESP), Prof. Paulo Donato Castelane, Jaboticabal, SP, 14884-900, Brazil
| | - Mehdi Momen
- Department of Surgical Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Daniel Gianola
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, 53706, Madison, WI, USA
| | - Hasan Khatib
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, 53706, Madison, WI, USA
| |
Collapse
|
7
|
Baba T, Pegolo S, Mota LFM, Peñagaricano F, Bittante G, Cecchinato A, Morota G. Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle. Genet Sel Evol 2021; 53:29. [PMID: 33726672 PMCID: PMC7968271 DOI: 10.1186/s12711-021-00620-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Accepted: 03/01/2021] [Indexed: 11/20/2022] Open
Abstract
Background Over the past decade, Fourier transform infrared (FTIR) spectroscopy has been used to predict novel milk protein phenotypes. Genomic data might help predict these phenotypes when integrated with milk FTIR spectra. The objective of this study was to investigate prediction accuracy for milk protein phenotypes when heterogeneous on-farm, genomic, and pedigree data were integrated with the spectra. To this end, we used the records of 966 Italian Brown Swiss cows with milk FTIR spectra, on-farm information, medium-density genetic markers, and pedigree data. True and total whey protein, and five casein, and two whey protein traits were analyzed. Multiple kernel learning constructed from spectral and genomic (pedigree) relationship matrices and multilayer BayesB assigning separate priors for FTIR and markers were benchmarked against a baseline partial least squares (PLS) regression. Seven combinations of covariates were considered, and their predictive abilities were evaluated by repeated random sub-sampling and herd cross-validations (CV). Results Addition of the on-farm effects such as herd, days in milk, and parity to spectral data improved predictions as compared to those obtained using the spectra alone. Integrating genomics and/or the top three markers with a large effect further enhanced the predictions. Pedigree data also improved prediction, but to a lesser extent than genomic data. Multiple kernel learning and multilayer BayesB increased predictive performance, whereas PLS did not. Overall, multilayer BayesB provided better predictions than multiple kernel learning, and lower prediction performance was observed in herd CV compared to repeated random sub-sampling CV. Conclusions Integration of genomic information with milk FTIR spectral can enhance milk protein trait predictions by 25% and 7% on average for repeated random sub-sampling and herd CV, respectively. Multiple kernel learning and multilayer BayesB outperformed PLS when used to integrate heterogeneous data for phenotypic predictions.
Collapse
Affiliation(s)
- Toshimi Baba
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Sara Pegolo
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020, Legnaro, Italy.
| | - Lucio F M Mota
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020, Legnaro, Italy
| | - Francisco Peñagaricano
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Giovanni Bittante
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020, Legnaro, Italy
| | - Alessio Cecchinato
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020, Legnaro, Italy
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA. .,Center for Advanced Innovation in Agriculture, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA.
| |
Collapse
|
8
|
Akbarzadeh M, Dehkordi SR, Roudbar MA, Sargolzaei M, Guity K, Sedaghati-Khayat B, Riahi P, Azizi F, Daneshpour MS. GWAS findings improved genomic prediction accuracy of lipid profile traits: Tehran Cardiometabolic Genetic Study. Sci Rep 2021; 11:5780. [PMID: 33707626 PMCID: PMC7952573 DOI: 10.1038/s41598-021-85203-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 02/26/2021] [Indexed: 12/15/2022] Open
Abstract
In recent decades, ongoing GWAS findings discovered novel therapeutic modifications such as whole-genome risk prediction in particular. Here, we proposed a method based on integrating the traditional genomic best linear unbiased prediction (gBLUP) approach with GWAS information to boost genetic prediction accuracy and gene-based heritability estimation. This study was conducted in the framework of the Tehran Cardio-metabolic Genetic study (TCGS) containing 14,827 individuals and 649,932 SNP markers. Five SNP subsets were selected based on GWAS results: top 1%, 5%, 10%, 50% significant SNPs, and reported associated SNPs in previous studies. Furthermore, we randomly selected subsets as large as every five subsets. Prediction accuracy has been investigated on lipid profile traits with a tenfold and 10-repeat cross-validation algorithm by the gBLUP method. Our results revealed that genetic prediction based on selected subsets of SNPs obtained from the dataset outperformed the subsets from previously reported SNPs. Selected SNPs' subsets acquired a more precise prediction than whole SNPs and much higher than randomly selected SNPs. Also, common SNPs with the most captured prediction accuracy in the selected sets caught the highest gene-based heritability. However, it is better to be mindful of the fact that a small number of SNPs obtained from GWAS results could capture a highly notable proportion of variance and prediction accuracy.
Collapse
Affiliation(s)
- Mahdi Akbarzadeh
- Cellular and Molecular Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, POBox: 19195-4763, Tehran, Iran
| | - Saeid Rasekhi Dehkordi
- Cellular and Molecular Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, POBox: 19195-4763, Tehran, Iran
| | - Mahmoud Amiri Roudbar
- Department of Animal Science, Safiabad-Dezful Agricultural and Natural Resources Research and Education Center, Agricultural Research, Education & Extension Organization (AREEO), Dezful, Iran
| | - Mehdi Sargolzaei
- Department of Pathobiology, Ontario Veterinary College, University of Guelph, Guelph, Canada
- Select Sires Inc., Plain City, USA
| | - Kamran Guity
- Cellular and Molecular Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, POBox: 19195-4763, Tehran, Iran
| | - Bahareh Sedaghati-Khayat
- Cellular and Molecular Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, POBox: 19195-4763, Tehran, Iran
| | - Parisa Riahi
- Cellular and Molecular Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, POBox: 19195-4763, Tehran, Iran
| | - Fereidoun Azizi
- Endocrine Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Maryam S Daneshpour
- Cellular and Molecular Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, POBox: 19195-4763, Tehran, Iran.
| |
Collapse
|
9
|
Cai J, Xu Y, Zhang W, Ding S, Sun Y, Lyu J, Duan M, Liu S, Huang L, Zhou F. A comprehensive comparison of residue-level methylation levels with the regression-based gene-level methylation estimations by ReGear. Brief Bioinform 2020; 22:5921981. [PMID: 33048108 DOI: 10.1093/bib/bbaa253] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/10/2020] [Accepted: 09/08/2020] [Indexed: 02/07/2023] Open
Abstract
MOTIVATION DNA methylation is a biological process impacting the gene functions without changing the underlying DNA sequence. The DNA methylation machinery usually attaches methyl groups to some specific cytosine residues, which modify the chromatin architectures. Such modifications in the promoter regions will inactivate some tumor-suppressor genes. DNA methylation within the coding region may significantly reduce the transcription elongation efficiency. The gene function may be tuned through some cytosines are methylated. METHODS This study hypothesizes that the overall methylation level across a gene may have a better association with the sample labels like diseases than the methylations of individual cytosines. The gene methylation level is formulated as a regression model using the methylation levels of all the cytosines within this gene. A comprehensive evaluation of various feature selection algorithms and classification algorithms is carried out between the gene-level and residue-level methylation levels. RESULTS A comprehensive evaluation was conducted to compare the gene and cytosine methylation levels for their associations with the sample labels and classification performances. The unsupervised clustering was also improved using the gene methylation levels. Some genes demonstrated statistically significant associations with the class label, even when no residue-level methylation features have statistically significant associations with the class label. So in summary, the trained gene methylation levels improved various methylome-based machine learning models. Both methodology development of regression algorithms and experimental validation of the gene-level methylation biomarkers are worth of further investigations in the future studies. The source code, example data files and manual are available at http://www.healthinformaticslab.org/supp/.
Collapse
|