1
|
García-Nieto PE, Wang B, Fraser HB. Transcriptome diversity is a systematic source of variation in RNA-sequencing data. PLoS Comput Biol 2022; 18:e1009939. [PMID: 35324895 PMCID: PMC8982896 DOI: 10.1371/journal.pcbi.1009939] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 04/05/2022] [Accepted: 02/18/2022] [Indexed: 01/02/2023] Open
Abstract
RNA sequencing has been widely used as an essential tool to probe gene expression. While standard practices have been established to analyze RNA-seq data, it is still challenging to interpret and remove artifactual signals. Several biological and technical factors such as sex, age, batches, and sequencing technology have been found to bias these estimates. Probabilistic estimation of expression residuals (PEER), which infers broad variance components in gene expression measurements, has been used to account for some systematic effects, but it has remained challenging to interpret these PEER factors. Here we show that transcriptome diversity-a simple metric based on Shannon entropy-explains a large portion of variability in gene expression and is the strongest known factor encoded in PEER factors. We then show that transcriptome diversity has significant associations with multiple technical and biological variables across diverse organisms and datasets. In sum, transcriptome diversity provides a simple explanation for a major source of variation in both gene expression estimates and PEER covariates.
Collapse
Affiliation(s)
- Pablo E. García-Nieto
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Ban Wang
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Hunter B. Fraser
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
2
|
Verifying explainability of a deep learning tissue classifier trained on RNA-seq data. Sci Rep 2021; 11:2641. [PMID: 33514769 PMCID: PMC7846764 DOI: 10.1038/s41598-021-81773-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 01/11/2021] [Indexed: 12/16/2022] Open
Abstract
For complex machine learning (ML) algorithms to gain widespread acceptance in decision making, we must be able to identify the features driving the predictions. Explainability models allow transparency of ML algorithms, however their reliability within high-dimensional data is unclear. To test the reliability of the explainability model SHapley Additive exPlanations (SHAP), we developed a convolutional neural network to predict tissue classification from Genotype-Tissue Expression (GTEx) RNA-seq data representing 16,651 samples from 47 tissues. Our classifier achieved an average F1 score of 96.1% on held-out GTEx samples. Using SHAP values, we identified the 2423 most discriminatory genes, of which 98.6% were also identified by differential expression analysis across all tissues. The SHAP genes reflected expected biological processes involved in tissue differentiation and function. Moreover, SHAP genes clustered tissue types with superior performance when compared to all genes, genes detected by differential expression analysis, or random genes. We demonstrate the utility and reliability of SHAP to explain a deep learning model and highlight the strengths of applying ML to transcriptome data.
Collapse
|
3
|
Ehrlich KC, Baribault C, Ehrlich M. Epigenetics of Muscle- and Brain-Specific Expression of KLHL Family Genes. Int J Mol Sci 2020; 21:E8394. [PMID: 33182325 PMCID: PMC7672584 DOI: 10.3390/ijms21218394] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 11/02/2020] [Accepted: 11/06/2020] [Indexed: 02/07/2023] Open
Abstract
KLHL and the related KBTBD genes encode components of the Cullin-E3 ubiquitin ligase complex and typically target tissue-specific proteins for degradation, thereby affecting differentiation, homeostasis, metabolism, cell signaling, and the oxidative stress response. Despite their importance in cell function and disease (especially, KLHL40, KLHL41, KBTBD13, KEAP1, and ENC1), previous studies of epigenetic factors that affect transcription were predominantly limited to promoter DNA methylation. Using diverse tissue and cell culture whole-genome profiles, we examined 17 KLHL or KBTBD genes preferentially expressed in skeletal muscle or brain to identify tissue-specific enhancer and promoter chromatin, open chromatin (DNaseI hypersensitivity), and DNA hypomethylation. Sixteen of the 17 genes displayed muscle- or brain-specific enhancer chromatin in their gene bodies, and most exhibited specific intergenic enhancer chromatin as well. Seven genes were embedded in super-enhancers (particularly strong, tissue-specific clusters of enhancers). The enhancer chromatin regions typically displayed foci of DNA hypomethylation at peaks of open chromatin. In addition, we found evidence for an intragenic enhancer in one gene upregulating expression of its neighboring gene, specifically for KLHL40/HHATL and KLHL38/FBXO32 gene pairs. Many KLHL/KBTBD genes had tissue-specific promoter chromatin at their 5' ends, but surprisingly, two (KBTBD11 and KLHL31) had constitutively unmethylated promoter chromatin in their 3' exons that overlaps a retrotransposed KLHL gene. Our findings demonstrate the importance of expanding epigenetic analyses beyond the 5' ends of genes in studies of normal and abnormal gene regulation.
Collapse
Affiliation(s)
- Kenneth C. Ehrlich
- Center for Biomedical Informatics and Genomics, Tulane University Health Sciences Center, New Orleans, LA 70112, USA;
| | - Carl Baribault
- Center for Research and Scientific Computing (CRSC), Tulane University Information Technology, Tulane University, New Orleans, LA 70112, USA;
| | - Melanie Ehrlich
- Center for Biomedical Informatics and Genomics, Tulane Cancer Center, Hayward Genetics Program, Tulane University Health Sciences Center, New Orleans, LA 70112, USA
| |
Collapse
|
4
|
TGFB1-Mediated Gliosis in Multiple Sclerosis Spinal Cords Is Favored by the Regionalized Expression of HOXA5 and the Age-Dependent Decline in Androgen Receptor Ligands. Int J Mol Sci 2019; 20:ijms20235934. [PMID: 31779094 PMCID: PMC6928867 DOI: 10.3390/ijms20235934] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 11/18/2019] [Accepted: 11/22/2019] [Indexed: 02/07/2023] Open
Abstract
In multiple sclerosis (MS) patients with a progressive form of the disease, spinal cord (SC) functions slowly deteriorate beyond age 40. We previously showed that in the SC of these patients, large areas of incomplete demyelination extend distance away from plaque borders and are characterized by a unique progliotic TGFB1 (Transforming Growth Factor Beta 1) genomic signature. Here, we attempted to determine whether region- and age-specific physiological parameters could promote the progression of SC periplaques in MS patients beyond age 40. An analysis of transcriptomics databases showed that, under physiological conditions, a set of 10 homeobox (HOX) genes are highly significantly overexpressed in the human SC as compared to distinct brain regions. Among these HOX genes, a survey of the human proteome showed that only HOXA5 encodes a protein which interacts with a member of the TGF-beta signaling pathway, namely SMAD1 (SMAD family member 1). Moreover, HOXA5 was previously found to promote the TGF-beta pathway. Interestingly, SMAD1 is also a protein partner of the androgen receptor (AR) and an unsupervised analysis of gene ontology terms indicates that the AR pathway antagonizes the TGF-beta/SMAD pathway. Retrieval of promoter analysis data further confirmed that AR negatively regulates the transcription of several members of the TGF-beta/SMAD pathway. On this basis, we propose that in progressive MS patients, the physiological SC overexpression of HOXA5 combined with the age-dependent decline in AR ligands may favor the slow progression of TGFB1-mediated gliosis. Potential therapeutic implications are discussed.
Collapse
|
5
|
Ramachandran P, Dobie R, Wilson-Kanamori JR, Dora EF, Henderson BEP, Luu NT, Portman JR, Matchett KP, Brice M, Marwick JA, Taylor RS, Efremova M, Vento-Tormo R, Carragher NO, Kendall TJ, Fallowfield JA, Harrison EM, Mole DJ, Wigmore SJ, Newsome PN, Weston CJ, Iredale JP, Tacke F, Pollard JW, Ponting CP, Marioni JC, Teichmann SA, Henderson NC. Resolving the fibrotic niche of human liver cirrhosis at single-cell level. Nature 2019; 575:512-518. [PMID: 31597160 PMCID: PMC6876711 DOI: 10.1038/s41586-019-1631-3] [Citation(s) in RCA: 894] [Impact Index Per Article: 178.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 09/04/2019] [Indexed: 12/13/2022]
Abstract
Liver cirrhosis is a major cause of death worldwide and is characterized by extensive fibrosis. There are currently no effective antifibrotic therapies available. To obtain a better understanding of the cellular and molecular mechanisms involved in disease pathogenesis and enable the discovery of therapeutic targets, here we profile the transcriptomes of more than 100,000 single human cells, yielding molecular definitions for non-parenchymal cell types that are found in healthy and cirrhotic human liver. We identify a scar-associated TREM2+CD9+ subpopulation of macrophages, which expands in liver fibrosis, differentiates from circulating monocytes and is pro-fibrogenic. We also define ACKR1+ and PLVAP+ endothelial cells that expand in cirrhosis, are topographically restricted to the fibrotic niche and enhance the transmigration of leucocytes. Multi-lineage modelling of ligand and receptor interactions between the scar-associated macrophages, endothelial cells and PDGFRα+ collagen-producing mesenchymal cells reveals intra-scar activity of several pro-fibrogenic pathways including TNFRSF12A, PDGFR and NOTCH signalling. Our work dissects unanticipated aspects of the cellular and molecular basis of human organ fibrosis at a single-cell level, and provides a conceptual framework for the discovery of rational therapeutic targets in liver cirrhosis.
Collapse
Affiliation(s)
- P Ramachandran
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK.
| | - R Dobie
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - J R Wilson-Kanamori
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - E F Dora
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - B E P Henderson
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - N T Luu
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
| | - J R Portman
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - K P Matchett
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - M Brice
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - J A Marwick
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Molecular Medicine at the University of Edinburgh, Edinburgh, UK
| | - R S Taylor
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - M Efremova
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - R Vento-Tormo
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - N O Carragher
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Molecular Medicine at the University of Edinburgh, Edinburgh, UK
| | - T J Kendall
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
- Division of Pathology, University of Edinburgh, Edinburgh, UK
| | - J A Fallowfield
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
| | - E M Harrison
- Clinical Surgery, University of Edinburgh, Royal Infirmary of Edinburgh, Edinburgh, UK
| | - D J Mole
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
- Clinical Surgery, University of Edinburgh, Royal Infirmary of Edinburgh, Edinburgh, UK
| | - S J Wigmore
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK
- Clinical Surgery, University of Edinburgh, Royal Infirmary of Edinburgh, Edinburgh, UK
| | - P N Newsome
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
| | - C J Weston
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK
- Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK
| | - J P Iredale
- Office of the Vice Chancellor, Beacon House and National Institute for Health Research, Biomedical Research Centre, Bristol, UK
| | - F Tacke
- Department of Hepatology and Gastroenterology, Charité University Medical Center, Berlin, Germany
| | - J W Pollard
- MRC Centre for Reproductive Health, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
- Department of Developmental and Molecular Biology, Albert Einstein College of Medicine, New York, NY, USA
| | - C P Ponting
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine at the University of Edinburgh, Edinburgh, UK
| | - J C Marioni
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - S A Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
- Theory of Condensed Matter Group, The Cavendish Laboratory, University of Cambridge, Cambridge, UK
| | - N C Henderson
- University of Edinburgh Centre for Inflammation Research, The Queen's Medical Research Institute, Edinburgh BioQuarter, Edinburgh, UK.
| |
Collapse
|
6
|
Schrag TA, Westhues M, Schipprack W, Seifert F, Thiemann A, Scholten S, Melchinger AE. Beyond Genomic Prediction: Combining Different Types of omics Data Can Improve Prediction of Hybrid Performance in Maize. Genetics 2018; 208:1373-1385. [PMID: 29363551 PMCID: PMC5887136 DOI: 10.1534/genetics.117.300374] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 01/20/2018] [Indexed: 01/28/2023] Open
Abstract
The ability to predict the agronomic performance of single-crosses with high precision is essential for selecting superior candidates for hybrid breeding. With recent technological advances, thousands of new parent lines, and, consequently, millions of new hybrid combinations are possible in each breeding cycle, yet only a few hundred can be produced and phenotyped in multi-environment yield trials. Well established prediction approaches such as best linear unbiased prediction (BLUP) using pedigree data and whole-genome prediction using genomic data are limited in capturing epistasis and interactions occurring within and among downstream biological strata such as transcriptome and metabolome. Because mRNA and small RNA (sRNA) sequences are involved in transcriptional, translational and post-translational processes, we expect them to provide information influencing several biological strata. However, using sRNA data of parent lines to predict hybrid performance has not yet been addressed. Here, we gathered genomic, transcriptomic (mRNA and sRNA) and metabolomic data of parent lines to evaluate the ability of the data to predict the performance of untested hybrids for important agronomic traits in grain maize. We found a considerable interaction for predictive ability between predictor and trait, with mRNA data being a superior predictor for grain yield and genomic data for grain dry matter content, while sRNA performed relatively poorly for both traits. Combining mRNA and genomic data as predictors resulted in high predictive abilities across both traits and combining other predictors improved prediction over that of the individual predictors alone. We conclude that downstream "omics" can complement genomics for hybrid prediction, and, thereby, contribute to more efficient selection of hybrid candidates.
Collapse
Affiliation(s)
- Tobias A Schrag
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Matthias Westhues
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Wolfgang Schipprack
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Felix Seifert
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609 Hamburg, Germany
| | - Alexander Thiemann
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609 Hamburg, Germany
| | - Stefan Scholten
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Albrecht E Melchinger
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| |
Collapse
|
7
|
Ferreira PG, Muñoz-Aguirre M, Reverter F, Sá Godinho CP, Sousa A, Amadoz A, Sodaei R, Hidalgo MR, Pervouchine D, Carbonell-Caballero J, Nurtdinov R, Breschi A, Amador R, Oliveira P, Çubuk C, Curado J, Aguet F, Oliveira C, Dopazo J, Sammeth M, Ardlie KG, Guigó R. The effects of death and post-mortem cold ischemia on human tissue transcriptomes. Nat Commun 2018; 9:490. [PMID: 29440659 PMCID: PMC5811508 DOI: 10.1038/s41467-017-02772-x] [Citation(s) in RCA: 156] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2017] [Accepted: 12/22/2017] [Indexed: 12/05/2022] Open
Abstract
Post-mortem tissues samples are a key resource for investigating patterns of gene expression. However, the processes triggered by death and the post-mortem interval (PMI) can significantly alter physiologically normal RNA levels. We investigate the impact of PMI on gene expression using data from multiple tissues of post-mortem donors obtained from the GTEx project. We find that many genes change expression over relatively short PMIs in a tissue-specific manner, but this potentially confounding effect in a biological analysis can be minimized by taking into account appropriate covariates. By comparing ante- and post-mortem blood samples, we identify the cascade of transcriptional events triggered by death of the organism. These events do not appear to simply reflect stochastic variation resulting from mRNA degradation, but active and ongoing regulation of transcription. Finally, we develop a model to predict the time since death from the analysis of the transcriptome of a few readily accessible tissues. RNA levels in post-mortem tissue can differ greatly from those before death. Studying the effect of post-mortem interval on the transcriptome in 36 human tissues, Ferreira et al. find that the response to death is largely tissue-specific and develop a model to predict time since death based on RNA data.
Collapse
Affiliation(s)
- Pedro G Ferreira
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208, Porto, 4200-135, Portugal. .,Institute of Molecular Pathology and Immunology, University of Porto, Rua Dr. Roberto Frias s/n, Porto, 4200-625, Portugal.
| | - Manuel Muñoz-Aguirre
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain.,Departament d'Estadística i Investigació Operativa, Universitat Politècnica de Catalunya, Barcelona, E-08034, Catalonia, Spain
| | - Ferran Reverter
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain.,Universitat de Barcelona, Barcelona, E-08028, Catalonia, Spain
| | - Caio P Sá Godinho
- Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, 21941-902, Brazil
| | - Abel Sousa
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208, Porto, 4200-135, Portugal.,Institute of Molecular Pathology and Immunology, University of Porto, Rua Dr. Roberto Frias s/n, Porto, 4200-625, Portugal.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, CB10 1 SD, UK
| | - Alicia Amadoz
- Department of Bioinformatics, Igenomix S.A, Valencia, 46980, Spain
| | - Reza Sodaei
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain
| | - Marta R Hidalgo
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocio, Sevilla, 41013, Spain
| | - Dmitri Pervouchine
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain.,Skolkovo Institute of Science and Technology, 100 Novaya Street, Skolkovo, Moscow Region, 143025, Russia
| | - Jose Carbonell-Caballero
- Chromatin and Gene expression Lab, Gene Regulation, Stem Cells and Cancer Program, Centre de Regulació Genòmica (CRG), The Barcelona Institute of Science and Technology, PRBB, Barcelona, 08003, Spain
| | - Ramil Nurtdinov
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain
| | - Alessandra Breschi
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain
| | - Raziel Amador
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain
| | - Patrícia Oliveira
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208, Porto, 4200-135, Portugal.,Institute of Molecular Pathology and Immunology, University of Porto, Rua Dr. Roberto Frias s/n, Porto, 4200-625, Portugal
| | - Cankut Çubuk
- Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocio, Sevilla, 41013, Spain
| | - João Curado
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain
| | - François Aguet
- The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Carla Oliveira
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen, 208, Porto, 4200-135, Portugal.,Institute of Molecular Pathology and Immunology, University of Porto, Rua Dr. Roberto Frias s/n, Porto, 4200-625, Portugal
| | - Joaquin Dopazo
- Department of Bioinformatics, Igenomix S.A, Valencia, 46980, Spain.,Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), Hospital Virgen del Rocio, Sevilla, 41013, Spain.,Functional Genomics Node (INB), FPS, Hospital Virgen del Rocio, Sevilla, 41013, Spain.,Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocio, Sevilla, 41013, Spain
| | - Michael Sammeth
- Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, 21941-902, Brazil
| | - Kristin G Ardlie
- The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, E-08003, Catalonia, Spain. .,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, E-08003, Catalonia, Spain.
| |
Collapse
|
8
|
Westhues M, Schrag TA, Heuer C, Thaller G, Utz HF, Schipprack W, Thiemann A, Seifert F, Ehret A, Schlereth A, Stitt M, Nikoloski Z, Willmitzer L, Schön CC, Scholten S, Melchinger AE. Omics-based hybrid prediction in maize. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2017. [PMID: 28647896 DOI: 10.1007/s00122-017-2934-0] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Complementing genomic data with other "omics" predictors can increase the probability of success for predicting the best hybrid combinations using complex agronomic traits. Accurate prediction of traits with complex genetic architecture is crucial for selecting superior candidates in animal and plant breeding and for guiding decisions in personalized medicine. Whole-genome prediction has revolutionized these areas but has inherent limitations in incorporating intricate epistatic interactions. Downstream "omics" data are expected to integrate interactions within and between different biological strata and provide the opportunity to improve trait prediction. Yet, predicting traits from parents to progeny has not been addressed by a combination of "omics" data. Here, we evaluate several "omics" predictors-genomic, transcriptomic and metabolic data-measured on parent lines at early developmental stages and demonstrate that the integration of transcriptomic with genomic data leads to higher success rates in the correct prediction of untested hybrid combinations in maize. Despite the high predictive ability of genomic data, transcriptomic data alone outperformed them and other predictors for the most complex heterotic trait, dry matter yield. An eQTL analysis revealed that transcriptomic data integrate genomic information from both, adjacent and distant sites relative to the expressed genes. Together, these findings suggest that downstream predictors capture physiological epistasis that is transmitted from parents to their hybrid offspring. We conclude that the use of downstream "omics" data in prediction can exploit important information beyond structural genomics for leveraging the efficiency of hybrid breeding.
Collapse
Affiliation(s)
- Matthias Westhues
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany
| | - Tobias A Schrag
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany
| | - Claas Heuer
- Institute of Animal Breeding and Husbandry, Christian-Albrechts-University Kiel, 24098, Kiel, Germany
- Inguran LLC dba STGenetics, 22575 SH6 South, Navasota, TX, 77868, USA
| | - Georg Thaller
- Institute of Animal Breeding and Husbandry, Christian-Albrechts-University Kiel, 24098, Kiel, Germany
| | - H Friedrich Utz
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany
| | - Wolfgang Schipprack
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany
| | - Alexander Thiemann
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609, Hamburg, Germany
| | - Felix Seifert
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609, Hamburg, Germany
| | - Anita Ehret
- Institute of Animal Breeding and Husbandry, Christian-Albrechts-University Kiel, 24098, Kiel, Germany
| | - Armin Schlereth
- Max-Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | - Mark Stitt
- Max-Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | - Zoran Nikoloski
- Max-Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | - Lothar Willmitzer
- Max-Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | - Chris C Schön
- Plant Breeding, Technische Universität München, 85354, Freising, Germany
| | - Stefan Scholten
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609, Hamburg, Germany.
| | - Albrecht E Melchinger
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany.
| |
Collapse
|