51
|
Exploiting Single-Cell Quantitative Data to Map Genetic Variants Having Probabilistic Effects. PLoS Genet 2016; 12:e1006213. [PMID: 27479122 PMCID: PMC4968810 DOI: 10.1371/journal.pgen.1006213] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 07/02/2016] [Indexed: 01/11/2023] Open
Abstract
Despite the recent progress in sequencing technologies, genome-wide association studies (GWAS) remain limited by a statistical-power issue: many polymorphisms contribute little to common trait variation and therefore escape detection. The small contribution sometimes corresponds to incomplete penetrance, which may result from probabilistic effects on molecular regulations. In such cases, genetic mapping may benefit from the wealth of data produced by single-cell technologies. We present here the development of a novel genetic mapping method that allows to scan genomes for single-cell Probabilistic Trait Loci that modify the statistical properties of cellular-level quantitative traits. Phenotypic values are acquired on thousands of individual cells, and genetic association is obtained from a multivariate analysis of a matrix of Kantorovich distances. No prior assumption is required on the mode of action of the genetic loci involved and, by exploiting all single-cell values, the method can reveal non-deterministic effects. Using both simulations and yeast experimental datasets, we show that it can detect linkages that are missed by classical genetic mapping. A probabilistic effect of a single SNP on cell shape was detected and validated. The method also detected a novel locus associated with elevated gene expression noise of the yeast galactose regulon. Our results illustrate how single-cell technologies can be exploited to improve the genetic dissection of certain common traits. The method is available as an open source R package called ptlmapper.
Collapse
|
52
|
Märtens K, Hallin J, Warringer J, Liti G, Parts L. Predicting quantitative traits from genome and phenome with near perfect accuracy. Nat Commun 2016; 7:11512. [PMID: 27160605 PMCID: PMC4866306 DOI: 10.1038/ncomms11512] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Accepted: 04/01/2016] [Indexed: 12/20/2022] Open
Abstract
In spite of decades of linkage and association studies and its potential impact on human health, reliable prediction of an individual's risk for heritable disease remains difficult. Large numbers of mapped loci do not explain substantial fractions of heritable variation, leaving an open question of whether accurate complex trait predictions can be achieved in practice. Here, we use a genome sequenced population of ∼7,000 yeast strains of high but varying relatedness, and predict growth traits from family information, effects of segregating genetic variants and growth in other environments with an average coefficient of determination R(2) of 0.91. This accuracy exceeds narrow-sense heritability, approaches limits imposed by measurement repeatability and is higher than achieved with a single assay in the laboratory. Our results prove that very accurate prediction of complex traits is possible, and suggest that additional data from families rather than reference cohorts may be more useful for this purpose.
Collapse
Affiliation(s)
- Kaspar Märtens
- Institute of Computer Science, University of Tartu, Tartu 50409, Estonia
| | - Johan Hallin
- Institute for Research on Cancer and Aging, University of Sophia Antipolis, Nice 02 06107, France
| | - Jonas Warringer
- Department of Chemistry and Molecular Biology, Gothenburg University, Gothenburg 40530, Sweden
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Ås N-1432, Norway
| | - Gianni Liti
- Institute for Research on Cancer and Aging, University of Sophia Antipolis, Nice 02 06107, France
| | - Leopold Parts
- Institute of Computer Science, University of Tartu, Tartu 50409, Estonia
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB101SA, UK
| |
Collapse
|
53
|
Wang J, Gamazon ER, Pierce BL, Stranger BE, Im HK, Gibbons RD, Cox NJ, Nicolae DL, Chen LS. Imputing Gene Expression in Uncollected Tissues Within and Beyond GTEx. Am J Hum Genet 2016; 98:697-708. [PMID: 27040689 PMCID: PMC4833292 DOI: 10.1016/j.ajhg.2016.02.020] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 02/22/2016] [Indexed: 01/14/2023] Open
Abstract
Gene expression and its regulation can vary substantially across tissue types. In order to generate knowledge about gene expression in human tissues, the Genotype-Tissue Expression (GTEx) program has collected transcriptome data in a wide variety of tissue types from post-mortem donors. However, many tissue types are difficult to access and are not collected in every GTEx individual. Furthermore, in non-GTEx studies, the accessibility of certain tissue types greatly limits the feasibility and scale of studies of multi-tissue expression. In this work, we developed multi-tissue imputation methods to impute gene expression in uncollected or inaccessible tissues. Via simulation studies, we showed that the proposed methods outperform existing imputation methods in multi-tissue expression imputation and that incorporating imputed expression data can improve power to detect phenotype-expression correlations. By analyzing data from nine selected tissue types in the GTEx pilot project, we demonstrated that harnessing expression quantitative trait loci (eQTLs) and tissue-tissue expression-level correlations can aid imputation of transcriptome data from uncollected GTEx tissues. More importantly, we showed that by using GTEx data as a reference, one can impute expression levels in inaccessible tissues in non-GTEx expression studies.
Collapse
Affiliation(s)
- Jiebiao Wang
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Eric R Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University and Vanderbilt Genetics Institute, Nashville, TN 37232, USA; Academic Medical Center, University of Amsterdam, Amsterdam 1105 AZ, the Netherlands
| | - Brandon L Pierce
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Barbara E Stranger
- Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA; Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Robert D Gibbons
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Nancy J Cox
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University and Vanderbilt Genetics Institute, Nashville, TN 37232, USA
| | - Dan L Nicolae
- Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA; Department of Statistics, University of Chicago, Chicago, IL 60637, USA
| | - Lin S Chen
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
54
|
Wellenreuther M, Hansson B. Detecting Polygenic Evolution: Problems, Pitfalls, and Promises. Trends Genet 2016; 32:155-164. [DOI: 10.1016/j.tig.2015.12.004] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2015] [Revised: 12/21/2015] [Accepted: 12/22/2015] [Indexed: 10/22/2022]
|
55
|
Rakitsch B, Stegle O. Modelling local gene networks increases power to detect trans-acting genetic effects on gene expression. Genome Biol 2016; 17:33. [PMID: 26911988 PMCID: PMC4765046 DOI: 10.1186/s13059-016-0895-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2015] [Accepted: 02/09/2016] [Indexed: 01/05/2023] Open
Abstract
Expression quantitative trait loci (eQTL) mapping is a widely used tool to study the genetics of gene expression. Confounding factors and the burden of multiple testing limit the ability to map distal trans eQTLs, which is important to understand downstream genetic effects on genes and pathways. We propose a two-stage linear mixed model that first learns local directed gene-regulatory networks to then condition on the expression levels of selected genes. We show that this covariate selection approach controls for confounding factors and regulatory context, thereby increasing eQTL detection power and improving the consistency between studies. GNet-LMM is available at: https://github.com/PMBio/GNetLMM.
Collapse
Affiliation(s)
- Barbara Rakitsch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| |
Collapse
|
56
|
Young DL, Fields S. The role of functional data in interpreting the effects of genetic variation. Mol Biol Cell 2015; 26:3904-8. [PMID: 26543197 PMCID: PMC4710221 DOI: 10.1091/mbc.e15-03-0153] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Revised: 08/27/2015] [Accepted: 08/28/2015] [Indexed: 12/30/2022] Open
Abstract
Progress in DNA-sequencing technologies has provided a catalogue of millions of DNA variants in the human population, but characterization of the functional effects of these variants has lagged far behind. For example, sequencing of tumor samples is driving an urgent need to classify whether or not mutations seen in cancers affect disease progression or treatment effectiveness or instead are benign. Furthermore, mutations can interact with genetic background and with environmental effects. A new approach, termed deep mutational scanning, has enabled the quantitative assessment of the effects of thousands of mutations in a protein. However, this type of experiment is carried out in model organisms, tissue culture, or in vitro; typically addresses only a single biochemical function of a protein; and is generally performed under a single condition. The current challenge lies in using these functional data to generate useful models for the phenotypic consequences of genetic variation in humans.
Collapse
Affiliation(s)
- David L Young
- Department of Genome Sciences, University of Washington, Seattle, WA 98195
| | - Stanley Fields
- Department of Genome Sciences, University of Washington, Seattle, WA 98195 Department of Medicine, University of Washington, Seattle, WA 98195 Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| |
Collapse
|