101
|
Detecting Rare Mutations with Heterogeneous Effects Using a Family-Based Genetic Random Field Method. Genetics 2018; 210:463-476. [PMID: 30104420 DOI: 10.1534/genetics.118.301266] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Accepted: 07/29/2018] [Indexed: 01/19/2023] Open
Abstract
The genetic etiology of many complex diseases is highly heterogeneous. A complex disease can be caused by multiple mutations within the same gene or mutations in multiple genes at various genomic loci. Although these disease-susceptibility mutations can be collectively common in the population, they are often individually rare or even private to certain families. Family-based studies are powerful for detecting rare variants enriched in families, which is an important feature for sequencing studies due to the heterogeneous nature of rare variants. In addition, family designs can provide robust protection against population stratification. Nevertheless, statistical methods for analyzing family-based sequencing data are underdeveloped, especially those accounting for heterogeneous etiology of complex diseases. In this article, we introduce a random field framework for detecting gene-phenotype associations in family-based sequencing studies, referred to as family-based genetic random field (FGRF). Similar to existing family-based association tests, FGRF could utilize within-family and between-family information separately or jointly to test an association. We demonstrate that FGRF has comparable statistical power with existing methods when there is no genetic heterogeneity, but can improve statistical power when there is genetic heterogeneity across families. The proposed method also shares the same advantages with the conventional family-based association tests (e.g., being robust to population stratification). Finally, we applied the proposed method to a sequencing data from the Minnesota Twin Family Study, and revealed several genes, including SAMD14, potentially associated with alcohol dependence.
Collapse
|
102
|
He J, Guo Y, Xu J, Li H, Fuller A, Tait RG, Wu XL, Bauck S. Comparing SNP panels and statistical methods for estimating genomic breed composition of individual animals in ten cattle breeds. BMC Genet 2018; 19:56. [PMID: 30092776 PMCID: PMC6085684 DOI: 10.1186/s12863-018-0654-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 07/11/2018] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND SNPs are informative to estimate genomic breed composition (GBC) of individual animals, but selected SNPs for this purpose were not made available in the commercial bovine SNP chips prior to the present study. The primary objective of the present study was to select five common SNP panels for estimating GBC of individual animals initially involving 10 cattle breeds (two dairy breeds and eight beef breeds). The performance of the five common SNP panels was evaluated based on admixture model and linear regression model, respectively. Finally, the downstream implication of GBC on genomic prediction accuracies was investigated and discussed in a Santa Gertrudis cattle population. RESULTS There were 15,708 common SNPs across five currently-available commercial bovine SNP chips. From this set, four subsets (1,000, 3,000, 5,000, and 10,000 SNPs) were selected by maximizing average Euclidean distance (AED) of SNP allelic frequencies among the ten cattle breeds. For 198 animals presented as Akaushi, estimated GBC of the Akaushi breed (GBCA) based on the admixture model agreed very well among the five SNP panels, identifying 166 animals with GBCA = 1. Using the same SNP panels, the linear regression approach reported fewer animals with GBCA = 1. Nevertheless, estimated GBCA using both models were highly correlated (r = 0.953 to 0.992). In the genomic prediction of a Santa Gertrudis population (and crosses), the results showed that the predictability of molecular breeding values using SNP effects obtained from 1,225 animals with no less than 0.90 GBC of Santa Gertrudis (GBCSG) decreased on crossbred animals with lower GBCSG. CONCLUSIONS Of the two statistical models used to compute GBC, the admixture model gave more consistent results among the five selected SNP panels than the linear regression model. The availability of these common SNP panels facilitates identification and estimation of breed compositions using currently-available bovine SNP chips. In view of utility, the 1 K panel is the most cost effective and it is convenient to be included as add-on content in future development of bovine SNP chips, whereas the 10 K and 16 K SNP panels can be more resourceful if used independently for imputation to intermediate or high-density genotypes.
Collapse
Affiliation(s)
- Jun He
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Yage Guo
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
- College of Education and Human Sciences, University of Nebraska, Lincoln, NE USA
| | - Jiaqi Xu
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
- Department of Statistics, University of Nebraska, Lincoln, NE USA
| | - Hao Li
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
- Department of Animal Sciences, University of Wisconsin, Madison, WI USA
| | - Anna Fuller
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
| | - Richard G. Tait
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
| | - Xiao-Lin Wu
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
- Department of Animal Sciences, University of Wisconsin, Madison, WI USA
| | - Stewart Bauck
- Biostatistics and Bioinformatics, Neogen GeneSeek Operations, Lincoln, NE USA
| |
Collapse
|
103
|
Pemberton TJ, Verdu P, Becker NS, Willer CJ, Hewlett BS, Le Bomin S, Froment A, Rosenberg NA, Heyer E. A genome scan for genes underlying adult body size differences between Central African hunter-gatherers and farmers. Hum Genet 2018; 137:487-509. [PMID: 30008065 DOI: 10.1007/s00439-018-1902-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2017] [Accepted: 07/03/2018] [Indexed: 12/16/2022]
Abstract
The evolutionary and biological bases of the Central African "pygmy" phenotype, a characteristic of rainforest hunter-gatherers defined by reduced body size compared with neighboring farmers, remain largely unknown. Here, we perform a joint investigation in Central African hunter-gatherers and farmers of adult standing height, sitting height, leg length, and body mass index (BMI), considering 358 hunter-gatherers and 169 farmers with genotypes for 153,798 SNPs. In addition to reduced standing heights, hunter-gatherers have shorter sitting heights and leg lengths and higher sitting/standing height ratios than farmers and lower BMI for males. Standing height, sitting height, and leg length are strongly correlated with inferred levels of farmer genetic ancestry, whereas BMI is only weakly correlated, perhaps reflecting greater contributions of non-genetic factors to body weight than to height. Single- and multi-marker association tests identify one region and eight genes associated with hunter-gatherer/farmer status, and 24 genes associated with the height-related traits. Many of these genes have putative functions consistent with roles in determining their associated traits and the pygmy phenotype, and they include three associated with standing height in non-Africans (PRKG1, DSCAM, MAGI2). We find evidence that European height-associated SNPs or variants in linkage disequilibrium with them contribute to standing- and sitting-height determination in Central Africans, but not to the differential status of hunter-gatherers and farmers. These findings provide new insights into the biological basis of the pygmy phenotype, and they highlight the potential of cross-population studies for exploring the genetic basis of phenotypes that vary naturally across populations.
Collapse
Affiliation(s)
- Trevor J Pemberton
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada.
| | - Paul Verdu
- CNRS-MNHN-Université Paris Diderot, UMR 7206 Eco-Anthropologie et Ethnobiologie, Paris, France.
| | - Noémie S Becker
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Cristen J Willer
- Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Barry S Hewlett
- Department of Anthropology, Washington State University, Vancouver, WA, USA
| | - Sylvie Le Bomin
- CNRS-MNHN-Université Paris Diderot, UMR 7206 Eco-Anthropologie et Ethnobiologie, Paris, France
| | | | | | - Evelyne Heyer
- CNRS-MNHN-Université Paris Diderot, UMR 7206 Eco-Anthropologie et Ethnobiologie, Paris, France.
| |
Collapse
|
104
|
Hübel C, Leppä V, Breen G, Bulik CM. Rigor and reproducibility in genetic research on eating disorders. Int J Eat Disord 2018; 51:593-607. [PMID: 30194862 DOI: 10.1002/eat.22896] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 05/10/2018] [Accepted: 05/11/2018] [Indexed: 12/29/2022]
Abstract
OBJECTIVE We explored both within-method and between-method rigor and reproducibility in the field of eating disorders genetics. METHOD We present critical evaluation and commentary on component methods of genetic research (family studies, twin studies, molecular genetic studies) and discuss both successful and unsuccessful efforts in the field. RESULTS Eating disorders genetics has had a number of robust results that converge across component methodologies. Familial aggregation of eating disorders, twin-based heritability estimates of eating disorders, and genome-wide association studies (GWAS) all point toward a substantial role for genetics in eating disorders etiology and support the premise that genes do not act alone. Candidate gene and linkage studies have been less informative historically. DISCUSSION The eating disorders field has entered the GWAS era with studies of anorexia nervosa. Continued growth of sample sizes is essential for rigorous discovery of actionable variation. Molecular genetic studies of bulimia nervosa, binge-eating disorder, and other eating disorders are virtually nonexistent and lag seriously behind other major psychiatric disorders. Expanded efforts are necessary to reveal the fundamental biology of eating disorders, inform clinical practice, and deliver new therapeutic targets.
Collapse
Affiliation(s)
- Christopher Hübel
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom.,UK National Institute for Health Research (NIHR) Biomedical Research Centre for Mental Health, South London and Maudsley Hospital, London, United Kingdom.,Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Virpi Leppä
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Gerome Breen
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom.,UK National Institute for Health Research (NIHR) Biomedical Research Centre for Mental Health, South London and Maudsley Hospital, London, United Kingdom
| | - Cynthia M Bulik
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.,Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.,Department of Nutrition, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
105
|
D'Aquila P, Crocco P, De Rango F, Indiveri C, Bellizzi D, Rose G, Passarino G. A Genetic Variant of ASCT2 Hampers In Vitro RNA Splicing and Correlates with Human Longevity. Rejuvenation Res 2018; 21:193-199. [DOI: 10.1089/rej.2017.1948] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Affiliation(s)
- Patrizia D'Aquila
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| | - Paolina Crocco
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| | - Francesco De Rango
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| | - Cesare Indiveri
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| | - Dina Bellizzi
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| | - Giuseppina Rose
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| | - Giuseppe Passarino
- Department of Biology, Ecology and Earth Science, University of Calabria, Rende, Italy
| |
Collapse
|
106
|
Boerner V, Wittenburg D. On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression. Front Genet 2018; 9:185. [PMID: 29896217 PMCID: PMC5986875 DOI: 10.3389/fgene.2018.00185] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 05/07/2018] [Indexed: 11/18/2022] Open
Abstract
Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most widely used approach capable of providing biologically interpretable results is a likelihood formulation which allows for estimation of founder genome proportions and founder allele frequency conditional on the observed genotypes. However, if founder allele frequencies are known and samples are dominated by admixed genotypes this approach may lead to biased inference. In addition, processing time increases drastically with the number of genetic markers. This article describes a simplified approach for obtaining biologically meaningful measures of population stratification at the genotype level conditional on known founder allele frequencies. It was tested on cattle and human data sets with 4,022 and 150,000 genetic markers, respectively, and proved to be very accurate in situations where founder poplations were correctly specified, or under-, over-, and miss-specified. Moreover, processing time was only marginally affected by an increase in the number of markers.
Collapse
Affiliation(s)
- Vinzent Boerner
- Animal Genetics and Breeding Unit, University of New England, Armidale, NSW, Australia
| | - Dörte Wittenburg
- Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| |
Collapse
|
107
|
Alhusain L, Hafez AM. Nonparametric approaches for population structure analysis. Hum Genomics 2018; 12:25. [PMID: 29743099 PMCID: PMC5944014 DOI: 10.1186/s40246-018-0156-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 04/24/2018] [Indexed: 12/28/2022] Open
Abstract
The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves grouping individuals into subpopulations based on shared genetic variations. The most widely used markers to study the variation of DNA sequences between populations are single nucleotide polymorphisms. Data preprocessing is a necessary step to assess the quality of the data and to determine which markers or individuals can reasonably be included in the analysis. After preprocessing, several methods can be utilized to uncover population substructure, which can be categorized into two broad approaches: parametric and nonparametric. Parametric approaches use statistical models to infer population structure and assign individuals into subpopulations. However, these approaches suffer from many drawbacks that make them impractical for large datasets. In contrast, nonparametric approaches do not suffer from these drawbacks, making them more viable than parametric approaches for analyzing large datasets. Consequently, nonparametric approaches are increasingly used to reveal population substructure. Thus, this paper reviews and discusses the nonparametric approaches that are available for population structure analysis along with some implications to resolve challenges.
Collapse
Affiliation(s)
- Luluah Alhusain
- College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia.
| | - Alaaeldin M Hafez
- College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
108
|
Muzzio M, Motti JMB, Paz Sepulveda PB, Yee MC, Cooke T, Santos MR, Ramallo V, Alfaro EL, Dipierri JE, Bailliet G, Bravi CM, Bustamante CD, Kenny EE. Population structure in Argentina. PLoS One 2018; 13:e0196325. [PMID: 29715266 PMCID: PMC5929549 DOI: 10.1371/journal.pone.0196325] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 04/11/2018] [Indexed: 11/19/2022] Open
Abstract
We analyzed 391 samples from 12 Argentinian populations from the Center-West, East and North-West regions with the Illumina Human Exome Beadchip v1.0 (HumanExome-12v1-A). We did Principal Components analysis to infer patterns of populational divergence and migrations. We identified proportions and patterns of European, African and Native American ancestry and found a correlation between distance to Buenos Aires and proportion of Native American ancestry, where the highest proportion corresponds to the Northernmost populations, which is also the furthest from the Argentinian capital. Most of the European sources are from a South European origin, matching historical records, and we see two different Native American components, one that spreads all over Argentina and another specifically Andean. The highest percentages of African ancestry were in the Center West of Argentina, where the old trade routes took the slaves from Buenos Aires to Chile and Peru. Subcontinentaly, sources of this African component are represented by both West Africa and groups influenced by the Bantu expansion, the second slightly higher than the first, unlike North America and the Caribbean, where the main source is West Africa. This is reasonable, considering that a large proportion of the ships arriving at the Southern Hemisphere came from Mozambique, Loango and Angola.
Collapse
Affiliation(s)
- Marina Muzzio
- Instituto Multidisciplinario de Biología Celular (IMBICE) CCT-La Plata CONICET-CICPBA, La Plata, Buenos Aires, Argentina
- Facultad de Ciencias Naturales y Museo, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina
| | - Josefina M. B. Motti
- Universidad Nacional del Centro de la Provincia de Buenos Aires, FACSO, NEIPHPA, Quequén, Buenos Aires, Argentina
| | - Paula B. Paz Sepulveda
- Instituto Multidisciplinario de Biología Celular (IMBICE) CCT-La Plata CONICET-CICPBA, La Plata, Buenos Aires, Argentina
| | - Muh-ching Yee
- Stanford University, Stanford, California, United States of America
| | - Thomas Cooke
- Stanford University, Stanford, California, United States of America
| | - María R. Santos
- Instituto Multidisciplinario de Biología Celular (IMBICE) CCT-La Plata CONICET-CICPBA, La Plata, Buenos Aires, Argentina
- Facultad de Ciencias Naturales y Museo, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina
| | | | - Emma L. Alfaro
- INECOA (Instituto de Ecorregiones Andinas) UNJu-CONICET, Instituto de Biología de la Altura, Universidad Nacional de Jujuy, San Salvador de Jujuy, Jujuy, Argentina
| | - Jose E. Dipierri
- INECOA (Instituto de Ecorregiones Andinas) UNJu-CONICET, Instituto de Biología de la Altura, Universidad Nacional de Jujuy, San Salvador de Jujuy, Jujuy, Argentina
| | - Graciela Bailliet
- Instituto Multidisciplinario de Biología Celular (IMBICE) CCT-La Plata CONICET-CICPBA, La Plata, Buenos Aires, Argentina
| | - Claudio M. Bravi
- Instituto Multidisciplinario de Biología Celular (IMBICE) CCT-La Plata CONICET-CICPBA, La Plata, Buenos Aires, Argentina
- Facultad de Ciencias Naturales y Museo, Universidad Nacional de La Plata, La Plata, Buenos Aires, Argentina
| | | | - Eimear E. Kenny
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, United States
| |
Collapse
|
109
|
Llinares-López F, Papaxanthos L, Bodenham D, Roqueiro D, Borgwardt K. Genome-wide genetic heterogeneity discovery with categorical covariates. Bioinformatics 2018; 33:1820-1828. [PMID: 28200033 PMCID: PMC5870548 DOI: 10.1093/bioinformatics/btx071] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 02/08/2017] [Indexed: 12/30/2022] Open
Abstract
Motivation Genetic heterogeneity is the phenomenon that distinct genetic variants may give rise to the same phenotype. The recently introduced algorithm Fast Automatic Interval Search (FAIS) enables the genome-wide search of candidate regions for genetic heterogeneity in the form of any contiguous sequence of variants, and achieves high computational efficiency and statistical power. Although FAIS can test all possible genomic regions for association with a phenotype, a key limitation is its inability to correct for confounders such as gender or population structure, which may lead to numerous false-positive associations. Results We propose FastCMH, a method that overcomes this problem by properly accounting for categorical confounders, while still retaining statistical power and computational efficiency. Experiments comparing FastCMH with FAIS and multiple kinds of burden tests on simulated data, as well as on human and Arabidopsis samples, demonstrate that FastCMH can drastically reduce genomic inflation and discover associations that are missed by standard burden tests. Availability and Implementation An R package fastcmh is available on CRAN and the source code can be found at: https://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/fastcmh.html Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Felipe Llinares-López
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Laetitia Papaxanthos
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Dean Bodenham
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Damian Roqueiro
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Karsten Borgwardt
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
110
|
Lloyd-Jones LR, Robinson MR, Yang J, Visscher PM. Transformation of Summary Statistics from Linear Mixed Model Association on All-or-None Traits to Odds Ratio. Genetics 2018; 208:1397-1408. [PMID: 29429966 PMCID: PMC5887138 DOI: 10.1534/genetics.117.300360] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 01/25/2018] [Indexed: 12/15/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified thousands of loci that are robustly associated with complex diseases. The use of linear mixed model (LMM) methodology for GWAS is becoming more prevalent due to its ability to control for population structure and cryptic relatedness and to increase power. The odds ratio (OR) is a common measure of the association of a disease with an exposure (e.g., a genetic variant) and is readably available from logistic regression. However, when the LMM is applied to all-or-none traits it provides estimates of genetic effects on the observed 0-1 scale, a different scale to that in logistic regression. This limits the comparability of results across studies, for example in a meta-analysis, and makes the interpretation of the magnitude of an effect from an LMM GWAS difficult. In this study, we derived transformations from the genetic effects estimated under the LMM to the OR that only rely on summary statistics. To test the proposed transformations, we used real genotypes from two large, publicly available data sets to simulate all-or-none phenotypes for a set of scenarios that differ in underlying model, disease prevalence, and heritability. Furthermore, we applied these transformations to GWAS summary statistics for type 2 diabetes generated from 108,042 individuals in the UK Biobank. In both simulation and real-data application, we observed very high concordance between the transformed OR from the LMM and either the simulated truth or estimates from logistic regression. The transformations derived and validated in this study improve the comparability of results from prospective and already performed LMM GWAS on complex diseases by providing a reliable transformation to a common comparative scale for the genetic effects.
Collapse
Affiliation(s)
- Luke R Lloyd-Jones
- Institute for Molecular Bioscience, University of Queensland, Brisbane 4072, Australia
| | - Matthew R Robinson
- Institute for Molecular Bioscience, University of Queensland, Brisbane 4072, Australia
- Department of Computational Biology, University of Lausanne, CH-1015, Switzerland
| | - Jian Yang
- Institute for Molecular Bioscience, University of Queensland, Brisbane 4072, Australia
- Queensland Brain Institute, University of Queensland, Brisbane 4072, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, University of Queensland, Brisbane 4072, Australia
- Queensland Brain Institute, University of Queensland, Brisbane 4072, Australia
| |
Collapse
|
111
|
Brieuc MSO, Waters CD, Drinan DP, Naish KA. A practical introduction to Random Forest for genetic association studies in ecology and evolution. Mol Ecol Resour 2018; 18:755-766. [PMID: 29504715 DOI: 10.1111/1755-0998.12773] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Revised: 02/08/2018] [Accepted: 02/17/2018] [Indexed: 12/25/2022]
Abstract
Large genomic studies are becoming increasingly common with advances in sequencing technology, and our ability to understand how genomic variation influences phenotypic variation between individuals has never been greater. The exploration of such relationships first requires the identification of associations between molecular markers and phenotypes. Here, we explore the use of Random Forest (RF), a powerful machine-learning algorithm, in genomic studies to discern loci underlying both discrete and quantitative traits, particularly when studying wild or nonmodel organisms. RF is becoming increasingly used in ecological and population genetics because, unlike traditional methods, it can efficiently analyse thousands of loci simultaneously and account for nonadditive interactions. However, understanding both the power and limitations of Random Forest is important for its proper implementation and the interpretation of results. We therefore provide a practical introduction to the algorithm and its use for identifying associations between molecular markers and phenotypes, discussing such topics as data limitations, algorithm initiation and optimization, as well as interpretation. We also provide short R tutorials as examples, with the aim of providing a guide to the implementation of the algorithm. Topics discussed here are intended to serve as an entry point for molecular ecologists interested in employing Random Forest to identify trait associations in genomic data sets.
Collapse
Affiliation(s)
- Marine S O Brieuc
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA.,Center for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | - Charles D Waters
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA
| | - Daniel P Drinan
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA
| | - Kerry A Naish
- School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA
| |
Collapse
|
112
|
Steinsaltz D, Dahl A, Wachter KW. Statistical properties of simple random-effects models for genetic heritability. Electron J Stat 2018; 12:321-356. [PMID: 30057658 PMCID: PMC6063091 DOI: 10.1214/17-ejs1386] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Random-effects models are a popular tool for analysing total narrow-sense heritability for quantitative phenotypes, on the basis of large-scale SNP data. Recently, there have been disputes over the validity of conclusions that may be drawn from such analysis. We derive some of the fundamental statistical properties of heritability estimates arising from these models, showing that the bias will generally be small. We show that that the score function may be manipulated into a form that facilitates intelligible interpretations of the results. We go on to use this score function to explore the behavior of the model when certain key assumptions of the model are not satisfied - shared environment, measurement error, and genetic effects that are confined to a small subset of sites. The variance and bias depend crucially on the variance of certain functionals of the singular values of the genotype matrix. A useful baseline is the singular value distribution associated with genotypes that are completely independent - that is, with no linkage and no relatedness - for a given number of individuals and sites. We calculate the corresponding variance and bias for this setting.
Collapse
Affiliation(s)
| | - Andrew Dahl
- Wellcome Trust Centre for Human Genetics and Department of Statistics, University of Oxford
| | | |
Collapse
|
113
|
Masaki Y, Cayer D, McBride R, Ghadiri MR. A kinetically controlled, isothermal method for the detection of single nucleotide mismatches. Bioorg Med Chem Lett 2018; 28:2754-2758. [PMID: 29500066 DOI: 10.1016/j.bmcl.2018.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 02/13/2018] [Indexed: 11/28/2022]
Abstract
We describe an isothermal, enzyme-free method to detect single nucleotide differences between oligonucleotides of close homology. The approach exploits kinetic differences in toe-hold-mediated, nucleic acid strand-displacement reactions to detect single nucleotide polymorphisms (SNPs) with essentially "digital" precision. The theoretical underpinning, experimental analyses, predictability, and accuracy of this new method are reported. We demonstrate detection of biologically relevant SNPs and single nucleotide differences in the let-7 family of microRNAs. The method is adaptable to microarray formats, as demonstrated with on-chip detection of SNP variants involved in susceptibility to the therapeutic agents abacavir, Herceptin, and simvastatin.
Collapse
Affiliation(s)
- Yoshiaki Masaki
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - Devon Cayer
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - Ryan McBride
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - M Reza Ghadiri
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States.
| |
Collapse
|
114
|
Abstract
This chapter provides a practical overview of the statistical analysis using R [1] and genotype by sequencing (GBS) markers for genome-wide association studies (GWAS) in oats. Statistical analysis is performed by R package rrBLUP [2] and issues associated with the analysis are addressed along with the R code. The ultimate aim of this chapter is to provide a practical guideline to do GWAS analysis using R, rather than describe the theory in depth. For more details about the subject, readers are referred to the excellent resource book in GWAS [3]. A basic programming experience in R is assumed.
Collapse
Affiliation(s)
- Julio Isidro-Sánchez
- Agriculture and Food Science, University College Dublin, Room 1.38, Belfield, Dublin 4, Ireland.
| | - Deniz Akdemir
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| | - Gracia Montilla-Bascón
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| |
Collapse
|
115
|
Smit RAJ, Noordam R, le Cessie S, Trompet S, Jukema JW. A critical appraisal of pharmacogenetic inference. Clin Genet 2018; 93:498-507. [PMID: 29136278 DOI: 10.1111/cge.13178] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Revised: 10/25/2017] [Accepted: 11/09/2017] [Indexed: 01/06/2023]
Abstract
In essence, pharmacogenetic research is aimed at discovering variants of importance to gene-treatment interaction. However, epidemiological studies are rarely set up with this goal in mind. It is therefore of great importance that researchers clearly communicate which assumptions they have had to make, and which inherent limitations apply to the interpretation of their results. This review discusses considerations of, and the underlying assumptions for, utilizing different response phenotypes and study designs popular in pharmacogenetic research to infer gene-treatment interaction effects, with a special focus on those dealing with of clinical effects of drug treatment.
Collapse
Affiliation(s)
- R A J Smit
- Department of Cardiology, Leiden University Medical Center, Leiden, the Netherlands.,Section of Gerontology and Geriatrics, Department of Internal Medicine, Leiden University Medical Center, Leiden, the Netherlands
| | - R Noordam
- Section of Gerontology and Geriatrics, Department of Internal Medicine, Leiden University Medical Center, Leiden, the Netherlands
| | - S le Cessie
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands.,Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands
| | - S Trompet
- Department of Cardiology, Leiden University Medical Center, Leiden, the Netherlands.,Section of Gerontology and Geriatrics, Department of Internal Medicine, Leiden University Medical Center, Leiden, the Netherlands
| | - J W Jukema
- Department of Cardiology, Leiden University Medical Center, Leiden, the Netherlands.,Einthoven Laboratory for Experimental Vascular Medicine, Leiden University Medical Center, Leiden, the Netherlands
| |
Collapse
|
116
|
A phylogenetic method to perform genome-wide association studies in microbes that accounts for population structure and recombination. PLoS Comput Biol 2018; 14:e1005958. [PMID: 29401456 PMCID: PMC5814097 DOI: 10.1371/journal.pcbi.1005958] [Citation(s) in RCA: 122] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 02/15/2018] [Accepted: 12/30/2017] [Indexed: 11/28/2022] Open
Abstract
Genome-Wide Association Studies (GWAS) in microbial organisms have the potential to vastly improve the way we understand, manage, and treat infectious diseases. Yet, microbial GWAS methods established thus far remain insufficiently able to capitalise on the growing wealth of bacterial and viral genetic sequence data. Facing clonal population structure and homologous recombination, existing GWAS methods struggle to achieve both the precision necessary to reject spurious findings and the power required to detect associations in microbes. In this paper, we introduce a novel phylogenetic approach that has been tailor-made for microbial GWAS, which is applicable to organisms ranging from purely clonal to frequently recombining, and to both binary and continuous phenotypes. Our approach is robust to the confounding effects of both population structure and recombination, while maintaining high statistical power to detect associations. Thorough testing via application to simulated data provides strong support for the power and specificity of our approach and demonstrates the advantages offered over alternative cluster-based and dimension-reduction methods. Two applications to Neisseria meningitidis illustrate the versatility and potential of our method, confirming previously-identified penicillin resistance loci and resulting in the identification of both well-characterised and novel drivers of invasive disease. Our method is implemented as an open-source R package called treeWAS which is freely available at https://github.com/caitiecollins/treeWAS. Measurable differences often exist within a microbial population, with important ecological or epidemiological consequences. Examples include differences in growth rates, host range, transmissibility, antimicrobial resistance, virulence, etc. Understanding the genetic factors involved in these phenotypic properties is a crucial aim in microbial genomics. A fundamental approach for doing so is to perform a Genome-Wide Association Study (GWAS), where genomes are compared to search for genetic markers systematically correlated with the property of interest. If this strategy were implemented naively in microbes, it could lead to spurious results due to the confounding effects of population structure and recombination. Here we present treeWAS, a new phylogenetic method to perform microbial GWAS that avoids these pitfalls. We show, using simulated datasets, that treeWAS is able to distinguish between genetic markers that are truly associated with the property of interest and those that are not. Furthermore, we demonstrate that treeWAS offers advantages in both sensitivity and specificity over alternative cluster-based and dimension-reduction techniques. We also showcase treeWAS in two applications to real datasets from N. meningitidis. We have developed an easy-to-use implementation of treeWAS in the R environment, which should be useful to a wide range of researchers in microbial genomics.
Collapse
|
117
|
Armstrong C, Richardson DS, Hipperson H, Horsburgh GJ, Küpper C, Percival‐Alwyn L, Clark M, Burke T, Spurgin LG. Genomic associations with bill length and disease reveal drift and selection across island bird populations. Evol Lett 2018; 2:22-36. [PMID: 30283662 PMCID: PMC6121843 DOI: 10.1002/evl3.38] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 12/19/2017] [Accepted: 12/21/2017] [Indexed: 12/15/2022] Open
Abstract
Island species provide excellent models for investigating how selection and drift operate in wild populations, and for determining how these processes act to influence local adaptation and speciation. Here, we examine the role of selection and drift in shaping genomic and phenotypic variation across recently separated populations of Berthelot's pipit (Anthus berthelotii), a passerine bird endemic to three archipelagos in the Atlantic. We first characterized genetic diversity and population structuring that supported previous inferences of a history of recent colonizations and bottlenecks. We then tested for regions of the genome associated with the ecologically important traits of bill length and malaria infection, both of which vary substantially across populations in this species. We identified a SNP associated with variation in bill length among individuals, islands, and archipelagos; patterns of variation at this SNP suggest that both phenotypic and genotypic variation in bill length is largely shaped by founder effects. Malaria was associated with SNPs near/within genes involved in the immune response, but this relationship was not consistent among archipelagos, supporting the view that disease resistance is complex and rapidly evolving. Although we found little evidence for divergent selection at candidate loci for bill length and malaria resistance, genome scan analyses pointed to several genes related to immunity and metabolism as having important roles in divergence and adaptation. Our findings highlight the utility and challenges involved with combining association mapping and population genetic analysis in nonequilibrium populations, to disentangle the effects of drift and selection on shaping genotypes and phenotypes.
Collapse
Affiliation(s)
- Claire Armstrong
- School of Biological Sciences, University of East AngliaNorwich Research ParkNorwich NR4 7TJUnited Kingdom
| | - David S. Richardson
- School of Biological Sciences, University of East AngliaNorwich Research ParkNorwich NR4 7TJUnited Kingdom
| | - Helen Hipperson
- NERC Biomolecular Analysis Facility, Department of Animal and Plant SciencesUniversity of SheffieldSheffield S10 2TNUnited Kingdom
| | - Gavin J. Horsburgh
- NERC Biomolecular Analysis Facility, Department of Animal and Plant SciencesUniversity of SheffieldSheffield S10 2TNUnited Kingdom
| | - Clemens Küpper
- Max Planck Institute for Ornithology82319 SeewiesenGermany
| | | | - Matt Clark
- Earlham InstituteNorwich Research ParkNorwich NR4 7UZUnited Kingdom
| | - Terry Burke
- NERC Biomolecular Analysis Facility, Department of Animal and Plant SciencesUniversity of SheffieldSheffield S10 2TNUnited Kingdom
| | - Lewis G. Spurgin
- School of Biological Sciences, University of East AngliaNorwich Research ParkNorwich NR4 7TJUnited Kingdom
| |
Collapse
|
118
|
Erzurumluoglu AM, Baird D, Richardson TG, Timpson NJ, Rodriguez S. Using Y-Chromosomal Haplogroups in Genetic Association Studies and Suggested Implications. Genes (Basel) 2018; 9:E45. [PMID: 29361760 PMCID: PMC5793196 DOI: 10.3390/genes9010045] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Revised: 01/16/2018] [Accepted: 01/16/2018] [Indexed: 11/16/2022] Open
Abstract
Y-chromosomal (Y-DNA) haplogroups are more widely used in population genetics than in genetic epidemiology, although associations between Y-DNA haplogroups and several traits, including cardiometabolic traits, have been reported. In apparently homogeneous populations defined by principal component analyses, there is still Y-DNA haplogroup variation which will result from population history. Therefore, hidden stratification and/or differential phenotypic effects by Y-DNA haplogroups could exist. To test this, we hypothesised that stratifying individuals according to their Y-DNA haplogroups before testing for associations between autosomal single nucleotide polymorphisms (SNPs) and phenotypes will yield difference in association. For proof of concept, we derived Y-DNA haplogroups from 6537 males from two epidemiological cohorts, Avon Longitudinal Study of Parents and Children (ALSPAC) (n = 5080; 816 Y-DNA SNPs) and the 1958 Birth Cohort (n = 1457; 1849 Y-DNA SNPs), and studied the robust associations between 32 SNPs and body mass index (BMI), including SNPs in or near Fat Mass and Obesity-associated protein (FTO) which yield the strongest effects. Overall, no association was replicated in both cohorts when Y-DNA haplogroups were considered and this suggests that, for BMI at least, there is little evidence of differences in phenotype or SNP association by Y-DNA structure. Further studies using other traits, phenome-wide association studies (PheWAS), other haplogroups and/or autosomal SNPs are required to test the generalisability and utility of this approach.
Collapse
Affiliation(s)
- A Mesut Erzurumluoglu
- Genetic Epidemiology Group, Department of Health Sciences, University of Leicester, Leicester LE1 7RH, UK.
| | - Denis Baird
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.
| | - Tom G Richardson
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.
| | - Nicholas J Timpson
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.
| | - Santiago Rodriguez
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.
| |
Collapse
|
119
|
Skotte L, Koch A, Yakimov V, Zhou S, Søborg B, Andersson M, Michelsen SW, Navne JE, Mistry JM, Dion PA, Pedersen ML, Børresen ML, Rouleau GA, Geller F, Melbye M, Feenstra B. CPT1A Missense Mutation Associated With Fatty Acid Metabolism and Reduced Height in Greenlanders. ACTA ACUST UNITED AC 2018; 10:CIRCGENETICS.116.001618. [PMID: 28611031 DOI: 10.1161/circgenetics.116.001618] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Accepted: 04/06/2017] [Indexed: 11/16/2022]
Abstract
BACKGROUND Inuit have lived for thousands of years in an extremely cold environment on a diet dominated by marine-derived fat. To investigate how this selective pressure has affected the genetic regulation of fatty acid metabolism, we assessed 233 serum metabolic phenotypes in a population-based sample of 1570 Greenlanders. METHODS AND RESULTS Using array-based and targeted genotyping, we found that rs80356779, a p.Pro479Leu variant in CPT1A, was strongly associated with markers of n-3 fatty acid metabolism, including degree of unsaturation (P=1.16×10-34), levels of polyunsaturated fatty acids, n-3 fatty acids, and docosahexaoenic acid relative to total fatty acid levels (P=2.35×10-15, P=4.02×10-19, and P=7.92×10-27). The derived allele (L479) occurred at a frequency of 76.2% in our sample while being absent in most other populations, and we found strong signatures of positive selection at the locus. Furthermore, we found that each copy of L479 reduced height by an average of 2.1 cm (P=1.04×10-9). In exome sequencing data from a sister population, the Nunavik Inuit, we found no other likely causal candidate variant than rs80356779. CONCLUSION Our study shows that a common CPT1A missense mutation is strongly associated with a range of metabolic phenotypes and reduced height in Greenlanders. These findings are important from a public health perspective and highlight the usefulness of complex trait genetic studies in isolated populations.
Collapse
Affiliation(s)
- Line Skotte
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.).
| | - Anders Koch
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Victor Yakimov
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Sirui Zhou
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Bolette Søborg
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Mikael Andersson
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Sascha W Michelsen
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Johan E Navne
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Jacqueline M Mistry
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Patrick A Dion
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Michael L Pedersen
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Malene L Børresen
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Guy A Rouleau
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Frank Geller
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Mads Melbye
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.)
| | - Bjarke Feenstra
- From the Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark (L.S., A.K., V.Y., B.S., M.A., S.W.M., J.E.N., J.M.M., M.L.B., F.G., M.M., B.F.); Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada (S.Z., P.A.D., G.A.R.); Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada (P.A.D., G.A.R.); Département de Médecine, Faculté de Médecine, Université de Montréal, Quebec, Canada (S.Z.); Greenland Center for Health Research, Institute of Nursing and Health Science, University of Greenland, Nuuk, Greenland (M.L.P.); Department of Clinical Medicine, University of Copenhagen, Denmark (M.M.); and Department of Medicine, Stanford University School of Medicine, California (M.M.).
| |
Collapse
|
120
|
Lipner EM, Greenberg DA. The Rise and Fall and Rise of Linkage Analysis as a Technique for Finding and Characterizing Inherited Influences on Disease Expression. Methods Mol Biol 2018; 1706:381-397. [PMID: 29423810 DOI: 10.1007/978-1-4939-7471-9_21] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
For many years, family-based studies using linkage analysis represented the primary approach for identifying disease genes. This strategy is responsible for the identification of the greatest number of genes proven to cause human disease. However, technical advancements in next generation sequencing and high throughput genotyping, coupled with the apparent simplicity of association testing, led to the rejection of family-based studies and of linkage analysis. At present, genetic association methods, using case-control comparisons, have become the exclusive approach for detecting disease-related genes, particularly those underlying common, complex diseases. In this chapter, we present a historical overview of linkage analysis, including a description of how the approach works, as well as its strengths and weaknesses. We discuss how the transition from family-based studies to population comparison association studies led to a critical loss of information with respect to genetic etiology and inheritance, and we present historical and contemporary examples of linkage analysis "success stories" in identifying genes contributing to the development of human disease. Currently, linkage analysis is re-emerging as a useful approach for identifying disease genes, determining genetic parameters, and resolving genetic heterogeneity. We posit that the combination of linkage analysis, association testing, and high throughput sequencing provides a powerful approach for identifying disease-causing genes.
Collapse
Affiliation(s)
- Ettie M Lipner
- Center for Genes, Environment, and Health, National Jewish Health, 1400 Jackson Street, Denver, CO, 80602, USA.
- Department of Pharmacology, University of Colorado Denver, School of Medicine, Aurora, CO, USA.
| | - David A Greenberg
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, Wexner Medical Center, Ohio State University, Columbus, OH, USA
| |
Collapse
|
121
|
Alhusain L, Hafez AM. Cluster ensemble based on Random Forests for genetic data. BioData Min 2017; 10:37. [PMID: 29270227 PMCID: PMC5732374 DOI: 10.1186/s13040-017-0156-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2017] [Accepted: 11/21/2017] [Indexed: 11/25/2022] Open
Abstract
Background Clustering plays a crucial role in several application domains, such as bioinformatics. In bioinformatics, clustering has been extensively used as an approach for detecting interesting patterns in genetic data. One application is population structure analysis, which aims to group individuals into subpopulations based on shared genetic variations, such as single nucleotide polymorphisms. Advances in DNA sequencing technology have facilitated the obtainment of genetic datasets with exceptional sizes. Genetic data usually contain hundreds of thousands of genetic markers genotyped for thousands of individuals, making an efficient means for handling such data desirable. Results Random Forests (RFs) has emerged as an efficient algorithm capable of handling high-dimensional data. RFs provides a proximity measure that can capture different levels of co-occurring relationships between variables. RFs has been widely considered a supervised learning method, although it can be converted into an unsupervised learning method. Therefore, RF-derived proximity measure combined with a clustering technique may be well suited for determining the underlying structure of unlabeled data. This paper proposes, RFcluE, a cluster ensemble approach for determining the underlying structure of genetic data based on RFs. The approach comprises a cluster ensemble framework to combine multiple runs of RF clustering. Experiments were conducted on high-dimensional, real genetic dataset to evaluate the proposed approach. The experiments included an examination of the impact of parameter changes, comparing RFcluE performance against other clustering methods, and an assessment of the relationship between the diversity and quality of the ensemble and its effect on RFcluE performance. Conclusions This paper proposes, RFcluE, a cluster ensemble approach based on RF clustering to address the problem of population structure analysis and demonstrate the effectiveness of the approach. The paper also illustrates that applying a cluster ensemble approach, combining multiple RF clusterings, produces more robust and higher-quality results as a consequence of feeding the ensemble with diverse views of high-dimensional genetic data obtained through bagging and random subspace, the two key features of the RF algorithm.
Collapse
Affiliation(s)
- Luluah Alhusain
- College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Alaaeldin M Hafez
- College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
122
|
Krause ET, Krüger O, Hoffman JI. The influence of inherited plumage colour morph on morphometric traits and breeding investment in zebra finches (Taeniopygia guttata). PLoS One 2017; 12:e0188582. [PMID: 29190647 PMCID: PMC5708660 DOI: 10.1371/journal.pone.0188582] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 11/09/2017] [Indexed: 12/18/2022] Open
Abstract
Melanin-based plumage polymorphism occurs in many wild bird populations and has been linked to fitness variation in several species. These fitness differences often arise as a consequence of variation in traits such as behaviour, immune responsiveness, body size and reproductive investment. However, few studies have controlled for genetic differences between colour morphs that could potentially generate artefactual associations between plumage colouration and trait variation. Here, we used zebra finches (Taeniopygia guttata) as a model system in order to evaluate whether life-history traits such as adult body condition and reproductive investment could be influenced by plumage morph. To maximise any potential differences, we selected wild-type and white plumage morphs, which differ maximally in their extent of melanisation, while using a controlled three-generation breeding design to homogenise the genetic background. We found that F2 adults with white plumage colouration were on average lighter and had poorer body condition than wild-type F2 birds. However, they appeared to compensate for this by reproducing earlier and producing heavier eggs relative to their own body mass. Our study thus reveals differences in morphological and life history traits that could be relevant to fitness variation, although further studies will be required to evaluate fitness effects under natural conditions as well as to characterise any potential fitness costs of compensatory strategies in white zebra finches.
Collapse
Affiliation(s)
- E. Tobias Krause
- Department of Animal Behaviour, Bielefeld University, Bielefeld, Germany
- Institute of Animal Welfare and Animal Husbandry, Friedrich-Loeffler-Institut, Celle, Germany
- * E-mail:
| | - Oliver Krüger
- Department of Animal Behaviour, Bielefeld University, Bielefeld, Germany
| | - Joseph I. Hoffman
- Department of Animal Behaviour, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
123
|
Hellwege J, Keaton J, Giri A, Gao X, Velez Edwards DR, Edwards TL. Population Stratification in Genetic Association Studies. CURRENT PROTOCOLS IN HUMAN GENETICS 2017; 95:1.22.1-1.22.23. [PMID: 29044472 PMCID: PMC6007879 DOI: 10.1002/cphg.48] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Population stratification (PS) is a primary consideration in studies of genetic determinants of human traits. Failure to control for PS may lead to confounding, causing a study to fail for lack of significant results, or resources to be wasted following false-positive signals. Here, historical and current approaches for addressing PS when performing genetic association studies in human populations are reviewed. Methods for detecting the presence of PS, including global and local ancestry methods, are described. Also described are approaches for accounting for PS when calculating association statistics, such that measures of association are not confounded. Many traits are being examined for the first time in minority populations, which may inherently feature PS. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Jacklyn Hellwege
- Vanderbilt Genetics Institute, Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center,
Nashville, TN 37203, USA
| | - Jacob Keaton
- Vanderbilt Genetics Institute, Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center,
Nashville, TN 37203, USA
| | - Ayush Giri
- Vanderbilt Genetics Institute, Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center,
Nashville, TN 37203, USA
| | - Xiaoyi Gao
- Department of Ophthalmology and Preventive Medicine, Keck School of Medicine, University of Southern California, Los
Angeles, CA 90033, USA
| | - Digna R. Velez Edwards
- Vanderbilt Genetics Institute, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center,
Nashville, TN 37203, USA
| | - Todd L. Edwards
- Vanderbilt Genetics Institute, Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center,
Nashville, TN 37203, USA
| |
Collapse
|
124
|
Sánchez-Leyva M, Sánchez-Zazueta JG, Osuna-Ramos JF, Rendón-Aguilar H, Félix-Espinoza R, Becerra-Loaiza DS, Sánchez-García DC, Romero-Quintana JG, Castillo Ureta H, Velarde-Rodríguez I, Velarde-Félix JS. Genetic Polymorphisms of Tumor Necrosis Factor Alpha and Susceptibility to Dengue Virus Infection in a Mexican Population. Viral Immunol 2017. [DOI: 10.1089/vim.2017.0029] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Affiliation(s)
- Marina Sánchez-Leyva
- Departamento de Ciencias Biomédicas, Facultad de Ciencias Químico Biológicas, Universidad Autónoma de Sinaloa, Culiacán, México
| | - Jorge Guillermo Sánchez-Zazueta
- Cuerpo Académico Inmunogenética y Evolución UAS-CA-265, Unidad Académica Escuela de Biología, Universidad Autónoma de Sinaloa, Culiacán, México
| | - Juan Fidel Osuna-Ramos
- Departamento de Infectómica y Patogénesis Molecular, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Mexico City, México
| | | | | | | | | | - José Geovanni Romero-Quintana
- Departamento de Ciencias Biomédicas, Facultad de Ciencias Químico Biológicas, Universidad Autónoma de Sinaloa, Culiacán, México
| | | | | | - Jesús Salvador Velarde-Félix
- Departamento de Ciencias Biomédicas, Facultad de Ciencias Químico Biológicas, Universidad Autónoma de Sinaloa, Culiacán, México
- Cuerpo Académico Inmunogenética y Evolución UAS-CA-265, Unidad Académica Escuela de Biología, Universidad Autónoma de Sinaloa, Culiacán, México
- Hospital General de Culiacán, “Bernardo J Gastélum,” Culiacán, México
| |
Collapse
|
125
|
Ramstetter MD, Dyer TD, Lehman DM, Curran JE, Duggirala R, Blangero J, Mezey JG, Williams AL. Benchmarking Relatedness Inference Methods with Genome-Wide Data from Thousands of Relatives. Genetics 2017; 207:75-82. [PMID: 28739658 PMCID: PMC5586387 DOI: 10.1534/genetics.117.1122] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 07/08/2017] [Indexed: 01/03/2023] Open
Abstract
Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a data set with 2485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92-99%) when detecting first- and second-degree relationships, but their accuracy dwindles to <43% for seventh-degree relationships. However, most identical by descent (IBD) segment-based methods inferred seventh-degree relatives correct to within one relatedness degree for >76% of relative pairs. Overall, the most accurate methods are Estimation of Recent Shared Ancestry (ERSA) and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches, such as new methods that leverage relatedness signals from multiple samples, are needed to achieve a sizeable jump in performance.
Collapse
Affiliation(s)
- Monica D Ramstetter
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
| | - Thomas D Dyer
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Donna M Lehman
- Department of Medicine, University of Texas Health San Antonio, San Antonio, Texas 78229
| | - Joanne E Curran
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Ravindranath Duggirala
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - John Blangero
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Jason G Mezey
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
- Department of Genetic Medicine, Weill Cornell Medicine, New York, New York 10065
| | - Amy L Williams
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
| |
Collapse
|
126
|
Peterson RE, Edwards AC, Bacanu SA, Dick DM, Kendler KS, Webb BT. The utility of empirically assigning ancestry groups in cross-population genetic studies of addiction. Am J Addict 2017; 26:494-501. [PMID: 28714599 DOI: 10.1111/ajad.12586] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Revised: 06/08/2017] [Accepted: 06/25/2017] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND AND OBJECTIVES Given moderate heritability and significant heterogeneity among addiction phenotypes, successful genome-wide association studies (GWAS) are expected to need very large samples. As sample sizes grow, so can genetic diversity leading to challenges in analyzing these data. Methods for empirically assigning individuals to genetically informed ancestry groups are needed. METHODS We describe a strategy for empirically assigning ancestry groups in ethnically diverse GWAS data including extensions of principal component analysis (PCA) and population matching through minimum Mahalanobis distance. We apply these methods to data from Spit for Science (S4S): the University Student Survey, a study following college students longitudinally that includes genetic and environmental data on substance use and mental health (n = 7,603). RESULTS The genetic-based population assignments for S4S were 48.7% European, 22.5% African, 10.4% Americas, 9.2% East Asian, and 9.2% South Asian descent. Self-reported census categories "More than one race" and "Unknown"as well as "Hawaiian/Pacific Islander" and "American-Indian/Native Alaskan" were empirically assigned representing a +9% sample retention over conventional methods. Although there was high concordance between self-reported race and empirical population-match (+.924), there was reduction in variance for most ancestry PCs for genetic-based population assignments. CONCLUSIONS We were able to create more genetically homogenous groups and reduce sample and marker loss through cross-ancestry meta-analysis, potentially increasing power to detect etiologically relevant variation. Our approach provides a framework for empirically assigning genetic ancestry groups which can be applied to other ethnically diverse genetic studies. SCIENTIFIC SIGNIFICANCE Given the important public health impact and demonstrable gains in statistical power from studying diverse populations, empirically sound practices for genetic studies are needed. (Am J Addict 2017;26:494-501).
Collapse
Affiliation(s)
- Roseann E Peterson
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia
| | - Alexis C Edwards
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia
| | - Silviu-Alin Bacanu
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia
| | - Danielle M Dick
- Departments of Psychology, African American Studies, and Human and Molecular Genetics, Virginia Commonwealth University, Richmond, Virginia.,College Behavioral and Emotional Health Institute, Virginia Commonwealth University, Richmond, Virginia
| | - Kenneth S Kendler
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia
| | - Bradley T Webb
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia
| |
Collapse
|
127
|
Biscarini F, Nazzicari N, Bink M, Arús P, Aranzana MJ, Verde I, Micali S, Pascal T, Quilot-Turion B, Lambert P, da Silva Linge C, Pacheco I, Bassi D, Stella A, Rossini L. Genome-enabled predictions for fruit weight and quality from repeated records in European peach progenies. BMC Genomics 2017; 18:432. [PMID: 28583089 PMCID: PMC5460546 DOI: 10.1186/s12864-017-3781-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 05/10/2017] [Indexed: 11/16/2022] Open
Abstract
Background Highly polygenic traits such as fruit weight, sugar content and acidity strongly influence the agroeconomic value of peach varieties. Genomic Selection (GS) can accelerate peach yield and quality gain if predictions show higher levels of accuracy compared to phenotypic selection. The available IPSC 9K SNP array V1 allows standardized and highly reliable genotyping, preparing the ground for GS in peach. Results A repeatability model (multiple records per individual plant) for genome-enabled predictions in eleven European peach populations is presented. The analysis included 1147 individuals derived from both commercial and non-commercial peach or peach-related accessions. Considered traits were average fruit weight (FW), sugar content (SC) and titratable acidity (TA). Plants were genotyped with the 9K IPSC array, grown in three countries (France, Italy, Spain) and phenotyped for 3–5 years. An analysis of imputation accuracy of missing genotypic data was conducted using the software Beagle, showing that two of the eleven populations were highly sensitive to increasing levels of missing data. The regression model produced, for each trait and each population, estimates of heritability (FW:0.35, SC:0.48, TA:0.53, on average) and repeatability (FW:0.56, SC:0.63, TA:0.62, on average). Predictive ability was estimated in a five-fold cross validation scheme within population as the correlation of true and predicted phenotypes. Results differed by populations and traits, but predictive abilities were in general high (FW:0.60, SC:0.72, TA:0.65, on average). Conclusions This study assessed the feasibility of Genomic Selection in peach for highly polygenic traits linked to yield and fruit quality. The accuracy of imputing missing genotypes was as high as 96%, and the genomic predictive ability was on average 0.65, but could be as high as 0.84 for fruit weight or 0.83 for titratable acidity. The estimated repeatability may prove very useful in the management of the typical long cycles involved in peach productions. All together, these results are very promising for the application of genomic selection to peach breeding programmes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3781-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Filippo Biscarini
- PTP Science Park, Via Einstein - Loc. Cascina Codazza, Lodi, Italy.,IBBA-CNR, Via Edoardo Bassini, 15, Milan, 20133, Italy
| | - Nelson Nazzicari
- PTP Science Park, Via Einstein - Loc. Cascina Codazza, Lodi, Italy.,Council for Agricultural Research and Economics (CREA) Research Centre for Fodder Crops and Dairy Productions, Lodi, Italy
| | - Marco Bink
- Wageningen UR Biometris, Wageningen, The Netherlands.,Present Address: Hendrix Genetics Research, Technology & Services B.V., P.O. Box 114, Boxmeer NL, 5830AC, The Netherlands
| | - Pere Arús
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB, Campus UAB, Bellaterra (Cerdanyola del Vallés), Barcelona, Spain
| | - Maria José Aranzana
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB, Campus UAB, Bellaterra (Cerdanyola del Vallés), Barcelona, Spain
| | - Ignazio Verde
- Consiglio per la ricerca in agricoltura e l'analisi dell'economia agraria (CREA) - Centro di Ricerca per la Frutticoltura (CREA-FRU), Via di Fioranello 52, Roma, Italy
| | - Sabrina Micali
- Consiglio per la ricerca in agricoltura e l'analisi dell'economia agraria (CREA) - Centro di Ricerca per la Frutticoltura (CREA-FRU), Via di Fioranello 52, Roma, Italy
| | | | | | - Patrick Lambert
- Università degli Studi di Milano - DiSAA, Via Celoria 2, Milano, Italy
| | | | - Igor Pacheco
- Università degli Studi di Milano - DiSAA, Via Celoria 2, Milano, Italy.,Institute of Nutrition and Food Technology - INTA, Universidad de Chile, Av El Líbano 5524, Santiago, Chile
| | - Daniele Bassi
- Università degli Studi di Milano - DiSAA, Via Celoria 2, Milano, Italy
| | - Alessandra Stella
- PTP Science Park, Via Einstein - Loc. Cascina Codazza, Lodi, Italy.,IBBA-CNR, Via Edoardo Bassini, 15, Milan, 20133, Italy
| | - Laura Rossini
- PTP Science Park, Via Einstein - Loc. Cascina Codazza, Lodi, Italy. .,Università degli Studi di Milano - DiSAA, Via Celoria 2, Milano, Italy.
| |
Collapse
|
128
|
Rustagi N, Zhou A, Watkins WS, Gedvilaite E, Wang S, Ramesh N, Muzny D, Gibbs RA, Jorde LB, Yu F, Xing J. Extremely low-coverage whole genome sequencing in South Asians captures population genomics information. BMC Genomics 2017; 18:396. [PMID: 28532386 PMCID: PMC5440948 DOI: 10.1186/s12864-017-3767-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 05/07/2017] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND The cost of Whole Genome Sequencing (WGS) has decreased tremendously in recent years due to advances in next-generation sequencing technologies. Nevertheless, the cost of carrying out large-scale cohort studies using WGS is still daunting. Past simulation studies with coverage at ~2x have shown promise for using low coverage WGS in studies focused on variant discovery, association study replications, and population genomics characterization. However, the performance of low coverage WGS in populations with a complex history and no reference panel remains to be determined. RESULTS South Indian populations are known to have a complex population structure and are an example of a major population group that lacks adequate reference panels. To test the performance of extremely low-coverage WGS (EXL-WGS) in populations with a complex history and to provide a reference resource for South Indian populations, we performed EXL-WGS on 185 South Indian individuals from eight populations to ~1.6x coverage. Using two variant discovery pipelines, SNPTools and GATK, we generated a consensus call set that has ~90% sensitivity for identifying common variants (minor allele frequency ≥ 10%). Imputation further improves the sensitivity of our call set. In addition, we obtained high-coverage for the whole mitochondrial genome to infer the maternal lineage evolutionary history of the Indian samples. CONCLUSIONS Overall, we demonstrate that EXL-WGS with imputation can be a valuable study design for variant discovery with a dramatically lower cost than standard WGS, even in populations with a complex history and without available reference data. In addition, the South Indian EXL-WGS data generated in this study will provide a valuable resource for future Indian genomic studies.
Collapse
Affiliation(s)
- Navin Rustagi
- Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030 USA
| | - Anbo Zhou
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - W. Scott Watkins
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112 USA
| | - Erika Gedvilaite
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Shuoguo Wang
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Naveen Ramesh
- Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030 USA
| | - Donna Muzny
- Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030 USA
| | - Richard A. Gibbs
- Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030 USA
| | - Lynn B. Jorde
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112 USA
| | - Fuli Yu
- Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030 USA
| | - Jinchuan Xing
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| |
Collapse
|
129
|
Li X, Jian Y, Xie C, Wu J, Xu Y, Zou C. Fast diffusion of domesticated maize to temperate zones. Sci Rep 2017; 7:2077. [PMID: 28522839 PMCID: PMC5437101 DOI: 10.1038/s41598-017-02125-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 04/06/2017] [Indexed: 11/09/2022] Open
Abstract
Adaptation to a temperate climate was a prerequisite for the spread of maize across a broad geographical range. To explicitly explore the demographic process underlying maize adaptation, we used a diffusion-based method to model the differentiation between temperate and tropical populations using the Non-Stiff Stalk group as a proxy for temperate maize. Based on multiple sequential Markovian coalescent approaches, we estimate that tropical and temperate maize diverged approximately 3'000 to 5'000 years ago and the population size shrank after the split. Using composite likelihood approaches, we identified a distinct tropical-temperate divergence event initiated 4'958 years ago (95% confidence interval (CI): 4'877-5'039) from an ancestral population whose effective size was 24,162 (95% CI: 23,914-24,409). We found that continuous gene flow between tropical and temperate maize accompanied the differentiation of temperate maize. Long identical-by-descent tracts shared by tropical and temperate inbred lines have been identified, which might be the result of gene flow between tropical and temperate maize or artificial selection during domestication and crop improvement. Understanding the demographic history of maize diffusion not only provides evidence for population dynamics of maize, but will also assist the identification of regions under selection and the genetic basis of complex traits of agronomic importance.
Collapse
Affiliation(s)
- Xiaolong Li
- National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
- Centre of Pear Engineering Technology Research, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
| | - Yinqiao Jian
- National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Chuanxiao Xie
- National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Jun Wu
- Centre of Pear Engineering Technology Research, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
| | - Yunbi Xu
- National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
- International Maize and Wheat Improvement Center (CIMMYT), El Batán, 56130, Texcoco, Mexico.
| | - Cheng Zou
- National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| |
Collapse
|
130
|
Rethinking the Epigenetic Framework to Unravel the Molecular Pathology of Schizophrenia. Int J Mol Sci 2017; 18:ijms18040790. [PMID: 28387726 PMCID: PMC5412374 DOI: 10.3390/ijms18040790] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 03/23/2017] [Accepted: 04/04/2017] [Indexed: 12/26/2022] Open
Abstract
Schizophrenia is a complex mental disorder whose causes are still far from being known. Although researchers have focused on genetic or environmental contributions to the disease, we still lack a scientific framework that joins molecular and clinical findings. Epigenetic can explain how environmental variables may affect gene expression without modifying the DNA sequence. In fact, neuroepigenomics represents an effort to unify the research available on the molecular pathology of mental diseases, which has been carried out through several approaches ranging from interrogating single DNA methylation events and hydroxymethylation patterns, to epigenome-wide association studies, as well as studying post-translational modifications of histones, or nucleosomal positioning. The high dependence on tissues with epigenetic marks compels scientists to refine their sampling procedures, and in this review, we will focus on findings obtained from brain tissue. Despite our efforts, we still need to refine our hypothesis generation process to obtain real knowledge from a neuroepigenomic framework, to avoid the creation of more noise on this innovative point of view; this may help us to definitively unravel the molecular pathology of severe mental illnesses, such as schizophrenia.
Collapse
|
131
|
McGirr JA, Martin CH. Novel Candidate Genes Underlying Extreme Trophic Specialization in Caribbean Pupfishes. Mol Biol Evol 2017; 34:873-888. [PMID: 28028132 PMCID: PMC5850223 DOI: 10.1093/molbev/msw286] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The genetic changes responsible for evolutionary transitions from generalist to specialist phenotypes are poorly understood. Here we examine the genetic basis of craniofacial traits enabling novel trophic specialization in a sympatric radiation of Cyprinodon pupfishes endemic to San Salvador Island, Bahamas. This recent radiation consists of a generalist species and two novel specialists: a small-jawed "snail-eater" and a large-jawed "scale-eater." We genotyped 12 million single nucleotide polymorphisms (SNPs) by whole-genome resequencing of 37 individuals of all three species from nine populations and integrated genome-wide divergence scans with association mapping to identify divergent regions containing putatively causal SNPs affecting jaw size-the most rapidly diversifying trait in this radiation. A mere 22 fixed variants accompanied extreme ecological divergence between generalist and scale-eater species. We identified 31 regions (20 kb) containing variants fixed between specialists that were significantly associated with variation in jaw size which contained 11 genes annotated for skeletal system effects and 18 novel candidate genes never previously associated with craniofacial phenotypes. Six of these 31 regions showed robust signs of hard selective sweeps after accounting for demographic history. Our data are consistent with predictions based on quantitative genetic models of adaptation, suggesting that the effect sizes of regions influencing jaw phenotypes are positively correlated with distance between fitness peaks on a complex adaptive landscape.
Collapse
Affiliation(s)
- Joseph A. McGirr
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | | |
Collapse
|
132
|
Yang RC. Genome-wide estimation of heritability and its functional components for flowering, defense, ionomics, and developmental traits in a geographically diverse population of Arabidopsis thaliana. Genome 2017; 60:572-580. [PMID: 28314113 DOI: 10.1139/gen-2016-0213] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Narrow-sense heritability (portion of the total phenotypic variation attributable to additive genetic effect, h2) is a critical parameter in plant breeding and genetics, but its estimation is difficult for populations with unknown pedigree information. This study applied a marker-based linear mixed model (LMM) analysis to estimate narrow-sense heritability and its seven functional components corresponding to SNPs in coding and noncoding regions for each of 107 flowering, defense, ionomics, and developmental traits in an Arabidopsis (Arabidopsis thaliana) population of 199 inbred lines with unknown genetic relatedness. Genetic relationship matrix (GRM) based on 214 051 SNPs and component GRMs based on seven subsets of SNPs were computed for LMM estimation of h2 and functional components contributing to h2, respectively. The h2 estimates for flowering traits were higher than those for defense, ionomics, and developmental traits, supporting a general view that the fitness-related traits have lower heritabilities than other traits. The function component owing to SNPs in coding (exon) regions was the least contributor to h2. Our LMM analysis provides an opportunity to gain a comprehensive view on heritability and its functional components for populations with unknown structure but with genome-wide DNA markers.
Collapse
Affiliation(s)
- Rong-Cai Yang
- Alberta Agriculture and Forestry, #307, 7000-113 Street, Edmonton, AB T6H 5T6, Canada; Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada.,Alberta Agriculture and Forestry, #307, 7000-113 Street, Edmonton, AB T6H 5T6, Canada; Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| |
Collapse
|
133
|
Kushima I, Aleksic B, Nakatochi M, Shimamura T, Shiino T, Yoshimi A, Kimura H, Takasaki Y, Wang C, Xing J, Ishizuka K, Oya-Ito T, Nakamura Y, Arioka Y, Maeda T, Yamamoto M, Yoshida M, Noma H, Hamada S, Morikawa M, Uno Y, Okada T, Iidaka T, Iritani S, Yamamoto T, Miyashita M, Kobori A, Arai M, Itokawa M, Cheng MC, Chuang YA, Chen CH, Suzuki M, Takahashi T, Hashimoto R, Yamamori H, Yasuda Y, Watanabe Y, Nunokawa A, Someya T, Ikeda M, Toyota T, Yoshikawa T, Numata S, Ohmori T, Kunimoto S, Mori D, Iwata N, Ozaki N. High-resolution copy number variation analysis of schizophrenia in Japan. Mol Psychiatry 2017; 22:430-440. [PMID: 27240532 DOI: 10.1038/mp.2016.88] [Citation(s) in RCA: 100] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2015] [Revised: 04/18/2016] [Accepted: 04/20/2016] [Indexed: 12/30/2022]
Abstract
Recent schizophrenia (SCZ) studies have reported an increased burden of de novo copy number variants (CNVs) and identified specific high-risk CNVs, although with variable phenotype expressivity. However, the pathogenesis of SCZ has not been fully elucidated. Using array comparative genomic hybridization, we performed a high-resolution genome-wide CNV analysis on a mainly (92%) Japanese population (1699 SCZ cases and 824 controls) and identified 7066 rare CNVs, 70.0% of which were small (<100 kb). Clinically significant CNVs were significantly more frequent in cases than in controls (odds ratio=3.04, P=9.3 × 10-9, 9.0% of cases). We confirmed a significant association of X-chromosome aneuploidies with SCZ and identified 11 de novo CNVs (e.g., MBD5 deletion) in cases. In patients with clinically significant CNVs, 41.7% had a history of congenital/developmental phenotypes, and the rate of treatment resistance was significantly higher (odds ratio=2.79, P=0.0036). We found more severe clinical manifestations in patients with two clinically significant CNVs. Gene set analysis replicated previous findings (e.g., synapse, calcium signaling) and identified novel biological pathways including oxidative stress response, genomic integrity, kinase and small GTPase signaling. Furthermore, involvement of multiple SCZ candidate genes and biological pathways in the pathogenesis of SCZ was suggested in established SCZ-associated CNV loci. Our study shows the high genetic heterogeneity of SCZ and its clinical features and raises the possibility that genomic instability is involved in its pathogenesis, which may be related to the increased burden of de novo CNVs and variable expressivity of CNVs.
Collapse
Affiliation(s)
- I Kushima
- Institute for Advanced Research, Nagoya University, Nagoya, Japan.,Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - B Aleksic
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - M Nakatochi
- Bioinformatics Section, Center for Advanced Medicine and Clinical Research, Nagoya University Hospital, Nagoya, Japan
| | - T Shimamura
- Division of Systems Biology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - T Shiino
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - A Yoshimi
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - H Kimura
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Y Takasaki
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - C Wang
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - J Xing
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - K Ishizuka
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - T Oya-Ito
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Y Nakamura
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Y Arioka
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan.,Center for Advanced Medicine and Clinical Research, Nagoya University Hospital, Nagoya, Japan
| | - T Maeda
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - M Yamamoto
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - M Yoshida
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - H Noma
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - S Hamada
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - M Morikawa
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Y Uno
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - T Okada
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - T Iidaka
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - S Iritani
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - T Yamamoto
- Department of Legal Medicine and Bioethics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - M Miyashita
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - A Kobori
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - M Arai
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - M Itokawa
- Center for Medical Cooperation, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - M-C Cheng
- Department of Psychiatry, Yuli Mental Health Research Center, Yuli Branch, Taipei Veterans General Hospital, Hualien, Taiwan
| | - Y-A Chuang
- Department of Psychiatry, Yuli Mental Health Research Center, Yuli Branch, Taipei Veterans General Hospital, Hualien, Taiwan
| | - C-H Chen
- Department of Psychiatry, Chang Gung Memorial Hospital-Linkou, Taoyuan, Taiwan.,Department and Graduate Institute of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - M Suzuki
- Department of Neuropsychiatry, University of Toyama Graduate School of Medicine and Pharmaceutical Sciences, Toyama, Japan
| | - T Takahashi
- Department of Neuropsychiatry, University of Toyama Graduate School of Medicine and Pharmaceutical Sciences, Toyama, Japan
| | - R Hashimoto
- Molecular Research Center for Children's Mental Development, United Graduate School of Child Development, Osaka University, Suita, Japan.,Department of Psychiatry, Osaka University Graduate School of Medicine, Suita, Japan
| | - H Yamamori
- Department of Psychiatry, Osaka University Graduate School of Medicine, Suita, Japan
| | - Y Yasuda
- Department of Psychiatry, Osaka University Graduate School of Medicine, Suita, Japan
| | - Y Watanabe
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - A Nunokawa
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - T Someya
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - M Ikeda
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| | - T Toyota
- Laboratory for Molecular Psychiatry, RIKEN Brain Science Institute, Wako, Japan
| | - T Yoshikawa
- Laboratory for Molecular Psychiatry, RIKEN Brain Science Institute, Wako, Japan
| | - S Numata
- Department of Psychiatry, Institute of Biomedical Sciences, Tokushima University Graduate School, Tokushima, Japan
| | - T Ohmori
- Department of Psychiatry, Institute of Biomedical Sciences, Tokushima University Graduate School, Tokushima, Japan
| | - S Kunimoto
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - D Mori
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan.,Brain and Mind Research Center, Nagoya University, Nagoya, Japan
| | - N Iwata
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| | - N Ozaki
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
134
|
Carter H, Marty R, Hofree M, Gross AM, Jensen J, Fisch KM, Wu X, DeBoever C, Van Nostrand EL, Song Y, Wheeler E, Kreisberg JF, Lippman SM, Yeo GW, Gutkind JS, Ideker T. Interaction Landscape of Inherited Polymorphisms with Somatic Events in Cancer. Cancer Discov 2017; 7:410-423. [PMID: 28188128 DOI: 10.1158/2159-8290.cd-16-1045] [Citation(s) in RCA: 101] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 02/06/2017] [Accepted: 02/08/2017] [Indexed: 02/06/2023]
Abstract
Recent studies have characterized the extensive somatic alterations that arise during cancer. However, the somatic evolution of a tumor may be significantly affected by inherited polymorphisms carried in the germline. Here, we analyze genomic data for 5,954 tumors to reveal and systematically validate 412 genetic interactions between germline polymorphisms and major somatic events, including tumor formation in specific tissues and alteration of specific cancer genes. Among germline-somatic interactions, we found germline variants in RBFOX1 that increased incidence of SF3B1 somatic mutation by 8-fold via functional alterations in RNA splicing. Similarly, 19p13.3 variants were associated with a 4-fold increased likelihood of somatic mutations in PTEN. In support of this association, we found that PTEN knockdown sensitizes the MTOR pathway to high expression of the 19p13.3 gene GNA11 Finally, we observed that stratifying patients by germline polymorphisms exposed distinct somatic mutation landscapes, implicating new cancer genes. This study creates a validated resource of inherited variants that govern where and how cancer develops, opening avenues for prevention research.Significance: This study systematically identifies germline variants that directly affect tumor evolution, either by dramatically increasing alteration frequency of specific cancer genes or by influencing the site where a tumor develops. Cancer Discovery; 7(4); 410-23. ©2017 AACR.See related commentary by Geeleher and Huang, p. 354This article is highlighted in the In This Issue feature, p. 339.
Collapse
Affiliation(s)
- Hannah Carter
- Department of Medicine, Division of Medical Genetics, University of California, San Diego, La Jolla, California. .,Moores Cancer Center, University of California, San Diego, La Jolla, California.,Cancer Cell Map Initiative (CCMI), La Jolla and San Francisco, California.,Institute for Genomic Medicine, University of California, San Diego, La Jolla, California
| | - Rachel Marty
- Bioinformatics Program, University of California, San Diego, La Jolla, California
| | - Matan Hofree
- Department of Computer Science, University of California, San Diego, La Jolla, California
| | - Andrew M Gross
- Bioinformatics Program, University of California, San Diego, La Jolla, California
| | - James Jensen
- Bioinformatics Program, University of California, San Diego, La Jolla, California
| | - Kathleen M Fisch
- Department of Medicine, Division of Medical Genetics, University of California, San Diego, La Jolla, California.,Moores Cancer Center, University of California, San Diego, La Jolla, California.,Cancer Cell Map Initiative (CCMI), La Jolla and San Francisco, California.,Department of Medicine, Center for Computational Biology and Bioinformatics, University of California, San Diego, La Jolla, California
| | - Xingyu Wu
- Moores Cancer Center, University of California, San Diego, La Jolla, California
| | - Christopher DeBoever
- Bioinformatics Program, University of California, San Diego, La Jolla, California
| | - Eric L Van Nostrand
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California.,Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
| | - Yan Song
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California.,Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
| | - Emily Wheeler
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California.,Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
| | - Jason F Kreisberg
- Department of Medicine, Division of Medical Genetics, University of California, San Diego, La Jolla, California.,Cancer Cell Map Initiative (CCMI), La Jolla and San Francisco, California
| | - Scott M Lippman
- Moores Cancer Center, University of California, San Diego, La Jolla, California
| | - Gene W Yeo
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California.,Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
| | - J Silvio Gutkind
- Moores Cancer Center, University of California, San Diego, La Jolla, California.,Cancer Cell Map Initiative (CCMI), La Jolla and San Francisco, California
| | - Trey Ideker
- Department of Medicine, Division of Medical Genetics, University of California, San Diego, La Jolla, California.,Moores Cancer Center, University of California, San Diego, La Jolla, California.,Cancer Cell Map Initiative (CCMI), La Jolla and San Francisco, California.,Institute for Genomic Medicine, University of California, San Diego, La Jolla, California.,Bioinformatics Program, University of California, San Diego, La Jolla, California.,Department of Computer Science, University of California, San Diego, La Jolla, California
| |
Collapse
|
135
|
N’Diaye A, Haile JK, Cory AT, Clarke FR, Clarke JM, Knox RE, Pozniak CJ. Single Marker and Haplotype-Based Association Analysis of Semolina and Pasta Colour in Elite Durum Wheat Breeding Lines Using a High-Density Consensus Map. PLoS One 2017; 12:e0170941. [PMID: 28135299 PMCID: PMC5279799 DOI: 10.1371/journal.pone.0170941] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 01/12/2017] [Indexed: 12/30/2022] Open
Abstract
Association mapping is usually performed by testing the correlation between a single marker and phenotypes. However, because patterns of variation within genomes are inherited as blocks, clustering markers into haplotypes for genome-wide scans could be a worthwhile approach to improve statistical power to detect associations. The availability of high-density molecular data allows the possibility to assess the potential of both approaches to identify marker-trait associations in durum wheat. In the present study, we used single marker- and haplotype-based approaches to identify loci associated with semolina and pasta colour in durum wheat, the main objective being to evaluate the potential benefits of haplotype-based analysis for identifying quantitative trait loci. One hundred sixty-nine durum lines were genotyped using the Illumina 90K Infinium iSelect assay, and 12,234 polymorphic single nucleotide polymorphism (SNP) markers were generated and used to assess the population structure and the linkage disequilibrium (LD) patterns. A total of 8,581 SNPs previously localized to a high-density consensus map were clustered into 406 haplotype blocks based on the average LD distance of 5.3 cM. Combining multiple SNPs into haplotype blocks increased the average polymorphism information content (PIC) from 0.27 per SNP to 0.50 per haplotype. The haplotype-based analysis identified 12 loci associated with grain pigment colour traits, including the five loci identified by the single marker-based analysis. Furthermore, the haplotype-based analysis resulted in an increase of the phenotypic variance explained (50.4% on average) and the allelic effect (33.7% on average) when compared to single marker analysis. The presence of multiple allelic combinations within each haplotype locus offers potential for screening the most favorable haplotype series and may facilitate marker-assisted selection of grain pigment colour in durum wheat. These results suggest a benefit of haplotype-based analysis over single marker analysis to detect loci associated with colour traits in durum wheat.
Collapse
Affiliation(s)
- Amidou N’Diaye
- Department of Plant Sciences and Crop Development Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Jemanesh K. Haile
- Department of Plant Sciences and Crop Development Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Aron T. Cory
- Department of Plant Sciences and Crop Development Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Fran R. Clarke
- Semiarid Prairie Agricultural Research Centre, Agriculture and Agri-Food Canada, Swift Current, Saskatchewan, Canada
| | - John M. Clarke
- Department of Plant Sciences and Crop Development Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Ron E. Knox
- Semiarid Prairie Agricultural Research Centre, Agriculture and Agri-Food Canada, Swift Current, Saskatchewan, Canada
| | - Curtis J. Pozniak
- Department of Plant Sciences and Crop Development Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
136
|
Post-mortem whole-exome analysis in a large sudden infant death syndrome cohort with a focus on cardiovascular and metabolic genetic diseases. Eur J Hum Genet 2017; 25:404-409. [PMID: 28074886 DOI: 10.1038/ejhg.2016.199] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 11/18/2016] [Accepted: 12/14/2016] [Indexed: 12/23/2022] Open
Abstract
Sudden infant death syndrome (SIDS) is described as the sudden and unexplained death of an apparently healthy infant younger than one year of age. Genetic studies indicate that up to 35% of SIDS cases might be explained by familial or genetic diseases such as cardiomyopathies, ion channelopathies or metabolic disorders that remained undetected during conventional forensic autopsy procedures. Post-mortem genetic testing by using massive parallel sequencing (MPS) approaches represents an efficient and rapid tool to further investigate unexplained death cases and might help to elucidate pathogenic genetic variants and mechanisms in cases without a conclusive cause of death. In this study, we performed whole-exome sequencing (WES) in 161 European SIDS infants with focus on 192 genes associated with cardiovascular and metabolic diseases. Potentially causative variants were detected in 20% of the SIDS cases. The majority of infants had variants with likely functional effects in genes associated with channelopathies (9%), followed by cardiomyopathies (7%) and metabolic diseases (1%). Although lethal arrhythmia represents the most plausible and likely cause of death, the majority of SIDS cases still remains elusive and might be explained by a multifactorial etiology, triggered by a combination of different genetic and environmental risk factors. As WES is not substantially more expensive than a targeted sequencing approach, it represents an unbiased screening of the exome, which could help to investigate different pathogenic mechanisms within the genetically heterogeneous SIDS cohort. Additionally, re-analysis of the datasets provides the basis to identify new candidate genes in sudden infant death.
Collapse
|
137
|
Faruque MU, Chen G, Doumatey AP, Zhou J, Huang H, Shriner D, Adeyemo AA, Rotimi CN, Dunston GM. Transferability of genome-wide associated loci for asthma in African Americans. J Asthma 2017; 54:1-8. [PMID: 27177148 PMCID: PMC5300042 DOI: 10.1080/02770903.2016.1188941] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Revised: 05/05/2016] [Accepted: 05/08/2016] [Indexed: 01/11/2023]
Abstract
OBJECTIVE Transferability of significantly associated loci or GWAS "hits" adds credibility to genotype-disease associations and provides evidence for generalizability across different ancestral populations. We sought evidence of association of known asthma-associated single nucleotide polymorphisms (SNPs) in an African American population. METHODS Subjects comprised 661 participants (261 asthma cases and 400 controls) from the Howard University Family Study. Forty-eight SNPs previously reported to be associated with asthma by GWAS were selected for testing. We adopted a combined strategy by first adopting an "exact" approach where we looked-up only the reported index SNP. For those index SNPs missing form our dataset, we used a "local" approach that examined all the regional SNPs in LD with the index SNP. RESULTS Out of the 48 SNPs, our cohort had genotype data available for 27, which were examined for exact replication. Of these, two SNPs were found positively associated with asthma. These included: rs10508372 (OR = 1.567 [95%CI, 1.133-2.167], P = 0.0066) and rs2378383 (OR = 2.147 [95%CI, 1.149-4.013], P = 0.0166), located on chromosomal bands 10p14 and 9q21.31, respectively. Local replication of the remaining 21 loci showed association at two chromosomal loci (9p24.1-rs2381413 and 6p21.32-rs3132947; Bonferroni-corrected P values: 0.0033 and 0.0197, respectively). Of note, multiple SNPs in LD with rs2381413 located upstream of IL33 were significantly associated with asthma. CONCLUSIONS This study has successfully transferred four reported asthma-associated loci in an independent African American population. Identification of several asthma-associated SNPs in the upstream of the IL33, a gene previously implicated in allergic inflammation of asthmatic airway, supports the generalizability of this finding.
Collapse
Affiliation(s)
- Mezbah U. Faruque
- National Human Genome Center, Howard University College of Medicine, Washington, DC, USA
| | - Guanjie Chen
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ayo P. Doumatey
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jie Zhou
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Hanxia Huang
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Daniel Shriner
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Adebowale A. Adeyemo
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Charles N. Rotimi
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Georgia M. Dunston
- National Human Genome Center, Howard University College of Medicine, Washington, DC, USA
| |
Collapse
|
138
|
Gottscho AD, Wood DA, Vandergast AG, Lemos-Espinal J, Gatesy J, Reeder TW. Lineage diversification of fringe-toed lizards (Phrynosomatidae: Uma notata complex) in the Colorado Desert: Delimiting species in the presence of gene flow. Mol Phylogenet Evol 2017; 106:103-117. [DOI: 10.1016/j.ympev.2016.09.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 07/26/2016] [Accepted: 09/12/2016] [Indexed: 01/08/2023]
|
139
|
Abstract
This chapter describes the main issues that genetic epidemiologists usually consider in the design of linkage and association studies. For linkage, we briefly consider the situation of rare highly penetrant alleles showing a disease pattern consistent with Mendelian inheritance investigated through parametric methods in large pedigrees, or with autozygosity mapping in inbred families, and we then turn our focus to the most common design, the affected sibling pair design that is of more relevance for common, complex diseases. Power and sample size calculations are provided as a function of the strength of the genetic effect being investigated. We also discuss the impact of other determinants of statistical power such as disease heterogeneity, pedigree and genotyping errors and the effect of the type and density of genetic markers. For association studies, we consider the popular case-control design for dichotomous phenotypes and we provide power and sample size calculations for one-stage and multistage designs. For candidate genes, guidelines are given on the prioritization of genetic variants, and for genome-wide association studies (GWAS) the issue of choosing an appropriate SNP array is discussed. A warning is issued regarding the danger of designing an underpowered replication study following an initial GWAS. The risk of finding spurious association due to population stratification, cryptic relatedness, and differential bias is underlined.
Collapse
Affiliation(s)
- Jérémie Nsengimana
- Section of Epidemiology and Biostatistics, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK
| | - D Timothy Bishop
- Section of Epidemiology and Biostatistics, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK.
| |
Collapse
|
140
|
Abstract
In genetic association studies, it is necessary to correct for population structure to avoid inference bias. During the past decade, prevailing corrections often only involved adjustments of global ancestry differences between sampled individuals. Nevertheless, population structure may vary across local genomic regions due to the variability of local ancestries associated with natural selection, migration, or random genetic drift. Adjusting for global ancestry alone may be inadequate when local population structure is an important confounding factor. In contrast, adjusting for local ancestry can more effectively prevent false positives due to local population structure. To more accurately locate disease genes, we recommend adjusting for local ancestries by interrogating local structure. In practice, locus-specific ancestries are usually unknown and must be inferred. For recently admixed populations with known reference ancestral populations, locus-specific ancestries can be inferred accurately using some hidden Markov model-based methods. However, SNP-wise ancestries cannot be accurately inferred when ancestral population information is not available. For such scenarios, we propose employing local principal components (PCs) to present local ancestries and adjusting for local PCs when testing for gene-phenotype association.
Collapse
Affiliation(s)
- Huaizhen Qin
- Department of Global Biostatistics and Data Science, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, 70112, USA. .,Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| | - Xiaofeng Zhu
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| |
Collapse
|
141
|
Reilly JP, Meyer NJ, Christie JD. Genetics in the Prevention and Treatment of Sepsis. SEPSIS 2017. [DOI: 10.1007/978-3-319-48470-9_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
142
|
Sun K, Ye Y, Luo T, Hou Y. Multi-InDel Analysis for Ancestry Inference of Sub-Populations in China. Sci Rep 2016; 6:39797. [PMID: 28004788 PMCID: PMC5177877 DOI: 10.1038/srep39797] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Accepted: 11/29/2016] [Indexed: 01/03/2023] Open
Abstract
Ancestry inference is of great interest in diverse areas of scientific researches, including the forensic biology, medical genetics and anthropology. Various methods have been published for distinguishing populations. However, few reports refer to sub-populations (like ethnic groups) within Asian populations for the limitation of markers. Several InDel loci located very tightly in physical positions were treated as one marker by us, which is multi-InDel. The multi-InDel shows potential as Ancestry Inference Marker (AIM). In this study, we performed a genome-wide scan for multi-InDels as AIM. After examining the FST distributions in the 1000 Genomes Database, 12 candidates were selected and validated for eastern Asian populations. A multiplexed assay was developed as a panel to genotype 12 multi-InDel markers simultaneously. Ancestry component analysis with STRUCTURE and principal component analysis (PCA) were employed to estimate its capability for ancestry inference. Furthermore, ancestry assignments of trial individuals were conducted. It proved to be very effective when 210 samples from Han and Tibetan individuals in China were tested. The panel consisting of multi-InDel markers exhibited considerable potency in ancestry inference, and was suggested to be applied in forensic practices and genetic population studies.
Collapse
Affiliation(s)
- Kuan Sun
- Institute of Forensic Medicine, West China School of Basic Science and Forensic Medicine, Sichuan University, Chengdu, P.R. China
| | - Yi Ye
- Institute of Forensic Medicine, West China School of Basic Science and Forensic Medicine, Sichuan University, Chengdu, P.R. China
| | - Tao Luo
- Laboratory of Infection and Immunity, School of Basic Medical Sciences, West China Center of Medical Science, Sichuan University, Chengdu P.R. China
| | - Yiping Hou
- Institute of Forensic Medicine, West China School of Basic Science and Forensic Medicine, Sichuan University, Chengdu, P.R. China
| |
Collapse
|
143
|
Wang KS, Liu X, Xie C, Liu Y, Xu C. Non-parametric Survival Analysis of EPG5 Gene with Age at Onset of Alzheimer's Disease. J Mol Neurosci 2016; 60:436-444. [PMID: 27586004 DOI: 10.1007/s12031-016-0821-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 08/17/2016] [Indexed: 01/17/2023]
Abstract
Non-parametric methods such as Wilcoxon test have the advantages of no assumptions for the underlying survival distributions. Alzheimer's disease (AD) is a chronic neurodegenerative disease while the ectopic P-granules autophagy protein 5 homolog (EPG5 gene) is highly expressed in human brain and may implicate in the pathogenesis of neurodegenerative disorders. The present study explored the associations of 26 single-nucleotide polymorphisms (SNPs) in the EPG5 gene with the age at onset (AAO) of AD using a family-based association test (FBAT)-Wilcoxon statistic in a family-based study. Then a replication study using a case-control sample was conducted to perform Wilcoxon test in Kaplan-Meier survival analysis of AAO. The results from FBAT-generalized estimating equations (FBAT-GEE) statistics and FBAT-Wilcoxon test showed that seven SNPs (top SNP rs495078 with p = 1.29 × 10-3) were significantly associated with the risk of AD, and eight SNPs (top SNP rs11082498 with p = 3.55 × 10-4) were associated with the AAO of AD in the family-based study (p < 0.05). In the replicated data, three SNPs were associated with AAO by using the Wilcoxon test, where the mean AAO was approximately 2.2 years earlier in individuals who had at least one minor allele of the top AAO-associated SNP rs9963463 (p = 0.0018) compared with those who were homozygous for the major allele. These findings from non-parametric survival analyses provide evidence for several genetic variants in EPG5 influencing the AAO of AD and will serve as a resource for replication in other populations.
Collapse
Affiliation(s)
- Ke-Sheng Wang
- Department of Biostatistics and Epidemiology, College of Public Health, East Tennessee State University, PO Box 70259, Lamb Hall, Johnson City, TN, 37614-1700, USA.
| | - Xuefeng Liu
- Department of Systems Leadership and Effectiveness Science, School of Nursing, University of Michigan, Ann Arbor, MI, 48109-5482, USA
| | - Changchun Xie
- Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, 45267, USA
| | - Ying Liu
- Department of Biostatistics and Epidemiology, College of Public Health, East Tennessee State University, PO Box 70259, Lamb Hall, Johnson City, TN, 37614-1700, USA
| | - Chun Xu
- Department of Pediatrics, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, 79912, USA
| |
Collapse
|
144
|
Kemppainen P, Rønning B, Kvalnes T, Hagen IJ, Ringsby TH, Billing AM, Pärn H, Lien S, Husby A, Saether BE, Jensen H. Controlling for P
-value inflation in allele frequency change in experimental evolution and artificial selection experiments. Mol Ecol Resour 2016; 17:770-782. [DOI: 10.1111/1755-0998.12631] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Revised: 10/23/2016] [Accepted: 10/28/2016] [Indexed: 01/25/2023]
Affiliation(s)
- Petri Kemppainen
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Bernt Rønning
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Thomas Kvalnes
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Ingerid J. Hagen
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Thor Harald Ringsby
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Anna M. Billing
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Henrik Pärn
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Sigbjørn Lien
- CIGENE; Norwegian University of Life Sciences; P.O. Box 5003 NO-1432 Ås Norway
| | - Arild Husby
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
- Department of Biosciences; University of Helsinki; P.O. Box 65 (Viikinkaari 1) 00014 Helsinki Finland
| | - Bernt-Erik Saether
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| | - Henrik Jensen
- Centre for Biodiversity Dynamics; Department of Biology; Norwegian University of Science and Technology; Høgskoleringen 5, Realfagbygget E1-126 NO-7491 Trondheim Norway
| |
Collapse
|
145
|
Efficient and Accurate Multiple-Phenotype Regression Method for High Dimensional Data Considering Population Structure. Genetics 2016; 204:1379-1390. [PMID: 27770036 DOI: 10.1534/genetics.116.189712] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 09/28/2016] [Indexed: 02/07/2023] Open
Abstract
A typical genome-wide association study tests correlation between a single phenotype and each genotype one at a time. However, single-phenotype analysis might miss unmeasured aspects of complex biological networks. Analyzing many phenotypes simultaneously may increase the power to capture these unmeasured aspects and detect more variants. Several multivariate approaches aim to detect variants related to more than one phenotype, but these current approaches do not consider the effects of population structure. As a result, these approaches may result in a significant amount of false positive identifications. Here, we introduce a new methodology, referred to as GAMMA for generalized analysis of molecular variance for mixed-model analysis, which is capable of simultaneously analyzing many phenotypes and correcting for population structure. In a simulated study using data implanted with true genetic effects, GAMMA accurately identifies these true effects without producing false positives induced by population structure. In simulations with this data, GAMMA is an improvement over other methods which either fail to detect true effects or produce many false positive identifications. We further apply our method to genetic studies of yeast and gut microbiome from mice and show that GAMMA identifies several variants that are likely to have true biological mechanisms.
Collapse
|
146
|
Kandt J, Cheshire JA, Longley PA. Regional surnames and genetic structure in Great Britain. TRANSACTIONS (INSTITUTE OF BRITISH GEOGRAPHERS : 1965) 2016; 41:554-569. [PMID: 27708455 PMCID: PMC5032893 DOI: 10.1111/tran.12131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Indexed: 06/06/2023]
Abstract
Following the increasing availability of DNA-sequenced data, the genetic structure of populations can now be inferred and studied in unprecedented detail. Across social science, this innovation is shaping new bio-social research agendas, attracting substantial investment in the collection of genetic, biological and social data for large population samples. Yet genetic samples are special because the precise populations that they represent are uncertain and ill-defined. Unlike most social surveys, a genetic sample's representativeness of the population cannot be established by conventional procedures of statistical inference, and the implications for population-wide generalisations about bio-social phenomena are little understood. In this paper, we seek to address these problems by linking surname data to a censored and geographically uneven sample of DNA scans, collected for the People of the British Isles study. Based on a combination of global and local spatial correspondence measures, we identify eight regions in Great Britain that are most likely to represent the geography of genetic structure of Great Britain's long-settled population. We discuss the implications of this regionalisation for bio-social investigations. We conclude that, as the often highly selective collection of DNA and biomarkers becomes a more common practice, geography is crucial to understanding variation in genetic information within diverse populations.
Collapse
Affiliation(s)
- Jens Kandt
- Department of GeographyUniversity College LondonLondonWC1E 6BT
| | | | - Paul A Longley
- Department of GeographyUniversity College LondonLondonWC1E 6BT
| |
Collapse
|
147
|
Sánchez-Pozos K, Menjívar M. Genetic Component of Type 2 Diabetes in a Mexican Population. Arch Med Res 2016; 47:496-505. [DOI: 10.1016/j.arcmed.2016.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 12/05/2016] [Indexed: 01/15/2023]
|
148
|
Zhang L, Mukherjee B, Ghosh M, Wu R. Bayesian modeling for genetic association in case-control studies: accounting for unknown population substructure. STAT MODEL 2016. [DOI: 10.1177/1471082006071841] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A two-stage parametric Bayesian method is proposed to examine the association between a candidate gene and the occurrence of a disease after accounting for population substructure. This procedure, implemented via a Markov chain Monte Carlo numerical integration technique, first estimates the posterior probability of different unknown population substructures and then integrates this information into a disease-gene association model through the technique of Bayesian model averaging. The model relaxes certain assumptions of previous analyses and provides a unified computational framework to obtain an estimate of the log odds ratio parameter corresponding to the genetic factor after allowing for the allele frequencies to vary across subpopulations. The uncertainty in estimating the population substructure is taken into account while providing credible intervals for parameters in the disease-gene association model. Simulations on unmatched case-control studies that mimic an admixed Argentinean population are performed to demonstrate the statistical properties of our model. The method is also applied to a real data set coming from a genetic association study on obesity.
Collapse
Affiliation(s)
- Li Zhang
- Department of Statistics, University of Florida, Gainesville, USA
| | | | - Malay Ghosh
- Department of Statistics, University of Florida, Gainesville, USA
| | - Rongling Wu
- Department of Statistics, University of Florida, Gainesville, USA
| |
Collapse
|
149
|
Meng S, He J, Zhao T, Xing G, Li Y, Yang S, Lu J, Wang Y, Gai J. Detecting the QTL-allele system of seed isoflavone content in Chinese soybean landrace population for optimal cross design and gene system exploration. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2016; 129:1557-76. [PMID: 27189002 DOI: 10.1007/s00122-016-2724-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 04/28/2016] [Indexed: 05/20/2023]
Abstract
KEY MESSAGE Utilizing an innovative GWAS in CSLRP, 44 QTL 199 alleles with 72.2 % contribution to SIFC variation were detected and organized into a QTL-allele matrix for cross design and gene annotation. The seed isoflavone content (SIFC) of soybeans is of great importance to health care. The Chinese soybean landrace population (CSLRP) as a genetic reservoir was studied for its whole-genome quantitative trait loci (QTL) system of the SIFC using an innovative restricted two-stage multi-locus genome-wide association study procedure (RTM-GWAS). A sample of 366 landraces was tested under four environments and sequenced using RAD-seq (restriction-site-associated DNA sequencing) technique to obtain 116,769 single nucleotide polymorphisms (SNPs) then organized into 29,119 SNP linkage disequilibrium blocks (SNPLDBs) for GWAS. The detected 44 QTL 199 alleles on 16 chromosomes (explaining 72.2 % of the total phenotypic variation) with the allele effects (92 positive and 107 negative) of the CSLRP were organized into a QTL-allele matrix showing the SIFC population genetic structure. Additional differentiation among eco-regions due to the SIFC in addition to that of genome-wide markers was found. All accessions comprised both positive and negative alleles, implying a great potential for recombination within the population. The optimal crosses were predicted from the matrices, showing transgressive potentials in the CSLRP. From the detected QTL system, 55 candidate genes related to 11 biological processes were χ (2)-tested as an SIFC candidate gene system. The present study explored the genome-wide SIFC QTL/gene system with the innovative RTM-GWAS and found the potentials of the QTL-allele matrix in optimal cross design and population genetic and genomic studies, which may have provided a solution to match the breeding by design strategy at both QTL and gene levels in breeding programs.
Collapse
Affiliation(s)
- Shan Meng
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
| | - Jianbo He
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
| | - Tuanjie Zhao
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- National Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Guangnan Xing
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- National Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Yan Li
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- National Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Shouping Yang
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
- National Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Jiangjie Lu
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
| | - Yufeng Wang
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China
| | - Junyi Gai
- Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China.
- National Center for Soybean Improvement, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China.
- Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, Nanjing, 210095, Jiangsu, China.
- National Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China.
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China.
| |
Collapse
|
150
|
Bayesian Inference of the Evolution of a Phenotype Distribution on a Phylogenetic Tree. Genetics 2016; 204:89-98. [PMID: 27412711 PMCID: PMC5012407 DOI: 10.1534/genetics.116.190496] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 07/07/2016] [Indexed: 12/21/2022] Open
Abstract
The distribution of a phenotype on a phylogenetic tree is often a quantity of interest. Many phenotypes have imperfect heritability, so that a measurement of the phenotype for an individual can be thought of as a single realization from the phenotype distribution of that individual. If all individuals in a phylogeny had the same phenotype distribution, measured phenotypes would be randomly distributed on the tree leaves. This is, however, often not the case, implying that the phenotype distribution evolves over time. Here we propose a new model based on this principle of evolving phenotype distribution on the branches of a phylogeny, which is different from ancestral state reconstruction where the phenotype itself is assumed to evolve. We develop an efficient Bayesian inference method to estimate the parameters of our model and to test the evidence for changes in the phenotype distribution. We use multiple simulated data sets to show that our algorithm has good sensitivity and specificity properties. Since our method identifies branches on the tree on which the phenotype distribution has changed, it is able to break down a tree into components for which this distribution is unique and constant. We present two applications of our method, one investigating the association between HIV genetic variation and human leukocyte antigen and the other studying host range distribution in a lineage of Salmonella enterica, and we discuss many other potential applications.
Collapse
|