101
|
Hernandez Cordero AI, Carbonetto P, Riboni Verri G, Gregory JS, Vandenbergh DJ, P Gyekis J, Blizard DA, Lionikas A. Replication and discovery of musculoskeletal QTLs in LG/J and SM/J advanced intercross lines. Physiol Rep 2019; 6. [PMID: 29479840 PMCID: PMC6430048 DOI: 10.14814/phy2.13561] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 12/02/2017] [Accepted: 12/05/2017] [Indexed: 12/14/2022] Open
Abstract
The genetics underlying variation in health‐related musculoskeletal phenotypes can be investigated in a mouse model. Quantitative trait loci (QTLs) affecting musculoskeletal traits in the LG/J and SM/J strain lineage remain to be refined and corroborated. The aim of this study was to map muscle and bone traits in males (n = 506) of the 50th filial generation of advanced intercross lines (LG/SM AIL) derived from the two strains. Genetic contribution to variation in all musculoskeletal traits was confirmed; the SNP heritability of muscle mass ranged between 0.46 and 0.56; and the SNP heritability of tibia length was 0.40. We used two analytical software, GEMMA and QTLRel, to map the underlying QTLs. GEMMA required substantially less computation and recovered all the QTLs identified by QTLRel. Seven significant QTLs were identified for muscle weight (Chr 1, 7, 11, 12, 13, 15, and 16), and two for tibia length, (Chr 1 and 13). Each QTL explained 4–5% of phenotypic variation. One muscle and both bone loci replicated previous findings; the remaining six were novel. Positional candidates for the replicated QTLs were prioritized based on in silico analyses and gene expression in muscle tissue. In summary, we replicated existing QTLs and identified novel QTLs affecting muscle weight, and replicated bone length QTLs in LG/SM AIL males. Heritability estimates substantially exceed the cumulative effect of the QTLs, hence a richer genetic architecture contributing to muscle and bone variability could be uncovered with a larger sample size.
Collapse
Affiliation(s)
- Ana I Hernandez Cordero
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, United Kingdom
| | - Peter Carbonetto
- Research Computing Center and Department of Human Genetics, University of Chicago, Chicago, Illinois
| | - Gioia Riboni Verri
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, United Kingdom
| | - Jennifer S Gregory
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, United Kingdom
| | - David J Vandenbergh
- Department of Biobehavioral Health, The Penn State Institute for the Neurosciences, Molecular and Cellular Integrative Biosciences Program, The Pennsylvania State University, University Park, Pennsylvania
| | - Joseph P Gyekis
- Department of Biobehavioral Health, The Pennsylvania State University, University Park, Pennsylvania
| | - David A Blizard
- Department of Biobehavioral Health, The Penn State Institute for the Neurosciences, Molecular and Cellular Integrative Biosciences Program, The Pennsylvania State University, University Park, Pennsylvania
| | - Arimantas Lionikas
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, United Kingdom
| |
Collapse
|
102
|
Hansen MEB, Rubel MA, Bailey AG, Ranciaro A, Thompson SR, Campbell MC, Beggs W, Dave JR, Mokone GG, Mpoloka SW, Nyambo T, Abnet C, Chanock SJ, Bushman FD, Tishkoff SA. Population structure of human gut bacteria in a diverse cohort from rural Tanzania and Botswana. Genome Biol 2019; 20:16. [PMID: 30665461 PMCID: PMC6341659 DOI: 10.1186/s13059-018-1616-9] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 12/20/2018] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Gut microbiota from individuals in rural, non-industrialized societies differ from those in individuals from industrialized societies. Here, we use 16S rRNA sequencing to survey the gut bacteria of seven non-industrialized populations from Tanzania and Botswana. These include populations practicing traditional hunter-gatherer, pastoralist, and agropastoralist subsistence lifestyles and a comparative urban cohort from the greater Philadelphia region. RESULTS We find that bacterial diversity per individual and within-population phylogenetic dissimilarity differs between Botswanan and Tanzanian populations, with Tanzania generally having higher diversity per individual and lower dissimilarity between individuals. Among subsistence groups, the gut bacteria of hunter-gatherers are phylogenetically distinct from both agropastoralists and pastoralists, but that of agropastoralists and pastoralists were not significantly different from each other. Nearly half of the Bantu-speaking agropastoralists from Botswana have gut bacteria that are very similar to the Philadelphian cohort. Based on imputed metagenomic content, US samples have a relative enrichment of genes found in pathways for degradation of several common industrial pollutants. Within two African populations, we find evidence that bacterial composition correlates with the genetic relatedness between individuals. CONCLUSIONS Across the cohort, similarity in bacterial presence/absence compositions between people increases with both geographic proximity and genetic relatedness, while abundance weighted bacterial composition varies more significantly with geographic proximity than with genetic relatedness.
Collapse
Affiliation(s)
- Matthew E B Hansen
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Meagan A Rubel
- Department of Anthropology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Aubrey G Bailey
- Department of Microbiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Present address: Kuopio Center for Gene and Cell Therapy, Microkatu 1, 70210, Kuopio, Finland
| | - Alessia Ranciaro
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Simon R Thompson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Present address: Genomics England, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Michael C Campbell
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Present address: Department of Biology, Howard University, 415 College St. NW, Washington, DC, USA
| | - William Beggs
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jaanki R Dave
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- The Geisinger Commonwealth Medical College, Scranton, PA, 18509, USA
| | - Gaonyadiwe G Mokone
- Department of Biomedical Sciences, University of Botswana School of Medicine, Gaborone, Botswana
| | | | - Thomas Nyambo
- Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Christian Abnet
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Stephen J Chanock
- Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Frederic D Bushman
- Department of Microbiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sarah A Tishkoff
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
103
|
Xiong Z, Zhang Q, Platt A, Liao W, Shi X, de Los Campos G, Long Q. OCMA: Fast, Memory-Efficient Factorization of Prohibitively Large Relationship Matrices. G3 (BETHESDA, MD.) 2019; 9:13-19. [PMID: 30482799 PMCID: PMC6325911 DOI: 10.1534/g3.118.200908] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 11/26/2018] [Indexed: 11/28/2022]
Abstract
Matrices representing genetic relatedness among individuals (i.e., Genomic Relationship Matrices, GRMs) play a central role in genetic analysis. The eigen-decomposition of GRMs (or its alternative that generates fewer top singular values using genotype matrices) is a necessary step for many analyses including estimation of SNP-heritability, Principal Component Analysis (PCA), and genomic prediction. However, the GRMs and genotype matrices provided by modern biobanks are too large to be stored in active memory. To accommodate the current and future "bigger-data", we develop a disk-based tool, Out-of-Core Matrices Analyzer (OCMA), using state-of-the-art computational techniques that can nimbly perform eigen and Singular Value Decomposition (SVD) analyses. By integrating memory mapping (mmap) and the latest matrix factorization libraries, our tool is fast and memory-efficient. To demonstrate the impressive performance of OCMA, we test it on a personal computer. For full eigen-decomposition, it solves an ordinary GRM (N = 10,000) in 55 sec. For SVD, a commonly used faster alternative of full eigen-decomposition in genomic analyses, OCMA solves the top 200 singular values (SVs) in half an hour, top 2,000 SVs in 0.95 hr, and all 5,000 SVs in 1.77 hr based on a very large genotype matrix (N = 1,000,000, M = 5,000) on the same personal computer. OCMA also supports multi-threading when running in a desktop or HPC cluster. Our OCMA tool can thus alleviate the computing bottleneck of classical analyses on large genomic matrices, and make it possible to scale up current and emerging analytical methods to big genomics data using lightweight computing resources.
Collapse
Affiliation(s)
- Zhi Xiong
- Department of Computer Science, Shantou University, China
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Canada
- Annie Charbonneau Cancer Institute, University of Calgary, Canada
| | - Alexander Platt
- Center for Computational Genetics and Genomics, Temple University, USA
| | - Wenyuan Liao
- Department of Mathematics and Statistics, University of Calgary, Canada
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, USA
| | - Gustavo de Los Campos
- Department of Epidemiology & Biostatistics, Statistics & Probability and Institute for Quantitative Health Science and Engineering, Michigan State University, USA
| | - Quan Long
- Department of Biochemistry and Molecular Biology, University of Calgary, Canada
- Department of Medical Genetics, University of Calgary, Canada
- Department of Mathematics and Statistics, University of Calgary, Canada
- Alberta Childrens Hospital Research Institute, University of Calgary, Canada
| |
Collapse
|
104
|
Hardner CM, Hayes BJ, Kumar S, Vanderzande S, Cai L, Piaskowski J, Quero-Garcia J, Campoy JA, Barreneche T, Giovannini D, Liverani A, Charlot G, Villamil-Castro M, Oraguzie N, Peace CP. Prediction of genetic value for sweet cherry fruit maturity among environments using a 6K SNP array. HORTICULTURE RESEARCH 2019; 6:6. [PMID: 30603092 PMCID: PMC6312542 DOI: 10.1038/s41438-018-0081-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Revised: 06/08/2018] [Accepted: 07/15/2018] [Indexed: 05/21/2023]
Abstract
The timing of fruit maturity is an important trait in sweet cherry production and breeding. Phenotypic variation for phenology of fruit maturity in sweet cherry appears to be under strong genetic control, but that control might be complicated by phenotypic instability across environments. Although such genotype-by-environment interaction (G × E) is a common phenomenon in crop plants, knowledge about it is lacking for fruit maturity timing and other sweet cherry traits. In this study, 1673 genome-wide SNP markers were used to estimate genomic relationships among 597 weakly pedigree-connected individuals evaluated over two seasons at three locations in Europe and one location in the USA, thus sampling eight 'environments'. The combined dataset enabled a single meta-analysis to investigate the environmental stability of genomic predictions. Linkage disequilibrium among marker loci declined rapidly with physical distance, and ordination of the relationship matrix suggested no strong structure among germplasm. The most parsimonious G × E model allowed heterogeneous genetic variance and pairwise covariances among environments. Narrow-sense genomic heritability was very high (0.60-0.83), as was accuracy of predicted breeding values (>0.62). Average correlation of additive effects among environments was high (0.96) and breeding values were highly correlated across locations. Results indicated that genomic models can be used in cherry to accurately predict date of fruit maturity for untested individuals in new environments. Limited G × E for this trait indicated that phenotypes of individuals will be stable across similar environments. Equivalent analyses for other sweet cherry traits, for which multiple years of data are commonly available among breeders and cultivar testers, would be informative for predicting performance of elite selections and cultivars in new environments.
Collapse
Affiliation(s)
- Craig M. Hardner
- University of Queensland, Queensland Alliance for Agriculture and Food Innovation, Brisbane, QLD 4072 Australia
| | - Ben J. Hayes
- University of Queensland, Queensland Alliance for Agriculture and Food Innovation, Brisbane, QLD 4072 Australia
| | - Satish Kumar
- The New Zealand Institute for Plant and Food Research Limited, Hawke’s Bay Research Centre, Hastings, 4130 New Zealand
| | - Stijn Vanderzande
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - Lichun Cai
- Department of Horticulture, Michigan State University, East Lansing, MI 48824 USA
| | - Julia Piaskowski
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - José Quero-Garcia
- UMR 1332 BFP, INRA, University of Bordeaux, 33140 Nouvelle-Aquitaine, France
| | - José Antonio Campoy
- UMR 1332 BFP, INRA, University of Bordeaux, 33140 Nouvelle-Aquitaine, France
| | - Teresa Barreneche
- UMR 1332 BFP, INRA, University of Bordeaux, 33140 Nouvelle-Aquitaine, France
| | - Daniela Giovannini
- Council for Agricultural Research and Economics (CREA), Fruit Unit of Forlì, Via la Canapona, 1 bis, 47121 Emilia-Romagna, Italy
| | - Alessandro Liverani
- Council for Agricultural Research and Economics (CREA), Fruit Unit of Forlì, Via la Canapona, 1 bis, 47121 Emilia-Romagna, Italy
| | - Gérard Charlot
- Centre Technique Interprofessionnel des Fruits et Légumes (CTIFL), 751 Chemin de Balandran, 30127 Bellegarde, France
| | - Miguel Villamil-Castro
- University of Queensland, Queensland Alliance for Agriculture and Food Innovation, Brisbane, QLD 4072 Australia
| | - Nnadozie Oraguzie
- Department of Horticulture, Washington State University, Irrigated Agriculture Research and Extension Center, 24106N Bunn Road, Prosser, WA 99350 USA
| | - Cameron P. Peace
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| |
Collapse
|
105
|
Herzig AF, Nutile T, Ruggiero D, Ciullo M, Perdry H, Leutenegger AL. Detecting the dominance component of heritability in isolated and outbred human populations. Sci Rep 2018; 8:18048. [PMID: 30575761 PMCID: PMC6303332 DOI: 10.1038/s41598-018-36050-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 11/10/2018] [Indexed: 11/21/2022] Open
Abstract
Inconsistencies between published estimates of dominance heritability between studies of human genetic isolates and human outbred populations incite investigation into whether such differences result from particular trait architectures or specific population structures. We analyse simulated datasets, characteristic of genetic isolates and of unrelated individuals, before analysing the isolate of Cilento for various commonly studied traits. We show the strengths of using genetic relationship matrices for variance decomposition over identity-by-descent based methods in a population isolate and that heritability estimates in isolates will avoid the downward biases that may occur in studies of samples of unrelated individuals; irrespective of the simulated distribution of causal variants. Yet, we also show that precise estimates of dominance in isolates are demonstrably problematic in the presence of shared environmental effects and such effects should be accounted for. Nevertheless, we demonstrate how studying isolates can help determine the existence or non-existence of dominance for complex traits, and we find strong indications of non-zero dominance for low-density lipoprotein level in Cilento. Finally, we recommend future study designs to analyse trait variance decomposition from ensemble data across multiple population isolates.
Collapse
Affiliation(s)
- Anthony F Herzig
- Inserm, U946, Genetic variation and Human diseases, Paris, France. .,Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.
| | - Teresa Nutile
- Institute of Genetics and Biophysics A. Buzzati-Traverso - CNR, Naples, Italy
| | - Daniela Ruggiero
- Institute of Genetics and Biophysics A. Buzzati-Traverso - CNR, Naples, Italy.,IRCCS Neuromed, Pozzilli, Isernia, Italy
| | - Marina Ciullo
- Institute of Genetics and Biophysics A. Buzzati-Traverso - CNR, Naples, Italy. .,IRCCS Neuromed, Pozzilli, Isernia, Italy.
| | - Hervé Perdry
- Université Paris-Saclay, University. Paris-Sud, Inserm, CESP, Villejuif, France
| | - Anne-Louise Leutenegger
- Inserm, U946, Genetic variation and Human diseases, Paris, France.,Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France
| |
Collapse
|
106
|
Hogg CJ, Wright B, Morris KM, Lee AV, Ivy JA, Grueber CE, Belov K. Founder relationships and conservation management: empirical kinships reveal the effect on breeding programmes when founders are assumed to be unrelated. Anim Conserv 2018. [DOI: 10.1111/acv.12463] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- C. J. Hogg
- School of Life and Environmental Sciences The University of Sydney Sydney NSW Australia
- Zoo and Aquarium Association Australasia Mosman NSW Australia
| | - B. Wright
- School of Life and Environmental Sciences The University of Sydney Sydney NSW Australia
| | - K. M. Morris
- School of Life and Environmental Sciences The University of Sydney Sydney NSW Australia
| | - A. V. Lee
- Save the Tasmanian Devil Program DPIPWE Hobart TAS Australia
| | - J. A. Ivy
- San Diego Zoo Global San Diego CA USA
| | - C. E. Grueber
- School of Life and Environmental Sciences The University of Sydney Sydney NSW Australia
- San Diego Zoo Global San Diego CA USA
| | - K. Belov
- School of Life and Environmental Sciences The University of Sydney Sydney NSW Australia
| |
Collapse
|
107
|
Klápště J, Suontama M, Dungey HS, Telfer EJ, Graham NJ, Low CB, Stovold GT. Effect of Hidden Relatedness on Single-Step Genetic Evaluation in an Advanced Open-Pollinated Breeding Program. J Hered 2018; 109:802-810. [PMID: 30285150 PMCID: PMC6208454 DOI: 10.1093/jhered/esy051] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 09/27/2018] [Indexed: 01/17/2023] Open
Abstract
Open-pollinated (OP) mating is frequently used in forest tree breeding due to the relative temporal and financial efficiency of the approach. The trade-off is the lower precision of the estimated genetic parameters. Pedigree/sib-ship reconstruction has been proven as a tool to correct and complete pedigree information and to improve the precision of genetic parameter estimates. Our study analyzed an advanced generation Eucalyptus population from an OP breeding program using single-step genetic evaluation. The relationship matrix inferred from sib-ship reconstruction was used to rescale the marker-based relationship matrix (G matrix). This was compared with a second scenario that used rescaling based on the documented pedigree. The proposed single-step model performed better with respect to both model fit and the theoretical accuracy of breeding values. We found that the prediction accuracy was superior when using the pedigree information only when compared with using a combination of the pedigree and genomic information. This pattern appeared to be mainly a result of accumulated unrecognized relatedness over several breeding cycles, resulting in breeding values being shrunk toward the population mean. Using biased, pedigree-based breeding values as the base with which to correlate predicted GEBVs, resulted in the underestimation of prediction accuracies. Using breeding values estimated on the basis of sib-ship reconstruction resulted in increased prediction accuracies of the genotyped individuals. Therefore, selection of the correct base for estimation of prediction accuracy is critical. The beneficial impact of sib-ship reconstruction using G matrix rescaling was profound, especially in traits with inbreeding depression, such as stem diameter.
Collapse
Affiliation(s)
- Jaroslav Klápště
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| | - Mari Suontama
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| | - Heidi S Dungey
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| | - Emily J Telfer
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| | - Natalie J Graham
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| | - Charlie B Low
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| | - Grahame T Stovold
- Scion (New Zealand Forest Research Institute Ltd.), Whakarewarewa, Rotorua, New Zealand. Mari Suontama is now at Skogforsk, Umeå, Sävar SE, Sweden
| |
Collapse
|
108
|
Pedigree data indicate rapid inbreeding and loss of genetic diversity within populations of native, traditional dog breeds of conservation concern. PLoS One 2018; 13:e0202849. [PMID: 30208042 PMCID: PMC6135370 DOI: 10.1371/journal.pone.0202849] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Accepted: 08/09/2018] [Indexed: 11/19/2022] Open
Abstract
Increasing concern is directed towards genetic diversity of domestic animal populations because strong selective breeding can rapidly deplete genetic diversity of socio-economically valuable animals. International conservation policy identifies minimizing genetic erosion of domesticated animals as a key biodiversity target. We used breeding records to assess potential indications of inbreeding and loss of founder allelic diversity in 12 native Swedish dog breeds, traditional to the country, ten of which have been identified by authorities as of conservation concern. The pedigrees dated back to the mid-1900, comprising 5-11 generations and 350-66,500 individuals per pedigree. We assessed rates of inbreeding and potential indications of loss of genetic variation by measuring inbreeding coefficients and remaining number of founder alleles at five points in time during 1980-2012. We found average inbreeding coefficients among breeds to double-from an average of 0.03 in 1980 to 0.07 in 2012 -in spite of the majority of breeds being numerically large with pedigrees comprising thousands of individuals indicating that such rapid increase of inbreeding should have been possible to avoid. We also found indications of extensive loss of intra-breed variation; on average 70 percent of founder alleles are lost during 1980-2012. Explicit conservation goals for these breeds were not reflected in pedigree based conservation genetic measures; breeding needs to focus more on retaining genetic variation, and supplementary genomic analyses of these breeds are highly warranted in order to find out the extent to which the trends indicated here are reflected over the genomes of these breeds.
Collapse
|
109
|
Goudet J, Kay T, Weir BS. How to estimate kinship. Mol Ecol 2018; 27:4121-4135. [PMID: 30107060 PMCID: PMC6220858 DOI: 10.1111/mec.14833] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 07/10/2018] [Accepted: 07/16/2018] [Indexed: 01/06/2023]
Abstract
The concept of kinship permeates many domains of fundamental and applied biology ranging from social evolution to conservation science to quantitative and human genetics. Until recently, pedigrees were the gold standard to infer kinship, but the advent of next‐generation sequencing and the availability of dense genetic markers in many species make it a good time to (re)evaluate the usefulness of genetic markers in this context. Using three published data sets where both pedigrees and markers are available, we evaluate two common and a new genetic estimator of kinship. We show discrepancies between pedigree values and marker estimates of kinship and explore via simulations the possible reasons for these. We find these discrepancies are attributable to two main sources: pedigree errors and heterogeneity in the origin of founders. We also show that our new marker‐based kinship estimator has very good statistical properties and behaviour and is particularly well suited for situations where the source population is of small size, as will often be the case in conservation biology, and where high levels of kinship are expected, as is typical in social evolution studies.
Collapse
Affiliation(s)
- Jérôme Goudet
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Tomas Kay
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, Seattle, Washington
| |
Collapse
|
110
|
Raidan FSS, Porto-Neto LR, Li Y, Lehnert SA, Reverter A. Weighting genomic and genealogical information for genetic parameter estimation and breeding value prediction in tropical beef cattle. J Anim Sci 2018; 96:612-617. [PMID: 29385460 DOI: 10.1093/jas/skx027] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 01/16/2018] [Indexed: 11/14/2022] Open
Abstract
A combined matrix that exploits genealogy together with marker-based information could improve the selection of elite individuals in breeding programs. We present genetic parameters for adaptive and growth traits in beef cattle by exploring linear combinations of pedigree-based (A) and marker-based (G) relationship matrices. We use a data set with 2,111 Brahman (BB) and 2,550 Tropical Composite (TC) cattle with genotypes for 729,068 SNP, and phenotypes for five traits. A weighted relationship matrix (WRM) combining G and A was constructed as WRM = λG + (1 - λ)A. The weight (λ) was explored at values from 0.0 to 1.0, at 0.1 intervals. Additionally, four alternative G matrices, in the WRM, were evaluated according to the selection of SNP used to generate them: 1) Gw: all autosomal SNP with minor allele frequency (MAF) > 1%; 2) Gg: autosomal SNP with MAF > 1% and mapped inside to gene coding regions; 3) Gp: autosomal SNP with MAF > 1% and previously reported to have significant pleiotropic effect in these two populations; and 4) Gc: autosomal SNP with MAF > 1% and with significant correlated effects previously reported in both BB and TC populations. In addition, two A matrices were evaluated: 1) A: all relationships between animals were considered after tracing back known ancestors; and 2) Ad: a distorted A matrix where a random 1% of the off-diagonal nonzero values were set to zero to simulate relationship errors. Five independent Ad matrices were explored each with a different random 1% of relationships masked. Criteria for comparing the resulting WRM included estimates of heritability (h2) and cross-validation accuracy (ACC) of genomic estimated breeding values. The choice of WRM had a greater impact on h2 than on ACC estimates. The 1% errors introduced in pedigree relationships generated large distortion in genetic parameters and ACC estimates. However, employing a λ > 0.7 was an efficient mechanism to compensate for the errors in A. Additionally, although significant (P-value < 0.0001), we found no consistent relationship between the type of SNP used to compute G and h2 or ACC estimates. We devised the optimal value of λ for maximum h2 and ACC at λ = 0.7 suggesting a 70% and 30% weighting to genomic and genealogical information, respectively, as an optimal strategy to compensate for pedigree errors, to improve genetic parameters estimates and lead to more accurate selection decisions.
Collapse
Affiliation(s)
- Fernanda S S Raidan
- CSIRO Agriculture & Food, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| | - Laercio R Porto-Neto
- CSIRO Agriculture & Food, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| | - Yutao Li
- CSIRO Agriculture & Food, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| | - Sigrid A Lehnert
- CSIRO Agriculture & Food, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| | - Antonio Reverter
- CSIRO Agriculture & Food, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| |
Collapse
|
111
|
Hosoya S, Kikuchi K, Nagashima H, Onodera J, Sugimoto K, Satoh K, Matsuzaki K, Yasugi M, Nagano AJ, Kumagayi A, Ueda K, Kurokawa T. Assessment of genetic diversity in Coho salmon (Oncorhynchus kisutch) populations with no family records using ddRAD-seq. BMC Res Notes 2018; 11:548. [PMID: 30071886 PMCID: PMC6071332 DOI: 10.1186/s13104-018-3663-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 07/30/2018] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE Selective breeding for desirable traits is becoming popular in aquaculture. In Miyagi prefecture, Japan, a selectively bred population of Coho salmon (Oncorhynchus kisutch) has been established with the original, randomly breeding population maintained separately. Since they have been bred without family records, the genetic diversity within these populations remains unknown. In this study, we estimated the genetic diversity and key quantitative genetic parameters such as heritability and genomic breeding value for body size traits by means of genomic best linear unbiased prediction to assess the genetic health of these populations. RESULTS Ninety-nine and 83 females from the selective and random groups, respectively, were genotyped at 2350 putative SNPs by means of double digest restriction associated DNA sequencing. The genetic diversity in the selectively bred group was low, as were the estimated heritability and prediction accuracy for length and weight (h2 = 0.26-0.28; accuracy = 0.34), compared to the randomly bred group (h2 = 0.50-0.60; accuracy = 0.51-0.54). Although the tested sample size was small, these results suggest that further selection is difficult for the selectively bred population, while there is some potential for the randomly bred group, especially with the aid of genomic information.
Collapse
Affiliation(s)
- Sho Hosoya
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, University of Tokyo, 2971-4 Bentenjima, Maisaka, Hamamatsu, Shizuoka, 431-0214, Japan.
| | - Kiyoshi Kikuchi
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, University of Tokyo, 2971-4 Bentenjima, Maisaka, Hamamatsu, Shizuoka, 431-0214, Japan
| | - Hiroshi Nagashima
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Junichi Onodera
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Kouichi Sugimoto
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Kou Satoh
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Keisuke Matsuzaki
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Masaki Yasugi
- Center for Ecological Research, Kyoto University, Hirano 509-3-2, Otsu, Shiga, 520-2113, Japan
| | - Atsushi J Nagano
- Center for Ecological Research, Kyoto University, Hirano 509-3-2, Otsu, Shiga, 520-2113, Japan.,JST CREST, Honcho 4-1-8, Kawaguchi, Saitama, 332-0012, Japan.,Faculty of Agriculture, Ryukoku University, Yokotani 1-5, Seta Ohe-cho, Otsu-shi, Shiga, 520-2194, Japan
| | - Akira Kumagayi
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Kenichi Ueda
- Miyagi Prefecture Fisheries Technology Institute, Freshwater Fisheries Experimental Station, Taiwa, Miyagi, 981-3625, Japan
| | - Tadahide Kurokawa
- Kushiro Laboratory, Hokkaido National Fisheries Research Institute, Japan Fisheries Research and Education Agency, 116 Katsurakoi, Kushiro-shi, Hokkaido, 085-0802, Japan.
| |
Collapse
|
112
|
Chang ES, Orive ME, Cartwright P. Nonclonal coloniality: Genetically chimeric colonies through fusion of sexually produced polyps in the hydrozoan Ectopleura larynx. Evol Lett 2018; 2:442-455. [PMID: 30283694 PMCID: PMC6121865 DOI: 10.1002/evl3.68] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 06/18/2018] [Indexed: 12/20/2022] Open
Abstract
Hydrozoans typically develop colonies through asexual budding of polyps. Although colonies of Ectopleura are similar to other hydrozoans in that they consist of multiple polyps physically connected through continuous epithelia and shared gastrovascular cavity, Ectopleura larynx does not asexually bud polyps indeterminately. Instead, after an initial phase of limited budding in a young colony, E. larynx achieves its large colony size through the aggregation and fusion of sexually (nonclonally) produced polyps. The apparent chimerism within a physiologically integrated colony presents a potential source of conflict between distinct genetic lineages, which may vary in their ability to access the germline. To determine the extent to which the potential for genetic conflict exists, we characterized the types of genetic relationships between polyps within colonies, using a RAD‐Seq approach. Our results indicate that E. larynx colonies are indeed comprised of polyps that are clones and sexually reproduced siblings and offspring, consistent with their life history. In addition, we found that colonies also contain polyps that are genetically unrelated, and that estimates of genome‐wide relatedness suggests a potential for conflict within a colony. Taken together, our data suggest that there are distinct categories of relationships in colonies of E. larynx, likely achieved through a range of processes including budding, regeneration, and fusion of progeny and unrelated polyps, with the possibility for a genetic conflict resolution mechanism. Together these processes contribute to the reevolution of the ecologically important trait of coloniality in E. larynx.
Collapse
Affiliation(s)
- E Sally Chang
- Department of Ecology and Evolutionary Biology University of Kansas Lawrence Kansas 66045
| | - Maria E Orive
- Department of Ecology and Evolutionary Biology University of Kansas Lawrence Kansas 66045
| | - Paulyn Cartwright
- Department of Ecology and Evolutionary Biology University of Kansas Lawrence Kansas 66045
| |
Collapse
|
113
|
Genomic Model with Correlation Between Additive and Dominance Effects. Genetics 2018; 209:711-723. [PMID: 29743175 DOI: 10.1534/genetics.118.301015] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2018] [Accepted: 05/08/2018] [Indexed: 11/18/2022] Open
Abstract
Dominance genetic effects are rarely included in pedigree-based genetic evaluation. With the availability of single nucleotide polymorphism markers and the development of genomic evaluation, estimates of dominance genetic effects have become feasible using genomic best linear unbiased prediction (GBLUP). Usually, studies involving additive and dominance genetic effects ignore possible relationships between them. It has been often suggested that the magnitude of functional additive and dominance effects at the quantitative trait loci are related, but there is no existing GBLUP-like approach accounting for such correlation. Wellmann and Bennewitz (2012) showed two ways of considering directional relationships between additive and dominance effects, which they estimated in a Bayesian framework. However, these relationships cannot be fitted at the level of individuals instead of loci in a mixed model, and are not compatible with standard animal or plant breeding software. This comes from a fundamental ambiguity in assigning the reference allele at a given locus. We show that, if there has been selection, assigning the most frequent as the reference allele orients the correlation between functional additive and dominance effects. As a consequence, the most frequent reference allele is expected to have a positive value. We also demonstrate that selection creates negative covariance between genotypic additive and dominance genetic values. For parameter estimation, it is possible to use a combined additive and dominance relationship matrix computed from marker genotypes, and to use standard restricted maximum likelihood algorithms based on an equivalent model. Through a simulation study, we show that such correlations can easily be estimated by mixed model software and that the accuracy of prediction for genetic values is slightly improved if such correlations are used in GBLUP. However, a model assuming uncorrelated effects and fitting orthogonal breeding values and dominant deviations performed similarly for prediction.
Collapse
|
114
|
Malenfant RM, Davis CS, Richardson ES, Lunn NJ, Coltman DW. Heritability of body size in the polar bears of Western Hudson Bay. Mol Ecol Resour 2018; 18:854-866. [DOI: 10.1111/1755-0998.12889] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 03/23/2018] [Accepted: 03/29/2018] [Indexed: 12/16/2022]
Affiliation(s)
- René M. Malenfant
- Department of Biology University of New Brunswick Fredericton NB Canada
- Department of Biological Sciences University of Alberta Edmonton Alberta Canada
| | - Corey S. Davis
- Department of Biological Sciences University of Alberta Edmonton Alberta Canada
| | - Evan S. Richardson
- Wildlife Research Division Science and Technology Branch Environment and Climate Change Canada Winnipeg MB Canada
| | - Nicholas J. Lunn
- Wildlife Research Division Science and Technology Branch Environment and Climate Change Canada University of Alberta Edmonton AB Canada
| | - David W. Coltman
- Department of Biological Sciences University of Alberta Edmonton Alberta Canada
| |
Collapse
|
115
|
Kaplanis J, Gordon A, Shor T, Weissbrod O, Geiger D, Wahl M, Gershovits M, Markus B, Sheikh M, Gymrek M, Bhatia G, MacArthur DG, Price AL, Erlich Y. Quantitative analysis of population-scale family trees with millions of relatives. Science 2018; 360:171-175. [PMID: 29496957 PMCID: PMC6593158 DOI: 10.1126/science.aam9309] [Citation(s) in RCA: 107] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 11/02/2017] [Accepted: 02/07/2018] [Indexed: 12/12/2022]
Abstract
Family trees have vast applications in fields as diverse as genetics, anthropology, and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. We collected 86 million profiles from publicly available online data shared by genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million individuals. We leveraged the data to partition the genetic architecture of human longevity and to provide insights into the geographical dispersion of families. We also report a simple digital procedure to overlay other data sets with our resource.
Collapse
Affiliation(s)
- Joanna Kaplanis
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Assaf Gordon
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Tal Shor
- MyHeritage, Or Yehuda 6037606, Israel
- Computer Science Department, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Omer Weissbrod
- Computer Science Department, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Dan Geiger
- Computer Science Department, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Mary Wahl
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Barak Markus
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Mona Sheikh
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Melissa Gymrek
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Harvard Medical School, Boston, MA 02115, USA
- Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Gaurav Bhatia
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
| | - Daniel G MacArthur
- Harvard Medical School, Boston, MA 02115, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
- Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA
| | - Yaniv Erlich
- New York Genome Center, New York, NY 10013, USA.
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- MyHeritage, Or Yehuda 6037606, Israel
- Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA
- Center for Computational Biology and Bioinformatics, Department of Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
116
|
Schrag TA, Westhues M, Schipprack W, Seifert F, Thiemann A, Scholten S, Melchinger AE. Beyond Genomic Prediction: Combining Different Types of omics Data Can Improve Prediction of Hybrid Performance in Maize. Genetics 2018; 208:1373-1385. [PMID: 29363551 PMCID: PMC5887136 DOI: 10.1534/genetics.117.300374] [Citation(s) in RCA: 99] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 01/20/2018] [Indexed: 01/28/2023] Open
Abstract
The ability to predict the agronomic performance of single-crosses with high precision is essential for selecting superior candidates for hybrid breeding. With recent technological advances, thousands of new parent lines, and, consequently, millions of new hybrid combinations are possible in each breeding cycle, yet only a few hundred can be produced and phenotyped in multi-environment yield trials. Well established prediction approaches such as best linear unbiased prediction (BLUP) using pedigree data and whole-genome prediction using genomic data are limited in capturing epistasis and interactions occurring within and among downstream biological strata such as transcriptome and metabolome. Because mRNA and small RNA (sRNA) sequences are involved in transcriptional, translational and post-translational processes, we expect them to provide information influencing several biological strata. However, using sRNA data of parent lines to predict hybrid performance has not yet been addressed. Here, we gathered genomic, transcriptomic (mRNA and sRNA) and metabolomic data of parent lines to evaluate the ability of the data to predict the performance of untested hybrids for important agronomic traits in grain maize. We found a considerable interaction for predictive ability between predictor and trait, with mRNA data being a superior predictor for grain yield and genomic data for grain dry matter content, while sRNA performed relatively poorly for both traits. Combining mRNA and genomic data as predictors resulted in high predictive abilities across both traits and combining other predictors improved prediction over that of the individual predictors alone. We conclude that downstream "omics" can complement genomics for hybrid prediction, and, thereby, contribute to more efficient selection of hybrid candidates.
Collapse
Affiliation(s)
- Tobias A Schrag
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Matthias Westhues
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Wolfgang Schipprack
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Felix Seifert
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609 Hamburg, Germany
| | - Alexander Thiemann
- Biocenter Klein Flottbek, Developmental Biology and Biotechnology, University of Hamburg, 22609 Hamburg, Germany
| | - Stefan Scholten
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| | - Albrecht E Melchinger
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599 Stuttgart, Germany
| |
Collapse
|
117
|
López-Cortegano E, Bersabé D, Wang J, García-Dorado A. Detection of genetic purging and predictive value of purging parameters estimated in pedigreed populations. Heredity (Edinb) 2018; 121:38-51. [PMID: 29434337 DOI: 10.1038/s41437-017-0045-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 12/07/2017] [Accepted: 12/09/2017] [Indexed: 11/09/2022] Open
Abstract
The consequences of inbreeding for fitness are important in evolutionary and conservation biology, but can critically depend on genetic purging. However, estimating purging has proven elusive. Using PURGd software, we assess the performance of the Inbreeding-Purging (IP) model and of ancestral inbreeding (Fa) models to detect purging in simulated pedigreed populations, and to estimate parameters that allow reliably predicting the evolution of fitness under inbreeding. The power to detect purging in a single small population of size N is low for both models during the first few generations of inbreeding (t ≈ N/2), but increases for longer periods of slower inbreeding and is, on average, larger for the IP model. The ancestral inbreeding approach overestimates the rate of inbreeding depression during long inbreeding periods, and produces joint estimates of the effects of inbreeding and purging that lead to unreliable predictions for the evolution of fitness. The IP estimates of the rate of inbreeding depression become downwardly biased when obtained from long inbreeding processes. However, the effect of this bias is canceled out by a coupled downward bias in the estimate of the purging coefficient so that, unless the population is very small, the joint estimate of these two IP parameters yields good predictions of the evolution of mean fitness in populations of different sizes during periods of different lengths. Therefore, our results support the use of the IP model to detect inbreeding depression and purging, and to estimate reliable parameters for predictive purposes.
Collapse
Affiliation(s)
- Eugenio López-Cortegano
- Departamento de Genética. Facultad de Biología, Universidad Complutense, 28040, Madrid, Spain
| | - Diego Bersabé
- Departamento de Genética. Facultad de Biología, Universidad Complutense, 28040, Madrid, Spain
| | - Jinliang Wang
- Institute of Zoology, Zoological Society of London, Regent's Park, London, NW1 4RY, United Kingdom
| | - Aurora García-Dorado
- Departamento de Genética. Facultad de Biología, Universidad Complutense, 28040, Madrid, Spain.
| |
Collapse
|
118
|
Attard CRM, Beheregaray LB, Möller LM. Genotyping‐by‐sequencing for estimating relatedness in nonmodel organisms: Avoiding the trap of precise bias. Mol Ecol Resour 2018; 18:381-390. [DOI: 10.1111/1755-0998.12739] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 11/02/2017] [Accepted: 11/02/2017] [Indexed: 12/29/2022]
Affiliation(s)
- Catherine R. M. Attard
- Molecular Ecology Lab College of Science and Engineering Flinders University Adelaide SA Australia
| | - Luciano B. Beheregaray
- Molecular Ecology Lab College of Science and Engineering Flinders University Adelaide SA Australia
| | - Luciana M. Möller
- Molecular Ecology Lab College of Science and Engineering Flinders University Adelaide SA Australia
| |
Collapse
|
119
|
Nietlisbach P, Keller LF, Camenisch G, Guillaume F, Arcese P, Reid JM, Postma E. Pedigree-based inbreeding coefficient explains more variation in fitness than heterozygosity at 160 microsatellites in a wild bird population. Proc Biol Sci 2018; 284:rspb.2016.2763. [PMID: 28250184 DOI: 10.1098/rspb.2016.2763] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 02/06/2017] [Indexed: 01/14/2023] Open
Abstract
Although the pedigree-based inbreeding coefficient F predicts the expected proportion of an individual's genome that is identical-by-descent (IBD), heterozygosity at genetic markers captures Mendelian sampling variation and thereby provides an estimate of realized IBD. Realized IBD should hence explain more variation in fitness than their pedigree-based expectations, but how many markers are required to achieve this in practice remains poorly understood. We use extensive pedigree and life-history data from an island population of song sparrows (Melospiza melodia) to show that the number of genetic markers and pedigree depth affected the explanatory power of heterozygosity and F, respectively, but that heterozygosity measured at 160 microsatellites did not explain more variation in fitness than F This is in contrast with other studies that found heterozygosity based on far fewer markers to explain more variation in fitness than F Thus, the relative performance of marker- and pedigree-based estimates of IBD depends on the quality of the pedigree, the number, variability and location of the markers employed, and the species-specific recombination landscape, and expectations based on detailed and deep pedigrees remain valuable until we can routinely afford genotyping hundreds of phenotyped wild individuals of genetic non-model species for thousands of genetic markers.
Collapse
Affiliation(s)
- Pirmin Nietlisbach
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Lukas F Keller
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Glauco Camenisch
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Frédéric Guillaume
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Peter Arcese
- Department of Forest and Conservation Sciences, University of British Columbia, 2424 Main Mall, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Jane M Reid
- Institute of Biological and Environmental Sciences, School of Biological Sciences, University of Aberdeen, Zoology Building, Tillydrone Avenue, Aberdeen AB24 2TZ, UK
| | - Erik Postma
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland.,Centre for Ecology and Conservation, College of Life and Environmental Sciences, University of Exeter, Cornwall Campus, Penryn TR10 9EZ, UK
| |
Collapse
|
120
|
Luikart G, Kardos M, Hand BK, Rajora OP, Aitken SN, Hohenlohe PA. Population Genomics: Advancing Understanding of Nature. POPULATION GENOMICS 2018. [DOI: 10.1007/13836_2018_60] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
121
|
Fonseca PAS, Leal TP, Santos FC, Gouveia MH, Id-Lahoucine S, Rosse IC, Ventura RV, Bruneli FAT, Machado MA, Peixoto MGCD, Tarazona-Santos E, Carvalho MRS. Reducing cryptic relatedness in genomic data sets via a central node exclusion algorithm. Mol Ecol Resour 2017; 18:435-447. [PMID: 29271609 DOI: 10.1111/1755-0998.12746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2017] [Revised: 12/04/2017] [Accepted: 12/14/2017] [Indexed: 11/30/2022]
Abstract
Cryptic relatedness is a confounding factor in genetic diversity and genetic association studies. Development of strategies to reduce cryptic relatedness in a sample is a crucial step for downstream genetic analyses. This study uses a node selection algorithm, based on network degrees of centrality, to evaluate its applicability and impact on evaluation of genetic diversity and population stratification. 1,036 Guzerá (Bos indicus) females were genotyped using Illumina Bovine SNP50 v2 BeadChip. Four strategies were compared. The first and second strategies consist on a iterative exclusion of most related individuals based on PLINK kinship coefficient (φij) and VanRaden's φij, respectively. The third and fourth strategies were based on a node selection algorithm. The fourth strategy, Network G matrix, preserved the larger number of individuals with a better diversity and representation from the initial sample. Determining the most probable number of populations was directly affected by the kinship metric. Network G matrix was the better strategy for reducing relatedness due to producing a larger sample, with more distant individuals, a more similar distribution when compared with the full data set in the MDS plots and keeping a better representation of the population structure. Resampling strategies using VanRaden's φij as a relationship metric was better to infer the relationships among individuals. Moreover, the resampling strategies directly impact the genomic inflation values in genomewide association studies. The use of the node selection algorithm also implies better selection of the most central individuals to be removed, providing a more representative sample.
Collapse
Affiliation(s)
- Pablo A S Fonseca
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Thiago P Leal
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Fernanda C Santos
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Mateus H Gouveia
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Samir Id-Lahoucine
- Center for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, Canada
| | - Izinara C Rosse
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Ricardo V Ventura
- Center for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, Canada.,Beef Improvement Opportunities, Guelph, ON, Canada
| | | | | | | | - Eduardo Tarazona-Santos
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Maria Raquel S Carvalho
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| |
Collapse
|
122
|
A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction. Heredity (Edinb) 2017; 120:356-368. [PMID: 29238077 PMCID: PMC5842222 DOI: 10.1038/s41437-017-0023-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 10/13/2017] [Accepted: 10/23/2017] [Indexed: 12/15/2022] Open
Abstract
Single nucleotide polymorphism (SNP)-heritability estimation is an important topic in several research fields, including animal, plant and human genetics, as well as in ecology. Linear mixed model estimation of SNP-heritability uses the structures of genomic relationships between individuals, which is constructed from genome-wide sets of SNP-markers that are generally weighted equally in their contributions. Proposed methods to handle dependence between SNPs include, “thinning” the marker set by linkage disequilibrium (LD)-pruning, the use of haplotype-tagging of SNPs, and LD-weighting of the SNP-contributions. For improved estimation, we propose a new conceptual framework for genomic relationship matrix, in which Mahalanobis distance-based LD-correction is used in a linear mixed model estimation of SNP-heritability. The superiority of the presented method is illustrated and compared to mixed-model analyses using a VanRaden genomic relationship matrix, a matrix used by GCTA and a matrix employing LD-weighting (as implemented in the LDAK software) in simulated (using real human, rice and cattle genotypes) and real (maize, rice and mice) datasets. Despite of the computational difficulties, our results suggest that by using the proposed method one can improve the accuracy of SNP-heritability estimates in datasets with high LD.
Collapse
|
123
|
Zhou X. A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES. Ann Appl Stat 2017; 11:2027-2051. [PMID: 29515717 PMCID: PMC5836736 DOI: 10.1214/17-aoas1052] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods-the renowned Haseman-Elston (HE) regression and the recent LD score regression (LDSC)-into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal z-scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.
Collapse
|
124
|
Noble LM, Chelo I, Guzella T, Afonso B, Riccardi DD, Ammerman P, Dayarian A, Carvalho S, Crist A, Pino-Querido A, Shraiman B, Rockman MV, Teotónio H. Polygenicity and Epistasis Underlie Fitness-Proximal Traits in the Caenorhabditis elegans Multiparental Experimental Evolution (CeMEE) Panel. Genetics 2017; 207:1663-1685. [PMID: 29066469 PMCID: PMC5714472 DOI: 10.1534/genetics.117.300406] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 10/10/2017] [Indexed: 01/27/2023] Open
Abstract
Understanding the genetic basis of complex traits remains a major challenge in biology. Polygenicity, phenotypic plasticity, and epistasis contribute to phenotypic variance in ways that are rarely clear. This uncertainty can be problematic for estimating heritability, for predicting individual phenotypes from genomic data, and for parameterizing models of phenotypic evolution. Here, we report an advanced recombinant inbred line (RIL) quantitative trait locus mapping panel for the hermaphroditic nematode Caenorhabditis elegans, the C. elegans multiparental experimental evolution (CeMEE) panel. The CeMEE panel, comprising 507 RILs at present, was created by hybridization of 16 wild isolates, experimental evolution for 140-190 generations, and inbreeding by selfing for 13-16 generations. The panel contains 22% of single-nucleotide polymorphisms known to segregate in natural populations, and complements existing C. elegans mapping resources by providing fine resolution and high nucleotide diversity across > 95% of the genome. We apply it to study the genetic basis of two fitness components, fertility and hermaphrodite body size at time of reproduction, with high broad-sense heritability in the CeMEE. While simulations show that we should detect common alleles with additive effects as small as 5%, at gene-level resolution, the genetic architectures of these traits do not feature such alleles. We instead find that a significant fraction of trait variance, approaching 40% for fertility, can be explained by sign epistasis with main effects below the detection limit. In congruence, phenotype prediction from genomic similarity, while generally poor ([Formula: see text]), requires modeling epistasis for optimal accuracy, with most variance attributed to the rapidly evolving chromosome arms.
Collapse
Affiliation(s)
- Luke M Noble
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York 10003
| | - Ivo Chelo
- Instituto Gulbenkian de Ciência, P-2781-901 Oeiras, Portugal
| | - Thiago Guzella
- Institut de Biologie, École Normale Supérieure, Centre National de la Recherche Scientifique (CNRS) UMR 8197, Institut National de la Santé et de la Recherche Médicale (INSERM) U1024, F-75005 Paris, France
| | - Bruno Afonso
- Instituto Gulbenkian de Ciência, P-2781-901 Oeiras, Portugal
- Institut de Biologie, École Normale Supérieure, Centre National de la Recherche Scientifique (CNRS) UMR 8197, Institut National de la Santé et de la Recherche Médicale (INSERM) U1024, F-75005 Paris, France
| | - David D Riccardi
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York 10003
| | - Patrick Ammerman
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York 10003
| | - Adel Dayarian
- Kavli Institute for Theoretical Physics, University of California, Santa Barbara, California 93106
| | - Sara Carvalho
- Instituto Gulbenkian de Ciência, P-2781-901 Oeiras, Portugal
| | - Anna Crist
- Institut de Biologie, École Normale Supérieure, Centre National de la Recherche Scientifique (CNRS) UMR 8197, Institut National de la Santé et de la Recherche Médicale (INSERM) U1024, F-75005 Paris, France
| | | | - Boris Shraiman
- Kavli Institute for Theoretical Physics, University of California, Santa Barbara, California 93106
- Department of Physics, University of California, Santa Barbara, California 93106
| | - Matthew V Rockman
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York 10003
| | - Henrique Teotónio
- Institut de Biologie, École Normale Supérieure, Centre National de la Recherche Scientifique (CNRS) UMR 8197, Institut National de la Santé et de la Recherche Médicale (INSERM) U1024, F-75005 Paris, France
| |
Collapse
|
125
|
Discrimination of relationships with the same degree of kinship using chromosomal sharing patterns estimated from high-density SNPs. Forensic Sci Int Genet 2017; 33:10-16. [PMID: 29172066 DOI: 10.1016/j.fsigen.2017.11.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 11/07/2017] [Accepted: 11/14/2017] [Indexed: 11/21/2022]
Abstract
Distinguishing relationships with the same degree of kinship (e.g., uncle-nephew and grandfather-grandson) is generally difficult in forensic genetics by using the commonly employed short tandem repeat loci. In this study, we developed a new method for discerning such relationships between two individuals by examining the number of chromosomal shared segments estimated from high-density single nucleotide polymorphisms (SNPs). We computationally generated second-degree kinships (i.e., uncle-nephew and grandfather-grandson) and third-degree kinships (i.e., first cousins and great-grandfather-great-grandson) for 174,254 autosomal SNPs considering the effect of linkage disequilibrium and recombination for each SNP. We investigated shared chromosomal segments between two individuals that were estimated based on identity by state regions. We then counted the number of segments in each pair. Based on our results, the number of shared chromosomal segments in collateral relationships was larger than that in lineal relationships with both the second-degree and third-degree kinships. This was probably caused by differences involving chromosomal transitions and recombination between relationships. As we probabilistically evaluated the relationships between simulated pairs based on the number of shared segments using logistic regression, we could determine accurate relationships in >90% of second-degree relatives and >70% of third-degree relatives, using a probability criterion for the relationship ≥0.9. Furthermore, we could judge the true relationships of actual sample pairs from volunteers, as well as simulated data. Therefore, this method can be useful for discerning relationships between two individuals with the same degree of kinship.
Collapse
|
126
|
Wientjes YCJ, Bijma P, Vandenplas J, Calus MPL. Multi-population Genomic Relationships for Estimating Current Genetic Variances Within and Genetic Correlations Between Populations. Genetics 2017; 207:503-515. [PMID: 28821589 PMCID: PMC5629319 DOI: 10.1534/genetics.117.300152] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 08/15/2017] [Indexed: 01/19/2023] Open
Abstract
Different methods are available to calculate multi-population genomic relationship matrices. Since those matrices differ in base population, it is anticipated that the method used to calculate genomic relationships affects the estimate of genetic variances, covariances, and correlations. The aim of this article is to define the multi-population genomic relationship matrix to estimate current genetic variances within and genetic correlations between populations. The genomic relationship matrix containing two populations consists of four blocks, one block for population 1, one block for population 2, and two blocks for relationships between the populations. It is known, based on literature, that by using current allele frequencies to calculate genomic relationships within a population, current genetic variances are estimated. In this article, we theoretically derived the properties of the genomic relationship matrix to estimate genetic correlations between populations and validated it using simulations. When the scaling factor of across-population genomic relationships is equal to the product of the square roots of the scaling factors for within-population genomic relationships, the genetic correlation is estimated unbiasedly even though estimated genetic variances do not necessarily refer to the current population. When this property is not met, the correlation based on estimated variances should be multiplied by a correction factor based on the scaling factors. In this study, we present a genomic relationship matrix which directly estimates current genetic variances as well as genetic correlations between populations.
Collapse
Affiliation(s)
- Yvonne C J Wientjes
- Wageningen University and Research, Animal Breeding and Genomics, 6700 AH Wageningen, The Netherlands
| | - Piter Bijma
- Wageningen University and Research, Animal Breeding and Genomics, 6700 AH Wageningen, The Netherlands
| | - Jérémie Vandenplas
- Wageningen University and Research, Animal Breeding and Genomics, 6700 AH Wageningen, The Netherlands
| | - Mario P L Calus
- Wageningen University and Research, Animal Breeding and Genomics, 6700 AH Wageningen, The Netherlands
| |
Collapse
|
127
|
Dou J, Sun B, Sim X, Hughes JD, Reilly DF, Tai ES, Liu J, Wang C. Estimation of kinship coefficient in structured and admixed populations using sparse sequencing data. PLoS Genet 2017; 13:e1007021. [PMID: 28961250 PMCID: PMC5636172 DOI: 10.1371/journal.pgen.1007021] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 10/11/2017] [Accepted: 09/14/2017] [Indexed: 12/15/2022] Open
Abstract
Knowledge of biological relatedness between samples is important for many genetic studies. In large-scale human genetic association studies, the estimated kinship is used to remove cryptic relatedness, control for family structure, and estimate trait heritability. However, estimation of kinship is challenging for sparse sequencing data, such as those from off-target regions in target sequencing studies, where genotypes are largely uncertain or missing. Existing methods often assume accurate genotypes at a large number of markers across the genome. We show that these methods, without accounting for the genotype uncertainty in sparse sequencing data, can yield a strong downward bias in kinship estimation. We develop a computationally efficient method called SEEKIN to estimate kinship for both homogeneous samples and heterogeneous samples with population structure and admixture. Our method models genotype uncertainty and leverages linkage disequilibrium through imputation. We test SEEKIN on a whole exome sequencing dataset (WES) of Singapore Chinese and Malays, which involves substantial population structure and admixture. We show that SEEKIN can accurately estimate kinship coefficient and classify genetic relatedness using off-target sequencing data down sampled to ~0.15X depth. In application to the full WES dataset without down sampling, SEEKIN also outperforms existing methods by properly analyzing shallow off-target data (~0.75X). Using both simulated and real phenotypes, we further illustrate how our method improves estimation of trait heritability for WES studies. Inference of genetic relatedness from molecular markers has broad applications in many areas, including quantitative genetics, forensics, evolution and ecology. Classic estimators, however, are not suitable for low-coverage sequencing data, which have high levels of genotype uncertainty and missing data. We evaluate existing methods and describe a new method for kinship estimation using sparse sequencing data. Our method leverages correlations between neighboring markers and models genotype uncertainty in kinship estimators for both homogeneous populations and admixed populations. We show that our method can accurately estimate kinship coefficient even when the sequencing depth is as low as ~0.15X, while existing methods have strong downward bias. Our method can be applied to estimate kinship using sparse off-target data and thus enables control of family structure and estimation of heritability in target sequencing studies, in which the deeply sequenced target regions are often too small to infer genetic relatedness. Even for whole exome sequencing, we show that our method can improve kinship and heritability estimation by including off-target data, compared to conventional analyses solely based on the target regions.
Collapse
Affiliation(s)
- Jinzhuang Dou
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore
| | - Baoluo Sun
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore
| | - Xueling Sim
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| | - Jason D. Hughes
- Genetics, Merck Sharp & Dohme Corp., Kenilworth, New Jersey, United States of America
| | - Dermot F. Reilly
- Genetics, Merck Sharp & Dohme Corp., Kenilworth, New Jersey, United States of America
| | - E. Shyong Tai
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Jianjun Liu
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Human Genetics, Genome Institute of Singapore, Singapore, Singapore
| | - Chaolong Wang
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- * E-mail:
| |
Collapse
|
128
|
Kang JTL, Goldberg A, Edge MD, Behar DM, Rosenberg NA. Consanguinity Rates Predict Long Runs of Homozygosity in Jewish Populations. Hum Hered 2017; 82:87-102. [PMID: 28910803 DOI: 10.1159/000478897] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Accepted: 06/14/2017] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES Recent studies have highlighted the potential of analyses of genomic sharing to produce insight into the demographic processes affecting human populations. We study runs of homozygosity (ROH) in 18 Jewish populations, examining these groups in relation to 123 non-Jewish populations sampled worldwide. METHODS By sorting ROH into 3 length classes (short, intermediate, and long), we evaluate the impact of demographic processes on genomic patterns in Jewish populations. RESULTS We find that the portion of the genome appearing in long ROH - the length class most directly related to recent consanguinity - closely accords with data gathered from interviews during the 1950s on frequencies of consanguineous unions in various Jewish groups. CONCLUSION The high correlation between 1950s consanguinity levels and coverage by long ROH explains differences across populations in ROH patterns. The dissection of ROH into length classes and the comparison to consanguinity data assist in understanding a number of additional phenomena, including similarities of Jewish populations to Middle Eastern, European, and Central and South Asian non-Jewish populations in short ROH patterns, relative lengths of identity-by-descent tracts in different Jewish groups, and the "population isolate" status of the Ashkenazi Jews.
Collapse
|
129
|
The Human Salivary Microbiome Is Shaped by Shared Environment Rather than Genetics: Evidence from a Large Family of Closely Related Individuals. mBio 2017; 8:mBio.01237-17. [PMID: 28900019 PMCID: PMC5596345 DOI: 10.1128/mbio.01237-17] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The human microbiome is affected by multiple factors, including the environment and host genetics. In this study, we analyzed the salivary microbiomes of an extended family of Ashkenazi Jewish individuals living in several cities and investigated associations with both shared household and host genetic similarities. We found that environmental effects dominated over genetic effects. While there was weak evidence of geographical structuring at the level of cities, we observed a large and significant effect of shared household on microbiome composition, supporting the role of the immediate shared environment in dictating the presence or absence of taxa. This effect was also seen when including adults who had grown up in the same household but moved out prior to the time of sampling, suggesting that the establishment of the salivary microbiome earlier in life may affect its long-term composition. We found weak associations between host genetic relatedness and microbiome dissimilarity when using family pedigrees as proxies for genetic similarity. However, this association disappeared when using more-accurate measures of kinship based on genome-wide genetic markers, indicating that the environment rather than host genetics is the dominant factor affecting the composition of the salivary microbiome in closely related individuals. Our results support the concept that there is a consistent core microbiome conserved across global scales but that small-scale effects due to a shared living environment significantly affect microbial community composition. Previous research shows that the salivary microbiomes of relatives are more similar than those of nonrelatives, but it remains difficult to distinguish the effects of relatedness and shared household environment. Furthermore, pedigree measures may not accurately measure host genetic similarity. In this study, we include genetic relatedness based on genome-wide single nucleotide polymorphisms (SNPs) (rather than pedigree measures) and shared environment in the same analysis. We quantify the relative importance of these factors by studying the salivary microbiomes in members of a large extended Ashkenazi Jewish family living in different locations. We find that host genetics plays no significant role and that the dominant factor is the shared environment at the household level. We also find that this effect appears to persist in individuals who have moved out of the parental household, suggesting that aspects of salivary microbiome composition established during upbringing can persist over a time scale of years.
Collapse
|
130
|
Ramstetter MD, Dyer TD, Lehman DM, Curran JE, Duggirala R, Blangero J, Mezey JG, Williams AL. Benchmarking Relatedness Inference Methods with Genome-Wide Data from Thousands of Relatives. Genetics 2017; 207:75-82. [PMID: 28739658 PMCID: PMC5586387 DOI: 10.1534/genetics.117.1122] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 07/08/2017] [Indexed: 01/03/2023] Open
Abstract
Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a data set with 2485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92-99%) when detecting first- and second-degree relationships, but their accuracy dwindles to <43% for seventh-degree relationships. However, most identical by descent (IBD) segment-based methods inferred seventh-degree relatives correct to within one relatedness degree for >76% of relative pairs. Overall, the most accurate methods are Estimation of Recent Shared Ancestry (ERSA) and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches, such as new methods that leverage relatedness signals from multiple samples, are needed to achieve a sizeable jump in performance.
Collapse
Affiliation(s)
- Monica D Ramstetter
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
| | - Thomas D Dyer
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Donna M Lehman
- Department of Medicine, University of Texas Health San Antonio, San Antonio, Texas 78229
| | - Joanne E Curran
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Ravindranath Duggirala
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - John Blangero
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Jason G Mezey
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
- Department of Genetic Medicine, Weill Cornell Medicine, New York, New York 10065
| | - Amy L Williams
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
| |
Collapse
|
131
|
Genetic variations of HvP5CS1 and their association with drought tolerance related traits in barley (Hordeum vulgare L.). Sci Rep 2017; 7:7870. [PMID: 28801593 PMCID: PMC5554244 DOI: 10.1038/s41598-017-08393-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 07/10/2017] [Indexed: 11/21/2022] Open
Abstract
Delta-1-pyrroline-5-carboxylate synthase gene1 (P5CS1) is the key gene involved in the biosynthesis of proline and is significantly induced by drought stress. The exploration of genetic variation in HvP5CS1 may facilitate a better understanding of the mechanism of drought adaptation in barley. In the current study, 41 polymorphisms including 16 single nucleotide polymorphisms (SNPs) and 25 insertions/deletions (indels) were detected in HvP5CS1 among 287 barley (Hordeum vulgare L.) accessions collected worldwide, with 13 distinct haplotypes identified in the barley collection. Five polymorphisms in HvP5CS1 were significantly (P < 0.001) associated with drought tolerance related traits in barley. The phenotypic variation of a given trait explained by each associated polymorphism ranged from 4.43% to 9.81%. Two sequence variations that were significantly (P < 0.0001) associated with grain yield had marginally significant positive Tajima’s D values in the sliding window, so they might have been selected for environmental adaptation. Meanwhile, two haplotypes HvP5CS1_H1 and HvP5CS1_H4, which contained desired alleles of the two variations mentioned above, were significantly (P < 0.001) associated with drought tolerance related traits, and explained 5.00~11.89% of the phenotypic variations. These variations associated with drought tolerance related traits can be used as potential markers for improving drought tolerance in barley.
Collapse
|
132
|
Weir BS, Goudet J. A Unified Characterization of Population Structure and Relatedness. Genetics 2017; 206:2085-2103. [PMID: 28550018 PMCID: PMC5560808 DOI: 10.1534/genetics.116.198424] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 05/17/2017] [Indexed: 11/18/2022] Open
Abstract
Many population genetic activities, ranging from evolutionary studies to association mapping, to forensic identification, rely on appropriate estimates of population structure or relatedness. All applications require recognition that quantities with an underlying meaning of allelic dependence are not defined in an absolute sense, but instead are made "relative to" some set of alleles other than the target set. The 1984 Weir and Cockerham [Formula: see text] estimate made explicit that the reference set of alleles was across populations, whereas standard kinship estimates do not make the reference explicit. Weir and Cockerham stated that their [Formula: see text] estimates were for independent populations, and standard kinship estimates have an implicit assumption that pairs of individuals in a study sample, other than the target pair, are unrelated or are not inbred. However, populations lose independence when there is migration between them, and dependencies between pairs of individuals in a population exist for more than one target pair. We have therefore recast our treatments of population structure, relatedness, and inbreeding to make explicit that the parameters of interest involve the differences in degrees of allelic dependence between the target and the reference sets of alleles, and so can be negative. We take the reference set to be the population from which study individuals have been sampled. We provide simple moment estimates of these parameters, phrased in terms of allelic matching within and between individuals for relatedness and inbreeding, or within and between populations for population structure. A multi-level hierarchy of alleles within individuals, alleles between individuals within populations, and alleles between populations, allows a unified treatment of relatedness and population structure. We expect our new measures to have a wide range of applications, but we note that their estimates are sensitive to rare or private variants: some population-characterization applications suggest exploiting those sensitivities, whereas estimation of relatedness may best use all genetic markers without filtering on minor allele frequency.
Collapse
Affiliation(s)
- Bruce S Weir
- Department of Biostatistics, University of Washington, Seattle, Washington 98195
| | - Jérôme Goudet
- Department of Ecology and Evolution
- Swiss Institute of Bioinformatics, University of Lausanne, 1015 Switzerland
| |
Collapse
|
133
|
Garin V, Wimmer V, Mezmouk S, Malosetti M, van Eeuwijk F. How do the type of QTL effect and the form of the residual term influence QTL detection in multi-parent populations? A case study in the maize EU-NAM population. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2017; 130:1753-1764. [PMID: 28547012 PMCID: PMC5511610 DOI: 10.1007/s00122-017-2923-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 05/11/2017] [Indexed: 05/25/2023]
Abstract
In the QTL analysis of multi-parent populations, the inclusion of QTLs with various types of effects can lead to a better description of the phenotypic variation and increased power. For the type of QTL effect in QTL models for multi-parent populations (MPPs), various options exist to define them with respect to their origin. They can be modelled as referring to close parental lines or to further away ancestral founder lines. QTL models for MPPs can also be characterized by the homo- or heterogeneity of variance for polygenic effects. The most suitable model for the origin of the QTL effect and the homo- or heterogeneity of polygenic effects may be a function of the genetic distance distribution between the parents of MPPs. We investigated the statistical properties of various QTL detection models for MPPs taking into account the genetic distances between the parents of the MPP. We evaluated models with different assumptions about the QTL effect and the form of the residual term using cross validation. For the EU-NAM data, we showed that it can be useful to mix in the same model QTLs with different types of effects (parental, ancestral, or bi-allelic). The benefit of using cross-specific residual terms to handle the heterogeneity of variance was less obvious for this particular data set.
Collapse
Affiliation(s)
- Vincent Garin
- Biometris, Wageningen University and Research Center, P.O Box 100, 6700 AC, Wageningen, The Netherlands.
| | | | | | - Marcos Malosetti
- Biometris, Wageningen University and Research Center, P.O Box 100, 6700 AC, Wageningen, The Netherlands
| | - Fred van Eeuwijk
- Biometris, Wageningen University and Research Center, P.O Box 100, 6700 AC, Wageningen, The Netherlands
| |
Collapse
|
134
|
Martin MD, Jay F, Castellano S, Slatkin M. Determination of genetic relatedness from low-coverage human genome sequences using pedigree simulations. Mol Ecol 2017; 26:4145-4157. [PMID: 28543951 DOI: 10.1111/mec.14188] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 05/05/2017] [Indexed: 02/01/2023]
Abstract
We develop and evaluate methods for inferring relatedness among individuals from low-coverage DNA sequences of their genomes, with particular emphasis on sequences obtained from fossil remains. We suggest the major factors complicating the determination of relatedness among ancient individuals are sequencing depth, the number of overlapping sites, the sequencing error rate and the presence of contamination from present-day genetic sources. We develop a theoretical model that facilitates the exploration of these factors and their relative effects, via measurement of pairwise genetic distances, without calling genotypes, and determine the power to infer relatedness under various scenarios of varying sequencing depth, present-day contamination and sequencing error. The model is validated by a simulation study as well as the analysis of aligned sequences from present-day human genomes. We then apply the method to the recently published genome sequences of ancient Europeans, developing a statistical treatment to determine confidence in assigned relatedness that is, in some cases, more precise than previously reported. As the majority of ancient specimens are from animals, this method would be applicable to investigate kinship in nonhuman remains. The developed software grups (Genetic Relatedness Using Pedigree Simulations) is implemented in Python and freely available.
Collapse
Affiliation(s)
- Michael D Martin
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.,Center for Theoretical Evolutionary Genomics, Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | - Flora Jay
- Center for Theoretical Evolutionary Genomics, Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA.,Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Sud, Paris-Saclay, France
| | - Sergi Castellano
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Montgomery Slatkin
- Center for Theoretical Evolutionary Genomics, Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| |
Collapse
|
135
|
Speed D, Cai N, Johnson MR, Nejentsev S, Balding DJ. Reevaluation of SNP heritability in complex human traits. Nat Genet 2017; 49:986-992. [PMID: 28530675 PMCID: PMC5493198 DOI: 10.1038/ng.3865] [Citation(s) in RCA: 256] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 04/18/2017] [Indexed: 12/15/2022]
Abstract
SNP heritability, the proportion of phenotypic variance explained by SNPs, has been reported for many hundreds of traits. Its estimation requires strong prior assumptions about the distribution of heritability across the genome, but the assumptions in current use have not been thoroughly tested. By analyzing imputed data for a large number of human traits, we empirically derive a model that more accurately describes how heritability varies with minor allele frequency, linkage disequilibrium and genotype certainty. Across 19 traits, our improved model leads to estimates of common SNP heritability on average 43% (standard deviation 3) higher than those obtained from the widely-used software GCTA, and 25% (standard deviation 2) higher than those from the recently-proposed extension GCTA-LDMS. Previously, DNaseI hypersensitivity sites were reported to explain 79% of SNP heritability; using our improved heritability model their estimated contribution is only 24%.
Collapse
Affiliation(s)
- Doug Speed
- UCL Genetics Institute, University College London, London, UK
| | - Na Cai
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | | | | | | | - David J Balding
- UCL Genetics Institute, University College London, London, UK.,Centre for Systems Genomics, School of BioSciences, and School of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
136
|
Fernando RL, Cheng H, Sun X, Garrick DJ. A comparison of identity-by-descent and identity-by-state matrices that are used for genetic evaluation and estimation of variance components. J Anim Breed Genet 2017; 134:213-223. [DOI: 10.1111/jbg.12275] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 03/26/2017] [Indexed: 02/02/2023]
Affiliation(s)
- R. L. Fernando
- Department of Animal Science; Iowa State University; Ames IA USA
| | - H. Cheng
- Department of Animal Science; Iowa State University; Ames IA USA
| | - X. Sun
- Department of Animal Science; Iowa State University; Ames IA USA
| | - D. J. Garrick
- Department of Animal Science; Iowa State University; Ames IA USA
- Institute of Veterinary, Animal and Biomedical Sciences; Massey University; Palmerston North New Zealand
| |
Collapse
|
137
|
Al Abri MA, König von Borstel U, Strecker V, Brooks SA. Application of Genomic Estimation Methods of Inbreeding and Population Structure in an Arabian Horse Herd. J Hered 2017; 108:361-368. [DOI: 10.1093/jhered/esx025] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Accepted: 03/14/2017] [Indexed: 11/14/2022] Open
|
138
|
Willoughby JR, Ivy JA, Lacy RC, Doyle JM, DeWoody JA. Inbreeding and selection shape genomic diversity in captive populations: Implications for the conservation of endangered species. PLoS One 2017; 12:e0175996. [PMID: 28423000 PMCID: PMC5396937 DOI: 10.1371/journal.pone.0175996] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 04/04/2017] [Indexed: 12/01/2022] Open
Abstract
Captive breeding programs are often initiated to prevent species extinction until reintroduction into the wild can occur. However, the evolution of captive populations via inbreeding, drift, and selection can impair fitness, compromising reintroduction programs. To better understand the evolutionary response of species bred in captivity, we used nearly 5500 single nucleotide polymorphisms (SNPs) in populations of white-footed mice (Peromyscus leucopus) to measure the impact of breeding regimes on genomic diversity. We bred mice in captivity for 20 generations using two replicates of three protocols: random mating (RAN), selection for docile behaviors (DOC), and minimizing mean kinship (MK). The MK protocol most effectively retained genomic diversity and reduced the effects of selection. Additionally, genomic diversity was significantly related to fitness, as assessed with pedigrees and SNPs supported with genomic sequence data. Because captive-born individuals are often less fit in wild settings compared to wild-born individuals, captive-estimated fitness correlations likely underestimate the effects in wild populations. Therefore, minimizing inbreeding and selection in captive populations is critical to increasing the probability of releasing fit individuals into the wild.
Collapse
Affiliation(s)
- Janna R. Willoughby
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, Indiana, United States of America
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- * E-mail:
| | - Jamie A. Ivy
- San Diego Zoo Global Collections Department, San Diego, California, United States of America
| | - Robert C. Lacy
- Chicago Zoological Society, Brookfield, Illinois, United States of America
| | - Jacqueline M. Doyle
- Department of Biological Sciences, Towson University, Towson, Maryland, United States of America
| | - J. Andrew DeWoody
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, Indiana, United States of America
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| |
Collapse
|
139
|
Joint Estimation of Relatedness Coefficients and Allele Frequencies from Ancient Samples. Genetics 2017; 206:1025-1035. [PMID: 28396504 DOI: 10.1534/genetics.117.200600] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 04/03/2017] [Indexed: 11/18/2022] Open
Abstract
Here, we develop and test a method to address whether DNA samples sequenced from a group of fossil hominin bone or tooth fragments originate from the same individual or from closely related individuals. Our method assumes low amounts of retrievable DNA, significant levels of sequencing error, and contamination from one or more present-day humans. We develop and implement a maximum likelihood method that estimates levels of contamination, sequencing error rates, and pairwise relatedness coefficients in a set of individuals. We assume that there is no reference panel for the ancient population to provide allele and haplotype frequencies. Our approach makes use of single nucleotide polymorphisms (SNPs) and does not make assumptions about the underlying demographic model. By artificially mating genomes from the 1000 Genomes Project, we determine the numbers of individuals at a given genomic coverage that are required to detect different levels of genetic relatedness with confidence.
Collapse
|
140
|
Ge T, Chen CY, Neale BM, Sabuncu MR, Smoller JW. Phenome-wide heritability analysis of the UK Biobank. PLoS Genet 2017; 13:e1006711. [PMID: 28388634 PMCID: PMC5400281 DOI: 10.1371/journal.pgen.1006711] [Citation(s) in RCA: 149] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Revised: 04/21/2017] [Accepted: 03/22/2017] [Indexed: 11/18/2022] Open
Abstract
Heritability estimation provides important information about the relative contribution of genetic and environmental factors to phenotypic variation, and provides an upper bound for the utility of genetic risk prediction models. Recent technological and statistical advances have enabled the estimation of additive heritability attributable to common genetic variants (SNP heritability) across a broad phenotypic spectrum. Here, we present a computationally and memory efficient heritability estimation method that can handle large sample sizes, and report the SNP heritability for 551 complex traits derived from the interim data release (152,736 subjects) of the large-scale, population-based UK Biobank, comprising both quantitative phenotypes and disease codes. We demonstrate that common genetic variation contributes to a broad array of quantitative traits and human diseases in the UK population, and identify phenotypes whose heritability is moderated by age (e.g., a majority of physical measures including height and body mass index), sex (e.g., blood pressure related traits) and socioeconomic status (education). Our study represents the first comprehensive phenome-wide heritability analysis in the UK Biobank, and underscores the importance of considering population characteristics in interpreting heritability. Heritability of a trait refers to the proportion of phenotypic variation that is due to genetic variation among individuals. It provides important information about the genetic basis of complex traits and indicates whether a phenotype is an appropriate target for more specific statistical and molecular genetic analyses. Recent studies have leveraged the increasingly ubiquitous genome-wide data and documented the heritability attributable to common genetic variation captured by genotyping microarrays for a wide range of human traits. However, heritability is not a fixed property of a phenotype and can vary with population-specific differences in the genetic background and environmental variation. Here, using a computationally and memory efficient heritability estimation method, we report the heritability for a large number of traits derived from the large-scale, population-based UK Biobank, and, for the first time, demonstrate the moderating effect of three major demographic variables (age, sex and socioeconomic status) on heritability estimates derived from genome-wide common genetic variation. Our study represents the first comprehensive heritability analysis across the phenotypic spectrum in the UK Biobank.
Collapse
Affiliation(s)
- Tian Ge
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital / Harvard Medical School, Charlestown, MA, United States of America
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, United States of America
- * E-mail: (TG); (JWS)
| | - Chia-Yen Chen
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, United States of America
- Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America
| | - Benjamin M. Neale
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, United States of America
- Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America
| | - Mert R. Sabuncu
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital / Harvard Medical School, Charlestown, MA, United States of America
- School of Electrical and Computer Engineering and Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, United States of America
| | - Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, United States of America
- * E-mail: (TG); (JWS)
| |
Collapse
|
141
|
Estimating Seven Coefficients of Pairwise Relatedness Using Population-Genomic Data. Genetics 2017; 206:105-118. [PMID: 28341647 DOI: 10.1534/genetics.116.190660] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Accepted: 02/22/2017] [Indexed: 02/01/2023] Open
Abstract
Population structure can be described by genotypic-correlation coefficients between groups of individuals, the most basic of which are the pairwise relatedness coefficients between any two individuals. There are nine pairwise relatedness coefficients in the most general model, and we show that these can be reduced to seven coefficients for biallelic loci. Although all nine coefficients can be estimated from pedigrees, six coefficients have been beyond empirical reach. We provide a numerical optimization procedure that estimates all seven reduced coefficients from population-genomic data. Simulations show that the procedure is nearly unbiased, even at 3× coverage, and errors in five of the seven coefficients are statistically uncorrelated. The remaining two coefficients have a negative correlation of errors, but their sum provides an unbiased assessment of the overall correlation of heterozygosity between two individuals. Application of these new methods to four populations of the freshwater crustacean Daphnia pulex reveal the occurrence of half siblings in our samples, as well as a number of identical individuals that are likely obligately asexual clone mates. Statistically significant negative estimates of these pairwise relatedness coefficients, including inbreeding coefficients that were typically negative, underscore the difficulties that arise when interpreting genotypic correlations as estimations of the probability that alleles are identical by descent.
Collapse
|
142
|
Burghardt LT, Young ND, Tiffin P. A Guide to Genome-Wide Association Mapping in Plants. ACTA ACUST UNITED AC 2017; 2:22-38. [PMID: 31725973 DOI: 10.1002/cppb.20041] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Genome-wide association studies (GWAS) have developed into a valuable approach for identifying the genetic basis of phenotypic variation. In this article, we provide an overview of the design, analysis, and interpretation of GWAS. First, we present results from simulations that explore key elements of experimental design as well as considerations for collecting the relevant genomic and phenotypic data. Next, we outline current statistical methods and tools used for GWA analyses and discuss the inclusion of covariates to account for population structure and the interpretation of results. Given that many false positive associations will occur in any GWA analysis, we highlight strategies for prioritizing GWA candidates for further statistical and empirical validation. While focused on plants, the material we cover is also applicable to other systems. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Liana T Burghardt
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota
| | - Nevin D Young
- Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota
| |
Collapse
|
143
|
Visconti A, Al-Shafai M, Al Muftah WA, Zaghlool SB, Mangino M, Suhre K, Falchi M. PopPAnTe: population and pedigree association testing for quantitative data. BMC Genomics 2017; 18:150. [PMID: 28187711 PMCID: PMC5303218 DOI: 10.1186/s12864-017-3527-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Accepted: 01/31/2017] [Indexed: 11/13/2022] Open
Abstract
Background Family-based designs, from twin studies to isolated populations with their complex genealogical data, are a valuable resource for genetic studies of heritable molecular biomarkers. Existing software for family-based studies have mainly focused on facilitating association between response phenotypes and genetic markers, and no user-friendly tools are at present available to straightforwardly extend association studies in related samples to large datasets of generic quantitative data, as those generated by current -omics technologies. Results We developed PopPAnTe, a user-friendly Java program, which evaluates the association of quantitative data in related samples. Additionally, PopPAnTe implements data pre and post processing, region based testing, and empirical assessment of associations. Conclusions PopPAnTe is an integrated and flexible framework for pairwise association testing in related samples with a large number of predictors and response variables. It works either with family data of any size and complexity, or, when the genealogical information is unknown, it uses genetic similarity information between individuals as those inferred from genome-wide genetic data. It can therefore be particularly useful in facilitating usage of biobank data collections from population isolates when extensive genealogical information is missing.
Collapse
Affiliation(s)
- Alessia Visconti
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | - Mashael Al-Shafai
- Department of Genomics of Common Diseases, Imperial College London, London, UK.,Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Doha, Qatar.,Research Division, Qatar Science Leadership Program, Qatar Foundation, Doha, Qatar.,Department of Biomedical Sciences, College of Health Sciences at Qatar University, Doha, Qatar
| | - Wadha A Al Muftah
- Department of Genomics of Common Diseases, Imperial College London, London, UK.,Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Doha, Qatar.,Research Division, Qatar Science Leadership Program, Qatar Foundation, Doha, Qatar
| | - Shaza B Zaghlool
- Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Doha, Qatar
| | - Massimo Mangino
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.,NIHR Biomedical Research Centre at Guy's and St Thomas' Foundation Trust, London, UK
| | - Karsten Suhre
- Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Doha, Qatar
| | - Mario Falchi
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| |
Collapse
|
144
|
Knief U, Schielzeth H, Backström N, Hemmrich‐Stanisak G, Wittig M, Franke A, Griffith SC, Ellegren H, Kempenaers B, Forstmeier W. Association mapping of morphological traits in wild and captive zebra finches: reliable within, but not between populations. Mol Ecol 2017; 26:1285-1305. [DOI: 10.1111/mec.14009] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Revised: 12/05/2016] [Accepted: 12/21/2016] [Indexed: 01/17/2023]
Affiliation(s)
- Ulrich Knief
- Department of Behavioural Ecology and Evolutionary Genetics Max Planck Institute for Ornithology 82319 Seewiesen Germany
| | - Holger Schielzeth
- Department of Population Ecology Friedrich Schiller University Jena 07743 Jena Germany
| | - Niclas Backström
- Department of Ecology and Genetics Uppsala University 752 36 Uppsala Sweden
| | | | - Michael Wittig
- Institute of Clinical Molecular Biology Christian‐Albrechts‐University 24105 Kiel Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology Christian‐Albrechts‐University 24105 Kiel Germany
| | - Simon C. Griffith
- Department of Biological Sciences Macquarie University Sydney NSW 2109 Australia
- School of Biological, Earth & Environmental Sciences University of New South Wales Sydney NSW 2057 Australia
| | - Hans Ellegren
- Department of Ecology and Genetics Uppsala University 752 36 Uppsala Sweden
| | - Bart Kempenaers
- Department of Behavioural Ecology and Evolutionary Genetics Max Planck Institute for Ornithology 82319 Seewiesen Germany
| | - Wolfgang Forstmeier
- Department of Behavioural Ecology and Evolutionary Genetics Max Planck Institute for Ornithology 82319 Seewiesen Germany
| |
Collapse
|
145
|
The Identification of a 1916 Irish Rebel: New Approach for Estimating Relatedness From Low Coverage Homozygous Genomes. Sci Rep 2017; 7:41529. [PMID: 28134350 PMCID: PMC5278401 DOI: 10.1038/srep41529] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 12/16/2016] [Indexed: 11/08/2022] Open
Abstract
Thomas Kent was an Irish rebel who was executed by British forces in the aftermath of the Easter Rising armed insurrection of 1916 and buried in a shallow grave on Cork prison’s grounds. In 2015, ninety-nine years after his death, a state funeral was offered to his living family to honor his role in the struggle for Irish independence. However, inaccuracies in record keeping did not allow the bodily remains that supposedly belonged to Kent to be identified with absolute certainty. Using a novel approach based on homozygous single nucleotide polymorphisms, we identified these remains to be those of Kent by comparing his genetic data to that of two known living relatives. As the DNA degradation found on Kent’s DNA, characteristic of ancient DNA, rendered traditional methods of relatedness estimation unusable, we forced all loci homozygous, in a process we refer to as “forced homozygote approach”. The results were confirmed using simulated data for different relatedness classes. We argue that this method provides a necessary alternative for relatedness estimations, not only in forensic analysis, but also in ancient DNA studies, where reduced amounts of genetic information can limit the application of traditional methods.
Collapse
|
146
|
Wolak ME, Reid JM. Accounting for genetic differences among unknown parents in microevolutionary studies: how to include genetic groups in quantitative genetic animal models. J Anim Ecol 2017; 86:7-20. [PMID: 27731502 PMCID: PMC5217070 DOI: 10.1111/1365-2656.12597] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 09/23/2016] [Indexed: 11/30/2022]
Abstract
Quantifying and predicting microevolutionary responses to environmental change requires unbiased estimation of quantitative genetic parameters in wild populations. 'Animal models', which utilize pedigree data to separate genetic and environmental effects on phenotypes, provide powerful means to estimate key parameters and have revolutionized quantitative genetic analyses of wild populations. However, pedigrees collected in wild populations commonly contain many individuals with unknown parents. When unknown parents are non-randomly associated with genetic values for focal traits, animal model parameter estimates can be severely biased. Yet, such bias has not previously been highlighted and statistical methods designed to minimize such biases have not been implemented in evolutionary ecology. We first illustrate how the occurrence of non-random unknown parents in population pedigrees can substantially bias animal model predictions of breeding values and estimates of additive genetic variance, and create spurious temporal trends in predicted breeding values in the absence of local selection. We then introduce 'genetic group' methods, which were developed in agricultural science, and explain how these methods can minimize bias in quantitative genetic parameter estimates stemming from genetic heterogeneity among individuals with unknown parents. We summarize the conceptual foundations of genetic group animal models and provide extensive, step-by-step tutorials that demonstrate how to fit such models in a variety of software programs. Furthermore, we provide new functions in r that extend current software capabilities and provide a standardized approach across software programs to implement genetic group methods. Beyond simply alleviating bias, genetic group animal models can directly estimate new parameters pertaining to key biological processes. We discuss one such example, where genetic group methods potentially allow the microevolutionary consequences of local selection to be distinguished from effects of immigration and resulting gene flow. We highlight some remaining limitations of genetic group models and discuss opportunities for further development and application in evolutionary ecology. We suggest that genetic group methods should no longer be overlooked by evolutionary ecologists, but should become standard components of the toolkit for animal model analyses of wild population data sets.
Collapse
Affiliation(s)
- Matthew E. Wolak
- Institute of Biological and Environmental SciencesSchool of Biological SciencesUniversity of Aberdeen, Zoology Building, Tillydrone AvenueAberdeen AB24 2TZUK
| | - Jane M. Reid
- Institute of Biological and Environmental SciencesSchool of Biological SciencesUniversity of Aberdeen, Zoology Building, Tillydrone AvenueAberdeen AB24 2TZUK
| |
Collapse
|
147
|
Kardos M, Taylor HR, Ellegren H, Luikart G, Allendorf FW. Genomics advances the study of inbreeding depression in the wild. Evol Appl 2016; 9:1205-1218. [PMID: 27877200 PMCID: PMC5108213 DOI: 10.1111/eva.12414] [Citation(s) in RCA: 167] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 08/05/2016] [Indexed: 12/12/2022] Open
Abstract
Inbreeding depression (reduced fitness of individuals with related parents) has long been a major focus of ecology, evolution, and conservation biology. Despite decades of research, we still have a limited understanding of the strength, underlying genetic mechanisms, and demographic consequences of inbreeding depression in the wild. Studying inbreeding depression in natural populations has been hampered by the inability to precisely measure individual inbreeding. Fortunately, the rapidly increasing availability of high-throughput sequencing data means it is now feasible to measure the inbreeding of any individual with high precision. Here, we review how genomic data are advancing our understanding of inbreeding depression in the wild. Recent results show that individual inbreeding and inbreeding depression can be measured more precisely with genomic data than via traditional pedigree analysis. Additionally, the availability of genomic data has made it possible to pinpoint loci with large effects contributing to inbreeding depression in wild populations, although this will continue to be a challenging task in many study systems due to low statistical power. Now that reliably measuring individual inbreeding is no longer a limitation, a major focus of future studies should be to more accurately quantify effects of inbreeding depression on population growth and viability.
Collapse
Affiliation(s)
- Marty Kardos
- Department of Evolutionary BiologyEvolutionary Biology CentreUppsala UniversityUppsalaSweden
| | | | - Hans Ellegren
- Department of Evolutionary BiologyEvolutionary Biology CentreUppsala UniversityUppsalaSweden
| | - Gordon Luikart
- Division of Biological SciencesUniversity of MontanaMissoulaMTUSA
- Flathead Lake Biological StationDivision of Biological SciencesUniversity of MontanaPolsonMTUSA
| | | |
Collapse
|
148
|
The Y chromosome as the most popular marker in genetic genealogy benefits interdisciplinary research. Hum Genet 2016; 136:559-573. [DOI: 10.1007/s00439-016-1740-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 10/16/2016] [Indexed: 01/01/2023]
|
149
|
Meiotic recombination shapes precision of pedigree- and marker-based estimates of inbreeding. Heredity (Edinb) 2016; 118:239-248. [PMID: 27804967 DOI: 10.1038/hdy.2016.95] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Accepted: 08/29/2016] [Indexed: 01/17/2023] Open
Abstract
The proportion of an individual's genome that is identical by descent (GWIBD) can be estimated from pedigrees (inbreeding coefficient 'Pedigree F') or molecular markers ('Marker F'), but both estimators come with error. Assuming unrelated pedigree founders, Pedigree F is the expected proportion of GWIBD given a specific inbreeding constellation. Meiotic recombination introduces variation around that expectation (Mendelian noise) and related pedigree founders systematically bias Pedigree F downward. Marker F is an estimate of the actual proportion of GWIBD but it suffers from the sampling error of markers plus the error that occurs when a marker is homozygous without reflecting common ancestry (identical by state). We here show via simulation of a zebra finch and a human linkage map that three aspects of meiotic recombination (independent assortment of chromosomes, number of crossovers and their distribution along chromosomes) contribute to variation in GWIBD and thus the precision of Pedigree and Marker F. In zebra finches, where the genome contains large blocks that are rarely broken up by recombination, the Mendelian noise was large (nearly twofold larger s.d. values compared with humans) and Pedigree F thus less precise than in humans, where crossovers are distributed more uniformly along chromosomes. Effects of meiotic recombination on Marker F were reversed, such that the same number of molecular markers yielded more precise estimates of GWIBD in zebra finches than in humans. As a consequence, in species inheriting large blocks that rarely recombine, even small numbers of microsatellite markers will often be more informative about inbreeding and fitness than large pedigrees.
Collapse
|
150
|
Cussens J, Sheehan NA. Special issue on New Developments in Relatedness and Relationship Estimation. Theor Popul Biol 2016; 107:1-3. [PMID: 26772525 DOI: 10.1016/j.tpb.2015.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 12/14/2015] [Indexed: 11/17/2022]
Affiliation(s)
- J Cussens
- Department of Computer Science, University of York, UK.
| | - N A Sheehan
- Department of Health Sciences, University of Leicester, UK.
| |
Collapse
|