1
|
Lyulina AS, Liu Z, Good BH. Linkage equilibrium between rare mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.28.587282. [PMID: 38617331 PMCID: PMC11014483 DOI: 10.1101/2024.03.28.587282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Recombination breaks down genetic linkage by reshuffling existing variants onto new genetic backgrounds. These dynamics are traditionally quantified by examining the correlations between alleles, and how they decay as a function of the recombination rate. However, the magnitudes of these correlations are strongly influenced by other evolutionary forces like natural selection and genetic drift, making it difficult to tease out the effects of recombination. Here we introduce a theoretical framework for analyzing an alternative family of statistics that measure the homoplasy produced by recombination. We derive analytical expressions that predict how these statistics depend on the rates of recombination and recurrent mutation, the strength of negative selection and genetic drift, and the present-day frequencies of the mutant alleles. We find that the degree of homoplasy can strongly depend on this frequency scale, which reflects the underlying timescales over which these mutations occurred. We show how these scaling properties can be used to isolate the effects of recombination, and discuss their implications for the rates of horizontal gene transfer in bacteria.
Collapse
Affiliation(s)
- Anastasia S Lyulina
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Zhiru Liu
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Benjamin H Good
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub - San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
2
|
Shilbayeh SAR, Adeen IS, Ghanem EH, Aljurayb H, Aldilaijan KE, AlDosari F, Fadda A. Exploratory focused pharmacogenetic testing reveals novel markers associated with risperidone pharmacokinetics in Saudi children with autism. Front Pharmacol 2024; 15:1356763. [PMID: 38375040 PMCID: PMC10875102 DOI: 10.3389/fphar.2024.1356763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 01/24/2024] [Indexed: 02/21/2024] Open
Abstract
Background: Autism spectrum disorders (ASDs) encompass a broad range of phenotypes characterized by diverse neurological alterations. Genomic studies have revealed considerable overlap between the molecular mechanisms implicated in the etiology of ASD and genes involved in the pharmacokinetic (PK) and pharmacodynamic (PD) pathways of antipsychotic drugs employed in ASD management. Given the conflicting data originating from candidate PK or PD gene association studies in diverse ethnogeographic ASD populations, dosage individualization based on "actionable" pharmacogenetic (PGx) markers has limited application in clinical practice. Additionally, off-label use of different antipsychotics is an ongoing practice, which is justified given the shortage of approved cures, despite the lack of satisfactory evidence for its safety according to precision medicine. This exploratory study aimed to identify PGx markers predictive of risperidone (RIS) exposure in autistic Saudi children. Methods: This prospective cohort study enrolled 89 Saudi children with ASD treated with RIS-based antipsychotic therapy. Plasma levels of RIS and 9-OH-RIS were measured using a liquid chromatography-tandem mass spectrometry system. To enable focused exploratory testing, genotyping was performed with the Axiom PharmacoFocus Array, which included a collection of probe sets targeting PK/PD genes. A total of 720 PGx markers were included in the association analysis. Results: A total of 27 PGx variants were found to have a prominent impact on various RIS PK parameters; most were not located within the genes involved in the classical RIS PK pathway. Specifically, 8 markers in 7 genes were identified as the PGx markers with the strongest impact on RIS levels (p < 0.01). Four PGx variants in 3 genes were strongly associated with 9-OH-RIS levels, while 5 markers in 5 different genes explained the interindividual variability in the total active moiety. Notably, 6 CYP2D6 variants exhibited strong linkage disequilibrium; however, they significantly influenced only the metabolic ratio and had no considerable effects on the individual estimates of RIS, 9-OH-RIS, or the total active moiety. After correction for multiple testing, rs78998153 in UGT2B17 (which is highly expressed in the brain) remained the most significant PGx marker positively adjusting the metabolic ratio. For the first time, certain human leukocyte antigen (HLA) markers were found to enhance various RIS exposure parameters, which reinforces the gut-brain axis theory of ASD etiology and its suggested inflammatory impacts on drug bioavailability through modulation of the brain, gastrointestinal tract and/or hepatic expression of metabolizing enzymes and transporters. Conclusion: Our hypothesis-generating approach identified a broad spectrum of PGx markers that interactively influence RIS exposure in ASD children, which indicated the need for further validation in population PK modeling studies to define polygenic scores for antipsychotic efficacy and safety, which could facilitate personalized therapeutic decision-making in this complex neurodevelopmental condition.
Collapse
Affiliation(s)
- Sireen Abdul Rahim Shilbayeh
- Department of Pharmacy Practice, College of Pharmacy, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Iman Sharaf Adeen
- Department of Pediatric Behavior and Development and Adolescent Medicine, King Fahad Medical City, Riyadh, Saudi Arabia
| | - Ezzeldeen Hasan Ghanem
- Pharmaceutical Analysis Section, King Abdullah International Medical Research Center (KAIMRC), King Abdulaziz Medical City, Ministry of National Guard - Health Affairs, Riyadh, Saudi Arabia
| | - Haya Aljurayb
- Molecular Pathology Laboratory, Pathology and Clinical Laboratory Medicine Administration, King Fahad Medical City, Riyadh, Saudi Arabia
| | - Khawlah Essa Aldilaijan
- Health Sciences Research Center, King Abdullah Bin Abdulaziz University Hospital, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Fatimah AlDosari
- Pharmaceutical Care Department, Ministry of National Guard-Health Affairs, Jeddah, Saudi Arabia
| | | |
Collapse
|
3
|
Papageorgiou L, Papakonstantinou E, Diakou I, Pierouli K, Dragoumani K, Bacopoulou F, Chrousos GP, Eliopoulos E, Vlachakis D. Semantic and Population Analysis of the Genetic Targets Related to COVID-19 and Its Association with Genes and Diseases. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1423:59-78. [PMID: 37525033 DOI: 10.1007/978-3-031-31978-5_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]
Abstract
SARS-CoV-2 is a coronavirus responsible for one of the most serious, modern worldwide pandemics, with lasting and multifaceted effects. By late 2021, SARS-CoV-2 has infected more than 180 million people and has killed more than 3 million. The virus gains entrance to human cells through binding to ACE2 via its surface spike protein and causes a complex disease of the respiratory system, termed COVID-19. Vaccination efforts are being made to hinder the viral spread, and therapeutics are currently under development. Toward this goal, scientific attention is shifting toward variants and SNPs that affect factors of the disease such as susceptibility and severity. This genomic grammar, tightly related to the dark part of our genome, can be explored through the use of modern methods such as natural language processing. We present a semantic analysis of SARS-CoV-2-related publications, which yielded a repertoire of SNPs, genes, and disease ontologies. Population data from the 1000 Genomes Project were subsequently integrated into the pipeline. Data mining approaches of this scale have the potential to elucidate the complex interaction between COVID-19 pathogenesis and host genetic variation; the resulting knowledge can facilitate the management of high-risk groups and aid the efforts toward precision medicine.
Collapse
Affiliation(s)
- Louis Papageorgiou
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Eleni Papakonstantinou
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Io Diakou
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Katerina Pierouli
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Konstantina Dragoumani
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Flora Bacopoulou
- University Research Institute of Maternal and Child Health & Precision Medicine, National and Kapodistrian University of Athens, "Aghia Sophia" Children's Hospital, Athens, Greece
| | - George P Chrousos
- University Research Institute of Maternal and Child Health & Precision Medicine, National and Kapodistrian University of Athens, "Aghia Sophia" Children's Hospital, Athens, Greece
| | - Elias Eliopoulos
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Dimitrios Vlachakis
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece.
- University Research Institute of Maternal and Child Health & Precision Medicine, National and Kapodistrian University of Athens, "Aghia Sophia" Children's Hospital, Athens, Greece.
- Division of Endocrinology and Metabolism, Center of Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, Athens, Greece.
| |
Collapse
|
4
|
Good BH. Linkage disequilibrium between rare mutations. Genetics 2022; 220:6503502. [PMID: 35100407 PMCID: PMC8982034 DOI: 10.1093/genetics/iyac004] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 12/21/2021] [Indexed: 01/13/2023] Open
Abstract
The statistical associations between mutations, collectively known as linkage disequilibrium, encode important information about the evolutionary forces acting within a population. Yet in contrast to single-site analogues like the site frequency spectrum, our theoretical understanding of linkage disequilibrium remains limited. In particular, little is currently known about how mutations with different ages and fitness costs contribute to expected patterns of linkage disequilibrium, even in simple settings where recombination and genetic drift are the major evolutionary forces. Here, I introduce a forward-time framework for predicting linkage disequilibrium between pairs of neutral and deleterious mutations as a function of their present-day frequencies. I show that the dynamics of linkage disequilibrium become much simpler in the limit that mutations are rare, where they admit a simple heuristic picture based on the trajectories of the underlying lineages. I use this approach to derive analytical expressions for a family of frequency-weighted linkage disequilibrium statistics as a function of the recombination rate, the frequency scale, and the additive and epistatic fitness costs of the mutations. I find that the frequency scale can have a dramatic impact on the shapes of the resulting linkage disequilibrium curves, reflecting the broad range of time scales over which these correlations arise. I also show that the differences between neutral and deleterious linkage disequilibrium are not purely driven by differences in their mutation frequencies and can instead display qualitative features that are reminiscent of epistasis. I conclude by discussing the implications of these results for recent linkage disequilibrium measurements in bacteria. This forward-time approach may provide a useful framework for predicting linkage disequilibrium across a range of evolutionary scenarios.
Collapse
Affiliation(s)
- Benjamin H Good
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA,Corresponding author: Department of Applied Physics, Stanford University, Clark Center, 318 Campus Drive, Stanford, CA 94305, USA.
| |
Collapse
|
5
|
Atashi H, Wilmot H, Gengler N. The pattern of linkage disequilibrium in Dual-Purpose Belgian Blue cattle. J Anim Breed Genet 2021; 139:320-329. [PMID: 34859921 DOI: 10.1111/jbg.12662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 11/14/2021] [Accepted: 11/22/2021] [Indexed: 11/27/2022]
Abstract
Quantifying the level of linkage disequilibrium (LD), non-random association of alleles at two or more loci, is important to determine the number of markers needed for genomic selection. The aims of this study were to evaluate the extent of LD in Dual-Purpose Belgian Blue (DPBB) and to compare the level of LD in DPBB with that of Walloon Holstein. Data of 28,427 single nucleotide polymorphisms (SNP), located on 29 Bos taurus autosomes (BTA), of 639 DPBB and 398 Holstein bulls were used. The level of LD between pairwise SNPs separated by up to 10 Mb was evaluated, separately for each breed, using the squared correlation of the alleles at two loci. The analysis of molecular variance showed that the percentage of variation within populations (85.48%) was higher than between populations (14.52%). However, permutation tests showed a significant genetic differentiation between the two studied populations (p < .01). The average LD found between adjacent SNP pairs in DPBB (0.16 (SD = 0.22)) was generally lower than in Holstein (0.23 (SD = 0.27)). The proportion of SNPs in useful LD (r2 > 0.30) within a genomic distance of ≤0.10 Mb between SNPs was 18.58% and 28.23% in DPBB and Holstein bulls, respectively. In both breeds, the effective population size decreased over generations; however, the decline was greater in DPBB than that in Holstein. Based on results, it can be concluded that at least 68,000 SNPs are needed for implementing genomic selection in DPBB cattle with enough accuracy.
Collapse
Affiliation(s)
- Hadi Atashi
- TERRA Research and Training Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium.,Department of Animal Science, Shiraz University, Shiraz, Iran
| | - Hélène Wilmot
- TERRA Research and Training Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium.,National Fund for Scientific Research (F.R.S.-FNRS), Brussels, Belgium
| | - Nicolas Gengler
- TERRA Research and Training Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| |
Collapse
|
6
|
Garcia JA, Lohmueller KE. Negative linkage disequilibrium between amino acid changing variants reveals interference among deleterious mutations in the human genome. PLoS Genet 2021; 17:e1009676. [PMID: 34319975 PMCID: PMC8351996 DOI: 10.1371/journal.pgen.1009676] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 08/09/2021] [Accepted: 06/22/2021] [Indexed: 11/18/2022] Open
Abstract
Evolutionary forces like Hill-Robertson interference and negative epistasis can lead to deleterious mutations being found on distinct haplotypes. However, the extent to which these forces depend on the selection and dominance coefficients of deleterious mutations and shape genome-wide patterns of linkage disequilibrium (LD) in natural populations with complex demographic histories has not been tested. In this study, we first used forward-in-time simulations to predict how negative selection impacts LD. Under models where deleterious mutations have additive effects on fitness, deleterious variants less than 10 kb apart tend to be carried on different haplotypes relative to pairs of synonymous SNPs. In contrast, for recessive mutations, there is no consistent ordering of how selection coefficients affect LD decay, due to the complex interplay of different evolutionary effects. We then examined empirical data of modern humans from the 1000 Genomes Project. LD between derived alleles at nonsynonymous SNPs is lower compared to pairs of derived synonymous variants, suggesting that nonsynonymous derived alleles tend to occur on different haplotypes more than synonymous variants. This result holds when controlling for potential confounding factors by matching SNPs for frequency in the sample (allele count), physical distance, magnitude of background selection, and genetic distance between pairs of variants. Lastly, we introduce a new statistic HR(j) which allows us to detect interference using unphased genotypes. Application of this approach to high-coverage human genome sequences confirms our finding that nonsynonymous derived alleles tend to be located on different haplotypes more often than are synonymous derived alleles. Our findings suggest that interference may play a pervasive role in shaping patterns of LD between deleterious variants in the human genome, and consequently influences genome-wide patterns of LD.
Collapse
Affiliation(s)
- Jesse A. Garcia
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
| | - Kirk E. Lohmueller
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
| |
Collapse
|
7
|
Lucek K, Willi Y. Drivers of linkage disequilibrium across a species' geographic range. PLoS Genet 2021; 17:e1009477. [PMID: 33770075 PMCID: PMC8026057 DOI: 10.1371/journal.pgen.1009477] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 04/07/2021] [Accepted: 03/09/2021] [Indexed: 11/25/2022] Open
Abstract
While linkage disequilibrium (LD) is an important parameter in genetics and evolutionary biology, the drivers of LD remain elusive. Using whole-genome sequences from across a species’ range, we assessed the impact of demographic history and mating system on LD. Both range expansion and a shift from outcrossing to selfing in North American Arabidopsis lyrata were associated with increased average genome-wide LD. Our results indicate that range expansion increases short-distance LD at the farthest range edges by about the same amount as a shift to selfing. However, the extent over which LD in genic regions unfolds was shorter for range expansion compared to selfing. Linkage among putatively neutral variants and between neutral and deleterious variants increased to a similar degree with range expansion, providing support that genome-wide LD was positively associated with mutational load. As a consequence, LD combined with mutational load may decelerate range expansions and set range limits. Finally, a small number of genes were identified as LD outliers, suggesting that they experience selection by either of the two demographic processes. These included genes involved in flowering and photoperiod for range expansion, and the self-incompatibility locus for mating system. Nearby genomic variants are often co-inherited because of limited recombination. The extent of non-random association of alleles at different loci is called linkage disequilibrium (LD) and is commonly used in genomic analyses, for example to detect regions under selection or to determine effective population size. Here we reversed testing and addressed how demographic history may affect LD within a species. Using genomic data from more than a thousand individuals of North American Arabidopsis lyrata from across the entire species’ range, we quantified the effect of postglacial range expansion and a shift in mating system from outcrossing to selfing on LD. We show that both factors lead to increased LD, and that the maximal effect of range expansion is comparable with a shift in mating system to selfing. Heightened LD involves deleterious mutations, and therefore, LD can also serve as an indicator of mutation accumulation. Furthermore, we provide evidence that some genes experienced stronger increases in LD possibly due to selection associated with the two demographic changes. Our results provide a novel and broad view on the evolutionary factors shaping LD that may also apply to the very many species that underwent postglacial range expansion.
Collapse
Affiliation(s)
- Kay Lucek
- Department of Environmental Sciences, University of Basel, Basel, Switzerland
- * E-mail:
| | - Yvonne Willi
- Department of Environmental Sciences, University of Basel, Basel, Switzerland
| |
Collapse
|
8
|
Peng J, Rajeevan H, Kubatko L, RoyChoudhury A. A fast likelihood approach for estimation of large phylogenies from continuous trait data. Mol Phylogenet Evol 2021; 161:107142. [PMID: 33713799 DOI: 10.1016/j.ympev.2021.107142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 10/15/2020] [Accepted: 03/03/2021] [Indexed: 11/28/2022]
Abstract
Despite the recent availability of large-scale genomic data for many individuals, few methods for phylogenetic inference are both computationally efficient and highly accurate for trees with hundreds of taxa. Model-based methods such as those developed in the maximum likelihood and Bayesian frameworks are especially time-consuming, as they involve both computationally intensive calculations on fixed phylogenies and searches through the space of possible phylogenies, and they are known to scale poorly with the addition of taxa. Here, we propose a fast approximation to the maximum likelihood estimator that directly uses continuous trait data, such as allele frequency data. The approximation works by first computing the maximum likelihood estimates of some internal branch lengths, and then inferring the tree-topology using these estimates. Our approach is more computationally efficient than existing methods for such data while still achieving comparable accuracy. This method is innovative in its use of the mathematical properties of tree-topologies for inference, and thus serves as a useful addition to the collection of methods available for estimating phylogenies from continuous trait data.
Collapse
Affiliation(s)
- Jing Peng
- Division of Biostatistics, College of Public Health, The Ohio State University, United States; Department of Statistics, The Ohio State University, United States
| | | | - Laura Kubatko
- Department of Statistics, The Ohio State University, United States; Department of Evolution, Ecology and Organismal Biology, The Ohio State University, United States.
| | - Arindam RoyChoudhury
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, United States
| |
Collapse
|
9
|
Salnikova LE, Khadzhieva MB, Kolobkov DS, Gracheva AS, Kuzovlev AN, Abilev SK. Cytokines mapping for tissue-specific expression, eQTLs and GWAS traits. Sci Rep 2020; 10:14740. [PMID: 32895400 PMCID: PMC7477549 DOI: 10.1038/s41598-020-71018-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 07/28/2020] [Indexed: 12/02/2022] Open
Abstract
Dysregulation in cytokine production has been linked to the pathogenesis of various immune-mediated traits, in which genetic variability contributes to the etiopathogenesis. GWA studies have identified many genetic variants in or near cytokine genes, nonetheless, the translation of these findings into knowledge of functional determinants of complex traits remains a fundamental challenge. In this study we aimed at collection, analysis and interpretation of data on cytokines focused on their tissue-specific expression, eQTLs and GWAS traits. Using GO annotations, we generated a list of 314 cytokines and analyzed them with the GTEx resource. Cytokines were highly tissue-specific, 82.3% of cytokines had Tau expression metrics ≥ 0.8. In total, 3077 associations for 1760 unique SNPs in or near 244 cytokines were mapped in the NHGRI-EBI GWAS Catalog. According to the Experimental Factor Ontology resource, the largest numbers of disease associations were related to 'Inflammatory disease', 'Immune system disease' and 'Asthma'. The GTEx-based analysis revealed that among GWAS SNPs, 1142 SNPs had eQTL effects and influenced expression levels of 999 eGenes, among them 178 cytokines. Several types of enrichment analysis showed that it was cytokines expression variability that fundamentally contributed to the molecular origins of considered immune-mediated conditions.
Collapse
Affiliation(s)
- Lyubov E Salnikova
- Laboratory of Ecological Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkin Street, Moscow, Russia, 117971.
- Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Petrovka str, 25, b.2, Moscow, Russia, 107031.
| | - Maryam B Khadzhieva
- Laboratory of Ecological Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkin Street, Moscow, Russia, 117971
- Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Petrovka str, 25, b.2, Moscow, Russia, 107031
| | - Dmitry S Kolobkov
- Laboratory of Ecological Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkin Street, Moscow, Russia, 117971
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 234 Herzl St., PO Box 26, 7610001, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, 234 Herzl St., PO Box 26, 7610001, Rehovot, Israel
| | - Alesya S Gracheva
- Laboratory of Ecological Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkin Street, Moscow, Russia, 117971
- Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Petrovka str, 25, b.2, Moscow, Russia, 107031
| | - Artem N Kuzovlev
- Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Petrovka str, 25, b.2, Moscow, Russia, 107031
| | - Serikbay K Abilev
- Laboratory of Ecological Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 3 Gubkin Street, Moscow, Russia, 117971
| |
Collapse
|
10
|
The nonlinear structure of linkage disequilibrium. Theor Popul Biol 2020; 134:160-170. [PMID: 32222435 DOI: 10.1016/j.tpb.2020.02.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 02/15/2020] [Accepted: 02/27/2020] [Indexed: 11/23/2022]
Abstract
The allele frequency dependence of the ranges of all measures of linkage disequilibrium is well-known. The maximum values of commonly used parameters such as r2 and D vary depending on the allele frequencies at each locus. However, though this phenomenon is recognized and accounted for in many studies, the comprehensive mathematical framework underlying the limits of linkage disequilibrium measures at various frequency combinations is often heuristic or empirical. Here, it is demonstrated that underlying this behavior is the fundamental shift between linear and nonlinear dependence in the linkage disequilibrium structure between loci. The proportion of linear and nonlinear dependence can be estimated and it demonstrates how even the same values of r2 can have different implications for the nature of the overall dependence. One result of this is the value of D', when defined as only a positive number, has a minimum value of |r|. Understanding this dependence is crucial to making correct inferences about the relationships between two loci in linkage disequilibrium.
Collapse
|
11
|
Kang JTL, Rosenberg NA. Mathematical Properties of Linkage Disequilibrium Statistics Defined by Normalization of the Coefficient D = pAB - pApB. Hum Hered 2020; 84:127-143. [PMID: 32045910 DOI: 10.1159/000504171] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 10/10/2019] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Many statistics for measuring linkage disequilibrium (LD) take the form of a normalization of the LD coefficient D. Different normalizations produce statistics with different ranges, interpretations, and arguments favoring their use. METHODS Here, to compare the mathematical properties of these normalizations, we consider 5 of these normalized statistics, describing their upper bounds, the mean values of their maxima over the set of possible allele frequency pairs, and the size of the allele frequency regions accessible given specified values of the statistics. RESULTS We produce detailed characterizations of these properties for the statistics d and ρ, analogous to computations previously performed for r2. We examine the relationships among the statistics, uncovering conditions under which some of them have close connections. CONCLUSION The results contribute insight into LD measurement, particularly the understanding of differences in the features of different LD measures when computed on the same data.
Collapse
Affiliation(s)
- Jonathan T L Kang
- Department of Biology, Stanford University, Stanford, California, USA,
| | - Noah A Rosenberg
- Department of Biology, Stanford University, Stanford, California, USA
| |
Collapse
|
12
|
Borja T, Karim N, Goecker Z, Salemi M, Phinney B, Naeem M, Rice R, Parker G. Proteomic genotyping of fingermark donors with genetically variant peptides. Forensic Sci Int Genet 2019; 42:21-30. [DOI: 10.1016/j.fsigen.2019.05.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 05/09/2019] [Accepted: 05/26/2019] [Indexed: 01/31/2023]
|
13
|
Vazquez-Gonzalez WG, Martinez-Alvarez JC, Arrazola-Garcia A, Perez-Rodriguez M. Haplotype block 1 variant (HB-1v) of the NKG2 family of receptors. Hum Immunol 2019; 80:842-847. [PMID: 31320124 DOI: 10.1016/j.humimm.2019.07.276] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 07/05/2019] [Accepted: 07/08/2019] [Indexed: 01/04/2023]
Abstract
The natural killer group 2 (NKG2) family of receptors, encoded within the NK complex gene region (NKC), modulate the cytotoxic activity of NK cells. Two haplotype blocks throughout the NKC, hb-1 and hb-2 have been associated with different levels of overall natural cytotoxicity. Here, we evaluated allelic and genotype frequencies at rs1049174, rs2617160, rs2617170, rs2617171, rs1983526 (hb-1 haplotype), and rs2255336 and rs2246809 (hb-2 haplotype) in 928 subjects examined from Mexico City. The most frequent alleles and genotypes were as follows: C, CG to rs1049174; G, GG to rs2255336; T, AT to rs2617160; G, GG to rs2246809; C, CT to rs2617170; G, CG to rs2617171; and G, CG to rs1983526. Linkage disequilibrium analysis revealed that rs1049174, rs2617160, rs2617170, and rs2617171 constituted the haplotype block-1 variant (hb-1v) (r2 ≥ 0.89). Two predominant haplotypes of hb-1v were identified based on the allele content and included CTCG and GATC. This study is the first to evaluate the allelic and genotype frequency distribution of rs1049174, rs2255336, rs2617160, rs2246809, rs2617170, rs2617171, and rs1983526 in the population of Mexico City.
Collapse
Affiliation(s)
- Wendy Guadalupe Vazquez-Gonzalez
- Unidad de Investigación Médica en Inmunología, Hospital de Pediatría Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Cuauhtémoc 330, Col. Doctores, CP 06720 Ciudad de México, Mexico
| | - Julio Cesar Martinez-Alvarez
- Banco Central de Sangre, Hospital de Especialidades Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Cuauhtémoc 330, Col. Doctores, CP 06720 Ciudad de México, Mexico
| | - Araceli Arrazola-Garcia
- Banco Central de Sangre, Hospital de Especialidades Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Cuauhtémoc 330, Col. Doctores, CP 06720 Ciudad de México, Mexico
| | - Martha Perez-Rodriguez
- Unidad de Investigación Médica en Inmunología, Hospital de Pediatría Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Cuauhtémoc 330, Col. Doctores, CP 06720 Ciudad de México, Mexico.
| |
Collapse
|
14
|
Vergara-Lope A, Ennis S, Vorechovsky I, Pengelly RJ, Collins A. Heterogeneity in the extent of linkage disequilibrium among exonic, intronic, non-coding RNA and intergenic chromosome regions. Eur J Hum Genet 2019; 27:1436-1444. [PMID: 31053778 DOI: 10.1038/s41431-019-0419-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Revised: 03/04/2019] [Accepted: 04/16/2019] [Indexed: 11/09/2022] Open
Abstract
Whole-genome sequence data enable construction of high-resolution linkage disequilibrium (LD) maps revealing the LD structure of functional elements within genic and subgenic sequences. The Malecot-Morton model defines LD map distances in linkage disequilibrium units (LDUs), analogous to the centimorgan scale of linkage maps. For whole-genome sequence-derived LD maps, we introduce the ratio of corresponding map lengths kilobases/LDU to describe the extent of LD within genome components. The extent of LD is highly variable across the genome ranging from ~38 kb for intergenic sequences to ~858 kb for centromeric regions. LD is ~16% more extensive in genic, compared with intergenic sequences, reflecting relatively increased selection and/or reduced recombination in genes. The LD profile across 18,268 autosomal genes reveals reduced extent of LD, consistent with elevated recombination, in exonic regions near the 5' end of genes but more extensive LD, compared with intronic sequences, across more centrally located exons. Genes classified as essential and genes linked to Mendelian phenotypes show more extensive LD compared with genes associated with complex traits, perhaps reflecting differences in selective pressure. Significant differences between exonic, intronic and intergenic components demonstrate that fine-scale LD structure provides important insights into genome function, which cannot be revealed by LD analysis of much lower resolution array-based genotyping and conventional linkage maps.
Collapse
Affiliation(s)
- Alejandra Vergara-Lope
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Sarah Ennis
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Igor Vorechovsky
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Reuben J Pengelly
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Andrew Collins
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK.
| |
Collapse
|
15
|
Guyatt AL, Brennan RR, Burrows K, Guthrie PAI, Ascione R, Ring SM, Gaunt TR, Pyle A, Cordell HJ, Lawlor DA, Chinnery PF, Hudson G, Rodriguez S. A genome-wide association study of mitochondrial DNA copy number in two population-based cohorts. Hum Genomics 2019; 13:6. [PMID: 30704525 PMCID: PMC6357493 DOI: 10.1186/s40246-018-0190-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 12/27/2018] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Mitochondrial DNA copy number (mtDNA CN) exhibits interindividual and intercellular variation, but few genome-wide association studies (GWAS) of directly assayed mtDNA CN exist. We undertook a GWAS of qPCR-assayed mtDNA CN in the Avon Longitudinal Study of Parents and Children (ALSPAC) and the UK Blood Service (UKBS) cohort. After validating and harmonising data, 5461 ALSPAC mothers (16-43 years at mtDNA CN assay) and 1338 UKBS females (17-69 years) were included in a meta-analysis. Sensitivity analyses restricted to females with white cell-extracted DNA and adjusted for estimated or assayed cell proportions. Associations were also explored in ALSPAC children and UKBS males. RESULTS A neutrophil-associated locus approached genome-wide significance (rs709591 [MED24], β (change in SD units of mtDNA CN per allele) [SE] - 0.084 [0.016], p = 1.54e-07) in the main meta-analysis of adult females. This association was concordant in magnitude and direction in UKBS males and ALSPAC neonates. SNPs in and around ABHD8 were associated with mtDNA CN in ALSPAC neonates (rs10424198, β [SE] 0.262 [0.034], p = 1.40e-14), but not other study groups. In a meta-analysis of unrelated individuals (N = 11,253), we replicated a published association in TFAM (β [SE] 0.046 [0.017], p = 0.006), with an effect size much smaller than that observed in the replication analysis of a previous in silico GWAS. CONCLUSIONS In a hypothesis-generating GWAS, we confirm an association between TFAM and mtDNA CN and present putative loci requiring replication in much larger samples. We discuss the limitations of our work, in terms of measurement error and cellular heterogeneity, and highlight the need for larger studies to better understand nuclear genomic control of mtDNA copy number.
Collapse
Affiliation(s)
- Anna L. Guyatt
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Rebecca R. Brennan
- Wellcome Centre for Mitochondrial Research, Newcastle University, Newcastle, UK
- Institute of Genetic Medicine, Newcastle University, Newcastle, UK
| | - Kimberley Burrows
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Philip A. I. Guthrie
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Raimondo Ascione
- Bristol Heart Institute, Translational Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Susan M. Ring
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Tom R. Gaunt
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Angela Pyle
- Wellcome Centre for Mitochondrial Research, Newcastle University, Newcastle, UK
| | | | - Debbie A. Lawlor
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Patrick F. Chinnery
- Department of Clinical Neurosciences and MRC Mitochondrial Biology Unit, University of Cambridge, Cambridge, UK
| | - Gavin Hudson
- Wellcome Centre for Mitochondrial Research, Newcastle University, Newcastle, UK
- Institute of Genetic Medicine, Newcastle University, Newcastle, UK
| | - Santiago Rodriguez
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| |
Collapse
|
16
|
Madlon-Kay S, Montague MJ, Brent LJN, Ellis S, Zhong B, Snyder-Mackler N, Horvath JE, Skene JHP, Platt ML. Weak effects of common genetic variation in oxytocin and vasopressin receptor genes on rhesus macaque social behavior. Am J Primatol 2018; 80:e22873. [PMID: 29931777 DOI: 10.1002/ajp.22873] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 04/01/2018] [Accepted: 04/02/2018] [Indexed: 02/02/2023]
Abstract
The neuropeptides oxytocin (OT) and arginine vasopressin (AVP) influence pair bonding, attachment, and sociality, as well as anxiety and stress responses in humans and other mammals. The effects of these peptides are mediated by genetic variability in their associated receptors, OXTR and the AVPR gene family. However, the role of these genes in regulating social behaviors in non-human primates is not well understood. To address this question, we examined whether genetic variation in the OT receptor gene OXTR and the AVP receptor genes AVPR1A and AVPR1B influence naturally-occurring social behavior in free-ranging rhesus macaques-gregarious primates that share many features of their biology and social behavior with humans. We assessed rates of social behavior across 3,250 hr of observational behavioral data from 201 free-ranging rhesus macaques on Cayo Santiago island in Puerto Rico, and used genetic sequence data to identify 25 OXTR, AVPR1A, and AVPR1B single-nucleotide variants (SNVs) in the population. We used an animal model to estimate the effects of 12 SNVs (n = 3 OXTR; n = 5 AVPR1A; n = 4 AVPR1B) on rates of grooming, approaches, passive contact, contact aggression, and non-contact aggression, given and received. Though we found evidence for modest heritability of these behaviors, estimates of effect sizes of the selected SNVs were close to zero, indicating that common OXTR and AVPR variation contributed little to social behavior in these animals. Our results are consistent with recent findings in human genetics that the effects of individual common genetic variants on complex phenotypes are generally small.
Collapse
Affiliation(s)
- Seth Madlon-Kay
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael J Montague
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Lauren J N Brent
- Centre for Research in Animal Behaviour, University of Exeter, Exeter, Devon
| | - Samuel Ellis
- Centre for Research in Animal Behaviour, University of Exeter, Exeter, Devon
| | - Brian Zhong
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Noah Snyder-Mackler
- Department of Psychology, University of Washington, Seattle, Washington.,Center for Studies in Demography and Ecology, University of Washington, Seattle, Washington.,Washington National Primate Research Center, University of Washington, Seattle, Washington
| | - Julie E Horvath
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, North Carolina.,North Carolina Museum of Natural Sciences, Raleigh, North Carolina.,Department of Evolutionary Anthropology, Duke University, Durham, North Carolina
| | | | - Michael L Platt
- Department of Psychology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania.,Department of Marketing, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
17
|
Abstract
Human genetic diversity is the result of population genetic forces. This genetic variation influences disease risk and contributes to health disparities. Natural selection is an important influence on human genetic variation. Because immune and inflammatory function genes are enriched for signals of positive selection, the prevalence of rheumatic disease-risk alleles seen in different populations is partially the result of differing selective pressures (eg, due to pathogens). This review summarizes the genetic regions associated with susceptibility to different rheumatic diseases and concomitant evidence for natural selection, including known agents of selection exerting selective pressure in these regions.
Collapse
Affiliation(s)
- Paula S Ramos
- Division of Rheumatology and Immunology, Department of Medicine, Medical University of South Carolina, 96 Jonathan Lucas Street, Suite 816, Charleston, SC 29425, USA; Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, USA.
| |
Collapse
|
18
|
Selection for long and short sleep duration in Drosophila melanogaster reveals the complex genetic network underlying natural variation in sleep. PLoS Genet 2017; 13:e1007098. [PMID: 29240764 PMCID: PMC5730107 DOI: 10.1371/journal.pgen.1007098] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 11/01/2017] [Indexed: 12/16/2022] Open
Abstract
Why do some individuals need more sleep than others? Forward mutagenesis screens in flies using engineered mutations have established a clear genetic component to sleep duration, revealing mutants that convey very long or short sleep. Whether such extreme long or short sleep could exist in natural populations was unknown. We applied artificial selection for high and low night sleep duration to an outbred population of Drosophila melanogaster for 13 generations. At the end of the selection procedure, night sleep duration diverged by 9.97 hours in the long and short sleeper populations, and 24-hour sleep was reduced to 3.3 hours in the short sleepers. Neither long nor short sleeper lifespan differed appreciably from controls, suggesting little physiological consequences to being an extreme long or short sleeper. Whole genome sequence data from seven generations of selection revealed several hundred thousand changes in allele frequencies at polymorphic loci across the genome. Combining the data from long and short sleeper populations across generations in a logistic regression implicated 126 polymorphisms in 80 candidate genes, and we confirmed three of these genes and a larger genomic region with mutant and chromosomal deficiency tests, respectively. Many of these genes could be connected in a single network based on previously known physical and genetic interactions. Candidate genes have known roles in several classic, highly conserved developmental and signaling pathways—EGFR, Wnt, Hippo, and MAPK. The involvement of highly pleiotropic pathway genes suggests that sleep duration in natural populations can be influenced by a wide variety of biological processes, which may be why the purpose of sleep has been so elusive. One of the biggest mysteries in biology is the need to sleep. Sleep duration has an underlying genetic basis, suggesting that very long and short sleep times could be bred for experimentally. How far can sleep duration be driven up or down? Here we achieved extremely long and short night sleep duration by subjecting a wild-derived population of Drosophila melanogaster to an experimental breeding program. At the end of the breeding program, long sleepers averaged 9.97 hours more nightly sleep than short sleepers. We analyzed whole-genome sequences from seven generations of the experimental breeding to identify allele frequencies that diverged between long and short sleepers, and verified genes and genomic regions with mutation and deficiency testing. These alleles map to classic developmental and signaling pathways, implicating many diverse processes that potentially affect sleep duration.
Collapse
|
19
|
Parker CC, Gopalakrishnan S, Carbonetto P, Gonzales NM, Leung E, Park YJ, Aryee E, Davis J, Blizard DA, Ackert-Bicknell CL, Lionikas A, Pritchard JK, Palmer AA. Genome-wide association study of behavioral, physiological and gene expression traits in outbred CFW mice. Nat Genet 2016; 48:919-26. [PMID: 27376237 PMCID: PMC4963286 DOI: 10.1038/ng.3609] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/08/2016] [Indexed: 12/15/2022]
Abstract
Although mice are the most widely used mammalian model organism, genetic studies have suffered from limited mapping resolution due to extensive linkage disequilibrium (LD) that is characteristic of crosses among inbred strains. Carworth Farms White (CFW) mice are a commercially available outbred mouse population that exhibit rapid LD decay in comparison to other available mouse populations. We performed a genome-wide association study (GWAS) of behavioral, physiological and gene expression phenotypes using 1,200 male CFW mice. We used genotyping by sequencing (GBS) to obtain genotypes at 92,734 SNPs. We also measured gene expression using RNA sequencing in three brain regions. Our study identified numerous behavioral, physiological and expression quantitative trait loci (QTLs). We integrated the behavioral QTL and eQTL results to implicate specific genes, including Azi2 in sensitivity to methamphetamine and Zmynd11 in anxiety-like behavior. The combination of CFW mice, GBS and RNA sequencing constitutes a powerful approach to GWAS in mice.
Collapse
Affiliation(s)
- Clarissa C. Parker
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
- Department of Psychology, Middlebury College, Middlebury, VT 05753, USA
- Program in Neuroscience, Middlebury College, Middlebury, VT 05753, USA
| | - Shyam Gopalakrishnan
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
- Museum of Natural History, Copenhagen University, Copenhagen, Denmark
| | - Peter Carbonetto
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
- AncestryDNA, San Francisco, CA 94105, USA
| | | | - Emily Leung
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Yeonhee J Park
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Emmanuel Aryee
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Joe Davis
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - David A. Blizard
- Department of Biobehavioral Health, Pennsylvania State University, University Park, PA 16802, USA
| | - Cheryl L. Ackert-Bicknell
- Center for Musculoskeletal Research, University of Rochester, Rochester, NY 14624, USA
- Department of Orthopaedics and Rehabilitation, University of Rochester, Rochester, NY 14624, USA
| | - Arimantas Lionikas
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Foresterhill Aberdeen, Scotland UK
| | - Jonathan K. Pritchard
- Department of Genetics, Stanford University, Palo Alto, CA 94305, USA
- Department of Biology, Stanford University, Palo Alto, CA 94305, USA
- Howard Hughes Medical Institute, Stanford University, Palo Alto, CA 94305, USA
| | - Abraham A. Palmer
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
- Department of Psychiatry and Behavioral Neuroscience, University of Chicago, Chicago, IL 60637, USA
- Department of Psychiatry, University of California San Diego, La Jolla, CA 92103, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92103, USA
| |
Collapse
|
20
|
Lee JY, Ha JJ, Park YS, Yi JK, Lee S, Mun S, Han K, Kim JJ, Kim HJ, Oh DY. Relationship between Single Nucleotide Polymorphisms in the Peroxisome Proliferator-Activated Receptor Gamma Gene and Fatty Acid Composition in Korean Native Cattle. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2016; 29:184-94. [PMID: 26732443 PMCID: PMC4698698 DOI: 10.5713/ajas.15.0502] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Revised: 08/03/2015] [Accepted: 08/24/2015] [Indexed: 12/18/2022]
Abstract
The peroxisome proliferator-activated receptor gamma (PPARγ) gene plays an important role in the biosynthesis process controlled by a number of fatty acid transcription factors. This study investigates the relationships between 130 single-nucleotide polymorphisms (SNPs) in the PPARγ gene and the fatty acid composition of muscle fat in the commercial population of Korean native cattle. We identified 38 SNPs and verified relationships between 3 SNPs (g.1159-71208 A>G, g.42555-29812 G>A, and g.72362 G>T) and the fatty acid composition of commercial Korean native cattle (n = 513). Cattle with the AA genotype of g.1159-71208 A>G and the GG genotype of g.42555-29812 G>A and g.72362 G>T had higher levels of monounsaturated fatty acids and carcass traits (p<0.05). The results revealed that the 3 identified SNPs in the PPARγ gene affected fatty acid composition and carcass traits, suggesting that these 3 SNPs may improve the flavor and quality of beef in commercial Korean native cattle.
Collapse
Affiliation(s)
- Jea-Young Lee
- Livestock Research institute, Yeongju 750-871, Korea
| | - Jae-Jung Ha
- Livestock Research institute, Yeongju 750-871, Korea
| | - Yong-Soo Park
- Department of Equine Industry, Korea National College of Agriculture and Fisheries, Hwaseong 445-760, Korea
| | - Jun-Koo Yi
- Livestock Research institute, Yeongju 750-871, Korea
| | - Seunguk Lee
- Biotechnology Research Center, The University of Tokyo, Bunkyo 113-8657, Tokyo
| | - Seyoung Mun
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 330-714, Korea; DKU-Theragen institute for NGS analysis (DTiNa), Cheonan 330-714, Korea
| | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 330-714, Korea; DKU-Theragen institute for NGS analysis (DTiNa), Cheonan 330-714, Korea
| | - J-J Kim
- School of Biotechnology, Yeungnam University, Gyeongsan 712-749, Korea
| | - Hyun-Ji Kim
- Livestock Research institute, Yeongju 750-871, Korea
| | - Dong-Yep Oh
- Livestock Research institute, Yeongju 750-871, Korea
| |
Collapse
|
21
|
Snelling WM, Bennett GL, Keele JW, Kuehn LA, McDaneld TG, Smith TP, Thallman RM, Kalbfleisch TS, Pollak EJ. A survey of polymorphisms detected from sequences of popular beef breeds1,2,3. J Anim Sci 2015; 93:5128-43. [DOI: 10.2527/jas.2015-9356] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
22
|
Ramos PS, Shedlock AM, Langefeld CD. Genetics of autoimmune diseases: insights from population genetics. J Hum Genet 2015; 60:657-64. [PMID: 26223182 PMCID: PMC4660050 DOI: 10.1038/jhg.2015.94] [Citation(s) in RCA: 110] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Revised: 06/12/2015] [Accepted: 06/19/2015] [Indexed: 12/14/2022]
Abstract
Human genetic diversity is the result of population genetic forces. This genetic variation influences disease risk and contributes to health disparities. Autoimmune diseases (ADs) are a family of complex heterogeneous disorders with similar underlying mechanisms characterized by immune responses against self. Collectively, ADs are common, exhibit gender and ethnic disparities, and increasing incidence. As natural selection is an important influence on human genetic variation, and immune function genes are enriched for signals of positive selection, it is thought that the prevalence of AD risk alleles seen in different population is partially the result of differing selective pressures (for example, due to pathogens). With the advent of high-throughput technologies, new analytical methodologies and large-scale projects, evidence for the role of natural selection in contributing to the heritable component of ADs keeps growing. This review summarizes the genetic regions associated with susceptibility to different ADs and concomitant evidence for selection, including known agents of selection exerting selective pressure in these regions. Examples of specific adaptive variants with phenotypic effects are included as an evidence of natural selection increasing AD susceptibility. Many of the complexities of gene effects in different ADs can be explained by population genetics phenomena. Integrating AD susceptibility studies with population genetics to investigate how natural selection has contributed to genetic variation that influences disease risk will help to identify functional variants and elucidate biological mechanisms. As such, the study of population genetics in human population holds untapped potential for elucidating the genetic causes of human disease and more rapidly focusing to personalized medicine.
Collapse
Affiliation(s)
- Paula S Ramos
- Division of Rheumatology and Immunology, Department of Medicine, and Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Andrew M Shedlock
- Department of Biology, College of Charleston, Charleston, SC, USA
- Hollings Marine Laboratory Center for Marine Biomedicine and College of Graduate Studies, Medical University of South Carolina, Charleston, SC, USA
| | - Carl D Langefeld
- Division of Public Health Sciences, Department of Biostatistical Sciences; and Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, NC, USA
| |
Collapse
|
23
|
Berger S, Schlather M, de los Campos G, Weigend S, Preisinger R, Erbe M, Simianer H. A Scale-Corrected Comparison of Linkage Disequilibrium Levels between Genic and Non-Genic Regions. PLoS One 2015; 10:e0141216. [PMID: 26517830 PMCID: PMC4627745 DOI: 10.1371/journal.pone.0141216] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 10/06/2015] [Indexed: 12/27/2022] Open
Abstract
The understanding of non-random association between loci, termed linkage disequilibrium (LD), plays a central role in genomic research. Since causal mutations are generally not included in genomic marker data, LD between those and available markers is essential for capturing the effects of causal loci on localizing genes responsible for traits. Thus, the interpretation of association studies requires a detailed knowledge of LD patterns. It is well known that most LD measures depend on minor allele frequencies (MAF) of the considered loci and the magnitude of LD is influenced by the physical distances between loci. In the present study, a procedure to compare the LD structure between genomic regions comprising several markers each is suggested. The approach accounts for different scaling factors, namely the distribution of MAF, the distribution of pair-wise differences in MAF, and the physical extent of compared regions, reflected by the distribution of pair-wise physical distances. In the first step, genomic regions are matched based on similarity in these scaling factors. In the second step, chromosome- and genome-wide significance tests for differences in medians of LD measures in each pair are performed. The proposed framework was applied to test the hypothesis that the average LD is different in genic and non-genic regions. This was tested with a genome-wide approach with data sets for humans (Homo sapiens), a highly selected chicken line (Gallus gallus domesticus) and the model plant Arabidopsis thaliana. In all three data sets we found a significantly higher level of LD in genic regions compared to non-genic regions. About 31% more LD was detected genome-wide in genic compared to non-genic regions in Arabidopsis thaliana, followed by 13.6% in human and 6% chicken. Chromosome-wide comparison discovered significant differences on all 5 chromosomes in Arabidopsis thaliana and on one third of the human and of the chicken chromosomes.
Collapse
Affiliation(s)
- Swetlana Berger
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University, Goettingen, Germany
| | - Martin Schlather
- School of Business Informatics and Mathematics, University of Mannheim, Mannheim, Germany
| | - Gustavo de los Campos
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, United States of America
| | - Steffen Weigend
- Institut of Farm Animal Genetics, Friedrich-Loeffler Institut, Neustadt-Mariensee, Germany
| | | | - Malena Erbe
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University, Goettingen, Germany
| | - Henner Simianer
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University, Goettingen, Germany
| |
Collapse
|
24
|
Tung J, Zhou X, Alberts SC, Stephens M, Gilad Y. The genetic architecture of gene expression levels in wild baboons. eLife 2015; 4. [PMID: 25714927 PMCID: PMC4383332 DOI: 10.7554/elife.04729] [Citation(s) in RCA: 86] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 02/03/2015] [Indexed: 12/19/2022] Open
Abstract
Primate evolution has been argued to result, in part, from changes in how genes are regulated. However, we still know little about gene regulation in natural primate populations. We conducted an RNA sequencing (RNA-seq)-based study of baboons from an intensively studied wild population. We performed complementary expression quantitative trait locus (eQTL) mapping and allele-specific expression analyses, discovering substantial evidence for, and surprising power to detect, genetic effects on gene expression levels in the baboons. eQTL were most likely to be identified for lineage-specific, rapidly evolving genes; interestingly, genes with eQTL significantly overlapped between baboons and a comparable human eQTL data set. Our results suggest that genes vary in their tolerance of genetic perturbation, and that this property may be conserved across species. Further, they establish the feasibility of eQTL mapping using RNA-seq data alone, and represent an important step towards understanding the genetic architecture of gene expression in primates.
Collapse
Affiliation(s)
- Jenny Tung
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Xiang Zhou
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Susan C Alberts
- Institute of Primate Research, National Museums of Kenya, Nairobi, Kenya
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, United States
| |
Collapse
|
25
|
Ng J, Trask JS, Houghton P, Smith DG, Kanthaswamy S. Use of genome-wide heterospecific single-nucleotide polymorphisms to estimate linkage disequilibrium in rhesus and cynomolgus macaques. Comp Med 2015; 65:62-9. [PMID: 25730759 PMCID: PMC4396931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Revised: 08/25/2014] [Accepted: 10/27/2014] [Indexed: 06/04/2023]
Abstract
Rhesus and cynomolgus macaques are frequently used in biomedical research, and the availability of their reference genomes now provides for their use in genome-wide association studies. However, little is known about linkage disequilibrium (LD) in their genomes, which can affect the design and success of such studies. Here we studied LD by using 1781 conserved single-nucleotide polymorphisms (SNPs) in 183 rhesus macaques (Macaca mulatta), including 97 purebred Chinese and 86 purebred Indian animals, and 96 cynomolgus macaques (M. fascicularis fascicularis). Correlation between loci pairs decayed to 0.02 at 1146.83, 2197.92, and 3955.83 kb for Chinese rhesus, Indian rhesus, and cynomolgus macaques, respectively. Differences between the observed heterozygosity and minor allele frequency (MAF) of pairs of these 3 taxa were highly statistically significant. These 3 nonhuman primate taxa have significantly different genetic diversities (heterozygosity and MAF) and rates of LD decay. Our study confirms a much lower rate of LD decay in Indian than in Chinese rhesus macaques relative to that previously reported. In contrast, the especially low rate of LD decay in cynomolgus macaques suggests the particular usefulness of this species in genome-wide association studies. Although conserved markers, such as those used here, are required for valid LD comparisons among taxa, LD can be assessed with less bias by using species-specific markers, because conserved SNPs may be ancestral and therefore not informative for LD.
Collapse
Affiliation(s)
- Jillian Ng
- Molecular Anthropology Laboratory, Department of Anthropology, University of California, Davis, California, USA
| | - Jessica Satkoski Trask
- Molecular Anthropology Laboratory, California National Primate Research Center, University of California, Davis, California, USA
| | | | - David G Smith
- Molecular Anthropology Laboratory, Department of Anthropology, California National Primate Research Center, University of California, Davis, California, USA
| | - Sree Kanthaswamy
- Molecular Anthropology Laboratory, Department of Anthropology, California National Primate Research Center, Department of Environmental Toxicology, University of California, Davis, USA; California, School of Mathematics and Natural Sciences, Arizona State University (ASU) at the West Campus, Glendale, Arizona, USA.
| |
Collapse
|
26
|
Oh DY, Lee YS, La BM, Lee JY, Park YS, Lee JH, Ha JJ, Yi JK, Kim BK, Yeo JS. Identification of exonic nucleotide variants of the thyroid hormone responsive protein gene associated with carcass traits and Fatty Acid composition in korean cattle. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2014; 27:1373-80. [PMID: 25178286 PMCID: PMC4150167 DOI: 10.5713/ajas.2014.14101] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Revised: 03/26/2014] [Accepted: 05/06/2014] [Indexed: 11/27/2022]
Abstract
The thyroid hormone responsive protein (THRSP) gene is a functional gene that can be used to indicate the fatty acid compositions. This study investigates the relationships of exonic single nucleotide polymorphisms (SNPs) in the THRSP gene and fatty acid composition of muscle fat and marbling score in the 612 Korean cattle. The relationships between fatty acid composition and eight SNPs in the THRSP gene (g.78 G>A, g.173 C>T, g.184 C>T, g.190 C>A, g.194 C>T, g.277 C>G, g.283 T>G and g.290 T>G) were investigated, and according to the results, two SNPs (g.78 G>A and g.184 C>T) in exon 1 were associated with fatty acid composition. The GG and CC genotypes of g.78 G>A and g.184 C>T had higher unsaturated fatty acid (UFA) and monounsaturated fatty acid (MUFA) content (p<0.05). In addition, the ht1*ht1 group (Val/Ala haplotype) in a linkage disequilibrium increased MUFAs and marbling scores for carcass traits (p<0.05). As a result, g.78 G>A and g.184 C>T had significantly relationships with UFAs and MUFAs. Two SNPs in the THRSP gene affected fatty acid composition, suggesting that GG and CC genotypes and the ht1*ht1 group (Val/Ala haplotype) can be markers to genetically improve the quality and flavor of beef.
Collapse
Affiliation(s)
- Dong-Yep Oh
- Institute of Green Bio Science and Technology, Seoul National University, Pyeongchang 232-916, Korea
| | - Yoon-Seok Lee
- Institute of Green Bio Science and Technology, Seoul National University, Pyeongchang 232-916, Korea
| | - Boo-Mi La
- School of Biotechnology, Yeungnam University, Gyeongsan 712-749, Korea
| | - Jea-Young Lee
- Department of Statistics, Yeungnam University, Gyeongsan 712-749, Korea
| | - Yong-Soo Park
- Department of Equine Industry, Korea National College of Agriculture and Fisheries, Hwaseong 445-760, Korea
| | - Ji-Hong Lee
- Gyeongbuk Provincial College, Yecheon 750-767, Korea
| | - Jae-Jung Ha
- Institute of Green Bio Science and Technology, Seoul National University, Pyeongchang 232-916, Korea
| | - Jun-Koo Yi
- Institute of Green Bio Science and Technology, Seoul National University, Pyeongchang 232-916, Korea
| | - Byung-Ki Kim
- Institute of Green Bio Science and Technology, Seoul National University, Pyeongchang 232-916, Korea
| | - Jung-Sou Yeo
- School of Biotechnology, Yeungnam University, Gyeongsan 712-749, Korea
| |
Collapse
|
27
|
Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data. Genetics 2014; 197:1303-13. [PMID: 24875187 DOI: 10.1534/genetics.114.165514] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Rapidly improving sequencing technologies provide unprecedented opportunities for analyzing genome-wide patterns of polymorphisms. In particular, they have great potential for linkage-disequilibrium analyses on both global and local genetic scales, which will substantially improve our ability to derive evolutionary inferences. However, there are some difficulties with analyzing high-throughput sequencing data, including high error rates associated with base reads and complications from the random sampling of sequenced chromosomes in diploid organisms. To overcome these difficulties, we developed a maximum-likelihood estimator of linkage disequilibrium for use with error-prone sampling data. Computer simulations indicate that the estimator is nearly unbiased with a sampling variance at high coverage asymptotically approaching the value expected when all relevant information is accurately estimated. The estimator does not require phasing of haplotypes and enables the estimation of linkage disequilibrium even when all individual reads cover just single polymorphic sites.
Collapse
|
28
|
Ramos PS, Shaftman SR, Ward RC, Langefeld CD. Genes associated with SLE are targets of recent positive selection. Autoimmune Dis 2014; 2014:203435. [PMID: 24587899 PMCID: PMC3920976 DOI: 10.1155/2014/203435] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Accepted: 11/12/2013] [Indexed: 01/03/2023] Open
Abstract
The reasons for the ethnic disparities in the prevalence of systemic lupus erythematosus (SLE) and the relative high frequency of SLE risk alleles in the population are not fully understood. Population genetic factors such as natural selection alter allele frequencies over generations and may help explain the persistence of such common risk variants in the population and the differential risk of SLE. In order to better understand the genetic basis of SLE that might be due to natural selection, a total of 74 genomic regions with compelling evidence for association with SLE were tested for evidence of recent positive selection in the HapMap and HGDP populations, using population differentiation, allele frequency, and haplotype-based tests. Consistent signs of positive selection across different studies and statistical methods were observed at several SLE-associated loci, including PTPN22, TNFSF4, TET3-DGUOK, TNIP1, UHRF1BP1, BLK, and ITGAM genes. This study is the first to evaluate and report that several SLE-associated regions show signs of positive natural selection. These results provide corroborating evidence in support of recent positive selection as one mechanism underlying the elevated population frequency of SLE risk loci and supports future research that integrates signals of natural selection to help identify functional SLE risk alleles.
Collapse
Affiliation(s)
- Paula S. Ramos
- Department of Medicine, Medical University of South Carolina, Charleston, SC 29425, USA
| | - Stephanie R. Shaftman
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC 29425, USA
| | - Ralph C. Ward
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC 29425, USA
| | - Carl D. Langefeld
- Department of Public Health Sciences, Wake Forest School of Medicine and Center for Public Health Genomics, Winston-Salem, NC 27157, USA
| |
Collapse
|
29
|
Edge MD, Gorroochurn P, Rosenberg NA. Windfalls and pitfalls: Applications of population genetics to the search for disease genes. EVOLUTION MEDICINE AND PUBLIC HEALTH 2013; 2013:254-72. [PMID: 24481204 PMCID: PMC3868415 DOI: 10.1093/emph/eot021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Association mapping can be viewed as an application of population genetics and evolutionary biology to the problem of identifying genes causally connected to phenotypes. However, some population-genetic principles important to the design and analysis of association studies have not been widely understood or have even been generally misunderstood. Some of these principles underlie techniques that can aid in the discovery of genetic variants that influence phenotypes (‘windfalls’), whereas others can interfere with study design or interpretation of results (‘pitfalls’). Here, considering examples involving genetic variant discovery, linkage disequilibrium, power to detect associations, population stratification and genotype imputation, we address misunderstandings in the application of population genetics to association studies, and we illuminate how some surprising results in association contexts can be easily explained when considered from evolutionary and population-genetic perspectives. Through our examples, we argue that population-genetic thinking—which takes a theoretical view of the evolutionary forces that guide the emergence and propagation of genetic variants—substantially informs the design and interpretation of genetic association studies. In particular, population-genetic thinking sheds light on genetic confounding, on the relationships between association signals of typed markers and causal variants, and on the advantages and disadvantages of particular strategies for measuring genetic variation in association studies.
Collapse
Affiliation(s)
- Michael D Edge
- Department of Biology, Stanford University, Stanford, CA 94305-5020, USA and Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | | | | |
Collapse
|
30
|
Stern JA, White SN, Meurs KM. Extent of linkage disequilibrium in large-breed dogs: chromosomal and breed variation. Mamm Genome 2013; 24:409-15. [PMID: 24062056 DOI: 10.1007/s00335-013-9474-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Accepted: 08/01/2013] [Indexed: 11/24/2022]
Abstract
The aim of this study was to better define the extent of linkage disequilibrium (LD) in populations of large-breed dogs and its variation by breed and chromosomal region. Understanding the extent of LD is a crucial component for successful utilization of genome-wide association studies and allows researchers to better define regions of interest and target candidate genes. Twenty-four Golden Retriever dogs, 28 Rottweiler dogs, and 24 Newfoundland dogs were genotyped for single-nucleotide polymorphism (SNP) data using a high-density SNP array. LD was calculated for all autosomes using Haploview. Decay of the squared correlation coefficient (r (2)) was plotted on a per-breed and per-chromosome basis as well as in a genome-wide fashion. The point of 50 % decay of r (2) was used to estimate the difference in extent of LD between breeds. Extent of LD was significantly shorter for Newfoundland dogs based upon 50 % decay of r (2) data at a mean of 344 kb compared to Golden Retriever and Rottweiler dogs at 715 and 834 kb, respectively (P < 0.0001). Notable differences in LD by chromosome were present within each breed and not strictly related to the length of the corresponding chromosome. Extent of LD is breed and chromosome dependent. To our knowledge, this is the first report of SNP-based LD for Newfoundland dogs, the first report based on genome-wide SNPs for Rottweilers, and an almost tenfold improvement in marker density over previous genome-wide studies of LD in Golden Retrievers.
Collapse
Affiliation(s)
- Joshua A Stern
- Department of Clinical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, 27607, USA,
| | | | | |
Collapse
|
31
|
Azia A, Uversky VN, Horovitz A, Unger R. The Effects of Mutations on Protein Function: A Comparative Study of Three Databases of Mutations in Humans. Isr J Chem 2013. [DOI: 10.1002/ijch.201300011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
32
|
Abstract
The molecular basis of adaptation--and, in particular, the relative roles of protein-coding versus gene expression changes--has long been the subject of speculation and debate. Recently, the genotyping of diverse human populations has led to the identification of many putative "local adaptations" that differ between populations. Here I show that these local adaptations are over 10-fold more likely to affect gene expression than amino acid sequence. In addition, a novel framework for identifying polygenic local adaptations detects recent positive selection on the expression levels of genes involved in UV radiation response, immune cell proliferation, and diabetes-related pathways. These results provide the first examples of polygenic gene expression adaptation in humans, as well as the first genome-scale support for the hypothesis that changes in gene expression have driven human adaptation.
Collapse
Affiliation(s)
- Hunter B Fraser
- Department of Biology, Stanford University, Stanford, California 94305, USA.
| |
Collapse
|
33
|
Oh D, La B, Lee Y, Byun Y, Lee J, Yeo G, Yeo J. Identification of novel single nucleotide polymorphisms (SNPs) of the lipoprotein lipase (LPL) gene associated with fatty acid composition in Korean cattle. Mol Biol Rep 2012; 40:3155-63. [PMID: 23271120 DOI: 10.1007/s11033-012-2389-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2012] [Accepted: 12/17/2012] [Indexed: 11/28/2022]
Abstract
The lipoprotein lipase (LPL) gene can be considered a functional candidate gene that regulates fatty acid composition. In this study, genetic associations between fatty acid composition and exonic single nucleotide polymorphisms (SNPs) in the LPL gene were examined using 612 Korean cattle. We investigated the relationship between unsaturated fatty acids and five novel SNPs (c.322G>A, c.329A>T, c.527T>G, c.988C>T and c.1591G>A), and confirmed that three polymorphic SNPs (c.322G>A, c.329A>T and c.1591G>A) were associated with fatty acid composition. Korean cattle with an AA genotype of c.322G>A, c.329A>T, and GA genotype of c.1591G>A had higher levels of monounsaturated fatty acids and carcass traits (P < 0.05). Our findings confirmed that three novel SNPs we identified in the LPL gene can affect fatty acid composition and carcass traits. Therefore, selection for AA and GA genotypes should be recommended to genetically improve beef quality and flavor.
Collapse
Affiliation(s)
- Dongyep Oh
- School of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 712-749, South Korea.
| | | | | | | | | | | | | |
Collapse
|
34
|
Karjalainen MK, Huusko JM, Ulvila J, Sotkasiira J, Luukkonen A, Teramo K, Plunkett J, Anttila V, Palotie A, Haataja R, Muglia LJ, Hallman M. A potential novel spontaneous preterm birth gene, AR, identified by linkage and association analysis of X chromosomal markers. PLoS One 2012; 7:e51378. [PMID: 23227263 PMCID: PMC3515491 DOI: 10.1371/journal.pone.0051378] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2012] [Accepted: 11/07/2012] [Indexed: 11/20/2022] Open
Abstract
Preterm birth is the major cause of neonatal mortality and morbidity. In many cases, it has severe life-long consequences for the health and neurological development of the newborn child. More than 50% of all preterm births are spontaneous, and currently there is no effective prevention. Several studies suggest that genetic factors play a role in spontaneous preterm birth (SPTB). However, its genetic background is insufficiently characterized. The aim of the present study was to perform a linkage analysis of X chromosomal markers in SPTB in large northern Finnish families with recurrent SPTBs. We found a significant linkage signal (HLOD = 3.72) on chromosome locus Xq13.1 when the studied phenotype was being born preterm. There were no significant linkage signals when the studied phenotype was giving preterm deliveries. Two functional candidate genes, those encoding the androgen receptor (AR) and the interleukin-2 receptor gamma subunit (IL2RG), located near this locus were analyzed as candidates for SPTB in subsequent case-control association analyses. Nine single-nucleotide polymorphisms (SNPs) within these genes and an AR exon-1 CAG repeat, which was previously demonstrated to be functionally significant, were analyzed in mothers with preterm delivery (n = 272) and their offspring (n = 269), and in mothers with exclusively term deliveries (n = 201) and their offspring (n = 199), all originating from northern Finland. A replication study population consisting of individuals born preterm (n = 111) and term (n = 197) from southern Finland was also analyzed. Long AR CAG repeats (≥26) were overrepresented and short repeats (≤19) underrepresented in individuals born preterm compared to those born at term. Thus, our linkage and association results emphasize the role of the fetal genome in genetic predisposition to SPTB and implicate AR as a potential novel fetal susceptibility gene for SPTB.
Collapse
Affiliation(s)
- Minna K Karjalainen
- Department of Pediatrics, Institute of Clinical Medicine, University of Oulu, Oulu, Finland
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Elhaik E. Empirical distributions of F(ST) from large-scale human polymorphism data. PLoS One 2012; 7:e49837. [PMID: 23185452 PMCID: PMC3504095 DOI: 10.1371/journal.pone.0049837] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Accepted: 10/12/2012] [Indexed: 12/19/2022] Open
Abstract
Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F(ST) analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F(ST) distribution closely follows an exponential distribution. Third, although the overall F(ST) distribution is similarly shaped (inverse J), F(ST) distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F(ST) of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F(ST) distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Mental Health, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
36
|
Hendre PS, Kamalakannan R, Varghese M. High-throughput and parallel SNP discovery in selected candidate genes in Eucalyptus camaldulensis using Illumina NGS platform. PLANT BIOTECHNOLOGY JOURNAL 2012; 10:646-56. [PMID: 22607345 DOI: 10.1111/j.1467-7652.2012.00699.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Next generation sequencing (NGS) technologies have revolutionized the pace and scale of genomics- and transcriptomics-based SNP discovery across different plant and animal species. Herein, 72-base paired-end Illumina sequencing was employed for high-throughput, parallel and large-scale SNP discovery in 41 growth-related candidate genes in Eucalyptus camaldulensis. Approximately 100 kb of genome from 96 individuals was amplified and sequenced using a hierarchical DNA/PCR pooling strategy and assembled over corresponding E. grandis reference. A total of 1191 SNPs (minimum 5% other allele frequency) were identified with an average frequency of 1 SNP/83.9 bp, whereas in exons and introns, it was 1 SNP/108.4 bp and 1 SNP/65.6 bp, respectively. A total of 75 insertions and 89 deletions were detected of which approximately 15% were exonic. Transitions (Tr) were in excess than transversions (Tv) (Tr/Tv: 1.89), but exceeded in exons (Tr/Tv: 2.73). In exons, synonymous SNPs (Ka) prevailed over the non-synonymous SNPs (Ks; average Ka/Ks ratio: 0.72, range: 0-3.00 across genes). Many of the exonic SNPs/indels had potential to change amino acid sequence of respective genes. Transcription factors appeared more conserved, whereas enzyme coding genes appeared under relaxed control. Further, 541 SNPs were classified into 196 'equal frequency' (EF) blocks with almost similar minor allele frequencies to facilitate selection of one tag-SNP/EF-block. There were 241 (approximately 20%) 'zero-SNP' blocks with absence of SNPs in surrounding ±60 bp windows. The data thus indicated enormous extant and unexplored diversity in E. camaldulensis in the studied genes with potential applications for marker-trait associations.
Collapse
Affiliation(s)
- Prasad S Hendre
- ITC R&D Centre, Peenya Industrial Area, Bangalore, Karnataka, India.
| | | | | |
Collapse
|
37
|
Trask JS, Garnica WT, Kanthaswamy S, Malhi RS, Smith DG. 4040 SNPs for genomic analysis in the rhesus macaque (Macaca mulatta). Genomics 2011; 98:352-8. [PMID: 21907785 PMCID: PMC3207016 DOI: 10.1016/j.ygeno.2011.08.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2011] [Revised: 08/11/2011] [Accepted: 08/17/2011] [Indexed: 11/17/2022]
Abstract
Although the rhesus macaque (Macaca mulatta) is commonly used for biomedical research and becoming a preferred model for translational medicine, quantification of genome-wide variation has been slow to follow the publication of the genome in 2007. Here we report the properties of 4040 single nucleotide polymorphisms discovered and validated in Chinese and Indian rhesus macaques from captive breeding colonies in the United States. Frequency-matched measures of linkage disequilibrium were much greater in the Indian sample. Although the majority of polymorphisms were shared between the two populations, rare alleles were over twice as common in the Chinese sample. Indian rhesus had higher rates of heterozygosity, as well as previously undetected substructure, potentially due to admixture from Burma in wild populations and demographic events post-captivity.
Collapse
Affiliation(s)
- J Satkoski Trask
- Department of Anthropology, University of California, Davis, USA.
| | | | | | | | | |
Collapse
|
38
|
Whiteley AR, Bhat A, Martins EP, Mayden RL, Arunachalam M, Uusi-Heikkilä S, Ahmed ATA, Shrestha J, Clark M, Stemple D, Bernatchez L. Population genomics of wild and laboratory zebrafish (Danio rerio). Mol Ecol 2011; 20:4259-76. [PMID: 21923777 PMCID: PMC3627301 DOI: 10.1111/j.1365-294x.2011.05272.x] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Understanding a wider range of genotype–phenotype associations can be achieved through ecological and evolutionary studies of traditional laboratory models. Here, we conducted the first large-scale geographic analysis of genetic variation within and among wild zebrafish (Danio rerio) populations occurring in Nepal, India, and Bangladesh, and we genetically compared wild populations to several commonly used lab strains. We examined genetic variation at 1832 polymorphic EST-based single nucleotide polymorphisms (SNPs) and the cytb mitochondrial gene in 13 wild populations and three lab strains. Natural populations were subdivided into three major mitochondrial DNA clades with an average among-clade sequence divergence of 5.8%. SNPs revealed five major evolutionarily and genetically distinct groups with an overall FST of 0.170 (95% CI 0.105–0.254). These genetic groups corresponded to discrete geographic regions and appear to reflect isolation in refugia during past climate cycles. We detected 71 significantly divergent outlier loci (3.4%) and nine loci (0.5%) with significantly low FST values. Valleys of reduced heterozygosity, consistent with selective sweeps, surrounded six of the 71 outliers (8.5%). The lab strains formed two additional groups that were genetically distinct from all wild populations. An additional subset of outlier loci was consistent with domestication selection within lab strains. Substantial genetic variation that exists in zebrafish as a whole is missing from lab strains that we analysed. A combination of laboratory and field studies that incorporates genetic variation from divergent wild populations along with the wealth of molecular information available for this model organism provides an opportunity to advance our understanding of genetic influences on phenotypic variation for a vertebrate species.
Collapse
Affiliation(s)
- Andrew R Whiteley
- Department of Environmental Conservation, University of Massachusetts, Amherst, MA 01003, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Zapata C. On the uses and applications of the most commonly used measures of linkage disequilibrium from the comparative analysis of their statistical properties. Hum Hered 2011; 71:186-95. [PMID: 21778738 DOI: 10.1159/000327732] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2010] [Accepted: 03/22/2011] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND/OBJECTIVE The analysis of linkage disequilibrium is relevant for the exploration of the structure and evolution of genomes and for the gene mapping of quantitative characters and human diseases. The strength of linkage disequilibrium between diallelic loci is commonly measured by the coefficients D' and r. Recent studies suggest that r is more useful than D' as a general measure of the strength of disequilibrium because it provides much more precise (lower sampling variance) and accurate (lower bias) estimates of disequilibrium. We compared for the first time the statistical properties of D' and r taking into account their differences in range. METHODS The sampling properties of D' and r were evaluated by simulation under a variety of realistic population conditions and varying sample sizes using standardised statistics that allow for comparisons of the precision, accuracy and efficiency of estimates with different ranges. RESULTS Simulations revealed that estimates of r do not tend to be significantly more precise, accurate or efficient than those of D' when compared by means of standardised statistics. CONCLUSION The supposed advantage of r over D' based on direct comparisons of their sampling distributions is more apparent than real. The obtained results are useful to assess the uses and applications of these widely used disequilibrium measures.
Collapse
Affiliation(s)
- Carlos Zapata
- Departamento de Genética, Universidad de Santiago, Santiago de Compostela, Spain.
| |
Collapse
|
40
|
Hu XS, Yeh FC, Wang Z. Structural genomics: correlation blocks, population structure, and genome architecture. Curr Genomics 2011; 12:55-70. [PMID: 21886455 PMCID: PMC3129043 DOI: 10.2174/138920211794520141] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2010] [Revised: 01/06/2011] [Accepted: 01/06/2011] [Indexed: 11/27/2022] Open
Abstract
An integration of the pattern of genome-wide inter-site associations with evolutionary forces is important for gaining insights into the genomic evolution in natural or artificial populations. Here, we assess the inter-site correlation blocks and their distributions along chromosomes. A correlation block is broadly termed as the DNA segment within which strong correlations exist between genetic diversities at any two sites. We bring together the population genetic structure and the genomic diversity structure that have been independently built on different scales and synthesize the existing theories and methods for characterizing genomic structure at the population level. We discuss how population structure could shape correlation blocks and their patterns within and between populations. Effects of evolutionary forces (selection, migration, genetic drift, and mutation) on the pattern of genome-wide correlation blocks are discussed. In eukaryote organisms, we briefly discuss the associations between the pattern of correlation blocks and genome assembly features in eukaryote organisms, including the impacts of multigene family, the perturbation of transposable elements, and the repetitive nongenic sequences and GC-rich isochores. Our reviews suggest that the observable pattern of correlation blocks can refine our understanding of the ecological and evolutionary processes underlying the genomic evolution at the population level.
Collapse
Affiliation(s)
- Xin-Sheng Hu
- 1400 College Plaza, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6J 2C8, Canada
- Department of Renewable Resources, 751 General Service Building, University of Alberta, Edmonton, Alberta, T6G 2H1, Canada
| | - Francis C. Yeh
- Department of Renewable Resources, 751 General Service Building, University of Alberta, Edmonton, Alberta, T6G 2H1, Canada
| | - Zhiquan Wang
- 1400 College Plaza, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6J 2C8, Canada
| |
Collapse
|
41
|
Miller JM, Poissant J, Kijas JW, Coltman DW. A genome-wide set of SNPs detects population substructure and long range linkage disequilibrium in wild sheep. Mol Ecol Resour 2010; 11:314-22. [PMID: 21429138 DOI: 10.1111/j.1755-0998.2010.02918.x] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The development of genomic resources for wild species is still in its infancy. However, cross-species utilization of technologies developed for their domestic counterparts has the potential to unlock the genomes of organisms that currently lack genomic resources. Here, we apply the OvineSNP50 BeadChip, developed for domestic sheep, to two related wild ungulate species: the bighorn sheep (Ovis canadensis) and the thinhorn sheep (Ovis dalli). Over 95% of the domestic sheep markers were successfully genotyped in a sample of fifty-two bighorn sheep while over 90% were genotyped in two thinhorn sheep. Pooling the results from both species identified 868 single-nucleotide polymorphisms (SNPs), 570 were detected in bighorn sheep, while 330 SNPs were identified in thinhorn sheep. The total panel of SNPs was able to discriminate between the two species, assign population of origin for bighorn sheep and detect known relationship classes within one population of bighorn sheep. Using an informative subset of these SNPs (n=308), we examined the extent of genome-wide linkage disequilibrium (LD) within one population of bighorn sheep and found that high levels of LD persist over 4 Mb.
Collapse
Affiliation(s)
- J M Miller
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | | | | | | | | |
Collapse
|
42
|
Yang HC, Lin HC, Huang MC, Li LH, Pan WH, Wu JY, Chen YT. A new analysis tool for individual-level allele frequency for genomic studies. BMC Genomics 2010; 11:415. [PMID: 20602748 PMCID: PMC2996943 DOI: 10.1186/1471-2164-11-415] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 07/05/2010] [Indexed: 01/23/2023] Open
Abstract
Background Allele frequency is one of the most important population indices and has been broadly applied to genetic/genomic studies. Estimation of allele frequency using genotypes is convenient but may lose data information and be sensitive to genotyping errors. Results This study utilizes a unified intensity-measuring approach to estimating individual-level allele frequencies for 1,104 and 1,270 samples genotyped with the single-nucleotide-polymorphism arrays of the Affymetrix Human Mapping 100K and 500K Sets, respectively. Allele frequencies of all samples are estimated and adjusted by coefficients of preferential amplification/hybridization (CPA), and large ethnicity-specific and cross-ethnicity databases of CPA and allele frequency are established. The results show that using the CPA significantly improves the accuracy of allele frequency estimates; moreover, this paramount factor is insensitive to the time of data acquisition, effect of laboratory site, type of gene chip, and phenotypic status. Based on accurate allele frequency estimates, analytic methods based on individual-level allele frequencies are developed and successfully applied to discover genomic patterns of allele frequencies, detect chromosomal abnormalities, classify sample groups, identify outlier samples, and estimate the purity of tumor samples. The methods are packaged into a new analysis tool, ALOHA (Allele-frequency/Loss-of-heterozygosity/Allele-imbalance). Conclusions This is the first time that these important genetic/genomic applications have been simultaneously conducted by the analyses of individual-level allele frequencies estimated by a unified intensity-measuring approach. We expect that additional practical applications for allele frequency analysis will be found. The developed databases and tools provide useful resources for human genome analysis via high-throughput single-nucleotide-polymorphism arrays. The ALOHA software was written in R and R GUI and can be downloaded at http://www.stat.sinica.edu.tw/hsinchou/genetics/aloha/ALOHA.htm.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan.
| | | | | | | | | | | | | |
Collapse
|
43
|
Rosenberg NA, Huang L, Jewett EM, Szpiech ZA, Jankovic I, Boehnke M. Genome-wide association studies in diverse populations. Nat Rev Genet 2010; 11:356-66. [PMID: 20395969 PMCID: PMC3079573 DOI: 10.1038/nrg2760] [Citation(s) in RCA: 414] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Genome-wide association (GWA) studies have identified a large number of SNPs associated with disease phenotypes. As most GWA studies have been performed in populations of European descent, this Review examines the issues involved in extending the consideration of GWA studies to diverse worldwide populations. Although challenges exist with issues such as imputation, admixture and replication, investigation of a greater diversity of populations could make substantial contributions to the goal of mapping the genetic determinants of complex diseases for the human population as a whole.
Collapse
Affiliation(s)
- Noah A Rosenberg
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | | | | | | | |
Collapse
|
44
|
Bush WS, Chen G, Torstenson ES, Ritchie MD. LD-spline: mapping SNPs on genotyping platforms to genomic regions using patterns of linkage disequilibrium. BioData Min 2009; 2:7. [PMID: 19954552 PMCID: PMC2795743 DOI: 10.1186/1756-0381-2-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2009] [Accepted: 12/03/2009] [Indexed: 01/29/2023] Open
Abstract
Background Gene-centric analysis tools for genome-wide association study data are being developed both to annotate single locus statistics and to prioritize or group single nucleotide polymorphisms (SNPs) prior to analysis. These approaches require knowledge about the relationships between SNPs on a genotyping platform and genes in the human genome. SNPs in the genome can represent broader genomic regions via linkage disequilibrium (LD), and population-specific patterns of LD can be exploited to generate a data-driven map of SNPs to genes. Methods In this study, we implemented LD-Spline, a database routine that defines the genomic boundaries a particular SNP represents using linkage disequilibrium statistics from the International HapMap Project. We compared the LD-Spline haplotype block partitioning approach to that of the four gamete rule and the Gabriel et al. approach using simulated data; in addition, we processed two commonly used genome-wide association study platforms. Results We illustrate that LD-Spline performs comparably to the four-gamete rule and the Gabriel et al. approach; however as a SNP-centric approach LD-Spline has the added benefit of systematically identifying a genomic boundary for each SNP, where the global block partitioning approaches may falter due to sampling variation in LD statistics. Conclusion LD-Spline is an integrated database routine that quickly and effectively defines the genomic region marked by a SNP using linkage disequilibrium, with a SNP-centric block definition algorithm.
Collapse
Affiliation(s)
- William S Bush
- Center for Human Genetics Research, Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA.
| | | | | | | |
Collapse
|
45
|
Clark NL, Gasper J, Sekino M, Springer SA, Aquadro CF, Swanson WJ. Coevolution of interacting fertilization proteins. PLoS Genet 2009; 5:e1000570. [PMID: 19629160 PMCID: PMC2704960 DOI: 10.1371/journal.pgen.1000570] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2008] [Accepted: 06/23/2009] [Indexed: 01/01/2023] Open
Abstract
Reproductive proteins are among the fastest evolving in the proteome, often due to the consequences of positive selection, and their rapid evolution is frequently attributed to a coevolutionary process between interacting female and male proteins. Such a process could leave characteristic signatures at coevolving genes. One signature of coevolution, predicted by sexual selection theory, is an association of alleles between the two genes. Another predicted signature is a correlation of evolutionary rates during divergence due to compensatory evolution. We studied female-male coevolution in the abalone by resequencing sperm lysin and its interacting egg coat protein, VERL, in populations of two species. As predicted, we found intergenic linkage disequilibrium between lysin and VERL, despite our demonstration that they are not physically linked. This finding supports a central prediction of sexual selection using actual genotypes, that of an association between a male trait and its female preference locus. We also created a novel likelihood method to show that lysin and VERL have experienced correlated rates of evolution. These two signatures of coevolution can provide statistical rigor to hypotheses of coevolution and could be exploited for identifying coevolving proteins a priori. We also present polymorphism-based evidence for positive selection and implicate recent selective events at the specific structural regions of lysin and VERL responsible for their species-specific interaction. Finally, we observed deep subdivision between VERL alleles in one species, which matches a theoretical prediction of sexual conflict. Thus, abalone fertilization proteins illustrate how coevolution can lead to reproductive barriers and potentially drive speciation.
Collapse
Affiliation(s)
- Nathaniel L. Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Joe Gasper
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Masashi Sekino
- Tohoku National Fisheries Research Institute, Fisheries Research Agency, Shiogama, Miyagi, Japan
| | - Stevan A. Springer
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Charles F. Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Willie J. Swanson
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
46
|
Abstract
Assessing the extent of linkage disequilibrium (LD) in natural populations of a nonmodel species has been difficult due to the lack of available genomic markers. However, with advances in genotyping and genome sequencing, genomic characterization of natural populations has become feasible. Using sequence data and SNP genotypes, we measured LD and modeled the demographic history of wild canid populations and domestic dog breeds. In 11 gray wolf populations and one coyote population, we find that the extent of LD as measured by the distance at which r2=0.2 extends <10 kb in outbred populations to >1.7 Mb in populations that have experienced significant founder events and bottlenecks. This large range in the extent of LD parallels that observed in 18 dog breeds where the r2 value varies from approximately 20 kb to >5 Mb. Furthermore, in modeling demographic history under a composite-likelihood framework, we find that two of five wild canid populations exhibit evidence of a historical population contraction. Five domestic dog breeds display evidence for a minor population contraction during domestication and a more severe contraction during breed formation. Only a 5% reduction in nucleotide diversity was observed as a result of domestication, whereas the loss of nucleotide diversity with breed formation averaged 35%.
Collapse
|
47
|
Hofer T, Ray N, Wegmann D, Excoffier L. Large Allele Frequency Differences between Human Continental Groups are more Likely to have Occurred by Drift During range Expansions than by Selection. Ann Hum Genet 2009; 73:95-108. [PMID: 19040659 DOI: 10.1111/j.1469-1809.2008.00489.x] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- T Hofer
- Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
| | | | | | | |
Collapse
|
48
|
Kim D, Nylander-French LA. Physiologically based toxicokinetic models and their application in human exposure and internal dose assessment. EXS 2009; 99:37-55. [PMID: 19157057 DOI: 10.1007/978-3-7643-8336-7_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Human populations may exhibit large interindividual variation in toxicokinetic response to chemical exposures. Rapid developments in dosimetry research have brought medicine and public health closer to understanding the biological basis of this heterogeneity. The toxicokinetic behavior of chemicals is, in part, controlled by the properties of the epithelium surrounding organs, some of which are effective barriers to penetration into the systemic circulation. Physiologically based toxicokinetic (PBTK) models have been developed and used to simulate the mechanism of uptake into the systemic circulation, to extrapolate between doses and exposure routes, and to estimate internal dosimetry and sources of heterogeneity in animals and humans. Recent improvements to PBTK models include descriptions of active transport across biological membranes, carrier-mediated clearance, and fractal kinetics. The expanding area of toxicogenetics has provided valuable insight for delineating toxicokinetic differences between individuals; genetic differences include inherited single nucleotide polymorphisms, copy number variants, and dynamic changes in the methylation pattern of imprinted genes. This chapter discusses the structure of PBTK models and how toxicogenetic information and newer biological descriptions have improved our understanding of variability in response to toxicant exposures.
Collapse
Affiliation(s)
- David Kim
- Department of Environmental Health, School of Public Health,Harvard University, Boston, MA 02215, USA.
| | | |
Collapse
|
49
|
Confounding between recombination and selection, and the Ped/Pop method for detecting selection. Genome Res 2008; 18:1304-13. [PMID: 18617692 DOI: 10.1101/gr.067181.107] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In recent years, there have been major developments of population genetics methods to estimate both rates of recombination and levels of natural selection. However, genomic variants subject to positive selection are likely to have arisen recently and, consequently, had less opportunity to be affected by recombination. Thus, the two processes have an intimately related impact on genetic variation, and inference of either may be vulnerable to confounding by the other. We illustrate here that even modest levels of positive selection can substantially reduce population-based recombination rate estimates. We also show that genome-wide scans to detect loci under recent selection in humans have tended to highlight loci in regions of low recombination, suggesting that confounding by recombination rate may have reduced the power of these studies. Motivated by these findings, we introduce a new genome-wide approach for detecting selection, based on the ratio of pedigree-based to population-based estimates of recombination rate. Simulations suggest that our "Ped/Pop" method, which is designed to capture completed sweeps, has good power to discriminate between neutral and adaptive evolution. Unusually for a multimarker method, our approach performs well in regions of high recombination and also has good power for many generations after the fixation of an advantageous variant. We apply the method to human HapMap and Perlegen data sets, finding confirmation of reported candidates as well as identifying new loci that may have undergone recent intense selection.
Collapse
|
50
|
VanLiere JM, Rosenberg NA. Mathematical properties of the r2 measure of linkage disequilibrium. Theor Popul Biol 2008; 74:130-7. [PMID: 18572214 DOI: 10.1016/j.tpb.2008.05.006] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2008] [Revised: 05/14/2008] [Accepted: 05/14/2008] [Indexed: 11/28/2022]
Abstract
Statistics for linkage disequilibrium (LD), the non-random association of alleles at two loci, depend on the frequencies of the alleles at the loci under consideration. Here, we examine the r(2) measure of LD and its mathematical relationship to allele frequencies, quantifying the constraints on its maximum value. Assuming independent uniform distributions for the allele frequencies of two biallelic loci, we find that the mean maximum value of r(2) is approximately 0.43051, and that r(2) can exceed a threshold of 4/5 in only approximately 14.232% of the allele frequency space. If one locus is assumed to have known allele frequencies--the situation in an association study in which LD between a known marker locus and an unknown trait locus is of interest--we find that the mean maximum value of r(2) is greatest when the known locus has a minor allele frequency of approximately 0.30131. We find that in 1/4 of the space of allowed values of minor allele frequencies and haplotype frequencies at a pair of loci, the unconstrained maximum r(2) allowing for the possibility of recombination between the loci exceeds the constrained maximum assuming that no recombination has occurred. Finally, we use r(max)(2) to examine the connection between r(2) and the D(') measure of linkage disequilibrium, finding that r(2)/r(max)(2)=D('2) for approximately 72.683% of the space of allowed values of (p(a),p(b),p(ab)). Our results concerning the properties of r(2) have the potential to inform the interpretation of unusual LD behavior and to assist in the design of LD-based association-mapping studies.
Collapse
Affiliation(s)
- Jenna M VanLiere
- Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| | | |
Collapse
|