1
|
Pablo C, Matías A, Lavalle Cobo A, Sergio G, Federico RN. Exploring the Interplay between Diabetes and Lp(a): Implications for Cardiovascular Risk. Curr Diab Rep 2024; 24:167-172. [PMID: 38805111 DOI: 10.1007/s11892-024-01543-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 05/29/2024]
Abstract
PURPOSE OF REVIEW The objective of this manuscript is to review and describe the relationship between Lp(a) and diabetes, exploring both their association and synergy as cardiovascular risk factors, while also describing the current evidence regarding the potential connection between low levels of Lp(a) and the presence of diabetes. RECENT FINDINGS Epidemiological studies suggest a potential relationship between low to very low levels of Lp(a) and diabetes. Lipoprotein(a), or Lp(a), is an intriguing lipoprotein of genetic origin, yet its biological function remains unknown. Elevated levels of Lp(a) are associated with an increased risk of cardiovascular atherosclerosis, and coexisting diabetes status confers an even higher risk. On the other hand, epidemiological and genetic studies have paradoxically suggested a potential relationship between low to very low levels of Lp(a) and diabetes. While new pharmacological strategies are being developed to reduce Lp(a) levels, the dual aspects of this lipoprotein's behavior need to be elucidated in the near future.
Collapse
Affiliation(s)
- Corral Pablo
- Pharmacology and Research Department, FASTA University, Mar del Plata, Argentina.
| | - Arrupe Matías
- Cardiometabolic Unit Coordinator - Hospital Español, Mendoza, Argentina
| | | | | | - Renna Nicolás Federico
- Chief of Coronary Care Unit - Hospital Español de Mendoza- School of Medicine-UNCuyo, Mendoza, Argentina
| |
Collapse
|
2
|
Lamkin M, Gymrek M. The emerging role of tandem repeats in complex traits. Nat Rev Genet 2024; 25:452-453. [PMID: 38714860 DOI: 10.1038/s41576-024-00736-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Affiliation(s)
- Michael Lamkin
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
3
|
Tanudisastro HA, Deveson IW, Dashnow H, MacArthur DG. Sequencing and characterizing short tandem repeats in the human genome. Nat Rev Genet 2024; 25:460-475. [PMID: 38366034 DOI: 10.1038/s41576-024-00692-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2023] [Indexed: 02/18/2024]
Abstract
Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.
Collapse
Affiliation(s)
- Hope A Tanudisastro
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Ira W Deveson
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
4
|
Di Maio S, Zöscher P, Weissensteiner H, Forer L, Schachtl-Riess JF, Amstler S, Streiter G, Pfurtscheller C, Paulweber B, Kronenberg F, Coassin S, Schönherr S. Resolving intra-repeat variation in medically relevant VNTRs from short-read sequencing data using the cardiovascular risk gene LPA as a model. Genome Biol 2024; 25:167. [PMID: 38926899 PMCID: PMC11201333 DOI: 10.1186/s13059-024-03316-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 06/18/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Variable number tandem repeats (VNTRs) are highly polymorphic DNA regions harboring many potentially disease-causing variants. However, VNTRs often appear unresolved ("dark") in variation databases due to their repetitive nature. One particularly complex and medically relevant VNTR is the KIV-2 VNTR located in the cardiovascular disease gene LPA which encompasses up to 70% of the coding sequence. RESULTS Using the highly complex LPA gene as a model, we develop a computational approach to resolve intra-repeat variation in VNTRs from largely available short-read sequencing data. We apply the approach to six protein-coding VNTRs in 2504 samples from the 1000 Genomes Project and developed an optimized method for the LPA KIV-2 VNTR that discriminates the confounding KIV-2 subtypes upfront. This results in an F1-score improvement of up to 2.1-fold compared to previously published strategies. Finally, we analyze the LPA VNTR in > 199,000 UK Biobank samples, detecting > 700 KIV-2 mutations. This approach successfully reveals new strong Lp(a)-lowering effects for KIV-2 variants, with protective effect against coronary artery disease, and also validated previous findings based on tagging SNPs. CONCLUSIONS Our approach paves the way for reliable variant detection in VNTRs at scale and we show that it is transferable to other dark regions, which will help unlock medical information hidden in VNTRs.
Collapse
Affiliation(s)
- Silvia Di Maio
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Peter Zöscher
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Hansi Weissensteiner
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Lukas Forer
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | | | - Stephan Amstler
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Gertraud Streiter
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Cathrin Pfurtscheller
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Bernhard Paulweber
- Department of Internal Medicine I, Paracelsus Medical University/Salzburger Landeskliniken, Salzburg, Austria
| | - Florian Kronenberg
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Stefan Coassin
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Sebastian Schönherr
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria.
| |
Collapse
|
5
|
Rodriguez-Algarra F, Evans DM, Rakyan VK. Ribosomal DNA copy number variation associates with hematological profiles and renal function in the UK Biobank. CELL GENOMICS 2024; 4:100562. [PMID: 38749448 DOI: 10.1016/j.xgen.2024.100562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/19/2023] [Accepted: 04/21/2024] [Indexed: 06/15/2024]
Abstract
The phenotypic impact of genetic variation of repetitive features in the human genome is currently understudied. One such feature is the multi-copy 47S ribosomal DNA (rDNA) that codes for rRNA components of the ribosome. Here, we present an analysis of rDNA copy number (CN) variation in the UK Biobank (UKB). From the first release of UKB whole-genome sequencing (WGS) data, a discovery analysis in White British individuals reveals that rDNA CN associates with altered counts of specific blood cell subtypes, such as neutrophils, and with the estimated glomerular filtration rate, a marker of kidney function. Similar trends are observed in other ancestries. A range of analyses argue against reverse causality or common confounder effects, and all core results replicate in the second UKB WGS release. Our work demonstrates that rDNA CN is a genetic influence on trait variance in humans.
Collapse
Affiliation(s)
| | - David M Evans
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia; Frazer Institute, The University of Queensland, Brisbane, QLD 4102, Australia; MRC Integrative Epidemiology Unit, University of Bristol, Bristol BS8 2BN, UK
| | - Vardhman K Rakyan
- The Blizard Institute, School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK.
| |
Collapse
|
6
|
King DG. Mutation protocols share with sexual reproduction the physiological role of producing genetic variation within 'constraints that deconstrain'. J Physiol 2024; 602:2615-2626. [PMID: 38178567 DOI: 10.1113/jp285478] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 12/14/2023] [Indexed: 01/06/2024] Open
Abstract
Because the universe of possible DNA sequences is inconceivably vast, organisms have evolved mechanisms for exploring DNA sequence space while substantially reducing the hazard that would otherwise accrue to any process of random, accidental mutation. One such mechanism is meiotic recombination. Although sexual reproduction imposes a seemingly paradoxical 50% cost to fitness, sex evidently prevails because this cost is outweighed by the advantage of equipping offspring with genetic variation to accommodate environmental vicissitudes. The potential adaptive utility of additional mechanisms for producing genetic variation has long been obscured by a presumption that the vast majority of mutations are deleterious. Perhaps surprisingly, the probability for adaptive variation can be increased by several mechanisms that generate mutations abundantly. Such mechanisms, here called 'mutation protocols', implement implicit 'constraints that deconstrain'. Like meiotic recombination, they produce genetic variation in forms that minimize potential for harm while providing a reasonably high probability for benefit. One example is replication slippage of simple sequence repeats (SSRs); this process yields abundant, reversible mutations, typically with small quantitative effect on phenotype. This enables SSRs to function as adjustable 'tuning knobs'. There exists a clear pathway for SSRs to be shaped through indirect selection favouring their implicit tuning-knob protocol. Several other molecular mechanisms comprise probable components of additional mutation protocols. Biologists might plausibly regard such mechanisms of mutation not primarily as sources of deleterious genetic mistakes but also as potentially adaptive processes for 'exploring' DNA sequence space.
Collapse
Affiliation(s)
- David G King
- Department of Anatomy, School of Medicine, Southern Illinois University Carbondale, Carbondale, Illinois, USA
- Department of Zoology, College of Agricultural, Life, and Physical Sciences, Southern Illinois University Carbondale, Carbondale, Illinois, USA
| |
Collapse
|
7
|
Caporale LH. Evolutionary feedback from the environment shapes mechanisms that generate genome variation. J Physiol 2024; 602:2601-2614. [PMID: 38194279 DOI: 10.1113/jp284411] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 12/14/2023] [Indexed: 01/10/2024] Open
Abstract
Darwin recognized that 'a grand and almost untrodden field of inquiry will be opened, on the causes and laws of variation.' However, because the Modern Synthesis assumes that the intrinsic probability of any individual mutation is unrelated to that mutation's potential adaptive value, attention has been focused on selection rather than on the intrinsic generation of variation. Yet many examples illustrate that the term 'random' mutation, as widely understood, is inaccurate. The probabilities of distinct classes of variation are neither evenly distributed across a genome nor invariant over time, nor unrelated to their potential adaptive value. Because selection acts upon variation, multiple biochemical mechanisms can and have evolved that increase the relative probability of adaptive mutations. In effect, the generation of heritable variation is in a feedback loop with selection, such that those mechanisms that tend to generate variants that survive recurring challenges in the environment would be captured by this survival and thus inherited and accumulated within lineages of genomes. Moreover, because genome variation is affected by a wide range of biochemical processes, genome variation can be regulated. Biochemical mechanisms that sense stress, from lack of nutrients to DNA damage, can increase the probability of specific classes of variation. A deeper understanding of evolution involves attention to the evolution of, and environmental influences upon, the intrinsic variation generated in gametes, in other words upon the biochemical mechanisms that generate variation across generations. These concepts have profound implications for the types of questions that can and should be asked, as omics databases become more comprehensive, detection methods more sensitive, and computation and experimental analyses even more high throughput and thus capable of revealing the intrinsic generation of variation in individual gametes. These concepts also have profound implications for evolutionary theory, which, upon reflection it will be argued, predicts that selection would increase the probability of generating adaptive mutations, in other words, predicts that the ability to evolve itself evolves.
Collapse
|
8
|
Moya R, Wang X, Tsien RW, Maurano MT. Structural characterization of a polymorphic repeat at the CACNA1C schizophrenia locus. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.05.24303780. [PMID: 38798557 PMCID: PMC11118589 DOI: 10.1101/2024.03.05.24303780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Genetic variation within intron 3 of the CACNA1C calcium channel gene is associated with schizophrenia and bipolar disorder, but analysis of the causal variants and their effect is complicated by a nearby variable-number tandem repeat (VNTR). Here, we used 155 long-read genome assemblies from 78 diverse individuals to delineate the structure and population variability of the CACNA1C intron 3 VNTR. We categorized VNTR sequences into 7 Types of structural alleles using sequence differences among repeat units. Only 12 repeat units at the 5' end of the VNTR were shared across most Types, but several Types were related through a series of large and small duplications. The most diverged Types were rare and present only in individuals with African ancestry, but the multiallelic structural polymorphism Variable Region 2 was present across populations at different frequencies, consistent with expansion of the VNTR preceding the emergence of early hominins. VR2 was in complete linkage disequilibrium with fine-mapped schizophrenia variants (SNPs) from genome-wide association studies (GWAS). This risk haplotype was associated with decreased CACNA1C gene expression in brain tissues profiled by the GTEx project. Our work suggests that sequence variation within a human-specific VNTR affects gene expression, and provides a detailed characterization of new alleles at a flagship neuropsychiatric locus.
Collapse
Affiliation(s)
- Raquel Moya
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Xiaohan Wang
- Neuroscience Institute, NYU School of Medicine, New York, NY 10016, USA
- Department of Neuroscience and Physiology, New York University, New York, NY 10016
| | - Richard W. Tsien
- Neuroscience Institute, NYU School of Medicine, New York, NY 10016, USA
- Department of Neuroscience and Physiology, New York University, New York, NY 10016
| | - Matthew T. Maurano
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Department of Pathology, NYU School of Medicine, New York, NY 10016, USA
| |
Collapse
|
9
|
Rossen J, Shi H, Strober BJ, Zhang MJ, Kanai M, McCaw ZR, Liang L, Weissbrod O, Price AL. MultiSuSiE improves multi-ancestry fine-mapping in All of Us whole-genome sequencing data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.13.24307291. [PMID: 38798542 PMCID: PMC11118590 DOI: 10.1101/2024.05.13.24307291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Leveraging data from multiple ancestries can greatly improve fine-mapping power due to differences in linkage disequilibrium and allele frequencies. We propose MultiSuSiE, an extension of the sum of single effects model (SuSiE) to multiple ancestries that allows causal effect sizes to vary across ancestries based on a multivariate normal prior informed by empirical data. We evaluated MultiSuSiE via simulations and analyses of 14 quantitative traits leveraging whole-genome sequencing data in 47k African-ancestry and 94k European-ancestry individuals from All of Us. In simulations, MultiSuSiE applied to Afr47k+Eur47k was well-calibrated and attained higher power than SuSiE applied to Eur94k; interestingly, higher causal variant PIPs in Afr47k compared to Eur47k were entirely explained by differences in the extent of LD quantified by LD 4th moments. Compared to very recently proposed multi-ancestry fine-mapping methods, MultiSuSiE attained higher power and/or much lower computational costs, making the analysis of large-scale All of Us data feasible. In real trait analyses, MultiSuSiE applied to Afr47k+Eur94k identified 579 fine-mapped variants with PIP > 0.5, and MultiSuSiE applied to Afr47k+Eur47k identified 44% more fine-mapped variants with PIP > 0.5 than SuSiE applied to Eur94k. We validated MultiSuSiE results for real traits via functional enrichment of fine-mapped variants. We highlight several examples where MultiSuSiE implicates well-studied or biologically plausible fine-mapped variants that were not implicated by other methods.
Collapse
|
10
|
Maciocha F, Suchanecka A, Chmielowiec K, Chmielowiec J, Ciechanowicz A, Boroń A. Correlations of the CNR1 Gene with Personality Traits in Women with Alcohol Use Disorder. Int J Mol Sci 2024; 25:5174. [PMID: 38791212 PMCID: PMC11121729 DOI: 10.3390/ijms25105174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 05/02/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
Alcohol use disorder (AUD) is a significant issue affecting women, with severe consequences for society, the economy, and most importantly, health. Both personality and alcohol use disorders are phenotypically very complex, and elucidating their shared heritability is a challenge for medical genetics. Therefore, our study investigated the correlations between the microsatellite polymorphism (AAT)n of the Cannabinoid Receptor 1 (CNR1) gene and personality traits in women with AUD. The study group included 187 female subjects. Of these, 93 were diagnosed with alcohol use disorder, and 94 were controls. Repeat length polymorphism of microsatellite regions (AAT)n in the CNR1 gene was identified with PCR. All participants were assessed with the Mini-International Neuropsychiatric Interview and completed the NEO Five-Factor and State-Trait Anxiety Inventories. In the group of AUD subjects, significantly fewer (AAT)n repeats were present when compared with controls (p = 0.0380). While comparing the alcohol use disorder subjects (AUD) and the controls, we observed significantly higher scores on the STAI trait (p < 0.00001) and state scales (p = 0.0001) and on the NEO Five-Factor Inventory Neuroticism (p < 0.00001) and Openness (p = 0.0237; insignificant after Bonferroni correction) scales. Significantly lower results were obtained on the NEO-FFI Extraversion (p = 0.00003), Agreeability (p < 0.00001) and Conscientiousness (p < 0.00001) scales by the AUD subjects when compared to controls. There was no statistically significant Pearson's linear correlation between the number of (AAT)n repeats in the CNR1 gene and the STAI and NEO Five-Factor Inventory scores in the group of AUD subjects. In contrast, Pearson's linear correlation analysis in controls showed a positive correlation between the number of the (AAT)n repeats and the STAI state scale (r = 0.184; p = 0.011; insignificant after Bonferroni correction) and a negative correlation with the NEO-FFI Openness scale (r = -0.241; p = 0.001). Interestingly, our study provided data on two separate complex issues, i.e., (1) the association of (AAT)n CNR1 repeats with the AUD in females; (2) the correlation of (AAT)n CNR1 repeats with anxiety as a state and Openness in non-alcohol dependent subjects. In conclusion, our study provided a plethora of valuable data for improving our understanding of alcohol use disorder and anxiety.
Collapse
Affiliation(s)
- Filip Maciocha
- Department of Clinical and Molecular Biochemistry, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland; (F.M.); (A.C.)
| | - Aleksandra Suchanecka
- Independent Laboratory of Behavioral Genetics and Epigenetics, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland;
| | - Krzysztof Chmielowiec
- Department of Hygiene and Epidemiology, Collegium Medicum, University of Zielona Góra, 28 Zyty St., 65-046 Zielona Góra, Poland; (K.C.); (J.C.)
| | - Jolanta Chmielowiec
- Department of Hygiene and Epidemiology, Collegium Medicum, University of Zielona Góra, 28 Zyty St., 65-046 Zielona Góra, Poland; (K.C.); (J.C.)
| | - Andrzej Ciechanowicz
- Department of Clinical and Molecular Biochemistry, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland; (F.M.); (A.C.)
| | - Agnieszka Boroń
- Department of Clinical and Molecular Biochemistry, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland; (F.M.); (A.C.)
| |
Collapse
|
11
|
Hamilton F, Mitchell R, Ghazal P, Timpson N. Phenotypic Associations With the HMOX1 GT(n) Repeat in European Populations. Am J Epidemiol 2024; 193:718-726. [PMID: 37414746 PMCID: PMC11074708 DOI: 10.1093/aje/kwad154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 12/21/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023] Open
Abstract
Heme oxygenase 1 is a key enzyme in the management of heme in humans. A GT(n) repeat length in the heme oxygenase 1 gene (HMOX1) has been widely associated with a variety of phenotypes, including susceptibility to and outcomes in diabetes, cancer, infections, and neonatal jaundice. However, studies have generally been small and results inconsistent. In this study, we imputed the GT(n) repeat length in participants from 2 UK cohort studies (the UK Biobank study (n = 463,005; recruited in 2006-2010) and the Avon Longitudinal Study of Parents and Children (ALSPAC; n = 937; recruited in 1990-1991)), with the reliability of imputation tested in other cohorts (1000 Genomes Project, Human Genome Diversity Project, and Personal Genome Project UK). Subsequently, we measured the relationship between repeat length and previously identified associations (diabetes, chronic obstructive pulmonary disease, pneumonia, and infection-related mortality in the UK Biobank; neonatal jaundice in ALSPAC) and performed a phenomewide association study in the UK Biobank. Despite high-quality imputation (correlation between true repeat length and imputed repeat length > 0.9 in test cohorts), clinical associations were not identified in either the phenomewide association study or specific association studies. These findings were robust to definitions of repeat length and sensitivity analyses. Despite multiple smaller studies identifying associations across a variety of clinical settings, we could not replicate or identify any relevant phenotypic associations with the HMOX1 GT(n) repeat.
Collapse
Affiliation(s)
- Fergus Hamilton
- Correspondence to Dr. Fergus Hamilton, MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, United Kingdom (e-mail: )
| | | | | | | |
Collapse
|
12
|
Koschinsky ML, Bajaj A, Boffa MB, Dixon DL, Ferdinand KC, Gidding SS, Gill EA, Jacobson TA, Michos ED, Safarova MS, Soffer DE, Taub PR, Wilkinson MJ, Wilson DP, Ballantyne CM. A focused update to the 2019 NLA scientific statement on use of lipoprotein(a) in clinical practice. J Clin Lipidol 2024; 18:e308-e319. [PMID: 38565461 DOI: 10.1016/j.jacl.2024.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 03/06/2024] [Indexed: 04/04/2024]
Abstract
Since the 2019 National Lipid Association (NLA) Scientific Statement on Use of Lipoprotein(a) in Clinical Practice was issued, accumulating epidemiological data have clarified the relationship between lipoprotein(a) [Lp(a)] level and cardiovascular disease risk and risk reduction. Therefore, the NLA developed this focused update to guide clinicians in applying this emerging evidence in clinical practice. We now have sufficient evidence to support the recommendation to measure Lp(a) levels at least once in every adult for risk stratification. Individuals with Lp(a) levels <75 nmol/L (30 mg/dL) are considered low risk, individuals with Lp(a) levels ≥125 nmol/L (50 mg/dL) are considered high risk, and individuals with Lp(a) levels between 75 and 125 nmol/L (30-50 mg/dL) are at intermediate risk. Cascade screening of first-degree relatives of patients with elevated Lp(a) can identify additional individuals at risk who require intervention. Patients with elevated Lp(a) should receive early, more-intensive risk factor management, including lifestyle modification and lipid-lowering drug therapy in high-risk individuals, primarily to reduce low-density lipoprotein cholesterol (LDL-C) levels. The U.S. Food and Drug Administration approved an indication for lipoprotein apheresis (which reduces both Lp(a) and LDL-C) in high-risk patients with familial hypercholesterolemia and documented coronary or peripheral artery disease whose Lp(a) level remains ≥60 mg/dL [∼150 nmol/L)] and LDL-C ≥ 100 mg/dL on maximally tolerated lipid-lowering therapy. Although Lp(a) is an established independent causal risk factor for cardiovascular disease, and despite the high prevalence of Lp(a) elevation (∼1 of 5 individuals), measurement rates are low, warranting improved screening strategies for cardiovascular disease prevention.
Collapse
Affiliation(s)
- Marlys L Koschinsky
- Department of Physiology & Pharmacology and Robarts Research Institute, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada (Drs Koschinsky, Boffa)
| | - Archna Bajaj
- Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA (Drs Bajaj, Soffer)
| | - Michael B Boffa
- Department of Physiology & Pharmacology and Robarts Research Institute, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada (Drs Koschinsky, Boffa)
| | - Dave L Dixon
- Department of Pharmacotherapy & Outcomes Science, Virginia Commonwealth University School of Pharmacy, Richmond, VA, USA (Dr Dixon)
| | - Keith C Ferdinand
- Department of Medicine, Tulane University School of Medicine, New Orleans, LA, USA (Dr. Ferdinand)
| | - Samuel S Gidding
- Department of Genomic Health, Geisinger. Danville, PA, USA (Dr Gidding)
| | - Edward A Gill
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA (Dr Gill)
| | - Terry A Jacobson
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA (Dr Jacobson)
| | - Erin D Michos
- Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, MD, USA (Dr Michos)
| | - Maya S Safarova
- Division of Cardiovascular Medicine, Department of Internal Medicine, Froedtert & the Medical College of Wisconsin, Milwaukee, WI, USA (Dr Safarova)
| | - Daniel E Soffer
- Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA (Drs Bajaj, Soffer)
| | - Pam R Taub
- Department of Medicine, University of California San Diego, La Jolla, CA, USA (Drs Taub, Wilkinson)
| | - Michael J Wilkinson
- Department of Medicine, University of California San Diego, La Jolla, CA, USA (Drs Taub, Wilkinson)
| | - Don P Wilson
- Department of Pediatric Endocrinology and Diabetes, Cook Children's Medical Center, Fort Worth, TX, USA (Dr Wilson)
| | - Christie M Ballantyne
- Department of Medicine, Baylor College of Medicine, Houston, TX, USA (Dr Ballantyne).
| |
Collapse
|
13
|
Fages V, Bourre F, Larrue R, Wenzel A, Gibier JB, Bonte F, Dhaenens CM, Kidd K, Kmoch S, Bleyer A, Glowacki F, Grunewald O. Description of a New Simple and Cost-Effective Molecular Testing That Could Simplify MUC1 Variant Detection. Kidney Int Rep 2024; 9:1451-1457. [PMID: 38707821 PMCID: PMC11068942 DOI: 10.1016/j.ekir.2024.01.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 01/23/2024] [Accepted: 01/29/2024] [Indexed: 05/07/2024] Open
Abstract
Introduction Patients with autosomal dominant tubulointerstitial kidney disease (ADTKD) usually present with nonspecific progressive chronic kidney disease (CKD) with mild to negative proteinuria and a family history. ADTKD-MUC1 leads to the formation of a frameshift protein that accumulates in the cytoplasm, leading to tubulointerstitial damage. ADTKD-MUC1 prevalence remains unclear because MUC1 variants are not routinely detected by standard next-generation sequencing (NGS) techniques. Methods We developed a bioinformatic counting script that can detect specific genetic sequences and count the number of occurrences. We used DNA samples from 27 patients for validation, 11 of them were patients from the Lille University Hospital in France and 16 were from the Wake Forest Hospital, NC. All patients from Lille were tested with an NGS gene panel with our script and all patients from Wake Forest Hospital were tested with the snapshot reference technique. Between January 2018 and February 2023, we collected data on all patients diagnosed with MUC1 variants with this script. Results A total of 27 samples were tested anonymously by the BROAD Institute reference technique for confirmation and we were able to get a 100% concordance for MUC1 diagnosis. Clinico-biologic characteristics in our cohort were similar to those previously described in ADTKD-MUC1. Conclusion We describe a new simple and cost-effective method for molecular testing of ADTKD-MUC1. Genetic analyses in our cohort suggest that MUC1 might be the first cause of ADTKD. Increasing the availability of MUC1 diagnosis tools will contribute to a better understanding of the disease and to the development of specific treatments.
Collapse
Affiliation(s)
- Victor Fages
- Nephrology, Centre Hospitalier Regional Universitaire de Lille, Lille, France
| | - Florentin Bourre
- Nephrology, Centre Hospitalier Regional Universitaire de Lille, Lille, France
| | - Romain Larrue
- Service de Toxicologie et Génopathies, CHU Lille, Lille, France
| | - Andrea Wenzel
- Institute of Human Genetics, Center for Molecular Medicine Cologne, Cologne, Germany
| | | | - Fabrice Bonte
- Functional and Structural Platform, Université de Lille, Lille, France
| | - Claire-Marie Dhaenens
- Department of Biochemistry and Molecular Biology, Institut National de la Santé et de la Recherche Médicale, Centre Hospitalier Universitaire de Lille, Lille, France
| | - Kendrah Kidd
- Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Stanislav Kmoch
- First Faculty of Medicine, Charles University, Nové Město, Czechia
| | - Anthony Bleyer
- Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - François Glowacki
- Nephrology, Centre Hospitalier Regional Universitaire de Lille, Lille, France
| | - Olivier Grunewald
- Neuroscience and Cognition, University Lille, Inserm, CHU Lille, Lille, France
| |
Collapse
|
14
|
English AC, Dolzhenko E, Ziaei Jam H, McKenzie SK, Olson ND, De Coster W, Park J, Gu B, Wagner J, Eberle MA, Gymrek M, Chaisson MJP, Zook JM, Sedlazeck FJ. Analysis and benchmarking of small and large genomic variants across tandem repeats. Nat Biotechnol 2024:10.1038/s41587-024-02225-z. [PMID: 38671154 DOI: 10.1038/s41587-024-02225-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 03/28/2024] [Indexed: 04/28/2024]
Abstract
Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies. We curated variants from the Genome in a Bottle (GIAB) HG002 individual to create a TR dataset to benchmark existing and future TR analysis methods. We also present an improved variant comparison method that handles variants greater than 4 bp in length and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ~24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 'truth-set' TR benchmark. We demonstrate the utility of this pipeline across short-read and long-read technologies.
Collapse
Affiliation(s)
- Adam C English
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | | | - Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | | | - Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Jonghun Park
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Bida Gu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | | | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
15
|
Hujoel MLA, Handsaker RE, Sherman MA, Kamitaki N, Barton AR, Mukamel RE, Terao C, McCarroll SA, Loh PR. Protein-altering variants at copy number-variable regions influence diverse human phenotypes. Nat Genet 2024; 56:569-578. [PMID: 38548989 PMCID: PMC11018521 DOI: 10.1038/s41588-024-01684-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 02/08/2024] [Indexed: 04/09/2024]
Abstract
Copy number variants (CNVs) are among the largest genetic variants, yet CNVs have not been effectively ascertained in most genetic association studies. Here we ascertained protein-altering CNVs from UK Biobank whole-exome sequencing data (n = 468,570) using haplotype-informed methods capable of detecting subexonic CNVs and variation within segmental duplications. Incorporating CNVs into analyses of rare variants predicted to cause gene loss of function (LOF) identified 100 associations of predicted LOF variants with 41 quantitative traits. A low-frequency partial deletion of RGL3 exon 6 conferred one of the strongest protective effects of gene LOF on hypertension risk (odds ratio = 0.86 (0.82-0.90)). Protein-coding variation in rapidly evolving gene families within segmental duplications-previously invisible to most analysis methods-generated some of the human genome's largest contributions to variation in type 2 diabetes risk, chronotype and blood cell traits. These results illustrate the potential for new genetic insights from genomic variation that has escaped large-scale analysis to date.
Collapse
Affiliation(s)
- Margaux L A Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Robert E Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Maxwell A Sherman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- Serinus Biosciences Inc., New York, NY, USA
| | - Nolan Kamitaki
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alison R Barton
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Ronen E Mukamel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Steven A McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
16
|
Yu Z, Coorens THH, Uddin MM, Ardlie KG, Lennon N, Natarajan P. Genetic variation across and within individuals. Nat Rev Genet 2024:10.1038/s41576-024-00709-x. [PMID: 38548833 DOI: 10.1038/s41576-024-00709-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2024] [Indexed: 04/12/2024]
Abstract
Germline variation and somatic mutation are intricately connected and together shape human traits and disease risks. Germline variants are present from conception, but they vary between individuals and accumulate over generations. By contrast, somatic mutations accumulate throughout life in a mosaic manner within an individual due to intrinsic and extrinsic sources of mutations and selection pressures acting on cells. Recent advancements, such as improved detection methods and increased resources for association studies, have drastically expanded our ability to investigate germline and somatic genetic variation and compare underlying mutational processes. A better understanding of the similarities and differences in the types, rates and patterns of germline and somatic variants, as well as their interplay, will help elucidate the mechanisms underlying their distinct yet interlinked roles in human health and biology.
Collapse
Affiliation(s)
- Zhi Yu
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | | | - Md Mesbah Uddin
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | | | - Niall Lennon
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Pradeep Natarajan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
17
|
Wu Z, Li T, Jiang Z, Zheng J, Gu Y, Liu Y, Liu Y, Xie Z. Human pangenome analysis of sequences missing from the reference genome reveals their widespread evolutionary, phenotypic, and functional roles. Nucleic Acids Res 2024; 52:2212-2230. [PMID: 38364871 PMCID: PMC10954445 DOI: 10.1093/nar/gkae086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 01/18/2024] [Accepted: 01/27/2024] [Indexed: 02/18/2024] Open
Abstract
Nonreference sequences (NRSs) are DNA sequences present in global populations but absent in the current human reference genome. However, the extent and functional significance of NRSs in the human genomes and populations remains unclear. Here, we de novo assembled 539 genomes from five genetically divergent human populations using long-read sequencing technology, resulting in the identification of 5.1 million NRSs. These were merged into 45284 unique NRSs, with 29.7% being novel discoveries. Among these NRSs, 38.7% were common across the five populations, and 35.6% were population specific. The use of a graph-based pangenome approach allowed for the detection of 565 transcript expression quantitative trait loci on NRSs, with 426 of these being novel findings. Moreover, 26 NRS candidates displayed evidence of adaptive selection within human populations. Genes situated in close proximity to or intersecting with these candidates may be associated with metabolism and type 2 diabetes. Genome-wide association studies revealed 14 NRSs to be significantly associated with eight phenotypes. Additionally, 154 NRSs were found to be in strong linkage disequilibrium with 258 phenotype-associated SNPs in the GWAS catalogue. Our work expands the understanding of human NRSs and provides novel insights into their functions, facilitating evolutionary and biomedical researches.
Collapse
Affiliation(s)
- Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehang Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Jingjing Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yizhou Gu
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
- University of Wisconsin-Madison, WI, USA
| | - Yizhi Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yun Liu
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences and Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
18
|
Wang Z, Fu G, Ma G, Wang C, Wang Q, Lu C, Fu L, Zhang X, Cong B, Li S. The association between DNA methylation and human height and a prospective model of DNA methylation-based height prediction. Hum Genet 2024; 143:401-421. [PMID: 38507014 DOI: 10.1007/s00439-024-02659-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/13/2024] [Indexed: 03/22/2024]
Abstract
As a vital anthropometric characteristic, human height information not only helps to understand overall developmental status and genetic risk factors, but is also important for forensic DNA phenotyping. We utilized linear regression analysis to test the association between each CpG probe and the height phenotype. Next, we designed a methylation sequencing panel targeting 959 CpGs and subsequent height inference models were constructed for the Chinese population. A total of 11,730 height-associated sites were identified. By employing KPCA and deep neural networks, a prediction model was developed, of which the cross-validation RMSE, MAE and R2 were 5.62 cm, 4.45 cm and 0.64, respectively. Genetic factors could explain 39.4% of the methylation level variance of sites used in the height inference models. Collectively, we demonstrated an association between height and DNA methylation status through an EWAS analysis. Targeted methylation sequencing of only 959 CpGs combined with deep learning techniques could provide a model to estimate human height with higher accuracy than SNP-based prediction models.
Collapse
Affiliation(s)
- Zhonghua Wang
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Guangping Fu
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Guanju Ma
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Chunyan Wang
- Physical Examination Center of Shijiazhuang People's Hospital, Shijiazhuang, 050011, Hebei, China
| | - Qian Wang
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Chaolong Lu
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Lihong Fu
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Xiaojing Zhang
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Bin Cong
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China
| | - Shujin Li
- College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China.
| |
Collapse
|
19
|
Ling E, Nemesh J, Goldman M, Kamitaki N, Reed N, Handsaker RE, Genovese G, Vogelgsang JS, Gerges S, Kashin S, Ghosh S, Esposito JM, Morris K, Meyer D, Lutservitz A, Mullally CD, Wysoker A, Spina L, Neumann A, Hogan M, Ichihara K, Berretta S, McCarroll SA. A concerted neuron-astrocyte program declines in ageing and schizophrenia. Nature 2024; 627:604-611. [PMID: 38448582 PMCID: PMC10954558 DOI: 10.1038/s41586-024-07109-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Accepted: 01/23/2024] [Indexed: 03/08/2024]
Abstract
Human brains vary across people and over time; such variation is not yet understood in cellular terms. Here we describe a relationship between people's cortical neurons and cortical astrocytes. We used single-nucleus RNA sequencing to analyse the prefrontal cortex of 191 human donors aged 22-97 years, including healthy individuals and people with schizophrenia. Latent-factor analysis of these data revealed that, in people whose cortical neurons more strongly expressed genes encoding synaptic components, cortical astrocytes more strongly expressed distinct genes with synaptic functions and genes for synthesizing cholesterol, an astrocyte-supplied component of synaptic membranes. We call this relationship the synaptic neuron and astrocyte program (SNAP). In schizophrenia and ageing-two conditions that involve declines in cognitive flexibility and plasticity1,2-cells divested from SNAP: astrocytes, glutamatergic (excitatory) neurons and GABAergic (inhibitory) neurons all showed reduced SNAP expression to corresponding degrees. The distinct astrocytic and neuronal components of SNAP both involved genes in which genetic risk factors for schizophrenia were strongly concentrated. SNAP, which varies quantitatively even among healthy people of similar age, may underlie many aspects of normal human interindividual differences and may be an important point of convergence for multiple kinds of pathophysiology.
Collapse
Affiliation(s)
- Emi Ling
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Genetics, Harvard Medical School, Boston, MA, USA.
| | - James Nemesh
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Melissa Goldman
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Nolan Kamitaki
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Nora Reed
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Robert E Handsaker
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Giulio Genovese
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Jonathan S Vogelgsang
- McLean Hospital, Belmont, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Sherif Gerges
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Seva Kashin
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Sulagna Ghosh
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | | | | | - Daniel Meyer
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Alyssa Lutservitz
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Christopher D Mullally
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Alec Wysoker
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Liv Spina
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Anna Neumann
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Marina Hogan
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Kiku Ichihara
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Sabina Berretta
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- McLean Hospital, Belmont, MA, USA.
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA.
- Program in Neuroscience, Harvard Medical School, Boston, MA, USA.
| | - Steven A McCarroll
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Genetics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
20
|
Fazzari V, Moo-Choy A, Panoyan MA, Abbatangelo CL, Polimanti R, Novroski NM, Wendt FR. Multi-ancestry tandem repeat association study of hair colour using exome-wide sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.24.581865. [PMID: 38464141 PMCID: PMC10925195 DOI: 10.1101/2024.02.24.581865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Hair colour variation is influenced by hundreds of positions across the human genome but this genetic contribution has only been narrowly explored. Genome-wide association studies identified single nucleotide polymorphisms (SNPs) influencing hair colour but the biology underlying these associations is challenging to interpret. We report 16 tandem repeats (TRs) with effects on different models of hair colour plus two TRs associated with hair colour in diverse ancestry groups. Several of these TRs expand or contract amino acid coding regions of their localized protein such that structure, and by extension function, may be altered. We also demonstrate that independent of SNP variation, these TRs can be used to great an additive polygenic score that predicts darker hair colour. This work adds to the growing body of evidence regarding TR influence on human traits with relatively large and independent effects relative to surrounding SNP variation.
Collapse
|
21
|
Hong EP, Ramos EM, Aziz NA, Massey TH, McAllister B, Lobanov S, Jones L, Holmans P, Kwak S, Orth M, Ciosi M, Lomeikaite V, Monckton DG, Long JD, Lucente D, Wheeler VC, Gillis T, MacDonald ME, Sequeiros J, Gusella JF, Lee JM. Modification of Huntington's disease by short tandem repeats. Brain Commun 2024; 6:fcae016. [PMID: 38449714 PMCID: PMC10917446 DOI: 10.1093/braincomms/fcae016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/20/2023] [Accepted: 01/22/2024] [Indexed: 03/08/2024] Open
Abstract
Expansions of glutamine-coding CAG trinucleotide repeats cause a number of neurodegenerative diseases, including Huntington's disease and several of spinocerebellar ataxias. In general, age-at-onset of the polyglutamine diseases is inversely correlated with the size of the respective inherited expanded CAG repeat. Expanded CAG repeats are also somatically unstable in certain tissues, and age-at-onset of Huntington's disease corrected for individual HTT CAG repeat length (i.e. residual age-at-onset), is modified by repeat instability-related DNA maintenance/repair genes as demonstrated by recent genome-wide association studies. Modification of one polyglutamine disease (e.g. Huntington's disease) by the repeat length of another (e.g. ATXN3, CAG expansions in which cause spinocerebellar ataxia 3) has also been hypothesized. Consequently, we determined whether age-at-onset in Huntington's disease is modified by the CAG repeats of other polyglutamine disease genes. We found that the CAG measured repeat sizes of other polyglutamine disease genes that were polymorphic in Huntington's disease participants but did not influence Huntington's disease age-at-onset. Additional analysis focusing specifically on ATXN3 in a larger sample set (n = 1388) confirmed the lack of association between Huntington's disease residual age-at-onset and ATXN3 CAG repeat length. Additionally, neither our Huntington's disease onset modifier genome-wide association studies single nucleotide polymorphism data nor imputed short tandem repeat data supported the involvement of other polyglutamine disease genes in modifying Huntington's disease. By contrast, our genome-wide association studies based on imputed short tandem repeats revealed significant modification signals for other genomic regions. Together, our short tandem repeat genome-wide association studies show that modification of Huntington's disease is associated with short tandem repeats that do not involve other polyglutamine disease-causing genes, refining the landscape of Huntington's disease modification and highlighting the importance of rigorous data analysis, especially in genetic studies testing candidate modifiers.
Collapse
Affiliation(s)
- Eun Pyo Hong
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Eliana Marisa Ramos
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - N Ahmad Aziz
- Population & Clinical Neuroepidemiology, German Center for Neurodegenerative Diseases, 53127 Bonn, Germany
- Department of Neurology, Faculty of Medicine, University of Bonn, Bonn D-53113, Germany
| | - Thomas H Massey
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Branduff McAllister
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Sergey Lobanov
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Lesley Jones
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Peter Holmans
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Seung Kwak
- Molecular System Biology, CHDI Foundation, Princeton, NJ 08540, USA
| | - Michael Orth
- University Hospital of Old Age Psychiatry and Psychotherapy, Bern University, CH-3000 Bern 60, Switzerland
| | - Marc Ciosi
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Vilija Lomeikaite
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Darren G Monckton
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Jeffrey D Long
- Department of Psychiatry, Carver College of Medicine and Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA 52242, USA
| | - Diane Lucente
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Vanessa C Wheeler
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Tammy Gillis
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Marcy E MacDonald
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Jorge Sequeiros
- UnIGENe, IBMC—Institute for Molecular and Cell Biology, i3S—Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto 420-135, Portugal
- ICBAS School of Medicine and Biomedical Sciences, University of Porto, Porto 420-135, Portugal
| | - James F Gusella
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Jong-Min Lee
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
22
|
Manigbas CA, Jadhav B, Garg P, Shadrina M, Lee W, Martin-Trujillo A, Sharp AJ. A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.22.24301630. [PMID: 38343850 PMCID: PMC10854328 DOI: 10.1101/2024.01.22.24301630] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Most genetic association studies focus on binary variants. To identify the effects of multi-allelic variation of tandem repeats (TRs) on human traits, we performed direct TR genotyping and phenome-wide association studies in 168,554 individuals from the UK Biobank, identifying 47 TRs showing causal associations with 73 traits. We replicated 23 of 31 (74%) of these causal associations in the All of Us cohort. While this set included several known repeat expansion disorders, novel associations we found were attributable to common polymorphic variation in TR length rather than rare expansions and include e.g. a coding polyhistidine motif in HRCT1 influencing risk of hypertension and a poly(CGC) in the 5'UTR of GNB2 influencing heart rate. Causal TRs were strongly enriched for associations with local gene expression and DNA methylation. Our study highlights the contribution of multi-allelic TRs to the "missing heritability" of the human genome.
Collapse
|
23
|
Ling E, Nemesh J, Goldman M, Kamitaki N, Reed N, Handsaker RE, Genovese G, Vogelgsang JS, Gerges S, Kashin S, Ghosh S, Esposito JM, French K, Meyer D, Lutservitz A, Mullally CD, Wysoker A, Spina L, Neumann A, Hogan M, Ichihara K, Berretta S, McCarroll SA. Concerted neuron-astrocyte gene expression declines in aging and schizophrenia. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.07.574148. [PMID: 38260461 PMCID: PMC10802483 DOI: 10.1101/2024.01.07.574148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Human brains vary across people and over time; such variation is not yet understood in cellular terms. Here we describe a striking relationship between people's cortical neurons and cortical astrocytes. We used single-nucleus RNA-seq to analyze the prefrontal cortex of 191 human donors ages 22-97 years, including healthy individuals and persons with schizophrenia. Latent-factor analysis of these data revealed that in persons whose cortical neurons more strongly expressed genes for synaptic components, cortical astrocytes more strongly expressed distinct genes with synaptic functions and genes for synthesizing cholesterol, an astrocyte-supplied component of synaptic membranes. We call this relationship the Synaptic Neuron-and-Astrocyte Program (SNAP). In schizophrenia and aging - two conditions that involve declines in cognitive flexibility and plasticity 1,2 - cells had divested from SNAP: astrocytes, glutamatergic (excitatory) neurons, and GABAergic (inhibitory) neurons all reduced SNAP expression to corresponding degrees. The distinct astrocytic and neuronal components of SNAP both involved genes in which genetic risk factors for schizophrenia were strongly concentrated. SNAP, which varies quantitatively even among healthy persons of similar age, may underlie many aspects of normal human interindividual differences and be an important point of convergence for multiple kinds of pathophysiology.
Collapse
Affiliation(s)
- Emi Ling
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - James Nemesh
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Melissa Goldman
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Nolan Kamitaki
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Nora Reed
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Robert E. Handsaker
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Giulio Genovese
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jonathan S. Vogelgsang
- McLean Hospital, Belmont, MA 02478, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA 02215, USA
| | - Sherif Gerges
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Seva Kashin
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Sulagna Ghosh
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | - Daniel Meyer
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Alyssa Lutservitz
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Christopher D. Mullally
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Alec Wysoker
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Liv Spina
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Anna Neumann
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Marina Hogan
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Kiku Ichihara
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Sabina Berretta
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- McLean Hospital, Belmont, MA 02478, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA 02215, USA
- Program in Neuroscience, Harvard Medical School, Boston, MA 02215, USA
| | - Steven A. McCarroll
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
24
|
Parikh K, Quintero Reis A, Wendt FR. Association between suicidal ideation and tandem repeats in contactins. Front Psychiatry 2024; 14:1236540. [PMID: 38239902 PMCID: PMC10794671 DOI: 10.3389/fpsyt.2023.1236540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 12/13/2023] [Indexed: 01/22/2024] Open
Abstract
Background Death by suicide is one of the leading causes of death among adolescents. Genome-wide association studies (GWAS) have identified loci that associate with suicidal ideation and related behaviours. One such group of loci are the six contactin genes (CNTN1-6) that are critical to neurodevelopment through regulating neurite structure. Because single nucleotide polymorphisms (SNPs) detected by GWAS often map to non-coding intergenic regions, we investigated whether repetitive variants in CNTNs associated with suicidality in a young cohort aged 8 to 21. Understanding the genetic liability of suicidal thought and behavior in this age group will promote early intervention and treatment. Methods Genotypic and phenotypic data were obtained from the Philadelphia Neurodevelopment Cohort (PNC). Across six CNTNs, 232 short tandem repeats (STRs) were analyzed in up to 4,595 individuals of European ancestry who expressed current, previous, or no suicidal ideation. STRs were imputed into SNP arrays using a phased SNP-STR haplotype reference panel from the 1000 Genomes Project. We tested several additive and interactive models of locus-level burden (i.e., sum of STR alleles) with respect to suicidal ideation. Additive models included sex, birth year, developmental stage ("DevStage"), and the first 10 principal components of ancestry as covariates; interactive models assessed the effect of STR-by-DevStage considering all other covariates. Results CNTN1-[T]N interacted with DevStage to increase risk for current suicidal ideation (CNTN1-[T]N-by-DevStage; p = 0.00035). Compared to the youngest age group, the middle (OR = 1.80, p = 0.0514) and oldest (OR = 3.82, p = 0.0002) participant groups had significantly higher odds of suicidal ideation as their STR length expanded; this result was independent of polygenic scores for suicidal ideation. Discussion These findings highlight diversity in the genetic effects (i.e., SNP and STR) acting on suicidal thoughts and behavior and advance our understanding of suicidal ideation across childhood and adolescence.
Collapse
Affiliation(s)
- Kairavi Parikh
- Forensic Science Program, University of Toronto, Mississauga, ON, Canada
| | - Andrea Quintero Reis
- Forensic Science Program, University of Toronto, Mississauga, ON, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada
| | - Frank R. Wendt
- Forensic Science Program, University of Toronto, Mississauga, ON, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada
| |
Collapse
|
25
|
Birnbaum R. Rediscovering tandem repeat variation in schizophrenia: challenges and opportunities. Transl Psychiatry 2023; 13:402. [PMID: 38123544 PMCID: PMC10733427 DOI: 10.1038/s41398-023-02689-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 11/23/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open
Abstract
Tandem repeats (TRs) are prevalent throughout the genome, constituting at least 3% of the genome, and often highly polymorphic. The high mutation rate of TRs, which can be orders of magnitude higher than single-nucleotide polymorphisms and indels, indicates that they are likely to make significant contributions to phenotypic variation, yet their contribution to schizophrenia has been largely ignored by recent genome-wide association studies (GWAS). Tandem repeat expansions are already known causative factors for over 50 disorders, while common tandem repeat variation is increasingly being identified as significantly associated with complex disease and gene regulation. The current review summarizes key background concepts of tandem repeat variation as pertains to disease risk, elucidating their potential for schizophrenia association. An overview of next-generation sequencing-based methods that may be applied for TR genome-wide identification is provided, and some key methodological challenges in TR analyses are delineated.
Collapse
Affiliation(s)
- Rebecca Birnbaum
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
26
|
Panoyan MA, Wendt FR. The role of tandem repeat expansions in brain disorders. Emerg Top Life Sci 2023; 7:249-263. [PMID: 37401564 DOI: 10.1042/etls20230022] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/05/2023] [Accepted: 06/19/2023] [Indexed: 07/05/2023]
Abstract
The human genome contains numerous genetic polymorphisms contributing to different health and disease outcomes. Tandem repeat (TR) loci are highly polymorphic yet under-investigated in large genomic studies, which has prompted research efforts to identify novel variations and gain a deeper understanding of their role in human biology and disease outcomes. We summarize the current understanding of TRs and their implications for human health and disease, including an overview of the challenges encountered when conducting TR analyses and potential solutions to overcome these challenges. By shedding light on these issues, this article aims to contribute to a better understanding of the impact of TRs on the development of new disease treatments.
Collapse
Affiliation(s)
- Mary Anne Panoyan
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada
| | - Frank R Wendt
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Forensic Science Program, University of Toronto, Mississauga, ON, Canada
| |
Collapse
|
27
|
Chaisson MJP, Sulovari A, Valdmanis PN, Miller DE, Eichler EE. Advances in the discovery and analyses of human tandem repeats. Emerg Top Life Sci 2023; 7:361-381. [PMID: 37905568 PMCID: PMC10806765 DOI: 10.1042/etls20230074] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/18/2023] [Accepted: 10/18/2023] [Indexed: 11/02/2023]
Abstract
Long-read sequencing platforms provide unparalleled access to the structure and composition of all classes of tandemly repeated DNA from STRs to satellite arrays. This review summarizes our current understanding of their organization within the human genome, their importance with respect to disease, as well as the advances and challenges in understanding their genetic diversity and functional effects. Novel computational methods are being developed to visualize and associate these complex patterns of human variation with disease, expression, and epigenetic differences. We predict accurate characterization of this repeat-rich form of human variation will become increasingly relevant to both basic and clinical human genetics.
Collapse
Affiliation(s)
- Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, U.S.A
- The Genomic and Epigenomic Regulation Program, USC Norris Cancer Center, University of Southern California, Los Angeles, CA 90089, U.S.A
| | - Arvis Sulovari
- Computational Biology, Cajal Neuroscience Inc, Seattle, WA 98102, U.S.A
| | - Paul N Valdmanis
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
| | - Danny E Miller
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, U.S.A
- Department of Pediatrics, University of Washington, Seattle, WA 98195, U.S.A
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, U.S.A
| |
Collapse
|
28
|
Loh PR. Uncovering complex trait heritability hidden in the repeatome. CELL GENOMICS 2023; 3:100461. [PMID: 38116125 PMCID: PMC10726486 DOI: 10.1016/j.xgen.2023.100461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Short tandem repeats (STRs) account for a substantial fraction of human genetic variation, but their contribution to complex human phenotypes is largely unknown. Margoliash et al. perform detailed genome-wide association analysis and fine-mapping of STRs in UK Biobank, identifying many STRs likely to influence variation in blood and serum traits.
Collapse
Affiliation(s)
- Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
29
|
Margoliash J, Fuchs S, Li Y, Zhang X, Massarat A, Goren A, Gymrek M. Polymorphic short tandem repeats make widespread contributions to blood and serum traits. CELL GENOMICS 2023; 3:100458. [PMID: 38116119 PMCID: PMC10726533 DOI: 10.1016/j.xgen.2023.100458] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 09/09/2023] [Accepted: 11/07/2023] [Indexed: 12/21/2023]
Abstract
Short tandem repeats (STRs) are genomic regions consisting of repeated sequences of 1-6 bp in succession. Single-nucleotide polymorphism (SNP)-based genome-wide association studies (GWASs) do not fully capture STR effects. To study these effects, we imputed 445,720 STRs into genotype arrays from 408,153 White British UK Biobank participants and tested for association with 44 blood phenotypes. Using two fine-mapping methods, we identify 119 candidate causal STR-trait associations and estimate that STRs account for 5.2%-7.6% of causal variants identifiable from GWASs for these traits. These are among the strongest associations for multiple phenotypes, including a coding CTG repeat associated with apolipoprotein B levels, a promoter CGG repeat with platelet traits, and an intronic poly(A) repeat with mean platelet volume. Our study suggests that STRs make widespread contributions to complex traits, provides stringently selected candidate causal STRs, and demonstrates the need to consider a more complete view of genetic variation in GWASs.
Collapse
Affiliation(s)
- Jonathan Margoliash
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Shai Fuchs
- Pediatric Endocrine and Diabetes Unit, Edmond and Lily Safra Children’s Hospital, Sheba Medical Center, Ramat Gan, Israel
| | - Yang Li
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Xuan Zhang
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Arya Massarat
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Alon Goren
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
30
|
Hannan AJ. Expanding horizons of tandem repeats in biology and medicine: Why 'genomic dark matter' matters. Emerg Top Life Sci 2023; 7:ETLS20230075. [PMID: 38088823 PMCID: PMC10754335 DOI: 10.1042/etls20230075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 12/30/2023]
Abstract
Approximately half of the human genome includes repetitive sequences, and these DNA sequences (as well as their transcribed repetitive RNA and translated amino-acid repeat sequences) are known as the repeatome. Within this repeatome there are a couple of million tandem repeats, dispersed throughout the genome. These tandem repeats have been estimated to constitute ∼8% of the entire human genome. These tandem repeats can be located throughout exons, introns and intergenic regions, thus potentially affecting the structure and function of tandemly repetitive DNA, RNA and protein sequences. Over more than three decades, more than 60 monogenic human disorders have been found to be caused by tandem-repeat mutations. These monogenic tandem-repeat disorders include Huntington's disease, a variety of ataxias, amyotrophic lateral sclerosis and frontotemporal dementia, as well as many other neurodegenerative diseases. Furthermore, tandem-repeat disorders can include fragile X syndrome, related fragile X disorders, as well as other neurological and psychiatric disorders. However, these monogenic tandem-repeat disorders, which were discovered via their dominant or recessive modes of inheritance, may represent the 'tip of the iceberg' with respect to tandem-repeat contributions to human disorders. A previous proposal that tandem repeats may contribute to the 'missing heritability' of various common polygenic human disorders has recently been supported by a variety of new evidence. This includes genome-wide studies that associate tandem-repeat mutations with autism, schizophrenia, Parkinson's disease and various types of cancers. In this article, I will discuss how tandem-repeat mutations and polymorphisms could contribute to a wide range of common disorders, along with some of the many major challenges of tandem-repeat biology and medicine. Finally, I will discuss the potential of tandem repeats to be therapeutically targeted, so as to prevent and treat an expanding range of human disorders.
Collapse
Affiliation(s)
- Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Victoria 3010, Australia
- Department of Anatomy and Physiology, University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
31
|
Panoyan MA, Shi Y, Abbatangelo CL, Adler N, Moo-Choy A, Parra EJ, Polimanti R, Hu P, Wendt FR. Exome-wide tandem repeats confer large effects on subcortical volumes in UK Biobank participants. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.11.23299818. [PMID: 38168307 PMCID: PMC10760277 DOI: 10.1101/2023.12.11.23299818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
The human subcortex is involved in memory and cognition. Structural and functional changes in subcortical regions is implicated in psychiatric conditions. We performed an association study of subcortical volumes using 15,941 tandem repeats (TRs) derived from whole exome sequencing (WES) data in 16,527 unrelated European ancestry participants. We identified 17 loci, most of which were associated with accumbens volume, and nine of which had fine-mapping probability supporting their causal effect on subcortical volume independent of surrounding variation. The most significant association involved NTN1 -[GCGG] N and increased accumbens volume (β=5.93, P=8.16x10 -9 ). Three exonic TRs had large effects on thalamus volume ( LAT2 -[CATC] N β=-949, P=3.84x10 -6 and SLC39A4 -[CAG] N β=-1599, P=2.42x10 -8 ) and pallidum volume ( MCM2 -[AGG] N β=-404.9, P=147x10 -7 ). These genetic effects were consistent measurements of per-repeat expansion/contraction effects on organism fitness. With 3-dimensional modeling, we reinforced these effects to show that the expanded and contracted LAT2 -[CATC] N repeat causes a frameshift mutation that prevents appropriate protein folding. These TRs also exhibited independent effects on several psychiatric symptoms, including LAT2 -[CATC] N and the tiredness/low energy symptom of depression (β=0.340, P=0.003). These findings link genetic variation to tractable biology in the brain and relevant psychiatric symptoms. We also chart one pathway for TR prioritization in future complex trait genetic studies.
Collapse
|
32
|
Reeskamp LF, Tromp TR, Patel AP, Ibrahim S, Trinder M, Haidermota S, Hovingh GK, Stroes ESG, Natarajan P, Khera AV. Concordance of a High Lipoprotein(a) Concentration Among Relatives. JAMA Cardiol 2023; 8:1111-1118. [PMID: 37819667 PMCID: PMC10568442 DOI: 10.1001/jamacardio.2023.3548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/14/2023] [Indexed: 10/13/2023]
Abstract
Importance Lipoprotein(a) (Lp[a]) concentrations are a highly heritable and potential causal risk factor for atherosclerotic cardiovascular disease (ASCVD). Recent consensus statements by the European Atherosclerosis Society and American Heart Association recommend screening of relatives of individuals with high Lp(a) concentrations, but the expected yield of this approach has not been quantified in large populations. Objective To measure the prevalence of high Lp(a) concentrations among first- and second-degree relatives of individuals with high Lp(a) concentrations compared with unrelated participants. Design, Setting, and Participants In this cross-sectional analysis, pairs of first-degree (n = 19 899) and second-degree (n = 9715) relatives with measured Lp(a) levels from the UK Biobank study and random pairs of unrelated individuals (n = 184 764) were compared. Data for this study were collected from March 2006 to August 2010 and analyzed from December 2021 to August 2023. Exposure Serum Lp(a) levels, with a high Lp(a) level defined as at least 125 nmol/L. Main Outcome and Measure Concordance of clinically relevant high Lp(a) levels in first- and second-degree relatives of index participants with high Lp(a) levels. Results A total of 52 418 participants were included in the analysis (mean [SD] age, 57.3 [8.0] years; 29 825 [56.9%] women). Levels of Lp(a) were correlated among pairs of first-degree (Spearman ρ = 0.45; P < .001) and second-degree (Spearman ρ = 0.22; P < .001) relatives. A total of 1607 of 3420 (47.0% [95% CI, 45.3%-48.7%]) first-degree and 514 of 1614 (31.8% [95% CI, 29.6%-34.2%]) second-degree relatives of index participants with high Lp(a) levels also had elevated concentrations compared with 4974 of 30 258 (16.4% [95% CI, 16.0%-16.9%]) pairs of unrelated individuals. The concordance in high Lp(a) levels was generally consistent among subgroups (eg, those with prior ASCVD, postmenopausal women, and statin users). The odds ratios for relatives to have high Lp(a) levels if their index relative had a high Lp(a) level compared with those whose index relatives did not have high Lp(a) levels were 7.4 (95% CI, 6.8-8.1) for first-degree relatives and 3.0 (95% CI, 2.7-3.4) for second-degree relatives. Conclusions and Relevance The findings of this cross-sectional study suggest that the yield of cascade screening of first-degree relatives of individuals with high Lp(a) levels is over 40%. These findings support recent recommendations to use this approach to identify additional individuals at ASCVD risk based on Lp(a) concentrations.
Collapse
Affiliation(s)
- Laurens F. Reeskamp
- Department of Vascular Medicine, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, the Netherlands
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Tycho R. Tromp
- Department of Vascular Medicine, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, the Netherlands
| | - Aniruddh P. Patel
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts
- Division of Cardiology and Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Shirin Ibrahim
- Department of Vascular Medicine, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, the Netherlands
| | - Mark Trinder
- Centre for Heart Lung Innovation, Vancouver, British Columbia, Canada
| | - Sara Haidermota
- Cardiovascular Research Center, Massachusetts General Hospital, Boston
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts
| | - G. Kees Hovingh
- Department of Vascular Medicine, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, the Netherlands
- Novo Nordisk, Copenhagen, Denmark
| | - Erik S. G. Stroes
- Department of Vascular Medicine, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, the Netherlands
| | - Pradeep Natarajan
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts
- Division of Cardiology and Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Amit V. Khera
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts
- Verve Therapeutics, Boston, Massachusetts
| |
Collapse
|
33
|
Tang D, Freudenberg J, Dahl A. Factorizing polygenic epistasis improves prediction and uncovers biological pathways in complex traits. Am J Hum Genet 2023; 110:1875-1887. [PMID: 37922884 PMCID: PMC10645564 DOI: 10.1016/j.ajhg.2023.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 10/04/2023] [Accepted: 10/05/2023] [Indexed: 11/07/2023] Open
Abstract
Epistasis is central in many domains of biology, but it has not yet been proven useful for understanding the etiology of complex traits. This is partly because complex-trait epistasis involves polygenic interactions that are poorly captured in current models. To address this gap, we developed a model called Epistasis Factor Analysis (EFA). EFA assumes that polygenic epistasis can be factorized into interactions between a few epistasis factors (EFs), which represent latent polygenic components of the observed complex trait. The statistical goals of EFA are to improve polygenic prediction and to increase power to detect epistasis, while the biological goal is to unravel genetic effects into more-homogeneous units. We mathematically characterize EFA and use simulations to show that EFA outperforms current epistasis models when its assumptions approximately hold. Applied to predicting yeast growth rates, EFA outperforms the additive model for several traits with large epistasis heritability and uniformly outperforms the standard epistasis model. We replicate these prediction improvements in a second dataset. We then apply EFA to four previously characterized traits in the UK Biobank and find statistically significant epistasis in all four, including two that are robust to scale transformation. Moreover, we find that the inferred EFs partly recover pre-defined biological pathways for two of the traits. Our results demonstrate that more realistic models can identify biologically and statistically meaningful epistasis in complex traits, indicating that epistasis has potential for precision medicine and characterizing the biology underlying GWAS results.
Collapse
Affiliation(s)
- David Tang
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, USA.
| | - Jerome Freudenberg
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Andy Dahl
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
34
|
English A, Dolzhenko E, Jam HZ, Mckenzie S, Olson ND, De Coster W, Park J, Gu B, Wagner J, Eberle MA, Gymrek M, Chaisson MJP, Zook JM, Sedlazeck FJ. Benchmarking of small and large variants across tandem repeats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.29.564632. [PMID: 37961319 PMCID: PMC10634962 DOI: 10.1101/2023.10.29.564632] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and are linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due to challenges with variant calling, representation, and lack of a genome-wide standard. To promote TR methods development, we create a comprehensive catalog of TR regions and explore its properties across 86 samples. We then curate variants from the GIAB HG002 individual to create a tandem repeat benchmark. We also present a variant comparison method that handles small and large alleles and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ∼24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 TR benchmark. We work with the GIAB community to demonstrate the utility of this benchmark across short and long read technologies.
Collapse
|
35
|
Ziaei Jam H, Li Y, DeVito R, Mousavi N, Ma N, Lujumba I, Adam Y, Maksimov M, Huang B, Dolzhenko E, Qiu Y, Kakembo FE, Joseph H, Onyido B, Adeyemi J, Bakhtiari M, Park J, Javadzadeh S, Jjingo D, Adebiyi E, Bafna V, Gymrek M. A deep population reference panel of tandem repeat variation. Nat Commun 2023; 14:6711. [PMID: 37872149 PMCID: PMC10593948 DOI: 10.1038/s41467-023-42278-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 10/05/2023] [Indexed: 10/25/2023] Open
Abstract
Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
Collapse
Affiliation(s)
- Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Yang Li
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Ross DeVito
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Nima Mousavi
- Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA, USA
| | - Nichole Ma
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Ibra Lujumba
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
| | - Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Mikhail Maksimov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Bonnie Huang
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | | | - Yunjiang Qiu
- Illumina Incorporated, San Diego, CA, 92122, USA
| | - Fredrick Elishama Kakembo
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
| | - Habi Joseph
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
| | - Blessing Onyido
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Jumoke Adeyemi
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Mehrdad Bakhtiari
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Jonghun Park
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Sara Javadzadeh
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Daudi Jjingo
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
- Department of Computer Science, Makerere University, Kampala, Uganda
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, Baden-Württemberg, 69120, Germany
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
36
|
Nakanishi T, Willett J, Farjoun Y, Allen RJ, Guillen-Guio B, Adra D, Zhou S, Richards JB. Alternative splicing in lung influences COVID-19 severity and respiratory diseases. Nat Commun 2023; 14:6198. [PMID: 37794074 PMCID: PMC10550956 DOI: 10.1038/s41467-023-41912-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 09/21/2023] [Indexed: 10/06/2023] Open
Abstract
Alternative splicing generates functional diversity in isoforms, impacting immune response to infection. Here, we evaluate the causal role of alternative splicing in COVID-19 severity and susceptibility by applying two-sample Mendelian randomization to cis-splicing quantitative trait loci and the results from COVID-19 Host Genetics Initiative. We identify that alternative splicing in lung, rather than total expression of OAS1, ATP11A, DPP9 and NPNT, is associated with COVID-19 severity. MUC1 and PMF1 splicing is associated with COVID-19 susceptibility. Colocalization analyses support a shared genetic mechanism between COVID-19 severity with idiopathic pulmonary fibrosis at the ATP11A and DPP9 loci, and with chronic obstructive lung diseases at the NPNT locus. Last, we show that ATP11A, DPP9, NPNT, and MUC1 are highly expressed in lung alveolar epithelial cells, both in COVID-19 uninfected and infected samples. These findings clarify the importance of alternative splicing in lung for COVID-19 and respiratory diseases, providing isoform-based targets for drug discovery.
Collapse
Affiliation(s)
- Tomoko Nakanishi
- Department of Human Genetics, McGill University, Montréal, QC, Canada.
- Lady Davis Institute, Jewish General Hospital, McGill University, Montréal, QC, Canada.
- Kyoto-McGill International Collaborative Program in Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan.
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan.
- Research Fellow, Japan Society for the Promotion of Science, Tokyo, Japan.
| | - Julian Willett
- Lady Davis Institute, Jewish General Hospital, McGill University, Montréal, QC, Canada
- Quantitative Life Sciences Program, McGill University, Montréal, Canada
- McGill Genome Centre, McGill University, Montréal, QC, Canada
| | - Yossi Farjoun
- Lady Davis Institute, Jewish General Hospital, McGill University, Montréal, QC, Canada
- Five Prime Sciences Inc, Montréal, QC, Canada
| | - Richard J Allen
- Department of Population Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Beatriz Guillen-Guio
- Department of Population Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Darin Adra
- Lady Davis Institute, Jewish General Hospital, McGill University, Montréal, QC, Canada
| | - Sirui Zhou
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- Quantitative Life Sciences Program, McGill University, Montréal, Canada
- McGill Genome Centre, McGill University, Montréal, QC, Canada
| | - J Brent Richards
- Lady Davis Institute, Jewish General Hospital, McGill University, Montréal, QC, Canada.
- Five Prime Sciences Inc, Montréal, QC, Canada.
- Departments of Medicine, Human Genetics, Epidemiology and Biostatistics, McGill University, Montréal, QC, Canada.
- Department of Twin Research, King's College London, London, UK.
| |
Collapse
|
37
|
Chiesa G, Zenti MG, Baragetti A, Barbagallo CM, Borghi C, Colivicchi F, Maggioni AP, Noto D, Pirro M, Rivellese AA, Sampietro T, Sbrana F, Arca M, Averna M, Catapano AL. Consensus document on Lipoprotein(a) from the Italian Society for the Study of Atherosclerosis (SISA). Nutr Metab Cardiovasc Dis 2023; 33:1866-1877. [PMID: 37586921 DOI: 10.1016/j.numecd.2023.07.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/07/2023] [Accepted: 07/13/2023] [Indexed: 08/18/2023]
Abstract
AIMS In view of the consolidating evidence on the causal role of Lp(a) in cardiovascular disease, the Italian Society for the Study of Atherosclerosis (SISA) has assembled a consensus on Lp(a) genetics and epidemiology, together with recommendations for its measurement and current and emerging therapeutic approaches to reduce its plasma levels. Data on the Italian population are also provided. DATA SYNTHESIS Lp(a) is constituted by one apo(a) molecule and a lipoprotein closely resembling to a low-density lipoprotein (LDL). Its similarity with an LDL, together with its ability to carry oxidized phospholipids are considered the two main features making Lp(a) harmful for cardiovascular health. Plasma Lp(a) concentrations vary over about 1000 folds in humans and are genetically determined, thus they are quite stable in any individual. Mendelian Randomization studies have suggested a causal role of Lp(a) in atherosclerotic cardiovascular disease (ASCVD) and aortic valve stenosis and observational studies indicate a linear direct correlation between cardiovascular disease and Lp(a) plasma levels. Lp(a) measurement is strongly recommended once in a patient's lifetime, particularly in FH subjects, but also as part of the initial lipid screening to assess cardiovascular risk. The apo(a) size polymorphism represents a challenge for Lp(a) measurement in plasma, but new strategies are overcoming these difficulties. A reduction of Lp(a) levels can be currently attained only by plasma apheresis and, moderately, with PCSK9 inhibitor treatment. CONCLUSIONS Awaiting the approval of selective Lp(a)-lowering drugs, an intensive management of the other risk factors for individuals with elevated Lp(a) levels is strongly recommended.
Collapse
Affiliation(s)
- Giulia Chiesa
- Department of Pharmacological and Biomolecular Sciences "Rodolfo Paoletti", Università Degli Studi di Milano, Milan, Italy.
| | - Maria Grazia Zenti
- Section of Diabetes and Metabolism, Pederzoli Hospital, Peschiera Del Garda, Verona, Italy.
| | - Andrea Baragetti
- Department of Pharmacological and Biomolecular Sciences "Rodolfo Paoletti", Università Degli Studi di Milano, Milan, Italy; IRCCS MultiMedica, Sesto San Giovanni, Milan, Italy
| | - Carlo M Barbagallo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (PROMISE), University of Palermo, Palermo, Italy
| | - Claudio Borghi
- Department of Cardiovascular Medicine, IRCCS AOU S. Orsola, Bologna, Italy
| | - Furio Colivicchi
- Division of Clinical Cardiology, San Filippo Neri Hospital, Rome, Italy
| | - Aldo P Maggioni
- ANMCO Research Center, Heart Care Foundation, Firenze, Italy
| | - Davide Noto
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (PROMISE), University of Palermo, Palermo, Italy
| | - Matteo Pirro
- Unit of Internal Medicine, Angiology and Arteriosclerosis Diseases, Department of Medicine and Surgery, University of Perugia, Italy
| | - Angela A Rivellese
- Department of Clinical Medicine and Surgery, Federico II University, Naples, Italy
| | - Tiziana Sampietro
- Lipoapheresis Unit, Reference Center for Diagnosis and Treatment of Inherited Dyslipidemias, Fondazione Toscana Gabriele Monasterio, Pisa, Italy
| | - Francesco Sbrana
- Lipoapheresis Unit, Reference Center for Diagnosis and Treatment of Inherited Dyslipidemias, Fondazione Toscana Gabriele Monasterio, Pisa, Italy
| | - Marcello Arca
- Department of Translational and Precision Medicine (DTPM), Sapienza University of Rome, Policlinico Umberto I, Rome, Italy
| | - Maurizio Averna
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (PROMISE), University of Palermo, Palermo, Italy; Institute of Biophysics, National Council of Researches, Palermo, Italy
| | - Alberico L Catapano
- Department of Pharmacological and Biomolecular Sciences "Rodolfo Paoletti", Università Degli Studi di Milano, Milan, Italy; IRCCS MultiMedica, Sesto San Giovanni, Milan, Italy
| |
Collapse
|
38
|
Liao X, Zhu W, Zhou J, Li H, Xu X, Zhang B, Gao X. Repetitive DNA sequence detection and its role in the human genome. Commun Biol 2023; 6:954. [PMID: 37726397 PMCID: PMC10509279 DOI: 10.1038/s42003-023-05322-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/04/2023] [Indexed: 09/21/2023] Open
Abstract
Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.
Collapse
Affiliation(s)
- Xingyu Liao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Wufei Zhu
- Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, 443000, Yichang, P.R. China
| | - Juexiao Zhou
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Haoyang Li
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xiaopeng Xu
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Bin Zhang
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
39
|
Naas S, Krüger R, Knaup KX, Naas J, Grampp S, Schiffer M, Wiesener M, Schödel J. Hypoxia controls expression of kidney-pathogenic MUC1 variants. Life Sci Alliance 2023; 6:e202302078. [PMID: 37316299 PMCID: PMC10267510 DOI: 10.26508/lsa.202302078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 06/16/2023] Open
Abstract
The interplay between genetic and environmental factors influences the course of chronic kidney disease (CKD). In this context, genetic alterations in the kidney disease gene MUC1 (Mucin1) predispose to the development of CKD. These variations comprise the polymorphism rs4072037, which alters splicing of MUC1 mRNA, the length of a region with variable number of tandem repeats (VNTR), and rare autosomal-dominant inherited dominant-negative mutations in or 5' to the VNTR that causes autosomal dominant tubulointerstitial kidney disease (ADTKD-MUC1). As hypoxia plays a pivotal role in states of acute and chronic kidney injury, we explored the effects of hypoxia-inducible transcription factors (HIF) on the expression of MUC1 and its pathogenic variants in isolated primary human renal tubular cells. We defined a HIF-binding DNA regulatory element in the promoter-proximal region of MUC1 from which hypoxia or treatment with HIF stabilizers, which were recently approved for an anti-anemic therapy in CKD patients, increased levels of wild-type MUC1 and the disease-associated variants. Thus, application of these compounds might exert unfavorable effects in patients carrying MUC1 risk variants.
Collapse
Affiliation(s)
- Stephanie Naas
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - René Krüger
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Karl Xaver Knaup
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Julia Naas
- Center for Integrative Bioinformatics Vienna (CIBIV), Max Perutz Labs, University of Vienna and Medical University of Vienna, Wien, Austria
| | - Steffen Grampp
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Mario Schiffer
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Wiesener
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Johannes Schödel
- Department of Nephrology and Hypertension, Uniklinikum Erlangen und Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
40
|
Mukamel RE, Handsaker RE, Sherman MA, Barton AR, Hujoel MLA, McCarroll SA, Loh PR. Repeat polymorphisms underlie top genetic risk loci for glaucoma and colorectal cancer. Cell 2023; 186:3659-3673.e23. [PMID: 37527660 PMCID: PMC10528368 DOI: 10.1016/j.cell.2023.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 04/07/2023] [Accepted: 07/03/2023] [Indexed: 08/03/2023]
Abstract
Many regions in the human genome vary in length among individuals due to variable numbers of tandem repeats (VNTRs). To assess the phenotypic impact of VNTRs genome-wide, we applied a statistical imputation approach to estimate the lengths of 9,561 autosomal VNTR loci in 418,136 unrelated UK Biobank participants and 838 GTEx participants. Association and statistical fine-mapping analyses identified 58 VNTRs that appeared to influence a complex trait in UK Biobank, 18 of which also appeared to modulate expression or splicing of a nearby gene. Non-coding VNTRs at TMCO1 and EIF3H appeared to generate the largest known contributions of common human genetic variation to risk of glaucoma and colorectal cancer, respectively. Each of these two VNTRs associated with a >2-fold range of risk across individuals. These results reveal a substantial and previously unappreciated role of non-coding VNTRs in human health and gene regulation.
Collapse
Affiliation(s)
- Ronen E Mukamel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Robert E Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Genetics, Harvard Medical School, Boston, MA, USA.
| | - Maxwell A Sherman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Alison R Barton
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Bioinformatics and Integrative Genomics Program, Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Margaux L A Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven A McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Genetics, Harvard Medical School, Boston, MA, USA.
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
41
|
Cuomo ASE, Nathan A, Raychaudhuri S, MacArthur DG, Powell JE. Single-cell genomics meets human genetics. Nat Rev Genet 2023; 24:535-549. [PMID: 37085594 PMCID: PMC10784789 DOI: 10.1038/s41576-023-00599-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/29/2023] [Indexed: 04/23/2023]
Abstract
Single-cell genomic technologies are revealing the cellular composition, identities and states in tissues at unprecedented resolution. They have now scaled to the point that it is possible to query samples at the population level, across thousands of individuals. Combining single-cell information with genotype data at this scale provides opportunities to link genetic variation to the cellular processes underpinning key aspects of human biology and disease. This strategy has potential implications for disease diagnosis, risk prediction and development of therapeutic solutions. But, effectively integrating large-scale single-cell genomic data, genetic variation and additional phenotypic data will require advances in data generation and analysis methods. As single-cell genetics begins to emerge as a field in its own right, we review its current state and the challenges and opportunities ahead.
Collapse
Affiliation(s)
- Anna S E Cuomo
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
| | - Aparna Nathan
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Divisions of Rheumatology and Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Divisions of Rheumatology and Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Joseph E Powell
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia.
- UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
42
|
Lee MP, Dimos SF, Raffield LM, Wang Z, Ballou AF, Downie CG, Arehart CH, Correa A, de Vries PS, Du Z, Gignoux CR, Gordon-Larsen P, Guo X, Haessler J, Howard AG, Hu Y, Kassahun H, Kent ST, Lopez JAG, Monda KL, North KE, Peters U, Preuss MH, Rich SS, Rhodes SL, Yao J, Yarosh R, Tsai MY, Rotter JI, Kooperberg CL, Loos RJF, Ballantyne C, Avery CL, Graff M. Ancestral diversity in lipoprotein(a) studies helps address evidence gaps. Open Heart 2023; 10:e002382. [PMID: 37648373 PMCID: PMC10471864 DOI: 10.1136/openhrt-2023-002382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 08/04/2023] [Indexed: 09/01/2023] Open
Abstract
INTRODUCTION The independent and causal cardiovascular disease risk factor lipoprotein(a) (Lp(a)) is elevated in >1.5 billion individuals worldwide, but studies have prioritised European populations. METHODS Here, we examined how ancestrally diverse studies could clarify Lp(a)'s genetic architecture, inform efforts examining application of Lp(a) polygenic risk scores (PRS), enable causal inference and identify unexpected Lp(a) phenotypic effects using data from African (n=25 208), East Asian (n=2895), European (n=362 558), South Asian (n=8192) and Hispanic/Latino (n=8946) populations. RESULTS Fourteen genome-wide significant loci with numerous population specific signals of large effect were identified that enabled construction of Lp(a) PRS of moderate (R2=15% in East Asians) to high (R2=50% in Europeans) accuracy. For all populations, PRS showed promise as a 'rule out' for elevated Lp(a) because certainty of assignment to the low-risk threshold was high (88.0%-99.9%) across PRS thresholds (80th-99th percentile). Causal effects of increased Lp(a) with increased glycated haemoglobin were estimated for Europeans (p value =1.4×10-6), although inverse effects in Africans and East Asians suggested the potential for heterogeneous causal effects. Finally, Hispanic/Latinos were the only population in which known associations with coronary atherosclerosis and ischaemic heart disease were identified in external testing of Lp(a) PRS phenotypic effects. CONCLUSIONS Our results emphasise the merits of prioritising ancestral diversity when addressing Lp(a) evidence gaps.
Collapse
Affiliation(s)
- Moa P Lee
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Sofia F Dimos
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Laura M Raffield
- Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Anna F Ballou
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Carolina G Downie
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Christopher H Arehart
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Adolfo Correa
- Department of Population Health Science, The University of Mississippi Medical Center, Jackson, Mississippi, USA
| | - Paul S de Vries
- Department of Epidemiology, Human Genetics, and Environmental Sciences, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Zhaohui Du
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Christopher R Gignoux
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Penny Gordon-Larsen
- Department of Nutrition, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Xiuqing Guo
- Department of Pediatrics, UCLA Medical Center, Los Angeles, California, USA
| | - Jeffrey Haessler
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Annie Green Howard
- Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Yao Hu
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Helina Kassahun
- Global Development, Amgen Inc, Thousand Oaks, California, USA
| | - Shia T Kent
- Center for Observational Research, Amgen Inc, Thousand Oaks, California, USA
| | | | - Keri L Monda
- Center for Observational Research, Amgen Inc, Thousand Oaks, California, USA
| | - Kari E North
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Ulrike Peters
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Michael H Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Stephen S Rich
- University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Shannon L Rhodes
- Center for Observational Research, Amgen Inc, Thousand Oaks, California, USA
| | - Jie Yao
- Department of Pediatrics, UCLA Medical Center, Los Angeles, California, USA
| | - Rina Yarosh
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Michael Y Tsai
- Department of Laboratory Medicine & Pathology, University of Minnesota, Minneapolis, Minnesota, USA
| | - Jerome I Rotter
- Department of Pediatrics, UCLA Medical Center, Los Angeles, California, USA
| | - Charles L Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Kobenhavn, Denmark
| | - Christie Ballantyne
- Department of Medicine, Section of Cardiology, Baylor College of Medicine, Houston, Texas, USA
| | - Christy L Avery
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Mariaelisa Graff
- Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
43
|
Ren J, Gu B, Chaisson MJP. vamos: variable-number tandem repeats annotation using efficient motif sets. Genome Biol 2023; 24:175. [PMID: 37501141 PMCID: PMC10373352 DOI: 10.1186/s13059-023-03010-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 07/06/2023] [Indexed: 07/29/2023] Open
Abstract
Roughly 3% of the human genome is composed of variable-number tandem repeats (VNTRs): arrays of motifs at least six bases. These loci are highly polymorphic, yet current approaches that define and merge variants based on alignment breakpoints do not capture their full diversity. Here we present a method vamos: VNTR Annotation using efficient Motif Sets that instead annotates VNTR using repeat composition under different levels of motif diversity. Using vamos we estimate 7.4-16.7 alleles per locus when applied to 74 haplotype-resolved human assemblies, compared to breakpoint-based approaches that estimate 4.0-5.5 alleles per locus.
Collapse
Affiliation(s)
- Jingwen Ren
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, US
| | - Bida Gu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, US
| | - Mark J. P. Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, US
| |
Collapse
|
44
|
Wang X, Huang M, Budowle B, Ge J. TRcaller: a novel tool for precise and ultrafast tandem repeat variant genotyping in massively parallel sequencing reads. Front Genet 2023; 14:1227176. [PMID: 37533432 PMCID: PMC10390829 DOI: 10.3389/fgene.2023.1227176] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/13/2023] [Indexed: 08/04/2023] Open
Abstract
Calling tandem repeat (TR) variants from DNA sequences is of both theoretical and practical significance. Some bioinformatics tools have been developed for detecting or genotyping TRs. However, little study has been done to genotyping TR alleles from long-read sequencing data, and the accuracy of genotyping TR alleles from next-generation sequencing data still needs to be improved. Herein, a novel algorithm is described to retrieve TR regions from sequence alignment, and a software program TRcaller has been developed and integrated into a web portal to call TR alleles from both short- and long-read sequences, both whole genome and targeted sequences generated from multiple sequencing platforms. All TR alleles are genotyped as haplotypes and the robust alleles will be reported, even multiple alleles in a DNA mixture. TRcaller could provide substantially higher accuracy (>99% in 289 human individuals) in detecting TR alleles with magnitudes faster (e.g., ∼2 s for 300x human sequence data) than the mainstream software tools. The web portal preselected 119 TR loci from forensics, genealogy, and disease related TR loci. TRcaller is validated to be scalable in various applications, such as DNA forensics and disease diagnosis, which can be expanded into other fields like breeding programs. Availability: TRcaller is available at https://www.trcaller.com/SignIn.aspx.
Collapse
Affiliation(s)
- Xuewen Wang
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Meng Huang
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Jianye Ge
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| |
Collapse
|
45
|
Mieno MN, Yamasaki M, Kuchiba A, Yamaji T, Ide K, Tanaka N, Sawada N, Inoue M, Tsugane S, Sawabe M, Iwasaki M. Lack of significant associations between single nucleotide polymorphisms in LPAL2-LPA genetic region and all cancer incidence and mortality in Japanese population: The Japan public health center-based prospective study. Cancer Epidemiol 2023; 85:102395. [PMID: 37321067 DOI: 10.1016/j.canep.2023.102395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/02/2023] [Accepted: 05/25/2023] [Indexed: 06/17/2023]
Abstract
BACKGROUND High lipoprotein (a) level is an established cardiovascular risk, but its association with non-cardiovascular diseases, especially cancer, is controversial. Serum lipoprotein (a) levels vary widely by genetic backgrounds and are largely determined by the genetic variations of apolipoprotein (a) gene, LPA. In this study, we investigate the association between SNPs in LPA region and cancer incidence and mortality in Japanese. METHODS A genetic cohort study was conducted utilizing the data from 9923 participants in the Japan Public Health Center-based Prospective Study (JPHC Study). Twenty-five SNPs in the LPAL2-LPA region were selected from the genome-wide genotyped data. Cox regression analysis adjusted for the covariates and competing risks of death from other causes, were used to estimate the relative risk (hazard ratios (HR) with 95% confidence intervals (CI)) of overall and site-specific cancer incidence and mortality, for each SNP. RESULTS No significant association was found between SNPs in the LPAL2-LPA region and cancer incidence or mortality (overall/site-specific cancer). In men, however, HRs for stomach cancer incidence of 18SNPs were estimated higher than 1.5 (e.g., 2.15 for rs13202636, model free, 95%CI: 1.28-3.62) and those for stomach cancer mortality of 2SNPs (rs9365171, rs1367211) were estimated 2.13 (recessive, 95%CI:1.04-4.37) and 1.61 (additive, 95%CI: 1.00-2.59). Additionally, the minor allele for SNP rs3798220 showed increased death risk from colorectal cancer (CRC) in men (HR: 3.29, 95% CI:1.59 - 6.81) and decreased CRC incidence risk in women (HR: 0.46, 95%CI: 0.22-0.94). Minor allele carrier of any of 4SNPs could have risk of prostate cancer incidence (e.g., rs9365171 dominant, HR: 1.71, 95%CI: 1.06-2.77). CONCLUSIONS None of the 25 SNPs in the LPAL2-LPA region was found to be significantly associated with cancer incidence or mortality. Considering the possible association between SNPs in LPAL2-LPA region and colorectal, prostate and stomach cancer incidence or mortality, further analysis using different cohorts is warranted.
Collapse
Affiliation(s)
- Makiko Naka Mieno
- Department of Medical Informatics, Center for Information, Jichi Medical University, Shimotsuke 329-0498, Japan; Health Data Science Research Section, Healthy Aging Innovation Center, Tokyo Metropolitan Geriatric Research Institute, Tokyo 173-0015, Japan
| | - Maria Yamasaki
- Health Data Science Research Section, Healthy Aging Innovation Center, Tokyo Metropolitan Geriatric Research Institute, Tokyo 173-0015, Japan
| | - Aya Kuchiba
- Biostatistics Division, Center for Research Administration and Support/Division of Biostatistical Research, Institute for Cancer Control, National Cancer Center, Tokyo 104-0045, Japan; Graduate School of Health Innovation, Kanagawa University of Human Services, Kanagawa, 210-0821, Japan
| | - Taiki Yamaji
- Division of Epidemiology, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan.
| | - Keigo Ide
- Health Data Science Research Section, Healthy Aging Innovation Center, Tokyo Metropolitan Geriatric Research Institute, Tokyo 173-0015, Japan; Department of Life Science and Medical Bioscience, Graduate School of Advanced Science and Engineering, Waseda University, Tokyo 162-8480, Japan
| | - Noriko Tanaka
- Health Data Science Research Section, Healthy Aging Innovation Center, Tokyo Metropolitan Geriatric Research Institute, Tokyo 173-0015, Japan.
| | - Norie Sawada
- Division of Cohort Research, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan
| | - Manami Inoue
- Division of Cohort Research, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan; Division of Prevention, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan
| | - Shoichiro Tsugane
- Division of Cohort Research, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan; National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo 162-8636, Japan
| | - Motoji Sawabe
- Department of Molecular Pathology, Graduate School of Health Care Sciences, Tokyo Medical and Dental University, Tokyo 113-8519, Japan
| | - Motoki Iwasaki
- Division of Epidemiology, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan; Division of Cohort Research, National Cancer Center Institute for Cancer Control, Tokyo 104-0045, Japan
| |
Collapse
|
46
|
Hujoel ML, Handsaker RE, Sherman MA, Kamitaki N, Barton AR, Mukamel RE, Terao C, McCarroll SA, Loh PR. Hidden protein-altering variants influence diverse human phenotypes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.07.544066. [PMID: 37333244 PMCID: PMC10274781 DOI: 10.1101/2023.06.07.544066] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Structural variants (SVs) comprise the largest genetic variants, altering from 50 base pairs to megabases of DNA. However, SVs have not been effectively ascertained in most genetic association studies, leaving a key gap in our understanding of human complex trait genetics. We ascertained protein-altering SVs from UK Biobank whole-exome sequencing data (n=468,570) using haplotype-informed methods capable of detecting sub-exonic SVs and variation within segmental duplications. Incorporating SVs into analyses of rare variants predicted to cause gene loss-of-function (pLoF) identified 100 associations of pLoF variants with 41 quantitative traits. A low-frequency partial deletion of RGL3 exon 6 appeared to confer one of the strongest protective effects of gene LoF on hypertension risk (OR = 0.86 [0.82-0.90]). Protein-coding variation in rapidly-evolving gene families within segmental duplications-previously invisible to most analysis methods-appeared to generate some of the human genome's largest contributions to variation in type 2 diabetes risk, chronotype, and blood cell traits. These results illustrate the potential for new genetic insights from genomic variation that has escaped large-scale analysis to date.
Collapse
Affiliation(s)
- Margaux L.A. Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Robert E. Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard University, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Maxwell A. Sherman
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Nolan Kamitaki
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alison R. Barton
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ronen E. Mukamel
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Steven A. McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard University, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
47
|
Zhang BC, Biddanda A, Gunnarsson ÁF, Cooper F, Palamara PF. Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nat Genet 2023; 55:768-776. [PMID: 37127670 PMCID: PMC10181934 DOI: 10.1038/s41588-023-01379-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Accepted: 03/22/2023] [Indexed: 05/03/2023]
Abstract
Genome-wide genealogies compactly represent the evolutionary history of a set of genomes and inferring them from genetic data has the potential to facilitate a wide range of analyses. We introduce a method, ARG-Needle, for accurately inferring biobank-scale genealogies from sequencing or genotyping array data, as well as strategies to utilize genealogies to perform association and other complex trait analyses. We use these methods to build genome-wide genealogies using genotyping data for 337,464 UK Biobank individuals and test for association across seven complex traits. Genealogy-based association detects more rare and ultra-rare signals (N = 134, frequency range 0.0007-0.1%) than genotype imputation using ~65,000 sequenced haplotypes (N = 64). In a subset of 138,039 exome sequencing samples, these associations strongly tag (average r = 0.72) underlying sequencing variants enriched (4.8×) for loss-of-function variation. These results demonstrate that inferred genome-wide genealogies may be leveraged in the analysis of complex traits, complementing approaches that require the availability of large, population-specific sequencing panels.
Collapse
Affiliation(s)
- Brian C Zhang
- Department of Statistics, University of Oxford, Oxford, UK
| | - Arjun Biddanda
- Department of Statistics, University of Oxford, Oxford, UK
| | - Árni Freyr Gunnarsson
- Department of Statistics, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Fergus Cooper
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Pier Francesco Palamara
- Department of Statistics, University of Oxford, Oxford, UK.
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
| |
Collapse
|
48
|
Behera S, Belyeu JR, Chen X, Paulin LF, Nguyen NQH, Newman E, Mahmoud M, Menon VK, Qi Q, Joshi P, Marcovina S, Rossi M, Roller E, Han J, Onuchic V, Avery CL, Ballantyne CM, Rodriguez CJ, Kaplan RC, Muzny DM, Metcalf GA, Gibbs R, Yu B, Boerwinkle E, Eberle MA, Sedlazeck FJ. Identification of allele-specific KIV-2 repeats and impact on Lp(a) measurements for cardiovascular disease risk. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.24.538128. [PMID: 37163057 PMCID: PMC10168217 DOI: 10.1101/2023.04.24.538128] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The abundance of Lp(a) protein holds significant implications for the risk of cardiovascular disease (CVD), which is directly impacted by the copy number (CN) of KIV-2, a 5.5 kbp sub-region. KIV-2 is highly polymorphic in the population and accurate analysis is challenging. In this study, we present the DRAGEN KIV-2 CN caller, which utilizes short reads. Data across 166 WGS show that the caller has high accuracy, compared to optical mapping and can further phase ~50% of the samples. We compared KIV-2 CN numbers to 24 previously postulated KIV-2 relevant SNVs, revealing that many are ineffective predictors of KIV-2 copy number. Population studies, including USA-based cohorts, showed distinct KIV-2 CN, distributions for European-, African-, and Hispanic-American populations and further underscored the limitations of SNV predictors. We demonstrate that the CN estimates correlate significantly with the available Lp(a) protein levels and that phasing is highly important.
Collapse
Affiliation(s)
- S Behera
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - X Chen
- Illumina Inc., San Diego, CA, USA
| | - L F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - N Q H Nguyen
- School of Public Health, University of Texas Health Science Center at Houston, TX, USA
| | - E Newman
- Illumina Inc., San Diego, CA, USA
| | - M Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - V K Menon
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Q Qi
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - P Joshi
- Medpace Reference Laboratories, Cincinnati, OH, USA
| | - S Marcovina
- University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - M Rossi
- Illumina Inc., San Diego, CA, USA
| | - E Roller
- Illumina Inc., San Diego, CA, USA
| | - J Han
- Illumina Inc., San Diego, CA, USA
| | | | - C L Avery
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - C M Ballantyne
- Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - C J Rodriguez
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - R C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Fred Hutchinson Cancer Center, Public Health Sciences Division, Seattle WA 98109
| | - D M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - G A Metcalf
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - R Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - B Yu
- School of Public Health, University of Texas Health Science Center at Houston, TX, USA
| | - E Boerwinkle
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- School of Public Health, University of Texas Health Science Center at Houston, TX, USA
| | | | - F J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, USA
| |
Collapse
|
49
|
Park J, Kaufman E, Valdmanis PN, Bafna V. TRviz: a Python library for decomposing and visualizing tandem repeat sequences. BIOINFORMATICS ADVANCES 2023; 3:vbad058. [PMID: 37168281 PMCID: PMC10166586 DOI: 10.1093/bioadv/vbad058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 04/14/2023] [Accepted: 04/24/2023] [Indexed: 05/13/2023]
Abstract
Summary TRviz is an open-source Python library for decomposing, encoding, aligning and visualizing tandem repeat (TR) sequences. TRviz takes a collection of alleles (TR containing sequences) and one or more motifs as input and generates a plot showing the motif composition of the TR sequences. Availability and implementation TRviz is an open-source Python library and freely available at https://github.com/Jong-hun-Park/trviz. Detailed documentation is available at https://trviz.readthedocs.io. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Jonghun Park
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Eli Kaufman
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Paul N Valdmanis
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Vineet Bafna
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
50
|
Mallard TT, Grotzinger AD, Smoller JW. Examining the shared etiology of psychopathology with genome-wide association studies. Physiol Rev 2023; 103:1645-1665. [PMID: 36634217 PMCID: PMC9988537 DOI: 10.1152/physrev.00016.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 12/19/2022] [Accepted: 01/03/2023] [Indexed: 01/13/2023] Open
Abstract
Genome-wide association studies (GWASs) have ushered in a new era of reproducible discovery in psychiatric genetics. The field has now identified hundreds of common genetic variants that are associated with mental disorders, and many of them influence more than one disorder. By advancing the understanding of causal biology underlying psychopathology, GWAS results are poised to inform the development of novel therapeutics, stratification of at-risk patients, and perhaps even the revision of top-down classification systems in psychiatry. Here, we provide a concise review of GWAS findings with an emphasis on findings that have elucidated the shared genetic etiology of psychopathology, summarizing insights at three levels of analysis: 1) genome-wide architecture; 2) networks, pathways, and gene sets; and 3) individual variants/genes. Three themes emerge from these efforts. First, all psychiatric phenotypes are heritable, highly polygenic, and influenced by many pleiotropic variants with incomplete penetrance. Second, GWAS results highlight the broad etiological roles of neuronal biology, system-wide effects over localized effects, and early neurodevelopment as a critical period. Third, many loci that are robustly associated with multiple forms of psychopathology harbor genes that are involved in synaptic structure and function. Finally, we conclude our review by discussing the implications that GWAS results hold for the field of psychiatry, as well as expected challenges and future directions in the next stage of psychiatric genetics.
Collapse
Affiliation(s)
- Travis T Mallard
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States
- Department of Psychiatry, Harvard Medical School, Boston, Massachusetts, United States
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, Massachusetts, United States
| | - Andrew D Grotzinger
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States
- Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, Colorado, United States
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States
- Department of Psychiatry, Harvard Medical School, Boston, Massachusetts, United States
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, Massachusetts, United States
| |
Collapse
|