1
|
Lamkin M, Gymrek M. The emerging role of tandem repeats in complex traits. Nat Rev Genet 2024; 25:452-453. [PMID: 38714860 DOI: 10.1038/s41576-024-00736-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Affiliation(s)
- Michael Lamkin
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
2
|
Tanudisastro HA, Deveson IW, Dashnow H, MacArthur DG. Sequencing and characterizing short tandem repeats in the human genome. Nat Rev Genet 2024; 25:460-475. [PMID: 38366034 DOI: 10.1038/s41576-024-00692-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2023] [Indexed: 02/18/2024]
Abstract
Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.
Collapse
Affiliation(s)
- Hope A Tanudisastro
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Ira W Deveson
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
3
|
Maciocha F, Suchanecka A, Chmielowiec K, Chmielowiec J, Ciechanowicz A, Boroń A. Correlations of the CNR1 Gene with Personality Traits in Women with Alcohol Use Disorder. Int J Mol Sci 2024; 25:5174. [PMID: 38791212 PMCID: PMC11121729 DOI: 10.3390/ijms25105174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 05/02/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
Alcohol use disorder (AUD) is a significant issue affecting women, with severe consequences for society, the economy, and most importantly, health. Both personality and alcohol use disorders are phenotypically very complex, and elucidating their shared heritability is a challenge for medical genetics. Therefore, our study investigated the correlations between the microsatellite polymorphism (AAT)n of the Cannabinoid Receptor 1 (CNR1) gene and personality traits in women with AUD. The study group included 187 female subjects. Of these, 93 were diagnosed with alcohol use disorder, and 94 were controls. Repeat length polymorphism of microsatellite regions (AAT)n in the CNR1 gene was identified with PCR. All participants were assessed with the Mini-International Neuropsychiatric Interview and completed the NEO Five-Factor and State-Trait Anxiety Inventories. In the group of AUD subjects, significantly fewer (AAT)n repeats were present when compared with controls (p = 0.0380). While comparing the alcohol use disorder subjects (AUD) and the controls, we observed significantly higher scores on the STAI trait (p < 0.00001) and state scales (p = 0.0001) and on the NEO Five-Factor Inventory Neuroticism (p < 0.00001) and Openness (p = 0.0237; insignificant after Bonferroni correction) scales. Significantly lower results were obtained on the NEO-FFI Extraversion (p = 0.00003), Agreeability (p < 0.00001) and Conscientiousness (p < 0.00001) scales by the AUD subjects when compared to controls. There was no statistically significant Pearson's linear correlation between the number of (AAT)n repeats in the CNR1 gene and the STAI and NEO Five-Factor Inventory scores in the group of AUD subjects. In contrast, Pearson's linear correlation analysis in controls showed a positive correlation between the number of the (AAT)n repeats and the STAI state scale (r = 0.184; p = 0.011; insignificant after Bonferroni correction) and a negative correlation with the NEO-FFI Openness scale (r = -0.241; p = 0.001). Interestingly, our study provided data on two separate complex issues, i.e., (1) the association of (AAT)n CNR1 repeats with the AUD in females; (2) the correlation of (AAT)n CNR1 repeats with anxiety as a state and Openness in non-alcohol dependent subjects. In conclusion, our study provided a plethora of valuable data for improving our understanding of alcohol use disorder and anxiety.
Collapse
Affiliation(s)
- Filip Maciocha
- Department of Clinical and Molecular Biochemistry, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland; (F.M.); (A.C.)
| | - Aleksandra Suchanecka
- Independent Laboratory of Behavioral Genetics and Epigenetics, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland;
| | - Krzysztof Chmielowiec
- Department of Hygiene and Epidemiology, Collegium Medicum, University of Zielona Góra, 28 Zyty St., 65-046 Zielona Góra, Poland; (K.C.); (J.C.)
| | - Jolanta Chmielowiec
- Department of Hygiene and Epidemiology, Collegium Medicum, University of Zielona Góra, 28 Zyty St., 65-046 Zielona Góra, Poland; (K.C.); (J.C.)
| | - Andrzej Ciechanowicz
- Department of Clinical and Molecular Biochemistry, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland; (F.M.); (A.C.)
| | - Agnieszka Boroń
- Department of Clinical and Molecular Biochemistry, Pomeranian Medical University in Szczecin, Powstańców Wielkopolskich 72 St., 70-111 Szczecin, Poland; (F.M.); (A.C.)
| |
Collapse
|
4
|
Tajeddin N, Arabfard M, Alizadeh S, Salesi M, Khamse S, Delbari A, Ohadi M. Novel islands of GGC and GCC repeats coincide with human evolution. Gene 2024; 902:148194. [PMID: 38262548 DOI: 10.1016/j.gene.2024.148194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/29/2023] [Accepted: 01/18/2024] [Indexed: 01/25/2024]
Abstract
BACKGROUND Because of high mutation rate, overrepresentation in genic regions, and link with various neurological, neurodegenerative, and movement disorders, GGC and GCC short tandem repeats (STRs) are prone to natural selection. Among a number of lacking data, the 3-repeats of these STRs remain widely unexplored. RESULTS In a genome-wide search in human, here we mapped GGC and GCC STRs of ≥3-repeats, and found novel islands of up to 45 of those STRs, populating spans of 1 to 2 kb of genomic DNA. RGPD4 and NOC4L harbored the densest (GGC)3 (probability 3.09061E-71) and (GCC)3 (probability 1.72376E-61) islands, respectively, and were human-specific. We also found prime instances of directional incremented density of STRs at specific loci in human versus other species, including the FOXK2 and SKI GGC islands. The genes containing those islands significantly diverged in expression in human versus other species, and the proteins encoded by those genes interact closely in a physical interaction network, consequence of which may be human-specific characteristics such as higher order brain functions. CONCLUSION We report novel islands of GGC and GCC STRs of evolutionary relevance to human. The density, and in some instances, periodicity of these islands support them as a novel genomic entity, which need to be further explored in evolutionary, mechanistic, and functional platforms.
Collapse
Affiliation(s)
- N Tajeddin
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Arabfard
- Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - S Alizadeh
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Salesi
- Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - S Khamse
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - A Delbari
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Ohadi
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
| |
Collapse
|
5
|
Arthur TD, Nguyen JP, D'Antonio-Chronowska A, Jaureguy J, Silva N, Henson B, Panopoulos AD, Belmonte JCI, D'Antonio M, McVicker G, Frazer KA. Multi-omic QTL mapping in early developmental tissues reveals phenotypic and temporal complexity of regulatory variants underlying GWAS loci. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.10.588874. [PMID: 38645112 PMCID: PMC11030419 DOI: 10.1101/2024.04.10.588874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Most GWAS loci are presumed to affect gene regulation, however, only ∼43% colocalize with expression quantitative trait loci (eQTLs). To address this colocalization gap, we identify eQTLs, chromatin accessibility QTLs (caQTLs), and histone acetylation QTLs (haQTLs) using molecular samples from three early developmental (EDev) tissues. Through colocalization, we annotate 586 GWAS loci for 17 traits by QTL complexity, QTL phenotype, and QTL temporal specificity. We show that GWAS loci are highly enriched for colocalization with complex QTL modules that affect multiple elements (genes and/or peaks). We also demonstrate that caQTLs and haQTLs capture regulatory variations not associated with eQTLs and explain ∼49% of the functionally annotated GWAS loci. Additionally, we show that EDev-unique QTLs are strongly depleted for colocalizing with GWAS loci. By conducting one of the largest multi-omic QTL studies to date, we demonstrate that many GWAS loci exhibit phenotypic complexity and therefore, are missed by traditional eQTL analyses.
Collapse
|
6
|
Chen F, Zhang Y, Sedlazeck FJ, Creighton CJ. Germline structural variation globally impacts the cancer transcriptome including disease-relevant genes. Cell Rep Med 2024; 5:101446. [PMID: 38442712 PMCID: PMC10983041 DOI: 10.1016/j.xcrm.2024.101446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/01/2024] [Accepted: 02/06/2024] [Indexed: 03/07/2024]
Abstract
Germline variation and somatic alterations contribute to the molecular profile of cancers. We combine RNA with whole genome sequencing across 1,218 cancer patients to determine the extent germline structural variants (SVs) impact expression of nearby genes. For hundreds of genes, recurrent and common germline SV breakpoints within 100 kb associate with increased or decreased expression in tumors spanning various tissues of origin. A significant fraction of germline SV expression associations involves duplication of intergenic enhancers or 3' UTR disruption. Genes altered by both somatic and germline SVs include ATRX and CEBPA. Genes essential in cancer cell lines include BARD1 and IRS2. Genes with both expression and germline SV breakpoint patterns associated with patient survival include GCLM. Our results capture a class of phenotypic variation at work in the disease setting, including genes with cancer roles. Specific germline SVs represent potential cancer risk variants for genetic testing, including those involving genes with targeting implications.
Collapse
Affiliation(s)
- Fengju Chen
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yiqun Zhang
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Chad J Creighton
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
7
|
Arthur TD, Nguyen JP, D'Antonio-Chronowska A, Matsui H, Silva NS, Joshua IN, Luchessi AD, Greenwald WWY, D'Antonio M, Pera MF, Frazer KA. Complex regulatory networks influence pluripotent cell state transitions in human iPSCs. Nat Commun 2024; 15:1664. [PMID: 38395976 PMCID: PMC10891157 DOI: 10.1038/s41467-024-45506-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 01/26/2024] [Indexed: 02/25/2024] Open
Abstract
Stem cells exist in vitro in a spectrum of interconvertible pluripotent states. Analyzing hundreds of hiPSCs derived from different individuals, we show the proportions of these pluripotent states vary considerably across lines. We discover 13 gene network modules (GNMs) and 13 regulatory network modules (RNMs), which are highly correlated with each other suggesting that the coordinated co-accessibility of regulatory elements in the RNMs likely underlie the coordinated expression of genes in the GNMs. Epigenetic analyses reveal that regulatory networks underlying self-renewal and pluripotency are more complex than previously realized. Genetic analyses identify thousands of regulatory variants that overlapped predicted transcription factor binding sites and are associated with chromatin accessibility in the hiPSCs. We show that the master regulator of pluripotency, the NANOG-OCT4 Complex, and its associated network are significantly enriched for regulatory variants with large effects, suggesting that they play a role in the varying cellular proportions of pluripotency states between hiPSCs. Our work bins tens of thousands of regulatory elements in hiPSCs into discrete regulatory networks, shows that pluripotency and self-renewal processes have a surprising level of regulatory complexity, and suggests that genetic factors may contribute to cell state transitions in human iPSC lines.
Collapse
Affiliation(s)
- Timothy D Arthur
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Jennifer P Nguyen
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
| | | | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Nayara S Silva
- Northeast Biotechnology Network (RENORBIO), Graduate Program in Biotechnology, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Isaac N Joshua
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - André D Luchessi
- Northeast Biotechnology Network (RENORBIO), Graduate Program in Biotechnology, Federal University of Rio Grande do Norte, Natal, Brazil
- Department of Clinical and Toxicological Analysis, Federal University of Rio Grande do Norte, Natal, Brazil
| | - William W Young Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Matteo D'Antonio
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | | | - Kelly A Frazer
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| |
Collapse
|
8
|
Arabfard M, Tajeddin N, Alizadeh S, Salesi M, Bayat H, Khorram Khorshid HR, Khamse S, Delbari A, Ohadi M. Dyads of GGC and GCC form hotspot colonies that coincide with the evolution of human and other great apes. BMC Genom Data 2024; 25:21. [PMID: 38383300 PMCID: PMC10880355 DOI: 10.1186/s12863-024-01207-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 02/11/2024] [Indexed: 02/23/2024] Open
Abstract
BACKGROUND GGC and GCC short tandem repeats (STRs) are of various evolutionary, biological, and pathological implications. However, the fundamental two-repeats (dyads) of these STRs are widely unexplored. RESULTS On a genome-wide scale, we mapped (GGC)2 and (GCC)2 dyads in human, and found monumental colonies (distance between each dyad < 500 bp) of extraordinary density, and in some instances periodicity. The largest (GCC)2 and (GGC)2 colonies were intergenic, homogeneous, and human-specific, consisting of 219 (GCC)2 on chromosome 2 (probability < 1.545E-219) and 70 (GGC)2 on chromosome 9 (probability = 1.809E-148). We also found that several colonies were shared in other great apes, and directionally increased in density and complexity in human, such as a colony of 99 (GCC)2 on chromosome 20, that specifically expanded in great apes, and reached maximum complexity in human (probability 1.545E-220). Numerous other colonies of evolutionary relevance in human were detected in other largely overlooked regions of the genome, such as chromosome Y and pseudogenes. Several of the genes containing or nearest to those colonies were divergently expressed in human. CONCLUSION In conclusion, (GCC)2 and (GGC)2 form unprecedented genomic colonies that coincide with the evolution of human and other great apes. The extent of the genomic rearrangements leading to those colonies support overlooked recombination hotspots, shared across great apes. The identified colonies deserve to be studied in mechanistic, evolutionary, and functional platforms.
Collapse
Affiliation(s)
- M Arabfard
- Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - N Tajeddin
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
- Department of Biology, Central Tehran Branch, Islamic Azad University, Tehran, Iran
| | - S Alizadeh
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Salesi
- Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
- Research Center for Prevention of Oral and Dental Diseases, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - H Bayat
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - H R Khorram Khorshid
- Personalized Medicine and Genometabolomics Research Center, Hope Generation Foundation, Tehran, Iran
| | - S Khamse
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - A Delbari
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Ohadi
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
| |
Collapse
|
9
|
Bayat H, Mirahmadi M, Azarshin Z, Ohadi H, Delbari A, Ohadi M. CRISPR/Cas9-mediated deletion of a GA-repeat in human GPM6B leads to disruption of neural cell differentiation from NT2 cells. Sci Rep 2024; 14:2136. [PMID: 38273037 PMCID: PMC10810867 DOI: 10.1038/s41598-024-52675-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 01/22/2024] [Indexed: 01/27/2024] Open
Abstract
The human neuron-specific gene, GPM6B (Glycoprotein membrane 6B), is considered a key gene in neural cell functionality. This gene contains an exceptionally long and strictly monomorphic short tandem repeat (STR) of 9-repeats, (GA)9. STRs in regulatory regions, may impact on the expression of nearby genes. We used CRISPR-based tool to delete this GA-repeat in NT2 cells, and analyzed the consequence of this deletion on GPM6B expression. Subsequently, the edited cells were induced to differentiate into neural cells, using retinoic acid (RA) treatment. Deletion of the GA-repeat significantly decreased the expression of GPM6B at the RNA (p < 0.05) and protein (40%) levels. Compared to the control cells, the edited cells showed dramatic decrease of the astrocyte and neural cell markers, including GFAP (0.77-fold), TUBB3 (0.57-fold), and MAP2 (0.2-fold). Subsequent sorting of the edited cells showed an increased number of NES (p < 0.01), but a decreased number of GFAP (p < 0.001), TUBB3 (p < 0.05), and MAP2 (p < 0.01), compared to the control cells. In conclusion, CRISPR/Cas9-mediated deletion of a GA-repeat in human GPM6B, led to decreased expression of this gene, which in turn, disrupted differentiation of NT2 cells into neural cells.
Collapse
Affiliation(s)
- Hadi Bayat
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Postal Code: 1985713834, Iran
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Postal Box: 331-14115, Tehran, Iran
| | - Maryam Mirahmadi
- Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Postal Box: 331-14115, Tehran, Iran
- Department of Exomine, PardisGene Company, Tehran, Postal Code: 1917635816, Iran
| | - Zohreh Azarshin
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Postal Box: 331-14115, Tehran, Iran
| | - Hamid Ohadi
- School of Physics and Astronomy, University of St Andrews, St Andrews, KY16 9SS, UK
| | - Ahmad Delbari
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Postal Code: 1985713834, Iran
| | - Mina Ohadi
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Postal Code: 1985713834, Iran.
| |
Collapse
|
10
|
Zhang J, Zhu B. Short, but matters: short tandem repeats confer variation in transcription factor-DNA binding. Sci Bull (Beijing) 2024; 69:9-10. [PMID: 38042705 DOI: 10.1016/j.scib.2023.11.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2023]
Affiliation(s)
- Jing Zhang
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Key Laboratory of Epigenetic Regulation and Intervention, Chinese Academy of Sciences, Beijing 100101, China; New Cornerstone Science Laboratory, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Bing Zhu
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Key Laboratory of Epigenetic Regulation and Intervention, Chinese Academy of Sciences, Beijing 100101, China; New Cornerstone Science Laboratory, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
11
|
Guo MH, Lee WP, Vardarajan B, Schellenberg GD, Phillips-Cremins J. Polygenic burden of short tandem repeat expansions promote risk for Alzheimer's disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.11.16.23298623. [PMID: 38014121 PMCID: PMC10680900 DOI: 10.1101/2023.11.16.23298623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Studies of the genetics of Alzheimer's disease (AD) have largely focused on single nucleotide variants and short insertions/deletions. However, most of the disease heritability has yet to be uncovered, suggesting that there is substantial genetic risk conferred by other forms of genetic variation. There are over one million short tandem repeats (STRs) in the genome, and their link to AD risk has not been assessed. As pathogenic expansions of STR cause over 30 neurologic diseases, it is important to ascertain whether STRs may also be implicated in AD risk. Here, we genotyped 321,742 polymorphic STR tracts genome-wide using PCR-free whole genome sequencing data from 2,981 individuals (1,489 AD case and 1,492 control individuals). We implemented an approach to identify STR expansions as STRs with tract lengths that are outliers from the population. We then tested for differences in aggregate burden of expansions in case versus control individuals. AD patients had a 1.19-fold increase of STR expansions compared to healthy elderly controls (p=8.27×10-3, two-sided Mann Whitney test). Individuals carrying > 30 STR expansions had 3.62-fold higher odds of having AD and had more severe AD neuropathology. AD STR expansions were highly enriched within active promoters in post-mortem hippocampal brain tissues and particularly within SINE-VNTR-Alu (SVA) retrotransposons. Together, these results demonstrate that expanded STRs within active promoter regions of the genome promote risk of AD.
Collapse
Affiliation(s)
- Michael H Guo
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Wan-Ping Lee
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Badri Vardarajan
- Department of Neurology, College of Physicians and Surgeons, Columbia University, New York, NY
| | - Gerard D Schellenberg
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Jennifer Phillips-Cremins
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
- Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
12
|
DeGorter MK, Goddard PC, Karakoc E, Kundu S, Yan SM, Nachun D, Abell N, Aguirre M, Carstensen T, Chen Z, Durrant M, Dwaracherla VR, Feng K, Gloudemans MJ, Hunter N, Moorthy MPS, Pomilla C, Rodrigues KB, Smith CJ, Smith KS, Ungar RA, Balliu B, Fellay J, Flicek P, McLaren PJ, Henn B, McCoy RC, Sugden L, Kundaje A, Sandhu MS, Gurdasani D, Montgomery SB. Transcriptomics and chromatin accessibility in multiple African population samples. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.04.564839. [PMID: 37986808 PMCID: PMC10659267 DOI: 10.1101/2023.11.04.564839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Mapping the functional human genome and impact of genetic variants is often limited to European-descendent population samples. To aid in overcoming this limitation, we measured gene expression using RNA sequencing in lymphoblastoid cell lines (LCLs) from 599 individuals from six African populations to identify novel transcripts including those not represented in the hg38 reference genome. We used whole genomes from the 1000 Genomes Project and 164 Maasai individuals to identify 8,881 expression and 6,949 splicing quantitative trait loci (eQTLs/sQTLs), and 2,611 structural variants associated with gene expression (SV-eQTLs). We further profiled chromatin accessibility using ATAC-Seq in a subset of 100 representative individuals, to identity chromatin accessibility quantitative trait loci (caQTLs) and allele-specific chromatin accessibility, and provide predictions for the functional effect of 78.9 million variants on chromatin accessibility. Using this map of eQTLs and caQTLs we fine-mapped GWAS signals for a range of complex diseases. Combined, this work expands global functional genomic data to identify novel transcripts, functional elements and variants, understand population genetic history of molecular quantitative trait loci, and further resolve the genetic basis of multiple human traits and disease.
Collapse
Affiliation(s)
| | - Page C Goddard
- Department of Genetics, Stanford University, Stanford, CA
| | - Emre Karakoc
- Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Soumya Kundu
- Department of Computer Science, Stanford University, Stanford CA
| | | | - Daniel Nachun
- Department of Pathology, Stanford University, Stanford, CA
| | - Nathan Abell
- Department of Genetics, Stanford University, Stanford, CA
| | - Matthew Aguirre
- Department of Biomedical Data Science, Stanford University, Stanford, CA
| | - Tommy Carstensen
- Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Ziwei Chen
- Department of Computer Science, Stanford University, Stanford CA
| | | | | | - Karen Feng
- Department of Biomedical Data Science, Stanford University, Stanford, CA
| | | | - Naiomi Hunter
- Department of Genetics, Stanford University, Stanford, CA
| | | | - Cristina Pomilla
- Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | | | | | - Kevin S Smith
- Department of Pathology, Stanford University, Stanford, CA
| | - Rachel A Ungar
- Department of Genetics, Stanford University, Stanford, CA
| | - Brunilda Balliu
- Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, CA and Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA
| | - Jacques Fellay
- School of Life Sciences, Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland and Precision Medicine Unit, Biomedical Data Science Center, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Paul Flicek
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Paul J McLaren
- Sexually Transmitted and Blood-Borne Infections Division at JC Wilt Infectious Diseases Research Centre, National Microbiology Laboratory Branch, Public Health Agency of Canada, Winnipeg, Canada and Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
| | - Brenna Henn
- Department of Anthropology, University of California Davis, Davis CA and Genome Center, University of California Davis, Davis CA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore
| | - Lauren Sugden
- Department of Mathematics and Computer Science, Dusquesne University, Pittsburgh, PA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA
- Department of Computer Science, Stanford University, Stanford CA
| | | | - Deepti Gurdasani
- William Harvey Research Institute, Queen Mary University of London, London, UK; Kirby Institute, University of New South Wales, Australia; School of Medicine, University of Western Australia, Australia
| | | |
Collapse
|
13
|
Bhati M, Mapel XM, Lloret-Villas A, Pausch H. Structural variants and short tandem repeats impact gene expression and splicing in bovine testis tissue. Genetics 2023; 225:iyad161. [PMID: 37655920 PMCID: PMC10627265 DOI: 10.1093/genetics/iyad161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/05/2023] [Accepted: 08/24/2023] [Indexed: 09/02/2023] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) are significant sources of genetic variation. However, the impacts of these variants on gene regulation have not been investigated in cattle. Here, we genotyped and characterized 19,408 SVs and 374,821 STRs in 183 bovine genomes and investigated their impact on molecular phenotypes derived from testis transcriptomes. We found that 71% STRs were multiallelic. The vast majority (95%) of STRs and SVs were in intergenic and intronic regions. Only 37% SVs and 40% STRs were in high linkage disequilibrium (LD) (R2 > 0.8) with surrounding SNPs/insertions and deletions (Indels), indicating that SNP-based association testing and genomic prediction are blind to a nonnegligible portion of genetic variation. We showed that both SVs and STRs were more than 2-fold enriched among expression and splicing QTL (e/sQTL) relative to SNPs/Indels and were often associated with differential expression and splicing of multiple genes. Deletions and duplications had larger impacts on splicing and expression than any other type of SV. Exonic duplications predominantly increased gene expression either through alternative splicing or other mechanisms, whereas expression- and splicing-associated STRs primarily resided in intronic regions and exhibited bimodal effects on the molecular phenotypes investigated. Most e/sQTL resided within 100 kb of the affected genes or splicing junctions. We pinpoint candidate causal STRs and SVs associated with the expression of SLC13A4 and TTC7B and alternative splicing of a lncRNA and CAPP1. We provide a catalog of STRs and SVs for taurine cattle and show that these variants contribute substantially to gene expression and splicing variation.
Collapse
Affiliation(s)
- Meenu Bhati
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Xena Marie Mapel
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | | | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| |
Collapse
|
14
|
Zong W, Wang J, Zhao R, Niu N, Su Y, Hu Z, Liu X, Hou X, Wang L, Wang L, Zhang L. Associations of genome-wide structural variations with phenotypic differences in cross-bred Eurasian pigs. J Anim Sci Biotechnol 2023; 14:136. [PMID: 37805653 PMCID: PMC10559557 DOI: 10.1186/s40104-023-00929-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/03/2023] [Indexed: 10/09/2023] Open
Abstract
BACKGROUND During approximately 10,000 years of domestication and selection, a large number of structural variations (SVs) have emerged in the genome of pig breeds, profoundly influencing their phenotypes and the ability to adapt to the local environment. SVs (≥ 50 bp) are widely distributed in the genome, mainly in the form of insertion (INS), mobile element insertion (MEI), deletion (DEL), duplication (DUP), inversion (INV), and translocation (TRA). While studies have investigated the SVs in pig genomes, genome-wide association studies (GWAS)-based on SVs have been rarely conducted. RESULTS Here, we obtained a high-quality SV map containing 123,151 SVs from 15 Large White and 15 Min pigs through integrating the power of several SV tools, with 53.95% of the SVs being reported for the first time. These high-quality SVs were used to recover the population genetic structure, confirming the accuracy of genotyping. Potential functional SV loci were then identified based on positional effects and breed stratification. Finally, GWAS were performed for 36 traits by genotyping the screened potential causal loci in the F2 population according to their corresponding genomic positions. We identified a large number of loci involved in 8 carcass traits and 6 skeletal traits on chromosome 7, with FKBP5 containing the most significant SV locus for almost all traits. In addition, we found several significant loci in intramuscular fat, abdominal circumference, heart weight, and liver weight, etc. CONCLUSIONS: We constructed a high-quality SV map using high-coverage sequencing data and then analyzed them by performing GWAS for 25 carcass traits, 7 skeletal traits, and 4 meat quality traits to determine that SVs may affect body size between European and Chinese pig breeds.
Collapse
Affiliation(s)
- Wencheng Zong
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Jinbu Wang
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Runze Zhao
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
- College of Animal Science, Shanxi Agricultural University, Jinzhong, 030801, China
| | - Naiqi Niu
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Yanfang Su
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Ziping Hu
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
- College of Animal Science and Technology, Qingdao Agricultural University, Qingdao, 266109, China
| | - Xin Liu
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Xinhua Hou
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Ligang Wang
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Lixian Wang
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| | - Longchao Zhang
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| |
Collapse
|
15
|
Liao X, Zhu W, Zhou J, Li H, Xu X, Zhang B, Gao X. Repetitive DNA sequence detection and its role in the human genome. Commun Biol 2023; 6:954. [PMID: 37726397 PMCID: PMC10509279 DOI: 10.1038/s42003-023-05322-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/04/2023] [Indexed: 09/21/2023] Open
Abstract
Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.
Collapse
Affiliation(s)
- Xingyu Liao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Wufei Zhu
- Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, 443000, Yichang, P.R. China
| | - Juexiao Zhou
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Haoyang Li
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xiaopeng Xu
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Bin Zhang
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
16
|
Arthur TD, Nguyen JP, D'Antonio-Chronowska A, Matsui H, Silva NS, Joshua IN, Luchessi AD, Young Greenwald WW, D'Antonio M, Pera MF, Frazer KA. Analysis of regulatory network modules in hundreds of human stem cell lines reveals complex epigenetic and genetic factors contribute to pluripotency state differences between subpopulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.20.541447. [PMID: 37292794 PMCID: PMC10245835 DOI: 10.1101/2023.05.20.541447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Stem cells exist in vitro in a spectrum of interconvertible pluripotent states. Analyzing hundreds of hiPSCs derived from different individuals, we show the proportions of these pluripotent states vary considerably across lines. We discovered 13 gene network modules (GNMs) and 13 regulatory network modules (RNMs), which were highly correlated with each other suggesting that the coordinated co-accessibility of regulatory elements in the RNMs likely underlied the coordinated expression of genes in the GNMs. Epigenetic analyses revealed that regulatory networks underlying self-renewal and pluripotency have a surprising level of complexity. Genetic analyses identified thousands of regulatory variants that overlapped predicted transcription factor binding sites and were associated with chromatin accessibility in the hiPSCs. We show that the master regulator of pluripotency, the NANOG-OCT4 Complex, and its associated network were significantly enriched for regulatory variants with large effects, suggesting that they may play a role in the varying cellular proportions of pluripotency states between hiPSCs. Our work captures the coordinated activity of tens of thousands of regulatory elements in hiPSCs and bins these elements into discrete functionally characterized regulatory networks, shows that regulatory elements in pluripotency networks harbor variants with large effects, and provides a rich resource for future pluripotent stem cell research.
Collapse
|
17
|
Lutz MW, Chiba-Falek O. Bioinformatics pipeline to guide post-GWAS studies in Alzheimer's: A new catalogue of disease candidate short structural variants. Alzheimers Dement 2023; 19:4094-4109. [PMID: 37253165 PMCID: PMC10524333 DOI: 10.1002/alz.13168] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/27/2023] [Accepted: 05/08/2023] [Indexed: 06/01/2023]
Abstract
BACKGROUND Short structural variants (SSVs), including insertions/deletions (indels), are common in the human genome and impact disease risk. The role of SSVs in late-onset Alzheimer's disease (LOAD) has been understudied. In this study, we developed a bioinformatics pipeline of SSVs within LOAD-genome-wide association study (GWAS) regions to prioritize regulatory SSVs based on the strength of their predicted effect on transcription factor (TF) binding sites. METHODS The pipeline utilized publicly available functional genomics data sources including candidate cis-regulatory elements (cCREs) from ENCODE and single-nucleus (sn)RNA-seq data from LOAD patient samples. RESULTS We catalogued 1581 SSVs in candidate cCREs in LOAD GWAS regions that disrupted 737 TF sites. That included SSVs that disrupted the binding of RUNX3, SPI1, and SMAD3, within the APOE-TOMM40, SPI1, and MS4A6A LOAD regions. CONCLUSIONS The pipeline developed here prioritized non-coding SSVs in cCREs and characterized their putative effects on TF binding. The approach integrates multiomics datasets for validation experiments using disease models.
Collapse
Affiliation(s)
- Michael W. Lutz
- Division of Translational Brain Sciences, Department of Neurology, Duke University Medical Center, Durham, NC 27710, USA
| | - Ornit Chiba-Falek
- Division of Translational Brain Sciences, Department of Neurology, Duke University Medical Center, Durham, NC 27710, USA
- Center for Genomic and Computational Biology, Duke University Medical Center, Durham, NC 27710, USA
| |
Collapse
|
18
|
Billingsley KJ, Ding J, Jerez PA, Illarionova A, Levine K, Grenn FP, Makarious MB, Moore A, Vitale D, Reed X, Hernandez D, Torkamani A, Ryten M, Hardy J, Chia R, Scholz SW, Traynor BJ, Dalgard CL, Ehrlich DJ, Tanaka T, Ferrucci L, Beach T, Serrano GE, Quinn JP, Bubb VJ, Collins RL, Zhao X, Walker M, Pierce-Hoffman E, Brand H, Talkowski ME, Casey B, Cookson MR, Markham A, Nalls MA, Mahmoud M, Sedlazeck FJ, Blauwendraat C, Gibbs JR, Singleton AB. Genome-Wide Analysis of Structural Variants in Parkinson Disease. Ann Neurol 2023; 93:1012-1022. [PMID: 36695634 PMCID: PMC10192042 DOI: 10.1002/ana.26608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 01/03/2023] [Accepted: 01/16/2023] [Indexed: 01/26/2023]
Abstract
OBJECTIVE Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 2023;93:1012-1022.
Collapse
Affiliation(s)
- Kimberley J. Billingsley
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
| | - Jinhui Ding
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Pilar Alvarez Jerez
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
| | | | | | - Francis P. Grenn
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Mary B. Makarious
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Anni Moore
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Daniel Vitale
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
- Data Tecnica International, Washington, DC, USA
| | - Xylena Reed
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
| | - Dena Hernandez
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Ali Torkamani
- The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Mina Ryten
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London, London, UK
| | - John Hardy
- UK Dementia Research Institute and Department of Neurodegenerative Disease and Reta Lila Weston Institute, UCL Queen Square Institute of Neurology and UCL Movement Disorders Centre, University College London, London, UK
- Institute for Advanced Study, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | | | - Ruth Chia
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Sonja W. Scholz
- Neurodegenerative Diseases Research Unit, National Institute of Neurological Disorders and Stroke, Bethesda, Maryland, USA
- Department of Neurology, Johns Hopkins University Medical Center, Baltimore, Maryland, USA
| | - Bryan J. Traynor
- Department of Neurology, Johns Hopkins University Medical Center, Baltimore, Maryland, USA
- Neuromuscular Diseases Research Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
- Therapeutic Development Branch, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
- National Institute of Neurological Disorders and Stroke, Bethesda, MD 20892
- Reta Lila Weston Institute, UCL Queen Square Institute of Neurology, University College London, London WC1N 1PJ, UK
| | - Clifton L. Dalgard
- Department of Anatomy Physiology & Genetics, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
- The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Debra J. Ehrlich
- Parkinson’s Disease Clinic, Office of the Clinical Director, National Institute of Neurological Disorders and Stroke, Bethesda, Maryland, USA
| | - Toshiko Tanaka
- Translational Gerontology Branch, National Institute on Aging, NIH, Baltimore, MD 21224, USA
| | - Luigi Ferrucci
- Translational Gerontology Branch, National Institute on Aging, NIH, Baltimore, MD 21224, USA
| | - Thomas.G. Beach
- Civin Laboratory for Neuropathology, Banner Sun Health Research Institute, Sun City, AZ
| | - Geidy E. Serrano
- Civin Laboratory for Neuropathology, Banner Sun Health Research Institute, Sun City, AZ
| | - John P. Quinn
- Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 3BX, UK
| | - Vivien J. Bubb
- Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 3BX, UK
| | - Ryan L Collins
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
- Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115
| | - Xuefang Zhao
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
| | - Mark Walker
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
- Data Sciences Platform, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
| | - Emma Pierce-Hoffman
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
- Data Sciences Platform, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
- Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115
| | - Michael E. Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T) and Harvard USA Cambridge, MA 02142, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Bradford Casey
- The Michael J. Fox Foundation for Parkinson’s Research, New York, NY 10001
| | - Mark R Cookson
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | | | - Mike A. Nalls
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
- Data Tecnica International, Washington, DC, USA
| | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, US
| | - Cornelis Blauwendraat
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
| | - J. Raphael Gibbs
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
| | - Andrew B. Singleton
- Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland, USA
- Center for Alzheimer’s and Related Dementias, National Institute on Aging, Bethesda, Maryland, USA
| |
Collapse
|
19
|
Shi Y, Niu Y, Zhang P, Luo H, Liu S, Zhang S, Wang J, Li Y, Liu X, Song T, Xu T, He S. Characterization of genome-wide STR variation in 6487 human genomes. Nat Commun 2023; 14:2092. [PMID: 37045857 PMCID: PMC10097659 DOI: 10.1038/s41467-023-37690-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.
Collapse
Affiliation(s)
- Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuai Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Sijia Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiajia Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinyue Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250117, Shandong, China.
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
20
|
Hong J, Su S, Wang L, Bai S, Xu J, Li Z, Betts N, Liang W, Wang W, Shi J, Zhang D. Combined genome-wide association study and epistasis analysis reveal multifaceted genetic architectures of plant height in Asian cultivated rice. PLANT, CELL & ENVIRONMENT 2023; 46:1295-1311. [PMID: 36734269 DOI: 10.1111/pce.14557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 01/08/2023] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
Plant height (PH) in rice (Oryza sativa) is an important trait for its adaptation and agricultural performance. Discovery of the semi-dwarf1 (SD1) mutation initiated the Green Revolution, boosting rice yield and fitness, but the underlying genetic regulation of PH in rice remains largely unknown. Here, we performed genome-wide association study (GWAS) and identified 12 non-repetitive QTL/genes regulating PH variation in 619 Asian cultivated rice accessions. One of these was an SD1 structural variant, not normally detected in standard GWAS analyses. Given the strong effect of SD1 on PH, we also divided 619 accessions into subgroups harbouring distinct SD1 haplotypes, and found a further 85 QTL/genes for PH, revealing genetic heterogeneity that may be missed by analysing a broad, diverse population. Moreover, we uncovered two epistatic interaction networks of PH-associated QTL/genes in the japonica (Geng)-dominant SD1NIP subgroup. In one of them, the hub QTL/gene qphSN1.4/GAMYB interacted with qphSN3.1/OsINO80, qphSN3.4/HD16/EL1, qphSN6.2/LOC_Os06g11130, and qphSN10.2/MADS56. Sequence variations in GAMYB and MADS56 were associated with their expression levels and PH variations, and MADS56 was shown to physically interact with MADS57 to coregulate expression of gibberellin (GA) metabolic genes OsGA2ox3 and Elongated Uppermost Internode1 (EUI1). Our study uncovered the multifaceted genetic architectures of rice PH, and provided novel and abundant genetic resources for breeding semi-dwarf rice and new candidates for further mechanistic studies on regulation of PH in rice.
Collapse
Affiliation(s)
- Jun Hong
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
| | - Su Su
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
| | - Li Wang
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
| | - Shaoxing Bai
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
| | - Jianlong Xu
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhikang Li
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Natalie Betts
- School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, South Australia, Australia
| | - Wanqi Liang
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
| | - Wensheng Wang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jianxin Shi
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
| | - Dabing Zhang
- Joint International Research Laboratory of Metabolic and Developmental Sciences, State Key Laboratory of Hybrid Rice, School of Life Sciences and Biotechnology, Yazhou Bay Institute of Deepsea Sci-Tech, Shanghai Jiao Tong University, Shanghai, China
- School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, South Australia, Australia
| |
Collapse
|
21
|
Wright SE, Todd PK. Native functions of short tandem repeats. eLife 2023; 12:e84043. [PMID: 36940239 PMCID: PMC10027321 DOI: 10.7554/elife.84043] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 03/08/2023] [Indexed: 03/21/2023] Open
Abstract
Over a third of the human genome is comprised of repetitive sequences, including more than a million short tandem repeats (STRs). While studies of the pathologic consequences of repeat expansions that cause syndromic human diseases are extensive, the potential native functions of STRs are often ignored. Here, we summarize a growing body of research into the normal biological functions for repetitive elements across the genome, with a particular focus on the roles of STRs in regulating gene expression. We propose reconceptualizing the pathogenic consequences of repeat expansions as aberrancies in normal gene regulation. From this altered viewpoint, we predict that future work will reveal broader roles for STRs in neuronal function and as risk alleles for more common human neurological diseases.
Collapse
Affiliation(s)
- Shannon E Wright
- Department of Neurology, University of Michigan–Ann ArborAnn ArborUnited States
- Neuroscience Graduate Program, University of Michigan–Ann ArborAnn ArborUnited States
- Department of Neuroscience, Picower InstituteCambridgeUnited States
| | - Peter K Todd
- Department of Neurology, University of Michigan–Ann ArborAnn ArborUnited States
- VA Ann Arbor Healthcare SystemAnn ArborUnited States
| |
Collapse
|
22
|
Revisiting mutagenesis at non-B DNA motifs in the human genome. Nat Struct Mol Biol 2023; 30:417-424. [PMID: 36914796 DOI: 10.1038/s41594-023-00936-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/03/2023] [Indexed: 03/16/2023]
Abstract
Non-B DNA structures formed by repetitive sequence motifs are known instigators of mutagenesis in experimental systems. Analyzing this phenomenon computationally in the human genome requires careful disentangling of intrinsic confounding factors, including overlapping and interrupted motifs and recurrent sequencing errors. Here, we show that accounting for these factors eliminates all signals of repeat-induced mutagenesis that extend beyond the motif boundary, and eliminates or dramatically shrinks the magnitude of mutagenesis within some motifs, contradicting previous reports. Mutagenesis not attributable to artifacts revealed several biological mechanisms. Polymerase slippage generates frequent indels within every variety of short tandem repeat motif, implicating slipped-strand structures. Interruption-correcting single nucleotide variants within short tandem repeats may originate from error-prone polymerases. Secondary-structure formation promotes single nucleotide variants within palindromic repeats and duplications within direct repeats. G-quadruplex motifs cause recurrent sequencing errors, whereas mutagenesis at Z-DNAs is conspicuously absent.
Collapse
|
23
|
Jun G, English AC, Metcalf GA, Yang J, Chaisson MJP, Pankratz N, Menon VK, Salerno WJ, Krasheninina O, Smith AV, Lane JA, Blackwell T, Kang HM, Salvi S, Meng Q, Shen H, Pasham D, Bhamidipati S, Kottapalli K, Arnett DK, Ashley-Koch A, Auer PL, Beutel KM, Bis JC, Blangero J, Bowden DW, Brody JA, Cade BE, Chen YDI, Cho MH, Curran JE, Fornage M, Freedman BI, Fingerlin T, Gelb BD, Hou L, Hung YJ, Kane JP, Kaplan R, Kim W, Loos RJ, Marcus GM, Mathias RA, McGarvey ST, Montgomery C, Naseri T, Nouraie SM, Preuss MH, Palmer ND, Peyser PA, Raffield LM, Ratan A, Redline S, Reupena S, Rotter JI, Rich SS, Rienstra M, Ruczinski I, Sankaran VG, Schwartz DA, Seidman CE, Seidman JG, Silverman EK, Smith JA, Stilp A, Taylor KD, Telen MJ, Weiss ST, Williams LK, Wu B, Yanek LR, Zhang Y, Lasky-Su J, Gingras MC, Dutcher SK, Eichler EE, Gabriel S, Germer S, Kim R, Viaud-Martinez KA, Nickerson DA, Luo J, Reiner A, Gibbs RA, Boerwinkle E, Abecasis G, Sedlazeck FJ. Structural variation across 138,134 samples in the TOPMed consortium. RESEARCH SQUARE 2023:rs.3.rs-2515453. [PMID: 36778386 PMCID: PMC9915771 DOI: 10.21203/rs.3.rs-2515453/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hematologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.
Collapse
Affiliation(s)
- Goo Jun
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston
| | - Adam C English
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Ginger A Metcalf
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Jianzhi Yang
- University of Southern California, Los Angeles, CA, USA
| | | | | | - Vipin K Menon
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | | | | | - Albert V Smith
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - John A Lane
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Tom Blackwell
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Hyun Min Kang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Sejal Salvi
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Qingchang Meng
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Hua Shen
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Divya Pasham
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Sravya Bhamidipati
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Kavya Kottapalli
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Donna K. Arnett
- Department of Epidemiology, University of Kentucky College of Public Health
| | - Allison Ashley-Koch
- Department of Medicine, Duke University Medical Center, Durham, NC
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC
| | - Paul L. Auer
- Division of Biostatistics and Cancer Center, Medical College of Wisconsin, Milwaukee WI
| | | | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas, Rio Grande Valley School of Medicine, Brownsville, TX
| | - Donald W. Bowden
- Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E. Cade
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA
| | - Yii-Der Ida Chen
- Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
| | - Michael H. Cho
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Joanne E. Curran
- Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX
| | - Barry I. Freedman
- Department of Internal Medicine, Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Tasha Fingerlin
- Center for Genes, Environment and Health, National Jewish Health, 1400 Jackson St., Denver, CO, 80206, USA
| | - Bruce D. Gelb
- Mindich Child Health and Development Institute and the Departments of Pediatrics and Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai
| | | | - Yi-Jen Hung
- Institute of Preventive Medicine, National Defense Medical Center, Taiwan
| | - John P Kane
- Cardiovascular Research Institute, University of California, San Francisco
| | - Robert Kaplan
- Department of epidemiology and population health, Albert Einstein College of Medicine, Bronx NY USA
| | - Wonji Kim
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Ruth J.F. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Gregory M Marcus
- Division of Cardiology, University of California, San Francisco CA
| | - Rasika A. Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Stephen T. McGarvey
- Department of Epidemiology, International Health Institute and Department of Anthropology, Brown University
| | - Courtney Montgomery
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | - S. Mehdi Nouraie
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213
| | - Michael H. Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | | | - Patricia A. Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI USA
| | | | - Aakrosh Ratan
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA
| | | | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Stephen S. Rich
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI USA
| | - Michiel Rienstra
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins University Bloomberg, School of Public Health, Baltimore, MD, USA
| | - Vijay G. Sankaran
- Division of Hematology/Oncology, Boston Children’s Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | | | - Christine E. Seidman
- Department of Genetics, Harvard Medical School
- Cardiovascular Division, Brigham & Women’s Hospital, Harvard University
- Howard Hughes Medical Institute, Harvard University
| | | | - Edwin K. Silverman
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA
| | - Jennifer A. Smith
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213
| | - Adrienne Stilp
- Department of Biostatistics, University of Washington, Seattle, WA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA USA
| | - Marilyn J. Telen
- Department of Medicine, Duke University Medical Center, Durham, NC
| | - Scott T. Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - L. Keoki Williams
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, Michigan, United States of America
| | - Baojun Wu
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, Michigan, United States of America
| | - Lisa R. Yanek
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Yingze Zhang
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213
| | - Jessica Lasky-Su
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | | | - Susan K. Dutcher
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, USA
| | | | | | - Ryan Kim
- Psomagen, Inc.,Rockville, Maryland, USA
| | | | | | | | - James Luo
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alex Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98109, USA
| | - Richard A Gibbs
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Goncalo Abecasis
- Regeneron Genetics Center
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Fritz J Sedlazeck
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, USA
| |
Collapse
|
24
|
Verbiest M, Maksimov M, Jin Y, Anisimova M, Gymrek M, Bilgin Sonay T. Mutation and selection processes regulating short tandem repeats give rise to genetic and phenotypic diversity across species. J Evol Biol 2023; 36:321-336. [PMID: 36289560 PMCID: PMC9990875 DOI: 10.1111/jeb.14106] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 06/29/2022] [Accepted: 08/01/2022] [Indexed: 02/03/2023]
Abstract
Short tandem repeats (STRs) are units of 1-6 bp that repeat in a tandem fashion in DNA. Along with single nucleotide polymorphisms and large structural variations, they are among the major genomic variants underlying genetic, and likely phenotypic, divergence. STRs experience mutation rates that are orders of magnitude higher than other well-studied genotypic variants. Frequent copy number changes result in a wide range of alleles, and provide unique opportunities for modulating complex phenotypes through variation in repeat length. While classical studies have identified key roles of individual STR loci, the advent of improved sequencing technology, high-quality genome assemblies for diverse species, and bioinformatics methods for genome-wide STR analysis now enable more systematic study of STR variation across wide evolutionary ranges. In this review, we explore mutation and selection processes that affect STR copy number evolution, and how these processes give rise to varying STR patterns both within and across species. Finally, we review recent examples of functional and adaptive changes linked to STRs.
Collapse
Affiliation(s)
- Max Verbiest
- Institute of Computational Life Sciences, School of Life Sciences and Facility ManagementZürich University of Applied SciencesWädenswilSwitzerland
- Department of Molecular Life SciencesUniversity of ZurichZurichSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Mikhail Maksimov
- Department of Computer Science & EngineeringUniversity of California San DiegoLa JollaCaliforniaUSA
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Ye Jin
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
- Department of BioengineeringUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Maria Anisimova
- Institute of Computational Life Sciences, School of Life Sciences and Facility ManagementZürich University of Applied SciencesWädenswilSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Melissa Gymrek
- Department of Computer Science & EngineeringUniversity of California San DiegoLa JollaCaliforniaUSA
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Tugce Bilgin Sonay
- Institute of Ecology, Evolution and Environmental BiologyColumbia UniversityNew YorkNew YorkUSA
| |
Collapse
|
25
|
Nguyen TV, Vander Jagt CJ, Wang J, Daetwyler HD, Xiang R, Goddard ME, Nguyen LT, Ross EM, Hayes BJ, Chamberlain AJ, MacLeod IM. In it for the long run: perspectives on exploiting long-read sequencing in livestock for population scale studies of structural variants. Genet Sel Evol 2023; 55:9. [PMID: 36721111 PMCID: PMC9887926 DOI: 10.1186/s12711-023-00783-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 01/23/2023] [Indexed: 02/02/2023] Open
Abstract
Studies have demonstrated that structural variants (SV) play a substantial role in the evolution of species and have an impact on Mendelian traits in the genome. However, unlike small variants (< 50 bp), it has been challenging to accurately identify and genotype SV at the population scale using short-read sequencing. Long-read sequencing technologies are becoming competitively priced and can address several of the disadvantages of short-read sequencing for the discovery and genotyping of SV. In livestock species, analysis of SV at the population scale still faces challenges due to the lack of resources, high costs, technological barriers, and computational limitations. In this review, we summarize recent progress in the characterization of SV in the major livestock species, the obstacles that still need to be overcome, as well as the future directions in this growing field. It seems timely that research communities pool resources to build global population-scale long-read sequencing consortiums for the major livestock species for which the application of genomic tools has become cost-effective.
Collapse
Affiliation(s)
- Tuan V. Nguyen
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| | - Christy J. Vander Jagt
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| | - Jianghui Wang
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| | - Hans D. Daetwyler
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1018.80000 0001 2342 0938School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083 Australia
| | - Ruidong Xiang
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1008.90000 0001 2179 088XFaculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville, VIC 3052 Australia
| | - Michael E. Goddard
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1008.90000 0001 2179 088XFaculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville, VIC 3052 Australia
| | - Loan T. Nguyen
- grid.1003.20000 0000 9320 7537Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072 Australia
| | - Elizabeth M. Ross
- grid.1003.20000 0000 9320 7537Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072 Australia
| | - Ben J. Hayes
- grid.1003.20000 0000 9320 7537Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072 Australia
| | - Amanda J. Chamberlain
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1018.80000 0001 2342 0938School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083 Australia
| | - Iona M. MacLeod
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| |
Collapse
|
26
|
Jun G, English AC, Metcalf GA, Yang J, Chaisson MJP, Pankratz N, Menon VK, Salerno WJ, Krasheninina O, Smith AV, Lane JA, Blackwell T, Kang HM, Salvi S, Meng Q, Shen H, Pasham D, Bhamidipati S, Kottapalli K, Arnett DK, Ashley-Koch A, Auer PL, Beutel KM, Bis JC, Blangero J, Bowden DW, Brody JA, Cade BE, Chen YDI, Cho MH, Curran JE, Fornage M, Freedman BI, Fingerlin T, Gelb BD, Hou L, Hung YJ, Kane JP, Kaplan R, Kim W, Loos RJ, Marcus GM, Mathias RA, McGarvey ST, Montgomery C, Naseri T, Nouraie SM, Preuss MH, Palmer ND, Peyser PA, Raffield LM, Ratan A, Redline S, Reupena S, Rotter JI, Rich SS, Rienstra M, Ruczinski I, Sankaran VG, Schwartz DA, Seidman CE, Seidman JG, Silverman EK, Smith JA, Stilp A, Taylor KD, Telen MJ, Weiss ST, Williams LK, Wu B, Yanek LR, Zhang Y, Lasky-Su J, Gingras MC, Dutcher SK, Eichler EE, Gabriel S, Germer S, Kim R, Viaud-Martinez KA, Nickerson DA, Luo J, Reiner A, Gibbs RA, Boerwinkle E, Abecasis G, Sedlazeck FJ. Structural variation across 138,134 samples in the TOPMed consortium. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.25.525428. [PMID: 36747810 PMCID: PMC9900832 DOI: 10.1101/2023.01.25.525428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hemotologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.
Collapse
Affiliation(s)
- Goo Jun
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston
| | - Adam C English
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Ginger A Metcalf
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Jianzhi Yang
- University of Southern California, Los Angeles, CA, USA
| | | | | | - Vipin K Menon
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | | | | | - Albert V Smith
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - John A Lane
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Tom Blackwell
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Hyun Min Kang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Sejal Salvi
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Qingchang Meng
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Hua Shen
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Divya Pasham
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Sravya Bhamidipati
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Kavya Kottapalli
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Donna K. Arnett
- Department of Epidemiology, University of Kentucky College of Public Health
| | - Allison Ashley-Koch
- Department of Medicine, Duke University Medical Center, Durham, NC
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC
| | - Paul L. Auer
- Division of Biostatistics and Cancer Center, Medical College of Wisconsin, Milwaukee WI
| | | | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas, Rio Grande Valley School of Medicine, Brownsville, TX
| | - Donald W. Bowden
- Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brian E. Cade
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA
| | - Yii-Der Ida Chen
- Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
| | - Michael H. Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Joanne E. Curran
- Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX
| | - Barry I. Freedman
- Department of Internal Medicine, Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Tasha Fingerlin
- Center for Genes, Environment and Health, National Jewish Health, 1400 Jackson St., Denver, CO, 80206, USA
| | - Bruce D. Gelb
- Mindich Child Health and Development Institute and the Departments of Pediatrics and Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai
| | | | - Yi-Jen Hung
- Institute of Preventive Medicine, National Defense Medical Center, Taiwan
| | - John P Kane
- Cardiovascular Research Institute, University of California, San Francisco
| | - Robert Kaplan
- Department of epidemiology and population health, Albert Einstein College of Medicine, Bronx NY USA
| | - Wonji Kim
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Ruth J.F. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Gregory M Marcus
- Division of Cardiology, University of California, San Francisco CA
| | - Rasika A. Mathias
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Stephen T. McGarvey
- Department of Epidemiology, International Health Institute and Department of Anthropology, Brown University
| | - Courtney Montgomery
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | - S. Mehdi Nouraie
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213
| | - Michael H. Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | | | - Patricia A. Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI USA
| | | | - Aakrosh Ratan
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA
| | | | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Stephen S. Rich
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI USA
| | - Michiel Rienstra
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins University Bloomberg, School of Public Health, Baltimore, MD, USA
| | - Vijay G. Sankaran
- Division of Hematology/Oncology, Boston Children's Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | | | - Christine E. Seidman
- Department of Genetics, Harvard Medical School
- Cardiovascular Division, Brigham & Women’s Hospital, Harvard University
- Howard Hughes Medical Institute, Harvard University
| | | | - Edwin K. Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA
| | - Jennifer A. Smith
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213
| | - Adrienne Stilp
- Department of Biostatistics, University of Washington, Seattle, WA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA USA
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA USA
| | - Marilyn J. Telen
- Department of Medicine, Duke University Medical Center, Durham, NC
| | - Scott T. Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - L. Keoki Williams
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, Michigan, United States of America
| | - Baojun Wu
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, Michigan, United States of America
| | - Lisa R. Yanek
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Yingze Zhang
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213
| | - Jessica Lasky-Su
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Susan K. Dutcher
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, USA
| | | | | | - Ryan Kim
- Psomagen, Inc.,Rockville, Maryland, USA
| | | | | | | | - James Luo
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alex Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98109, USA
| | - Richard A Gibbs
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Goncalo Abecasis
- Regeneron Genetics Center
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
| | - Fritz J Sedlazeck
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, USA
| |
Collapse
|
27
|
Wheeler MM, Stilp AM, Rao S, Halldórsson BV, Beyter D, Wen J, Mihkaylova AV, McHugh CP, Lane J, Jiang MZ, Raffield LM, Jun G, Sedlazeck FJ, Metcalf G, Yao Y, Bis JB, Chami N, de Vries PS, Desai P, Floyd JS, Gao Y, Kammers K, Kim W, Moon JY, Ratan A, Yanek LR, Almasy L, Becker LC, Blangero J, Cho MH, Curran JE, Fornage M, Kaplan RC, Lewis JP, Loos RJF, Mitchell BD, Morrison AC, Preuss M, Psaty BM, Rich SS, Rotter JI, Tang H, Tracy RP, Boerwinkle E, Abecasis GR, Blackwell TW, Smith AV, Johnson AD, Mathias RA, Nickerson DA, Conomos MP, Li Y, Þorsteinsdóttir U, Magnússon MK, Stefansson K, Pankratz ND, Bauer DE, Auer PL, Reiner AP. Whole genome sequencing identifies structural variants contributing to hematologic traits in the NHLBI TOPMed program. Nat Commun 2022; 13:7592. [PMID: 36481753 PMCID: PMC9732337 DOI: 10.1038/s41467-022-35354-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 11/29/2022] [Indexed: 12/13/2022] Open
Abstract
Genome-wide association studies have identified thousands of single nucleotide variants and small indels that contribute to variation in hematologic traits. While structural variants are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of structural variants to quantitative blood cell trait variation is unknown. Here we utilized whole genome sequencing data in ancestrally diverse participants of the NHLBI Trans Omics for Precision Medicine program (N = 50,675) to detect structural variants associated with hematologic traits. Using single variant tests, we assessed the association of common and rare structural variants with red cell-, white cell-, and platelet-related quantitative traits and observed 21 independent signals (12 common and 9 rare) reaching genome-wide significance. The majority of these associations (N = 18) replicated in independent datasets. In genome-editing experiments, we provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.
Collapse
Affiliation(s)
- Marsha M Wheeler
- Department of Genome Sciences, University of Washington, Seattle, WA, 98105, USA
| | - Adrienne M Stilp
- Department of Biostatistics, University of Washington, Seattle, WA, 98105, USA
| | - Shuquan Rao
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, 02115, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Harvard Stem Cell Institute, Boston, MA, 02138, USA
- Broad Institute, Cambridge, MA, 02142, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, 02115, USA
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China
| | - Bjarni V Halldórsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- School of Technology, Reykjavik University, Reykjavík, Iceland
| | | | - Jia Wen
- Departments of Biostatistics, Genetics, Computer Science, Applied Physical Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Anna V Mihkaylova
- Department of Biostatistics, University of Washington, Seattle, WA, 98105, USA
| | - Caitlin P McHugh
- Department of Biostatistics, University of Washington, Seattle, WA, 98105, USA
| | - John Lane
- Department of Laboratory Medicine and Pathology, University of Minnesota Medical School, Minneapolis, MN, 55455, USA
| | - Min-Zhi Jiang
- Departments of Biostatistics, Genetics, Computer Science, Applied Physical Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Goo Jun
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Ginger Metcalf
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Yao Yao
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, 02115, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Harvard Stem Cell Institute, Boston, MA, 02138, USA
- Broad Institute, Cambridge, MA, 02142, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, 02115, USA
| | - Joshua B Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
| | - Nathalie Chami
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Paul S de Vries
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Pinkal Desai
- Division of Hematology and Oncology, Weill Cornell Medical College, New York, NY, 10065, USA
| | - James S Floyd
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
| | - Yan Gao
- Jackson Heart Study, Department of Medicine, University of Mississippi, Jackson, MS, 39216, USA
| | - Kai Kammers
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Wonji Kim
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, 2115, USA
| | - Jee-Young Moon
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Aakrosh Ratan
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22908, USA
| | - Lisa R Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Laura Almasy
- Children's Hospital of Philadelphia and University of Pennsylvania School of Medicine, Philadelphia, PA, 19104, USA
| | - Lewis C Becker
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, 2115, USA
| | - Joanne E Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Robert C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Joshua P Lewis
- Department of Medicine, Division of Endocrinology, Diabetes, and Nutrition, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Braxton D Mitchell
- Department of Medicine, Division of Endocrinology, Diabetes, and Nutrition, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C Morrison
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Michael Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, 98101, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22908, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA
| | - Hua Tang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Russell P Tracy
- Departments of Pathology & Laboratory Medicine and Biochemistry, Larner College of Medicine at the University of Vermont, Colchester, VT, 5446, USA
| | - Eric Boerwinkle
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Goncalo R Abecasis
- TOPMed Informatics Research Center, University of Michigan, Department of Biostatistics, Ann Arbor, MI, 48109, USA
| | - Thomas W Blackwell
- TOPMed Informatics Research Center, University of Michigan, Department of Biostatistics, Ann Arbor, MI, 48109, USA
| | - Albert V Smith
- TOPMed Informatics Research Center, University of Michigan, Department of Biostatistics, Ann Arbor, MI, 48109, USA
| | - Andrew D Johnson
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung and Blood Institute, Framingham, MA, 1702, USA
| | - Rasika A Mathias
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Deborah A Nickerson
- Department of Genome Sciences, University of Washington, Seattle, WA, 98105, USA
| | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA, 98105, USA
| | - Yun Li
- Departments of Biostatistics, Genetics, Computer Science, Applied Physical Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Unnur Þorsteinsdóttir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, 101, Reykjavik, Iceland
| | - Magnús K Magnússon
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, 101, Reykjavik, Iceland
| | - Kari Stefansson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, 101, Reykjavik, Iceland
| | - Nathan D Pankratz
- Department of Laboratory Medicine and Pathology, University of Minnesota Medical School, Minneapolis, MN, 55455, USA
| | - Daniel E Bauer
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, 02115, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Harvard Stem Cell Institute, Boston, MA, 02138, USA
- Broad Institute, Cambridge, MA, 02142, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, 02115, USA
| | - Paul L Auer
- Division of Biostatistics, Institute for Health and Equity, and Cancer Center, Medical College of Wisconsin, Milwaukee, WI, 53226, USA.
| | - Alex P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA, 98105, USA.
| |
Collapse
|
28
|
Wang T, Liu J, Chen J, Qin B. Generation and Differentiation of Induced Pluripotent Stem Cells from Mononuclear Cells in An Age-Related Macular Degeneration Patient. CELL JOURNAL 2022; 24:764-773. [PMID: 36527349 PMCID: PMC9790072 DOI: 10.22074/cellj.2022.557559.1072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Indexed: 01/05/2023]
Abstract
OBJECTIVE We aimed to generate induced pluripotent stem cells (iPSCs)-derived retinal pigmented epithelium (RPE) cells from peripheral blood mononuclear cells (PBMCs) and age-related macular degeneration (AMD) patient to provide potential cell sources for both basic scientific research and clinical application. MATERIALS AND METHODS In this experimental study, PBMCs were isolated from the whole blood of a 70-year-old female patient with AMD and reprogrammed into iPSCs by transfection of Sendai virus that contained Yamanaka factors (OCT4, SOX2, KLF4, and c-MYC). Flow cytometry, real-time quantitative polymerase chain reaction (qPCR), karyotype analysis, embryoid body (EB) formation, and teratoma detection were performed to confirm that AMD-iPSCs exhibited full pluripotency and maintained a normal karyotype after reprogramming. AMD-iPSCs were induced into RPE cells by stepwise induced differentiation and specific markers of RPE cells examined by immunofluorescence and flow cytometry. RESULTS The iPSC colonies started to form on three weeks post-infection. AMD-iPSCs exhibited typical morphology including roundness, a large nucleus, sparse cytoplasm, and conspicuous nucleoli. QPCR data showed that AMDiPSCs expressed pluripotency markers (endo-OCT4, endo-SOX2, NANOG and REX1). Flow cytometry indicated 99.7% of generated iPSCs was TRA-1-60 positive. Methylation sequencing showed that the regions of OCT4 and NANOG promoter were demethylated in iPSCs. EBs and teratomas formation assay showed that iPSCs had strong differentiation potential and pluripotency. After a series of inductions with differentiation mediums, a monolayer of AMDiPSC- RPE cells was observed on day 50. The AMD-iPSC-RPEs highly expressed specific RPE markers (MITF, ZO-1, Bestrophin, and PMEL17). CONCLUSION A high quality iPSCs could be established from the PBMCs obtained from elderly AMD patient. The AMDiPSC displayed complete pluripotency, enabling for scientific study, disease modeling, pharmacological testing, and therapeutic applications in personalized medicine. Collectively, we successfully differentiated the iPSCs into RPE with native RPE characteristics, which might provide potential regenerative treatments for AMD patients.
Collapse
Affiliation(s)
- Tongmiao Wang
- Shenzhen Aier Eye Hospital, Shenzhen, China,Aier Eye Hospital, Jinan University, Shenzhen, China,Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, China
| | - Jingwen Liu
- Shenzhen Aier Eye Hospital, Shenzhen, China,Aier Eye Hospital, Jinan University, Shenzhen, China,Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, China
| | - Jianhua Chen
- Shenzhen Aier Eye Hospital, Shenzhen, China,Aier Eye Hospital, Jinan University, Shenzhen, China,Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, China,Aier Eye Hospital Group, Changsha, China,*Corresponding Address:Shenzhen Aier Eye HospitalShenzhenChina
Emails:,
| | - Bo Qin
- Shenzhen Aier Eye Hospital, Shenzhen, China,Aier Eye Hospital, Jinan University, Shenzhen, China,Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, China,Aier Eye Hospital Group, Changsha, China,*Corresponding Address:Shenzhen Aier Eye HospitalShenzhenChina
Emails:,
| |
Collapse
|
29
|
Comparative analysis of microsatellites in coding regions provides insights into the adaptability of the giant panda, polar bear and brown bear. Genetica 2022; 150:355-366. [DOI: 10.1007/s10709-022-00173-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 09/13/2022] [Indexed: 11/27/2022]
|
30
|
Wang Y, Ling Y, Gong J, Zhao X, Zhou H, Xie B, Lou H, Zhuang X, Jin L, Fan S, Zhang G, Xu S. PGG.SV: a whole-genome-sequencing-based structural variant resource and data analysis platform. Nucleic Acids Res 2022; 51:D1109-D1116. [PMID: 36243989 PMCID: PMC9825616 DOI: 10.1093/nar/gkac905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/21/2022] [Accepted: 10/04/2022] [Indexed: 01/30/2023] Open
Abstract
Structural variations (SVs) play important roles in human evolution and diseases, but there is a lack of data resources concerning representative samples, especially for East Asians. Taking advantage of both next-generation sequencing and third-generation sequencing data at the whole-genome level, we developed the database PGG.SV to provide a practical platform for both regionally and globally representative structural variants. In its current version, PGG.SV archives 584 277 SVs obtained from whole-genome sequencing data of 6048 samples, including 1030 long-read sequencing genomes representing 177 global populations. PGG.SV provides (i) high-quality SVs with fine-scale and precise genomic locations in both GRCh37 and GRCh38, covering underrepresented SVs in existing sequencing and microarray data; (ii) hierarchical estimation of SV prevalence in geographical populations; (iii) informative annotations of SV-related genes, potential functions and clinical effects; (iv) an analysis platform to facilitate SV-based case-control association studies and (v) various visualization tools for understanding the SV structures in the human genome. Taken together, PGG.SV provides a user-friendly online interface, easy-to-use analysis tools and a detailed presentation of results. PGG.SV is freely accessible via https://www.biosino.org/pggsv.
Collapse
Affiliation(s)
| | | | | | - Xiaohan Zhao
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, China,Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai 201203, China
| | - Hanwen Zhou
- Key Laboratory of Computational Biology, National Genomics Data Center & Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Bo Xie
- Key Laboratory of Computational Biology, National Genomics Data Center & Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Haiyi Lou
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xinhao Zhuang
- Key Laboratory of Computational Biology, National Genomics Data Center & Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, China,Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai 201203, China
| | | | - Shaohua Fan
- Correspondence may also be addressed to Shaohua Fan.
| | - Guoqing Zhang
- Correspondence may also be addressed to Guoqing Zhang.
| | - Shuhua Xu
- To whom correspondence should be addressed. Tel: +86 21 31246617; Fax: +86 21 31246617;
| |
Collapse
|
31
|
Lye Z, Choi JY, Purugganan MD. Deleterious mutations and the rare allele burden on rice gene expression. Mol Biol Evol 2022; 39:6693943. [PMID: 36073358 PMCID: PMC9512150 DOI: 10.1093/molbev/msac193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Deleterious genetic variation is maintained in populations at low frequencies. Under a model of stabilizing selection, rare (and presumably deleterious) genetic variants are associated with increase or decrease in gene expression from some intermediate optimum. We investigate this phenomenon in a population of largely Oryza sativa ssp. indica rice landraces under normal unstressed wet and stressful drought field conditions. We include single nucleotide polymorphisms, insertion/deletion mutations, and structural variants in our analysis and find a stronger association between rare variants and gene expression outliers under the stress condition. We also show an association of the strength of this rare variant effect with linkage, gene expression levels, network connectivity, local recombination rate, and fitness consequence scores, consistent with the stabilizing selection model of gene expression.
Collapse
Affiliation(s)
- Zoe Lye
- Center for Genomics and Systems Biology, New York University, New York, NY 10003
| | - Jae Young Choi
- Center for Genomics and Systems Biology, New York University, New York, NY 10003
| | - Michael D Purugganan
- Center for Genomics and Systems Biology, New York University, New York, NY 10003.,Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| |
Collapse
|
32
|
Berthold N, Pytte J, Bulik CM, Tschochner M, Medland SE, Akkari PA. Bridging the gap: Short structural variants in the genetics of anorexia nervosa. Int J Eat Disord 2022; 55:747-753. [PMID: 35470453 PMCID: PMC9545787 DOI: 10.1002/eat.23716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/30/2022] [Accepted: 03/31/2022] [Indexed: 11/07/2022]
Abstract
Anorexia nervosa (AN) is a devastating disorder with evidence of underexplored heritability. Twin and family studies estimate heritability (h2 ) to be 57%-64%, and genome-wide association studies (GWAS) reveal significant genetic correlations with psychiatric and anthropometric traits and a total of nine genome-wide significant loci. Whether significantly associated single nucleotide polymorphisms identified by GWAS are causal or tag true causal variants, remains to be elucidated. We propose a novel method for bridging this knowledge gap by fine-mapping short structural variants (SSVs) in and around GWAS-identified loci. SSV fine-mapping of loci associated with complex disorders such as schizophrenia, amyotrophic lateral sclerosis, and Alzheimer's disease has uncovered genetic risk markers, phenotypic variability between patients, new pathological mechanisms, and potential therapeutic targets. We analyze previous investigations' methods and propose utilizing an evaluation algorithm to prioritize 10 SSVs for each of the top two AN GWAS-identified loci followed by Sanger sequencing and fragment analysis via capillary electrophoresis to characterize these SSVs for case/control association studies. Success of previous SSV analyses in complex disorders and effective utilization of similar methodologies supports our proposed method. Furthermore, the structural and spatial properties of the 10 SSVs identified for each of the top two AN GWAS-associated loci, cell adhesion molecule 1 (CADM1) and NCK interacting protein with SH3 domain (NCKIPSD), are similar to previous studies. We propose SSV fine-mapping of AN-associated loci will identify causal genetic architecture. Deepening understandings of AN may lead to novel therapeutic targets and subsequently increase quality-of-life for individuals living with the illness. PUBLIC SIGNIFICANCE STATEMENT: Anorexia nervosa is a severe and complex illness, arising from a combination of environmental and genetic factors. Recent studies estimate the contribution of genetic variability; however, the specific DNA sequences and how they contribute remain unknown. We present a novel approach, arguing that the genetic variant class, short structural variants, could answer this knowledge gap and allow development of biologically targeted therapeutics, improving quality-of-life and patient outcomes for affected individuals.
Collapse
Affiliation(s)
- Natasha Berthold
- School of Nursing, Midwifery, Health Sciences & PhysiotherapyUniversity of Notre Dame AustraliaFremantleWestern AustraliaAustralia
- Perron Institute for Neurological and Translational ScienceNedlandsWestern AustraliaAustralia
- School of Human Sciences, University of Western AustraliaCrawleyWestern AustraliaAustralia
| | - Julia Pytte
- Perron Institute for Neurological and Translational ScienceNedlandsWestern AustraliaAustralia
- School of Human Sciences, University of Western AustraliaCrawleyWestern AustraliaAustralia
| | - Cynthia M. Bulik
- Department of Medical Epidemiology and BiostatisticsKarolinska InstitutetStockholmSweden
- Department of PsychiatryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
- Department of NutritionUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Monika Tschochner
- School of Nursing, Midwifery, Health Sciences & PhysiotherapyUniversity of Notre Dame AustraliaFremantleWestern AustraliaAustralia
| | - Sarah E. Medland
- QIMR Berghofer Medical Research InstituteBrisbaneQueenslandAustralia
| | - Patrick Anthony Akkari
- Perron Institute for Neurological and Translational ScienceNedlandsWestern AustraliaAustralia
- Centre for Molecular Medicine and Innovative TherapeuticsMurdoch UniversityPerthWestern AustraliaAustralia
- Centre for Neuromuscular and Neurological DisordersUniversity of Western AustraliaNedlandsWestern AustraliaAustralia
- Department of NeurologyDuke UniversityDurhamNorth Carolina
| |
Collapse
|
33
|
Liu Z, Zhao G, Xiao Y, Zeng S, Yuan Y, Zhou X, Fang Z, He R, Li B, Zhao Y, Pan H, Wang Y, Yu G, Peng IF, Wang D, Meng Q, Xu Q, Sun Q, Yan X, Shen L, Jiang H, Xia K, Wang J, Guo J, Liang F, Li J, Tang B. Profiling the Genome-Wide Landscape of Short Tandem Repeats by Long-Read Sequencing. Front Genet 2022; 13:810595. [PMID: 35601492 PMCID: PMC9117641 DOI: 10.3389/fgene.2022.810595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 03/30/2022] [Indexed: 11/23/2022] Open
Abstract
Background: Short tandem repeats (STRs) are highly variable elements that play a pivotal role in multiple genetic diseases and the regulation of gene expression. Long-read sequencing (LRS) offers a potential solution to genome-wide STR analysis. However, characterizing STRs in human genomes using LRS on a large population scale has not been reported. Methods: We conducted the large LRS-based STR analysis in 193 unrelated samples of the Chinese population and performed genome-wide profiling of STR variation in the human genome. The repeat dynamic index (RDI) was introduced to evaluate the variability of STR. We sourced the expression data from the Genotype-Tissue Expression to explore the tissue specificity of highly variable STRs related genes across tissues. Enrichment analyses were also conducted to identify potential functional roles of the high variable STRs. Results: This study reports the large-scale analysis of human STR variation by LRS and offers a reference STR database based on the LRS dataset. We found that the disease-associated STRs (dSTRs) and STRs associated with the expression of nearby genes (eSTRs) were highly variable in the general population. Moreover, tissue-specific expression analysis showed that those highly variable STRs related genes presented the highest expression level in brain tissues, and enrichment pathways analysis found those STRs are involved in synaptic function-related pathways. Conclusion: Our study profiled the genome-wide landscape of STR using LRS and highlighted the highly variable STRs in the human genome, which provide a valuable resource for studying the role of STRs in human disease and complex traits.
Collapse
Affiliation(s)
- Zhenhua Liu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Guihu Zhao
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | | | - Sheng Zeng
- Department of Geriatrics, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Yanchun Yuan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Xun Zhou
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhenghuan Fang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Runcheng He
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Bin Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Yuwen Zhao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Hongxu Pan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Yige Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | | | | | | | - Qingtuan Meng
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
| | - Qian Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Qiying Sun
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Xinxiang Yan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Lu Shen
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Hong Jiang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
| | - Kun Xia
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Junling Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Jifeng Guo
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Jinchen Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Beisha Tang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| |
Collapse
|
34
|
Quan C, Lu H, Lu Y, Zhou G. Population-scale genotyping of structural variation in the era of long-read sequencing. Comput Struct Biotechnol J 2022; 20:2639-2647. [PMID: 35685364 PMCID: PMC9163579 DOI: 10.1016/j.csbj.2022.05.047] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 05/24/2022] [Accepted: 05/24/2022] [Indexed: 11/29/2022] Open
Abstract
Population-scale studies of structural variation (SV) are growing rapidly worldwide with the development of long-read sequencing technology, yielding a considerable number of novel SVs and complete gap-closed genome assemblies. Herein, we highlight recent studies using a hybrid sequencing strategy and present the challenges toward large-scale genotyping for SVs due to the reference bias. Genotyping SVs at a population scale remains challenging, which severely impacts genotype-based population genetic studies or genome-wide association studies of complex diseases. We summarize academic efforts to improve genotype quality through linear or graph representations of reference and alternative alleles. Graph-based genotypers capable of integrating diverse genetic information are effectively applied to large and diverse cohorts, contributing to unbiased downstream analysis. Meanwhile, there is still an urgent need in this field for efficient tools to construct complex graphs and perform sequence-to-graph alignments.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
| | - Hao Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
- Hebei University, Baoding, Hebei Province 071002, PR China
- Corresponding authors at: Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing 100850, PR China (G. Zhou). Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850, PR China (Y. Lu).
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166, PR China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025, PR China
- Hebei University, Baoding, Hebei Province 071002, PR China
- Corresponding authors at: Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing 100850, PR China (G. Zhou). Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850, PR China (Y. Lu).
| |
Collapse
|
35
|
Dynamics of nuclear matrix attachment regions during 5 th instar posterior silk gland development in Bombyx mori. BMC Genomics 2022; 23:247. [PMID: 35361117 PMCID: PMC8973518 DOI: 10.1186/s12864-022-08446-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 03/06/2022] [Indexed: 12/02/2022] Open
Abstract
Background Chromatin architecture is critical for gene expression during development. Matrix attachment regions (MARs) control and regulate chromatin dynamics. The position of MARs in the genome determines the expression of genes in the organism. In this study, we set out to elucidate how MARs temporally regulate the expression of the fibroin heavy chain (FIBH) gene during development. We addressed this by identifying MARs and studying their distribution and differentiation, in the posterior silk glands of Bombyx mori during 5th instar development. Results Of the MARs identified on three different days, 7.15% MARs were common to all 3 days, whereas, 1.41, 19.27 and 52.47% MARs were unique to day 1, day 5, and day 7, respectively highlighting the dynamic nature of the matrix associated DNA. The average chromatin loop length based on the chromosome wise distribution of MARs and the distances between these MAR regions decreased from day 1 (253.91 kb) to day 5 (73.54 kb) to day 7 (39.19 kb). Further significant changes in the MARs in the vicinity of the FIBH gene were found during different days of 5th instar development which implied their role in the regulation and expression of the FIBH gene. Conclusions The presence of MARs in the flanking regions of genes found to exhibit differential expression during 5th instar development indicates their possible role in the regulation of their expression. This reiterates the importance of MARs in the genomic functioning as regulators of the molecular mechanisms in the nucleus. This is the first study that takes into account the tissue specific genome-wide MAR association and the potential role of these MARs in developmentally regulated gene expression. The current study lays a foundation to understand the genome wide regulation of chromatin during development. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08446-3.
Collapse
|
36
|
Vialle RA, de Paiva Lopes K, Bennett DA, Crary JF, Raj T. Integrating whole-genome sequencing with multi-omic data reveals the impact of structural variants on gene regulation in the human brain. Nat Neurosci 2022; 25:504-514. [PMID: 35288716 PMCID: PMC9245608 DOI: 10.1038/s41593-022-01031-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 02/07/2022] [Indexed: 11/09/2022]
Abstract
Structural variants (SVs), genomic rearrangements of >50 bp, are an important source of genetic diversity and have been linked to many diseases. However, it remains unclear how they modulate human brain function and disease risk. Here, we report 170,996 SVs discovered using 1,760 short-read whole genomes from aged adults and Alzheimer’s disease individuals. By applying quantitative trait locus (SV-xQTL) analyses, we quantified the impact of cis-acting SVs on histone modifications, gene expression, splicing, and protein abundance in post-mortem brain tissues. More than 3,200 SVs were associated with at least one molecular phenotype. We found reproducibility of 65–99% SV-eQTLs across cohorts and brain regions. SV associations with mRNA and proteins shared the same direction of effect in more than 87% of SV-gene pairs. Mediation analysis showed ~8% of SV-eQTLs mediated by histone acetylation, and ~11% by splicing. Additionally, associations of SVs with progressive supranuclear palsy identified previously known and novel SVs.
Collapse
Affiliation(s)
- Ricardo A Vialle
- Nash Family Department of Neuroscience & Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Department of Genetics and Genomic Sciences & Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Estelle and Daniel Maggin Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Katia de Paiva Lopes
- Nash Family Department of Neuroscience & Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Department of Genetics and Genomic Sciences & Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Estelle and Daniel Maggin Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - John F Crary
- Nash Family Department of Neuroscience & Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Towfique Raj
- Nash Family Department of Neuroscience & Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Department of Genetics and Genomic Sciences & Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Estelle and Daniel Maggin Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
37
|
Kasai S, Nishizawa D, Hasegawa J, Fukuda KI, Ichinohe T, Nagashima M, Hayashida M, Ikeda K. Short Tandem Repeat Variation in the CNR1 Gene Associated With Analgesic Requirements of Opioids in Postoperative Pain Management. Front Genet 2022; 13:815089. [PMID: 35360861 PMCID: PMC8963810 DOI: 10.3389/fgene.2022.815089] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 02/02/2022] [Indexed: 11/25/2022] Open
Abstract
Short tandem repeats (STRs) and variable number of tandem repeats (VNTRs) that have been identified at approximately 0.7 and 0.5 million loci in the human genome, respectively, are highly multi-allelic variations rather than single-nucleotide polymorphisms. The number of repeats of more than a few thousand STRs was associated with the expression of nearby genes, indicating that STRs are influential genetic variations in human traits. Analgesics act on the central nervous system via their intrinsic receptors to produce analgesic effects. In the present study, we focused on STRs and VNTRs in the CNR1, GRIN2A, PENK, and PDYN genes and analyzed two peripheral pain sensation-related traits and seven analgesia-related traits in postoperative pain management. A total of 192 volunteers who underwent the peripheral pain sensation tests and 139 and 252 patients who underwent open abdominal and orthognathic cosmetic surgeries, respectively, were included in the study. None of the four STRs or VNTRs were associated with peripheral pain sensation. Short tandem repeats in the CNR1, GRIN2A, and PENK genes were associated with the frequency of fentanyl use, fentanyl dose, and visual analog scale pain scores 3 h after orthognathic cosmetic surgery (Spearman’s rank correlation coefficient ρ = 0.199, p = 0.002, ρ = 0.174, p = 0.006, and ρ = 0.135, p = 0.033, respectively), analgesic dose, including epidural analgesics after open abdominal surgery (ρ = −0.200, p = 0.018), and visual analog scale pain scores 24 h after orthognathic cosmetic surgery (ρ = 0.143, p = 0.023), respectively. The associations between STRs in the CNR1 gene and the frequency of fentanyl use and fentanyl dose after orthognathic cosmetic surgery were confirmed by Holm’s multiple-testing correction. These findings indicate that STRs in the CNR1 gene influence analgesia in the orofacial region.
Collapse
Affiliation(s)
- Shinya Kasai
- Addictive Substance Project, Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Daisuke Nishizawa
- Addictive Substance Project, Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Junko Hasegawa
- Addictive Substance Project, Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Ken-ichi Fukuda
- Department of Oral Health Science, Tokyo Dental College, Tokyo, Japan
| | - Tatsuya Ichinohe
- Department of Dental Anesthesiology, Tokyo Dental College, Tokyo, Japan
| | - Makoto Nagashima
- Department of Surgery, Toho University Sakura Medical Center, Sakura, Japan
| | - Masakazu Hayashida
- Department of Anesthesiology and Pain Medicine, Juntendo University School of Medicine, Tokyo, Japan
| | - Kazutaka Ikeda
- Addictive Substance Project, Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
- *Correspondence: Kazutaka Ikeda,
| |
Collapse
|
38
|
In heart failure reactivation of RNA-binding proteins is associated with the expression of 1,523 fetal-specific isoforms. PLoS Comput Biol 2022; 18:e1009918. [PMID: 35226669 PMCID: PMC8912908 DOI: 10.1371/journal.pcbi.1009918] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 03/10/2022] [Accepted: 02/10/2022] [Indexed: 01/03/2023] Open
Abstract
Reactivation of fetal-specific genes and isoforms occurs during heart failure. However, the underlying molecular mechanisms and the extent to which the fetal program switch occurs remains unclear. Limitations hindering transcriptome-wide analyses of alternative splicing differences (i.e. isoform switching) in cardiovascular system (CVS) tissues between fetal, healthy adult and heart failure have included both cellular heterogeneity across bulk RNA-seq samples and limited availability of fetal tissue for research. To overcome these limitations, we have deconvoluted the cellular compositions of 996 RNA-seq samples representing heart failure, healthy adult (heart and arteria), and fetal-like (iPSC-derived cardiovascular progenitor cells) CVS tissues. Comparison of the expression profiles revealed that reactivation of fetal-specific RNA-binding proteins (RBPs), and the accompanied re-expression of 1,523 fetal-specific isoforms, contribute to the transcriptome differences between heart failure and healthy adult heart. Of note, isoforms for 20 different RBPs were among those that reverted in heart failure to the fetal-like expression pattern. We determined that, compared with adult-specific isoforms, fetal-specific isoforms encode proteins that tend to have more functions, are more likely to harbor RBP binding sites, have canonical sequences at their splice sites, and contain typical upstream polypyrimidine tracts. Our study suggests that compared with healthy adult, fetal cardiac tissue requires stricter transcriptional regulation, and that during heart failure reversion to this stricter transcriptional regulation occurs. Furthermore, we provide a resource of cardiac developmental stage-specific and heart failure-associated genes and isoforms, which are largely unexplored and can be exploited to investigate novel therapeutics for heart failure. Heart failure is a chronic condition in which the heart does not pump enough blood. It has been shown that in heart failure, the adult heart reverts to a fetal-like metabolic state and oxygen consumption. Additionally, genes and isoforms that are expressed in the heart only during fetal development (i.e. not in the healthy adult heart) are turned on in heart failure. However, the underlying molecular mechanisms and the extent to which the switch to a fetal gene program occurs remains unclear. In this study, we initially characterized the differences between the fetal and adult heart transcriptomes (entire set of expressed genes and isoforms). We found that RNA binding proteins (RBPs), a family of genes that regulate multiple aspects of a transcript’s maturation, including transcription, splicing and post-transcriptional modifications, play a central role in the differences between fetal and adult heart tissues. We observed that many RBPs that are only expressed in the heart during fetal development become reactivated in heart failure, resulting in the expression of 1,523 fetal-specific isoforms. These findings suggest that reactivation of fetal-specific RBPs in heart failure drives a transcriptome-wide switch to expression of fetal-specific isoforms; and hence that RBPs could potentially serve as novel therapeutic targets.
Collapse
|
39
|
Wu Z, Gong H, Zhou Z, Jiang T, Lin Z, Li J, Xiao S, Yang B, Huang L. Mapping short tandem repeats for liver gene expression traits helps prioritize potential causal variants for complex traits in pigs. J Anim Sci Biotechnol 2022; 13:8. [PMID: 35034641 PMCID: PMC8762894 DOI: 10.1186/s40104-021-00658-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/25/2021] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Short tandem repeats (STRs) were recently found to have significant impacts on gene expression and diseases in humans, but their roles on gene expression and complex traits in pigs remain unexplored. This study investigates the effects of STRs on gene expression in liver tissues based on the whole-genome sequences and RNA-Seq data of a discovery cohort of 260 F6 individuals and a validation population of 296 F7 individuals from a heterogeneous population generated from crosses among eight pig breeds. RESULTS We identified 5203 and 5868 significantly expression STRs (eSTRs, FDR < 1%) in the F6 and F7 populations, respectively, most of which could be reciprocally validated (π1 = 0.92). The eSTRs explained 27.5% of the cis-heritability of gene expression traits on average. We further identified 235 and 298 fine-mapped STRs through the Bayesian fine-mapping approach in the F6 and F7 pigs, respectively, which were significantly enriched in intron, ATAC peak, compartment A and H3K4me3 regions. We identified 20 fine-mapped STRs located in 100 kb windows upstream and downstream of published complex trait-associated SNPs, which colocalized with epigenetic markers such as H3K27ac and ATAC peaks. These included eSTR of the CLPB, PGLS, PSMD6 and DHDH genes, which are linked with genome-wide association study (GWAS) SNPs for blood-related traits, leg conformation, growth-related traits, and meat quality traits, respectively. CONCLUSIONS This study provides insights into the effects of STRs on gene expression traits. The identified eSTRs are valuable resources for prioritizing causal STRs for complex traits in pigs.
Collapse
Affiliation(s)
- Zhongzi Wu
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Huanfa Gong
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Zhimin Zhou
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Tao Jiang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Ziqi Lin
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Jing Li
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Shijun Xiao
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Bin Yang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China.
| | - Lusheng Huang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China.
| |
Collapse
|
40
|
Hagemeijer YP, Guryev V, Horvatovich P. Accurate Prediction of Protein Sequences for Proteogenomics Data Integration. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2420:233-260. [PMID: 34905178 DOI: 10.1007/978-1-0716-1936-0_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
This book chapter discusses proteogenomics data integration and provides an overview into the different omics layer involved in defining the proteome of a living organism. Various aspects of genome variability affecting either the sequence or abundance level of proteins are discussed in this book chapter, such as the effect of single-nucleotide variants or larger genomic structural variants on the proteome. Next, various sequencing technologies are introduced and discussed from a proteogenomics data integration perspective such as those providing short- and long-read sequencing and listing their respective advantages and shortcomings for accurate protein variant prediction using genomic/transcriptomics sequencing data. Finally, the various bioinformatics tools used to process and analyze DNA/RNA sequencing data are discussed with the ultimate goal of obtaining accurately predicted sample-specific protein sequences that can be used as a drop-in replacement in existing approaches for peptide and protein identification using popular database search engines such as MSFragger, SearchGUI/PeptideShaker.
Collapse
Affiliation(s)
- Yanick Paco Hagemeijer
- Department of Analytical Biochemistry, University of Groningen, Groningen Research Institute of Pharmacy, Groningen, The Netherlands.,European Research Institute for the Biology of Ageing, University Medical Center Groningen, Groningen, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, Groningen, The Netherlands
| | - Peter Horvatovich
- Department of Analytical Biochemistry, University of Groningen, Groningen Research Institute of Pharmacy, Groningen, The Netherlands.
| |
Collapse
|
41
|
Scott AJ, Chiang C, Hall IM. Structural variants are a major source of gene expression differences in humans and often affect multiple nearby genes. Genome Res 2021; 31:2249-2257. [PMID: 34544830 PMCID: PMC8647827 DOI: 10.1101/gr.275488.121] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 09/14/2021] [Indexed: 11/29/2022]
Abstract
Structural variants (SVs) are an important source of human genome diversity, but their functional effects are poorly understood. We mapped 61,668 SVs in 613 individuals from the GTEx project and measured their effects on gene expression. We estimate that common SVs are causal at 2.66% of eQTLs, a 10.5-fold enrichment relative to their abundance in the genome. Duplications and deletions were the most impactful variant types, whereas the contribution of mobile element insertions was small (0.12% of eQTLs, 1.9-fold enriched). Multitissue analysis of eQTLs revealed that gene-altering SVs show more constitutive effects than other variant types, with 62.09% of coding SV-eQTLs active in all tissues with eQTL activity compared with 23.08% of coding SNV- and indel-eQTLs. Noncoding SVs, SNVs and indels show broadly similar patterns. We also identified 539 rare SVs associated with nearby gene expression outliers. Of these, 62.34% are noncoding SVs that affect gene expression but have modest enrichment at regulatory elements, showing that rare noncoding SVs are a major source of gene expression differences but remain difficult to predict from current annotations. Both common and rare SVs often affect the expression of multiple genes: SV-eQTLs affect an average of 1.82 nearby genes, whereas SNV- and indel-eQTLs affect an average of 1.09 genes, and 21.34% of rare expression-altering SVs show effects on two to nine different genes. We also observe significant effects on rare gene expression changes extending 1 Mb from the SV. This provides a mechanism by which individual SVs may have strong or pleiotropic effects on phenotypic variation.
Collapse
Affiliation(s)
- Alexandra J Scott
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| |
Collapse
|
42
|
Novel implications of a strictly monomorphic (GCC) repeat in the human PRKACB gene. Sci Rep 2021; 11:20629. [PMID: 34667254 PMCID: PMC8526596 DOI: 10.1038/s41598-021-99932-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 10/05/2021] [Indexed: 02/07/2023] Open
Abstract
PRKACB (Protein Kinase CAMP-Activated Catalytic Subunit Beta) is predominantly expressed in the brain, and regulation of this gene links to neuroprotective effects against tau and Aβ-induced toxicity. Here we studied a (GCC)-repeat spanning the core promoter and 5′ UTR of this gene in 300 human subjects, consisting of late-onset neurocognitive disorder (NCD) (N = 150) and controls (N = 150). We also implemented several models to study the impact of this repeat on the three-dimensional (3D) structure of DNA. While the PRKACB (GCC)-repeat was strictly monomorphic at 7-repeats, we detected two 7/8 genotypes only in the NCD group. In all examined models, the (GCC)7 and its periodicals had the least range of divergence variation on the 3D structure of DNA in comparison to the 8-repeat periodicals and several hypothetical repeat lengths. A similar inert effect on the 3D structure was not detected in other classes of short tandem repeats (STRs) such as GA and CA repeats. In conclusion, we report monomorphism of a long (GCC)-repeat in the PRKACB gene in human, its inert effect on DNA structure, and enriched divergence in late-onset NCD. This is the first indication of natural selection for a monomorphic (GCC)-repeat, which probably evolved to function as an “epigenetic knob”, without changing the regional DNA structure.
Collapse
|
43
|
Yan SM, Sherman RM, Taylor DJ, Nair DR, Bortvin AN, Schatz MC, McCoy RC. Local adaptation and archaic introgression shape global diversity at human structural variant loci. eLife 2021; 10:e67615. [PMID: 34528508 PMCID: PMC8492059 DOI: 10.7554/elife.67615] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 09/14/2021] [Indexed: 12/13/2022] Open
Abstract
Large genomic insertions and deletions are a potent source of functional variation, but are challenging to resolve with short-read sequencing, limiting knowledge of the role of such structural variants (SVs) in human evolution. Here, we used a graph-based method to genotype long-read-discovered SVs in short-read data from diverse human genomes. We then applied an admixture-aware method to identify 220 SVs exhibiting extreme patterns of frequency differentiation - a signature of local adaptation. The top two variants traced to the immunoglobulin heavy chain locus, tagging a haplotype that swept to near fixation in certain southeast Asian populations, but is rare in other global populations. Further investigation revealed evidence that the haplotype traces to gene flow from Neanderthals, corroborating the role of immune-related genes as prominent targets of adaptive introgression. Our study demonstrates how recent technical advances can help resolve signatures of key evolutionary events that remained obscured within technically challenging regions of the genome.
Collapse
Affiliation(s)
- Stephanie M Yan
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Rachel M Sherman
- Department of Computer Science, Johns Hopkins UniversityBaltimoreUnited States
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Divya R Nair
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Andrew N Bortvin
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Michael C Schatz
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
- Department of Computer Science, Johns Hopkins UniversityBaltimoreUnited States
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| |
Collapse
|
44
|
Hollox EJ, Zuccherato LW, Tucci S. Genome structural variation in human evolution. Trends Genet 2021; 38:45-58. [PMID: 34284881 DOI: 10.1016/j.tig.2021.06.015] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 06/21/2021] [Accepted: 06/22/2021] [Indexed: 01/01/2023]
Abstract
Structural variation (SV) is a large difference (typically >100 bp) in the genomic structure of two genomes and includes both copy number variation and variation that does not change copy number of a genomic region, such as an inversion. Improved reference genomes, combined with widespread genome sequencing using short-read sequencing technology, and increasingly using long-read sequencing, have reignited interest in SV. Recent large-scale studies and functional focused analyses have highlighted the role of SV in human evolution. In this review, we highlight human-specific SVs involved in changes in the brain, population-specific SVs that affect response to the environment, including adaptation to diet and infectious diseases, and summarise the contribution of archaic hominin admixture to present-day human SV.
Collapse
Affiliation(s)
- Edward J Hollox
- Department of Genetics and Genome Biology, University of Leicester, UK.
| | - Luciana W Zuccherato
- Núcleo de Ensino e Pesquisa, Instituto Mário Penna, Belo Horizonte, Brazil; Departmento de Bioquímica e Imunologia, Universidade de Minas Gerais, Belo Horizonte, Brazil
| | - Serena Tucci
- Department of Anthropology, Yale University, New Haven, CT, USA
| |
Collapse
|
45
|
Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network. Nat Commun 2021; 12:3297. [PMID: 34078885 PMCID: PMC8172540 DOI: 10.1038/s41467-021-23143-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 04/13/2021] [Indexed: 02/04/2023] Open
Abstract
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.
Collapse
|
46
|
Tang H, He Z. Advances and challenges in quantitative delineation of the genetic architecture of complex traits. QUANTITATIVE BIOLOGY 2021; 9:168-184. [PMID: 35492964 PMCID: PMC9053444 DOI: 10.15302/j-qb-021-0249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background Genome-wide association studies (GWAS) have been widely adopted in studies of human complex traits and diseases. Results This review surveys areas of active research: quantifying and partitioning trait heritability, fine mapping functional variants and integrative analysis, genetic risk prediction of phenotypes, and the analysis of sequencing studies that have identified millions of rare variants. Current challenges and opportunities are highlighted. Conclusion GWAS have fundamentally transformed the field of human complex trait genetics. Novel statistical and computational methods have expanded the scope of GWAS and have provided valuable insights on the genetic architecture underlying complex phenotypes.
Collapse
Affiliation(s)
- Hua Tang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94305, USA
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
47
|
Quan C, Li Y, Liu X, Wang Y, Ping J, Lu Y, Zhou G. Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression. Genome Biol 2021; 22:159. [PMID: 34034800 PMCID: PMC8146648 DOI: 10.1186/s13059-021-02382-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/14/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Structural variation (SV) acts as an essential mutational force shaping the evolution and function of the human genome. However, few studies have examined the role of SVs in high-altitude adaptation and little is known of adaptive introgressed SVs in Tibetans so far. RESULTS Here, we generate a comprehensive catalog of SVs in a Chinese Tibetan (n = 15) and Han (n = 10) population using nanopore sequencing technology. Among a total of 38,216 unique SVs in the catalog, 27% are sequence-resolved for the first time. We systematically assess the distribution of these SVs across repeat sequences and functional genomic regions. Through genotyping in additional 276 genomes, we identify 69 Tibetan-Han stratified SVs and 80 candidate adaptive genes. We also discover a few adaptive introgressed SV candidates and provide evidence for a deletion of 335 base pairs at 1p36.32. CONCLUSIONS Overall, our results highlight the important role of SVs in the evolutionary processes of Tibetans' adaptation to the Qinghai-Tibet Plateau and provide a valuable resource for future high-altitude adaptation studies.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yuanfeng Li
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Xinyi Liu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yahui Wang
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Jie Ping
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166 People’s Republic of China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025 People’s Republic of China
| |
Collapse
|
48
|
Wu Z, Gong H, Zhang M, Tong X, Ai H, Xiao S, Perez-Enciso M, Yang B, Huang L. A worldwide map of swine short tandem repeats and their associations with evolutionary and environmental adaptations. Genet Sel Evol 2021; 53:39. [PMID: 33892623 PMCID: PMC8063339 DOI: 10.1186/s12711-021-00631-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 04/09/2021] [Indexed: 11/10/2022] Open
Abstract
Background Short tandem repeats (STRs) are genetic markers with a greater mutation rate than single nucleotide polymorphisms (SNPs) and are widely used in genetic studies and forensics. However, most studies in pigs have focused only on SNPs or on a limited number of STRs. Results This study screened 394 deep-sequenced genomes from 22 domesticated pig breeds/populations worldwide, wild boars from both Europe and Asia, and numerous outgroup Suidaes, and identified a set of 878,967 polymorphic STRs (pSTRs), which represents the largest repository of pSTRs in pigs to date. We found multiple lines of evidence that pSTRs in coding regions were affected by purifying selection. The enrichment of trinucleotide pSTRs in coding sequences (CDS), 5′UTR and H3K4me3 regions suggests that trinucleotide STRs serve as important components in the exons and promoters of the corresponding genes. We demonstrated that, compared to SNPs, pSTRs provide comparable or even greater accuracy in determining the breed identity of individuals. We identified pSTRs that showed significant population differentiation between domestic pigs and wild boars in Asia and Europe. We also observed that some pSTRs were significantly associated with environmental variables, such as average annual temperature or altitude of the originating sites of Chinese indigenous breeds, among which we identified loss-of-function and/or expanded STRs overlapping with genes such as AHR, LAS1L and PDK1. Finally, our results revealed that several pSTRs show stronger signals in domestic pig—wild boar differentiation or association with the analysed environmental variables than the flanking SNPs within a 100-kb window. Conclusions This study provides a genome-wide high-density map of pSTRs in diverse pig populations based on genome sequencing data, enabling a more comprehensive characterization of their roles in evolutionary and environmental adaptation. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00631-4.
Collapse
Affiliation(s)
- Zhongzi Wu
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Huanfa Gong
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Mingpeng Zhang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Xinkai Tong
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Huashui Ai
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Shijun Xiao
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Miguel Perez-Enciso
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain.,ICREA, Passeig de Lluís Companys 23, Barcelona, Spain
| | - Bin Yang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China.
| | - Lusheng Huang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China.
| |
Collapse
|
49
|
Evolving evidence on a link between the ZMYM3 exceptionally long GA-STR and human cognition. Sci Rep 2020; 10:19454. [PMID: 33173136 PMCID: PMC7655811 DOI: 10.1038/s41598-020-76461-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Accepted: 10/29/2020] [Indexed: 02/06/2023] Open
Abstract
The human X-linked zinc finger MYM-type protein 3 (ZMYM3) contains the longest GA-STR identified across protein-coding gene 5′ UTR sequences, at 32-repeats. This exceptionally long GA-STR is located at a complex string of GA-STRs with a human-specific formula across the complex as follows: (GA)8-(GA)4-(GA)6-(GA)32 (ZMYM3-207 ENST00000373998.5). ZMYM3 was previously reported among the top three genes involved in the progression of late-onset Alzheimer’s disease. Here we sequenced the ZMYM3 GA-STR complex in 750 human male subjects, consisting of late-onset neurocognitive disorder (NCD) as a clinical entity (n = 268) and matched controls (n = 482). We detected strict monomorphism of the GA-STR complex, except of the exceptionally long STR, which was architecturally skewed in respect of allele distribution between the NCD cases and controls [F (1, 50) = 12.283; p = 0.001]. Moreover, extreme alleles of this STR at 17, 20, 42, and 43 repeats were detected in seven NCD patients and not in the control group (Mid-P exact = 0.0003). A number of these alleles overlapped with alleles previously found in schizophrenia and bipolar disorder patients. In conclusion, we propose selective advantage for the exceptional length of the ZMYM3 GA-STR in human, and its link to a spectrum of diseases in which major cognition impairment is a predominant phenotype.
Collapse
|
50
|
Jakubosky D, Smith EN, D'Antonio M, Jan Bonder M, Young Greenwald WW, D'Antonio-Chronowska A, Matsui H, Stegle O, Montgomery SB, DeBoever C, Frazer KA. Discovery and quality analysis of a comprehensive set of structural variants and short tandem repeats. Nat Commun 2020; 11:2928. [PMID: 32522985 PMCID: PMC7287045 DOI: 10.1038/s41467-020-16481-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 05/05/2020] [Indexed: 02/07/2023] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) are important sources of genetic diversity but are not routinely analyzed in genetic studies because they are difficult to accurately identify and genotype. Because SVs and STRs range in size and type, it is necessary to apply multiple algorithms that incorporate different types of evidence from sequencing data and employ complex filtering strategies to discover a comprehensive set of high-quality and reproducible variants. Here we assemble a set of 719 deep whole genome sequencing (WGS) samples (mean 42×) from 477 distinct individuals which we use to discover and genotype a wide spectrum of SV and STR variants using five algorithms. We use 177 unique pairs of genetic replicates to identify factors that affect variant call reproducibility and develop a systematic filtering strategy to create of one of the most complete and well characterized maps of SVs and STRs to date.
Collapse
Affiliation(s)
- David Jakubosky
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
| | - Erin N Smith
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Matteo D'Antonio
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - William W Young Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | | | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany
| | - Stephen B Montgomery
- Department of Pathology, Stanford University, Stanford, CA, 94305, USA
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Christopher DeBoever
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Kelly A Frazer
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| |
Collapse
|