1
|
Tanudisastro HA, Deveson IW, Dashnow H, MacArthur DG. Sequencing and characterizing short tandem repeats in the human genome. Nat Rev Genet 2024; 25:460-475. [PMID: 38366034 DOI: 10.1038/s41576-024-00692-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2023] [Indexed: 02/18/2024]
Abstract
Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.
Collapse
Affiliation(s)
- Hope A Tanudisastro
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Ira W Deveson
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
2
|
Arthur TD, Nguyen JP, D'Antonio-Chronowska A, Jaureguy J, Silva N, Henson B, Panopoulos AD, Belmonte JCI, D'Antonio M, McVicker G, Frazer KA. Multi-omic QTL mapping in early developmental tissues reveals phenotypic and temporal complexity of regulatory variants underlying GWAS loci. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.10.588874. [PMID: 38645112 PMCID: PMC11030419 DOI: 10.1101/2024.04.10.588874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Most GWAS loci are presumed to affect gene regulation, however, only ∼43% colocalize with expression quantitative trait loci (eQTLs). To address this colocalization gap, we identify eQTLs, chromatin accessibility QTLs (caQTLs), and histone acetylation QTLs (haQTLs) using molecular samples from three early developmental (EDev) tissues. Through colocalization, we annotate 586 GWAS loci for 17 traits by QTL complexity, QTL phenotype, and QTL temporal specificity. We show that GWAS loci are highly enriched for colocalization with complex QTL modules that affect multiple elements (genes and/or peaks). We also demonstrate that caQTLs and haQTLs capture regulatory variations not associated with eQTLs and explain ∼49% of the functionally annotated GWAS loci. Additionally, we show that EDev-unique QTLs are strongly depleted for colocalizing with GWAS loci. By conducting one of the largest multi-omic QTL studies to date, we demonstrate that many GWAS loci exhibit phenotypic complexity and therefore, are missed by traditional eQTL analyses.
Collapse
|
3
|
Li X, Liu Q, Fu C, Li M, Li C, Li X, Zhao S, Zheng Z. Characterizing structural variants based on graph-genotyping provides insights into pig domestication and local adaption. J Genet Genomics 2024; 51:394-406. [PMID: 38056526 DOI: 10.1016/j.jgg.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 11/23/2023] [Accepted: 11/24/2023] [Indexed: 12/08/2023]
Abstract
Structural variants (SVs), such as deletions (DELs) and insertions (INSs), contribute substantially to pig genetic diversity and phenotypic variation. Using a library of SVs discovered from long-read primary assemblies and short-read sequenced genomes, we map pig genomic SVs with a graph-based method for re-genotyping SVs in 402 genomes. Our results demonstrate that those SVs harboring specific trait-associated genes may greatly shape pig domestication and local adaptation. Further characterization of SVs reveals that some population-stratified SVs may alter the transcription of genes by affecting regulatory elements. We identify that the genotypes of two DELs (296-bp DEL, chr7: 52,172,101-52,172,397; 278-bp DEL, chr18: 23,840,143-23,840,421) located in muscle-specific enhancers are associated with the expression of target genes related to meat quality (FSD2) and muscle fiber hypertrophy (LMOD2 and WASL) in pigs. Our results highlight the role of SVs in domestic porcine evolution, and the identified candidate functional genes and SVs are valuable resources for future genomic research and breeding programs in pigs.
Collapse
Affiliation(s)
- Xin Li
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Quan Liu
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Chong Fu
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Mengxun Li
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Changchun Li
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China; The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, Hubei 430070, China
| | - Xinyun Li
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China; The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, Hubei 430070, China; Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China
| | - Shuhong Zhao
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China; The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, Hubei 430070, China; Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China.
| | - Zhuqing Zheng
- Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China; Institute of Agricultural Biotechnology, Jingchu University of Technology, Jingmen, Hubei 448000, China.
| |
Collapse
|
4
|
Wang Y, Chen Y, Gao J, Xie H, Guo Y, Yang J, Liu J, Chen Z, Li Q, Li M, Ren J, Wen L, Tang F. Mapping crossover events of mouse meiotic recombination by restriction fragment ligation-based Refresh-seq. Cell Discov 2024; 10:26. [PMID: 38443370 PMCID: PMC10915157 DOI: 10.1038/s41421-023-00638-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 12/11/2023] [Indexed: 03/07/2024] Open
Abstract
Single-cell whole-genome sequencing methods have undergone great improvements over the past decade. However, allele dropout, which means the inability to detect both alleles simultaneously in an individual diploid cell, largely restricts the application of these methods particularly for medical applications. Here, we develop a new single-cell whole-genome sequencing method based on third-generation sequencing (TGS) platform named Refresh-seq (restriction fragment ligation-based genome amplification and TGS). It is based on restriction endonuclease cutting and ligation strategy in which two alleles in an individual cell can be cut into equal fragments and tend to be amplified simultaneously. As a new single-cell long-read genome sequencing method, Refresh-seq features much lower allele dropout rate compared with SMOOTH-seq. Furthermore, we apply Refresh-seq to 688 sperm cells and 272 female haploid cells (secondary polar bodies and parthenogenetic oocytes) from F1 hybrid mice. We acquire high-resolution genetic map of mouse meiosis recombination at low sequencing depth and reveal the sexual dimorphism in meiotic crossovers. We also phase the structure variations (deletions and insertions) in sperm cells and female haploid cells with high precision. Refresh-seq shows great performance in screening aneuploid sperm cells and oocytes due to the low allele dropout rate and has great potential for medical applications such as preimplantation genetic diagnosis.
Collapse
Affiliation(s)
- Yan Wang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Yijun Chen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Junpeng Gao
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Emergency Center, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Haoling Xie
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Yuqing Guo
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Jingwei Yang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Jun'e Liu
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Zonggui Chen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
- Changping Laboratory, Beijing, China
| | - Qingqing Li
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Mengyao Li
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Jie Ren
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Lu Wen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China
| | - Fuchou Tang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing, China.
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
- Changping Laboratory, Beijing, China.
| |
Collapse
|
5
|
Soto DC, Uribe-Salazar JM, Shew CJ, Sekar A, McGinty S, Dennis MY. Genomic structural variation: A complex but important driver of human evolution. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY 2023; 181 Suppl 76:118-144. [PMID: 36794631 PMCID: PMC10329998 DOI: 10.1002/ajpa.24713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 01/21/2023] [Accepted: 02/05/2023] [Indexed: 02/17/2023]
Abstract
Structural variants (SVs)-including duplications, deletions, and inversions of DNA-can have significant genomic and functional impacts but are technically difficult to identify and assay compared with single-nucleotide variants. With the aid of new genomic technologies, it has become clear that SVs account for significant differences across and within species. This phenomenon is particularly well-documented for humans and other primates due to the wealth of sequence data available. In great apes, SVs affect a larger number of nucleotides than single-nucleotide variants, with many identified SVs exhibiting population and species specificity. In this review, we highlight the importance of SVs in human evolution by (1) how they have shaped great ape genomes resulting in sensitized regions associated with traits and diseases, (2) their impact on gene functions and regulation, which subsequently has played a role in natural selection, and (3) the role of gene duplications in human brain evolution. We further discuss how to incorporate SVs in research, including the strengths and limitations of various genomic approaches. Finally, we propose future considerations in integrating existing data and biospecimens with the ever-expanding SV compendium propelled by biotechnology advancements.
Collapse
Affiliation(s)
- Daniela C. Soto
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - José M. Uribe-Salazar
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Aarthi Sekar
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Sean McGinty
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| |
Collapse
|
6
|
Kosugi S, Kamatani Y, Harada K, Tomizuka K, Momozawa Y, Morisaki T, Terao C. Detection of trait-associated structural variations using short-read sequencing. CELL GENOMICS 2023; 3:100328. [PMID: 37388916 PMCID: PMC10300613 DOI: 10.1016/j.xgen.2023.100328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 02/17/2023] [Accepted: 04/25/2023] [Indexed: 07/01/2023]
Abstract
Genomic structural variation (SV) affects genetic and phenotypic characteristics in diverse organisms, but the lack of reliable methods to detect SV has hindered genetic analysis. We developed a computational algorithm (MOPline) that includes missing call recovery combined with high-confidence SV call selection and genotyping using short-read whole-genome sequencing (WGS) data. Using 3,672 high-coverage WGS datasets, MOPline stably detected ∼16,000 SVs per individual, which is over ∼1.7-3.3-fold higher than previous large-scale projects while exhibiting a comparable level of statistical quality metrics. We imputed SVs from 181,622 Japanese individuals for 42 diseases and 60 quantitative traits. A genome-wide association study with the imputed SVs revealed 41 top-ranked or nearly top-ranked genome-wide significant SVs, including 8 exonic SVs with 5 novel associations and enriched mobile element insertions. This study demonstrates that short-read WGS data can be used to identify rare and common SVs associated with a variety of traits.
Collapse
Affiliation(s)
- Shunichi Kosugi
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
| | - Yoichiro Kamatani
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa-shi, Chiba 277-8562, Japan
| | - Katsutoshi Harada
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan
| | - Takayuki Morisaki
- Division of Molecular Pathology, Institute of Medical Science, The University of Tokyo, 4-6-1, Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan
| | | | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|
7
|
Shi Y, Niu Y, Zhang P, Luo H, Liu S, Zhang S, Wang J, Li Y, Liu X, Song T, Xu T, He S. Characterization of genome-wide STR variation in 6487 human genomes. Nat Commun 2023; 14:2092. [PMID: 37045857 PMCID: PMC10097659 DOI: 10.1038/s41467-023-37690-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.
Collapse
Affiliation(s)
- Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuai Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Sijia Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiajia Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinyue Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250117, Shandong, China.
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
8
|
Gochi L, Kawai Y, Fujimoto A. Comprehensive analysis of microsatellite polymorphisms in human populations. Hum Genet 2023; 142:45-57. [PMID: 36048238 DOI: 10.1007/s00439-022-02484-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/24/2022] [Indexed: 01/18/2023]
Abstract
Microsatellites (MS) are tandem repeats of short units, and have been used for population genetics, individual identification, and medical genetics. However, studies of MS on a whole-genome level are limited, and genotyping methods for MS have yet to be established. Here, we analyzed approximately 8.5 million MS regions using a previously developed MS caller for short reads (MIVcall method) for three large publicly available human genome sequencing data sets: the Korean Personal Genome Project, Simons Genome Diversity Project, and Human Genome Diversity Project. Our analysis identified 253,114 polymorphic MS. A comparison among different populations suggests that MS in the coding region evolved by random genetic drift and natural selection. In an analysis of genetic structures, MS clearly revealed population structures as SNPs and detected clusters that were not found by SNPs in African and Oceanian populations. Based on the MS polymorphisms, we selected MS marker candidates for individual identification. Finally, we applied our method to a deep sequenced ancient DNA sample. This study provides a comprehensive picture of MS polymorphisms and application to human population studies.
Collapse
Affiliation(s)
- Leo Gochi
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, 113-0003, Japan
| | - Yosuke Kawai
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, Japan
| | - Akihiro Fujimoto
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, 113-0003, Japan.
| |
Collapse
|
9
|
Berthold N, Pytte J, Bulik CM, Tschochner M, Medland SE, Akkari PA. Bridging the gap: Short structural variants in the genetics of anorexia nervosa. Int J Eat Disord 2022; 55:747-753. [PMID: 35470453 PMCID: PMC9545787 DOI: 10.1002/eat.23716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/30/2022] [Accepted: 03/31/2022] [Indexed: 11/07/2022]
Abstract
Anorexia nervosa (AN) is a devastating disorder with evidence of underexplored heritability. Twin and family studies estimate heritability (h2 ) to be 57%-64%, and genome-wide association studies (GWAS) reveal significant genetic correlations with psychiatric and anthropometric traits and a total of nine genome-wide significant loci. Whether significantly associated single nucleotide polymorphisms identified by GWAS are causal or tag true causal variants, remains to be elucidated. We propose a novel method for bridging this knowledge gap by fine-mapping short structural variants (SSVs) in and around GWAS-identified loci. SSV fine-mapping of loci associated with complex disorders such as schizophrenia, amyotrophic lateral sclerosis, and Alzheimer's disease has uncovered genetic risk markers, phenotypic variability between patients, new pathological mechanisms, and potential therapeutic targets. We analyze previous investigations' methods and propose utilizing an evaluation algorithm to prioritize 10 SSVs for each of the top two AN GWAS-identified loci followed by Sanger sequencing and fragment analysis via capillary electrophoresis to characterize these SSVs for case/control association studies. Success of previous SSV analyses in complex disorders and effective utilization of similar methodologies supports our proposed method. Furthermore, the structural and spatial properties of the 10 SSVs identified for each of the top two AN GWAS-associated loci, cell adhesion molecule 1 (CADM1) and NCK interacting protein with SH3 domain (NCKIPSD), are similar to previous studies. We propose SSV fine-mapping of AN-associated loci will identify causal genetic architecture. Deepening understandings of AN may lead to novel therapeutic targets and subsequently increase quality-of-life for individuals living with the illness. PUBLIC SIGNIFICANCE STATEMENT: Anorexia nervosa is a severe and complex illness, arising from a combination of environmental and genetic factors. Recent studies estimate the contribution of genetic variability; however, the specific DNA sequences and how they contribute remain unknown. We present a novel approach, arguing that the genetic variant class, short structural variants, could answer this knowledge gap and allow development of biologically targeted therapeutics, improving quality-of-life and patient outcomes for affected individuals.
Collapse
Affiliation(s)
- Natasha Berthold
- School of Nursing, Midwifery, Health Sciences & PhysiotherapyUniversity of Notre Dame AustraliaFremantleWestern AustraliaAustralia
- Perron Institute for Neurological and Translational ScienceNedlandsWestern AustraliaAustralia
- School of Human Sciences, University of Western AustraliaCrawleyWestern AustraliaAustralia
| | - Julia Pytte
- Perron Institute for Neurological and Translational ScienceNedlandsWestern AustraliaAustralia
- School of Human Sciences, University of Western AustraliaCrawleyWestern AustraliaAustralia
| | - Cynthia M. Bulik
- Department of Medical Epidemiology and BiostatisticsKarolinska InstitutetStockholmSweden
- Department of PsychiatryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
- Department of NutritionUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Monika Tschochner
- School of Nursing, Midwifery, Health Sciences & PhysiotherapyUniversity of Notre Dame AustraliaFremantleWestern AustraliaAustralia
| | - Sarah E. Medland
- QIMR Berghofer Medical Research InstituteBrisbaneQueenslandAustralia
| | - Patrick Anthony Akkari
- Perron Institute for Neurological and Translational ScienceNedlandsWestern AustraliaAustralia
- Centre for Molecular Medicine and Innovative TherapeuticsMurdoch UniversityPerthWestern AustraliaAustralia
- Centre for Neuromuscular and Neurological DisordersUniversity of Western AustraliaNedlandsWestern AustraliaAustralia
- Department of NeurologyDuke UniversityDurhamNorth Carolina
| |
Collapse
|
10
|
Reis ALM, Deveson IW, Madala BS, Wong T, Barker C, Xu J, Lennon N, Tong W, Mercer TR. Using synthetic chromosome controls to evaluate the sequencing of difficult regions within the human genome. Genome Biol 2022; 23:19. [PMID: 35022065 PMCID: PMC8753822 DOI: 10.1186/s13059-021-02579-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/16/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Next-generation sequencing (NGS) can identify mutations in the human genome that cause disease and has been widely adopted in clinical diagnosis. However, the human genome contains many polymorphic, low-complexity, and repetitive regions that are difficult to sequence and analyze. Despite their difficulty, these regions include many clinically important sequences that can inform the treatment of human diseases and improve the diagnostic yield of NGS. RESULTS To evaluate the accuracy by which these difficult regions are analyzed with NGS, we built an in silico decoy chromosome, along with corresponding synthetic DNA reference controls, that encode difficult and clinically important human genome regions, including repeats, microsatellites, HLA genes, and immune receptors. These controls provide a known ground-truth reference against which to measure the performance of diverse sequencing technologies, reagents, and bioinformatic tools. Using this approach, we provide a comprehensive evaluation of short- and long-read sequencing instruments, library preparation methods, and software tools and identify the errors and systematic bias that confound our resolution of these remaining difficult regions. CONCLUSIONS This study provides an analytical validation of diagnosis using NGS in difficult regions of the human genome and highlights the challenges that remain to resolve these difficult regions.
Collapse
Affiliation(s)
- Andre L M Reis
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Ira W Deveson
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - Bindu Swapna Madala
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Ted Wong
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Chris Barker
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Niall Lennon
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tim R Mercer
- Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
- Australian Institute for Biotechnology and Nanoengineering, University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
11
|
Han J, Munro JE, Kocoski A, Barry AE, Bahlo M. Population-level genome-wide STR discovery and validation for population structure and genetic diversity assessment of Plasmodium species. PLoS Genet 2022; 18:e1009604. [PMID: 35007277 PMCID: PMC8782505 DOI: 10.1371/journal.pgen.1009604] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 01/21/2022] [Accepted: 12/14/2021] [Indexed: 11/18/2022] Open
Abstract
Short tandem repeats (STRs) are highly informative genetic markers that have been used extensively in population genetics analysis. They are an important source of genetic diversity and can also have functional impact. Despite the availability of bioinformatic methods that permit large-scale genome-wide genotyping of STRs from whole genome sequencing data, they have not previously been applied to sequencing data from large collections of malaria parasite field samples. Here, we have genotyped STRs using HipSTR in more than 3,000 Plasmodium falciparum and 174 Plasmodium vivax published whole-genome sequence data from samples collected across the globe. High levels of noise and variability in the resultant callset necessitated the development of a novel method for quality control of STR genotype calls. A set of high-quality STR loci (6,768 from P. falciparum and 3,496 from P. vivax) were used to study Plasmodium genetic diversity, population structures and genomic signatures of selection and these were compared to genome-wide single nucleotide polymorphism (SNP) genotyping data. In addition, the genome-wide information about genetic variation and other characteristics of STRs in P. falciparum and P. vivax have been available in an interactive web-based R Shiny application PlasmoSTR (https://github.com/bahlolab/PlasmoSTR).
Collapse
Affiliation(s)
- Jiru Han
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
| | - Jacob E. Munro
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
| | - Anthony Kocoski
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Mathematics and Statistics, The University of Melbourne, Melbourne, Australia
| | - Alyssa E. Barry
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
- Disease Elimination Program, Burnet Institute, Melbourne, Australia
- IMPACT Institute for Innovation in Mental and Physical Health and Clinical Translation, Deakin University, Geelong, Australia
| | - Melanie Bahlo
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
- * E-mail:
| |
Collapse
|
12
|
Xiao X, Zhang CY, Zhang Z, Hu Z, Li M, Li T. Revisiting tandem repeats in psychiatric disorders from perspectives of genetics, physiology, and brain evolution. Mol Psychiatry 2022; 27:466-475. [PMID: 34650204 DOI: 10.1038/s41380-021-01329-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 09/16/2021] [Accepted: 09/28/2021] [Indexed: 01/28/2023]
Abstract
Genome-wide association studies (GWASs) have revealed substantial genetic components comprised of single nucleotide polymorphisms (SNPs) in the heritable risk of psychiatric disorders. However, genetic risk factors not covered by GWAS also play pivotal roles in these illnesses. Tandem repeats, which are likely functional but frequently overlooked by GWAS, may account for an important proportion in the "missing heritability" of psychiatric disorders. Despite difficulties in characterizing and quantifying tandem repeats in the genome, studies have been carried out in an attempt to describe impact of tandem repeats on gene regulation and human phenotypes. In this review, we have introduced recent research progress regarding the genomic distribution and regulatory mechanisms of tandem repeats. We have also summarized the current knowledge of the genetic architecture and biological underpinnings of psychiatric disorders brought by studies of tandem repeats. These findings suggest that tandem repeats, in candidate psychiatric risk genes or in different levels of linkage disequilibrium (LD) with psychiatric GWAS SNPs and haplotypes, may modulate biological phenotypes related to psychiatric disorders (e.g., cognitive function and brain physiology) through regulating alternative splicing, promoter activity, enhancer activity and so on. In addition, many tandem repeats undergo tight natural selection in the human lineage, and likely exert crucial roles in human brain evolution. Taken together, the putative roles of tandem repeats in the pathogenesis of psychiatric disorders is strongly implicated, and using examples from previous literatures, we wish to call for further attention to tandem repeats in the post-GWAS era of psychiatric disorders.
Collapse
Affiliation(s)
- Xiao Xiao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Chu-Yi Zhang
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Zhuohua Zhang
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China.,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Zhonghua Hu
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Department of Critical Care Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Eye Center of Xiangya Hospital and Hunan Key Laboratory of Ophthalmology, Central South University, Changsha, Hunan, China. .,National Clinical Research Center on Mental Disorders, Changsha, Hunan, China.
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China. .,CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China. .,KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Tao Li
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China. .,Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Guangzhou, China.
| |
Collapse
|
13
|
Kamimura S, Suga T, Hoki Y, Sunayama M, Imadome K, Fujita M, Nakamura M, Araki R, Abe M. Insertion/deletion and microsatellite alteration profiles in induced pluripotent stem cells. Stem Cell Reports 2021; 16:2503-2519. [PMID: 34559999 PMCID: PMC8514972 DOI: 10.1016/j.stemcr.2021.08.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 08/27/2021] [Accepted: 08/27/2021] [Indexed: 11/19/2022] Open
Abstract
We here demonstrate that microsatellite (MS) alterations are elevated in both mouse and human induced pluripotent stem cells (iPSCs), but importantly we have now identified a type of human iPSC in which these alterations are considerably reduced. We aimed in our present analyses to profile the InDels in iPSC/ntESC genomes, especially in MS regions. To detect somatic de novo mutations in particular, we generated 13 independent reprogramed stem cell lines (11 iPSC and 2 ntESC lines) from an identical parent somatic cell fraction of a C57BL/6 mouse. By using this cell set with an identical genetic background, we could comprehensively detect clone-specific alterations and, importantly, experimentally validate them. The effectiveness of employing sister clones for detecting somatic de novo mutations was thereby demonstrated. We then successfully applied this approach to human iPSCs. Our results require further careful genomic analysis but make an important inroad into solving the issue of genome abnormalities in iPSCs.
Collapse
Affiliation(s)
- Satoshi Kamimura
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Tomo Suga
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Yuko Hoki
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Misato Sunayama
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Kaori Imadome
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Mayumi Fujita
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Miki Nakamura
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan
| | - Ryoko Araki
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan.
| | - Masumi Abe
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba 263-8555, Japan.
| |
Collapse
|
14
|
Yan SM, Sherman RM, Taylor DJ, Nair DR, Bortvin AN, Schatz MC, McCoy RC. Local adaptation and archaic introgression shape global diversity at human structural variant loci. eLife 2021; 10:e67615. [PMID: 34528508 PMCID: PMC8492059 DOI: 10.7554/elife.67615] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 09/14/2021] [Indexed: 12/13/2022] Open
Abstract
Large genomic insertions and deletions are a potent source of functional variation, but are challenging to resolve with short-read sequencing, limiting knowledge of the role of such structural variants (SVs) in human evolution. Here, we used a graph-based method to genotype long-read-discovered SVs in short-read data from diverse human genomes. We then applied an admixture-aware method to identify 220 SVs exhibiting extreme patterns of frequency differentiation - a signature of local adaptation. The top two variants traced to the immunoglobulin heavy chain locus, tagging a haplotype that swept to near fixation in certain southeast Asian populations, but is rare in other global populations. Further investigation revealed evidence that the haplotype traces to gene flow from Neanderthals, corroborating the role of immune-related genes as prominent targets of adaptive introgression. Our study demonstrates how recent technical advances can help resolve signatures of key evolutionary events that remained obscured within technically challenging regions of the genome.
Collapse
Affiliation(s)
- Stephanie M Yan
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Rachel M Sherman
- Department of Computer Science, Johns Hopkins UniversityBaltimoreUnited States
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Divya R Nair
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Andrew N Bortvin
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| | - Michael C Schatz
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
- Department of Computer Science, Johns Hopkins UniversityBaltimoreUnited States
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, BaltimoreBaltimoreUnited States
| |
Collapse
|
15
|
Tang H, He Z. Advances and challenges in quantitative delineation of the genetic architecture of complex traits. QUANTITATIVE BIOLOGY 2021; 9:168-184. [PMID: 35492964 PMCID: PMC9053444 DOI: 10.15302/j-qb-021-0249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background Genome-wide association studies (GWAS) have been widely adopted in studies of human complex traits and diseases. Results This review surveys areas of active research: quantifying and partitioning trait heritability, fine mapping functional variants and integrative analysis, genetic risk prediction of phenotypes, and the analysis of sequencing studies that have identified millions of rare variants. Current challenges and opportunities are highlighted. Conclusion GWAS have fundamentally transformed the field of human complex trait genetics. Novel statistical and computational methods have expanded the scope of GWAS and have provided valuable insights on the genetic architecture underlying complex phenotypes.
Collapse
Affiliation(s)
- Hua Tang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94305, USA
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
16
|
Bonder MJ, Smail C, Gloudemans MJ, Frésard L, Jakubosky D, D'Antonio M, Li X, Ferraro NM, Carcamo-Orive I, Mirauta B, Seaton DD, Cai N, Vakili D, Horta D, Zhao C, Zastrow DB, Bonner DE, Wheeler MT, Kilpinen H, Knowles JW, Smith EN, Frazer KA, Montgomery SB, Stegle O. Identification of rare and common regulatory variants in pluripotent cells using population-scale transcriptomics. Nat Genet 2021; 53:313-321. [PMID: 33664507 PMCID: PMC7944648 DOI: 10.1038/s41588-021-00800-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 01/25/2021] [Indexed: 12/18/2022]
Abstract
Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of novel colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.
Collapse
Affiliation(s)
- Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | - Craig Smail
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA. .,Genomic Medicine Center, Children's Mercy Research Institute and Children's Mercy Kansas City, Kansas City, MO, USA.
| | - Michael J Gloudemans
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Laure Frésard
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - David Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA.,Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Matteo D'Antonio
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, USA
| | - Xin Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Nicole M Ferraro
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Ivan Carcamo-Orive
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Bogdan Mirauta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Daniel D Seaton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Na Cai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK.,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.,Helmholtz Pioneer Campus, Helmholtz Zentrum München, Neuherberg, Germany
| | - Dara Vakili
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,Faculty of Medicine, Imperial College London, London, UK
| | - Danilo Horta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Chunli Zhao
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Diane B Zastrow
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Devon E Bonner
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | | | | | | | | | - Matthew T Wheeler
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA.,Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Helena Kilpinen
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.,UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland.,Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Joshua W Knowles
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Erin N Smith
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Kelly A Frazer
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, USA.,Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA. .,Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany. .,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.
| |
Collapse
|
17
|
Jakubosky D, D'Antonio M, Bonder MJ, Smail C, Donovan MKR, Young Greenwald WW, Matsui H, D'Antonio-Chronowska A, Stegle O, Smith EN, Montgomery SB, DeBoever C, Frazer KA. Properties of structural variants and short tandem repeats associated with gene expression and complex traits. Nat Commun 2020; 11:2927. [PMID: 32522982 PMCID: PMC7286898 DOI: 10.1038/s41467-020-16482-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 05/05/2020] [Indexed: 12/14/2022] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we identify genomic features of SV classes and STRs that are associated with gene expression and complex traits, including their locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We identify a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and show that they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that are associated with gene expression and human traits.
Collapse
Affiliation(s)
- David Jakubosky
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
| | - Matteo D'Antonio
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Craig Smail
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Department of Pathology, Stanford University, Stanford, California, 94305, USA
| | - Margaret K R Donovan
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - William W Young Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | | | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany
| | - Erin N Smith
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University, Stanford, California, 94305, USA
- Department of Genetics, Stanford University, Stanford, California, 94305, USA
| | - Christopher DeBoever
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Kelly A Frazer
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|