1
|
Wang XB, Lu HW, Liu QY, Li AL, Zhou HL, Zhang Y, Zhu TQ, Ruan J. An effective strategy for assembling the sex-limited chromosome. Gigascience 2024; 13:giae015. [PMID: 38626722 PMCID: PMC11020242 DOI: 10.1093/gigascience/giae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 01/17/2024] [Accepted: 03/15/2024] [Indexed: 04/18/2024] Open
Abstract
BACKGROUND Most currently available reference genomes lack the sequence map of sex-limited (such as Y and W) chromosomes, which results in incomplete assemblies that hinder further research on sex chromosomes. Recent advancements in long-read sequencing and population sequencing have provided the opportunity to assemble sex-limited chromosomes without the traditional complicated experimental efforts. FINDINGS We introduce the first computational method, Sorting long Reads of Y or other sex-limited chromosome (SRY), which achieves improved assembly results compared to flow sorting. Specifically, SRY outperforms in the heterochromatic region and demonstrates comparable performance in other regions. Furthermore, SRY enhances the capabilities of the hybrid assembly software, resulting in improved continuity and accuracy. CONCLUSIONS Our method enables true complete genome assembly and facilitates downstream research of sex-limited chromosomes.
Collapse
Affiliation(s)
- Xiao-Bo Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
- The Shennong Laboratory/Institute of Crop Molecular Breeding, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Hong-Wei Lu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Qing-You Liu
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - A-Lun Li
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Hong-Ling Zhou
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Yong Zhang
- Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Tian-Qi Zhu
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| |
Collapse
|
2
|
Wu S, Lin T, Xu Y. Polymorphic USP8 allele promotes Parkinson's disease by inducing the accumulation of α-synuclein through deubiquitination. Cell Mol Life Sci 2023; 80:363. [PMID: 37981592 PMCID: PMC11072815 DOI: 10.1007/s00018-023-05006-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 10/14/2023] [Accepted: 10/16/2023] [Indexed: 11/21/2023]
Abstract
Parkinson's disease (PD) is one of the most common neuro-degenerative diseases characterized by α-synuclein accumulation and degeneration of dopaminergic neurons. Employing genome-wide sequencing, we identified a polymorphic USP8 allele (USP8D442G) significantly enriched in Chinese PD patients. To test the involvement of this polymorphism in PD pathogenesis, we derived dopaminergic neurons (DAn) from human-induced pluripotent stem cells (hiPSCs) reprogrammed from fibroblasts of PD patients harboring USP8D442G allele and their healthy siblings. In addition, we knock-in D442G polymorphic site into the endogenous USP8 gene of human embryonic stem cells (hESCs) and derived DAn from these knock-in hESCs to explore their cellular phenotypes and molecular mechanism. We found that expression of USP8D442G in DAn induces the accumulation and abnormal subcellular localization of α-Synuclein (α-Syn). Mechanistically, we demonstrate that D442G polymorphism enhances the interaction between α-Syn and USP8 and thus increases the K63-specific deubiquitination and stability of α-Syn . We discover a pathogenic polymorphism for PD that represent a promising therapeutic and diagnostic target for PD.
Collapse
Affiliation(s)
- Shouhai Wu
- State Key Laboratory of Dampness Syndrome of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
- Center for Regenerative and Translational Medicine, Guangdong Provincial Academy of Chinese Medical Sciences, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Tongxiang Lin
- Center for Regenerative and Translational Medicine, Guangdong Provincial Academy of Chinese Medical Sciences, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
- College of Animal Sciences, Fujian Agriculture and Forestory University, 15 ShangXiaDian Road, CangShan District, Fuzhou City, Fujian Province, China
| | - Yang Xu
- Department of Cardiology, Cardiovascular Key Lab of Zhejiang Province, State Key Laboratory of Transvascular Implantation Devices, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310009, Zhejiang, China.
| |
Collapse
|
3
|
Huang T, Li J, Zhao H, Ngamphiw C, Tongsima S, Kantaputra P, Kittitharaphan W, Wang SM. Core promoter in TNBC is highly mutated with rich ethnic signature. Brief Funct Genomics 2023; 22:9-19. [PMID: 36307127 PMCID: PMC9853936 DOI: 10.1093/bfgp/elac035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/23/2022] [Accepted: 09/28/2022] [Indexed: 01/25/2023] Open
Abstract
The core promoter plays an essential role in regulating transcription initiation by controlling the interaction between transcriptional factors and sequence motifs in the core promoter. Although mutation in core promoter sequences is expected to cause abnormal gene expression leading to pathogenic consequences, limited supporting evidence showed the involvement of core promoter mutation in diseases. Our previous study showed that the core promoter is highly polymorphic in worldwide human ethnic populations in reflecting human history and adaptation. Our recent characterization of the core promoter in triple-negative breast cancer (TNBC), a subtype of breast cancer, in a Chinese TNBC cohort revealed the wide presence of core promoter mutation in TNBC. In the current study, we analyzed the core promoter in a Thai TNBC cohort. We also observed rich core promoter mutation in the Thai TNBC patients. We compared the core promoter mutations between Chinese and Thai TNBC cohorts. We observed substantial differences of core promoter mutation in TNBC between the two cohorts, as reflected by the mutation spectrum, mutation-effected gene and functional category, and altered gene expression. Our study confirmed that the core promoter in TNBC is highly mutable, and is highly ethnic-specific.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - San Ming Wang
- Corresponding author: S.M. Wang, Faculty of Health Sciences, University of Macau, Taipa, Macau 999078, China. Tel.: +(853) 8822-4836; E-mail:
| |
Collapse
|
4
|
Yao Y, Sun K, Yang Q, Zhou Z, Qian J, Li Z, Shao C, Qian X, Tang Q, Xie J. Development of a multiplex panel with 31 multi-allelic InDels for forensic DNA typing. Int J Legal Med 2023; 137:1-12. [PMID: 36326889 DOI: 10.1007/s00414-022-02907-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 10/20/2022] [Indexed: 11/06/2022]
Abstract
Insertion/Deletion (InDel) polymorphic genetic markers are abundant in human genomes. Diallelic InDel markers have been widely studied for forensic purposes, yet the low polymorphic information content limits their application and current InDel panels remain to be improved. In this study, multi-allelic InDels located out of low complexity sequence regions were selected in the datasets from East Asian populations, and a multiplex amplification system containing 31 multi-allelic InDel markers and the Amelogenin marker (FA-HID32plex) was constructed and optimized. The preliminary study on sensitivity, species specificity, inhibitor tolerance, mixture resolution, and the detection of degraded samples demonstrates that the FA-HID32plex is highly sensitive, specific, and robust for traces and degraded samples. The combined power of discrimination (CPD) of 31 multi-allelic InDel markers was 0.999 999 999 999 999 999 85, and the cumulative probability of exclusion (CPE) was 0.999 920 in a Chinese Han population, which indicates a high discrimination power. Altogether, the FA-HID32plex panel could provide reliable supplements or stand-alone information in individual identification and paternity testing, especially for challenging samples.
Collapse
Affiliation(s)
- Yining Yao
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Kuan Sun
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China.,Department of Fetal Medicine and Prenatal Diagnosis Center, Shanghai First Maternity and Infant Hospital, Tongji University School of Medicine, 2699 West Gaoke Rd, 201204, Shanghai, China.,Shanghai Key Laboratory of Maternal Fetal Medicine, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, 200092, China
| | - Qinrui Yang
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Zhihan Zhou
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Jinglei Qian
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Zhimin Li
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Chengchen Shao
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Xiaoqin Qian
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China
| | - Qiqun Tang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai, 200032, China
| | - Jianhui Xie
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, 138 Yixueyuan Road, Shanghai, 200032, China.
| |
Collapse
|
5
|
Ishigaki K, Sakaue S, Terao C, Luo Y, Sonehara K, Yamaguchi K, Amariuta T, Too CL, Laufer VA, Scott IC, Viatte S, Takahashi M, Ohmura K, Murasawa A, Hashimoto M, Ito H, Hammoudeh M, Emadi SA, Masri BK, Halabi H, Badsha H, Uthman IW, Wu X, Lin L, Li T, Plant D, Barton A, Orozco G, Verstappen SMM, Bowes J, MacGregor AJ, Honda S, Koido M, Tomizuka K, Kamatani Y, Tanaka H, Tanaka E, Suzuki A, Maeda Y, Yamamoto K, Miyawaki S, Xie G, Zhang J, Amos CI, Keystone E, Wolbink G, van der Horst-Bruinsma I, Cui J, Liao KP, Carroll RJ, Lee HS, Bang SY, Siminovitch KA, de Vries N, Alfredsson L, Rantapää-Dahlqvist S, Karlson EW, Bae SC, Kimberly RP, Edberg JC, Mariette X, Huizinga T, Dieudé P, Schneider M, Kerick M, Denny JC, Matsuda K, Matsuo K, Mimori T, Matsuda F, Fujio K, Tanaka Y, Kumanogoh A, Traylor M, Lewis CM, Eyre S, Xu H, Saxena R, Arayssi T, Kochi Y, Ikari K, Harigai M, Gregersen PK, Yamamoto K, Louis Bridges S, Padyukov L, Martin J, Klareskog L, Okada Y, Raychaudhuri S. Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis. Nat Genet 2022; 54:1640-1651. [PMID: 36333501 PMCID: PMC10165422 DOI: 10.1038/s41588-022-01213-w] [Citation(s) in RCA: 92] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 09/26/2022] [Indexed: 11/06/2022]
Abstract
Rheumatoid arthritis (RA) is a highly heritable complex disease with unknown etiology. Multi-ancestry genetic research of RA promises to improve power to detect genetic signals, fine-mapping resolution and performances of polygenic risk scores (PRS). Here, we present a large-scale genome-wide association study (GWAS) of RA, which includes 276,020 samples from five ancestral groups. We conducted a multi-ancestry meta-analysis and identified 124 loci (P < 5 × 10-8), of which 34 are novel. Candidate genes at the novel loci suggest essential roles of the immune system (for example, TNIP2 and TNFRSF11A) and joint tissues (for example, WISP1) in RA etiology. Multi-ancestry fine-mapping identified putatively causal variants with biological insights (for example, LEF1). Moreover, PRS based on multi-ancestry GWAS outperformed PRS based on single-ancestry GWAS and had comparable performance between populations of European and East Asian ancestries. Our study provides several insights into the etiology of RA and improves the genetic predictability of RA.
Collapse
Affiliation(s)
- Kazuyoshi Ishigaki
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Laboratory for Human Immunogenetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Saori Sakaue
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Yang Luo
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kyuto Sonehara
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan
| | - Kensuke Yamaguchi
- Department of Genomic Function and Diversity, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tiffany Amariuta
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Chun Lai Too
- Immunogenetics Unit, Allergy and Immunology Research Center, Institute for Medical Research, National Institutes of Health Complex, Ministry of Health, Kuala Lumpur, Malaysia
- Department of Medicine, Division of Rheumatology, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Vincent A Laufer
- Department of Clinical Immunology and Rheumatology, University of Alabama at Birmingham School of Medicine, Birmingham, AL, USA
- Department of Pathology, Michigan Medicine, Ann Arbor, MI, USA
| | - Ian C Scott
- Haywood Academic Rheumatology Centre, Haywood Hospital, Midlands Partnership NHS Foundation Trust, Burslem, UK
- Primary Care Centre Versus Arthritis, School of Medicine, Keele University, Keele, UK
| | - Sebastien Viatte
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- Lydia Becker Institute of Immunology and Inflammation, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, UK
| | - Meiko Takahashi
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Koichiro Ohmura
- Department of Rheumatology and Clinical immunology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Akira Murasawa
- Department of Rheumatology, Niigata Rheumatic Center, Niigata, Japan
| | - Motomu Hashimoto
- Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Kyoto, Japan
- Department of Clinical Immunology, Graduate School of Medicine, Osaka City University, Osaka, Japan
| | - Hiromu Ito
- Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Kyoto, Japan
- Department of Orthopaedic Surgery, Kurashiki Central Hospital, Kurashiki, Japan
| | - Mohammed Hammoudeh
- Rheumatology Division, Department of Internal Medicine, Hamad Medical Corporation, Doha, Qatar
| | - Samar Al Emadi
- Rheumatology Division, Department of Internal Medicine, Hamad Medical Corporation, Doha, Qatar
| | - Basel K Masri
- Department of Internal Medicine, Jordan Hospital, Amman, Jordan
| | - Hussein Halabi
- Section of Rheumatology, Department of Internal Medicine, King Faisal Specialist Hospital and Research Center, Jeddah, Saudi Arabia
| | - Humeira Badsha
- Dr. Humeira Badsha Medical Center, Emirates Hospital, Dubai, United Arab Emirates
| | - Imad W Uthman
- Department of Rheumatology, American University of Beirut, Beirut, Lebanon
| | - Xin Wu
- Department of Rheumatology and Immunology, Shanghai Changzeng Hospital, The Second Military Medical University, Shanghai, China
| | - Li Lin
- Department of Rheumatology and Immunology, Shanghai Changzeng Hospital, The Second Military Medical University, Shanghai, China
| | - Ting Li
- Department of Rheumatology and Immunology, Shanghai Changzeng Hospital, The Second Military Medical University, Shanghai, China
| | - Darren Plant
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Anne Barton
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, UK
| | - Gisela Orozco
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, UK
| | - Suzanne M M Verstappen
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, UK
- Centre for Epidemiology Versus Arthritis, Centre for Musculoskeletal Research, Division of Musculoskeletal and Dermatological Sciences, The University of Manchester, Manchester, UK
| | - John Bowes
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, UK
| | | | - Suguru Honda
- Institute of Rheumatology, Tokyo Women's Medical University Hospital, Tokyo, Japan
- Department of Rheumatology, Tokyo Women's Medical University School of Medicine, Tokyo, Japan
| | - Masaru Koido
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Hiroaki Tanaka
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health Japan, Kitakyushu, Japan
| | - Eiichi Tanaka
- Institute of Rheumatology, Tokyo Women's Medical University Hospital, Tokyo, Japan
- Department of Rheumatology, Tokyo Women's Medical University School of Medicine, Tokyo, Japan
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yuichi Maeda
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan
- Department of Respiratory Medicine and Clinical Immunology, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Immunopathology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan
| | - Kenichi Yamamoto
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan
- Department of Pediatrics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Satoru Miyawaki
- Department of Neurosurgery, Faculty of Medicine, the University of Tokyo, Tokyo, Japan
| | - Gang Xie
- Lunenfeld-Tanenbaum Research Institute, Toronto, Ontario, Canada
| | - Jinyi Zhang
- Lunenfeld-Tanenbaum Research Institute, Toronto, Ontario, Canada
- Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | | | | | - Gertjan Wolbink
- Department of Rheumatology, Amsterdam Rheumatology and Immunology Center (ARC), Reade, Amsterdam, the Netherlands
| | - Irene van der Horst-Bruinsma
- Department of Rheumatology & Clinical Immunology/ARC, Amsterdam Institute for Infection and Immunity, Amsterdam UMC location Vrije Universiteit, Amsterdam, the Netherlands
| | - Jing Cui
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Katherine P Liao
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Massachusetts Veterans Epidemiology Research and Information Center, VA Boston Healthcare System, Boston, MA, USA
| | - Robert J Carroll
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Hye-Soon Lee
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Korea
- Hanyang University Institute for Rheumatology Research, Seoul, Korea
| | - So-Young Bang
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Korea
- Hanyang University Institute for Rheumatology Research, Seoul, Korea
| | - Katherine A Siminovitch
- Lunenfeld-Tanenbaum Research Institute, Toronto, Ontario, Canada
- Departments of Medicine and Immunology, University of Toronto, Toronto, Ontario, Canada
| | - Niek de Vries
- Department of Rheumatology & Clinical Immunology/ARC, Amsterdam Institute for Infection and Immunity, Amsterdam UMC location AMC/University of Amsterdam, Amsterdam, the Netherlands
| | - Lars Alfredsson
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | | | - Elizabeth W Karlson
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Sang-Cheol Bae
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Korea
- Hanyang University Institute for Rheumatology Research, Seoul, Korea
| | - Robert P Kimberly
- Center for Clinical and Translational Science, Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jeffrey C Edberg
- Center for Clinical and Translational Science, Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Xavier Mariette
- Department of Rheumatology, Université Paris-Saclay, Assistance Pubique - Hôpitaux de Paris, Hôpital Bicêtre, INSERM UMR1184, Le Kremlin Bicêtre, France
| | - Tom Huizinga
- Leiden University Medical Center, Leiden, the Netherlands
| | - Philippe Dieudé
- University of Paris Cité, Inserm, PHERE, F-75018, Paris, France
- Department of Rheumatology, Hôpital Bichat, APHP, Paris, France
| | - Matthias Schneider
- Department of Rheumatology & Hiller Research Unit Rheumatology, UKD, Heinrich-Heine University, Düsseldorf, Germany
| | - Martin Kerick
- Institute of Parasitology and Biomedicine Lopez-Neyra, CSIC, Granada, Spain
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
- All of Us Research Program, Office of the Director, National Institutes of Health, Bethesda, MD, USA
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Koichi Matsuda
- Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Keitaro Matsuo
- Division of Cancer Epidemiology and Prevention, Department of Preventive Medicine, Aichi Cancer Center Research Institute, Nagoya, Japan
- Department of Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Tsuneyo Mimori
- Department of Rheumatology and Clinical immunology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Keishi Fujio
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Yoshiya Tanaka
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health Japan, Kitakyushu, Japan
| | - Atsushi Kumanogoh
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- Department of Respiratory Medicine and Clinical Immunology, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Immunopathology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan
| | - Matthew Traylor
- Department of Medical & Molecular Genetics, King's College London, London, UK
- Department of Genetics, Novo Nordisk Research Centre Oxford, Oxford, UK
- Clinical Pharmacology, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Cathryn M Lewis
- Department of Medical & Molecular Genetics, King's College London, London, UK
- Social, Genetic and Developmental Psychiatry Centre, King's College London, London, UK
| | - Stephen Eyre
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, UK
| | - Huji Xu
- Department of Rheumatology and Immunology, Shanghai Changzeng Hospital, The Second Military Medical University, Shanghai, China
- School of Clinical Medicine Tsinghua University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Tsinghua University, Beijing, China
| | - Richa Saxena
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Thurayya Arayssi
- Department of Internal Medicine, Weill Cornell Medicine-Qatar, Education City, Doha, Qatar
| | - Yuta Kochi
- Department of Genomic Function and Diversity, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Katsunori Ikari
- Institute of Rheumatology, Tokyo Women's Medical University Hospital, Tokyo, Japan
- Department of Orthopedic Surgery, Tokyo Women's Medical University School of Medicine, Tokyo, Japan
- Division of Multidisciplinary Management of Rheumatic Diseases, Tokyo Women's Medical University, Tokyo, Japan
| | - Masayoshi Harigai
- Institute of Rheumatology, Tokyo Women's Medical University Hospital, Tokyo, Japan
- Division of Rheumatology, Department of Internal Medicine, Tokyo Women's Medical University School of Medicine, Tokyo, Japan
| | - Peter K Gregersen
- Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Kazuhiko Yamamoto
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - S Louis Bridges
- Department of Medicine, Hospital for Special Surgery, New York, NY, USA
- Division of Rheumatology, Weill Cornell Medicine, New York, NY, USA
| | - Leonid Padyukov
- Department of Medicine, Division of Rheumatology, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Javier Martin
- Institute of Parasitology and Biomedicine Lopez-Neyra, CSIC, Granada, Spain
| | - Lars Klareskog
- Department of Medicine, Division of Rheumatology, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan.
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan.
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan.
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita, Japan.
- Department of Genome Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.
| |
Collapse
|
6
|
Yu Y, Zhang Z, Dong X, Yang R, Duan Z, Xiang Z, Li J, Li G, Yan F, Xue H, Jiao D, Lu J, Lu H, Zhang W, Wei Y, Fan S, Li J, Jia J, Zhang J, Ji J, Liu P, Lu H, Zhao H, Chen S, Wei C, Chen H, Zhu Z. Pangenomic analysis of Chinese gastric cancer. Nat Commun 2022; 13:5412. [PMID: 36109518 PMCID: PMC9477819 DOI: 10.1038/s41467-022-33073-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 08/31/2022] [Indexed: 11/25/2022] Open
Abstract
Pangenomic study might improve the completeness of human reference genome (GRCh38) and promote precision medicine. Here, we use an automated pipeline of human pangenomic analysis to build gastric cancer pan-genome for 185 paired deep sequencing data (370 samples), and characterize the gene presence-absence variations (PAVs) at whole genome level. Genes ACOT1, GSTM1, SIGLEC14 and UGT2B17 are identified as highly absent genes in gastric cancer population. A set of genes from unaligned sequences with GRCh38 are predicted. We successfully locate one of predicted genes GC0643 on chromosome 9q34.2. Overexpression of GC0643 significantly inhibits cell growth, cell migration and invasion, cell cycle progression, and induces cell apoptosis in cancer cells. The tumor suppressor functions can be reversed by shGC0643 knockdown. The GC0643 is approved by NCBI database (GenBank: MW194843.1). Collectively, the robust pan-genome strategy provides a deeper understanding of the gene PAVs in the human cancer genome. Human pan-genomics are increasing our knowledge of genomic diversity and genetic factors in disease. Here, the authors built a gastric cancer pan-genome that included the sequences of Chinese Han patients, and predicted putative and previously unaligned genes associated with gastric cancer.
Collapse
|
7
|
Prodanov T, Bansal V. Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing. Nat Commun 2022; 13:3221. [PMID: 35680869 PMCID: PMC9184528 DOI: 10.1038/s41467-022-30930-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 05/20/2022] [Indexed: 11/09/2022] Open
Abstract
The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been implicated in monogenic and complex human diseases. We describe a computational tool, Parascopy, for estimating the aggregate and paralog-specific copy number of duplicated genes using whole-genome sequencing (WGS). Parascopy is an efficient method that jointly analyzes reads mapped to different repeat copies without the need for global realignment. It leverages multiple samples to mitigate sequencing bias and to identify reliable paralogous sequence variants (PSVs) that differentiate repeat copies. Analysis of WGS data for 2504 individuals from diverse populations showed that Parascopy is robust to sequencing bias, has higher accuracy compared to existing methods and enables prioritization of pathogenic copy number changes in duplicated genes.
Collapse
Affiliation(s)
- Timofey Prodanov
- Bioinformatics and Systems Biology Graduate Program, University of California, La Jolla, San Diego, CA, 92093, USA
| | - Vikas Bansal
- Department of Pediatrics, School of Medicine, University of California, La Jolla, San Diego, CA, 92093, USA.
| |
Collapse
|
8
|
Quan C, Lu H, Lu Y, Zhou G. Population-scale genotyping of structural variation in the era of long-read sequencing. Comput Struct Biotechnol J 2022; 20:2639-2647. [PMID: 35685364 PMCID: PMC9163579 DOI: 10.1016/j.csbj.2022.05.047] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 05/24/2022] [Accepted: 05/24/2022] [Indexed: 11/29/2022] Open
Abstract
Population-scale studies of structural variation (SV) are growing rapidly worldwide with the development of long-read sequencing technology, yielding a considerable number of novel SVs and complete gap-closed genome assemblies. Herein, we highlight recent studies using a hybrid sequencing strategy and present the challenges toward large-scale genotyping for SVs due to the reference bias. Genotyping SVs at a population scale remains challenging, which severely impacts genotype-based population genetic studies or genome-wide association studies of complex diseases. We summarize academic efforts to improve genotype quality through linear or graph representations of reference and alternative alleles. Graph-based genotypers capable of integrating diverse genetic information are effectively applied to large and diverse cohorts, contributing to unbiased downstream analysis. Meanwhile, there is still an urgent need in this field for efficient tools to construct complex graphs and perform sequence-to-graph alignments.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
| | - Hao Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
- Hebei University, Baoding, Hebei Province 071002, PR China
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, Beijing 100850, PR China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166, PR China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025, PR China
- Hebei University, Baoding, Hebei Province 071002, PR China
| |
Collapse
|
9
|
Kakuta Y, Iwaki H, Umeno J, Kawai Y, Kawahara M, Takagawa T, Shimoyama Y, Naito T, Moroi R, Kuroha M, Shiga H, Watanabe K, Nakamura S, Nakase H, Sasaki M, Hanai H, Fuyuno Y, Hirano A, Matsumoto T, Kudo H, Minegishi N, Nakamura M, Hisamatsu T, Andoh A, Nagasaki M, Tokunaga K, Kinouchi Y, Masamune A. Crohn's Disease and Early Exposure to Thiopurines are Independent Risk Factors for Mosaic Chromosomal Alterations in Patients with Inflammatory Bowel Diseases. J Crohns Colitis 2022; 16:643-655. [PMID: 34751398 DOI: 10.1093/ecco-jcc/jjab199] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
BACKGROUND AND AIMS Mosaic chromosomal alterations [mCAs] increase the risk for haematopoietic malignancies and may be risk factors for several other diseases. Inflammatory bowel diseases [IBDs], including Crohn's disease [CD] and ulcerative colitis [UC], are associated with mCAs, and patients may be at risk for haematopoietic malignancy development and/or modification of IBD phenotypes. In the present study, we screened patients with IBD for the presence of mCAs and explored the possible pathophysiological and genetic risk factors for mCAs. METHODS We analysed mCAs in peripheral blood from 3339 patients with IBD and investigated the clinical and genetic risk factors for mCAs. RESULTS CD and exposure to thiopurines before the age of 20 years were identified as novel independent risk factors for mCAs [odds ratio = 2.15 and 5.68, p = 1.17e-2 and 1.60e-3, respectively]. In contrast, there were no significant associations of disease duration, anti-tumour necrosis factor alpha antibodies, or other clinical factors with mCAs. Gene ontology enrichment analysis revealed that genes specifically located in the mCAs in patients with CD were significantly associated with factors related to mucosal immune responses. A genome-wide association study revealed that ERBIN, CD96, and AC068672.2 were significantly associated with mCAs in patients with CD [p = 1.56e-8, 1.65e-8, and 4.92e-8, respectively]. CONCLUSIONS The difference in mCAs between patients with CD and UC supports the higher incidence of haematopoietic malignancies in CD. Caution should be exercised when using thiopurines in young patients with IBD, particularly CD, in light of possible chromosomal alterations.
Collapse
Affiliation(s)
- Yoichi Kakuta
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Hideya Iwaki
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Junji Umeno
- Department of Medicine and Clinical Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Yosuke Kawai
- National Center for Global Health and Medicine, Tokyo, Japan
| | - Masahiro Kawahara
- Division of Gastroenterology and Hematology, Department of Medicine, Shiga University of Medical Science, Shiga, Japan
| | - Tetsuya Takagawa
- Center for Inflammatory Bowel Disease, Division of Internal Medicine, Hyogo College of Medicine, Nishinomiya, Japan
| | - Yusuke Shimoyama
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Takeo Naito
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Rintaro Moroi
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Masatake Kuroha
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Hisashi Shiga
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Kenji Watanabe
- Center for Inflammatory Bowel Disease, Division of Internal Medicine, Hyogo College of Medicine, Nishinomiya, Japan
| | - Shiro Nakamura
- Center for Inflammatory Bowel Disease, Division of Internal Medicine, Hyogo College of Medicine, Nishinomiya, Japan
| | - Hiroshi Nakase
- Department of Gastroenterology and Hepatology, Sapporo Medical University School of Medicine, Sapporo, Japan
| | - Makoto Sasaki
- Division of Gastroenterology, Department of Internal Medicine, Aichi Medical University School of Medicine, Nagakute, Japan
| | | | - Yuta Fuyuno
- Department of Medicine and Clinical Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Atsushi Hirano
- Department of Medicine and Clinical Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Takayuki Matsumoto
- Department of Medicine and Clinical Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan.,Division of Gastroenterology, Department of Internal Medicine, Iwate Medical University, Morioka, Japan
| | - Hisaaki Kudo
- Department of Biobank, Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Naoko Minegishi
- Department of Biobank, Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Minoru Nakamura
- Clinical Research Center, National Hospital Organization [NHO] Nagasaki Medical Center, Omura, Japan
| | - Tadakazu Hisamatsu
- Department of Gastroenterology and Hepatology, Kyorin University School of Medicine, Mitaka, Japan
| | - Akira Andoh
- Division of Gastroenterology and Hematology, Department of Medicine, Shiga University of Medical Science, Shiga, Japan
| | - Masao Nagasaki
- Human Biosciences Unit for the Top Global Course Center for the Promotion of Interdisciplinary Education and Research, Kyoto University, Kyoto, Japan
| | | | - Yoshitaka Kinouchi
- Student Healthcare Center, Institute for Excellence in Higher Education, Tohoku University, Sendai, Japan
| | - Atsushi Masamune
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | | |
Collapse
|
10
|
Huang T, Li J, Wang SM. Etiological roles of core promoter variation in triple-negative breast cancer. Genes Dis 2022; 10:228-238. [PMID: 37013029 PMCID: PMC10066267 DOI: 10.1016/j.gendis.2022.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 12/26/2021] [Accepted: 01/12/2022] [Indexed: 10/19/2022] Open
Abstract
Abnormal gene expression plays key role in cancer development. A core promoter is located around the transcriptional start site. Through interaction between core promoter sequences and transcriptional factors, core promoter controls transcriptional initiation. We hypothesized that in cancer, core promoter sequences could be mutated to interfere the interaction with transcriptional factors, resulting in altered transcriptional initiation and abnormal gene expression and cancer development. We used triple-negative breast cancer (TNBC) as a model to test our hypothesis. We collected genome-wide core promoter variants from 279 TNBC genomes. After extensive filtering of normal genomic polymorphism, we identified 19,427 recurrent somatic variants in 1,238 core promoters of 1,274 genes and 1,694 recurrent germline variants in 272 core promoters of 294 genes. Many of the affected genes were oncogenes and tumor suppressors. Analysis of RNA-seq data from the same patient cohort identified increased or decreased gene expression in 439 somatic and 85 germline variants-affected genes, and the results were validated by luciferase reporter assay. By comparing with the core promoter variation data from 610 unclassified breast cancer, we observed that core promoter variants in TNBC were highly TNBC-specific. We further identified the drugs targeting the genes with core promoter variation. Our study demonstrates that core promoter is highly mutable in cancer, and can play etiological roles in TNBC and other types of cancer through influencing transcriptional initiation.
Collapse
|
11
|
Natri HM, Hudjashov G, Jacobs G, Kusuma P, Saag L, Darusallam CC, Metspalu M, Sudoyo H, Cox MP, Gallego Romero I, Banovich NE. Genetic architecture of gene regulation in Indonesian populations identifies QTLs associated with global and local ancestries. Am J Hum Genet 2022; 109:50-65. [PMID: 34919805 PMCID: PMC8764200 DOI: 10.1016/j.ajhg.2021.11.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 11/16/2021] [Indexed: 02/07/2023] Open
Abstract
Lack of diversity in human genomics limits our understanding of the genetic underpinnings of complex traits, hinders precision medicine, and contributes to health disparities. To map genetic effects on gene regulation in the underrepresented Indonesian population, we have integrated genotype, gene expression, and CpG methylation data from 115 participants across three island populations that capture the major sources of genomic diversity in the region. In a comparison with European datasets, we identify eQTLs shared between Indonesia and Europe as well as population-specific eQTLs that exhibit differences in allele frequencies and/or overall expression levels between populations. By combining local ancestry and archaic introgression inference with eQTLs and methylQTLs, we identify regulatory loci driven by modern Papuan ancestry as well as introgressed Denisovan and Neanderthal variation. GWAS colocalization connects QTLs detected here to hematological traits, and further comparison with European datasets reflects the poor overall transferability of GWAS statistics across diverse populations. Our findings illustrate how population-specific genetic architecture, local ancestry, and archaic introgression drive variation in gene regulation across genetically distinct and in admixed populations and highlight the need for performing association studies on non-European populations.
Collapse
Affiliation(s)
- Heini M Natri
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA; The Translational Genomics Research Institute, Phoenix, AZ 85004, USA
| | - Georgi Hudjashov
- Statistics and Bioinformatics Group, School of Fundamental Sciences, Massey University, Palmerston North 4410, New Zealand; Centre for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Tartu 51010, Estonia
| | - Guy Jacobs
- Leverhulme Centre for Human Evolutionary Studies, Department of Archaeology, University of Cambridge, Cambridge CB2 1QH, UK; Complexity Institute, Nanyang Technological University, Singapore, 637460
| | - Pradiptajati Kusuma
- Complexity Institute, Nanyang Technological University, Singapore, 637460; Laboratory of Genome Diversity and Disease, Eijkman Institute for Molecular Biology, Jakarta 10430, Indonesia
| | - Lauri Saag
- Institute of Genomics, University of Tartu, Tartu 51010, Estonia
| | - Chelzie Crenna Darusallam
- Laboratory of Genome Diversity and Disease, Eijkman Institute for Molecular Biology, Jakarta 10430, Indonesia
| | - Mait Metspalu
- Institute of Genomics, University of Tartu, Tartu 51010, Estonia
| | - Herawati Sudoyo
- Laboratory of Genome Diversity and Disease, Eijkman Institute for Molecular Biology, Jakarta 10430, Indonesia
| | - Murray P Cox
- Statistics and Bioinformatics Group, School of Fundamental Sciences, Massey University, Palmerston North 4410, New Zealand
| | - Irene Gallego Romero
- Centre for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Tartu 51010, Estonia; Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC 3010, Australia; School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia; Centre for Stem Cell Systems, University of Melbourne, Parkville, VIC 3010, Australia
| | | |
Collapse
|
12
|
Suzuki K, Kakuta Y, Naito T, Takagawa T, Hanai H, Araki H, Sasaki Y, Sakuraba H, Sasaki M, Hisamatsu T, Motoya S, Matsumoto T, Onodera M, Ishiguro Y, Nakase H, Andoh A, Hiraoka S, Shinozaki M, Fujii T, Katsurada T, Kobayashi T, Fujiya M, Otsuka T, Oshima N, Suzuki Y, Sato Y, Hokari R, Noguchi M, Ohta Y, Matsuura M, Kawai Y, Tokunaga K, Nagasaki M, Kudo H, Minegishi N, Okamoto D, Shimoyama Y, Moroi R, Kuroha M, Shiga H, Li D, McGovern DPB, Kinouchi Y, Masamune A. Genetic Background of Mesalamine-induced Fever and Diarrhea in Japanese Patients with Inflammatory Bowel Disease. Inflamm Bowel Dis 2022; 28:21-31. [PMID: 33501934 DOI: 10.1093/ibd/izab004] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Indexed: 12/13/2022]
Abstract
BACKGROUND Some patients with inflammatory bowel disease (IBD) who were under mesalamine treatment develop adverse reactions called "mesalamine allergy," which includes high fever and worsening diarrhea. Currently, there is no method to predict mesalamine allergy. Pharmacogenomic approaches may help identify these patients. Here we analyzed the genetic background of mesalamine intolerance in the first genome-wide association study of Japanese patients with IBD. METHODS Two independent pharmacogenetic IBD cohorts were analyzed: the MENDEL (n = 1523; as a discovery set) and the Tohoku (n = 788; as a replication set) cohorts. Genome-wide association studies were performed in each population, followed by a meta-analysis. In addition, we constructed a polygenic risk score model and combined genetic and clinical factors to model mesalamine intolerance. RESULTS In the combined cohort, mesalamine-induced fever and/or diarrhea was significantly more frequent in ulcerative colitis vs Crohn's disease. The genome-wide association studies and meta-analysis identified one significant association between rs144384547 (upstream of RGS17) and mesalamine-induced fever and diarrhea (P = 7.21e-09; odds ratio = 11.2). The estimated heritability of mesalamine allergy was 25.4%, suggesting a significant correlation with the genetic background. Furthermore, a polygenic risk score model was built to predict mesalamine allergy (P = 2.95e-2). The combined genetic/clinical prediction model yielded a higher area under the curve than did the polygenic risk score or clinical model alone (area under the curve, 0.89; sensitivity, 71.4%; specificity, 90.8%). CONCLUSIONS Mesalamine allergy was more common in ulcerative colitis than in Crohn's disease. We identified a novel genetic association with and developed a combined clinical/genetic model for this adverse event.
Collapse
Affiliation(s)
- Kaoru Suzuki
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Yoichi Kakuta
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Takeo Naito
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan.,F. Widjaja Foundation Inflammatory Bowel and Immunology Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Tetsuya Takagawa
- Center for Inflammatory Bowel Disease, Division of Internal Medicine, Hyogo College of Medicine, Nishinomiya, Japan
| | | | - Hiroshi Araki
- Department of Gastroenterology, Gifu University Graduate School of Medicine, Gifu, Japan
| | - Yu Sasaki
- Department of Gastroenterology, Yamagata University Faculty of Medicine, Yamagata, Japan
| | - Hirotake Sakuraba
- Department of Gastroenterology and Hematology, Hirosaki University Graduate School of Medicine, Hirosaki, Japan
| | - Makoto Sasaki
- Division of Gastroenterology, Department of Internal Medicine, Aichi Medical University School of Medicine, Nagakute, Japan
| | - Tadakazu Hisamatsu
- Department of Gastroenterology and Hepatology, Kyorin University School of Medicine, Mitaka, Japan
| | - Satoshi Motoya
- IBD Center, Sapporo-Kosei General Hospital, Sapporo, Japan
| | - Takayuki Matsumoto
- Division of Gastroenterology, Department of Internal Medicine, School of Medicine, Iwate Medical University, Morioka, Japan
| | - Motoyuki Onodera
- Department of Gastroenterology, Iwate Prefectural Isawa Hospital, Iwate, Japan
| | - Yoh Ishiguro
- Department of Gastroenterology and Hematology, Hirosaki National Hospital, Hirosaki, Japan
| | - Hiroshi Nakase
- Department of Gastroenterology and Hepatology, Sapporo Medical University School of Medicine, Sapporo, Japan
| | - Akira Andoh
- Department of Gastroenterology, Shiga University of Medical Science, Otsu, Japan
| | - Sakiko Hiraoka
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences, Okayama, Japan
| | - Masaru Shinozaki
- Department of Surgery, IMSUT Hospital, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Toshimitsu Fujii
- Department of Gastroenterology and Hepatology, Tokyo Medical and Dental University, Tokyo, Japan
| | - Takehiko Katsurada
- Department of Gastroenterology and Hepatology, Graduate School of Medicine, Hokkaido University, Sapporo, Japan
| | - Taku Kobayashi
- Center for Advanced IBD Research and Treatment, Kitasato University Kitasato Institute Hospital, Tokyo, Japan
| | - Mikihiro Fujiya
- Department of Medicine, Division of Gastroenterology and Hematology/Oncology, Asahikawa Medical University, Asahikawa, Japan
| | - Takafumi Otsuka
- Division of Gastroenterology, Department of Internal Medicine, Kobe University Graduate School of Medicine, Hyogo, Japan
| | - Naoki Oshima
- Department of Internal Medicine II, Shimane University Faculty of Medicine, Shimane, Japan
| | - Yasuo Suzuki
- Department of Internal Medicine, Toho University Sakura Medical Center, Sakura, Japan
| | - Yuichirou Sato
- Department of Gastroenterology, Osaki Citizen Hospital, Osaki, Japan
| | - Ryota Hokari
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, National Defense Medical College, Tokorozawa, Japan
| | | | - Yuki Ohta
- Department of Gastroenterology, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Minoru Matsuura
- Department of Gastroenterology and Hepatology, Kyorin University School of Medicine, Mitaka, Japan.,Department of Gastroenterology and Hepatology, Graduate School of Medicine, Kyoto University, Japan
| | - Yosuke Kawai
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, Japan
| | - Katsushi Tokunaga
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, Japan
| | - Masao Nagasaki
- Human Biosciences Unit for the Top Global Course Center for the Promotion of Interdisciplinary Education and Research, Kyoto University, Kyoto, Japan
| | - Hisaaki Kudo
- Department of Biobank, Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Naoko Minegishi
- Department of Biobank, Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Daisuke Okamoto
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Yusuke Shimoyama
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Rintaro Moroi
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Masatake Kuroha
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Hisashi Shiga
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan.,Department of Gastroenterology, Graduate School of Medicine, Akita University, Akita, Japan
| | - Dalin Li
- F. Widjaja Foundation Inflammatory Bowel and Immunology Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Dermot P B McGovern
- F. Widjaja Foundation Inflammatory Bowel and Immunology Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Yoshitaka Kinouchi
- Health Administration Center, Center for the Advancement of Higher Education, Tohoku University, Sendai, Japan
| | - Atsushi Masamune
- Division of Gastroenterology, Tohoku University Graduate School of Medicine, Sendai, Japan
| | | |
Collapse
|
13
|
Wahyudi F, Aghakhanian F, Rahman S, Teo YY, Szpak M, Dhaliwal J, Ayub Q. Prioritising positively selected variants in whole-genome sequencing data using FineMAV. BMC Bioinformatics 2021; 22:604. [PMID: 34922440 PMCID: PMC8684245 DOI: 10.1186/s12859-021-04506-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 11/30/2021] [Indexed: 11/17/2022] Open
Abstract
Background In population genomics, polymorphisms that are highly differentiated between geographically separated populations are often suggestive of Darwinian positive selection. Genomic scans have highlighted several such regions in African and non-African populations, but only a handful of these have functional data that clearly associates candidate variations driving the selection process. Fine-Mapping of Adaptive Variation (FineMAV) was developed to address this in a high-throughput manner using population based whole-genome sequences generated by the 1000 Genomes Project. It pinpoints positively selected genetic variants in sequencing data by prioritizing high frequency, population-specific and functional derived alleles. Results We developed a stand-alone software that implements the FineMAV statistic. To graphically visualise the FineMAV scores, it outputs the statistics as bigWig files, which is a common file format supported by many genome browsers. It is available as a command-line and graphical user interface. The software was tested by replicating the FineMAV scores obtained using 1000 Genomes Project African, European, East and South Asian populations and subsequently applied to whole-genome sequencing datasets from Singapore and China to highlight population specific variants that can be subsequently modelled. The software tool is publicly available at https://github.com/fadilla-wahyudi/finemav. Conclusions The software tool described here determines genome-wide FineMAV scores, using low or high-coverage whole-genome sequencing datasets, that can be used to prioritize a list of population specific, highly differentiated candidate variants for in vitro or in vivo functional screens. The tool displays these scores on the human genome browsers for easy visualisation, annotation and comparison between different genomic regions in worldwide human populations. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04506-9.
Collapse
Affiliation(s)
- Fadilla Wahyudi
- School of Science, Monash University Malaysia, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia
| | - Farhang Aghakhanian
- Monash University Malaysia Genomics Facility, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia.,Genes and Human Disease Research Program, Oklahoma Medical Research Foundation,, Oklahoma City, OK, 73104, USA
| | - Sadequr Rahman
- School of Science, Monash University Malaysia, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia.,Tropical Medicine and Biology Multidisciplinary Platform, Monash University Malaysia, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia
| | - Yik-Ying Teo
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| | - Michał Szpak
- European Bioinformatics Institute, Hinxton, CB10 1SA, UK.,Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Jasbir Dhaliwal
- School of Information Technology, Monash University Malaysia, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia.
| | - Qasim Ayub
- School of Science, Monash University Malaysia, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia. .,Monash University Malaysia Genomics Facility, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia. .,Tropical Medicine and Biology Multidisciplinary Platform, Monash University Malaysia, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia.
| |
Collapse
|
14
|
NyuWa Genome resource: A deep whole-genome sequencing-based variation profile and reference panel for the Chinese population. Cell Rep 2021; 37:110017. [PMID: 34788621 DOI: 10.1016/j.celrep.2021.110017] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 05/04/2021] [Accepted: 10/28/2021] [Indexed: 01/07/2023] Open
Abstract
The lack of haplotype reference panels and whole-genome sequencing resources specific to the Chinese population has greatly hindered genetic studies in the world's largest population. Here, we present the NyuWa genome resource, based on deep (26.2×) sequencing of 2,999 Chinese individuals, and construct a NyuWa reference panel of 5,804 haplotypes and 19.3 million variants, which is a high-quality publicly available Chinese population-specific reference panel with thousands of samples. Compared with other panels, the NyuWa reference panel reduces the Han Chinese imputation error rate by a margin ranging from 30% to 51%. Population structure and imputation simulation tests support the applicability of one integrated reference panel for northern and southern Chinese. In addition, a total of 22,504 loss-of-function variants in coding and noncoding genes are identified, including 11,493 novel variants. These results highlight the value of the NyuWa genome resource in facilitating genetic research in Chinese and Asian populations.
Collapse
|
15
|
Li Q, Tian S, Yan B, Liu CM, Lam TW, Li R, Luo R. Building a Chinese pan-genome of 486 individuals. Commun Biol 2021; 4:1016. [PMID: 34462542 PMCID: PMC8405635 DOI: 10.1038/s42003-021-02556-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 08/13/2021] [Indexed: 02/07/2023] Open
Abstract
Pan-genome sequence analysis of human population ancestry is critical for expanding and better defining human genome sequence diversity. However, the amount of genetic variation still missing from current human reference sequences is still unknown. Here, we used 486 deep-sequenced Han Chinese genomes to identify 276 Mbp of DNA sequences that, to our knowledge, are absent in the current human reference. We classified these sequences into individual-specific and common sequences, and propose that the common sequence size is uncapped with a growing population. The 46.646 Mbp common sequences obtained from the 486 individuals improved the accuracy of variant calling and mapping rate when added to the reference genome. We also analyzed the genomic positions of these common sequences and found that they came from genomic regions characterized by high mutation rate and low pathogenicity. Our study authenticates the Chinese pan-genome as representative of DNA sequences specific to the Han Chinese population missing from the GRCh38 reference genome and establishes the newly defined common sequences as candidates to supplement the current human reference.
Collapse
Affiliation(s)
- Qiuhui Li
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Shilin Tian
- Novogene Bioinformatics Institute, Beijing, China
| | - Bin Yan
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Chi Man Liu
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Tak-Wah Lam
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| | - Ruiqiang Li
- Novogene Bioinformatics Institute, Beijing, China.
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| |
Collapse
|
16
|
Quan C, Li Y, Liu X, Wang Y, Ping J, Lu Y, Zhou G. Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression. Genome Biol 2021; 22:159. [PMID: 34034800 PMCID: PMC8146648 DOI: 10.1186/s13059-021-02382-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/14/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Structural variation (SV) acts as an essential mutational force shaping the evolution and function of the human genome. However, few studies have examined the role of SVs in high-altitude adaptation and little is known of adaptive introgressed SVs in Tibetans so far. RESULTS Here, we generate a comprehensive catalog of SVs in a Chinese Tibetan (n = 15) and Han (n = 10) population using nanopore sequencing technology. Among a total of 38,216 unique SVs in the catalog, 27% are sequence-resolved for the first time. We systematically assess the distribution of these SVs across repeat sequences and functional genomic regions. Through genotyping in additional 276 genomes, we identify 69 Tibetan-Han stratified SVs and 80 candidate adaptive genes. We also discover a few adaptive introgressed SV candidates and provide evidence for a deletion of 335 base pairs at 1p36.32. CONCLUSIONS Overall, our results highlight the important role of SVs in the evolutionary processes of Tibetans' adaptation to the Qinghai-Tibet Plateau and provide a valuable resource for future high-altitude adaptation studies.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yuanfeng Li
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Xinyi Liu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yahui Wang
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Jie Ping
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166 People’s Republic of China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025 People’s Republic of China
| |
Collapse
|
17
|
Kausthubham N, Shukla A, Gupta N, Bhavani GS, Kulshrestha S, Das Bhowmik A, Moirangthem A, Bijarnia-Mahay S, Kabra M, Puri RD, Mandal K, Verma IC, Bielas SL, Phadke SR, Dalal A, Girisha KM. A data set of variants derived from 1455 clinical and research exomes is efficient in variant prioritization for early-onset monogenic disorders in Indians. Hum Mutat 2021; 42:e15-e61. [PMID: 33502066 PMCID: PMC10052794 DOI: 10.1002/humu.24172] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 01/05/2021] [Accepted: 01/24/2021] [Indexed: 12/16/2022]
Abstract
Given the genomic uniqueness, a local data set is most desired for Indians, who are underrepresented in existing public databases. We hypothesize patients with rare monogenic disorders and their family members can provide a reliable source of common variants in the population. Exome sequencing (ES) data from families with rare Mendelian disorders was aggregated from five centers in India. The dataset was refined by excluding related individuals and removing the disease-causing variants (refined cohort). The efficiency of these data sets was assessed in a new set of 50 exomes against gnomAD and GenomeAsia. Our original cohort comprised 1455 individuals from 1203 families. The refined cohort had 836 unrelated individuals that retained 1,251,064 variants with 181,125 population-specific and 489,618 common variants. The allele frequencies from our cohort helped to define 97,609 rare variants in gnomAD and 44,520 rare variants in GenomeAsia as common variants in our population. Our variant dataset provided an additional 1.7% and 0.1% efficiency for prioritizing heterozygous and homozygous variants respectively for rare monogenic disorders. We observed additional 19 genes/human knockouts. We list carrier frequency for 142 recessive disorders. This is a large and useful resource of exonic variants for Indians. Despite limitations, datasets from patients are efficient tools for variant prioritization in a resource-limited setting.
Collapse
Affiliation(s)
- Neethukrishna Kausthubham
- Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Anju Shukla
- Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Neerja Gupta
- Division of Genetics, Department of Pediatrics, All India Institute of Medical Sciences, New Delhi, India
| | - Gandham S Bhavani
- Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Samarth Kulshrestha
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Aneek Das Bhowmik
- Division of Diagnostics, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, India.,ASPIRE (Diagnostics Facility), CSIR-Centre for Cellular & Molecular Biology, CCMB Annexe II, Hyderabad, India
| | - Amita Moirangthem
- Department of Medical Genetics, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, India
| | - Sunita Bijarnia-Mahay
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Madhulika Kabra
- Division of Genetics, Department of Pediatrics, All India Institute of Medical Sciences, New Delhi, India
| | - Ratna D Puri
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Kausik Mandal
- Department of Medical Genetics, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, India
| | - Ishwar C Verma
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Stephanie L Bielas
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Shubha R Phadke
- Department of Medical Genetics, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, India
| | - Ashwin Dalal
- Division of Diagnostics, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, India
| | - Katta M Girisha
- Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
18
|
Louzada S, Algady W, Weyell E, Zuccherato LW, Brajer P, Almalki F, Scliar MO, Naslavsky MS, Yamamoto GL, Duarte YAO, Passos-Bueno MR, Zatz M, Yang F, Hollox EJ. Structural variation of the malaria-associated human glycophorin A-B-E region. BMC Genomics 2020; 21:446. [PMID: 32600246 PMCID: PMC7325229 DOI: 10.1186/s12864-020-06849-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 06/18/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Approximately 5% of the human genome shows common structural variation, which is enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120 kb in length and carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A (encoded by GYPA) and glycophorin B (encoded by GYPB) are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They are receptors for the invasion of the protist parasite Plasmodium falciparum, a causative agent of malaria. A particular complex structural variant, called DUP4, creates a GYPB-GYPA fusion gene known to confer resistance to malaria. Many other structural variants exist across the glycophorin gene cluster, and they remain poorly characterised. RESULTS Here, we analyse sequences from 3234 diploid genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection of these variants using fibre-FISH and breakpoint mapping at the sequence level. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by non-allelic homologous recombination and by comparing the structural variant breakpoints with recombination hotspot maps, confirm the importance of a particular meiotic recombination hotspot on structural variant formation in this region. CONCLUSIONS We identify and validate large structural variants in the human glycophorin A-B-E gene cluster which may be associated with different clinical aspects of malaria.
Collapse
Affiliation(s)
- Sandra Louzada
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
- Present address: Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology, University of Trás-os-Montes and Alto Douro (UTAD), Vila Real, Portugal
- Present address: BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Lisbon, Portugal
| | - Walid Algady
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Eleanor Weyell
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Luciana W Zuccherato
- Department of Pathology, Faculty of Medicine, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Paulina Brajer
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Faisal Almalki
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Marilia O Scliar
- Human Genome and Stem Cell Research Center, Department of Genetics and Evolutionary Biology, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Michel S Naslavsky
- Human Genome and Stem Cell Research Center, Department of Genetics and Evolutionary Biology, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Guilherme L Yamamoto
- Human Genome and Stem Cell Research Center, Department of Genetics and Evolutionary Biology, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Yeda A O Duarte
- School of Nursing, Universidade de São Paulo, São Paulo, Brazil
| | - Maria Rita Passos-Bueno
- Human Genome and Stem Cell Research Center, Department of Genetics and Evolutionary Biology, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Mayana Zatz
- Human Genome and Stem Cell Research Center, Department of Genetics and Evolutionary Biology, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | | | - Edward J Hollox
- Department of Genetics and Genome Biology, University of Leicester, Leicester, UK.
| |
Collapse
|
19
|
Anderson-Trocmé L, Farouni R, Bourgey M, Kamatani Y, Higasa K, Seo JS, Kim C, Matsuda F, Gravel S. Legacy Data Confound Genomics Studies. Mol Biol Evol 2020; 37:2-10. [PMID: 31504792 DOI: 10.1093/molbev/msz201] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.
Collapse
Affiliation(s)
- Luke Anderson-Trocmé
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada
| | - Rick Farouni
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada
| | - Mathieu Bourgey
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada
| | - Yoichiro Kamatani
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Koichiro Higasa
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Jeong-Sun Seo
- Bioinformatics Institute, Macrogen Inc, Seoul, Republic of Korea
- Precision Medicine Center, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Changhoon Kim
- Bioinformatics Institute, Macrogen Inc, Seoul, Republic of Korea
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada
| |
Collapse
|
20
|
Zeng J, Yuan N, Zhu J, Pan M, Zhang H, Wang Q, Shi S, Du Z, Xiao J. RETRACTED: CGVD: a genomic variation database for Chinese populations. Nucleic Acids Res 2020; 48:D1174-D1180. [PMID: 31665422 PMCID: PMC7145633 DOI: 10.1093/nar/gkz952] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 10/09/2019] [Accepted: 10/10/2019] [Indexed: 12/21/2022] Open
Abstract
Precision medicine calls upon deeper coverage of population-based sequencing and thorough gene-content and phenotype-based analysis, which lead to a population-associated genomic variation map or database. The Chinese Genomic Variation Database (CGVD; https://bigd.big.ac.cn/cgvd/) is such a database that has combined 48.30 million (M) SNVs and 5.77 M small indels, identified from 991 Chinese individuals of the Chinese Academy of Sciences Precision Medicine Initiative Project (CASPMI) and 301 Chinese individuals of the 1000 Genomes Project (1KGP). The CASPMI project includes whole-genome sequencing data (WGS, 25–30×) from ∼1000 healthy individuals of the CASPMI cohort. To facilitate the usage of such variations for pharmacogenomics studies, star-allele frequencies of the drug-related genes in the CASPMI and 1KGP populations are calculated and provided in CGVD. As one of the important database resources in BIG Data Center, CGVD will continue to collect more genomic variations and to curate structural and functional annotations to support population-based healthcare projects and studies in China and worldwide.
Collapse
Affiliation(s)
- Jingyao Zeng
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Na Yuan
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Junwei Zhu
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Mengyu Pan
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Hao Zhang
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Qi Wang
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Shuo Shi
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Zhenglin Du
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingfa Xiao
- National Genomics Data Center, Beijing 100101, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
21
|
Gao Y, Zhang C, Yuan L, Ling Y, Wang X, Liu C, Pan Y, Zhang X, Ma X, Wang Y, Lu Y, Yuan K, Ye W, Qian J, Chang H, Cao R, Yang X, Ma L, Ju Y, Dai L, Tang Y, Zhang G, Xu S. PGG.Han: the Han Chinese genome database and analysis platform. Nucleic Acids Res 2020; 48:D971-D976. [PMID: 31584086 PMCID: PMC6943055 DOI: 10.1093/nar/gkz829] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 09/11/2019] [Accepted: 09/27/2019] [Indexed: 02/06/2023] Open
Abstract
As the largest ethnic group in the world, the Han Chinese population is nonetheless underrepresented in global efforts to catalogue the genomic variability of natural populations. Here, we developed the PGG.Han, a population genome database to serve as the central repository for the genomic data of the Han Chinese Genome Initiative (Phase I). In its current version, the PGG.Han archives whole-genome sequences or high-density genome-wide single-nucleotide variants (SNVs) of 114 783 Han Chinese individuals (a.k.a. the Han100K), representing geographical sub-populations covering 33 of the 34 administrative divisions of China, as well as Singapore. The PGG.Han provides: (i) an interactive interface for visualization of the fine-scale genetic structure of the Han Chinese population; (ii) genome-wide allele frequencies of hierarchical sub-populations; (iii) ancestry inference for individual samples and controlling population stratification based on nested ancestry informative markers (AIMs) panels; (iv) population-structure-aware shared control data for genotype-phenotype association studies (e.g. GWASs) and (v) a Han-Chinese-specific reference panel for genotype imputation. Computational tools are implemented into the PGG.Han, and an online user-friendly interface is provided for data analysis and results visualization. The PGG.Han database is freely accessible via http://www.pgghan.org or https://www.hanchinesegenomes.org.
Collapse
Affiliation(s)
- Yang Gao
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Chao Zhang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Liyun Yuan
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - YunChao Ling
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiaoji Wang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Chang Liu
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yuwen Pan
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiaoxi Zhang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Xixian Ma
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yuchen Wang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yan Lu
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Collaborative Innovation Center of Genetics and Development, Shanghai 200438, China
| | - Kai Yuan
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Wei Ye
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Jiaqiang Qian
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Huidan Chang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Ruifang Cao
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiao Yang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Ling Ma
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yuanhu Ju
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Long Dai
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yuanyuan Tang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | | | - Guoqing Zhang
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Shuhua Xu
- Key Laboratory of Computational Biology, Bio-Med Big Data Center, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Collaborative Innovation Center of Genetics and Development, Shanghai 200438, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
22
|
Bai WY, Zhu XW, Cong PK, Zhang XJ, Richards JB, Zheng HF. Genotype imputation and reference panel: a systematic evaluation on haplotype size and diversity. Brief Bioinform 2019; 21:bbz108. [PMID: 32002535 DOI: 10.1093/bib/bbz108] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 07/12/2019] [Accepted: 07/31/2019] [Indexed: 12/12/2022] Open
Abstract
Here, 622 imputations were conducted with 394 customized reference panels for Han Chinese and European populations. Besides validating the fact that imputation accuracy could always benefit from the increased panel size when the reference panel was population specific, the results brought two new thoughts. First, when the haplotype size of the reference panel was fixed, the imputation accuracy of common and low-frequency variants (Minor Allele Frequency (MAF) > 0.5%) decreased while the population diversity of the reference panel increased, but for rare variants (MAF < 0.5%), a small fraction of diversity in panel could improve imputation accuracy. Second, when the haplotype size of the reference panel was increased with extra population-diverse samples, the imputation accuracy of common variants (MAF > 5%) for the European population could always benefit from the expanding sample size. However, for the Han Chinese population, the accuracy of all imputed variants reached the highest when reference panel contained a fraction of an extra diverse sample (8-21%). In addition, we evaluated the imputation performances in the existing reference panels, such as the Haplotype Reference Consortium (HRC), 1000 Genomes Project Phase 3 and the China, Oxford and Virginia Commonwealth University Experimental Research on Genetic Epidemiology (CONVERGE). For the European population, the HRC panel showed the best performance in our analysis. For the Han Chinese population, we proposed an optimum imputation reference panel constituent ratio if researchers would like to customize their own sequenced reference panel, but a high-quality and large-scale Chinese reference panel was still needed. Our findings could be generalized to the other populations with conservative genome; a tool was provided to investigate other populations of interest (https://github.com/Abyss-bai/reference-panel-reconstruction).
Collapse
Affiliation(s)
- Wei-Yang Bai
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| | - Xiao-Wei Zhu
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| | - Pei-Kuan Cong
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| | - Xue-Jun Zhang
- Institute of Dermatology and Department of Dermatology, Huashan Hospital, Fudan University, Shanghai, China
| | - J Brent Richards
- Lady Davis Institute, Jewish General Hospital, McGill University, Montréal, Québec, Canada
| | - Hou-Feng Zheng
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| |
Collapse
|
23
|
Yoo SK, Kim CU, Kim HL, Kim S, Shin JY, Kim N, Yang JSW, Lo KW, Cho B, Matsuda F, Schuster SC, Kim C, Kim JI, Seo JS. NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and low-frequency variants. Genome Med 2019; 11:64. [PMID: 31640730 PMCID: PMC6805399 DOI: 10.1186/s13073-019-0677-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 10/11/2019] [Indexed: 12/30/2022] Open
Abstract
Here, we present the Northeast Asian Reference Database (NARD), including whole-genome sequencing data of 1779 individuals from Korea, Mongolia, Japan, China, and Hong Kong. NARD provides the genetic diversity of Korean (n = 850) and Mongolian (n = 384) ancestries that were not present in the 1000 Genomes Project Phase 3 (1KGP3). We combined and re-phased the genotypes from NARD and 1KGP3 to construct a union set of haplotypes. This approach established a robust imputation reference panel for Northeast Asians, which yields the greatest imputation accuracy of rare and low-frequency variants compared with the existing panels. NARD imputation panel is available at https://nard.macrogen.com/ .
Collapse
Affiliation(s)
- Seong-Keun Yoo
- Precision Medicine Center, Seoul National University Bundang Hospital, 172 Dolma-ro, Seongnam, Bundang-gu, Gyeonggi-do, 13605, Republic of Korea
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | - Chang-Uk Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Hie Lim Kim
- The Asian School of the Environment, Nanyang Technological University, Singapore, Singapore
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Sungjae Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Jong-Yeon Shin
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | - Namcheol Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | | | - Kwok-Wai Lo
- Department of Anatomical & Cellular Pathology and State Key Laboratory of Translational Oncology, The Chinese University of Hong Kong, Hong Kong, China
| | - Belong Cho
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Stephan C Schuster
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
- School of Biological Science, Nanyang Technological University, Singapore, Singapore
| | - Changhoon Kim
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea
| | - Jong-Il Kim
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Jeong-Sun Seo
- Precision Medicine Center, Seoul National University Bundang Hospital, 172 Dolma-ro, Seongnam, Bundang-gu, Gyeonggi-do, 13605, Republic of Korea.
- Precision Medicine Institute, Macrogen Inc., Seongnam, Republic of Korea.
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea.
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea.
- Gong-Wu Genomic Medicine Institute, Seoul National University Bundang Hospital, Seongnam, Republic of Korea.
| |
Collapse
|
24
|
Du Z, Ma L, Qu H, Chen W, Zhang B, Lu X, Zhai W, Sheng X, Sun Y, Li W, Lei M, Qi Q, Yuan N, Shi S, Zeng J, Wang J, Yang Y, Liu Q, Hong Y, Dong L, Zhang Z, Zou D, Wang Y, Song S, Liu F, Fang X, Chen H, Liu X, Xiao J, Zeng C. Whole Genome Analyses of Chinese Population and De Novo Assembly of A Northern Han Genome. GENOMICS PROTEOMICS & BIOINFORMATICS 2019; 17:229-247. [PMID: 31494266 PMCID: PMC6818495 DOI: 10.1016/j.gpb.2019.07.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 07/07/2019] [Accepted: 08/07/2019] [Indexed: 12/20/2022]
Abstract
To unravel the genetic mechanisms of disease and physiological traits, it requires comprehensive sequencing analysis of large sample size in Chinese populations. Here, we report the primary results of the Chinese Academy of Sciences Precision Medicine Initiative (CASPMI) project launched by the Chinese Academy of Sciences, including the de novo assembly of a northern Han reference genome (NH1.0) and whole genome analyses of 597 healthy people coming from most areas in China. Given the two existing reference genomes for Han Chinese (YH and HX1) were both from the south, we constructed NH1.0, a new reference genome from a northern individual, by combining the sequencing strategies of PacBio, 10× Genomics, and Bionano mapping. Using this integrated approach, we obtained an N50 scaffold size of 46.63 Mb for the NH1.0 genome and performed a comparative genome analysis of NH1.0 with YH and HX1. In order to generate a genomic variation map of Chinese populations, we performed the whole-genome sequencing of 597 participants and identified 24.85 million (M) single nucleotide variants (SNVs), 3.85 M small indels, and 106,382 structural variations. In the association analysis with collected phenotypes, we found that the T allele of rs1549293 in KAT8 significantly correlated with the waist circumference in northern Han males. Moreover, significant genetic diversity in MTHFR, TCN2, FADS1, and FADS2, which associate with circulating folate, vitamin B12, or lipid metabolism, was observed between northerners and southerners. Especially, for the homocysteine-increasing allele of rs1801133 (MTHFR 677T), we hypothesize that there exists a "comfort" zone for a high frequency of 677T between latitudes of 35-45 degree North. Taken together, our results provide a high-quality northern Han reference genome and novel population-specific data sets of genetic variants for use in the personalized and precision medicine.
Collapse
Affiliation(s)
- Zhenglin Du
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Liang Ma
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Hongzhu Qu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wei Chen
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Bing Zhang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xi Lu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Weibo Zhai
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xin Sheng
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yongqiao Sun
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wenjie Li
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Meng Lei
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Qiuhui Qi
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Na Yuan
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shuo Shi
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingyao Zeng
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jinyue Wang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yadong Yang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Qi Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yaqiang Hong
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lili Dong
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhewen Zhang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Dong Zou
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanqing Wang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shuhui Song
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Fan Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiangdong Fang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hua Chen
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xin Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingfa Xiao
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Changqing Zeng
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
25
|
Duan Z, Qiao Y, Lu J, Lu H, Zhang W, Yan F, Sun C, Hu Z, Zhang Z, Li G, Chen H, Xiang Z, Zhu Z, Zhao H, Yu Y, Wei C. HUPAN: a pan-genome analysis pipeline for human genomes. Genome Biol 2019; 20:149. [PMID: 31366358 PMCID: PMC6670167 DOI: 10.1186/s13059-019-1751-y] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 07/01/2019] [Indexed: 12/13/2022] Open
Abstract
The human reference genome is still incomplete, especially for those population-specific or individual-specific regions, which may have important functions. Here, we developed a HUman Pan-genome ANalysis (HUPAN) system to build the human pan-genome. We applied it to 185 deep sequencing and 90 assembled Han Chinese genomes and detected 29.5 Mb novel genomic sequences and at least 188 novel protein-coding genes missing in the human reference genome (GRCh38). It can be an important resource for the human genome-related biomedical studies, such as cancer genome analysis. HUPAN is freely available at http://cgm.sjtu.edu.cn/hupan/ and https://github.com/SJTU-CGM/HUPAN .
Collapse
Affiliation(s)
- Zhongqu Duan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Yuyang Qiao
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Jinyuan Lu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Huimin Lu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Wenmin Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Fazhe Yan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Chen Sun
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Zhiqiang Hu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Zhen Zhang
- Department of Radiation Oncology and Department of Oncology, Shanghai Medical College, Fudan University Shanghai Cancer Center, 270 Dong An Road, Shanghai, 200032, China
| | - Guichao Li
- Department of Radiation Oncology and Department of Oncology, Shanghai Medical College, Fudan University Shanghai Cancer Center, 270 Dong An Road, Shanghai, 200032, China
| | - Hongzhuan Chen
- Department of Pharmacology, Shanghai Key Laboratory For Translational Medicine, Shanghai Jiao Tong University School of Medicine, 227 South Chongqing Road, Shanghai, 200025, China
| | - Zhen Xiang
- Department of Surgery, Ruijin Hospital, Shanghai Key Laboratory for Gastric Neoplasms, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road, Shanghai, 200025, China
| | - Zhenggang Zhu
- Department of Surgery, Ruijin Hospital, Shanghai Key Laboratory for Gastric Neoplasms, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road, Shanghai, 200025, China
| | - Hongyu Zhao
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
- Department of Biostatistics, Yale University, 60 College Street, New Haven, CT, 06520, USA
| | - Yingyan Yu
- Department of Surgery, Ruijin Hospital, Shanghai Key Laboratory for Gastric Neoplasms, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road, Shanghai, 200025, China.
| | - Chaochun Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai, 201203, China.
| |
Collapse
|
26
|
Le VS, Tran KT, Bui HTP, Le HTT, Nguyen CD, Do DH, Ly HTT, Pham LTD, Dao LTM, Nguyen LT. A Vietnamese human genetic variation database. Hum Mutat 2019; 40:1664-1675. [PMID: 31180159 DOI: 10.1002/humu.23835] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/14/2019] [Accepted: 06/05/2019] [Indexed: 12/29/2022]
Abstract
Large scale human genome projects have created tremendous human genome databases for some well-studied populations. Vietnam has about 95 million people (the 14th largest country by population in the world) of which more than 86% are Kinh people. To date, genetic studies for Vietnamese people mostly rely on genetic information from other populations. Building a Vietnamese human genetic variation database is a must for properly interpreting Vietnamese genetic variants. To this end, we sequenced 105 whole genomes and 200 whole exomes of 305 unrelated Kinh Vietnamese (KHV) people. We also included 101 other previously published KHV genomes to build a Vietnamese human genetic variation database of 406 KHV people. The KHV database contains 24.81 million variants (22.47 million single nucleotide polymorphisms (SNPs) and 2.34 million indels) of which 0.71 million variants are novel. It includes more than 99.3% of variants with a frequency of >1% in the KHV population. Noticeably, the KHV database revealed 107 variants reported in the human genome mutation database as pathological mutations with a frequency above 1% in the KHV population. The KHV database (available at https://genomes.vn) would be beneficial for genetic studies and medical applications not only for the Vietnamese population but also for other closely related populations.
Collapse
Affiliation(s)
- Vinh S Le
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam.,Department of Gene Technology, Vinmec International Hospital Times City, Hanoi, Vietnam.,Faculty of Information Technology, University of Engineering and Technology, Vietnam National University Hanoi, Hanoi, Vietnam
| | - Kien T Tran
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam
| | - Hoa T P Bui
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam.,Department of Gene Technology, Vinmec International Hospital Times City, Hanoi, Vietnam.,School of Environment and Life Science, University of Salford, Manchester, United Kingdom
| | - Huong T T Le
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam.,Department of Gene Technology, Vinmec International Hospital Times City, Hanoi, Vietnam
| | - Canh D Nguyen
- Faculty of Information Technology, University of Engineering and Technology, Vietnam National University Hanoi, Hanoi, Vietnam
| | - Duong H Do
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam.,Department of Gene Technology, Vinmec International Hospital Times City, Hanoi, Vietnam
| | - Ha T T Ly
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam.,Department of Gene Technology, Vinmec International Hospital Times City, Hanoi, Vietnam
| | - Linh T D Pham
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam
| | - Lan T M Dao
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam
| | - Liem T Nguyen
- Vinmec Research Institute of Stem Cell and Gene Technology, Hanoi, Vietnam
| |
Collapse
|
27
|
Ragsdale AP, Gravel S. Models of archaic admixture and recent history from two-locus statistics. PLoS Genet 2019; 15:e1008204. [PMID: 31181058 PMCID: PMC6586359 DOI: 10.1371/journal.pgen.1008204] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 06/20/2019] [Accepted: 05/17/2019] [Indexed: 11/18/2022] Open
Abstract
We learn about population history and underlying evolutionary biology through patterns of genetic polymorphism. Many approaches to reconstruct evolutionary histories focus on a limited number of informative statistics describing distributions of allele frequencies or patterns of linkage disequilibrium. We show that many commonly used statistics are part of a broad family of two-locus moments whose expectation can be computed jointly and rapidly under a wide range of scenarios, including complex multi-population demographies with continuous migration and admixture events. A full inspection of these statistics reveals that widely used models of human history fail to predict simple patterns of linkage disequilibrium. To jointly capture the information contained in classical and novel statistics, we implemented a tractable likelihood-based inference framework for demographic history. Using this approach, we show that human evolutionary models that include archaic admixture in Africa, Asia, and Europe provide a much better description of patterns of genetic diversity across the human genome. We estimate that an unidentified, deeply diverged population admixed with modern humans within Africa both before and after the split of African and Eurasian populations, contributing 4 - 8% genetic ancestry to individuals in world-wide populations.
Collapse
Affiliation(s)
- Aaron P Ragsdale
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| |
Collapse
|
28
|
Yu Q, Zhang W, Zhang X, Zeng Y, Wang Y, Wang Y, Xu L, Huang X, Li N, Zhou X, Lu J, Guo X, Li G, Hou Y, Liu S, Li B. Correction to: Population-wide sampling of retrotransposon insertion polymorphisms using deep sequencing and efficient detection. Gigascience 2019; 8:5308139. [PMID: 30753694 PMCID: PMC6365299 DOI: 10.1093/gigascience/giz008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Qichao Yu
- BGI Education Center, UCAS: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Wei Zhang
- BGI Education Center, UCAS: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaolong Zhang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yongli Zeng
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yeming Wang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yanhui Wang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Liqin Xu
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaoyun Huang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Nannan Li
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xinlan Zhou
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Jie Lu
- BGI College: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaosen Guo
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Guibo Li
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen: Noregade 10, Copenhagen 1165, Denmark
| | - Yong Hou
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen: Noregade 10, Copenhagen 1165, Denmark
| | - Shiping Liu
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,School of Biology and Biological Engineering, SCUT: Postdoctoral Apartment Building, South China University of Technology, Wushan RD., TianHe District, Guangzhou, 510640, China
| | - Bo Li
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,BGI-Forensics: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| |
Collapse
|
29
|
Lan T, Lin H, Zhu W, Laurent TCAM, Yang M, Liu X, Wang J, Wang J, Yang H, Xu X, Guo X. Deep whole-genome sequencing of 90 Han Chinese genomes. Gigascience 2018; 6:1-7. [PMID: 28938720 PMCID: PMC5603764 DOI: 10.1093/gigascience/gix067] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 07/20/2017] [Indexed: 12/30/2022] Open
Abstract
Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects.
Collapse
Affiliation(s)
- Tianming Lan
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Haoxiang Lin
- BGI Genomics, BGI-Shenzhen, Building NO. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen, 518083, China
| | - Wenjuan Zhu
- BGI Genomics, BGI-Shenzhen, Building NO. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen, 518083, China
| | - Tellier Christian Asker Melchior Laurent
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen, Nørregade 10, PO Box 2177 1017 Copenhagen, Denmark
| | - Mengcheng Yang
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xin Liu
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Jun Wang
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen, Nørregade 10, PO Box 2177 1017 Copenhagen, Denmark
| | - Jian Wang
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,James D. Watson Institute of Genome Sciences, 866 Yuhangtang Road, Hangzhou, Zhejiang Province, 310058, P. R. China
| | - Huanming Yang
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,James D. Watson Institute of Genome Sciences, 866 Yuhangtang Road, Hangzhou, Zhejiang Province, 310058, P. R. China
| | - Xun Xu
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaosen Guo
- BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen, Nørregade 10, PO Box 2177 1017 Copenhagen, Denmark.,Shenzhen Key Laboratory of Neurogenomics, BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| |
Collapse
|
30
|
Yu Q, Zhang W, Zhang X, Zeng Y, Wang Y, Wang Y, Xu L, Huang X, Li N, Zhou X, Lu J, Guo X, Li G, Hou Y, Liu S, Li B. Population-wide sampling of retrotransposon insertion polymorphisms using deep sequencing and efficient detection. Gigascience 2018; 6:1-11. [PMID: 28938719 PMCID: PMC5603766 DOI: 10.1093/gigascience/gix066] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 07/20/2017] [Indexed: 12/20/2022] Open
Abstract
Active retrotransposons play important roles during evolution and continue to shape our genomes today, especially in genetic polymorphisms underlying a diverse set of diseases. However, studies of human retrotransposon insertion polymorphisms (RIPs) based on whole-genome deep sequencing at the population level have not been sufficiently undertaken, despite the obvious need for a thorough characterization of RIPs in the general population. Herein, we present a novel and efficient computational tool called Specific Insertions Detector (SID) for the detection of non-reference RIPs. We demonstrate that SID is suitable for high-depth whole-genome sequencing data using paired-end reads obtained from simulated and real datasets. We construct a comprehensive RIP database using a large population of 90 Han Chinese individuals with a mean ×68 depth per individual. In total, we identify 9342 recent RIPs, and 8433 of these RIPs are novel compared with dbRIP, including 5826 Alu, 2169 long interspersed nuclear element 1 (L1), 383 SVA, and 55 long terminal repeats. Among the 9342 RIPs, 4828 were located in gene regions and 5 were located in protein-coding regions. We demonstrate that RIPs can, in principle, be an informative resource to perform population evolution and phylogenetic analyses. Taking the demographic effects into account, we identify a weak negative selection on SVA and L1 but an approximately neutral selection for Alu elements based on the frequency spectrum of RIPs. SID is a powerful open-source program for the detection of non-reference RIPs. We built a non-reference RIP dataset that greatly enhanced the diversity of RIPs detected in the general population, and it should be invaluable to researchers interested in many aspects of human evolution, genetics, and disease. As a proof of concept, we demonstrate that the RIPs can be used as biomarkers in a similar way as single nucleotide polymorphisms.
Collapse
Affiliation(s)
- Qichao Yu
- BGI Education Center, UCAS: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Wei Zhang
- BGI Education Center, UCAS: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaolong Zhang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yongli Zeng
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yeming Wang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Yanhui Wang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Liqin Xu
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaoyun Huang
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Nannan Li
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xinlan Zhou
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Jie Lu
- BGI College: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Xiaosen Guo
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Guibo Li
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen: Nørregade 10, Copenhagen 1165, Denmark
| | - Yong Hou
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen: Nørregade 10, Copenhagen 1165, Denmark
| | - Shiping Liu
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,School of Biology and Biological Engineering, SCUT: Postdoctoral Apartment Building, South China University of Technology, Wushan RD., TianHe District, Guangzhou, 510640, China
| | - Bo Li
- BGI-Shenzhen: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,BGI-Forensics: Building 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| |
Collapse
|