1
|
Zhou B, Arthur JG, Guo H, Kim T, Huang Y, Pattni R, Wang T, Kundu S, Luo JXJ, Lee H, Nachun DC, Purmann C, Monte EM, Weimer AK, Qu PP, Shi M, Jiang L, Yang X, Fullard JF, Bendl J, Girdhar K, Kim M, Chen X, Greenleaf WJ, Duncan L, Ji HP, Zhu X, Song G, Montgomery SB, Palejev D, Zu Dohna H, Roussos P, Kundaje A, Hallmayer JF, Snyder MP, Wong WH, Urban AE. Detection and analysis of complex structural variation in human genomes across populations and in brains of donors with psychiatric disorders. Cell 2024:S0092-8674(24)01032-8. [PMID: 39353437 DOI: 10.1016/j.cell.2024.09.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 07/01/2024] [Accepted: 09/10/2024] [Indexed: 10/04/2024]
Abstract
Complex structural variations (cxSVs) are often overlooked in genome analyses due to detection challenges. We developed ARC-SV, a probabilistic and machine-learning-based method that enables accurate detection and reconstruction of cxSVs from standard datasets. By applying ARC-SV across 4,262 genomes representing all continental populations, we identified cxSVs as a significant source of natural human genetic variation. Rare cxSVs have a propensity to occur in neural genes and loci that underwent rapid human-specific evolution, including those regulating corticogenesis. By performing single-nucleus multiomics in postmortem brains, we discovered cxSVs associated with differential gene expression and chromatin accessibility across various brain regions and cell types. Additionally, cxSVs detected in brains of psychiatric cases are enriched for linkage with psychiatric GWAS risk alleles detected in the same brains. Furthermore, our analysis revealed significantly decreased brain-region- and cell-type-specific expression of cxSV genes, specifically for psychiatric cases, implicating cxSVs in the molecular etiology of major neuropsychiatric disorders.
Collapse
Affiliation(s)
- Bo Zhou
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | - Joseph G Arthur
- Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - Hanmin Guo
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Statistics, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Taeyoung Kim
- School of Computer Science and Engineering, Pusan National University, Busan 46241, South Korea
| | - Yiling Huang
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Reenal Pattni
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Tao Wang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Soumya Kundu
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Jay X J Luo
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - HoJoon Lee
- Division of Oncology, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Daniel C Nachun
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Carolin Purmann
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Emma M Monte
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Annika K Weimer
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Ping-Ping Qu
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Minyi Shi
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Lixia Jiang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Xinqiong Yang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Kiran Girdhar
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Minsu Kim
- School of Computer Science and Engineering, Pusan National University, Busan 46241, South Korea
| | - Xi Chen
- Department of Statistics, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | | | - Laramie Duncan
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA
| | - Hanlee P Ji
- Division of Oncology, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Xiang Zhu
- Department of Statistics, Stanford University, Stanford, CA 94305, USA; Department of Statistics, Pennsylvania State University, University Park, PA 16802, USA; Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Giltae Song
- School of Computer Science and Engineering, Pusan National University, Busan 46241, South Korea; Center for Artificial Intelligence Research, Pusan National University, Busan 46241, South Korea
| | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Dean Palejev
- Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria
| | - Heinrich Zu Dohna
- Department of Biology, American University of Beirut, Beirut 11-0236, Lebanon
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Center for Precision Medicine and Translational Therapeutics, James J. Peters VA Medical Center, Bronx, NY 10468, USA; Mental Illness Research Education and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Joachim F Hallmayer
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA
| | - Michael P Snyder
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Wing H Wong
- Department of Statistics, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
| | - Alexander E Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Maternal and Child Health Research Institute, Stanford University School of Medicine, Stanford, CA 94305, USA.
| |
Collapse
|
2
|
Zheng Y, Shang X. FindCSV: a long-read based method for detecting complex structural variations. BMC Bioinformatics 2024; 25:315. [PMID: 39342151 PMCID: PMC11439270 DOI: 10.1186/s12859-024-05937-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Accepted: 09/18/2024] [Indexed: 10/01/2024] Open
Abstract
BACKGROUND Structural variations play a significant role in genetic diseases and evolutionary mechanisms. Extensive research has been conducted over the past decade to detect simple structural variations, leading to the development of well-established detection methods. However, recent studies have highlighted the potentially greater impact of complex structural variations on individuals compared to simple structural variations. Despite this, the field still lacks precise detection methods specifically designed for complex structural variations. Therefore, the development of a highly efficient and accurate detection method is of utmost importance. RESULT In response to this need, we propose a novel method called FindCSV, which leverages deep learning techniques and consensus sequences to enhance the detection of SVs using long-read sequencing data. Compared to current methods, FindCSV performs better in detecting complex and simple structural variations. CONCLUSIONS FindCSV is a new method to detect complex and simple structural variations with reasonable accuracy in real and simulated data. The source code for the program is available at https://github.com/nwpuzhengyan/FindCSV .
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| |
Collapse
|
3
|
Junjun R, Zhengqian Z, Ying W, Jialiang W, Yongzhuang L. A comprehensive review of deep learning-based variant calling methods. Brief Funct Genomics 2024; 23:303-313. [PMID: 38366908 DOI: 10.1093/bfgp/elae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/14/2024] [Accepted: 01/18/2023] [Indexed: 02/18/2024] Open
Abstract
Genome sequencing data have become increasingly important in the field of personalized medicine and diagnosis. However, accurately detecting genomic variations remains a challenging task. Traditional variation detection methods rely on manual inspection or predefined rules, which can be time-consuming and prone to errors. Consequently, deep learning-based approaches for variation detection have gained attention due to their ability to automatically learn genomic features that distinguish between variants. In our review, we discuss the recent advancements in deep learning-based algorithms for detecting small variations and structural variations in genomic data, as well as their advantages and limitations.
Collapse
Affiliation(s)
- Ren Junjun
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Zhang Zhengqian
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Wu Ying
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Wang Jialiang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| | - Liu Yongzhuang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin 150001, China
| |
Collapse
|
4
|
Lemire G, Sanchis-Juan A, Russell K, Baxter S, Chao KR, Singer-Berk M, Groopman E, Wong I, England E, Goodrich J, Pais L, Austin-Tse C, DiTroia S, O'Heir E, Ganesh VS, Wojcik MH, Evangelista E, Snow H, Osei-Owusu I, Fu J, Singh M, Mostovoy Y, Huang S, Garimella K, Kirkham SL, Neil JE, Shao DD, Walsh CA, Argilli E, Le C, Sherr EH, Gleeson JG, Shril S, Schneider R, Hildebrandt F, Sankaran VG, Madden JA, Genetti CA, Beggs AH, Agrawal PB, Bujakowska KM, Place E, Pierce EA, Donkervoort S, Bönnemann CG, Gallacher L, Stark Z, Tan TY, White SM, Töpf A, Straub V, Fleming MD, Pollak MR, Õunap K, Pajusalu S, Donald KA, Bruwer Z, Ravenscroft G, Laing NG, MacArthur DG, Rehm HL, Talkowski ME, Brand H, O'Donnell-Luria A. Exome copy number variant detection, analysis, and classification in a large cohort of families with undiagnosed rare genetic disease. Am J Hum Genet 2024; 111:863-876. [PMID: 38565148 PMCID: PMC11080278 DOI: 10.1016/j.ajhg.2024.03.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/09/2024] [Accepted: 03/11/2024] [Indexed: 04/04/2024] Open
Abstract
Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and, with new innovative methods, can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the Genomics Research to Elucidate the Genetics of Rare Diseases consortium and analyzed using the seqr platform. The addition of CNV detection to exome analysis identified causal CNVs for 171 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb. The causal CNVs consisted of 140 deletions, 15 duplications, 3 suspected complex structural variants (SVs), 3 insertions, and 10 complex SVs, the latter two groups being identified by orthogonal confirmation methods. To classify CNV variant pathogenicity, we used the 2020 American College of Medical Genetics and Genomics/ClinGen CNV interpretation standards and developed additional criteria to evaluate allelic and functional data as well as variants on the X chromosome to further advance the framework. We interpreted 151 CNVs as likely pathogenic/pathogenic and 20 CNVs as high-interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher-resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.
Collapse
Affiliation(s)
- Gabrielle Lemire
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
| | - Alba Sanchis-Juan
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Kathryn Russell
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samantha Baxter
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Katherine R Chao
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Moriel Singer-Berk
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Emily Groopman
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Isaac Wong
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Eleina England
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Julia Goodrich
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Lynn Pais
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Christina Austin-Tse
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Stephanie DiTroia
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Emily O'Heir
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Vijay S Ganesh
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA
| | - Monica H Wojcik
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Emily Evangelista
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Hana Snow
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ikeoluwa Osei-Owusu
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Jack Fu
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Mugdha Singh
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Yulia Mostovoy
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Steve Huang
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kiran Garimella
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samantha L Kirkham
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
| | - Jennifer E Neil
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
| | - Diane D Shao
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Department of Neurology, Boston Children's Hospital, Boston, MA, USA
| | - Christopher A Walsh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
| | - Emanuela Argilli
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA; Institute of Human Genetics and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Carolyn Le
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA; Institute of Human Genetics and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Elliott H Sherr
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA; Institute of Human Genetics and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Joseph G Gleeson
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Shirlee Shril
- Harvard Medical School, Boston, MA, USA; Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA
| | - Ronen Schneider
- Harvard Medical School, Boston, MA, USA; Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA
| | - Friedhelm Hildebrandt
- Harvard Medical School, Boston, MA, USA; Department of Pediatrics, Boston Children's Hospital, Boston, MA, USA
| | - Vijay G Sankaran
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Division of Hematology/Oncology, Boston Children's Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Jill A Madden
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Casie A Genetti
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Alan H Beggs
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Pankaj B Agrawal
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Kinga M Bujakowska
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Emily Place
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Eric A Pierce
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Sandra Donkervoort
- Neuromuscular and Neurogenetic Disorders of Childhood Section, Neurogenetics Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Carsten G Bönnemann
- Neuromuscular and Neurogenetic Disorders of Childhood Section, Neurogenetics Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Lyndon Gallacher
- Department of Paediatrics, University of Melbourne, Parkville, VIC, Australia; Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Parkville, VIC, Australia
| | - Zornitza Stark
- Department of Paediatrics, University of Melbourne, Parkville, VIC, Australia; Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Parkville, VIC, Australia
| | - Tiong Yang Tan
- Department of Paediatrics, University of Melbourne, Parkville, VIC, Australia; Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Parkville, VIC, Australia
| | - Susan M White
- Department of Paediatrics, University of Melbourne, Parkville, VIC, Australia; Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Parkville, VIC, Australia
| | - Ana Töpf
- John Walton Muscular Dystrophy Research Centre, Newcastle University and Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Volker Straub
- John Walton Muscular Dystrophy Research Centre, Newcastle University and Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Mark D Fleming
- Harvard Medical School, Boston, MA, USA; Department of Pathology, Boston Children's Hospital, Boston, MA, USA
| | - Martin R Pollak
- Harvard Medical School, Boston, MA, USA; Division of Nephrology, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Katrin Õunap
- Department of Clinical Genetics, Genetics and Personalized Medicine Clinic, Tartu University Hospital, Tartu, Estonia; Department of Genetics and Personalized Medicine, Institute of Clinical Medicine, Faculty of Medicine, University of Tartu, Tartu, Estonia
| | - Sander Pajusalu
- Department of Clinical Genetics, Genetics and Personalized Medicine Clinic, Tartu University Hospital, Tartu, Estonia; Department of Genetics and Personalized Medicine, Institute of Clinical Medicine, Faculty of Medicine, University of Tartu, Tartu, Estonia
| | - Kirsten A Donald
- Department of Paediatrics and Child Health, Red Cross War Memorial Children's Hospital, Cape Town, South Africa; Neuroscience Institute, University of Cape Town, Cape Town, South Africa
| | - Zandre Bruwer
- Department of Paediatrics and Child Health, Red Cross War Memorial Children's Hospital, Cape Town, South Africa; Neuroscience Institute, University of Cape Town, Cape Town, South Africa
| | - Gianina Ravenscroft
- University of Western Australia Centre for Medical Research, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, WA, Australia
| | - Nigel G Laing
- University of Western Australia Centre for Medical Research, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, WA, Australia
| | - Daniel G MacArthur
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Centre for Population Genomics, Garvan Institute of Medical Research and UNSW, Sydney, NSW, Australia; Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Heidi L Rehm
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Michael E Talkowski
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Harrison Brand
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Anne O'Donnell-Luria
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
| |
Collapse
|
5
|
Li L, Hong C, Xu J, Chung CYL, Leung AKY, Boncan DAT, Cheng L, Lo KW, Lai PBS, Wong J, Zhou J, Cheng ASL, Chan TF, Yue F, Yip KY. Accurate identification of structural variations from cancer samples. Brief Bioinform 2023; 25:bbad520. [PMID: 38233091 PMCID: PMC10794023 DOI: 10.1093/bib/bbad520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 12/11/2023] [Accepted: 12/18/2023] [Indexed: 01/19/2024] Open
Abstract
Structural variations (SVs) are commonly found in cancer genomes. They can cause gene amplification, deletion and fusion, among other functional consequences. With an average read length of hundreds of kilobases, nano-channel-based optical DNA mapping is powerful in detecting large SVs. However, existing SV calling methods are not tailored for cancer samples, which have special properties such as mixed cell types and sub-clones. Here we propose the Cancer Optical Mapping for detecting Structural Variations (COMSV) method that is specifically designed for cancer samples. It shows high sensitivity and specificity in benchmark comparisons. Applying to cancer cell lines and patient samples, COMSV identifies hundreds of novel SVs per sample.
Collapse
Affiliation(s)
- Le Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Chenyang Hong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Jie Xu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois 60208, USA
| | - Claire Yik-Lok Chung
- School of Life Sciences and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Alden King-Yung Leung
- School of Life Sciences and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Delbert Almerick T Boncan
- School of Life Sciences and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Lixin Cheng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Kwok-Wai Lo
- Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Paul B S Lai
- Department of Surgery, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - John Wong
- Department of Surgery, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Jingying Zhou
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Alfred Sze-Lok Cheng
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Ting-Fung Chan
- School of Life Sciences and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
- Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
- Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois 60208, USA
| | - Kevin Y Yip
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
- Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
- Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, California 92037, USA
| |
Collapse
|
6
|
Lemire G, Sanchis-Juan A, Russell K, Baxter S, Chao KR, Singer-Berk M, Groopman E, Wong I, England E, Goodrich J, Pais L, Austin-Tse C, DiTroia S, O’Heir E, Ganesh VS, Wojcik MH, Evangelista E, Snow H, Osei-Owusu I, Fu J, Singh M, Mostovoy Y, Huang S, Garimella K, Kirkham SL, Neil JE, Shao DD, Walsh CA, Argili E, Le C, Sherr EH, Gleeson J, Shril S, Schneider R, Hildebrandt F, Sankaran VG, Madden JA, Genetti CA, Beggs AH, Agrawal PB, Bujakowska KM, Place E, Pierce EA, Donkervoort S, Bönnemann CG, Gallacher L, Stark Z, Tan T, White SM, Töpf A, Straub V, Fleming MD, Pollak MR, Õunap K, Pajusalu S, Donald KA, Bruwer Z, Ravenscroft G, Laing NG, MacArthur DG, Rehm HL, Talkowski ME, Brand H, O’Donnell-Luria A. Exome copy number variant detection, analysis and classification in a large cohort of families with undiagnosed rare genetic disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.05.23296595. [PMID: 37873196 PMCID: PMC10593084 DOI: 10.1101/2023.10.05.23296595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and with new innovative methods can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the GREGoR consortium. Each family's CNV data was analyzed using the seqr platform and candidate CNVs classified using the 2020 ACMG/ClinGen CNV interpretation standards. We developed additional evidence criteria to address situations not covered by the current standards. The addition of CNV calling to exome analysis identified causal CNVs for 173 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb with estimates that 44% would not have been detected by standard chromosomal microarrays. The causal CNVs consisted of 141 deletions, 15 duplications, 4 suspected complex structural variants (SVs), 3 insertions and 10 complex SVs, the latter two groups being identified by orthogonal validation methods. We interpreted 153 CNVs as likely pathogenic/pathogenic and 20 CNVs as high interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.
Collapse
Affiliation(s)
- Gabrielle Lemire
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- These authors contributed equally
| | - Alba Sanchis-Juan
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- These authors contributed equally
| | - Kathryn Russell
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samantha Baxter
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Katherine R. Chao
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Moriel Singer-Berk
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Emily Groopman
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
| | - Isaac Wong
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Eleina England
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Julia Goodrich
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Lynn Pais
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Christina Austin-Tse
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Stephanie DiTroia
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Emily O’Heir
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Vijay S. Ganesh
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Brigham and Women’s Hospital, Boston, MA, USA
| | - Monica H. Wojcik
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Emily Evangelista
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Hana Snow
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ikeoluwa Osei-Owusu
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Jack Fu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Mugdha Singh
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Yulia Mostovoy
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Steve Huang
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kiran Garimella
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samantha L. Kirkham
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
| | - Jennifer E. Neil
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Howard Hughes Medical Institute, Boston Children’s Hospital, Boston, MA, USA
| | - Diane D. Shao
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Neurology, Boston Children’s Hospital, Boston, MA, USA
| | - Christopher A. Walsh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Boston Children’s Hospital, Boston, MA, USA
| | - Emanuela Argili
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
- Institute of Human Genetics and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Carolyn Le
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
- Institute of Human Genetics and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Elliott H. Sherr
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
- Institute of Human Genetics and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Joseph Gleeson
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Rady Children’s Institute for Genomic Medicine, San Diego, CA, USA
| | - Shirlee Shril
- Harvard Medical School, Boston, MA, USA
- Department of Pediatrics, Boston Children’s Hospital, Boston, MA, USA
| | - Ronen Schneider
- Harvard Medical School, Boston, MA, USA
- Department of Pediatrics, Boston Children’s Hospital, Boston, MA, USA
| | - Friedhelm Hildebrandt
- Harvard Medical School, Boston, MA, USA
- Department of Pediatrics, Boston Children’s Hospital, Boston, MA, USA
| | - Vijay G. Sankaran
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children’s Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Jill A. Madden
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA, USA
| | - Casie A. Genetti
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA, USA
| | - Alan H. Beggs
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA, USA
| | - Pankaj B. Agrawal
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Boston, MA, USA
| | - Kinga M. Bujakowska
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
| | - Emily Place
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
| | - Eric A. Pierce
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
| | - Sandra Donkervoort
- Neuromuscular and Neurogenetic Disorders of Childhood Section, Neurogenetics Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Carsten G. Bönnemann
- Neuromuscular and Neurogenetic Disorders of Childhood Section, Neurogenetics Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Lyndon Gallacher
- Department of Paediatrics, University of Melbourne, Parkville, Victoria, Australia
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Parkville, Victoria, Australia
| | - Zornitza Stark
- Department of Paediatrics, University of Melbourne, Parkville, Victoria, Australia
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Parkville, Victoria, Australia
| | - Tiong Tan
- Department of Paediatrics, University of Melbourne, Parkville, Victoria, Australia
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Parkville, Victoria, Australia
| | - Susan M. White
- Department of Paediatrics, University of Melbourne, Parkville, Victoria, Australia
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Parkville, Victoria, Australia
| | - Ana Töpf
- John Walton Muscular Dystrophy Research Centre, Newcastle University and Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Volker Straub
- John Walton Muscular Dystrophy Research Centre, Newcastle University and Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Mark D. Fleming
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Boston Children’s Hospital, Boston, MA, USA
| | - Martin R. Pollak
- Harvard Medical School, Boston, MA, USA
- Division of Nephrology, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Katrin Õunap
- Department of Clinical Genetics, Genetics and Personalized Medicine Clinic, Tartu University Hospital, Tartu, Estonia
- Department of Clinical Genetics, Institute of Clinical Medicine, Faculty of Medicine, University of Tartu, Tartu, Estonia
| | - Sander Pajusalu
- Department of Clinical Genetics, Genetics and Personalized Medicine Clinic, Tartu University Hospital, Tartu, Estonia
- Department of Clinical Genetics, Institute of Clinical Medicine, Faculty of Medicine, University of Tartu, Tartu, Estonia
| | - Kirsten A. Donald
- Department of Paediatrics and Child Health, Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
- University of Cape Town, Cape Town, South Africa
| | - Zandre Bruwer
- Department of Paediatrics and Child Health, Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
- University of Cape Town, Cape Town, South Africa
| | - Gianina Ravenscroft
- University of Western Australia, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, Australia
| | - Nigel G. Laing
- University of Western Australia, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, Australia
| | - Daniel G. MacArthur
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Centre for Population Genomics, Garvan Institute, Sydney, Australia
- Centre for Population Genomics, Murdoch Children’s Research Institute, Melbourne, Australia
| | - Heidi L. Rehm
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Michael E. Talkowski
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Harrison Brand
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Senior authors
| | - Anne O’Donnell-Luria
- Broad Institute Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Senior authors
| |
Collapse
|
7
|
Du Q, Xu Q, Pan F, Shi Y, Yu F, Zhang T, Jiang J, Liu W, Pan X, Han D, Zhang H. Association between Intestinal Colonization and Extraintestinal Infection with Carbapenem-Resistant Klebsiella pneumoniae in Children. Microbiol Spectr 2023; 11:e0408822. [PMID: 36916927 PMCID: PMC10100809 DOI: 10.1128/spectrum.04088-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 02/15/2023] [Indexed: 03/16/2023] Open
Abstract
Carbapenem-resistant Klebsiella pneumoniae (CRKP) has become a critical public health threat. However, the association between intestinal colonization and parenteral infection among pediatric patients has not been elucidated. We collected 8 fecal CRKP strains and 10 corresponding CRKP strains responsible for extraintestinal infection from eight patients who did not manifest infection upon admission to the hospital. Paired isolates showed identical resistance to antimicrobials and identical virulence in vitro and in vivo. wzi capsule typing, multilocus sequence typing, and whole-genome sequencing (WGS) indicated high similarity between paired colonizing and infecting isolates. Mutations between colonizing and infecting isolate pairs found by WGS had a distinctive molecular signature of a high proportion of complex structural variants. The mutated genes were involved in pathways associated with infection-related physiological and pathogenic functions, including antibiotic resistance, virulence, and response to the extracellular environment. The latter is important for bacterial infection of environmental niches. Various mutations related to antibiotic resistance, virulence, and colonization that were not associated with any particular mutational hot spot correlated with an increased risk of extraintestinal infection. Notably, novel subclone carbapenem-resistant hypervirulent K. pneumoniae (CR-hvKP) KL19-ST15 exhibited hypervirulence in experimental assays that reflected the severe clinical symptoms of two patients infected with the clonal strains. Taken together, our findings indicate the association between CRKP intestinal colonization and extraintestinal infection, suggesting that active screening for colonization on admission could decrease infection risk in children. IMPORTANCE Carbapenem-resistant Klebsiella pneumoniae (CRKP) causes an increasing number of nosocomial infections, which can be life-threatening, as carbapenems are last-resort antibiotics. K. pneumoniae is part of the healthy human microbiome, and this provides a potential advantage for infection. This study demonstrated that CRKP intestinal colonization is strongly linked to extraintestinal infection, based on the evidence given by whole-genome sequencing data and phenotypic assays of antimicrobial resistance and virulence. Apart from these findings, our in-depth analysis of point mutations and chromosome structural variants in patient-specific infecting isolates compared with colonizing isolates may contribute insights into bacterial adaptation underlying CRKP infection. In addition, a novel subclone of carbapenem-resistant hypervirulent K. pneumoniae (CR-hvKP) was observed in the study. This finding highlights the importance of CRKP active surveillance among children, targeting in particular the novel high-risk CR-hvKP clone.
Collapse
Affiliation(s)
- Qingqing Du
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Qi Xu
- Department of Infectious Diseases, Research Laboratory of Clinical Virology, Ruijin Hospital, Shanghai Jiao Tong University, School of Medicine, Shanghai, China
| | - Fen Pan
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yingying Shi
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Fangyuan Yu
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Tiandong Zhang
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jie Jiang
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Wenxin Liu
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xiaozhou Pan
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Dingding Han
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Hong Zhang
- Department of Clinical Laboratory, Shanghai Children’s Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Pediatric Infection, Immunity, and Critical Care Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
8
|
Wang S, Zhang X, Qiang G, Wang J. DelInsCaller: An Efficient Algorithm for Identifying Delins and Estimating Haplotypes from Long Reads with High Level of Sequencing Errors. Genes (Basel) 2022; 14:4. [PMID: 36672745 PMCID: PMC9858578 DOI: 10.3390/genes14010004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 12/24/2022] Open
Abstract
Delins, as known as complex indel, is a combined genomic structural variation formed by deleting and inserting DNA fragments at a common genomic location. Recent studies emphasized the importance of delins in cancer diagnosis and treatment. Although the long reads from PacBio CLR sequencing significantly facilitate delins calling, the existing approaches still encounter computational challenges from the high level of sequencing errors, and often introduce errors in genotyping and phasing delins. In this paper, we propose an efficient algorithmic pipeline, named delInsCaller, to identify delins on haplotype resolution from the PacBio CLR sequencing data. delInsCaller design a fault-tolerant method by calculating a variation density score, which helps to locate the candidate mutational regions under a high-level of sequencing errors. It adopts a base association-based contig splicing method, which facilitates contig splicing in the presence of false-positive interference. We conducted a series of experiments on simulated datasets, and the results showed that delInsCaller outperformed several state-of-the-art approaches, e.g., SVseq3, across a wide range of parameter settings, such as read depth, sequencing error rates, etc. delInsCaller often obtained higher f-measures than other approaches; specifically, it was able to maintain advantages at ~15% sequencing errors. delInsCaller was able to significantly improve the N50 values with almost no loss of haplotype accuracy compared with the existing approach as well.
Collapse
Affiliation(s)
- Shenjie Wang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an 710049, China
| | - Xuanping Zhang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an 710049, China
| | - Geng Qiang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an 710049, China
| | - Jiayin Wang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an 710049, China
| |
Collapse
|
9
|
Schuy J, Grochowski CM, Carvalho CMB, Lindstrand A. Complex genomic rearrangements: an underestimated cause of rare diseases. Trends Genet 2022; 38:1134-1146. [PMID: 35820967 PMCID: PMC9851044 DOI: 10.1016/j.tig.2022.06.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/12/2022] [Accepted: 06/06/2022] [Indexed: 01/24/2023]
Abstract
Complex genomic rearrangements (CGRs) are known contributors to disease but are often missed during routine genetic screening. Identifying CGRs requires (i) identifying copy number variants (CNVs) concurrently with inversions, (ii) phasing multiple breakpoint junctions incis, as well as (iii) detecting and resolving structural variants (SVs) within repeats. We demonstrate how combining cytogenetics and new sequencing methodologies is being successfully applied to gain insights into the genomic architecture of CGRs. In addition, we review CGR patterns and molecular features revealed by studying constitutional genomic disorders. These data offer invaluable lessons to individuals interested in investigating CGRs, evaluating their clinical relevance and frequency, as well as assessing their impact(s) on rare genetic diseases.
Collapse
Affiliation(s)
- Jakob Schuy
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | | | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA; Pacific Northwest Research Institute, Seattle, WA, USA
| | - Anna Lindstrand
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden; Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden.
| |
Collapse
|
10
|
Lin J, Wang S, Audano PA, Meng D, Flores JI, Kosters W, Yang X, Jia P, Marschall T, Beck CR, Ye K. SVision: a deep learning approach to resolve complex structural variants. Nat Methods 2022; 19:1230-1233. [PMID: 36109679 PMCID: PMC9985066 DOI: 10.1038/s41592-022-01609-w] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 08/11/2022] [Indexed: 01/23/2023]
Abstract
Complex structural variants (CSVs) encompass multiple breakpoints and are often missed or misinterpreted. We developed SVision, a deep-learning-based multi-object-recognition framework, to automatically detect and haracterize CSVs from long-read sequencing data. SVision outperforms current callers at identifying the internal structure of complex events and has revealed 80 high-quality CSVs with 25 distinct structures from an individual genome. SVision directly detects CSVs without matching known structures, allowing sensitive detection of both common and previously uncharacterized complex rearrangements.
Collapse
Affiliation(s)
- Jiadong Lin
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
- Leiden Institute of Advanced Computer Science, Faculty of Science, Leiden University, Leiden, the Netherlands
| | - Songbo Wang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Peter A Audano
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Deyu Meng
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China
- Macau Institute of Systems Engineering, Macau University of Science and Technology, Taipa, Macau
| | - Jacob I Flores
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Walter Kosters
- Leiden Institute of Advanced Computer Science, Faculty of Science, Leiden University, Leiden, the Netherlands
| | - Xiaofei Yang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Peng Jia
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Tobias Marschall
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Dusseldorf, Germany
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, USA
| | - Kai Ye
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.
- The School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China.
- Faculty of Science, Leiden University, Leident, the Netherlands.
| |
Collapse
|
11
|
Uppuluri L, Wang Y, Young E, Wong JS, Abid HZ, Xiao M. Multiplex structural variant detection by whole-genome mapping and nanopore sequencing. Sci Rep 2022; 12:6512. [PMID: 35444207 PMCID: PMC9021263 DOI: 10.1038/s41598-022-10483-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 04/08/2022] [Indexed: 11/26/2022] Open
Abstract
Identification of structural variants (SVs) breakpoints is important in studying mutations, mutagenic causes, and functional impacts. Next-generation sequencing and whole-genome optical mapping are extensively used in SV discovery and characterization. However, multiple platforms and computational approaches are needed for comprehensive analysis, making it resource-intensive and expensive. Here, we propose a strategy combining optical mapping and cas9-assisted targeted nanopore sequencing to analyze SVs. Optical mapping can economically and quickly detect SVs across a whole genome but does not provide sequence-level information or precisely resolve breakpoints. Furthermore, since only a subset of all SVs is known to affect biology, we attempted to type a subset of all SVs using targeted nanopore sequencing. Using our approach, we resolved the breakpoints of five deletions, five insertions, and an inversion, in a single experiment.
Collapse
Affiliation(s)
- Lahari Uppuluri
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA.,Department of Mechanical Engineering and Mechanics, Drexel University, Philadelphia, PA, USA
| | - Yilin Wang
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Eleanor Young
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Jessica S Wong
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Heba Z Abid
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Ming Xiao
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA. .,Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, Philadelphia, PA, USA.
| |
Collapse
|
12
|
Saitou M, Masuda N, Gokcumen O. Similarity-Based Analysis of Allele Frequency Distribution among Multiple Populations Identifies Adaptive Genomic Structural Variants. Mol Biol Evol 2022; 39:msab313. [PMID: 34718708 PMCID: PMC8896759 DOI: 10.1093/molbev/msab313] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Structural variants have a considerable impact on human genomic diversity. However, their evolutionary history remains mostly unexplored. Here, we developed a new method to identify potentially adaptive structural variants based on a similarity-based analysis that incorporates genotype frequency data from 26 populations simultaneously. Using this method, we analyzed 57,629 structural variants and identified 576 structural variants that show unusual population differentiation. Of these putatively adaptive structural variants, we further showed that 24 variants are multiallelic and overlap with coding sequences, and 20 variants are significantly associated with GWAS traits. Closer inspection of the haplotypic variation associated with these putatively adaptive and functional structural variants reveals deviations from neutral expectations due to: 1) population differentiation of rapidly evolving multiallelic variants, 2) incomplete sweeps, and 3) recent population-specific negative selection. Overall, our study provides new methodological insights, documents hundreds of putatively adaptive variants, and introduces evolutionary models that may better explain the complex evolution of structural variants.
Collapse
Affiliation(s)
- Marie Saitou
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Naoki Masuda
- Department of Mathematics, University at Buffalo, State University of New York, Buffalo, NY, USA
- Computational and Data-Enabled Science and Engineering Program, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| |
Collapse
|
13
|
Krinsky BH, Arthur RK, Xia S, Sosa D, Arsala D, White KP, Long M. Rapid Cis-Trans Coevolution Driven by a Novel Gene Retroposed from a Eukaryotic Conserved CCR4-NOT Component in Drosophila. Genes (Basel) 2021; 13:57. [PMID: 35052398 PMCID: PMC8774992 DOI: 10.3390/genes13010057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 12/10/2021] [Accepted: 12/23/2021] [Indexed: 12/11/2022] Open
Abstract
Young, or newly evolved, genes arise ubiquitously across the tree of life, and they can rapidly acquire novel functions that influence a diverse array of biological processes. Previous work identified a young regulatory duplicate gene in Drosophila, Zeus that unexpectedly diverged rapidly from its parent, Caf40, an extremely conserved component in the CCR4-NOT machinery in post-transcriptional and post-translational regulation of eukaryotic cells, and took on roles in the male reproductive system. This neofunctionalization was accompanied by differential binding of the Zeus protein to loci throughout the Drosophila melanogaster genome. However, the way in which new DNA-binding proteins acquire and coevolve with their targets in the genome is not understood. Here, by comparing Zeus ChIP-Seq data from D. melanogaster and D. simulans to the ancestral Caf40 binding events from D. yakuba, a species that diverged before the duplication event, we found a dynamic pattern in which Zeus binding rapidly coevolved with a previously unknown DNA motif, which we term Caf40 and Zeus-Associated Motif (CAZAM), under the influence of positive selection. Interestingly, while both copies of Zeus acquired targets at male-biased and testis-specific genes, D. melanogaster and D. simulans proteins have specialized binding on different chromosomes, a pattern echoed in the evolution of the associated motif. Using CRISPR-Cas9-mediated gene knockout of Zeus and RNA-Seq, we found that Zeus regulated the expression of 661 differentially expressed genes (DEGs). Our results suggest that the evolution of young regulatory genes can be coupled to substantial rewiring of the transcriptional networks into which they integrate, even over short evolutionary timescales. Our results thus uncover dynamic genome-wide evolutionary processes associated with new genes.
Collapse
Affiliation(s)
- Benjamin H. Krinsky
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Robert K. Arthur
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Dylan Sosa
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Deanna Arsala
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Kevin P. White
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Manyuan Long
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| |
Collapse
|
14
|
Ferragut Cardoso AP, Banerjee M, Nail AN, Lykoudi A, States JC. miRNA dysregulation is an emerging modulator of genomic instability. Semin Cancer Biol 2021; 76:120-131. [PMID: 33979676 PMCID: PMC8576067 DOI: 10.1016/j.semcancer.2021.05.004] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 04/30/2021] [Accepted: 05/03/2021] [Indexed: 12/12/2022]
Abstract
Genomic instability consists of a range of genetic alterations within the genome that contributes to tumor heterogeneity and drug resistance. It is a well-established characteristic of most cancer cells. Genome instability induction results from defects in DNA damage surveillance mechanisms, mitotic checkpoints and DNA repair machinery. Accumulation of genetic alterations ultimately sets cells towards malignant transformation. Recent studies suggest that miRNAs are key players in mediating genome instability. miRNAs are a class of small RNAs expressed in most somatic tissues and are part of the epigenome. Importantly, in many cancers, miRNA expression is dysregulated. Consequently, this review examines the role of miRNA dysregulation as a causal step for induction of genome instability and subsequent carcinogenesis. We focus specifically on mechanistic studies assessing miRNA(s) and specific subtypes of genome instability or known modes of genome instability. In addition, we provide insight on the existing knowledge gaps within the field and possible ways to address them.
Collapse
Affiliation(s)
- Ana P Ferragut Cardoso
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, 40202, USA
| | - Mayukh Banerjee
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, 40202, USA
| | - Alexandra N Nail
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, 40202, USA
| | - Angeliki Lykoudi
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, 40202, USA
| | - J Christopher States
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, 40202, USA.
| |
Collapse
|
15
|
Nagura Y, Fujiwara K, Matsuura K, Iio E, Tanaka Y, Kataoka H. Complex structural variations in non-human primate hepatitis B virus. Virol J 2021; 18:200. [PMID: 34627299 PMCID: PMC8501659 DOI: 10.1186/s12985-021-01667-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 09/21/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Recent genome sequence technology has revealed a novel type of genetic rearrangement referred to as complex structural variations (SVs). Previous studies have elucidated the complex SVs in human hepatitis B viruses (HBVs). In this study, we investigated the existence of complex SVs in HBVs from non-human primates (NHPs). METHODS Searches for nucleotide sequences of NHP HBV were conducted using the PubMed, and genetic sequences were retrieved from databases. The candidate genetic sequences harboring complex SVs were analyzed using the CLUSTALW program and MAFFT. Additional bioinformatical analyses were performed to determine strains with complex SVs and to elucidate characteristics of NHP HBV strains. RESULTS One hundred and fifty-four HBV strains from NHPs were identified from databases. SVs and complex SVs were observed in 11 (7.1%) strains. Three gibbon HBV (GiHBV) strains showed complex SVs consisting of an insertion and a deletion in the pre-S1 region. One GiHBV strain possessed a 6-nt insertion, which are normally specific to human HBV genotype A (HBV/A) in the Core region, and further analyses clarified that the 6-nt insertion was not caused by recombination, but rather by simple insertion. Another chimpanzee HBV strain showed complex SVs in the pre-S1 region, which were composed of human HBV/E, G-specific polymorphic SV, and an additional 6-nt insertion. CONCLUSIONS In this study, complex SVs were observed in HBV strains from NHPs, in addition to human HBV strains, as shown in previous studies. These data suggest that complex SVs could also be found in other members of hepadnaviruses, and may play a role in their genetic diversity.
Collapse
Affiliation(s)
- Yoshihito Nagura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Mizuho, Nagoya, Aichi, 467-8601, Japan
| | - Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Mizuho, Nagoya, Aichi, 467-8601, Japan.
| | - Kentaro Matsuura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Mizuho, Nagoya, Aichi, 467-8601, Japan
| | - Etsuko Iio
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Mizuho, Nagoya, Aichi, 467-8601, Japan
| | - Yasuhito Tanaka
- Department of Virology and Liver Unit, Nagoya City University Graduate School of Medicinal Sciences, Nagoya, 467-8601, Japan
| | - Hiromi Kataoka
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Mizuho, Nagoya, Aichi, 467-8601, Japan
| |
Collapse
|
16
|
Mako: A Graph-based Pattern Growth Approach to Detect Complex Structural Variants. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 20:205-218. [PMID: 34224879 PMCID: PMC9510932 DOI: 10.1016/j.gpb.2021.03.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 03/05/2021] [Accepted: 03/05/2021] [Indexed: 11/21/2022]
Abstract
Complex structural variants (CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants. However, detecting the compounded mutational signals of CSVs is challenging through a commonly used model-match strategy. As a result, there has been limited progress for CSV discovery compared with simple structural variants. Here, we systematically analyzed the multi-breakpoint connection feature of CSVs, and proposed Mako, utilizing a bottom-up guided model-free strategy, to detect CSVs from paired-end short-read sequencing. Specifically, we implemented a graph-based pattern growth approach, where the graph depicts potential breakpoint connections, and pattern growth enables CSV detection without pre-defined models. Comprehensive evaluations on both simulated and real datasets revealed that Mako outperformed other algorithms. Notably, validation rates of CSVs on real data based on experimental and computational validations as well as manual inspections are around 70%, where the medians of experimental and computational breakpoint shift are 13 bp and 26 bp, respectively. Moreover, the Mako CSV subgraph effectively characterized the breakpoint connections of a CSV event and uncovered a total of 15 CSV types, including two novel types of adjacent segment swap and tandem dispersed duplication. Further analysis of these CSVs also revealed the impact of sequence homology on the formation of CSVs. Mako is publicly available at https://github.com/xjtu-omics/Mako.
Collapse
|
17
|
Badet T, Fouché S, Hartmann FE, Zala M, Croll D. Machine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen. Nat Commun 2021; 12:3551. [PMID: 34112792 PMCID: PMC8192914 DOI: 10.1038/s41467-021-23862-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 05/11/2021] [Indexed: 02/05/2023] Open
Abstract
Species harbor extensive structural variation underpinning recent adaptive evolution. However, the causality between genomic features and the induction of new rearrangements is poorly established. Here, we analyze a global set of telomere-to-telomere genome assemblies of a fungal pathogen of wheat to establish a nucleotide-level map of structural variation. We show that the recent emergence of pesticide resistance has been disproportionally driven by rearrangements. We use machine learning to train a model on structural variation events based on 30 chromosomal sequence features. We show that base composition and gene density are the major determinants of structural variation. Retrotransposons explain most inversion, indel and duplication events. We apply our model to Arabidopsis thaliana and show that our approach extends to more complex genomes. Finally, we analyze complete genomes of haploid offspring in a four-generation pedigree. Meiotic crossover locations are enriched for new rearrangements consistent with crossovers being mutational hotspots. The model trained on species-wide structural variation accurately predicts the position of >74% of newly generated variants along the pedigree. The predictive power highlights causality between specific sequence features and the induction of chromosomal rearrangements. Our work demonstrates that training sequence-derived models can accurately identify regions of intrinsic DNA instability in eukaryotic genomes.
Collapse
Affiliation(s)
- Thomas Badet
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| | - Simone Fouché
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Fanny E Hartmann
- Ecologie Systématique Evolution, Bâtiment 360, Univ. Paris-Sud, AgroParisTech, CNRS, Université Paris-Saclay, Orsay, France
| | - Marcello Zala
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Daniel Croll
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland.
| |
Collapse
|
18
|
Novel Genetic Rearrangements in Hepatitis B Virus: Complex Structural Variations and Structural Variation Polymorphisms. Viruses 2021; 13:v13030473. [PMID: 33809245 PMCID: PMC8000817 DOI: 10.3390/v13030473] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/06/2021] [Accepted: 03/11/2021] [Indexed: 12/11/2022] Open
Abstract
Chronic hepatitis B virus (HBV) causes serious clinical problems, such as liver cirrhosis and hepatocellular carcinoma. Current antiviral treatments suppress HBV; however, the clinical cure rate remains low. Basic research on HBV is indispensable to eradicate and cure HBV. Genetic alterations are defined by nucleotide substitutions and canonical forms of structural variations (SVs), such as insertion, deletion and duplication. Additionally, genetic changes inconsistent with the canonical forms have been reported, and these have been termed complex SVs. Detailed analyses of HBV using bioinformatical applications have detected complex SVs in HBV genomes. Sequence gaps and low sequence similarity have been observed in the region containing complex SVs. Additionally, insertional motif sequences have been observed in HBV strains with complex SVs. Following the analyses of complex SVs in the HBV genome, the role of SVs in the genetic diversity of orthohepadnavirus has been investigated. SV polymorphisms have been detected in comparisons of several species of orthohepadnaviruses. As mentioned, complex SVs are composed of multiple SVs. On the contrary, SV polymorphisms are observed as insertions of different SVs. Up to a certain point, nucleotide substitutions cause genetic differences. However, at some point, the nucleotide sequences are split into several particular patterns. These SVs have been observed as polymorphic changes. Different species of orthohepadnaviruses possess SVs which are unique and specific to a certain host of the virus. Studies have shown that SVs play an important role in the HBV genome. Further studies are required to elucidate their virologic and clinical roles.
Collapse
|
19
|
van den Akker J, Hon L, Ondov A, Mahkovec Z, O'Connor R, Chan RC, Lock J, Zimmer AD, Rostamianfar A, Ginsberg J, Leon A, Topper S. Intronic Breakpoint Signatures Enhance Detection and Characterization of Clinically Relevant Germline Structural Variants. J Mol Diagn 2021; 23:612-629. [PMID: 33621668 DOI: 10.1016/j.jmoldx.2021.01.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 12/14/2020] [Accepted: 01/27/2021] [Indexed: 12/16/2022] Open
Abstract
The relevance of large copy number variants (CNVs) to hereditary disorders has been long recognized, and population sequencing efforts have chronicled many common structural variants (SVs). However, limited data are available on the clinical contribution of rare germline SVs. Here, a detailed characterization of SVs identified using targeted next-generation sequencing was performed. Across 50 genes associated with hereditary cancer and cardiovascular disorders, a minimum of 828 unique SVs were reported, including 584 fully characterized SVs. Almost 40% of CNVs were <5 kb, with one in three deletions impacting a single exon. Additionally, 36 mid-range deletions/duplications (50 to 250 bp), 21 mobile element insertions, 6 inversions, and 27 complex rearrangements were detected. This data set was used to model SV detection in a bioinformatics pipeline solely relying on read depth, which revealed that genome sequencing (30×) allows detection of 71%, a 500× panel only targeting coding regions 53%, and exome sequencing (100×) <20% of characterized SVs. SVs accounted for 14.1% of all unique pathogenic variants, supporting the importance of SVs in hereditary disorders. Robust SV detection requires an ensemble of variant-calling algorithms that utilize sequencing of intronic regions. These algorithms should use distinct data features representative of each class of mutational mechanism, including recombination between two sequences sharing high similarity, covariants inserted between CNV breakpoints, and complex rearrangements containing inverted sequences.
Collapse
|
20
|
Luce L, Abelleyro MM, Carcione M, Mazzanti C, Rossetti L, Radic P, Szijan I, Menazzi S, Francipane L, Nevado J, Lapunzina P, De Brasi C, Giliberto F. Analysis of complex structural variants in the DMD gene in one family. Neuromuscul Disord 2021; 31:253-263. [PMID: 33451931 DOI: 10.1016/j.nmd.2020.11.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 11/26/2020] [Accepted: 11/30/2020] [Indexed: 11/24/2022]
Abstract
This work describes a family with Duchenne muscular dystrophy (DMD) with a rare case of a symptomatic pregnant woman. The main aim was to perform prenatal molecular diagnosis to provide genetic counseling. The secondary aim was to suggest the molecular mechanisms causing the complex structural variant (cxSV) identified. To accomplish this, we used a multi-technique algorithm including segregation analysis, Multiplex Ligation-dependent Probe Amplification, PCR, X-chromosome inactivation studies, microarrays, whole genome sequencing and bioinformatics. We identified a duplication of exons 38-43 in the DMD gene in all affected and obligate carrier members, proving that this was the DMD-causing mutation. We also observed a skewed X-chromosome inactivation in the symptomatic woman that explained her symptomatology. In addition, we identified a cxSV (duplication of exons 38-43 and deletion of exons 45-54) in the affected boy. The molecular characterization and bioinformatic analyses of the breakpoint junctions allowed us to identify Double Strand Breaks stimulator motifs and suggested the replication-dependent Fork Stalling and Template Switching as the most probable mechanisms leading to the duplication. In addition, the de novo deletion might have been the result of a germline inter-chromosome non-allelic recombination involving the Non-Homologous End Joining mechanism. In conclusion, the diagnostic strategy used allowed us to provide accurate molecular diagnosis and genetic counseling. In addition, the familial molecular diagnosis together with the in-depth characterization of the cxSV helped to determine the chronology of the molecular events, and propose and understand the molecular mechanisms involved in the generation of this complex rearrangement.
Collapse
Affiliation(s)
- Leonela Luce
- Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética, Cátedra de Genética, Laboratorio de Distrofinopatías, Universidad de Buenos Aires, Buenos Aires, Argentina; Instituto de Inmunología, Genética y Metabolismo (INIGEM), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Martín M Abelleyro
- CONICET-Academia Nacional de Medicina, Instituto de Medicina Experimental (IMEX), Buenos Aires, Argentina
| | - Micaela Carcione
- Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética, Cátedra de Genética, Laboratorio de Distrofinopatías, Universidad de Buenos Aires, Buenos Aires, Argentina; Instituto de Inmunología, Genética y Metabolismo (INIGEM), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Chiara Mazzanti
- Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética, Cátedra de Genética, Laboratorio de Distrofinopatías, Universidad de Buenos Aires, Buenos Aires, Argentina; Instituto de Inmunología, Genética y Metabolismo (INIGEM), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Liliana Rossetti
- CONICET-Academia Nacional de Medicina, Instituto de Medicina Experimental (IMEX), Buenos Aires, Argentina
| | - Pamela Radic
- CONICET-Academia Nacional de Medicina, Instituto de Medicina Experimental (IMEX), Buenos Aires, Argentina
| | - Irene Szijan
- Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética, Cátedra de Genética, Laboratorio de Distrofinopatías, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Sebastián Menazzi
- Hospital de Clínicas "José de San Martín", División de Genética, Universidad de Buenos Aires, Buenos Aires Argentina
| | - Liliana Francipane
- Hospital de Clínicas "José de San Martín", División de Genética, Universidad de Buenos Aires, Buenos Aires Argentina
| | - Julián Nevado
- Instituto de Genética Médica y Molecular (INGEMM)-IdiPAZ, Hospital Universitario La Paz, Universidad Autónoma, Madrid, Spain; Centro de Investigaciones Biomédicas en Red para Enfermedades Raras (CIBERER), Madrid, Spain; ITHACA-ERN (European Reference Network), La Paz University Hospital, Madrid, Spain
| | - Pablo Lapunzina
- Instituto de Genética Médica y Molecular (INGEMM)-IdiPAZ, Hospital Universitario La Paz, Universidad Autónoma, Madrid, Spain; Centro de Investigaciones Biomédicas en Red para Enfermedades Raras (CIBERER), Madrid, Spain; ITHACA-ERN (European Reference Network), La Paz University Hospital, Madrid, Spain
| | - Carlos De Brasi
- CONICET-Academia Nacional de Medicina, Instituto de Medicina Experimental (IMEX), Buenos Aires, Argentina
| | - Florencia Giliberto
- Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética, Cátedra de Genética, Laboratorio de Distrofinopatías, Universidad de Buenos Aires, Buenos Aires, Argentina; Instituto de Inmunología, Genética y Metabolismo (INIGEM), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina.
| |
Collapse
|
21
|
Shiny-SoSV: A web-based performance calculator for somatic structural variant detection. PLoS One 2020; 15:e0238108. [PMID: 32853264 PMCID: PMC7451576 DOI: 10.1371/journal.pone.0238108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 08/10/2020] [Indexed: 11/19/2022] Open
Abstract
Somatic structural variants are an important contributor to cancer development and evolution. Accurate detection of these complex variants from whole genome sequencing data is influenced by a multitude of parameters. However, there are currently no tools for guiding study design nor are there applications that could predict the performance of somatic structural variant detection. To address this gap, we developed Shiny-SoSV, a user-friendly web-based calculator for determining the impact of common variables on the sensitivity, precision and F1 score of somatic structural variant detection, including choice of variant detection tool, sequencing depth of coverage, variant allele fraction, and variant breakpoint resolution. Using simulation studies, we determined singular and combinatoric effects of these variables, modelled the results using a generalised additive model, allowing structural variant detection performance to be predicted for any combination of predictors. Shiny-SoSV provides an interactive and visual platform for users to easily compare individual and combined impact of different parameters. It predicts the performance of a proposed study design, on somatic structural variant detection, prior to the commencement of benchwork. Shiny-SoSV is freely available at https://hcpcg.shinyapps.io/Shiny-SoSV with accompanying user’s guide and example use-cases.
Collapse
|
22
|
Hayes M, Mullins D, Nguyen A. Complex Variant Discovery Using Discordant Cluster Normalization. J Comput Biol 2020; 28:185-194. [PMID: 32783649 DOI: 10.1089/cmb.2020.0249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Complex genomic structural variants (CGSVs) are abnormalities that present with three or more breakpoints, making their discovery a challenge. The majority of existing algorithms for structural variant detection are only designed to find simple structural variants (SSVs) such as deletions and inversions; they fail to find more complex events such as deletion-inversions or deletion-duplications, for example. In this study, we present an algorithm named CleanBreak that employs a clique partitioning graph-based strategy to identify collections of SSV clusters and then subsequently identifies overlapping SSV clusters to examine the search space of possible CGSVs, choosing the one that is most concordant with local read depth. We evaluated CleanBreak's performance on whole genome simulated data and a real data set from the 1000 Genomes Project. We also compared CleanBreak with another algorithm for CGSV discovery. The results demonstrate CleanBreak's utility as an effective method to discover CGSVs.
Collapse
Affiliation(s)
- Matthew Hayes
- Department of Physics and Computer Science and Xavier University of Louisiana, New Orleans, Louisiana, USA
| | - Derrick Mullins
- Department of Physics and Computer Science and Xavier University of Louisiana, New Orleans, Louisiana, USA
| | - Angela Nguyen
- Department of Biology, Xavier University of Louisiana, New Orleans, Louisiana, USA
| |
Collapse
|
23
|
Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, Buyske S, Matise TC, Muzny DM, Zody MC, Lander ES, Dutcher SK, Stitziel NO, Hall IM. Mapping and characterization of structural variation in 17,795 human genomes. Nature 2020; 583:83-89. [PMID: 32460305 PMCID: PMC7547914 DOI: 10.1038/s41586-020-2371-0] [Citation(s) in RCA: 146] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 05/18/2020] [Indexed: 12/18/2022]
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
Affiliation(s)
- Haley J Abel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - David E Larson
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Allison A Regier
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Indraniel Das
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Krishna L Kanchi
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Ryan M Layer
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Benjamin M Neale
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - William J Salerno
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Steven Buyske
- Department of Statistics, Rutgers University, Piscataway, NJ, USA
| | - Tara C Matise
- Department of Genetics, Rutgers University, Piscataway, NJ, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Susan K Dutcher
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Nathan O Stitziel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA.
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
| |
Collapse
|
24
|
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
|
25
|
Yuan X, Gao M, Bai J, Duan J. SVSR: A Program to Simulate Structural Variations and Generate Sequencing Reads for Multiple Platforms. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1082-1091. [PMID: 30334804 DOI: 10.1109/tcbb.2018.2876527] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Structural variation accounts for a major fraction of mutations in the human genome and confers susceptibility to complex diseases. Next generation sequencing along with the rapid development of computational methods provides a cost-effective procedure to detect such variations. Simulation of structural variations and sequencing reads with real characteristics is essential for benchmarking the computational methods. Here, we develop a new program, SVSR, to simulate five types of structural variations (indels, tandem duplication, CNVs, inversions, and translocations) and SNPs for the human genome and to generate sequencing reads with features from popular platforms (Illumina, SOLiD, 454, and Ion Torrent). We adopt a selection model trained from real data to predict copy number states, starting from the first site of a particular genome to the end. Furthermore, we utilize references of microbial genomes to produce insertion fragments and design probabilistic models to imitate inversions and translocations. Moreover, we create platform-specific errors and base quality profiles to generate normal, tumor, or normal-tumor mixture reads. Experimental results show that SVSR could capture more features that are realistic and generate datasets with satisfactory quality scores. SVSR is able to evaluate the performance of structural variation detection methods and guide the development of new computational methods.
Collapse
|
26
|
Shin W, Kim H, Oh DY, Kim DH, Han K. Quantitative evaluation of the molecular marker using droplet digital PCR. Genomics Inform 2020; 18:e4. [PMID: 32224837 PMCID: PMC7120350 DOI: 10.5808/gi.2020.18.1.e4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 12/05/2019] [Indexed: 11/24/2022] Open
Abstract
Transposable elements (TEs) constitute approximately half of Bovine genome. They can be a powerful species-specific marker without regression mutations by the structure variation (SV) at the time of genomic evolution. In a previous study, we identified the Hanwoo-specific SV that was generated by a TE–association deletion event using traditional PCR method and Sanger sequencing validation. It could be used as a molecular marker to distinguish different cattle breeds (i.e., Hanwoo vs. Holstein). However, PCR is defective with various final copy quantifications from every sample. Thus, we applied to the droplet digital PCR (ddPCR) platform for accurate quantitative detection of the Hanwoo-specific SV. Although samples have low allele frequency variation within Hanwoo population, ddPCR could perform high sensitive detection with absolute quantification. We aimed to use ddPCR for more accurate quantification than PCR. We suggest that the ddPCR platform is applicable for the quantitative evaluation of molecular markers.
Collapse
Affiliation(s)
- Wonseok Shin
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116, Korea.,Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Korea
| | - Haneul Kim
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116, Korea.,Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Korea
| | - Dong-Yep Oh
- Livestock Research Institute, Yeongju 36052, Korea
| | - Dong Hee Kim
- Department of Anesthesiology and Pain Management, Dankook University College of Medicine, Cheonan 31116, Korea
| | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116, Korea.,Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Korea
| |
Collapse
|
27
|
Li N, Yang J, Zhu W, Liang Y. MVSC: A Multi-variation Simulator of Cancer Genome. Comb Chem High Throughput Screen 2020; 23:326-333. [PMID: 32183666 DOI: 10.2174/1386207323666200317121136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 11/29/2019] [Accepted: 02/27/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Many forms of variations exist in the genome, which are the main causes of individual phenotypic differences. The detection of variants, especially those located in the tumor genome, still faces many challenges due to the complexity of the genome structure. Thus, the performance assessment of variation detection tools using next-generation sequencing platforms is urgently needed. METHOD We have created a software package called the Multi-Variation Simulator of Cancer genomes (MVSC) to simulate common genomic variants, including single nucleotide polymorphisms, small insertion and deletion polymorphisms, and structural variations (SVs), which are analogous to human somatically acquired variations. Three sets of variations embedded in genomic sequences in different periods were dynamically and sequentially simulated one by one. RESULTS In cancer genome simulation, complex SVs are important because this type of variation is characteristic of the tumor genome structure. Overlapping variations of different sizes can also coexist in the same genome regions, adding to the complexity of cancer genome architecture. Our results show that MVSC can efficiently simulate a variety of genomic variants that cannot be simulated by existing software packages. CONCLUSION The MVSC-simulated variants can be used to assess the performance of existing tools designed to detect SVs in next-generation sequencing data, and we also find that MVSC is memory and time-efficient compared with similar software packages.
Collapse
Affiliation(s)
- Ning Li
- School of Information and Electronic Engineering, Wuzhou University, Wuzhou, China
| | - Jialiang Yang
- Department of Mathematics and Statistics, Hainan Normal University, Haikou, Hainan 571158, China
| | - Wen Zhu
- Department of Mathematics and Statistics, Hainan Normal University, Haikou, Hainan 571158, China.,College of Computer Science and Electronic Engineering, Hunan University, Hunan, China
| | - Ying Liang
- College of Computer Science and Electronic Engineering, Hunan University, Hunan, China.,College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330000, China
| |
Collapse
|
28
|
Balachandran P, Beck CR. Structural variant identification and characterization. Chromosome Res 2020; 28:31-47. [PMID: 31907725 PMCID: PMC7131885 DOI: 10.1007/s10577-019-09623-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 10/15/2019] [Accepted: 11/24/2019] [Indexed: 01/06/2023]
Abstract
Structural variant (SV) differences between human genomes can cause germline and mosaic disease as well as inter-individual variation. De-regulation of accurate DNA repair and genomic surveillance mechanisms results in a large number of SVs in cancer. Analysis of the DNA sequences at SV breakpoints can help identify pathways of mutagenesis and regions of the genome that are more susceptible to rearrangement. Large-scale SV analyses have been enabled by high-throughput genome-level sequencing on humans in the past decade. These studies have shed light on the mechanisms and prevalence of complex genomic rearrangements. Recent advancements in both sequencing and other mapping technologies as well as calling algorithms for detection of genomic rearrangements have helped propel SV detection into population-scale studies, and have begun to elucidate previously inaccessible regions of the genome. Here, we discuss the genomic organization of simple and complex SVs, the molecular mechanisms of their formation, and various ways to detect them. We also introduce methods for characterizing SVs and their consequences on human genomes.
Collapse
Affiliation(s)
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, 06030, USA.
| |
Collapse
|
29
|
Abstract
Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.
Collapse
Affiliation(s)
- Steve S Ho
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Alexander E Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ryan E Mills
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
30
|
Jenko Bizjan B, Katsila T, Tesovnik T, Šket R, Debeljak M, Matsoukas MT, Kovač J. Challenges in identifying large germline structural variants for clinical use by long read sequencing. Comput Struct Biotechnol J 2019; 18:83-92. [PMID: 32099591 PMCID: PMC7026727 DOI: 10.1016/j.csbj.2019.11.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 11/07/2019] [Accepted: 11/21/2019] [Indexed: 12/30/2022] Open
Abstract
Genomic structural variations, previously considered rare events, are widely recognized as a major source of inter-individual variability and hence, a major hurdle in optimum patient stratification and disease management. Herein, we focus on large complex germline structural variations and present challenges towards target treatment via the synergy of state-of-the-art approaches and information technology tools. A complex structural variation detection remains challenging, as there is no gold standard for identifying such genomic variations with long reads, especially when the chromosomal rearrangement in question is a few Mb in length. A clinical case with a large complex chromosomal rearrangement serves as a paradigm. We feel that functional validation and data interpretation are of outmost importance for information growth to be translated into knowledge growth and hence, new working practices are highlighted.
Collapse
Affiliation(s)
- Barbara Jenko Bizjan
- Clinical Institute of Special Laboratory Diagnostics, University Children’s Hospital, UMC, Ljubljana, Slovenia
| | - Theodora Katsila
- Institute of Chemical Biology, National Hellenic Research Centre, Athens, Greece
| | - Tine Tesovnik
- Clinical Institute of Special Laboratory Diagnostics, University Children’s Hospital, UMC, Ljubljana, Slovenia
| | - Robert Šket
- Clinical Institute of Special Laboratory Diagnostics, University Children’s Hospital, UMC, Ljubljana, Slovenia
| | - Maruša Debeljak
- Clinical Institute of Special Laboratory Diagnostics, University Children’s Hospital, UMC, Ljubljana, Slovenia
| | | | - Jernej Kovač
- Clinical Institute of Special Laboratory Diagnostics, University Children’s Hospital, UMC, Ljubljana, Slovenia
| |
Collapse
|
31
|
Midha MK, Wu M, Chiu KP. Long-read sequencing in deciphering human genetics to a greater depth. Hum Genet 2019; 138:1201-1215. [PMID: 31538236 DOI: 10.1007/s00439-019-02064-y] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 09/13/2019] [Indexed: 12/12/2022]
Abstract
Through four decades' development, DNA sequencing has inched into the era of single-molecule sequencing (SMS), or the third-generation sequencing (TGS), as represented by two distinct technical approaches developed independently by Pacific Bioscience (PacBio) and Oxford Nanopore Technologies (ONT). Historically, each generation of sequencing technologies was marked by innovative technological achievements and novel applications. Long reads (LRs) are considered as the most advantageous feature of SMS shared by both PacBio and ONT to distinguish SMS from next-generation sequencing (NGS, or the second-generation sequencing) and Sanger sequencing (the first-generation sequencing). Long reads overcome the limitations of NGS and drastically improves the quality of genome assembly. Besides, ONT also contributes several unique features including ultra-long reads (ULRs) with read length above 300 kb and some close to 1 million bp, direct RNA sequencing and superior portability as made possible by pocket-sized MinION sequencer. Here, we review the history of DNA sequencing technologies and associated applications, with a special focus on the advantages as well as the limitations of ULR sequencing in genome assembly.
Collapse
Affiliation(s)
- Mohit K Midha
- Genomics Research Center, Academia Sinica, 128 Academia Road, Sec. 2, Nankang District, Taipei, 115, Taiwan.,Institute of Biochemistry and Molecular Biology, National Yang-Ming University, Taipei, Taiwan
| | - Mengchu Wu
- Health GeneTech, 22F No. 99, Xin Pu 6th St., Taoyuan, Taiwan
| | - Kuo-Ping Chiu
- Genomics Research Center, Academia Sinica, 128 Academia Road, Sec. 2, Nankang District, Taipei, 115, Taiwan. .,Institute of Biochemistry and Molecular Biology, National Yang-Ming University, Taipei, Taiwan. .,Department of Life Sciences, College of Life Sciences, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
32
|
Fujiwara K, Matsuura K, Matsunami K, Iio E, Nagura Y, Nojiri S, Kataoka H. Novel Genetic Rearrangements Termed "Structural Variation Polymorphisms" Contribute to the Genetic Diversity of Orthohepadnaviruses. Viruses 2019; 11:v11090871. [PMID: 31533314 PMCID: PMC6783994 DOI: 10.3390/v11090871] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 09/08/2019] [Accepted: 09/17/2019] [Indexed: 12/27/2022] Open
Abstract
The genetic diversity of orthohepadnaviruses is not yet fully understood. This study was conducted to investigate the role of structural variations (SVs) in their diversity. Genetic sequences of orthohepadnaviruses were retrieved from databases. The positions of sequence gaps were investigated, since they were found to be related to SVs, and they were further used to search for SVs. Then, a combination of pair-wise and multiple alignment analyses was performed to analyze the genomic structure. Unique patterns of SVs were observed; genetic sequences at certain genomic positions could be separated into multiple patterns, such as no SV, SV pattern 1, SV pattern 2, and SV pattern 3, which were observed as polymorphic changes. We provisionally referred to these genetic changes as SV polymorphisms. Our data showed that higher frequency of sequence gaps and lower genetic identity were observed in the pre-S1-S2 region of various types of HBVs. Detailed examination of the genetic structure in the pre-S region by a combination of pair-wise and multiple alignment analyses showed that the genetic diversity of orthohepadnaviruses in the pre-S1 region could have been also induced by SV polymorphisms. Our data showed that novel genetic rearrangements provisionally termed SV polymorphisms were observed in various orthohepadnaviruses.
Collapse
Affiliation(s)
- Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Kentaro Matsuura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Kayoko Matsunami
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Etsuko Iio
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Yoshihito Nagura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Shunsuke Nojiri
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Hiromi Kataoka
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| |
Collapse
|
33
|
Lal A, Ramazzotti D, Weng Z, Liu K, Ford JM, Sidow A. Comprehensive genomic characterization of breast tumors with BRCA1 and BRCA2 mutations. BMC Med Genomics 2019; 12:84. [PMID: 31182087 PMCID: PMC6558765 DOI: 10.1186/s12920-019-0545-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 05/31/2019] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Germline mutations in the BRCA1 and BRCA2 genes predispose carriers to breast and ovarian cancer, and there remains a need to identify the specific genomic mechanisms by which cancer evolves in these patients. Here we present a systematic genomic analysis of breast tumors with BRCA1 and BRCA2 mutations. METHODS We analyzed genomic data from breast tumors, with a focus on comparing tumors with BRCA1/BRCA2 gene mutations with common classes of sporadic breast tumors. RESULTS We identify differences between BRCA-mutated and sporadic breast tumors in patterns of point mutation, DNA methylation and structural variation. We show that structural variation disproportionately affects tumor suppressor genes and identify specific driver gene candidates that are enriched for structural variation. CONCLUSIONS Compared to sporadic tumors, BRCA-mutated breast tumors show signals of reduced DNA methylation, more ancestral cell divisions, and elevated rates of structural variation that tend to disrupt highly expressed protein-coding genes and known tumor suppressors. Our analysis suggests that BRCA-mutated tumors are more aggressive than sporadic breast cancers because loss of the BRCA pathway causes multiple processes of mutagenesis and gene dysregulation.
Collapse
Affiliation(s)
- Avantika Lal
- Department of Pathology, Stanford University, Stanford, CA 94305 USA
- Present address: NVIDIA Corporation, 2788 San Tomas Expy, Santa Clara, CA 95051 USA
| | - Daniele Ramazzotti
- Department of Pathology, Stanford University, Stanford, CA 94305 USA
- Department of Computer Science, Stanford University, Stanford, CA 94305 USA
| | - Ziming Weng
- Department of Pathology, Stanford University, Stanford, CA 94305 USA
| | - Keli Liu
- Department of Statistics, Stanford University, Stanford, CA 94305 USA
| | - James M. Ford
- Department of Medicine, Stanford University, Stanford, CA 94305 USA
- Department of Genetics, Stanford University, Stanford, CA 94305 USA
| | - Arend Sidow
- Department of Pathology, Stanford University, Stanford, CA 94305 USA
- Department of Genetics, Stanford University, Stanford, CA 94305 USA
| |
Collapse
|
34
|
Lappalainen T, Scott AJ, Brandt M, Hall IM. Genomic Analysis in the Age of Human Genome Sequencing. Cell 2019; 177:70-84. [PMID: 30901550 PMCID: PMC6532068 DOI: 10.1016/j.cell.2019.02.032] [Citation(s) in RCA: 158] [Impact Index Per Article: 31.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 02/19/2019] [Accepted: 02/19/2019] [Indexed: 02/08/2023]
Abstract
Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats. Here, we review the current state of technologies for genetic variant discovery, genotyping, and functional interpretation and discuss the prospects for future advances. We focus on germline variants discovered by whole-genome sequencing, genome-wide functional genomic approaches for predicting and measuring variant functional effects, and implications for studies of common and rare human disease.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Department of Systems Biology, Columbia University, New York, NY, USA.
| | - Alexandra J Scott
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA; Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Margot Brandt
- New York Genome Center, New York, NY, USA; Department of Systems Biology, Columbia University, New York, NY, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA; Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
35
|
Zhou B, Ho SS, Greer SU, Zhu X, Bell JM, Arthur JG, Spies N, Zhang X, Byeon S, Pattni R, Ben-Efraim N, Haney MS, Haraksingh RR, Song G, Ji HP, Perrin D, Wong WH, Abyzov A, Urban AE. Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562. Genome Res 2019; 29:472-484. [PMID: 30737237 PMCID: PMC6396411 DOI: 10.1101/gr.234948.118] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 12/28/2018] [Indexed: 11/24/2022]
Abstract
K562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and indels (both corrected for CN in aneuploid regions), loss of heterozygosity, megabase-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs), including small and large-scale complex SVs and nonreference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor gene FHIT. Taking aneuploidy into account, we reanalyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.
Collapse
Affiliation(s)
- Bo Zhou
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Steve S Ho
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Stephanie U Greer
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Xiaowei Zhu
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - John M Bell
- Stanford Genome Technology Center, Stanford University, Palo Alto, California 94304, USA
| | - Joseph G Arthur
- Department of Statistics, Stanford University, Stanford, California 94305, USA
| | - Noah Spies
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Pathology, Stanford University School of Medicine, Stanford, California 94305, USA.,Genome-Scale Measurements Group, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA
| | - Xianglong Zhang
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Seunggyu Byeon
- School of Computer Science and Engineering, College of Engineering, Pusan National University, Busan 46241, South Korea
| | - Reenal Pattni
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Noa Ben-Efraim
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Michael S Haney
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Rajini R Haraksingh
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Giltae Song
- School of Computer Science and Engineering, College of Engineering, Pusan National University, Busan 46241, South Korea
| | - Hanlee P Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, California 94305, USA.,Stanford Genome Technology Center, Stanford University, Palo Alto, California 94304, USA
| | - Dimitri Perrin
- Science and Engineering Faculty, Queensland University of Technology, Brisbane, QLD 4001, Australia
| | - Wing H Wong
- Department of Statistics, Stanford University, Stanford, California 94305, USA.,Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Alexej Abyzov
- Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota 55905, USA
| | - Alexander E Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA.,Tashia and John Morgridge Faculty Scholar, Stanford Child Health Research Institute, Stanford, California 94305, USA
| |
Collapse
|
36
|
Palumbo E, Russo A. Common fragile site instability in normal cells: Lessons and perspectives. Genes Chromosomes Cancer 2018; 58:260-269. [PMID: 30387295 DOI: 10.1002/gcc.22705] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 09/25/2018] [Accepted: 10/01/2018] [Indexed: 12/26/2022] Open
Abstract
Mechanisms and events related to common fragile site (CFS) instability are well known in cancer cells. Here, we argue that normal cells remain an important experimental model to address questions related to CFS instability in the absence of alterations in cell cycle and DNA damage repair pathways, which are common features acquired in cancer. Furthermore, a major gap of knowledge concerns the stability of CFSs during gametogenesis. CFS instability in meiotic or postmeiotic stages of the germ cell line could generate chromosome deletions or large rearrangements. This in turn can lead to the functional loss of the several CFS-associated genes with tumor suppressor function. Our hypothesis is that such mutations can potentially result in genetic predisposition to develop cancer. Indirect evidence for CFS instability in human germ cells has been provided by genomic investigations in family pedigrees associated with genetic disease. The issue of CFS instability in the germ cell line should represent one of the future efforts, and may take advantage of the existence of sequence and functional conservation of CFSs between rodents and humans.
Collapse
Affiliation(s)
- Elisa Palumbo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Antonella Russo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| |
Collapse
|
37
|
Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short- and long-read genome sequencing. Genome Med 2018; 10:95. [PMID: 30526634 PMCID: PMC6286558 DOI: 10.1186/s13073-018-0606-6] [Citation(s) in RCA: 102] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 11/23/2018] [Indexed: 12/19/2022] Open
Abstract
Background Studies have shown that complex structural variants (cxSVs) contribute to human genomic variation and can cause Mendelian disease. We aimed to identify cxSVs relevant to Mendelian disease using short-read whole-genome sequencing (WGS), resolve the precise variant configuration and investigate possible mechanisms of cxSV formation. Methods We performed short-read WGS and analysis of breakpoint junctions to identify cxSVs in a cohort of 1324 undiagnosed rare disease patients. Long-read WGS and gene expression analysis were used to resolve one case. Results We identified three pathogenic cxSVs: a de novo duplication-inversion-inversion-deletion affecting ARID1B, a de novo deletion-inversion-duplication affecting HNRNPU and a homozygous deletion-inversion-deletion affecting CEP78. Additionally, a de novo duplication-inversion-duplication overlapping CDKL5 was resolved by long-read WGS demonstrating the presence of both a disrupted and an intact copy of CDKL5 on the same allele, and gene expression analysis showed both parental alleles of CDKL5 were expressed. Breakpoint analysis in all the cxSVs revealed both microhomology and longer repetitive elements. Conclusions Our results corroborate that cxSVs cause Mendelian disease, and we recommend their consideration during clinical investigations. We show that resolution of breakpoints can be critical to interpret pathogenicity and present evidence of replication-based mechanisms in cxSV formation. Electronic supplementary material The online version of this article (10.1186/s13073-018-0606-6) contains supplementary material, which is available to authorized users.
Collapse
|
38
|
Fujiwara K, Matsuura K, Matsunami K, Iio E, Nojiri S. Characterization of hepatitis B virus with complex structural variations. BMC Microbiol 2018; 18:202. [PMID: 30509169 PMCID: PMC6276219 DOI: 10.1186/s12866-018-1350-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 11/20/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Hepatitis B virus (HBV) infection is one of the most serious public health issues. Recent HBV genetic research has revealed novel genetic rearrangements termed complex structural variations (SVs), which are composed of combinations of SVs such as insertions, deletions, and duplications. An extensive search was made for complex SVs of HBV and their characteristics were analyzed. RESULTS Fifty-five HBV strains with complex SVs were identified by analyzing genetic sequences of HBV with bioinformatical tools. Along with 15 HBV strains with complex SVs in a previous report, a total of 70 HBV strains harboring complex SVs were analyzed. Complex SVs in the HBV genome were located frequently between nt 1500 and 2000. Insertions were observed in 65/70 (92.9%) of HBV strains with complex SVs. As insertional motif sequences, hepatocyte nuclear factor 1 binding site, a sequence complementary to part of box α in enhancer II, and insertions of unknown origins were observed. The complex SVs were classified into six groups, and combination of insertion and deletion was observed more frequently than other patterns. CONCLUSION Through an extensive search of HBV sequences, new strains with complex SVs were identified in this study. Characteristics of HBV with complex SVs were clarified by the analysis of 70 HBV strains harboring complex SVs. Further investigation is required to elucidate its role in pathogenesis of HBV-related liver disease.
Collapse
Affiliation(s)
- Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi, 467-8601, Japan.
| | - Kentaro Matsuura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi, 467-8601, Japan
| | - Kayoko Matsunami
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi, 467-8601, Japan
| | - Etsuko Iio
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi, 467-8601, Japan
| | - Shunsuke Nojiri
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi, 467-8601, Japan
| |
Collapse
|
39
|
Genome-wide reconstruction of complex structural variants using read clouds. Nat Methods 2017; 14:915-920. [PMID: 28714986 PMCID: PMC5578891 DOI: 10.1038/nmeth.4366] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Accepted: 06/15/2017] [Indexed: 12/16/2022]
Abstract
In read cloud approaches, microfluidic partitioning of long genomic DNA fragments and barcoding of shorter fragments derived from these fragments retains long-range information in short sequencing reads. This combination of short reads with long-range information represents a powerful alternative to single-molecule long-read sequencing. We develop Genome-wide Reconstruction of Complex Structural Variants (GROC-SVs) for SV detection and assembly from read cloud data and apply this method to Illumina-sequenced 10x Genomics sarcoma and breast cancer data sets. Compared with short-fragment sequencing, GROC-SVs substantially improves the specificity of breakpoint detection at comparable sensitivity. This approach also performs sequence assembly across multiple breakpoints simultaneously, enabling the reconstruction of events exhibiting remarkable complexity. We show that chromothriptic rearrangements occurred before copy number amplifications, and that rates of single-nucleotide variants and SVs are not correlated. Our results support the use of read cloud approaches to advance the characterization of large and complex structural variation.
Collapse
|
40
|
vonHoldt BM, Shuldiner E, Koch IJ, Kartzinel RY, Hogan A, Brubaker L, Wanser S, Stahler D, Wynne CDL, Ostrander EA, Sinsheimer JS, Udell MAR. Structural variants in genes associated with human Williams-Beuren syndrome underlie stereotypical hypersociability in domestic dogs. SCIENCE ADVANCES 2017; 3:e1700398. [PMID: 28776031 PMCID: PMC5517105 DOI: 10.1126/sciadv.1700398] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 06/15/2017] [Indexed: 05/04/2023]
Abstract
Although considerable progress has been made in understanding the genetic basis of morphologic traits (for example, body size and coat color) in dogs and wolves, the genetic basis of their behavioral divergence is poorly understood. An integrative approach using both behavioral and genetic data is required to understand the molecular underpinnings of the various behavioral characteristics associated with domestication. We analyze a 5-Mb genomic region on chromosome 6 previously found to be under positive selection in domestic dog breeds. Deletion of this region in humans is linked to Williams-Beuren syndrome (WBS), a multisystem congenital disorder characterized by hypersocial behavior. We associate quantitative data on behavioral phenotypes symptomatic of WBS in humans with structural changes in the WBS locus in dogs. We find that hypersociability, a central feature of WBS, is also a core element of domestication that distinguishes dogs from wolves. We provide evidence that structural variants in GTF2I and GTF2IRD1, genes previously implicated in the behavioral phenotype of patients with WBS and contained within the WBS locus, contribute to extreme sociability in dogs. This finding suggests that there are commonalities in the genetic architecture of WBS and canine tameness and that directional selection may have targeted a unique set of linked behavioral genes of large phenotypic effect, allowing for rapid behavioral divergence of dogs and wolves, facilitating coexistence with humans.
Collapse
Affiliation(s)
- Bridgett M. vonHoldt
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
- Corresponding author.
| | - Emily Shuldiner
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
- Translational Genetics and Genomics Unit, National Institute of Arthritis and Musculoskeletal and Skin Disorders, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, MD 20892, USA
| | - Ilana Janowitz Koch
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Rebecca Y. Kartzinel
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Andrew Hogan
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Lauren Brubaker
- Department of Animal and Rangeland Sciences, Oregon State University, OR 97331, USA
| | - Shelby Wanser
- Department of Animal and Rangeland Sciences, Oregon State University, OR 97331, USA
| | - Daniel Stahler
- Yellowstone Center for Resources, National Park Service, Yellowstone National Park, WY 82190, USA
| | - Clive D. L. Wynne
- Department of Psychology, Arizona State University, Tempe, AZ 85287, USA
| | - Elaine A. Ostrander
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Janet S. Sinsheimer
- Departments of Human Genetics and Biomathematics, David Geffen School of Medicine at University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Monique A. R. Udell
- Department of Animal and Rangeland Sciences, Oregon State University, OR 97331, USA
| |
Collapse
|
41
|
Abstract
Background Genome rearrangements are critical oncogenic driver events in many malignancies. However, the identification and resolution of the structure of cancer genomic rearrangements remain challenging even with whole genome sequencing. Methods To identify oncogenic genomic rearrangements and resolve their structure, we analyzed linked read sequencing. This approach relies on a microfluidic droplet technology to produce libraries derived from single, high molecular weight DNA molecules, 50 kb in size or greater. After sequencing, the barcoded sequence reads provide long range genomic information, identify individual high molecular weight DNA molecules, determine the haplotype context of genetic variants that occur across contiguous megabase-length segments of the genome and delineate the structure of complex rearrangements. We applied linked read sequencing of whole genomes to the analysis of a set of synchronous metastatic diffuse gastric cancers that occurred in the same individual. Results When comparing metastatic sites, our analysis implicated a complex somatic rearrangement that was present in the metastatic tumor. The oncogenic event associated with the identified complex rearrangement resulted in an amplification of the known cancer driver gene FGFR2. With further investigation using these linked read data, the FGFR2 copy number alteration was determined to be a deletion-inversion motif that underwent tandem duplication, with unique breakpoints in each metastasis. Using a three-dimensional organoid tissue model, we functionally validated the metastatic potential of an FGFR2 amplification in gastric cancer. Conclusions Our study demonstrates that linked read sequencing is useful in characterizing oncogenic rearrangements in cancer metastasis. Electronic supplementary material The online version of this article (doi:10.1186/s13073-017-0447-8) contains supplementary material, which is available to authorized users.
Collapse
|
42
|
Fujiwara K, Matsunami K, Iio E, Nojiri S, Joh T. Novel non-canonical genetic rearrangements termed "complex structural variations" in HBV genome. Virus Res 2017. [PMID: 28627394 DOI: 10.1016/j.virusres.2017.06.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
BACKGROUNDS AND AIMS Chronic hepatitis B virus (HBV) infection is an important worldwide public health issue. Further knowledge on the characteristics of HBV will facilitate its eradication. Genome structural variations (SVs) are defined by its canonical form such as duplication, deletion, and insertion. However, recent studies have reported complex SVs that cannot be explained by those canonical SVs. A HBV strain (UK2) with an unusual genome structure rearrangement that was completely different from known mutations or rearrangements was previously reported. Thus, this study was conducted to confirm the rearrangement in UK2 as a novel complex SV, and to find additional HBV strains with complex SVs. Further, the contribution of complex SVs in hepadnavirus variability was investigated. METHODS The genome rearrangement pattern in UK2 was analyzed. Further, a search of online databases retrieved additional HBV strains which were candidates to harbor complex SVs. The architecture of each rearrangement in the candidate strains was analyzed by bioinformatical tools. In addition, alignment of woolly monkey hepatitis virus (WMHV) and HBV from human and non-human primates was performed to investigate the contribution of complex SVs to variability of hepadnavirus. RESULTS The rearrangement in UK2 was confirmed as a complex SV. An additional 15 HBV strains were retrieved from databases, and confirmed as harboring complex SVs. Complex combinations of deletion, insertion, and duplication characterized the novel rearrangements. The complex SVs in six strains (37.5%) were composed of deletion, insertion, and duplication. The complex SVs in another six strains (37.5%) consisted of deletion and insertion, followed by insertions and duplication in three strains (18.8%), and deletion and duplication in one strain (6.3%). In addition, unique preS1 promoter insertions, which contained the hepatocyte nuclear factor 1 binding site, were observed in seven (43.8%) of 16 strains. Further, analysis of the genetic sequences of WMHV and HBV from human and non-human primates showed that complex combinations of deletions and insertions accounted for their genetic differences. CONCLUSIONS Non-canonical genetic rearrangements termed complex SVs were observed in HBV. Further, complex SVs accounted for the genetic differences of WMHV and HBV from human and non-human primates.
Collapse
Affiliation(s)
- Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan.
| | - Kayoko Matsunami
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Etsuko Iio
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Shunsuke Nojiri
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Takashi Joh
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| |
Collapse
|
43
|
Chiang C, Scott AJ, Davis JR, Tsang EK, Li X, Kim Y, Hadzic T, Damani FN, Ganel L, Montgomery SB, Battle A, Conrad DF, Hall IM. The impact of structural variation on human gene expression. Nat Genet 2017; 49:692-699. [PMID: 28369037 PMCID: PMC5406250 DOI: 10.1038/ng.3834] [Citation(s) in RCA: 267] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 03/13/2017] [Indexed: 12/31/2022]
Abstract
Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.
Collapse
Affiliation(s)
- Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Alexandra J. Scott
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Joe R. Davis
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Emily K. Tsang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Biomedical Informatics Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Xin Li
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Yungil Kim
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Tarik Hadzic
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Farhan N. Damani
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Liron Ganel
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | | | - Stephen B. Montgomery
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Donald F. Conrad
- Department of Pathology & Immunology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Ira M. Hall
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Department of Pathology & Immunology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| |
Collapse
|
44
|
Global analysis of somatic structural genomic alterations and their impact on gene expression in diverse human cancers. Proc Natl Acad Sci U S A 2016; 113:13768-13773. [PMID: 27856756 DOI: 10.1073/pnas.1606220113] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Tumor genomes are mosaics of somatic structural variants (SVs) that may contribute to the activation of oncogenes or inactivation of tumor suppressors, for example, by altering gene copy number amplitude. However, there are multiple other ways in which SVs can modulate transcription, but the general impact of such events on tumor transcriptional output has not been systematically determined. Here we use whole-genome sequencing data to map SVs across 600 tumors and 18 cancers, and investigate the relationship between SVs, copy number alterations (CNAs), and mRNA expression. We find that 34% of CNA breakpoints can be clarified structurally and that most amplifications are due to tandem duplications. We observe frequent swapping of strong and weak promoters in the context of gene fusions, and find that this has a measurable global impact on mRNA levels. Interestingly, several long noncoding RNAs were strongly activated by this mechanism. Additionally, SVs were confirmed in telomere reverse transcriptase (TERT) upstream regions in several cancers, associated with elevated TERT mRNA levels. We also highlight high-confidence gene fusions supported by both genomic and transcriptomic evidence, including a previously undescribed paired box 8 (PAX8)-nuclear factor, erythroid 2 like 2 (NFE2L2) fusion in thyroid carcinoma. In summary, we combine SV, CNA, and expression data to provide insights into the structural basis of CNAs as well as the impact of SVs on gene expression in tumors.
Collapse
|
45
|
Zhao X, Emery SB, Myers B, Kidd JM, Mills RE. Resolving complex structural genomic rearrangements using a randomized approach. Genome Biol 2016; 17:126. [PMID: 27287201 PMCID: PMC4901421 DOI: 10.1186/s13059-016-0993-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 05/25/2016] [Indexed: 12/27/2022] Open
Abstract
Complex chromosomal rearrangements are structural genomic alterations involving multiple instances of deletions, duplications, inversions, or translocations that co-occur either on the same chromosome or represent different overlapping events on homologous chromosomes. We present SVelter, an algorithm that identifies regions of the genome suspected to harbor a complex event and then resolves the structure by iteratively rearranging the local genome structure, in a randomized fashion, with each structure scored against characteristics of the observed sequencing data. SVelter is able to accurately reconstruct complex chromosomal rearrangements when compared to well-characterized genomes that have been deeply sequenced with both short and long reads.
Collapse
Affiliation(s)
- Xuefang Zhao
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Sarah B Emery
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Bridget Myers
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Jeffrey M Kidd
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA.,Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ryan E Mills
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA. .,Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
46
|
Pedersen BS, Layer RM, Quinlan AR. Vcfanno: fast, flexible annotation of genetic variants. Genome Biol 2016; 17:118. [PMID: 27250555 PMCID: PMC4888505 DOI: 10.1186/s13059-016-0973-5] [Citation(s) in RCA: 108] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 05/03/2016] [Indexed: 01/17/2023] Open
Abstract
The integration of genome annotations is critical to the identification of genetic variants that are relevant to studies of disease or other traits. However, comprehensive variant annotation with diverse file formats is difficult with existing methods. Here we describe vcfanno, which flexibly extracts and summarizes attributes from multiple annotation files and integrates the annotations within the INFO column of the original VCF file. By leveraging a parallel "chromosome sweeping" algorithm, we demonstrate substantial performance gains by annotating ~85,000 variants per second with 50 attributes from 17 commonly used genome annotation resources. Vcfanno is available at https://github.com/brentp/vcfanno under the MIT license.
Collapse
Affiliation(s)
- Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, 84105, USA.
- USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, 84105, USA.
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, 84105, USA.
| | - Ryan M Layer
- Department of Human Genetics, University of Utah, Salt Lake City, UT, 84105, USA
- USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, 84105, USA
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, 84105, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, 84105, USA.
- USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, 84105, USA.
- Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, 84105, USA.
| |
Collapse
|
47
|
Del Rey J, Santos M, González-Meneses A, Milà M, Fuster C. Heterogeneity of a Constitutional Complex Chromosomal Rearrangement in 2q. Cytogenet Genome Res 2016; 148:156-64. [PMID: 27216161 DOI: 10.1159/000445859] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/03/2016] [Indexed: 11/19/2022] Open
Abstract
Complex chromosome rearrangements (CCRs) are unusual structural chromosome alterations found in humans, and to date only a few have been characterized molecularly. New mechanisms, such as chromothripsis, have been proposed to explain the presence of the CCRs in cancer cells and in patients with congenital disorders and/or mental retardation. The aim of the present study was the molecular characterization of a constitutional CCR in a girl with multiple congenital disorders and intellectual disability in order to determine the genotype-phenotype relation and to clarify whether the CCR could have been caused by chromosomal catastrophic events. The present CCR was characterized by G-banding, high-resolution CGH, multiplex ligation-dependent probe amplification and subtelomeric 2q-FISH analyses. Preliminary results indicate that the de novo CCR is unbalanced showing a 2q37.3 deletion and 2q34q37.2 partial trisomy. Our patient shows some of the typical traits and intellectual disability described in patients with 2q37 deletion and also in carriers of 2q34q37.2 partial trisomy; thus, the clinical disorders could be explained by additional effects of both chromosome alterations (deletions and duplications). A posterior, sequential FISH study using BAC probes revealed the unexpected presence of at least 17 different reorganizations affecting 2q34q37.2, suggesting the existence of chromosome instability in this region. The present CCR is the first case described in the literature of heterogeneity of unbalanced CCRs affecting a small region of 2q, indicating that the mechanisms involved in constitutional chromosome rearrangement may be more complex than previously thought.
Collapse
Affiliation(s)
- Javier Del Rey
- Unitat de Biologia Celx00B7;lular i Genx00E8;tica Mx00E8;dica, Facultat de Medicina, Universitat Autx00F2;noma de Barcelona, Barcelona, Spain
| | | | | | | | | |
Collapse
|
48
|
Hazen JL, Faust GG, Rodriguez AR, Ferguson WC, Shumilina S, Clark RA, Boland MJ, Martin G, Chubukov P, Tsunemoto RK, Torkamani A, Kupriyanov S, Hall IM, Baldwin KK. The Complete Genome Sequences, Unique Mutational Spectra, and Developmental Potency of Adult Neurons Revealed by Cloning. Neuron 2016; 89:1223-1236. [PMID: 26948891 DOI: 10.1016/j.neuron.2016.02.004] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 12/14/2015] [Accepted: 01/13/2016] [Indexed: 02/07/2023]
Abstract
Somatic mutation in neurons is linked to neurologic disease and implicated in cell-type diversification. However, the origin, extent, and patterns of genomic mutation in neurons remain unknown. We established a nuclear transfer method to clonally amplify the genomes of neurons from adult mice for whole-genome sequencing. Comprehensive mutation detection and independent validation revealed that individual neurons harbor ∼100 unique mutations from all classes but lack recurrent rearrangements. Most neurons contain at least one gene-disrupting mutation and rare (0-2) mobile element insertions. The frequency and gene bias of neuronal mutations differ from other lineages, potentially due to novel mechanisms governing postmitotic mutation. Fertile mice were cloned from several neurons, establishing the compatibility of mutated adult neuronal genomes with reprogramming to pluripotency and development.
Collapse
Affiliation(s)
- Jennifer L Hazen
- Department of Molecular and Cellular Neuroscience, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla CA 92037, USA
| | - Gregory G Faust
- Department of Biochemistry and Molecular Genetics, 1340 Jefferson Park Ave, University of Virginia School of Medicine, Charlottesville, VA 22901, USA
| | - Alberto R Rodriguez
- Mouse Genetics Core Facility, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - William C Ferguson
- Department of Molecular and Cellular Neuroscience, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla CA 92037, USA
| | - Svetlana Shumilina
- Department of Biochemistry and Molecular Genetics, 1340 Jefferson Park Ave, University of Virginia School of Medicine, Charlottesville, VA 22901, USA
| | - Royden A Clark
- Department of Biochemistry and Molecular Genetics, 1340 Jefferson Park Ave, University of Virginia School of Medicine, Charlottesville, VA 22901, USA
| | - Michael J Boland
- Department of Molecular and Cellular Neuroscience, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla CA 92037, USA
| | - Greg Martin
- Mouse Genetics Core Facility, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - Pavel Chubukov
- Department of Molecular and Cellular Neuroscience, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla CA 92037, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - Rachel K Tsunemoto
- Department of Molecular and Cellular Neuroscience, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla CA 92037, USA.,Neuroscience Graduate Program, 9500 Gilman Drive, University of California San Diego, La Jolla, California, USA
| | - Ali Torkamani
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - Sergey Kupriyanov
- Mouse Genetics Core Facility, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, 4444 Forest Park Ave, St. Louis, MO 63108, USA.,Department of Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA
| | - Kristin K Baldwin
- Department of Molecular and Cellular Neuroscience, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla CA 92037, USA.,Neuroscience Graduate Program, 9500 Gilman Drive, University of California San Diego, La Jolla, California, USA
| |
Collapse
|
49
|
Yamagata K, Yamanishi A, Kokubu C, Takeda J, Sese J. COSMOS: accurate detection of somatic structural variations through asymmetric comparison between tumor and normal samples. Nucleic Acids Res 2016; 44:e78. [PMID: 26833260 PMCID: PMC4856976 DOI: 10.1093/nar/gkw026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2015] [Accepted: 01/11/2016] [Indexed: 11/13/2022] Open
Abstract
An important challenge in cancer genomics is precise detection of structural variations (SVs) by high-throughput short-read sequencing, which is hampered by the high false discovery rates of existing analysis tools. Here, we propose an accurate SV detection method named COSMOS, which compares the statistics of the mapped read pairs in tumor samples with isogenic normal control samples in a distinct asymmetric manner. COSMOS also prioritizes the candidate SVs using strand-specific read-depth information. Performance tests on modeled tumor genomes revealed that COSMOS outperformed existing methods in terms of F-measure. We also applied COSMOS to an experimental mouse cell-based model, in which SVs were induced by genome engineering and gamma-ray irradiation, followed by polymerase chain reaction-based confirmation. The precision of COSMOS was 84.5%, while the next best existing method was 70.4%. Moreover, the sensitivity of COSMOS was the highest, indicating that COSMOS has great potential for cancer genome analysis.
Collapse
Affiliation(s)
- Koichi Yamagata
- Biotechnology Research Institute for Drug Discovery, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, 135-0064, Japan
| | - Ayako Yamanishi
- Department of Genome Biology, Graduate School of Medicine, Osaka University, Osaka, 565-0871, Japan
| | - Chikara Kokubu
- Department of Genome Biology, Graduate School of Medicine, Osaka University, Osaka, 565-0871, Japan
| | - Junji Takeda
- Department of Genome Biology, Graduate School of Medicine, Osaka University, Osaka, 565-0871, Japan
| | - Jun Sese
- Biotechnology Research Institute for Drug Discovery, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, 135-0064, Japan Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, 135-0064, Japan
| |
Collapse
|
50
|
Structural Variant Detection by Large-scale Sequencing Reveals New Evolutionary Evidence on Breed Divergence between Chinese and European Pigs. Sci Rep 2016; 6:18501. [PMID: 26729041 PMCID: PMC4700453 DOI: 10.1038/srep18501] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 11/19/2015] [Indexed: 01/28/2023] Open
Abstract
In this study, we performed a genome-wide SV detection among the genomes of thirteen pigs from diverse Chinese and European originated breeds by next genetation sequencing, and constrcuted a single-nucleotide resolution map involving 56,930 putative SVs. We firstly identified a SV hotspot spanning 35 Mb region on the X chromosome specifically in the genomes of Chinese originated individuals. Further scrutinizing this region by large-scale sequencing data of extra 111 individuals, we obtained the confirmatory evidence on our initial finding. Moreover, thirty five SV-related genes within the hotspot region, being of importance for reproduction ability, rendered significant different evolution rates between Chinese and European originated breeds. The SV hotspot identified herein offers a novel evidence for assessing phylogenetic relationships, as well as likely explains the genetic difference of corresponding phenotypes and features, among Chinese and European pig breeds. Furthermore, we employed various SVs to infer genetic structure of individuls surveyed. We found SVs can clearly detect the difference of genetic background among individuals. This clues us that genome-wide SVs can capture majority of geneic variation and be applied into cladistic analyses. Characterizing whole genome SVs demonstrated that SVs are significantly enriched/depleted with various genomic features.
Collapse
|