101
|
Harris AM, DeGiorgio M. Identifying and Classifying Shared Selective Sweeps from Multilocus Data. Genetics 2020; 215:143-171. [PMID: 32152048 PMCID: PMC7198270 DOI: 10.1534/genetics.120.303137] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 02/29/2020] [Indexed: 11/18/2022] Open
Abstract
Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12 is distinct from comparable statistics because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of statistics we term [Formula: see text] and [Formula: see text] to further classify identified shared sweeps as hard or soft. Finally, we identified both previously reported and novel shared sweep candidates from human whole-genome sequences. Previously reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-Europeans, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan Africans involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802
- Molecular, Cellular, and Integrative Biosciences at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431
| |
Collapse
|
102
|
Sheng Q, Yu H, Oyebamiji O, Wang J, Chen D, Ness S, Zhao YY, Guo Y. AnnoGen: annotating genome-wide pragmatic features. Bioinformatics 2020; 36:2899-2901. [PMID: 31930398 PMCID: PMC7203733 DOI: 10.1093/bioinformatics/btaa027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/19/2019] [Accepted: 01/08/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genome annotation is an important step for all in-depth bioinformatics analysis. It is imperative to augment quantity and diversity of genome-wide annotation data for the latest reference genome to promote its adoption by ongoing and future impactful studies. RESULTS We developed a python toolkit AnnoGen, which at the first time, allows the annotation of three pragmatic genomic features for the GRCh38 genome in enormous base-wise quantities. The three features are chemical binding Energy, sequence information Entropy and Homology Score. The Homology Score is an exceptional feature that captures the genome-wide homology through single-base-offset tiling windows of 100 continual nucleotide bases. AnnoGen is capable of annotating the proprietary pragmatic features for variable user-interested genomic regions and optionally comparing two parallel sets of genomic regions. AnnoGen is characterized with simple utility modes and succinct HTML report of informative statistical tables and plots. AVAILABILITY AND IMPLEMENTATION https://github.com/shengqh/annogen.
Collapse
Affiliation(s)
- Quanhu Sheng
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Hui Yu
- Department of Internal Medicine, Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87109, USA
| | - Olufunmilola Oyebamiji
- Department of Internal Medicine, Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87109, USA
| | - Jiandong Wang
- Department of Computer Science, University of South Carolina, Columbia, SC 29205, USA
| | - Danqian Chen
- Key Laboratory of Resource Biology and Biotechnology, Western China School of Life Sciences, Northwest University, Xi'an, Shaanxi, China
| | - Scott Ness
- Department of Internal Medicine, Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87109, USA
| | - Ying-Yong Zhao
- Key Laboratory of Resource Biology and Biotechnology, Western China School of Life Sciences, Northwest University, Xi'an, Shaanxi, China
| | - Yan Guo
- Department of Internal Medicine, Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87109, USA
| |
Collapse
|
103
|
Chow NA, Muñoz JF, Gade L, Berkow EL, Li X, Welsh RM, Forsberg K, Lockhart SR, Adam R, Alanio A, Alastruey-Izquierdo A, Althawadi S, Araúz AB, Ben-Ami R, Bharat A, Calvo B, Desnos-Ollivier M, Escandón P, Gardam D, Gunturu R, Heath CH, Kurzai O, Martin R, Litvintseva AP, Cuomo CA. Tracing the Evolutionary History and Global Expansion of Candida auris Using Population Genomic Analyses. mBio 2020; 11:e03364-19. [PMID: 32345637 PMCID: PMC7188998 DOI: 10.1128/mbio.03364-19] [Citation(s) in RCA: 197] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 04/01/2020] [Indexed: 01/26/2023] Open
Abstract
Candida auris has emerged globally as a multidrug-resistant yeast that can spread via nosocomial transmission. An initial phylogenetic study of isolates from Japan, India, Pakistan, South Africa, and Venezuela revealed four populations (clades I, II, III, and IV) corresponding to these geographic regions. Since this description, C. auris has been reported in more than 30 additional countries. To trace this global emergence, we compared the genomes of 304 C. auris isolates from 19 countries on six continents. We found that four predominant clades persist across wide geographic locations. We observed phylogeographic mixing in most clades; clade IV, with isolates mainly from South America, demonstrated the strongest phylogeographic substructure. C. auris isolates from two clades with opposite mating types were detected contemporaneously in a single health care facility in Kenya. We estimated a Bayesian molecular clock phylogeny and dated the origin of each clade within the last 360 years; outbreak-causing clusters from clades I, III, and IV originated 36 to 38 years ago. We observed high rates of antifungal resistance in clade I, including four isolates resistant to all three major classes of antifungals. Mutations that contribute to resistance varied between the clades, with Y132F in ERG11 as the most widespread mutation associated with azole resistance and S639P in FKS1 for echinocandin resistance. Copy number variants in ERG11 predominantly appeared in clade III and were associated with fluconazole resistance. These results provide a global context for the phylogeography, population structure, and mechanisms associated with antifungal resistance in C. aurisIMPORTANCE In less than a decade, C. auris has emerged in health care settings worldwide; this species is capable of colonizing skin and causing outbreaks of invasive candidiasis. In contrast to other Candida species, C. auris is unique in its ability to spread via nosocomial transmission and its high rates of drug resistance. As part of the public health response, whole-genome sequencing has played a major role in characterizing transmission dynamics and detecting new C. auris introductions. Through a global collaboration, we assessed genome evolution of isolates of C. auris from 19 countries. Here, we described estimated timing of the expansion of each C. auris clade and of fluconazole resistance, characterized discrete phylogeographic population structure of each clade, and compared genome data to sensitivity measurements to describe how antifungal resistance mechanisms vary across the population. These efforts are critical for a sustained, robust public health response that effectively utilizes molecular epidemiology.
Collapse
Affiliation(s)
- Nancy A Chow
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | - José F Muñoz
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Lalitha Gade
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | - Elizabeth L Berkow
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | - Xiao Li
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Rory M Welsh
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | - Kaitlin Forsberg
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | - Shawn R Lockhart
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | - Rodney Adam
- Department of Pathology, Aga Khan University Hospital, Nairobi, Kenya
| | - Alexandre Alanio
- Institut Pasteur, Molecular Mycology Unit, CNRS UMR2000, National Reference Center for Invasive Mycoses and Antifungals (NRCMA), Paris, France
- Laboratoire de Parasitologie-Mycologie, Hôpital Saint-Louis, Groupe Hospitalier Lariboisière, Saint-Louis, Fernand Widal, Assistance Publique-Hôpitaux de Paris (AP-HP), Paris, France
- Université Paris Diderot, Université de Paris, Paris, France
| | - Ana Alastruey-Izquierdo
- Mycology Reference Laboratory, National Centre for Microbiology, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Sahar Althawadi
- Department of Pathology and Laboratory Medicine, King Faisal Specialist Hospital and Research Centre, Riyadh, Saudi Arabia
| | | | - Ronen Ben-Ami
- Infectious Diseases Unit, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amrita Bharat
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Manitoba, Canada
| | - Belinda Calvo
- Department of Infectious Diseases, School of Medicine, Universidad del Zulia, Maracaibo, Venezuela
| | - Marie Desnos-Ollivier
- Institut Pasteur, Molecular Mycology Unit, CNRS UMR2000, National Reference Center for Invasive Mycoses and Antifungals (NRCMA), Paris, France
| | - Patricia Escandón
- Grupo de Microbiología, Instituto Nacional de Salud, Bogotá, Colombia
| | - Dianne Gardam
- Department of Microbiology, PathWest Laboratory Medicine FSH Network, Fiona Stanley Hospital, Murdoch, Australia
| | - Revathi Gunturu
- Department of Pathology, Aga Khan University Hospital, Nairobi, Kenya
| | - Christopher H Heath
- Department of Microbiology, PathWest Laboratory Medicine FSH Network, Fiona Stanley Hospital, Murdoch, Australia
- Department of Infectious Diseases, Fiona Stanley Hospital, Murdoch, Australia
- Infectious Diseases, Royal Perth Hospital, Perth, Australia
- Faculty of Health & Medical Sciences, University of Western Australia, Crawley, Washington, Australia
| | - Oliver Kurzai
- German National Reference Center for Invasive Fungal Infections NRZMyk, Leibniz Institute for Natural Product Research and Infection Biology-Hans-Knöll-Institute, Jena, Germany
- Institute for Hygiene and Microbiology, University of Würzburg, Würzburg, Germany
| | - Ronny Martin
- German National Reference Center for Invasive Fungal Infections NRZMyk, Leibniz Institute for Natural Product Research and Infection Biology-Hans-Knöll-Institute, Jena, Germany
- Institute for Hygiene and Microbiology, University of Würzburg, Würzburg, Germany
| | - Anastasia P Litvintseva
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, Atlanta, Georgia, USA
| | | |
Collapse
|
104
|
Delmore K, Illera JC, Pérez-Tris J, Segelbacher G, Lugo Ramos JS, Durieux G, Ishigohoka J, Liedvogel M. The evolutionary history and genomics of European blackcap migration. eLife 2020; 9:e54462. [PMID: 32312383 PMCID: PMC7173969 DOI: 10.7554/elife.54462] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 03/13/2020] [Indexed: 12/19/2022] Open
Abstract
Seasonal migration is a taxonomically widespread behaviour that integrates across many traits. The European blackcap exhibits enormous variation in migration and is renowned for research on its evolution and genetic basis. We assembled a reference genome for blackcaps and obtained whole genome resequencing data from individuals across its breeding range. Analyses of population structure and demography suggested divergence began ~30,000 ya, with evidence for one admixture event between migrant and resident continent birds ~5000 ya. The propensity to migrate, orientation and distance of migration all map to a small number of genomic regions that do not overlap with results from other species, suggesting that there are multiple ways to generate variation in migration. Strongly associated single nucleotide polymorphisms (SNPs) were located in regulatory regions of candidate genes that may serve as major regulators of the migratory syndrome. Evidence for selection on shared variation was documented, providing a mechanism by which rapid changes may evolve.
Collapse
Affiliation(s)
- Kira Delmore
- Behavioural Genomics, Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Juan Carlos Illera
- Research Unit of Biodiversity (UO-CSIC-PA), Oviedo UniversityMieresSpain
| | - Javier Pérez-Tris
- Department of Biodiversity, Ecology and Evolution, Complutense University of MadridMadridSpain
| | | | - Juan S Lugo Ramos
- Behavioural Genomics, Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Gillian Durieux
- Behavioural Genomics, Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Jun Ishigohoka
- Behavioural Genomics, Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Miriam Liedvogel
- Behavioural Genomics, Max Planck Institute for Evolutionary BiologyPlönGermany
| |
Collapse
|
105
|
KaramiNejadRanjbar M, Sharifzadeh S, Wietek NC, Artibani M, El-Sahhar S, Sauka-Spengler T, Yau C, Tresp V, Ahmed AA. A highly accurate platform for clone-specific mutation discovery enables the study of active mutational processes. eLife 2020; 9:55207. [PMID: 32255426 PMCID: PMC7228773 DOI: 10.7554/elife.55207] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Accepted: 04/07/2020] [Indexed: 12/14/2022] Open
Abstract
Bulk whole genome sequencing (WGS) enables the analysis of tumor evolution but, because of depth limitations, can only identify old mutational events. The discovery of current mutational processes for predicting the tumor’s evolutionary trajectory requires dense sequencing of individual clones or single cells. Such studies, however, are inherently problematic because of the discovery of excessive false positive (FP) mutations when sequencing picogram quantities of DNA. Data pooling to increase the confidence in the discovered mutations, moves the discovery back in the past to a common ancestor. Here we report a robust WGS and analysis pipeline (DigiPico/MutLX) that virtually eliminates all F results while retaining an excellent proportion of true positives. Using our method, we identified, for the first time, a hyper-mutation (kataegis) event in a group of ∼30 cancer cells from a recurrent ovarian carcinoma. This was unidentifiable from the bulk WGS data. Overall, we propose DigiPico/MutLX method as a powerful framework for the identification of clone-specific variants at an unprecedented accuracy.
Collapse
Affiliation(s)
- Mohammad KaramiNejadRanjbar
- Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, United Kingdom.,Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, United Kingdom
| | | | - Nina C Wietek
- Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, United Kingdom.,Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Mara Artibani
- Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, United Kingdom.,Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Salma El-Sahhar
- Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, United Kingdom.,Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Tatjana Sauka-Spengler
- Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, United Kingdom.,Radcliffe Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Christopher Yau
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, United Kingdom
| | - Volker Tresp
- Ludwig Maximilian University of Munich, Munich, Germany.,Siemens AG, Corporate Technology, Munich, Germany
| | - Ahmed A Ahmed
- Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, United Kingdom.,Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
106
|
Cresswell GD, Nichol D, Spiteri I, Tari H, Zapata L, Heide T, Maley CC, Magnani L, Schiavon G, Ashworth A, Barry P, Sottoriva A. Mapping the breast cancer metastatic cascade onto ctDNA using genetic and epigenetic clonal tracking. Nat Commun 2020; 11:1446. [PMID: 32221288 PMCID: PMC7101390 DOI: 10.1038/s41467-020-15047-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 02/18/2020] [Indexed: 02/06/2023] Open
Abstract
Circulating tumour DNA (ctDNA) allows tracking of the evolution of human cancers at high resolution, overcoming many limitations of tissue biopsies. However, exploiting ctDNA to determine how a patient's cancer is evolving in order to aid clinical decisions remains difficult. This is because ctDNA is a mix of fragmented alleles, and the contribution of different cancer deposits to ctDNA is largely unknown. Profiling ctDNA almost invariably requires prior knowledge of what genomic alterations to track. Here, we leverage on a rapid autopsy programme to demonstrate that unbiased genomic characterisation of several metastatic sites and concomitant ctDNA profiling at whole-genome resolution reveals the extent to which ctDNA is representative of widespread disease. We also present a methylation profiling method that allows tracking evolutionary changes in ctDNA at single-molecule resolution without prior knowledge. These results have critical implications for the use of liquid biopsies to monitor cancer evolution in humans and guide treatment.
Collapse
Affiliation(s)
- George D Cresswell
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Daniel Nichol
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Inmaculada Spiteri
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Haider Tari
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
- Glioma Lab, The Institute of Cancer Research, London, UK
| | - Luis Zapata
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Timon Heide
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Carlo C Maley
- Arizona Cancer Evolution Center, Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Luca Magnani
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Gaia Schiavon
- Breast Unit, Royal Marsden Hospital, London, UK
- AstraZeneca, Oncology R&D, Cambridge, UK
| | - Alan Ashworth
- UCSF Helen Diller Family Comprehensive Cancer Center, 1450 3rd St, San Francisco, CA, 94158, USA
| | - Peter Barry
- Breast Unit, Royal Marsden Hospital, London, UK.
| | - Andrea Sottoriva
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK.
| |
Collapse
|
107
|
Caspar SM, Schneider T, Meienberg J, Matyas G. Added Value of Clinical Sequencing: WGS-Based Profiling of Pharmacogenes. Int J Mol Sci 2020; 21:ijms21072308. [PMID: 32225115 PMCID: PMC7178228 DOI: 10.3390/ijms21072308] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 03/23/2020] [Accepted: 03/24/2020] [Indexed: 12/13/2022] Open
Abstract
Although several pharmacogenetic (PGx) predispositions affecting drug efficacy and safety are well established, drug selection and dosing as well as clinical trials are often performed in a non-pharmacogenetically-stratified manner, ultimately burdening healthcare systems. Pre-emptive PGx testing offers a solution which is often performed using microarrays or targeted gene panels, testing for common/known PGx variants. However, as an added value, whole-genome sequencing (WGS) could detect not only disease-causing but also pharmacogenetically-relevant variants in a single assay. Here, we present our WGS-based pipeline that extends the genetic testing of Mendelian diseases with PGx profiling, enabling the detection of rare/novel PGx variants as well. From our in-house WGS (PCR-free 60× PE150) data of 547 individuals we extracted PGx variants with drug-dosing recommendations of the Dutch Pharmacogenetics Working Group (DPWG). Furthermore, we explored the landscape of DPWG pharmacogenes in gnomAD and our in-house cohort as well as compared bioinformatic tools for WGS-based structural variant detection in CYP2D6. We show that although common/known PGx variants comprise the vast majority of detected DPWG pharmacogene alleles, for better precision medicine, PGx testing should move towards WGS-based approaches. Indeed, WGS-based PGx profiling is not only feasible and future-oriented but also the most comprehensive all-in-one approach without generating significant additional costs.
Collapse
Affiliation(s)
- Sylvan M. Caspar
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, 8952 Schlieren-Zurich, Switzerland; (S.M.C.); (T.S.); (J.M.)
- Laboratory of Translational Nutrition Biology, Department of Health Sciences and Technology, ETH Zurich, 8603 Schwerzenbach, Switzerland
| | - Timo Schneider
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, 8952 Schlieren-Zurich, Switzerland; (S.M.C.); (T.S.); (J.M.)
| | - Janine Meienberg
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, 8952 Schlieren-Zurich, Switzerland; (S.M.C.); (T.S.); (J.M.)
| | - Gabor Matyas
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, 8952 Schlieren-Zurich, Switzerland; (S.M.C.); (T.S.); (J.M.)
- Zurich Center for Integrative Human Physiology, University of Zurich, 8057 Zurich, Switzerland
- Correspondence: ; Tel.: +41-43-433-86-86
| |
Collapse
|
108
|
Franssen SU, Durrant C, Stark O, Moser B, Downing T, Imamura H, Dujardin JC, Sanders MJ, Mauricio I, Miles MA, Schnur LF, Jaffe CL, Nasereddin A, Schallig H, Yeo M, Bhattacharyya T, Alam MZ, Berriman M, Wirth T, Schönian G, Cotton JA. Global genome diversity of the Leishmania donovani complex. eLife 2020; 9:e51243. [PMID: 32209228 PMCID: PMC7105377 DOI: 10.7554/elife.51243] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 02/27/2020] [Indexed: 12/30/2022] Open
Abstract
Protozoan parasites of the Leishmania donovani complex - L. donovani and L. infantum - cause the fatal disease visceral leishmaniasis. We present the first comprehensive genome-wide global study, with 151 cultured field isolates representing most of the geographical distribution. L. donovani isolates separated into five groups that largely coincide with geographical origin but vary greatly in diversity. In contrast, the majority of L. infantum samples fell into one globally-distributed group with little diversity. This picture is complicated by several hybrid lineages. Identified genetic groups vary in heterozygosity and levels of linkage, suggesting different recombination histories. We characterise chromosome-specific patterns of aneuploidy and identified extensive structural variation, including known and suspected drug resistance loci. This study reveals greater genetic diversity than suggested by geographically-focused studies, provides a resource of genomic variation for future work and sets the scene for a new understanding of the evolution and genetics of the Leishmania donovani complex.
Collapse
Affiliation(s)
| | - Caroline Durrant
- Wellcome Sanger Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| | | | | | - Tim Downing
- Wellcome Sanger Institute, Wellcome Genome CampusHinxtonUnited Kingdom
- Dublin City UniversityDublinIreland
| | | | - Jean-Claude Dujardin
- Institute of Tropical MedicineAntwerpBelgium
- Department of Biomedical Sciences, University of AntwerpAntwerpBelgium
| | - Mandy J Sanders
- Wellcome Sanger Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| | - Isabel Mauricio
- Universidade Nova de Lisboa Instituto de Higiene e MedicinaLisboaPortugal
| | - Michael A Miles
- London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| | - Lionel F Schnur
- Kuvin Centre for the Study of Infectious and Tropical Diseases, IMRIC, Hebrew University-Hadassah, Medical SchoolJerusalemIsrael
| | - Charles L Jaffe
- Kuvin Centre for the Study of Infectious and Tropical Diseases, IMRIC, Hebrew University-Hadassah, Medical SchoolJerusalemIsrael
| | - Abdelmajeed Nasereddin
- Kuvin Centre for the Study of Infectious and Tropical Diseases, IMRIC, Hebrew University-Hadassah, Medical SchoolJerusalemIsrael
| | - Henk Schallig
- Amsterdam University Medical Centres – Academic Medical Centre at the University of Amsterdam, Department of Medical Microbiology – Experimental ParasitologyAmsterdamNetherlands
| | - Matthew Yeo
- London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| | | | - Mohammad Z Alam
- Department of Parasitology, Bangladesh Agricultural UniversityMymensinghBangladesh
| | - Matthew Berriman
- Wellcome Sanger Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| | - Thierry Wirth
- Institut de Systématique, Evolution, Biodiversité, ISYEB, Muséum national d'Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des AntillesParisFrance
- École Pratique des Hautes Études (EPHE)Paris Sciences & Lettres (PSL)ParisFrance
| | | | - James A Cotton
- Wellcome Sanger Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| |
Collapse
|
109
|
Pericentromeric heterochromatin is hierarchically organized and spatially contacts H3K9me2 islands in euchromatin. PLoS Genet 2020; 16:e1008673. [PMID: 32203508 PMCID: PMC7147806 DOI: 10.1371/journal.pgen.1008673] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 04/10/2020] [Accepted: 02/14/2020] [Indexed: 01/02/2023] Open
Abstract
Membraneless pericentromeric heterochromatin (PCH) domains play vital roles in chromosome dynamics and genome stability. However, our current understanding of 3D genome organization does not include PCH domains because of technical challenges associated with repetitive sequences enriched in PCH genomic regions. We investigated the 3D architecture of Drosophila melanogaster PCH domains and their spatial associations with the euchromatic genome by developing a novel analysis method that incorporates genome-wide Hi-C reads originating from PCH DNA. Combined with cytogenetic analysis, we reveal a hierarchical organization of the PCH domains into distinct “territories.” Strikingly, H3K9me2-enriched regions embedded in the euchromatic genome show prevalent 3D interactions with the PCH domain. These spatial contacts require H3K9me2 enrichment, are likely mediated by liquid-liquid phase separation, and may influence organismal fitness. Our findings have important implications for how PCH architecture influences the function and evolution of both repetitive heterochromatin and the gene-rich euchromatin. The three dimensional (3D) organization of genomes in cell nuclei can influence a wide variety of genome functions. However, most of our understanding of this critical architecture has been limited to the gene-rich euchromatin, and largely ignores the gene-poor and repeat-rich pericentromeric heterochromatin, or PCH. PCH comprises a large part of most eukaryotic genomes, forms 3D membraneless PCH domains in nuclei, and plays a vital role in chromosome dynamics and genome stability. In this study, we developed a new method that overcomes the technical challenges imposed by the highly repetitive PCH DNA, and generated a comprehensive picture of its 3D organization. Combined with image analyses, we reveal a hierarchical organization of the PCH domains. Surprisingly, we showed that distant euchromatic regions enriched for repressive epigenetic marks also dynamically interact with the main PCH domains. These 3D interactions are likely mediated by liquid-liquid phase separation (similar to how oil and vinegar separate in salad dressing) and the resulting liquid-like fusion events, and can influence the fitness of individuals. Our discoveries have strong implications for how seemingly “junk” DNA could impact functions in the gene-rich euchromatin.
Collapse
|
110
|
Boettcher S, Miller PG, Sharma R, McConkey M, Leventhal M, Krivtsov AV, Giacomelli AO, Wong W, Kim J, Chao S, Kurppa KJ, Yang X, Milenkowic K, Piccioni F, Root DE, Rücker FG, Flamand Y, Neuberg D, Lindsley RC, Jänne PA, Hahn WC, Jacks T, Döhner H, Armstrong SA, Ebert BL. A dominant-negative effect drives selection of TP53 missense mutations in myeloid malignancies. Science 2020; 365:599-604. [PMID: 31395785 DOI: 10.1126/science.aax3649] [Citation(s) in RCA: 238] [Impact Index Per Article: 59.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Accepted: 06/24/2019] [Indexed: 12/11/2022]
Abstract
TP53, which encodes the tumor suppressor p53, is the most frequently mutated gene in human cancer. The selective pressures shaping its mutational spectrum, dominated by missense mutations, are enigmatic, and neomorphic gain-of-function (GOF) activities have been implicated. We used CRISPR-Cas9 to generate isogenic human leukemia cell lines of the most common TP53 missense mutations. Functional, DNA-binding, and transcriptional analyses revealed loss of function but no GOF effects. Comprehensive mutational scanning of p53 single-amino acid variants demonstrated that missense variants in the DNA-binding domain exert a dominant-negative effect (DNE). In mice, the DNE of p53 missense variants confers a selective advantage to hematopoietic cells on DNA damage. Analysis of clinical outcomes in patients with acute myeloid leukemia showed no evidence of GOF for TP53 missense mutations. Thus, a DNE is the primary unit of selection for TP53 missense mutations in myeloid malignancies.
Collapse
Affiliation(s)
- Steffen Boettcher
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Peter G Miller
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Rohan Sharma
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Marie McConkey
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Matthew Leventhal
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Andrei V Krivtsov
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Andrew O Giacomelli
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,The Campbell Family Institute for Breast Cancer Research, Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2M9, Canada
| | - Waihay Wong
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Jesi Kim
- Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Sherry Chao
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Department of Biomedical Informatics, Harvard University, Boston, MA 02115, USA
| | - Kari J Kurppa
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Belfer Center for Applied Cancer Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Xiaoping Yang
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Kirsten Milenkowic
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Federica Piccioni
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - David E Root
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Frank G Rücker
- Department of Internal Medicine III, University of Ulm, 89081 Ulm, Germany
| | - Yael Flamand
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Donna Neuberg
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - R Coleman Lindsley
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Pasi A Jänne
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Belfer Center for Applied Cancer Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - William C Hahn
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Tyler Jacks
- David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Hartmut Döhner
- Department of Internal Medicine III, University of Ulm, 89081 Ulm, Germany
| | - Scott A Armstrong
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Benjamin L Ebert
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA. .,Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.,Division of Hematology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.,Howard Hughes Medical Institute, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| |
Collapse
|
111
|
Pervasive Differential Splicing in Marek's Disease Virus can Discriminate CVI-988 Vaccine Strain from RB-1B Very Virulent Strain in Chicken Embryonic Fibroblasts. Viruses 2020; 12:v12030329. [PMID: 32197378 PMCID: PMC7150913 DOI: 10.3390/v12030329] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 03/07/2020] [Accepted: 03/10/2020] [Indexed: 12/13/2022] Open
Abstract
Marek's disease is a major scourge challenging poultry health worldwide. It is caused by the highly contagious Marek's disease virus (MDV), an alphaherpesvirus. Here, we showed that, similar to other members of its Herpesviridae family, MDV also presents a complex landscape of splicing events, most of which are uncharacterised and/or not annotated. Quite strikingly, and although the biological relevance of this fact is unknown, we found that a number of viral splicing isoforms are strain-specific, despite the close sequence similarity of the strains considered: very virulent RB-1B and vaccine CVI-988. We validated our findings by devising an assay that discriminated infections caused by the two strains in chicken embryonic fibroblasts on the basis of the presence of some RNA species. To our knowledge, this study is the first to accomplish such a result, emphasizing how relevant a comprehensive picture of the viral transcriptome is to fully understand viral pathogenesis.
Collapse
|
112
|
SVXplorer: Three-tier approach to identification of structural variants via sequential recombination of discordant cluster signatures. PLoS Comput Biol 2020; 16:e1007737. [PMID: 32182236 PMCID: PMC7100977 DOI: 10.1371/journal.pcbi.1007737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 03/27/2020] [Accepted: 02/18/2020] [Indexed: 11/19/2022] Open
Abstract
The identification of structural variants using short-read data remains challenging. Most approaches that use discordant paired-end sequences ignore non-trivial signatures presented by variants containing 3 breakpoints, such as those generated by various copy-paste and cut-paste mechanisms. This can result in lower precision and sensitivity in the identification of the more common structural variants such as deletions and duplications. We present SVXplorer, which uses a graph-based clustering approach streamlined by the integration of non-trivial signatures from discordant paired-end alignments, split-reads and read depth information to improve upon existing methods. We show that SVXplorer is more sensitive and precise compared to several existing approaches on multiple real and simulated datasets. SVXplorer is available for download at https://github.com/kunalkathuria/SVXplorer.
Collapse
|
113
|
Zeng Y, Cao Y, Halevy RS, Nguyen P, Liu D, Zhang X, Ahituv N, Han JDJ. Characterization of functional transposable element enhancers in acute myeloid leukemia. SCIENCE CHINA-LIFE SCIENCES 2020; 63:675-687. [PMID: 32170627 DOI: 10.1007/s11427-019-1574-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 10/24/2019] [Indexed: 12/15/2022]
Abstract
Transposable elements (TEs) have been shown to have important gene regulatory functions and their alteration could lead to disease phenotypes. Acute myeloid leukemia (AML) develops as a consequence of a series of genetic changes in hematopoietic precursor cells, including mutations in epigenetic factors. Here, we set out to study the gene regulatory role of TEs in AML. We first explored the epigenetic landscape of TEs in AML patients using ATAC-seq data. We show that a large number of TEs in general, and more specifically mammalian-wide interspersed repeats (MIRs), are more enriched in AML cells than in normal blood cells. We obtained a similar finding when analyzing histone modification data in AML patients. Gene Ontology enrichment analysis showed that genes near MIRs in open chromatin regions are involved in leukemogenesis. To functionally validate their regulatory role, we selected 19 MIR regions in AML cells, and tested them for enhancer activity in an AML cell line (Kasumi-1) and a chronic myeloid leukemia (CML) cell line (K562); the results revealed several MIRs to be functional enhancers. Taken together, our results suggest that TEs are potentially involved in myeloid leukemogenesis and highlight these sequences as potential candidates harboring AML-associated variation.
Collapse
Affiliation(s)
- Yingying Zeng
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Yaqiang Cao
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Rivka Sukenik Halevy
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, 94158, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, 94143, USA.,Sackler School of Medicine, Tel-Aviv University, Tel Aviv, 6997801, Israel
| | - Picard Nguyen
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, 94158, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, 94143, USA
| | - Denghui Liu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Xiaoli Zhang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, 94158, USA. .,Institute for Human Genetics, University of California San Francisco, San Francisco, 94143, USA.
| | - Jing-Dong J Han
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China. .,Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology, Peking University, Beijing, 100871, China.
| |
Collapse
|
114
|
Yi G, Wierenga ATJ, Petraglia F, Narang P, Janssen-Megens EM, Mandoli A, Merkel A, Berentsen K, Kim B, Matarese F, Singh AA, Habibi E, Prange KHM, Mulder AB, Jansen JH, Clarke L, Heath S, van der Reijden BA, Flicek P, Yaspo ML, Gut I, Bock C, Schuringa JJ, Altucci L, Vellenga E, Stunnenberg HG, Martens JHA. Chromatin-Based Classification of Genetically Heterogeneous AMLs into Two Distinct Subtypes with Diverse Stemness Phenotypes. Cell Rep 2020; 26:1059-1069.e6. [PMID: 30673601 PMCID: PMC6363099 DOI: 10.1016/j.celrep.2018.12.098] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2018] [Revised: 09/27/2018] [Accepted: 12/21/2018] [Indexed: 12/19/2022] Open
Abstract
Global investigation of histone marks in acute myeloid leukemia (AML) remains limited. Analyses of 38 AML samples through integrated transcriptional and chromatin mark analysis exposes 2 major subtypes. One subtype is dominated by patients with NPM1 mutations or MLL-fusion genes, shows activation of the regulatory pathways involving HOX-family genes as targets, and displays high self-renewal capacity and stemness. The second subtype is enriched for RUNX1 or spliceosome mutations, suggesting potential interplay between the 2 aberrations, and mainly depends on IRF family regulators. Cellular consequences in prognosis predict a relatively worse outcome for the first subtype. Our integrated profiling establishes a rich resource to probe AML subtypes on the basis of expression and chromatin data.
Collapse
MESH Headings
- Chromatin/genetics
- Chromatin/metabolism
- Chromatin/pathology
- Core Binding Factor Alpha 2 Subunit/genetics
- Core Binding Factor Alpha 2 Subunit/metabolism
- Humans
- Leukemia, Myeloid, Acute/classification
- Leukemia, Myeloid, Acute/genetics
- Leukemia, Myeloid, Acute/metabolism
- Leukemia, Myeloid, Acute/pathology
- Mutation
- Nuclear Proteins/genetics
- Nuclear Proteins/metabolism
- Nucleophosmin
- Oncogene Proteins, Fusion/genetics
- Oncogene Proteins, Fusion/metabolism
Collapse
Affiliation(s)
- Guoqiang Yi
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Albertus T J Wierenga
- Department of Hematology, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, the Netherlands; Department of Laboratory Medicine, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, the Netherlands
| | - Francesca Petraglia
- Dipartimento di Biochimica, Biofisica e Patologia generale, Università degli Studi della Campania "Luigi Vanvitelli," Vico L. De Crecchio 7, 80138 Napoli, Italy
| | - Pankaj Narang
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Eva M Janssen-Megens
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Amit Mandoli
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Angelika Merkel
- Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Barcelona, Spain
| | - Kim Berentsen
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Bowon Kim
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Filomena Matarese
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Abhishek A Singh
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Ehsan Habibi
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Koen H M Prange
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - André B Mulder
- Department of Hematology, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, the Netherlands
| | - Joop H Jansen
- Department of Laboratory Medicine, Laboratory of Hematology, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Heath
- Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Barcelona, Spain
| | - Bert A van der Reijden
- Department of Laboratory Medicine, Laboratory of Hematology, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marie-Laure Yaspo
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Ivo Gut
- Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Barcelona, Spain
| | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, 1090 Vienna, Austria; Department of Laboratory Medicine, Medical University of Vienna, 1090 Vienna, Austria; Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany
| | - Jan Jacob Schuringa
- Department of Hematology, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, the Netherlands
| | - Lucia Altucci
- Dipartimento di Biochimica, Biofisica e Patologia generale, Università degli Studi della Campania "Luigi Vanvitelli," Vico L. De Crecchio 7, 80138 Napoli, Italy
| | - Edo Vellenga
- Department of Hematology, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, the Netherlands
| | - Hendrik G Stunnenberg
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands
| | - Joost H A Martens
- Department of Molecular Biology, Faculty of Science, Radboud University, 6525 GA Nijmegen, the Netherlands.
| |
Collapse
|
115
|
ChromID identifies the protein interactome at chromatin marks. Nat Biotechnol 2020; 38:728-736. [PMID: 32123383 PMCID: PMC7289633 DOI: 10.1038/s41587-020-0434-2] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 01/23/2020] [Indexed: 01/05/2023]
Abstract
Chromatin modifications regulate genome function by recruiting protein factors to the genome. However, the protein composition at distinct chromatin modifications remains to be fully characterized. Here, we use natural protein domains as modular building blocks to develop engineered chromatin readers (eCRs) selective for DNA methylation and histone tri-methylation at H3K4, H3K9 a H3K27 residues. We first demonstrate their utility as selective chromatin binders in living cells by stably expressing eCRs in mouse embryonic stem cells and measuring their subnuclear localisation, genomic distribution and histone modification–binding preference. By fusing eCRs to the biotin ligase BASU, we establish ChromID, a method for identifying the chromatin-dependent protein interactome based on proximity biotinylation, and apply it to distinct chromatin modifications in mouse stem cells. Using a synthetic dual-modification reader, we also uncover the protein composition at bivalent promoters marked by H3K4me3 and H3K27me3. These results highlight the ability of ChromID to obtain a detailed view of protein interaction networks on chromatin.
Collapse
|
116
|
Tattini L, Tellini N, Mozzachiodi S, D'Angiolo M, Loeillet S, Nicolas A, Liti G. Accurate Tracking of the Mutational Landscape of Diploid Hybrid Genomes. Mol Biol Evol 2020; 36:2861-2877. [PMID: 31397846 PMCID: PMC6878955 DOI: 10.1093/molbev/msz177] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Mutations, recombinations, and genome duplications may promote genetic diversity and trigger evolutionary processes. However, quantifying these events in diploid hybrid genomes is challenging. Here, we present an integrated experimental and computational workflow to accurately track the mutational landscape of yeast diploid hybrids (MuLoYDH) in terms of single-nucleotide variants, small insertions/deletions, copy-number variants, aneuploidies, and loss-of-heterozygosity. Pairs of haploid Saccharomyces parents were combined to generate ancestor hybrids with phased genomes and varying levels of heterozygosity. These diploids were evolved under different laboratory protocols, in particular mutation accumulation experiments. Variant simulations enabled the efficient integration of competitive and standard mapping of short reads, depending on local levels of heterozygosity. Experimental validations proved the high accuracy and resolution of our computational approach. Finally, applying MuLoYDH to four different diploids revealed striking genetic background effects. Homozygous Saccharomyces cerevisiae showed a ∼4-fold higher mutation rate compared with its closely related species S. paradoxus. Intraspecies hybrids unveiled that a substantial fraction of the genome (∼250 bp per generation) was shaped by loss-of-heterozygosity, a process strongly inhibited in interspecies hybrids by high levels of sequence divergence between homologous chromosomes. In contrast, interspecies hybrids exhibited higher single-nucleotide mutation rates compared with intraspecies hybrids. MuLoYDH provided an unprecedented quantitative insight into the evolutionary processes that mold diploid yeast genomes and can be generalized to other genetic systems.
Collapse
Affiliation(s)
- Lorenzo Tattini
- CNRS UMR7284, INSERM, IRCAN, Université Côte d'Azur, Nice, France
| | - Nicolò Tellini
- CNRS UMR7284, INSERM, IRCAN, Université Côte d'Azur, Nice, France
| | | | | | - Sophie Loeillet
- CNRS UMR3244, Institut Curie, PSL Research University, Paris, France
| | - Alain Nicolas
- CNRS UMR3244, Institut Curie, PSL Research University, Paris, France
| | - Gianni Liti
- CNRS UMR7284, INSERM, IRCAN, Université Côte d'Azur, Nice, France
| |
Collapse
|
117
|
Li R, Ren X, Ding Q, Bi Y, Xie D, Zhao Z. Direct full-length RNA sequencing reveals unexpected transcriptome complexity during Caenorhabditis elegans development. Genome Res 2020; 30:287-298. [PMID: 32024662 PMCID: PMC7050527 DOI: 10.1101/gr.251512.119] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 12/18/2019] [Indexed: 01/08/2023]
Abstract
Massively parallel sequencing of the polyadenylated RNAs has played a key role in delineating transcriptome complexity, including alternative use of an exon, promoter, 5′ or 3′ splice site or polyadenylation site, and RNA modification. However, reads derived from the current RNA-seq technologies are usually short and deprived of information on modification, compromising their potential in defining transcriptome complexity. Here, we applied a direct RNA sequencing method with ultralong reads using Oxford Nanopore Technologies to study the transcriptome complexity in Caenorhabditis elegans. We generated approximately six million reads using native poly(A)-tailed mRNAs from three developmental stages, with average read lengths ranging from 900 to 1100 nt. Around half of the reads represent full-length transcripts. To utilize the full-length transcripts in defining transcriptome complexity, we devised a method to classify the long reads as the same as existing transcripts or as a novel transcript using sequence mapping tracks rather than existing intron/exon structures, which allowed us to identify roughly 57,000 novel isoforms and recover at least 26,000 out of the 33,500 existing isoforms. The sets of genes with differential expression versus differential isoform usage over development are largely different, implying a fine-tuned regulation at isoform level. We also observed an unexpected increase in putative RNA modification in all bases in the coding region relative to the UTR, suggesting their possible roles in translation. The RNA reads and the method for read classification are expected to deliver new insights into RNA processing and modification and their underlying biology in the future.
Collapse
Affiliation(s)
- Runsheng Li
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Xiaoliang Ren
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Qiutao Ding
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Yu Bi
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Dongying Xie
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China
| | - Zhongying Zhao
- Department of Biology, Hong Kong Baptist University, Hong Kong, 999077, China.,State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Hong Kong, 999077, China
| |
Collapse
|
118
|
Abstract
ChIP-Seq blacklists contain genomic regions that frequently produce artifacts and noise in ChIP-Seq experiments. To improve signal-to-noise ratio, ChIP-Seq pipelines often remove data points that map to blacklist regions. Existing blacklists have been compiled in a manual or semiautomated way. In this article we describe PeakPass, an efficient method to generate blacklists, and demonstrate that blacklists can increase ChIP-Seq data quality. PeakPass leverages machine learning and attempts to automate blacklist generation. PeakPass uses a random forest classifier in combination with genomic features such as sequence, annotated repeats, complexity, assembly gaps, and the ratio of multimapping to uniquely mapping reads to identify artifact regions. We have validated PeakPass on a large data set and tested it for the purpose of upgrading a blacklist to a new reference genome version. We trained PeakPass on the ENCODE blacklist for the hg19 human reference genome, and created an updated blacklist for hg38. To assess the performance of this blacklist, we tested 42 ChIP-Seq replicates from 24 experiments using 10 ChIP-Seq quality metrics including relative strand coefficient, standardized standard deviation, and enrichment of reads in promoter regions. Using the blacklist generated by PeakPass resulted in a statistically significant improvement for nine of these metrics.
Collapse
Affiliation(s)
- Charles E Wimberley
- Department of Computer Science, NC State University, Raleigh, North Carolina
| | - Steffen Heber
- Department of Computer Science, NC State University, Raleigh, North Carolina
| |
Collapse
|
119
|
Vincenz C, Lovett JL, Wu W, Shedden K, Strassmann BI. Loss of Imprinting in Human Placentas Is Widespread, Coordinated, and Predicts Birth Phenotypes. Mol Biol Evol 2020; 37:429-441. [PMID: 31639821 PMCID: PMC6993844 DOI: 10.1093/molbev/msz226] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Genomic imprinting leads to mono-allelic expression of genes based on parent of origin. Therian mammals and angiosperms evolved this mechanism in nutritive tissues, the placenta, and endosperm, where maternal and paternal genomes are in conflict with respect to resource allocation. We used RNA-seq to analyze allelic bias in the expression of 91 known imprinted genes in term human placentas from a prospective cohort study in Mali. A large fraction of the imprinted exons (39%) deviated from mono-allelic expression. Loss of imprinting (LOI) occurred in genes with either maternal or paternal expression bias, albeit more frequently in the former. We characterized LOI using binomial generalized linear mixed models. Variation in LOI was predominantly at the gene as opposed to the exon level, consistent with a single promoter driving the expression of most exons in a gene. Some genes were less prone to LOI than others, particularly lncRNA genes were rarely expressed from the repressed allele. Further, some individuals had more LOI than others and, within a person, the expression bias of maternally and paternally imprinted genes was correlated. We hypothesize that trans-acting maternal effect genes mediate correlated LOI and provide the mother with an additional lever to control fetal growth by extending her influence to LOI of the paternally imprinted genes. Limited evidence exists to support associations between LOI and offspring phenotypes. We show that birth length and placental weight were associated with allelic bias, making this the first comprehensive report of an association between LOI and a birth phenotype.
Collapse
Affiliation(s)
- Claudius Vincenz
- Research Center for Group Dynamics, Institute for Social Research, University of Michigan, Ann Arbor, MI
| | - Jennie L Lovett
- Department of Anthropology, University of Michigan, Ann Arbor, MI
| | - Weisheng Wu
- BRCF Bioinformatics Core, University of Michigan, Ann Arbor, MI
| | - Kerby Shedden
- Department of Statistics, University of Michigan, Ann Arbor, MI
| | - Beverly I Strassmann
- Research Center for Group Dynamics, Institute for Social Research, University of Michigan, Ann Arbor, MI
- Department of Anthropology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
120
|
Rajagopalan R, Murrell JR, Luo M, Conlin LK. A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data. Genome Med 2020; 12:14. [PMID: 32000839 PMCID: PMC6993336 DOI: 10.1186/s13073-020-0712-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 01/13/2020] [Indexed: 12/12/2022] Open
Abstract
Background Exome sequencing (ES) is a first-tier diagnostic test for many suspected Mendelian disorders. While it is routine to detect small sequence variants, it is not a standard practice in clinical settings to detect germline copy-number variants (CNVs) from ES data due to several reasons relating to performance. In this work, we comprehensively characterized one of the most sensitive ES-based CNV tools, ExomeDepth, against SNP array, a standard of care test in clinical settings to detect genome-wide CNVs. Methods We propose a modified ExomeDepth workflow by excluding exons with low mappability prior to variant calling to drastically reduce the false positives originating from the repetitive regions of the genome, and an iterative variant calling framework to assess the reproducibility. We used a cohort of 307 individuals with clinical ES data and clinical SNP array to estimate the sensitivity and false discovery rate of the CNV detection using exome sequencing. Further, we performed targeted testing of the STRC gene in 1972 individuals. To reduce the number of variants for downstream analysis, we performed a large-scale iterative variant calling process with random control cohorts to assess the reproducibility of the CNVs. Results The modified workflow presented in this paper reduced the number of total variants identified by one third while retaining a higher sensitivity of 97% and resulted in an improved false discovery rate of 11.4% compared to the default ExomeDepth pipeline. The exclusion of exons with low mappability removes 4.5% of the exons, including a subset of exons (0.6%) in disease-associated genes which are intractable by short-read next-generation sequencing (NGS). Results from the reproducibility analysis showed that the clinically reported variants were reproducible 100% of the time and that the modified workflow can be used to rank variants from high to low confidence. Targeted testing of 30 CNVs identified in STRC, a challenging gene to ascertain by NGS, showed a 100% validation rate. Conclusions In summary, we introduced a modification to the default ExomeDepth workflow to reduce the false positives originating from the repetitive regions of the genome, created a large-scale iterative variant calling framework for reproducibility, and provided recommendations for implementation in clinical settings.
Collapse
Affiliation(s)
- Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Jill R Murrell
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Minjie Luo
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Laura K Conlin
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA. .,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
121
|
Clayton EA, Khalid S, Ban D, Wang L, Jordan IK, McDonald JF. Tumor suppressor genes and allele-specific expression: mechanisms and significance. Oncotarget 2020; 11:462-479. [PMID: 32064050 PMCID: PMC6996918 DOI: 10.18632/oncotarget.27468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 01/13/2020] [Indexed: 12/12/2022] Open
Abstract
Recent findings indicate that allele-specific expression (ASE) at specific cancer driver gene loci may be of importance in onset/progression of the disease. Of particular interest are loss-of-function (LOF) of tumor suppressor gene (TSGs) alleles. While LOF tumor suppressor mutations are typically considered to be recessive, if these mutant alleles can be significantly differentially expressed relative to wild-type alleles in heterozygotes, the clinical consequences could be significant. LOF TSG alleles are shown to be segregating at high frequencies in world-wide populations of normal/healthy individuals. Matched sets of normal and tumor tissues isolated from 233 cancer patients representing four diverse tumor types demonstrate functionally important changes in patterns of ASE in individuals heterozygous for LOF TSG alleles associated with cancer onset/progression. While a variety of molecular mechanisms were identified as potentially contributing to changes in ASE patterns in cancer, changes in DNA copy number and allele-specific alternative splicing possibly mediated by antisense RNA emerged as predominant factors. In conclusion, LOF TSGs are segregating in human populations at significant frequencies indicating that many otherwise healthy individuals are at elevated risk of developing cancer. Changes in ASE between normal and cancer tissues indicates that LOF TSG alleles may contribute to cancer onset/progression even when heterozygous with wild-type functional alleles.
Collapse
Affiliation(s)
- Evan A. Clayton
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Shareef Khalid
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Dongjo Ban
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Lu Wang
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
| | - I. King Jordan
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - John F. McDonald
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
122
|
Mechanisms governing the pioneering and redistribution capabilities of the non-classical pioneer PU.1. Nat Commun 2020; 11:402. [PMID: 31964861 PMCID: PMC6972792 DOI: 10.1038/s41467-019-13960-2] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Accepted: 12/10/2019] [Indexed: 12/21/2022] Open
Abstract
Establishing gene regulatory networks during differentiation or reprogramming requires master or pioneer transcription factors (TFs) such as PU.1, a prototype master TF of hematopoietic lineage differentiation. To systematically determine molecular features that control its activity, here we analyze DNA-binding in vitro and genome-wide in vivo across different cell types with native or ectopic PU.1 expression. Although PU.1, in contrast to classical pioneer factors, is unable to access nucleosomal target sites in vitro, ectopic induction of PU.1 leads to the extensive remodeling of chromatin and redistribution of partner TFs. De novo chromatin access, stable binding, and redistribution of partner TFs both require PU.1's N-terminal acidic activation domain and its ability to recruit SWI/SNF remodeling complexes, suggesting that the latter may collect and distribute co-associated TFs in conjunction with the non-classical pioneer TF PU.1.
Collapse
|
123
|
Seiden AH, Richter F, Patel N, Rodriguez OL, Deikus G, Shah H, Smith M, Roberts A, King EC, Sebra RP, Sharp AJ, Gelb BD. Elucidation of de novo small insertion/deletion biology with parent-of-origin phasing. Hum Mutat 2020; 41:800-806. [PMID: 31898844 DOI: 10.1002/humu.23971] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 11/24/2019] [Accepted: 12/24/2019] [Indexed: 12/30/2022]
Abstract
The mechanisms underlying de novo insertion/deletion (indel) genesis, such as polymerase slippage, have been hypothesized but not well characterized in the human genome. We implemented two methodological improvements, which were leveraged to dissect indel mutagenesis. We assigned de novo variants to parent-of-origin (i.e., phasing) with low-coverage long-read whole-genome sequencing, achieving better phasing compared to short-read sequencing (medians of 84% and 23%, respectively). We then wrote an application programming interface to classify indels into three subtypes according to sequence context. Across three cohorts with different phasing methods (Ntrios = 540, all cohorts), we observed that one de novo indel subtype, change in copy count (CCC), was significantly correlated with father's (p = 7.1 × 10-4 ) but not mother's (p = .45) age at conception. We replicated this effect in three cohorts without de novo phasing (ppaternal = 1.9 × 10-9 , pmaternal = .61; Ntrios = 3,391, all cohorts). Although this is consistent with polymerase slippage during spermatogenesis, the percentage of variance explained by paternal age was low, and we did not observe an association with replication timing. These results suggest that spermatogenesis-specific events have a minor role in CCC indel mutagenesis, one not observed for other indel subtypes nor for maternal age in general. These results have implications for indel modeling in evolution and disease.
Collapse
Affiliation(s)
- Allison H Seiden
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Felix Richter
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Nihir Patel
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Oscar L Rodriguez
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York.,Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Gintaras Deikus
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.,Icahn Institute for Data Science and Genomics Technology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Hardik Shah
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.,Icahn Institute for Data Science and Genomics Technology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Melissa Smith
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.,Icahn Institute for Data Science and Genomics Technology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Amy Roberts
- Division of Genetics, Department of Pediatrics and Department of Cardiology, Boston Children's Hospital, Boston, Massachusetts
| | - Eileen C King
- Division of Biostatistics and Epidemiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio
| | - Robert P Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.,Icahn Institute for Data Science and Genomics Technology, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Andrew J Sharp
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Bruce D Gelb
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.,Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, New York
| |
Collapse
|
124
|
Franco I, Helgadottir HT, Moggio A, Larsson M, Vrtačnik P, Johansson A, Norgren N, Lundin P, Mas-Ponte D, Nordström J, Lundgren T, Stenvinkel P, Wennberg L, Supek F, Eriksson M. Whole genome DNA sequencing provides an atlas of somatic mutagenesis in healthy human cells and identifies a tumor-prone cell type. Genome Biol 2019; 20:285. [PMID: 31849330 PMCID: PMC6918713 DOI: 10.1186/s13059-019-1892-z] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 11/18/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The lifelong accumulation of somatic mutations underlies age-related phenotypes and cancer. Mutagenic forces are thought to shape the genome of aging cells in a tissue-specific way. Whole genome analyses of somatic mutation patterns, based on both types and genomic distribution of variants, can shed light on specific processes active in different human tissues and their effect on the transition to cancer. RESULTS To analyze somatic mutation patterns, we compile a comprehensive genetic atlas of somatic mutations in healthy human cells. High-confidence variants are obtained from newly generated and publicly available whole genome DNA sequencing data from single non-cancer cells, clonally expanded in vitro. To enable a well-controlled comparison of different cell types, we obtain single genome data (92% mean coverage) from multi-organ biopsies from the same donors. These data show multiple cell types that are protected from mutagens and display a stereotyped mutation profile, despite their origin from different tissues. Conversely, the same tissue harbors cells with distinct mutation profiles associated to different differentiation states. Analyses of mutation rate in the coding and non-coding portions of the genome identify a cell type bearing a unique mutation pattern characterized by mutation enrichment in active chromatin, regulatory, and transcribed regions. CONCLUSIONS Our analysis of normal cells from healthy donors identifies a somatic mutation landscape that enhances the risk of tumor transformation in a specific cell population from the kidney proximal tubule. This unique pattern is characterized by high rate of mutation accumulation during adult life and specific targeting of expressed genes and regulatory regions.
Collapse
Affiliation(s)
- Irene Franco
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden.
| | - Hafdis T Helgadottir
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden
| | - Aldo Moggio
- Department of Medicine Huddinge, Integrated Cardio Metabolic Center, Karolinska Institutet, Huddinge, Sweden
| | - Malin Larsson
- Science for Life Laboratory, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Peter Vrtačnik
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden
| | - Anna Johansson
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Nina Norgren
- Science for Life Laboratory, Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Pär Lundin
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden
- Science for Life Laboratory, Department of Biochemistry and Biophysics (DBB), Stockholm University, Stockholm, Sweden
| | - David Mas-Ponte
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028, Barcelona, Spain
| | - Johan Nordström
- Department of Clinical Sciences, Intervention and Technology, Karolinska Institutet, Division of Transplantation Surgery, Karolinska University Hospital, Huddinge, Sweden
| | - Torbjörn Lundgren
- Department of Clinical Sciences, Intervention and Technology, Karolinska Institutet, Division of Transplantation Surgery, Karolinska University Hospital, Huddinge, Sweden
| | - Peter Stenvinkel
- Department of Clinical Sciences, Intervention and Technology, Karolinska Institutet, Division of Renal Medicine, Karolinska University Hospital, Huddinge, Sweden
| | - Lars Wennberg
- Department of Clinical Sciences, Intervention and Technology, Karolinska Institutet, Division of Transplantation Surgery, Karolinska University Hospital, Huddinge, Sweden
| | - Fran Supek
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Maria Eriksson
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden.
| |
Collapse
|
125
|
Hu HJ, Lee MY, Cho DY, Oh M, Kwon YJ, Han YJ, Ryu HM, Kim YN, Won HS. Prospective clinical evaluation of Momguard non-invasive prenatal test in 1011 Korean high-risk pregnant women. J OBSTET GYNAECOL 2019; 40:1090-1095. [PMID: 31826681 DOI: 10.1080/01443615.2019.1680617] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Clinical performance of the Momguard non-invasive prenatal test (NIPT) was evaluated in a cohort of Korean pregnant women. The foetal trisomies 21, 18 and 13 (T21, T18 and T13) were screened by low-coverage massive parallel sequencing in the maternal blood. Among the 1011 confirmed samples, 32 cases (3.2%) had positive NIPT results. Of these positive cases, 20 cases of T21, all cases of T18 and two cases of T13 had concordant karyotype findings. Only one case out of the remaining 979 negative NIPT samples showed a false negative result. The overall sensitivity and specificity of Momguard to detect the three chromosomal aneuploidies were 96.8% and 99.8%, respectively. Momguard is a clinically useful tool for the detection of T21, T18 and T13 in singleton pregnancy. However, as other NIPT tests, it carries the risk of false positive and false negative results. Hence, the genetic counsellors should provide these limitations to the examinees.Impact StatementWhat is already known on this subject? The NIPT approach using massive parallel sequencing (MPS) showed high sensitivity and specificity in various clinical studies. These results are based on analysis systems using their own bioinformatics algorithms.What the results of this study add? When this NIPT technology was introduced in Korea, the first biological specimens collected in Korea were transported overseas for processing in overseas laboratories and analysed by other country's analysis methods. We needed our own NIPT algorithm and developed Momguard NIPT for the first time in Korea. This study attempted to evaluate this Momguard NIPT protocol prospectively in a large number of samples obtained from three Korean hospitals.What the implications are of these findings for clinical practice and/or further research? The overall sensitivity and specificity to identify T13, T18 and T21 were 96.8% and 99.8%, respectively. These accuracy values were comparable to that of other studies. From this study, we found that Momguard is a clinically useful tool for the detection of three chromosomal aneuploidies. However, as other NIPT tests, it carries the risk of false positive and false negative results. Hence, the genetic counsellors should provide these limitations to the examinees.
Collapse
Affiliation(s)
- Hae-Jin Hu
- LabGenomics Clinical Research Institute, LabGenomics, Seongnam, Korea
| | - Mi-Young Lee
- Department of Obstetrics and Gynaecology, University of Ulsan College of Medicine, Asan Medical Centre, Seoul, Korea
| | - Dae-Yeon Cho
- LabGenomics Clinical Research Institute, LabGenomics, Seongnam, Korea
| | - Mijin Oh
- LabGenomics Clinical Research Institute, LabGenomics, Seongnam, Korea
| | - Young-Jun Kwon
- LabGenomics Clinical Research Institute, LabGenomics, Seongnam, Korea
| | - You-Jung Han
- Department of Obstetrics and Gynaecology, CHA Gangnam Medical Centre, CHA University, Seoul, Korea
| | - Hyun Mee Ryu
- Department of Obstetrics and Gynaecology, CHA Bundang Medical Centre, CHA University, Seongnam, Korea
| | - Young Nam Kim
- Busan Paik Hospital, Inje University College of Medicine, Busan, Korea
| | - Hye-Sung Won
- Department of Obstetrics and Gynaecology, University of Ulsan College of Medicine, Asan Medical Centre, Seoul, Korea
| |
Collapse
|
126
|
Reliability of Whole-Exome Sequencing for Assessing Intratumor Genetic Heterogeneity. Cell Rep 2019; 25:1446-1457. [PMID: 30404001 PMCID: PMC6261536 DOI: 10.1016/j.celrep.2018.10.046] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 05/20/2018] [Accepted: 10/11/2018] [Indexed: 12/11/2022] Open
Abstract
Multi-region sequencing is used to detect intratumor genetic heterogeneity (ITGH) in tumors. To assess whether genuine ITGH can be distinguished from sequencing artifacts, we performed whole-exome sequencing (WES) on three anatomically distinct regions of the same tumor with technical replicates to estimate technical noise. Somatic variants were detected with three different WES pipelines and subsequently validated by high-depth amplicon sequencing. The cancer-only pipeline was unreliable, with about 69% of the identified somatic variants being false positive. Even with matched normal DNA for which 82% of the somatic variants were detected reliably, only 36%-78% were found consistently in technical replicate pairs. Overall, 34%-80% of the discordant somatic variants, which could be interpreted as ITGH, were found to constitute technical noise. Excluding mutations affecting low-mappability regions or occurring in certain mutational contexts was found to reduce artifacts, yet detection of subclonal mutations by WES in the absence of orthogonal validation remains unreliable.
Collapse
|
127
|
Identification of African-Specific Admixture between Modern and Archaic Humans. Am J Hum Genet 2019; 105:1254-1261. [PMID: 31809748 DOI: 10.1016/j.ajhg.2019.11.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Accepted: 11/03/2019] [Indexed: 11/21/2022] Open
Abstract
Recent work has demonstrated that two archaic human groups (Neanderthals and Denisovans) interbred with modern humans and contributed to the contemporary human gene pool. These findings relied on the availability of high-coverage genomes from both Neanderthals and Denisovans. Here we search for evidence of archaic admixture from a worldwide panel of 1,667 individuals using an approach that does not require the presence of an archaic human reference genome. We find no evidence for archaic admixture in the Andaman Islands, as previously claimed, or on the island of Flores, where Homo floresiensis fossils have been found. However, we do find evidence for at least one archaic admixture event in sub-Saharan Africa, with the strongest signal in Khoesan and Pygmy individuals from Southern and Central Africa. The locations of these putative archaic admixture tracts are weighted against functional regions of the genome, consistent with the long-term effects of purifying selection against introgressed genetic material.
Collapse
|
128
|
Beurton F, Stempor P, Caron M, Appert A, Dong Y, Chen RAJ, Cluet D, Couté Y, Herbette M, Huang N, Polveche H, Spichty M, Bedet C, Ahringer J, Palladino F. Physical and functional interaction between SET1/COMPASS complex component CFP-1 and a Sin3S HDAC complex in C. elegans. Nucleic Acids Res 2019; 47:11164-11180. [PMID: 31602465 PMCID: PMC6868398 DOI: 10.1093/nar/gkz880] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 09/13/2019] [Accepted: 10/07/2019] [Indexed: 12/23/2022] Open
Abstract
The CFP1 CXXC zinc finger protein targets the SET1/COMPASS complex to non-methylated CpG rich promoters to implement tri-methylation of histone H3 Lys4 (H3K4me3). Although H3K4me3 is widely associated with gene expression, the effects of CFP1 loss vary, suggesting additional chromatin factors contribute to context dependent effects. Using a proteomics approach, we identified CFP1 associated proteins and an unexpected direct link between Caenorhabditis elegans CFP-1 and an Rpd3/Sin3 small (SIN3S) histone deacetylase complex. Supporting a functional connection, we find that mutants of COMPASS and SIN3 complex components genetically interact and have similar phenotypic defects including misregulation of common genes. CFP-1 directly binds SIN-3 through a region including the conserved PAH1 domain and recruits SIN-3 and the HDA-1/HDAC subunit to H3K4me3 enriched promoters. Our results reveal a novel role for CFP-1 in mediating interaction between SET1/COMPASS and a Sin3S HDAC complex at promoters.
Collapse
Affiliation(s)
- Flore Beurton
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| | - Przemyslaw Stempor
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, UK
| | - Matthieu Caron
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| | - Alex Appert
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, UK
| | - Yan Dong
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, UK
| | - Ron A-j Chen
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, UK
| | - David Cluet
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| | - Yohann Couté
- Grenoble Alpes, CEA, Inserm, BIG-BGE, 38000 Grenoble, France
| | - Marion Herbette
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| | - Ni Huang
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, UK
| | - Hélène Polveche
- INSERM UMR 861, I-STEM, 28, Rue Henri Desbruères, 91100 Corbeil-Essonnes, France
| | - Martin Spichty
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| | - Cécile Bedet
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| | - Julie Ahringer
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, UK
| | - Francesca Palladino
- Laboratory of Biology and Modeling of the Cell, UMR5239 CNRS/Ecole Normale Supérieure de Lyon, INSERM U1210, UMS 3444 Biosciences Lyon Gerland, Université de Lyon, Lyon, France
| |
Collapse
|
129
|
Waples RK, Albrechtsen A, Moltke I. Allele frequency-free inference of close familial relationships from genotypes or low-depth sequencing data. Mol Ecol 2019; 28:35-48. [PMID: 30462358 PMCID: PMC6850436 DOI: 10.1111/mec.14954] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Accepted: 10/12/2018] [Indexed: 01/03/2023]
Abstract
Knowledge of how individuals are related is important in many areas of research, and numerous methods for inferring pairwise relatedness from genetic data have been developed. However, the majority of these methods were not developed for situations where data are limited. Specifically, most methods rely on the availability of population allele frequencies, the relative genomic position of variants and accurate genotype data. But in studies of non‐model organisms or ancient samples, such data are not always available. Motivated by this, we present a new method for pairwise relatedness inference, which requires neither allele frequency information nor information on genomic position. Furthermore, it can be applied not only to accurate genotype data but also to low‐depth sequencing data from which genotypes cannot be accurately called. We evaluate it using data from a range of human populations and show that it can be used to infer close familial relationships with a similar accuracy as a widely used method that relies on population allele frequencies. Additionally, we show that our method is robust to SNP ascertainment and applicable to low‐depth sequencing data generated using different strategies, including resequencing and RADseq, which is important for application to a diverse range of populations and species.
Collapse
Affiliation(s)
- Ryan K Waples
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| | - Anders Albrechtsen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| | - Ida Moltke
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| |
Collapse
|
130
|
Sagar A, Xue B. Recent Advances in Machine Learning Based Prediction of RNA-protein Interactions. Protein Pept Lett 2019; 26:601-619. [PMID: 31215361 DOI: 10.2174/0929866526666190619103853] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 04/04/2019] [Accepted: 06/01/2019] [Indexed: 12/18/2022]
Abstract
The interactions between RNAs and proteins play critical roles in many biological processes. Therefore, characterizing these interactions becomes critical for mechanistic, biomedical, and clinical studies. Many experimental methods can be used to determine RNA-protein interactions in multiple aspects. However, due to the facts that RNA-protein interactions are tissuespecific and condition-specific, as well as these interactions are weak and frequently compete with each other, those experimental techniques can not be made full use of to discover the complete spectrum of RNA-protein interactions. To moderate these issues, continuous efforts have been devoted to developing high quality computational techniques to study the interactions between RNAs and proteins. Many important progresses have been achieved with the application of novel techniques and strategies, such as machine learning techniques. Especially, with the development and application of CLIP techniques, more and more experimental data on RNA-protein interaction under specific biological conditions are available. These CLIP data altogether provide a rich source for developing advanced machine learning predictors. In this review, recent progresses on computational predictors for RNA-protein interaction were summarized in the following aspects: dataset, prediction strategies, and input features. Possible future developments were also discussed at the end of the review.
Collapse
Affiliation(s)
- Amit Sagar
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, Florida 33620, United States
| | - Bin Xue
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, Florida 33620, United States
| |
Collapse
|
131
|
Hess JM, Bernards A, Kim J, Miller M, Taylor-Weiner A, Haradhvala NJ, Lawrence MS, Getz G. Passenger Hotspot Mutations in Cancer. Cancer Cell 2019; 36:288-301.e14. [PMID: 31526759 PMCID: PMC7371346 DOI: 10.1016/j.ccell.2019.08.002] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 05/15/2019] [Accepted: 08/06/2019] [Indexed: 01/04/2023]
Abstract
Current statistical models for assessing hotspot significance do not properly account for variation in site-specific mutability, thereby yielding many false-positives. We thus (i) detail a Log-normal-Poisson (LNP) background model that accounts for this variability in a manner consistent with models of mutagenesis; (ii) use it to show that passenger hotspots arise from all common mutational processes; and (iii) apply it to a ∼10,000-patient cohort to nominate driver hotspots with far fewer false-positives compared with conventional methods. Overall, we show that many cancer hotspot mutations recurring at the same genomic site across multiple tumors are actually passenger events, recurring at inherently mutable genomic sites under no positive selection.
Collapse
Affiliation(s)
- Julian M Hess
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Andre Bernards
- Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, 250 Longwood Avenue, Boston, MA 02115, USA
| | - Jaegil Kim
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Mendy Miller
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Nicholas J Haradhvala
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Michael S Lawrence
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, 250 Longwood Avenue, Boston, MA 02115, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA.
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, 250 Longwood Avenue, Boston, MA 02115, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA.
| |
Collapse
|
132
|
Caspar SM, Dubacher N, Kopps AM, Meienberg J, Henggeler C, Matyas G. Clinical sequencing: From raw data to diagnosis with lifetime value. Clin Genet 2019; 93:508-519. [PMID: 29206278 DOI: 10.1111/cge.13190] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 11/28/2017] [Accepted: 11/30/2017] [Indexed: 12/22/2022]
Abstract
High-throughput sequencing (HTS) has revolutionized genetics by enabling the detection of sequence variants at hitherto unprecedented large scale. Despite these advances, however, there are still remaining challenges in the complete coverage of targeted regions (genes, exome or genome) as well as in HTS data analysis and interpretation. Moreover, it is easy to get overwhelmed by the plethora of available methods and tools for HTS. Here, we review the step-by-step process from the generation of sequence data to molecular diagnosis of Mendelian diseases. Highlighting advantages and limitations, this review addresses the current state of (1) HTS technologies, considering targeted, whole-exome, and whole-genome sequencing on short- and long-read platforms; (2) read alignment, variant calling and interpretation; as well as (3) regulatory issues related to genetic counseling, reimbursement, and data storage.
Collapse
Affiliation(s)
- S M Caspar
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - N Dubacher
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - A M Kopps
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - J Meienberg
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - C Henggeler
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - G Matyas
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland.,Zurich Center for Integrative Human Physiology, University of Zurich, Zurich, Switzerland
| |
Collapse
|
133
|
Schwabl P, Imamura H, Van den Broeck F, Costales JA, Maiguashca-Sánchez J, Miles MA, Andersson B, Grijalva MJ, Llewellyn MS. Meiotic sex in Chagas disease parasite Trypanosoma cruzi. Nat Commun 2019; 10:3972. [PMID: 31481692 PMCID: PMC6722143 DOI: 10.1038/s41467-019-11771-z] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Accepted: 07/27/2019] [Indexed: 12/11/2022] Open
Abstract
Genetic exchange enables parasites to rapidly transform disease phenotypes and exploit new host populations. Trypanosoma cruzi, the parasitic agent of Chagas disease and a public health concern throughout Latin America, has for decades been presumed to exchange genetic material rarely and without classic meiotic sex. We present compelling evidence from 45 genomes sequenced from southern Ecuador that T. cruzi in fact maintains truly sexual, panmictic groups that can occur alongside others that remain highly clonal after past hybridization events. These groups with divergent reproductive strategies appear genetically isolated despite possible co-occurrence in vectors and hosts. We propose biological explanations for the fine-scale disconnectivity we observe and discuss the epidemiological consequences of flexible reproductive modes. Our study reinvigorates the hunt for the site of genetic exchange in the T. cruzi life cycle, provides tools to define the genetic determinants of parasite virulence, and reforms longstanding theory on clonality in trypanosomatid parasites.
Collapse
Affiliation(s)
- Philipp Schwabl
- Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Hideo Imamura
- Unit of Molecular Parasitology, Institute of Tropical Medicine Antwerp, 155 Nationalestraat, 2000, Antwerp, Belgium
| | - Frederik Van den Broeck
- Unit of Molecular Parasitology, Institute of Tropical Medicine Antwerp, 155 Nationalestraat, 2000, Antwerp, Belgium
| | - Jaime A Costales
- Center for Research on Health in Latin America, School of Biological Sciences, Pontifical Catholic University of Ecuador, Quito, Ecuador
| | - Jalil Maiguashca-Sánchez
- Center for Research on Health in Latin America, School of Biological Sciences, Pontifical Catholic University of Ecuador, Quito, Ecuador
| | - Michael A Miles
- London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Bjorn Andersson
- Department of Cell and Molecular Biology, Science for Life Laboratory, Karolinska Institutet, Biomedicum 9C, 171 77, Stockholm, Sweden
| | - Mario J Grijalva
- Center for Research on Health in Latin America, School of Biological Sciences, Pontifical Catholic University of Ecuador, Quito, Ecuador
- Infectious and Tropical Disease Institute, Biomedical Sciences Department, Heritage College of Osteopathic Medicine, Ohio University, 45701, Athens, OH, USA
| | - Martin S Llewellyn
- Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK.
| |
Collapse
|
134
|
Vegesna R, Tomaszkiewicz M, Medvedev P, Makova KD. Dosage regulation, and variation in gene expression and copy number of human Y chromosome ampliconic genes. PLoS Genet 2019; 15:e1008369. [PMID: 31525193 PMCID: PMC6772104 DOI: 10.1371/journal.pgen.1008369] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 10/01/2019] [Accepted: 08/13/2019] [Indexed: 12/28/2022] Open
Abstract
The Y chromosome harbors nine multi-copy ampliconic gene families expressed exclusively in testis. The gene copies within each family are >99% identical to each other, which poses a major challenge in evaluating their copy number. Recent studies demonstrated high variation in Y ampliconic gene copy number among humans. However, how this variation affects expression levels in human testis remains understudied. Here we developed a novel computational tool Ampliconic Copy Number Estimator (AmpliCoNE) that utilizes read sequencing depth information to estimate Y ampliconic gene copy number per family. We applied this tool to whole-genome sequencing data of 149 men with matched testis expression data whose samples are part of the Genotype-Tissue Expression (GTEx) project. We found that the Y ampliconic gene families with low copy number in humans were deleted or pseudogenized in non-human great apes, suggesting relaxation of functional constraints. Among the Y ampliconic gene families, higher copy number leads to higher expression. Within the Y ampliconic gene families, copy number does not influence gene expression, rather a high tolerance for variation in gene expression was observed in testis of presumably healthy men. No differences in gene expression levels were found among major Y haplogroups. Age positively correlated with expression levels of the HSFY and PRY gene families in the African subhaplogroup E1b, but not in the European subhaplogroups R1b and I1. We also found that expression of five Y ampliconic gene families is coordinated with that of their non-Y (i.e. X or autosomal) homologs. Indeed, five ampliconic gene families had consistently lower expression levels when compared to their non-Y homologs suggesting dosage regulation, while the HSFY family had higher expression levels than its X homolog and thus lacked dosage regulation.
Collapse
MESH Headings
- Animals
- Chromosomes, Human, Y/genetics
- Chromosomes, Human, Y/physiology
- DNA Copy Number Variations/genetics
- Databases, Genetic
- Dosage Compensation, Genetic/genetics
- Dosage Compensation, Genetic/physiology
- Epigenesis, Genetic/genetics
- Gene Dosage/genetics
- Gene Expression/genetics
- Gene Expression Regulation/genetics
- Genes, Y-Linked/genetics
- Genes, Y-Linked/physiology
- Heat Shock Transcription Factors/genetics
- Heat Shock Transcription Factors/metabolism
- Humans
- Male
- Multigene Family/genetics
- Sequence Analysis, DNA/methods
- Testis/metabolism
Collapse
Affiliation(s)
- Rahulsimham Vegesna
- Bioinformatics and Genomics Graduate Program, The Huck Institutes for the Life Sciences, Pennsylvania State University, University Park, PA, United States of America
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States of America
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, United States of America
| | - Paul Medvedev
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States of America
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, United States of America
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, United States of America
- Center for Medical Genomics, Pennsylvania State University, University Park, PA, United States of America
| | - Kateryna D. Makova
- Bioinformatics and Genomics Graduate Program, The Huck Institutes for the Life Sciences, Pennsylvania State University, University Park, PA, United States of America
- Department of Biology, Pennsylvania State University, University Park, PA, United States of America
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, United States of America
- Center for Medical Genomics, Pennsylvania State University, University Park, PA, United States of America
| |
Collapse
|
135
|
Moreland BS, Oman KM, Bundschuh R. A model of pulldown alignments from SssI-treated DNA improves DNA methylation prediction. BMC Bioinformatics 2019; 20:431. [PMID: 31426747 PMCID: PMC6700779 DOI: 10.1186/s12859-019-3011-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 07/29/2019] [Indexed: 11/12/2022] Open
Abstract
Background Protein pulldown using Methyl-CpG binding domain (MBD) proteins followed by high-throughput sequencing is a common method to determine DNA methylation. Algorithms have been developed to estimate absolute methylation level from read coverage generated by affinity enrichment-based techniques, but the most accurate one for MBD-seq data requires additional data from an SssI-treated Control experiment. Results Using our previous characterizations of Methyl-CpG/MBD2 binding in the context of an MBD pulldown experiment, we build a model of expected MBD pulldown reads as drawn from SssI-treated DNA. We use the program BayMeth to evaluate the effectiveness of this model by substituting calculated SssI Control data for the observed SssI Control data. By comparing methylation predictions against those from an RRBS data set, we find that BayMeth run with our modeled SssI Control data performs better than BayMeth run with observed SssI Control data, on both 100 bp and 10 bp windows. Adapting the model to an external data set solely by changing the average fragment length, our calculated data still informs the BayMeth program to a similar level as observed data in predicting methylation state on a pulldown data set with matching WGBS estimates. Conclusion In both internal and external MBD pulldown data sets tested in this study, BayMeth used with our modeled pulldown coverage performs better than BayMeth run without the inclusion of any estimate of SssI Control pulldown, and is comparable to – and in some cases better than – using observed SssI Control data with the BayMeth program. Thus, our MBD pulldown alignment model can improve methylation predictions without the need to perform additional control experiments. Electronic supplementary material The online version of this article (10.1186/s12859-019-3011-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Blythe S Moreland
- Department of Physics, The Ohio State University, Columbus, OH, USA.,Present address: Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH,, USA
| | - Kenji M Oman
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ralf Bundschuh
- Department of Physics, The Ohio State University, Columbus, OH, USA. .,Department of Chemistry&Biochemistry, Division of Hematology, and Center for RNA Biology, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
136
|
Cao J, Chen L, Li H, Chen H, Yao J, Mu S, Liu W, Zhang P, Cheng Y, Liu B, Hu Z, Chen D, Kang H, Hu J, Wang A, Wang W, Yao M, Chrin G, Wang X, Zhao W, Li L, Xu L, Guo W, Jia J, Chen J, Wang K, Li G, Shi W. An Accurate and Comprehensive Clinical Sequencing Assay for Cancer Targeted and Immunotherapies. Oncologist 2019; 24:e1294-e1302. [PMID: 31409745 DOI: 10.1634/theoncologist.2019-0236] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 05/25/2019] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Incorporation of next-generation sequencing (NGS) technology into clinical utility in targeted and immunotherapies requires stringent validation, including the assessment of tumor mutational burden (TMB) and microsatellite instability (MSI) status by NGS as important biomarkers for response to immune checkpoint inhibitors. MATERIALS AND METHODS We designed an NGS assay, Cancer Sequencing YS panel (CSYS), and applied algorithms to detect five classes of genomic alterations and two genomic features of TMB and MSI. RESULTS By stringent validation, CSYS exhibited high sensitivity and predictive positive value of 99.7% and 99.9%, respectively, for single nucleotide variation; 100% and 99.9%, respectively, for short insertion and deletion (indel); and 95.5% and 100%, respectively, for copy number alteration (CNA). Moreover, CSYS achieved 100% specificity for both long indel (50-3,000 bp insertion and deletion) and gene rearrangement. Overall, we used 33 cell lines and 208 clinical samples to validate CSYS's NGS performance, and genomic alterations in clinical samples were also confirmed by fluorescence in situ hybridization, immunohistochemistry, and polymerase chain reaction (PCR). Importantly, the landscape of TMB across different cancers of Chinese patients (n = 3,309) was studied. TMB by CSYS exhibited a high correlation (Pearson correlation coefficient r = 0.98) with TMB by whole exome sequencing (WES). MSI measurement showed 98% accuracy and was confirmed by PCR. Application of CSYS in a clinical setting showed an unexpectedly high occurrence of long indel (6.3%) in a cohort of tumors from Chinese patients with cancer (n = 3,309), including TP53, RB1, FLT3, BRCA2, and other cancer driver genes with clinical impact. CONCLUSION CSYS proves to be clinically applicable and useful in disclosing genomic alterations relevant to cancer target therapies and revealing biomarkers for immune checkpoint inhibitors. IMPLICATIONS FOR PRACTICE The study describes a specially designed sequencing panel assay to detect genomic alterations and features of 450 cancer genes, including its overall workflow and rigorous clinical and analytical validations. The distribution of pan-cancer tumor mutational burden, microsatellite instability, gene rearrangement, and long insertion and deletion mutations was assessed for the first time by this assay in a broad array of Chinese patients with cancer. The Cancer Sequencing YS panel and its validation study could serve as a blueprint for developing next-generation sequencing-based assays, particularly for the purpose of clinical application.
Collapse
Affiliation(s)
- Jingyu Cao
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Lijuan Chen
- OrigiMed, Shanghai, People's Republic of China
| | - Heng Li
- Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Yunnan Tumor Hospital, Kunming, People's Republic of China
| | - Hui Chen
- OrigiMed, Shanghai, People's Republic of China
| | - Jicheng Yao
- OrigiMed, Shanghai, People's Republic of China
| | - Shuo Mu
- OrigiMed, Shanghai, People's Republic of China
| | - Wenjin Liu
- OrigiMed, Shanghai, People's Republic of China
| | - Peng Zhang
- OrigiMed, Shanghai, People's Republic of China
| | - Yuwei Cheng
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
| | - Binbin Liu
- OrigiMed, Shanghai, People's Republic of China
| | | | | | - Hui Kang
- OrigiMed, Shanghai, People's Republic of China
| | - Jinwei Hu
- OrigiMed, Shanghai, People's Republic of China
| | - Aodi Wang
- OrigiMed, Shanghai, People's Republic of China
| | | | - Ming Yao
- OrigiMed, Shanghai, People's Republic of China
| | | | - Xiaoting Wang
- Department of Medicine, The First Affiliated Hospital, Zhejiang University, Hangzhou, People's Republic of China
| | - Wei Zhao
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Lei Li
- Department of Hepatobiliary Surgery, Shandong Tumor Hospital, Jinan, People's Republic of China
| | - Luping Xu
- Department of General Surgery, The First Affiliated Hospital, Jiaxing College of Medicine, Jiangxi, People's Republic of China
| | - Weixin Guo
- Department of Chemotherapy, Meizhou People's Hospital, Meizhou, People's Republic of China
| | - Jun Jia
- Department of Oncology, Dongguan People's Hospital, Dongguan, People's Republic of China
| | - Jianhua Chen
- Department of Medical Oncology-Chest, Hunan Cancer Hospital, Changsha, People's Republic of China
| | - Kai Wang
- OrigiMed, Shanghai, People's Republic of China
- Zhejiang University International Hospital, Hangzhou, People's Republic of China
| | - Gaofeng Li
- Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Yunnan Tumor Hospital, Kunming, People's Republic of China
| | - Weiwei Shi
- OrigiMed, Shanghai, People's Republic of China
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai, People's Republic of China
| |
Collapse
|
137
|
Maintenance of High Genome Integrity over Vegetative Growth in the Fairy-Ring Mushroom Marasmius oreades. Curr Biol 2019; 29:2758-2765.e6. [PMID: 31402298 DOI: 10.1016/j.cub.2019.07.025] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 05/25/2019] [Accepted: 07/09/2019] [Indexed: 01/06/2023]
Abstract
Most mutations in coding regions of the genome are deleterious, causing selection to favor mechanisms that minimize the mutational load over time [1-5]. DNA replication during cell division is a major source of new mutations. It is therefore important to limit the number of cell divisions between generations, particularly for large and long-lived organisms [6-9]. The germline cells of animals and the slowly dividing cells in plant meristems are adaptations to control the number of mutations that accumulate over generations [9-11]. Fungi lack a separated germline while harboring species with very large and long-lived individuals that appear to maintain highly stable genomes within their mycelia [8, 12, 13]. Here, we studied genomic mutation accumulation in the fairy-ring mushroom Marasmius oreades. We generated a chromosome-level genome assembly using a combination of cutting-edge DNA sequencing technologies and re-sequenced 40 samples originating from six individuals of this fungus. The low number of mutations recovered in the sequencing data suggests the presence of an unknown mechanism that works to maintain extraordinary genome integrity over vegetative growth in M. oreades. The highly structured growth pattern of M. oreades allowed us to estimate the number of cell divisions leading up to each sample [14, 15], and from this data, we infer an incredibly low per mitosis mutation rate (3.8 × 10-12 mutations per site and cell division) as one of several possible explanations for the low number of identified mutations.
Collapse
|
138
|
Mutational processes contributing to the development of multiple myeloma. Blood Cancer J 2019; 9:60. [PMID: 31387987 PMCID: PMC6684612 DOI: 10.1038/s41408-019-0221-9] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 04/26/2019] [Accepted: 05/08/2019] [Indexed: 12/20/2022] Open
Abstract
To gain insight into multiple myeloma (MM) tumorigenesis, we analyzed the mutational signatures in 874 whole-exome and 850 whole-genome data from the CoMMpass Study. We identified that coding and non-coding regions are differentially dominated by distinct single-nucleotide variant (SNV) mutational signatures, as well as five de novo structural rearrangement signatures. Mutational signatures reflective of different principle mutational processes—aging, defective DNA repair, and apolipoprotein B editing complex (APOBEC)/activation-induced deaminase activity—characterize MM. These mutational signatures show evidence of subgroup specificity—APOBEC-attributed signatures associated with MAF translocation t(14;16) and t(14;20) MM; potentially DNA repair deficiency with t(11;14) and t(4;14); and aging with hyperdiploidy. Mutational signatures beyond that associated with APOBEC are independent of established prognostic markers and appear to have relevance to predicting high-risk MM.
Collapse
|
139
|
Field MA, Burgio G, Chuah A, Al Shekaili J, Hassan B, Al Sukaiti N, Foote SJ, Cook MC, Andrews TD. Recurrent miscalling of missense variation from short-read genome sequence data. BMC Genomics 2019; 20:546. [PMID: 31307400 PMCID: PMC6631443 DOI: 10.1186/s12864-019-5863-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Background Short-read resequencing of genomes produces abundant information of the genetic variation of individuals. Due to their numerous nature, these variants are rarely exhaustively validated. Furthermore, low levels of undetected variant miscalling will have a systematic and disproportionate impact on the interpretation of individual genome sequence information, especially should these also be carried through into in reference databases of genomic variation. Results We find that sequence variation from short-read sequence data is subject to recurrent-yet-intermittent miscalling that occurs in a sequence intrinsic manner and is very sensitive to sequence read length. The miscalls arise from difficulties aligning short reads to redundant genomic regions, where the rate of sequencing error approaches the sequence diversity between redundant regions. We find the resultant miscalled variants to be sensitive to small sequence variations between genomes, and thereby are often intrinsic to an individual, pedigree, strain or human ethnic group. In human exome sequences, we identify 2–300 recurrent false positive variants per individual, almost all of which are present in public databases of human genomic variation. From the exomes of non-reference strains of inbred mice, we identify 3–5000 recurrent false positive variants per mouse – the number of which increasing with greater distance between an individual mouse strain and the reference C57BL6 mouse genome. We show that recurrently miscalled variants may be reproduced for a given genome from repeated simulation rounds of read resampling, realignment and recalling. As such, it is possible to identify more than two-thirds of false positive variation from only ten rounds of simulation. Conclusion Identification and removal of recurrent false positive variants from specific individual variant sets will improve overall data quality. Variant miscalls arising are highly sequence intrinsic and are often specific to an individual, pedigree or ethnicity. Further, read length is a strong determinant of whether given false variants will be called for any given genome – which has profound significance for cohort studies that pool datasets collected and sequenced at different points in time. Electronic supplementary material The online version of this article (10.1186/s12864-019-5863-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matthew A Field
- Department of Immunology and Infectious Disease, The John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia.,Australian Institute of Tropical Health and Medicine, Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
| | - Gaetan Burgio
- Department of Immunology and Infectious Disease, The John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Aaron Chuah
- Department of Immunology and Infectious Disease, The John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Jalila Al Shekaili
- Department of Microbiology and Immunology, Sultan Qaboos University Hospital, Seeb, Oman
| | - Batool Hassan
- Department of Medicine, Sultan Qaboos University Hospital, Muscat, Oman
| | - Nashat Al Sukaiti
- Department of Paediatrics, Allergy, and Clinical Immunology Unit, Royal Hospital, Muscat, Oman
| | - Simon J Foote
- Department of Immunology and Infectious Disease, The John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Matthew C Cook
- Department of Immunology and Infectious Disease, The John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia.,Department of Immunology, Canberra Hospital, Canberra, Australian Capital Territory, Australia
| | - T Daniel Andrews
- Department of Immunology and Infectious Disease, The John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia.
| |
Collapse
|
140
|
Sexton CE, Han MV. Paired-end mappability of transposable elements in the human genome. Mob DNA 2019; 10:29. [PMID: 31320939 PMCID: PMC6617613 DOI: 10.1186/s13100-019-0172-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Accepted: 07/02/2019] [Indexed: 01/02/2023] Open
Abstract
Though transposable elements make up around half of the human genome, the repetitive nature of their sequences makes it difficult to accurately align conventional sequencing reads. However, in light of new advances in sequencing technology, such as increased read length and paired-end libraries, these repetitive regions are now becoming easier to align to. This study investigates the mappability of transposable elements with 50 bp, 76 bp and 100 bp paired-end read libraries. With respect to those read lengths and allowing for 3 mismatches during alignment, over 68, 85, and 88% of all transposable elements in the RepeatMasker database are uniquely mappable, suggesting that accurate locus-specific mapping of older transposable elements is well within reach.
Collapse
Affiliation(s)
- Corinne E Sexton
- 1School of Life Sciences, University of Nevada, Las Vegas, NV 89154 USA.,Nevada Institute of Personalized Medicine, Las Vegas, NV 89154 USA
| | - Mira V Han
- 1School of Life Sciences, University of Nevada, Las Vegas, NV 89154 USA.,Nevada Institute of Personalized Medicine, Las Vegas, NV 89154 USA
| |
Collapse
|
141
|
Bowler TG, Pradhan K, Kong Y, Bartenstein M, Morrone KA, Sridharan A, Kessel RM, Shastri A, Giricz O, Bhagat TD, Gordon-Mitchell S, Rohanizadegan M, Hooda L, Datt I, Przychodzen BP, Parmar S, Maqbool S, Maciejewski JP, Steidl U, Greally JM, Verma A. Misidentification of MLL3 and other mutations in cancer due to highly homologous genomic regions. Leuk Lymphoma 2019; 60:3132-3137. [PMID: 31288594 DOI: 10.1080/10428194.2019.1630620] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The MLL3 gene has been shown to be recurrently mutated in many malignancies including in families with acute myeloid leukemia. We demonstrate that many MLL3 variant calls made by exome sequencing are false positives due to misalignment to homologous regions, including a region on chr21, and can only be validated by long-range PCR. Numerous other recurrently mutated genes reported in COSMIC and TCGA databases have pseudogenes and cannot also be validated by conventional short read-based sequencing approaches. Genome-wide identification of pseudogene regions demonstrates that frequency of these homologous regions is increased with sequencing read lengths below 200 bps. To enable identification of poor quality sequencing variants in prospective studies, we generated novel genome-wide maps of regions with poor mappability that can be used in variant calling algorithms. Taken together, our findings reveal that pseudogene regions are a source of false-positive mutations in cancers.
Collapse
Affiliation(s)
- Timothy G Bowler
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Kith Pradhan
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Yu Kong
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | | | - Kerry A Morrone
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Ashwin Sridharan
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Rachel M Kessel
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Aditi Shastri
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Orsi Giricz
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Tushar D Bhagat
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | | | | | - Lauren Hooda
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Ishan Datt
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | | | | | - Shahina Maqbool
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | | | - Ulrich Steidl
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - John M Greally
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| | - Amit Verma
- Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA
| |
Collapse
|
142
|
Karimzadeh M, Ernst C, Kundaje A, Hoffman MM. Umap and Bismap: quantifying genome and methylome mappability. Nucleic Acids Res 2019; 46:e120. [PMID: 30169659 PMCID: PMC6237805 DOI: 10.1093/nar/gky677] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 07/22/2018] [Indexed: 11/14/2022] Open
Abstract
Short-read sequencing enables assessment of genetic and biochemical traits of individual genomic regions, such as the location of genetic variation, protein binding and chemical modifications. Every region in a genome assembly has a property called 'mappability', which measures the extent to which it can be uniquely mapped by sequence reads. In regions of lower mappability, estimates of genomic and epigenomic characteristics from sequencing assays are less reliable. These regions have increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Bisulfite sequencing approaches used to identify DNA methylation exacerbate these problems by introducing large numbers of reads that map to multiple regions. Both to correct assumptions of uniformity in downstream analysis and to identify regions where the analysis is less reliable, it is necessary to know the mappability of both ordinary and bisulfite-converted genomes. We introduce the Umap software for identifying uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite-converted genome. A Umap and Bismap track hub for human genome assemblies GRCh37/hg19 and GRCh38/hg38, and mouse assemblies GRCm37/mm9 and GRCm38/mm10 is available at https://bismap.hoffmanlab.org for use with genome browsers.
Collapse
Affiliation(s)
- Mehran Karimzadeh
- Princess Margaret Cancer Centre, M5G 1L7, Toronto, ON, Canada.,Department of Medical Biophysics, M5G 1L7, University of Toronto, Toronto, ON, Canada.,Vector Institute, M5G 1M1, Toronto, ON, Canada
| | - Carl Ernst
- Department of Human Genetics, McGill University, H3A 0C7, Montreal, QC, Canada
| | - Anshul Kundaje
- Department of Genetics, Stanford University, 94305-9025, Stanford, CA, USA.,Department of Computer Science, Stanford University, 94305-5120, Stanford, CA, USA
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, M5G 1L7, Toronto, ON, Canada.,Department of Medical Biophysics, M5G 1L7, University of Toronto, Toronto, ON, Canada.,Vector Institute, M5G 1M1, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, M5S 2E4, Toronto, ON, Canada
| |
Collapse
|
143
|
Filia A, Droop A, Harland M, Thygesen H, Randerson-Moor J, Snowden H, Taylor C, Diaz JMS, Pozniak J, Nsengimana J, Laye J, Newton-Bishop JA, Bishop DT. High-Resolution Copy Number Patterns From Clinically Relevant FFPE Material. Sci Rep 2019; 9:8908. [PMID: 31222134 PMCID: PMC6586881 DOI: 10.1038/s41598-019-45210-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 05/07/2019] [Indexed: 11/09/2022] Open
Abstract
Systematic tumour profiling is essential for biomarker research and clinically for assessing response to therapy. Solving the challenge of delivering informative copy number (CN) profiles from formalin-fixed paraffin embedded (FFPE) material, the only likely readily available biospecimen for most cancers, involves successful processing of small quantities of degraded DNA. To investigate the potential for analysis of such lesions, whole-genome CNVseq was applied to 300 FFPE primary tumour samples, obtained from a large-scale epidemiological study of melanoma. The quality and the discriminatory power of CNVseq was assessed. Libraries were successfully generated for 93% of blocks, with input DNA quantity being the only predictor of success (success rate dropped to 65% if <20 ng available); 3% of libraries were dropped because of low sequence alignment rates. Technical replicates showed high reproducibility. Comparison with targeted CN assessment showed consistency with the Next Generation Sequencing (NGS) analysis. We were able to detect and distinguish CN changes with a resolution of ≤10 kb. To demonstrate performance, we report the spectrum of genomic CN alterations (CNAs) detected at 9p21, the major site of CN change in melanoma. This successful analysis of CN in FFPE material using NGS provides proof of principle for intensive examination of population-based samples.
Collapse
Affiliation(s)
- Anastasia Filia
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
- Centre for Translational Research, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
| | - Alastair Droop
- MRC Medical Bioinformatics Centre, Leeds Institute of Data Analytics, University of Leeds, Leeds, United Kingdom
| | - Mark Harland
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Helene Thygesen
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Juliette Randerson-Moor
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Helen Snowden
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Claire Taylor
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Joey Mark S Diaz
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Joanna Pozniak
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Jérémie Nsengimana
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Jon Laye
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - Julia A Newton-Bishop
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom
| | - D Timothy Bishop
- Section of Epidemiology and Biostatistics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, United Kingdom.
| |
Collapse
|
144
|
Fabry MH, Ciabrelli F, Munafò M, Eastwood EL, Kneuss E, Falciatori I, Falconio FA, Hannon GJ, Czech B. piRNA-guided co-transcriptional silencing coopts nuclear export factors. eLife 2019; 8:e47999. [PMID: 31219034 PMCID: PMC6677536 DOI: 10.7554/elife.47999] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 06/19/2019] [Indexed: 01/25/2023] Open
Abstract
The PIWI-interacting RNA (piRNA) pathway is a small RNA-based immune system that controls the expression of transposons and maintains genome integrity in animal gonads. In Drosophila, piRNA-guided silencing is achieved, in part, via co-transcriptional repression of transposons by Piwi. This depends on Panoramix (Panx); however, precisely how an RNA binding event silences transcription remains to be determined. Here we show that Nuclear Export Factor 2 (Nxf2) and its co-factor, Nxt1, form a complex with Panx and are required for co-transcriptional silencing of transposons in somatic and germline cells of the ovary. Tethering of Nxf2 or Nxt1 to RNA results in silencing of target loci and the concomitant accumulation of repressive chromatin marks. Nxf2 and Panx proteins are mutually required for proper localization and stability. We mapped the protein domains crucial for the Nxf2/Panx complex formation and show that the amino-terminal portion of Panx is sufficient to induce transcriptional silencing.
Collapse
Affiliation(s)
- Martin H Fabry
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Filippo Ciabrelli
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Marzia Munafò
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Evelyn L Eastwood
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Emma Kneuss
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Ilaria Falciatori
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Federica A Falconio
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Gregory J Hannon
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| | - Benjamin Czech
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUnited Kingdom
| |
Collapse
|
145
|
Huang CC, Du M, Wang L. Bioinformatics Analysis for Circulating Cell-Free DNA in Cancer. Cancers (Basel) 2019; 11:cancers11060805. [PMID: 31212602 PMCID: PMC6627444 DOI: 10.3390/cancers11060805] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 06/03/2019] [Accepted: 06/06/2019] [Indexed: 12/28/2022] Open
Abstract
Molecular analysis of cell-free DNA (cfDNA) that circulates in plasma and other body fluids represents a "liquid biopsy" approach for non-invasive cancer screening or monitoring. The rapid development of sequencing technologies has made cfDNA a promising source to study cancer development and progression. Specific genetic and epigenetic alterations have been found in plasma, serum, and urine cfDNA and could potentially be used as diagnostic or prognostic biomarkers in various cancer types. In this review, we will discuss the molecular characteristics of cancer cfDNA and major bioinformatics approaches involved in the analysis of cfDNA sequencing data for detecting genetic mutation, copy number alteration, methylation change, and nucleosome positioning variation. We highlight specific challenges in sensitivity to detect genetic aberrations and robustness of statistical analysis. Finally, we provide perspectives regarding the standard and continuing development of bioinformatics analysis to move this promising screening tool into clinical practice.
Collapse
Affiliation(s)
- Chiang-Ching Huang
- Zilber School of Public Health, University of Wisconsin, Milwaukee, WI 53205, USA.
| | - Meijun Du
- Department of Pathology and MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
| | - Liang Wang
- Department of Pathology and MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
| |
Collapse
|
146
|
Pounraja VK, Jayakar G, Jensen M, Kelkar N, Girirajan S. A machine-learning approach for accurate detection of copy number variants from exome sequencing. Genome Res 2019; 29:1134-1143. [PMID: 31171634 PMCID: PMC6633262 DOI: 10.1101/gr.245928.118] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Accepted: 06/04/2019] [Indexed: 11/25/2022]
Abstract
Copy number variants (CNVs) are a major cause of several genetic disorders, making their detection an essential component of genetic analysis pipelines. Current methods for detecting CNVs from exome-sequencing data are limited by high false-positive rates and low concordance because of inherent biases of individual algorithms. To overcome these issues, calls generated by two or more algorithms are often intersected using Venn diagram approaches to identify "high-confidence" CNVs. However, this approach is inadequate, because it misses potentially true calls that do not have consensus from multiple callers. Here, we present CN-Learn, a machine-learning framework that integrates calls from multiple CNV detection algorithms and learns to accurately identify true CNVs using caller-specific and genomic features from a small subset of validated CNVs. Using CNVs predicted by four exome-based CNV callers (CANOES, CODEX, XHMM, and CLAMMS) from 503 samples, we demonstrate that CN-Learn identifies true CNVs at higher precision (∼90%) and recall (∼85%) rates while maintaining robust performance even when trained with minimal data (∼30 samples). CN-Learn recovers twice as many CNVs compared to individual callers or Venn diagram-based approaches, with features such as exome capture probe count, caller concordance, and GC content providing the most discriminatory power. In fact, ∼58% of all true CNVs recovered by CN-Learn were either singletons or calls that lacked support from at least one caller. Our study underscores the limitations of current approaches for CNV identification and provides an effective method that yields high-quality CNVs for application in clinical diagnostics.
Collapse
Affiliation(s)
- Vijay Kumar Pounraja
- Bioinformatics and Genomics Graduate Program of the Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Gopal Jayakar
- The Schreyer Honors College, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Matthew Jensen
- Bioinformatics and Genomics Graduate Program of the Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Neil Kelkar
- The Schreyer Honors College, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Santhosh Girirajan
- Bioinformatics and Genomics Graduate Program of the Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.,Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.,Department of Anthropology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
147
|
Neumann T, Herzog VA, Muhar M, von Haeseler A, Zuber J, Ameres SL, Rescheneder P. Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets. BMC Bioinformatics 2019; 20:258. [PMID: 31109287 PMCID: PMC6528199 DOI: 10.1186/s12859-019-2849-7] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 04/25/2019] [Indexed: 11/15/2022] Open
Abstract
Background Methods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes. The recovery, quantification and interpretation of such events in high-throughput sequencing datasets demands specialized bioinformatics approaches. Results Here, we present Digital Unmasking of Nucleotide conversions in K-mers (DUNK), a data analysis pipeline enabling the quantification of nucleotide conversions in high-throughput sequencing datasets. We demonstrate using experimentally generated and simulated datasets that DUNK allows constant mapping rates irrespective of nucleotide-conversion rates, promotes the recovery of multimapping reads and employs Single Nucleotide Polymorphism (SNP) masking to uncouple true SNPs from nucleotide conversions to facilitate a robust and sensitive quantification of nucleotide-conversions. As a first application, we implement this strategy as SLAM-DUNK for the analysis of SLAMseq profiles, in which 4-thiouridine-labeled transcripts are detected based on T > C conversions. SLAM-DUNK provides both raw counts of nucleotide-conversion containing reads as well as a base-content and read coverage normalized approach for estimating the fractions of labeled transcripts as readout. Conclusion Beyond providing a readily accessible tool for analyzing SLAMseq and related time-resolved RNA sequencing methods (TimeLapse-seq, TUC-seq), DUNK establishes a broadly applicable strategy for quantifying nucleotide conversions. Electronic supplementary material The online version of this article (10.1186/s12859-019-2849-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tobias Neumann
- Research Institute of Molecular Pathology (IMP), Campus-Vienna-Biocenter 1, Vienna BioCenter (VBC), 1030, Vienna, Austria.
| | - Veronika A Herzog
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Dr. Bohr-Gasse 3, VBC, 1030, Vienna, Austria
| | - Matthias Muhar
- Research Institute of Molecular Pathology (IMP), Campus-Vienna-Biocenter 1, Vienna BioCenter (VBC), 1030, Vienna, Austria
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, Dr. Bohrgasse 9, VBC, 1030, Vienna, Austria.,Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Waehringerstrasse 17, A-1090, Vienna, Austria
| | - Johannes Zuber
- Research Institute of Molecular Pathology (IMP), Campus-Vienna-Biocenter 1, Vienna BioCenter (VBC), 1030, Vienna, Austria.,Medical University of Vienna, VBC, 1030, Vienna, Austria
| | - Stefan L Ameres
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Dr. Bohr-Gasse 3, VBC, 1030, Vienna, Austria
| | - Philipp Rescheneder
- Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, Dr. Bohrgasse 9, VBC, 1030, Vienna, Austria.
| |
Collapse
|
148
|
Tort F, Ugarteburu O, Texidó L, Gea-Sorlí S, García-Villoria J, Ferrer-Cortès X, Arias Á, Matalonga L, Gort L, Ferrer I, Guitart-Mampel M, Garrabou G, Vaz FM, Pristoupilova A, Rodríguez MIE, Beltran S, Cardellach F, Wanders RJ, Fillat C, García-Silva MT, Ribes A. Mutations in TIMM50 cause severe mitochondrial dysfunction by targeting key aspects of mitochondrial physiology. Hum Mutat 2019; 40:1700-1712. [PMID: 31058414 DOI: 10.1002/humu.23779] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 04/26/2019] [Accepted: 04/28/2019] [Indexed: 01/16/2023]
Abstract
3-Methylglutaconic aciduria (3-MGA-uria) syndromes comprise a heterogeneous group of diseases associated with mitochondrial membrane defects. Whole-exome sequencing identified compound heterozygous mutations in TIMM50 (c.[341 G>A];[805 G>A]) in a boy with West syndrome, optic atrophy, neutropenia, cardiomyopathy, Leigh syndrome, and persistent 3-MGA-uria. A comprehensive analysis of the mitochondrial function was performed in fibroblasts of the patient to elucidate the molecular basis of the disease. TIMM50 protein was severely reduced in the patient fibroblasts, regardless of the normal mRNA levels, suggesting that the mutated residues might be important for TIMM50 protein stability. Severe morphological defects and ultrastructural abnormalities with aberrant mitochondrial cristae organization in muscle and fibroblasts were found. The levels of fully assembled OXPHOS complexes and supercomplexes were strongly reduced in fibroblasts from this patient. High-resolution respirometry demonstrated a significant reduction of the maximum respiratory capacity. A TIMM50-deficient HEK293T cell line that we generated using CRISPR/Cas9 mimicked the respiratory defect observed in the patient fibroblasts; notably, this defect was rescued by transfection with a plasmid encoding the TIMM50 wild-type protein. In summary, we demonstrated that TIMM50 deficiency causes a severe mitochondrial dysfunction by targeting key aspects of mitochondrial physiology, such as the maintenance of proper mitochondrial morphology, OXPHOS assembly, and mitochondrial respiratory capacity.
Collapse
Affiliation(s)
- Frederic Tort
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Olatz Ugarteburu
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Laura Texidó
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Sabrina Gea-Sorlí
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universitat de Barcelona, Barcelona, Spain
| | - Judit García-Villoria
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Xènia Ferrer-Cortès
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Ángela Arias
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Leslie Matalonga
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Laura Gort
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Isidre Ferrer
- Department of Pathology and Experimental Therapeutics, University of Barcelona; Bellvitge University Hospital; IDIBELL; Network Biomedical Research Center of Neurodegenerative diseases (CIBERNED), Hospitalet de Llobregat, Barcelona, Spain
| | - Mariona Guitart-Mampel
- Muscle Research and Mitochondrial Function Laboratory, Cellex-IDIBAPS, Faculty of Medicine and Health Science-University of Barcelona, Internal Medicine Service-Hospital Clínic of Barcelona, CIBERER, Barcelona, Spain
| | - Glòria Garrabou
- Muscle Research and Mitochondrial Function Laboratory, Cellex-IDIBAPS, Faculty of Medicine and Health Science-University of Barcelona, Internal Medicine Service-Hospital Clínic of Barcelona, CIBERER, Barcelona, Spain
| | - Frederick M Vaz
- Departments of Clinical Chemistry and Pediatrics, Laboratory Genetic Metabolic Diseases, University of Amsterdam, Amsterdam, The Netherlands
| | - Ana Pristoupilova
- Department of Pediatrics and Adolescent Medicine, Research Unit for Rare Diseases, First Faculty of Medicine, Charles University, Prague, Czech Republic.,Centre for Genomic Regulation (CRG), CNAG-CRG, Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | | | - Sergi Beltran
- Centre for Genomic Regulation (CRG), CNAG-CRG, Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Francesc Cardellach
- Muscle Research and Mitochondrial Function Laboratory, Cellex-IDIBAPS, Faculty of Medicine and Health Science-University of Barcelona, Internal Medicine Service-Hospital Clínic of Barcelona, CIBERER, Barcelona, Spain
| | - Ronald Ja Wanders
- Departments of Clinical Chemistry and Pediatrics, Laboratory Genetic Metabolic Diseases, University of Amsterdam, Amsterdam, The Netherlands
| | - Cristina Fillat
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universitat de Barcelona, Barcelona, Spain
| | - María Teresa García-Silva
- Unidad de Enfermedades Mitocondriales- Enfermedades Metabólicas Hereditarias. Servicio de Pediatría. Universitary Hospital 12 de Octubre, U723 CIBERER, Universidad Complutense, Madrid, Spain
| | - Antonia Ribes
- Secció d'Errors Congènits del Metabolisme -IBC, Servei de Bioquímica i Genètica Molecular, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| |
Collapse
|
149
|
Xu YC, Niu XM, Li XX, He W, Chen JF, Zou YP, Wu Q, Zhang YE, Busch W, Guo YL. Adaptation and Phenotypic Diversification in Arabidopsis through Loss-of-Function Mutations in Protein-Coding Genes. THE PLANT CELL 2019; 31:1012-1025. [PMID: 30886128 PMCID: PMC6533021 DOI: 10.1105/tpc.18.00791] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 02/25/2019] [Accepted: 03/17/2019] [Indexed: 05/07/2023]
Abstract
According to the less-is-more hypothesis, gene loss is an engine for evolutionary change. Loss-of-function (LoF) mutations resulting in the natural knockout of protein-coding genes not only provide information about gene function but also play important roles in adaptation and phenotypic diversification. Although the less-is-more hypothesis was proposed two decades ago, it remains to be explored on a large scale. In this study, we identified 60,819 LoF variants in 1071 Arabidopsis (Arabidopsis thaliana) genomes and found that 34% of Arabidopsis protein-coding genes annotated in the Columbia-0 genome do not have any LoF variants. We found that nucleotide diversity, transposable element density, and gene family size are strongly correlated with the presence of LoF variants. Intriguingly, 0.9% of LoF variants with minor allele frequency larger than 0.5% are associated with climate change. In addition, in the Yangtze River basin population, 1% of genes with LoF mutations were under positive selection, providing important insights into the contribution of LoF mutations to adaptation. In particular, our results demonstrate that LoF mutations shape diverse phenotypic traits. Overall, our results highlight the importance of the LoF variants for the adaptation and phenotypic diversification of plants.
Collapse
Affiliation(s)
- Yong-Chao Xu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiao-Min Niu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xin-Xin Li
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wenrong He
- Salk Institute for Biological Studies, Plant Molecular and Cellular Biology Laboratory, La Jolla, California 92037
| | - Jia-Fu Chen
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yu-Pan Zou
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qiong Wu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Yong E Zhang
- University of Chinese Academy of Sciences, Beijing 100049, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents & Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Wolfgang Busch
- Salk Institute for Biological Studies, Plant Molecular and Cellular Biology Laboratory, La Jolla, California 92037
| | - Ya-Long Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
150
|
Salvadores M, Mas-Ponte D, Supek F. Passenger mutations accurately classify human tumors. PLoS Comput Biol 2019; 15:e1006953. [PMID: 30986244 PMCID: PMC6483366 DOI: 10.1371/journal.pcbi.1006953] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 04/25/2019] [Accepted: 03/15/2019] [Indexed: 12/18/2022] Open
Abstract
Determining the cancer type and molecular subtype has important clinical implications. The primary site is however unknown for some malignancies discovered in the metastatic stage. Moreover liquid biopsies may be used to screen for tumoral DNA, which upon detection needs to be assigned to a site-of-origin. Classifiers based on genomic features are a promising approach to prioritize the tumor anatomical site, type and subtype. We examined the predictive ability of causal (driver) somatic mutations in this task, comparing it against global patterns of non-selected (passenger) mutations, including features based on regional mutation density (RMD). In the task of distinguishing 18 cancer types, the driver mutations–mutated oncogenes or tumor suppressors, pathways and hotspots–classified 36% of the patients to the correct cancer type. In contrast, the features based on passenger mutations did so at 92% accuracy, with similar contribution from the RMD and the trinucleotide mutation spectra. The RMD and the spectra covered distinct sets of patients with predictions. In particular, introducing the RMD features into a combined classification model increased the fraction of diagnosed patients by 50 percentage points (at 20% FDR). Furthermore, RMD was able to discriminate molecular subtypes and/or anatomical site of six major cancers. The advantage of passenger mutations was upheld under high rates of false negative mutation calls and with exome sequencing, even though overall accuracy decreased. We suggest whole genome sequencing is valuable for classifying tumors because it captures global patterns emanating from mutational processes, which are informative of the underlying tumor biology. Mutations accumulate throughout the lifetime of human somatic cells. While some may affect oncogenes or tumor suppressor genes and cause tumors–the ‘driver’ mutations–most are thought to be of no consequence. The density of such ‘passenger’ mutations across the human chromosomes is very uneven and is correlated with replication time and gene expression in the cell type the tumor had originated from. This property can be used to classify a tumor, assigning it to a tissue of origin and also the molecular subtype. This is useful in cases of those metastatic cancers where the location of the primary tumor is unknown and is also of interest for the upcoming ‘liquid biopsy’ diagnostic approaches, where DNA is directly sequenced from bodily fluids to detect the presence of a cancer. The ability to type and subtype tumors is important to guide more detailed diagnostics and therapy, because the organ and the cell type which generated the tumor determines response to a variety of therapies, including targeted drugs.
Collapse
Affiliation(s)
- Marina Salvadores
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, Barcelona, Spain
| | - David Mas-Ponte
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, Barcelona, Spain
| | - Fran Supek
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- * E-mail:
| |
Collapse
|