1
|
Morrissey A, Shi J, James DQ, Mahony S. Accurate allocation of multimapped reads enables regulatory element analysis at repeats. Genome Res 2024; 34:937-951. [PMID: 38986578 PMCID: PMC11293539 DOI: 10.1101/gr.278638.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 06/14/2024] [Indexed: 07/12/2024]
Abstract
Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. However, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard "multimapped" reads that align equally well to multiple genomic locations. Because multimapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multimapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multimapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multimapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq data sets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly beneficial in identifying ChIP-seq peaks at centromeres, near segmentally duplicated genes, and in younger TEs, enabling new regulatory analyses in these regions.
Collapse
Affiliation(s)
- Alexis Morrissey
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Jeffrey Shi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Daniela Q James
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
2
|
Besedina E, Supek F. Copy number losses of oncogenes and gains of tumor suppressor genes generate common driver mutations. Nat Commun 2024; 15:6139. [PMID: 39033140 PMCID: PMC11271286 DOI: 10.1038/s41467-024-50552-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 07/11/2024] [Indexed: 07/23/2024] Open
Abstract
Cancer driver genes can undergo positive selection for various types of genetic alterations, including gain-of-function or loss-of-function mutations and copy number alterations (CNA). We investigated the landscape of different types of alterations affecting driver genes in 17,644 cancer exomes and genomes. We find that oncogenes may simultaneously exhibit signatures of positive selection and also negative selection in different gene segments, suggesting a method to identify additional tumor types where an oncogene is a driver or a vulnerability. Next, we characterize the landscape of CNA-dependent selection effects, revealing a general trend of increased positive selection on oncogene mutations not only upon CNA gains but also upon CNA deletions. Similarly, we observe a positive interaction between mutations and CNA gains in tumor suppressor genes. Thus, two-hit events involving point mutations and CNA are universally observed regardless of the type of CNA and may signal new therapeutic opportunities. An analysis with focus on the somatic CNA two-hit events can help identify additional driver genes relevant to a tumor type. By a global inference of point mutation and CNA selection signatures and interactions thereof across genes and tissues, we identify 9 evolutionary archetypes of driver genes, representing different mechanisms of (in)activation by genetic alterations.
Collapse
Affiliation(s)
- Elizaveta Besedina
- Institute for Research in Biomedicine (IRB Barcelona), 08028, Barcelona, Spain
| | - Fran Supek
- Institute for Research in Biomedicine (IRB Barcelona), 08028, Barcelona, Spain.
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200, Copenhagen, Denmark.
- Catalan Institution for Research and Advanced Studies (ICREA), 08010, Barcelona, Spain.
| |
Collapse
|
3
|
Popitsch N, Neumann T, von Haeseler A, Ameres SL. Splice_sim: a nucleotide conversion-enabled RNA-seq simulation and evaluation framework. Genome Biol 2024; 25:166. [PMID: 38918865 DOI: 10.1186/s13059-024-03313-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/17/2024] [Indexed: 06/27/2024] Open
Abstract
Nucleotide conversion RNA sequencing techniques interrogate chemical RNA modifications in cellular transcripts, resulting in mismatch-containing reads. Biases in mapping the resulting reads to reference genomes remain poorly understood. We present splice_sim, a splice-aware RNA-seq simulation and evaluation pipeline that introduces user-defined nucleotide conversions at set frequencies, creates mixture models of converted and unconverted reads, and calculates mapping accuracies per genomic annotation. By simulating nucleotide conversion RNA-seq datasets under realistic experimental conditions, including metabolic RNA labeling and RNA bisulfite sequencing, we measure mapping accuracies of state-of-the-art spliced-read mappers for mouse and human transcripts and derive strategies to prevent biases in the data interpretation.
Collapse
Affiliation(s)
- Niko Popitsch
- Max Perutz Labs, Vienna Biocenter Campus (VBC), Vienna, A-1030, Austria.
- Max Perutz Labs, Department of Biochemistry and Cell Biology, University of Vienna, Vienna, A-1030, Austria.
| | - Tobias Neumann
- Quantro Therapeutics, Vienna, A-1030, Austria
- Vienna Biocenter PhD Program, a Doctoral School of the University of Vienna and Medical University of Vienna, Vienna, A-1030, Austria
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna, Medical University of Vienna, Vienna, A-1030, Austria
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna, Medical University of Vienna, Vienna, A-1030, Austria
- Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, A-1090, Austria
| | - Stefan L Ameres
- Max Perutz Labs, Vienna Biocenter Campus (VBC), Vienna, A-1030, Austria
- Max Perutz Labs, Department of Biochemistry and Cell Biology, University of Vienna, Vienna, A-1030, Austria
- Institute of Molecular Biotechnology, IMBA, Vienna Biocenter Campus (VBC), Vienna, A-1030, Austria
| |
Collapse
|
4
|
Salvadores M, Supek F. Cell cycle gene alterations associate with a redistribution of mutation risk across chromosomal domains in human cancers. NATURE CANCER 2024; 5:330-346. [PMID: 38200245 DOI: 10.1038/s43018-023-00707-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 12/11/2023] [Indexed: 01/12/2024]
Abstract
Mutations in human cells exhibit increased burden in heterochromatic, late DNA replication time (RT) chromosomal domains, with variation in mutation rates between tissues mirroring variation in heterochromatin and RT. We observed that regional mutation risk further varies between individual tumors in a manner independent of cell type, identifying three signatures of domain-scale mutagenesis in >4,000 tumor genomes. The major signature reflects remodeling of heterochromatin and of the RT program domains seen across tumors, tissues and cultured cells, and is robustly linked with higher expression of cell proliferation genes. Regional mutagenesis is associated with loss of activity of the tumor-suppressor genes RB1 and TP53, consistent with their roles in cell cycle control, with distinct mutational patterns generated by the two genes. Loss of regional heterogeneity in mutagenesis is associated with deficiencies in various DNA repair pathways. These mutation risk redistribution processes modify the mutation supply towards important genes, diverting the course of somatic evolution.
Collapse
Affiliation(s)
- Marina Salvadores
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Fran Supek
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), Barcelona Institute of Science and Technology, Barcelona, Spain.
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain.
| |
Collapse
|
5
|
Hui R, Scheib CL, D’Atanasio E, Inskip SA, Cessford C, Biagini SA, Wohns AW, Ali MQ, Griffith SJ, Solnik A, Niinemäe H, Ge XJ, Rose AK, Beneker O, O’Connell TC, Robb JE, Kivisild T. Genetic history of Cambridgeshire before and after the Black Death. SCIENCE ADVANCES 2024; 10:eadi5903. [PMID: 38232165 PMCID: PMC10793959 DOI: 10.1126/sciadv.adi5903] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 12/14/2023] [Indexed: 01/19/2024]
Abstract
The extent of the devastation of the Black Death pandemic (1346-1353) on European populations is known from documentary sources and its bacterial source illuminated by studies of ancient pathogen DNA. What has remained less understood is the effect of the pandemic on human mobility and genetic diversity at the local scale. Here, we report 275 ancient genomes, including 109 with coverage >0.1×, from later medieval and postmedieval Cambridgeshire of individuals buried before and after the Black Death. Consistent with the function of the institutions, we found a lack of close relatives among the friars and the inmates of the hospital in contrast to their abundance in general urban and rural parish communities. While we detect long-term shifts in local genetic ancestry in Cambridgeshire, we find no evidence of major changes in genetic ancestry nor higher differentiation of immune loci between cohorts living before and after the Black Death.
Collapse
Affiliation(s)
- Ruoyun Hui
- Alan Turing Institute, London, UK
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
| | - Christiana L. Scheib
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia
- St John’s College, University of Cambridge, Cambridge, UK
| | | | - Sarah A. Inskip
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- School of Archaeology and Ancient History, University of Leicester, Leicester, UK
| | - Craig Cessford
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- Cambridge Archaeological Unit, Department of Archaeology, University of Cambridge, Cambridge, UK
| | | | - Anthony W. Wohns
- School of Medicine, Stanford University, Stanford, CA, USA
- Department of Genetics and Biology, Stanford University, Stanford, CA, USA
| | | | - Samuel J. Griffith
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Anu Solnik
- Core Facility, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Helja Niinemäe
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Xiangyu Jack Ge
- Wellcome Genome Campus, Wellcome Sanger Institute, Hinxton, UK
| | - Alice K. Rose
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- Department of Archaeology, University of Durham, Durham, UK
| | - Owyn Beneker
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Tamsin C. O’Connell
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
| | - John E. Robb
- Department of Archaeology, University of Cambridge, Cambridge, UK
| | - Toomas Kivisild
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| |
Collapse
|
6
|
Mandal AK. Recent insights into crosstalk between genetic parasites and their host genome. Brief Funct Genomics 2024; 23:15-23. [PMID: 36307128 PMCID: PMC10799329 DOI: 10.1093/bfgp/elac032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/14/2022] [Accepted: 09/21/2022] [Indexed: 01/21/2024] Open
Abstract
The bulk of higher order organismal genomes is comprised of transposable element (TE) copies, i.e. genetic parasites. The host-parasite relation is multi-faceted, varying across genomic region (genic versus intergenic), life-cycle stages, tissue-type and of course in health versus pathological state. The reach of functional genomics though, in investigating genotype-to-phenotype relations, has been limited when TEs are involved. The aim of this review is to highlight recent progress made in understanding how TE origin biochemical activity interacts with the central dogma stages of the host genome. Such interaction can also bring about modulation of the immune context and this could have important repercussions in disease state where immunity has a role to play. Thus, the review is to instigate ideas and action points around identifying evolutionary adaptations that the host genome and the genetic parasite have evolved and why they could be relevant.
Collapse
Affiliation(s)
- Amit K Mandal
- Corresponding author: A.K. Mandal, Nuffield Department of Surgical Sciences (NDS), University of Oxford, Old Road Campus Research building (ORCRB), Oxford OX3 7DQ, UK. Tel: +44 (0)1865 617123; Fax: +44 (0)1865 768876; E-mail:
| |
Collapse
|
7
|
Pečnerová P, Lord E, Garcia-Erill G, Hanghøj K, Rasmussen MS, Meisner J, Liu X, van der Valk T, Santander CG, Quinn L, Lin L, Liu S, Carøe C, Dalerum F, Götherström A, Måsviken J, Vartanyan S, Raundrup K, Al-Chaer A, Rasmussen L, Hvilsom C, Heide-Jørgensen MP, Sinding MHS, Aastrup P, Van Coeverden de Groot PJ, Schmidt NM, Albrechtsen A, Dalén L, Heller R, Moltke I, Siegismund HR. Population genomics of the muskox' resilience in the near absence of genetic variation. Mol Ecol 2024; 33:e17205. [PMID: 37971141 DOI: 10.1111/mec.17205] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 10/07/2023] [Accepted: 11/01/2023] [Indexed: 11/19/2023]
Abstract
Genomic studies of species threatened by extinction are providing crucial information about evolutionary mechanisms and genetic consequences of population declines and bottlenecks. However, to understand how species avoid the extinction vortex, insights can be drawn by studying species that thrive despite past declines. Here, we studied the population genomics of the muskox (Ovibos moschatus), an Ice Age relict that was at the brink of extinction for thousands of years at the end of the Pleistocene yet appears to be thriving today. We analysed 108 whole genomes, including present-day individuals representing the current native range of both muskox subspecies, the white-faced and the barren-ground muskox (O. moschatus wardi and O. moschatus moschatus) and a ~21,000-year-old ancient individual from Siberia. We found that the muskox' demographic history was profoundly shaped by past climate changes and post-glacial re-colonizations. In particular, the white-faced muskox has the lowest genome-wide heterozygosity recorded in an ungulate. Yet, there is no evidence of inbreeding depression in native muskox populations. We hypothesize that this can be explained by the effect of long-term gradual population declines that allowed for purging of strongly deleterious mutations. This study provides insights into how species with a history of population bottlenecks, small population sizes and low genetic diversity survive against all odds.
Collapse
Affiliation(s)
- Patrícia Pečnerová
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Copenhagen Zoo, Frederiksberg, Denmark
| | - Edana Lord
- Centre for Palaeogenetics, Stockholm, Sweden
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Genís Garcia-Erill
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kristian Hanghøj
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Malthe Sebro Rasmussen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jonas Meisner
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Xiaodong Liu
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Tom van der Valk
- Centre for Palaeogenetics, Stockholm, Sweden
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Cindy G Santander
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Liam Quinn
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Long Lin
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Shanlin Liu
- Department of Entomology, College of Plant Protection, China Agricultural University, Beijing, China
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Christian Carøe
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Fredrik Dalerum
- Department of Zoology, Stockholm University, Stockholm, Sweden
- Biodiversity Research Institute (CSIC-UO-PA), Mieres, Spain
- Department of Zoology and Entomology, Mammal Research Institute, University of Pretoria, Hatfield, South Africa
| | - Anders Götherström
- Centre for Palaeogenetics, Stockholm, Sweden
- Archaeological Research Laboratory, Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden
| | - Johannes Måsviken
- Centre for Palaeogenetics, Stockholm, Sweden
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Sergey Vartanyan
- North-East Interdisciplinary Scientific Research Institute N.A.N.A. Shilo, Russian Academy of Sciences, Magadan, Russia
| | | | - Amal Al-Chaer
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Linett Rasmussen
- Copenhagen Zoo, Frederiksberg, Denmark
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | | - Mads Peter Heide-Jørgensen
- Greenland Institute of Natural Resources, Nuuk, Greenland
- Greenland Institute of Natural Resources, Copenhagen, Denmark
| | - Mikkel-Holger S Sinding
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Greenland Institute of Natural Resources, Nuuk, Greenland
| | - Peter Aastrup
- Department of Ecoscience, Aarhus University, Roskilde, Denmark
- Arctic Research Centre, Aarhus University, Aarhus, Denmark
| | | | - Niels Martin Schmidt
- Department of Ecoscience, Aarhus University, Roskilde, Denmark
- Arctic Research Centre, Aarhus University, Aarhus, Denmark
| | - Anders Albrechtsen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Love Dalén
- Centre for Palaeogenetics, Stockholm, Sweden
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Ida Moltke
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Hans Redlef Siegismund
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
8
|
Maeng JH, Jang HJ, Du AY, Tzeng SC, Wang T. Using long-read CAGE sequencing to profile cryptic-promoter-derived transcripts and their contribution to the immunopeptidome. Genome Res 2023; 33:2143-2155. [PMID: 38065624 PMCID: PMC10760525 DOI: 10.1101/gr.277061.122] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 11/13/2023] [Indexed: 01/04/2024]
Abstract
Recent studies have shown that the noncoding genome can produce unannotated proteins as antigens that induce immune response. One major source of this activity is the aberrant epigenetic reactivation of transposable elements (TEs). In tumors, TEs often provide cryptic or alternate promoters, which can generate transcripts that encode tumor-specific unannotated proteins. Thus, TE-derived transcripts (TE transcripts) have the potential to produce tumor-specific, but recurrent, antigens shared among many tumors. Identification of TE-derived tumor antigens holds the promise to improve cancer immunotherapy approaches; however, current genomics and computational tools are not optimized for their detection. Here we combined CAGE technology with full-length long-read transcriptome sequencing (long-read CAGE, or LRCAGE) and developed a suite of computational tools to significantly improve immunopeptidome detection by incorporating TE and other tumor transcripts into the proteome database. By applying our methods to human lung cancer cell line H1299 data, we show that long-read technology significantly improves mapping of promoters with low mappability scores and that LRCAGE guarantees accurate construction of uncharacterized 5' transcript structure. Augmenting a reference proteome database with newly characterized transcripts enabled us to detect noncanonical antigens from HLA-pulldown LC-MS/MS data. Lastly, we show that epigenetic treatment increased the number of noncanonical antigens, particularly those encoded by TE transcripts, which might expand the pool of targetable antigens for cancers with low mutational burden.
Collapse
Affiliation(s)
- Ju Heon Maeng
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - H Josh Jang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Alan Y Du
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Shin-Cheng Tzeng
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| |
Collapse
|
9
|
Song S, Koh Y, Kim S, Lee SM, Kim HU, Ko JM, Lee SH, Yoon SS, Park S. Systematic analysis of Mendelian disease-associated gene variants reveals new classes of cancer-predisposing genes. Genome Med 2023; 15:107. [PMID: 38143269 PMCID: PMC10749499 DOI: 10.1186/s13073-023-01252-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 10/30/2023] [Indexed: 12/26/2023] Open
Abstract
BACKGROUND Despite the acceleration of somatic driver gene discovery facilitated by recent large-scale tumor sequencing data, the contribution of inherited variants remains largely unexplored, primarily focusing on previously known cancer predisposition genes (CPGs) due to the low statistical power associated with detecting rare pathogenic variant-phenotype associations. METHODS Here, we introduce a generalized log-regression model to measure the excess of pathogenic variants within genes in cancer patients compared to control samples. It aims to measure gene-level cancer risk enrichment by collapsing rare pathogenic variants after controlling the population differences across samples. RESULTS In this study, we investigate whether pathogenic variants in Mendelian disease-associated genes (OMIM genes) are enriched in cancer patients compared to controls. Utilizing data from PCAWG and the 1,000 Genomes Project, we identify 103 OMIM genes demonstrating significant enrichment of pathogenic variants in cancer samples (FDR 20%). Through an integrative approach considering three distinct properties, we classify these CPG-like OMIM genes into four clusters, indicating potential diverse mechanisms underlying tumor progression. Further, we explore the function of PAH (a key metabolic enzyme associated with Phenylketonuria), the gene exhibiting the highest prevalence of pathogenic variants in a pan-cancer (1.8%) compared to controls (0.6%). CONCLUSIONS Our findings suggest a possible cancer progression mechanism through metabolic profile alterations. Overall, our data indicates that pathogenic OMIM gene variants contribute to cancer progression and introduces new CPG classifications potentially underpinning diverse tumorigenesis mechanisms.
Collapse
Affiliation(s)
- Seulki Song
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
- Structural Biology Program, Centro Nacional de Investigaciones Oncológicas (CNIO), Calle de Melchor Fernández Almagro, 3, Madrid, 28029, Spain
| | - Youngil Koh
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
- Biomedical Research Institute and Departments of Internal Medicine, Seoul National University Hospital, Seoul, 03080, Republic of Korea
| | - Seokhyeon Kim
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
| | - Sang Mi Lee
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
| | - Hyun Uk Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea.
| | - Jung Min Ko
- Department of Pediatrics, Seoul National University Children's Hospital, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
| | - Se-Hoon Lee
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, 06351, Republic of Korea.
| | - Sung-Soo Yoon
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
- Biomedical Research Institute and Departments of Internal Medicine, Seoul National University Hospital, Seoul, 03080, Republic of Korea.
| | - Solip Park
- Structural Biology Program, Centro Nacional de Investigaciones Oncológicas (CNIO), Calle de Melchor Fernández Almagro, 3, Madrid, 28029, Spain.
| |
Collapse
|
10
|
Queitsch K, Moore TW, O'Connell BL, Nichols RV, Muschler JL, Keith D, Lopez C, Sears RC, Mills GB, Yardımcı GG, Adey AC. Accessible high-throughput single-cell whole-genome sequencing with paired chromatin accessibility. CELL REPORTS METHODS 2023; 3:100625. [PMID: 37918402 PMCID: PMC10694488 DOI: 10.1016/j.crmeth.2023.100625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/29/2023] [Accepted: 10/11/2023] [Indexed: 11/04/2023]
Abstract
Single-cell whole-genome sequencing (scWGS) enables the assessment of genome-level molecular differences between individual cells with particular relevance to genetically diverse systems like solid tumors. The application of scWGS was limited due to a dearth of accessible platforms capable of producing high-throughput profiles. We present a technique that leverages nucleosome disruption methodologies with the widely adopted 10× Genomics ATAC-seq workflow to produce scWGS profiles for high-throughput copy-number analysis without new equipment or custom reagents. We further demonstrate the use of commercially available indexed transposase complexes from ScaleBio for sample multiplexing, reducing the per-sample preparation costs. Finally, we demonstrate that sequential indexed tagmentation with an intervening nucleosome disruption step allows for the generation of both ATAC and WGS data from the same cell, producing comparable data to the unimodal assays. By exclusively utilizing accessible commercial reagents, we anticipate that these scWGS and scWGS+ATAC methods can be broadly adopted by the research community.
Collapse
Affiliation(s)
- Konstantin Queitsch
- Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR, USA
| | - Travis W Moore
- Cancer Early Detection Advanced Research Center, Oregon Health & Science University, Portland, OR, USA; Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Brendan L O'Connell
- Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR, USA; Cancer Early Detection Advanced Research Center, Oregon Health & Science University, Portland, OR, USA; Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Ruth V Nichols
- Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR, USA
| | - John L Muschler
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA; Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA; Brenden-Colson Center for Pancreatic Care, Oregon Health & Science University, Portland, OR, USA
| | - Dove Keith
- Brenden-Colson Center for Pancreatic Care, Oregon Health & Science University, Portland, OR, USA
| | - Charles Lopez
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Rosalie C Sears
- Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR, USA; Cancer Early Detection Advanced Research Center, Oregon Health & Science University, Portland, OR, USA; Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA; Brenden-Colson Center for Pancreatic Care, Oregon Health & Science University, Portland, OR, USA
| | - Gordon B Mills
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA; Department of Cell, Developmental and Cancer Biology, Oregon Health & Science University, Portland, OR, USA
| | - Galip Gürkan Yardımcı
- Cancer Early Detection Advanced Research Center, Oregon Health & Science University, Portland, OR, USA; Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Andrew C Adey
- Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR, USA; Cancer Early Detection Advanced Research Center, Oregon Health & Science University, Portland, OR, USA; Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA; Knight Cardiovascular Institute, Oregon Health & Science University, Portland, OR, USA.
| |
Collapse
|
11
|
Koh Y, Kim H, Joo SY, Song S, Choi YH, Kim HR, Moon B, Byun J, Hong J, Shin DY, Park S, Lee KH, Lee KT, Lee JK, Park D, Lee SH, Jang JY, Lee H, Kim JA, Yoon SS, Park JK. Genetic assessment of pathogenic germline alterations in lysosomal genes among Asian patients with pancreatic ductal adenocarcinoma. J Transl Med 2023; 21:730. [PMID: 37848935 PMCID: PMC10580633 DOI: 10.1186/s12967-023-04549-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 09/20/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Lysosomes are closely linked to autophagic activity, which plays a vital role in pancreatic ductal adenocarcinoma (PDAC) biology. The survival of PDAC patients is still poor, and the identification of novel genetic factors for prognosis and treatment is highly required to prevent PDAC-related deaths. This study investigated the germline variants related to lysosomal dysfunction in patients with PDAC and to analyze whether they contribute to the development of PDAC. METHODS The germline putative pathogenic variants (PPV) in genes involved in lysosomal storage disease (LSD) was compared between patients with PDAC (n = 418) and healthy controls (n = 845) using targeted panel and whole-exome sequencing. Furthermore, pancreatic organoids from wild-type and KrasG12D mice were used to evaluate the effect of lysosomal dysfunction on PDAC development. RNA sequencing (RNA-seq) analysis was performed with established PDAC patient-derived organoids (PDOs) according to the PPV status. RESULTS The PPV in LSD-related genes was higher in patients with PDAC than in healthy controls (8.13 vs. 4.26%, Log2 OR = 1.65, P = 3.08 × 10-3). The PPV carriers of LSD-related genes with PDAC were significantly younger than the non-carriers (mean age 61.5 vs. 65.3 years, P = 0.031). We further studied a variant of the lysosomal enzyme, galactosylceramidase (GALC), which was the most frequently detected LSD variant in our cohort. Autophagolysosomal activity was hampered when GALC was downregulated, which was accompanied by paradoxically elevated autophagic flux. Furthermore, the number of proliferating Ki-67+ cells increased significantly in pancreatic organoids derived from Galc knockout KrasG12D mice. Moreover, GALC PPV carriers tended to show drug resistance in both PDAC cell line and PDAC PDO, and RNA-seq analysis revealed that various metabolism and gene repair pathways were upregulated in PDAC PDOs harboring a GALC variant. CONCLUSIONS Genetically defined lysosomal dysfunction is frequently observed in patients with young-onset PDAC. This might contribute to PDAC development by altering metabolism and impairing autophagolysosomal activity, which could be potentially implicated in therapeutic applications for PDAC.
Collapse
Affiliation(s)
- Youngil Koh
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hyemin Kim
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - So Young Joo
- Department of Biological Sciences, Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Republic of Korea
| | - Seulki Song
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Young Hoon Choi
- Department of Internal Medicine, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Hyung Rae Kim
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Byul Moon
- Aging Convergence Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea
| | - Jamin Byun
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Junshik Hong
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Dong-Yeop Shin
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Solip Park
- Structural Biology Department, Centro Nacional de Investigaciones Oncológicas (CNIO), Madrid, Spain
| | - Kwang Hyuck Lee
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Kyu Taek Lee
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jong Kyun Lee
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Daechan Park
- Department of Molecular Science and Technology, Department of Biological Sciences, Ajou University, Suwon, Republic of Korea
| | - Se-Hoon Lee
- Department of Hematology/Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jin-Young Jang
- Departments of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea.
| | - Hyunsook Lee
- Department of Biological Sciences, Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Republic of Korea.
| | - Jung-Ae Kim
- Aging Convergence Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Republic of Korea.
- Department of Functional Genomics, KRIBB School of Bioscience, University of Science and Technology, Daejeon, Republic of Korea.
| | - Sung-Soo Yoon
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea.
- Cancer Research Institute, Seoul National University School of Medicine, Seoul, Republic of Korea.
| | - Joo Kyung Park
- Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
- Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea.
| |
Collapse
|
12
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
13
|
Hecht V, Dong K, Rajesh S, Shpilker P, Wekhande S, Shoresh N. Analyzing histone ChIP-seq data with a bin-based probability of being signal. PLoS Comput Biol 2023; 19:e1011568. [PMID: 37862349 PMCID: PMC10619820 DOI: 10.1371/journal.pcbi.1011568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 11/01/2023] [Accepted: 10/02/2023] [Indexed: 10/22/2023] Open
Abstract
Histone ChIP-seq is one of the primary methods for charting the cellular epigenomic landscape, the components of which play a critical regulatory role in gene expression. Analyzing the activity of regulatory elements across datasets and cell types can be challenging due to shifting peak positions and normalization artifacts resulting from, for example, differing read depths, ChIP efficiencies, and target sizes. Moreover, broad regions of enrichment seen in repressive histone marks often evade detection by commonly used peak callers. Here, we present a simple and versatile method for identifying enriched regions in ChIP-seq data that relies on estimating a gamma distribution fit to non-overlapping 5kB genomic bins to establish a global background. We use this distribution to assign a probability of being signal (PBS) between zero and one to each 5 kB bin. This approach, while lower in resolution than typical peak-calling methods, provides a straightforward way to identify enriched regions and compare enrichments among multiple datasets, by transforming the data to values that are universally normalized and can be readily visualized and integrated with downstream analysis methods. We demonstrate applications of PBS for both broad and narrow histone marks, and provide several illustrations of biological insights which can be gleaned by integrating PBS scores with downstream data types.
Collapse
Affiliation(s)
- Vivian Hecht
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Kevin Dong
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sreshtaa Rajesh
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Polina Shpilker
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Siddarth Wekhande
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Noam Shoresh
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| |
Collapse
|
14
|
Gong M, He Y, Wang M, Zhang Y, Ding C. Interpretable single-cell transcription factor prediction based on deep learning with attention mechanism. Comput Biol Chem 2023; 106:107923. [PMID: 37598467 DOI: 10.1016/j.compbiolchem.2023.107923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/01/2023] [Accepted: 07/12/2023] [Indexed: 08/22/2023]
Abstract
Predicting the transcription factor binding site (TFBS) in the whole genome range is essential in exploring the rule of gene transcription control. Although many deep learning methods to predict TFBS have been proposed, predicting TFBS using single-cell ATAC-seq data and embedding attention mechanisms needs to be improved. To this end, we present IscPAM, an interpretable method based on deep learning with an attention mechanism to predict single-cell transcription factors. Our model adopts the convolution neural network to extract the data feature and optimize the pre-trained model. In particular, the model obtains faster training and prediction due to the embedded attention mechanism. For datasets, we take ATAC-seq, ChIP-seq, and DNA sequences data for the pre-trained model, and single-cell ATAC-seq data is used to predict the TF binding graph in the given cell. We verify the interpretability of the model through ablation experiments and sensitivity analysis. IscPAM can efficiently predict the combination of whole genome transcription factors in single cells and study cellular heterogeneity through chromatin accessibility of related diseases.
Collapse
Affiliation(s)
- Meiqin Gong
- West China Second University Hospital, Sichuan University, Chengdu 610041, China
| | - Yuchen He
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Maocheng Wang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Chunli Ding
- Sichuan Institute of Computer Sciences, Chengdu 610041, China.
| |
Collapse
|
15
|
Davidson AL, Dressel U, Norris S, Canson DM, Glubb DM, Fortuno C, Hollway GE, Parsons MT, Vidgen ME, Holmes O, Koufariotis LT, Lakis V, Leonard C, Wood S, Xu Q, McCart Reed AE, Pickett HA, Al-Shinnag MK, Austin RL, Burke J, Cops EJ, Nichols CB, Goodwin A, Harris MT, Higgins MJ, Ip EL, Kiraly-Borri C, Lau C, Mansour JL, Millward MW, Monnik MJ, Pachter NS, Ragunathan A, Susman RD, Townshend SL, Trainer AH, Troth SL, Tucker KM, Wallis MJ, Walsh M, Williams RA, Winship IM, Newell F, Tudini E, Pearson JV, Poplawski NK, Mar Fan HG, James PA, Spurdle AB, Waddell N, Ward RL. The clinical utility and costs of whole-genome sequencing to detect cancer susceptibility variants-a multi-site prospective cohort study. Genome Med 2023; 15:74. [PMID: 37723522 PMCID: PMC10507925 DOI: 10.1186/s13073-023-01223-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 08/18/2023] [Indexed: 09/20/2023] Open
Abstract
BACKGROUND Many families and individuals do not meet criteria for a known hereditary cancer syndrome but display unusual clusters of cancers. These families may carry pathogenic variants in cancer predisposition genes and be at higher risk for developing cancer. METHODS This multi-centre prospective study recruited 195 cancer-affected participants suspected to have a hereditary cancer syndrome for whom previous clinical targeted genetic testing was either not informative or not available. To identify pathogenic disease-causing variants explaining participant presentation, germline whole-genome sequencing (WGS) and a comprehensive cancer virtual gene panel analysis were undertaken. RESULTS Pathogenic variants consistent with the presenting cancer(s) were identified in 5.1% (10/195) of participants and pathogenic variants considered secondary findings with potential risk management implications were identified in another 9.7% (19/195) of participants. Health economic analysis estimated the marginal cost per case with an actionable variant was significantly lower for upfront WGS with virtual panel ($8744AUD) compared to standard testing followed by WGS ($24,894AUD). Financial analysis suggests that national adoption of diagnostic WGS testing would require a ninefold increase in government annual expenditure compared to conventional testing. CONCLUSIONS These findings make a case for replacing conventional testing with WGS to deliver clinically important benefits for cancer patients and families. The uptake of such an approach will depend on the perspectives of different payers on affordability.
Collapse
Affiliation(s)
- Aimee L Davidson
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Uwe Dressel
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Sarah Norris
- Faculty of Medicine and Health, University of Sydney, L2.22 The Quadrangle (A14), Sydney, NSW, 2006, Australia
| | - Daffodil M Canson
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Dylan M Glubb
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Cristina Fortuno
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Georgina E Hollway
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Michael T Parsons
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Miranda E Vidgen
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Australian Genomics, Melbourne, VIC, Australia
| | - Oliver Holmes
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Lambros T Koufariotis
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Vanessa Lakis
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Conrad Leonard
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Scott Wood
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Qinying Xu
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Amy E McCart Reed
- Centre for Clinical Research, University of Queensland, Brisbane, QLD, Australia
| | - Hilda A Pickett
- Children's Medical Research Institute, University of Sydney, Westmead, NSW, Australia
| | - Mohammad K Al-Shinnag
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Rachel L Austin
- Australian Genomics, Melbourne, VIC, Australia
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Jo Burke
- Tasmanian Clinical Genetics Service, Royal Hobart Hospital, Hobart, TAS, Australia
| | - Elisa J Cops
- Parkville Familial Cancer Centre, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Cassandra B Nichols
- Genetic Services of Western Australia, King Edward Memorial Hospital, Subiaco, WA, Australia
| | - Annabel Goodwin
- Cancer Genetics Department, Royal Prince Alfred Hospital, Sydney, NSW, Australia
- University of Sydney, Sydney, NSW, Australia
| | - Marion T Harris
- Monash Health Familial Cancer, Monash Health, Melbourne, VIC, Australia
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, VIC, Australia
| | - Megan J Higgins
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Emilia L Ip
- Cancer Genetics, Liverpool Hospital, Sydney, NSW, Australia
| | | | - Chiyan Lau
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
- Genomics, Pathology Queensland, Brisbane, QLD, Australia
| | - Julia L Mansour
- Tasmanian Clinical Genetics Service, Royal Hobart Hospital, Hobart, TAS, Australia
| | - Michael W Millward
- Tasmanian Clinical Genetics Service, Royal Hobart Hospital, Hobart, TAS, Australia
| | - Melissa J Monnik
- Adult Genetics Unit, Royal Adelaide Hospital, Adelaide, SA, Australia
| | - Nicholas S Pachter
- Genetic Services of Western Australia, King Edward Memorial Hospital, Subiaco, WA, Australia
- Faculty of Health and Medical Sciences, University of Western Australia, Perth, WA, Australia
| | - Abiramy Ragunathan
- Familial Cancer Services, The Crown Princess Mary Cancer Centre, Westmead Hospital, Westmead, NSW, Australia
| | - Rachel D Susman
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Sharron L Townshend
- Genetic Services of Western Australia, King Edward Memorial Hospital, Subiaco, WA, Australia
| | - Alison H Trainer
- Parkville Familial Cancer Centre, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, VIC, Australia
- Department of Medicine, University of Melbourne, Melbourne, VIC, Australia
| | - Simon L Troth
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Katherine M Tucker
- Prince of Wales Clinical School, UNSW Medicine and Health, The University of New South Wales, Sydney, NSW, Australia
- Hereditary Cancer Centre, Prince of Wales Hospital, Sydney, NSW, Australia
| | - Mathew J Wallis
- Tasmanian Clinical Genetics Service, Royal Hobart Hospital, Hobart, TAS, Australia
- School of Medicine and Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, Australia
| | - Maie Walsh
- Parkville Familial Cancer Centre, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Rachel A Williams
- Prince of Wales Clinical School, UNSW Medicine and Health, The University of New South Wales, Sydney, NSW, Australia
- Hereditary Cancer Centre, Prince of Wales Hospital, Sydney, NSW, Australia
| | - Ingrid M Winship
- Department of Medicine, University of Melbourne, Melbourne, VIC, Australia
- Genomic Medicine and Familial Cancer Clinic, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Felicity Newell
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Emma Tudini
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Australian Genomics, Melbourne, VIC, Australia
| | - John V Pearson
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
| | - Nicola K Poplawski
- Adult Genetics Unit, Royal Adelaide Hospital, Adelaide, SA, Australia
- Adelaide Medical School, Faculty of Health and Medical Sciences, University of Adelaide, Adelaide, SA, Australia
| | - Helen G Mar Fan
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Paul A James
- Parkville Familial Cancer Centre, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
| | - Amanda B Spurdle
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia.
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia.
| | - Nicola Waddell
- QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Brisbane, QLD, Australia
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Robyn L Ward
- Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia.
- Faculty of Medicine and Health, University of Sydney, L2.22 The Quadrangle (A14), Sydney, NSW, 2006, Australia.
| |
Collapse
|
16
|
Morrissey A, Shi J, James DQ, Mahony S. Allo: Accurate allocation of multi-mapped reads enables regulatory element analysis at repeats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.12.556916. [PMID: 37745557 PMCID: PMC10515862 DOI: 10.1101/2023.09.12.556916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. Unfortunately, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard "multi-mapped" reads that align equally well to multiple genomic locations. Since multi-mapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multi-mapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multi-mapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multi-mapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq datasets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly effective in identifying ChIP-seq peaks in younger TEs, which hold evolutionary significance due to their emergence during human evolution from primates.
Collapse
Affiliation(s)
- Alexis Morrissey
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Jeffrey Shi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Daniela Q. James
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
17
|
Meng S, Liu X, Zhu S, Xie P, Fang H, Pan Q, Fang K, Li F, Zhang J, Che Z, Zhang Q, Mao G, Wang Y, Hu P, Chen K, Sun F, Xie W, Luo Z, Lin C. Young LINE-1 transposon 5' UTRs marked by elongation factor ELL3 function as enhancers to regulate naïve pluripotency in embryonic stem cells. Nat Cell Biol 2023; 25:1319-1331. [PMID: 37591949 DOI: 10.1038/s41556-023-01211-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 07/19/2023] [Indexed: 08/19/2023]
Abstract
LINE-1s are the major clade of retrotransposons with autonomous retrotransposition activity. Despite the potential genotoxicity, LINE-1s are highly activated in early embryos. Here we show that a subset of young LINE-1s, L1Md_Ts, are marked by the RNA polymerase II elongation factor ELL3, and function as enhancers in mouse embryonic stem cells. ELL3 depletion dislodges the DNA hydroxymethylase TET1 and the co-repressor SIN3A from L1Md_Ts, but increases the enrichment of the Bromodomain protein BRD4, leading to loss of 5hmC, gain of H3K27ac, and upregulation of the L1Md_T nearby genes. Specifically, ELL3 occupies and represses the L1Md_T-based enhancer located within Akt3, which encodes a key regulator of AKT pathway. ELL3 is required for proper ERK activation and efficient shutdown of naïve pluripotency through inhibiting Akt3 during naïve-primed transition. Our study reveals that the enhancer function of a subset of young LINE-1s controlled by ELL3 in transcription regulation and mouse early embryo development.
Collapse
Affiliation(s)
- Siyan Meng
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
- Co-innovation Center of Neuroregeneration, Nantong University, Nantong, China
| | - Xiaoxu Liu
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Shiqi Zhu
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Peng Xie
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Haitong Fang
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Qingyun Pan
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Ke Fang
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Fanfan Li
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Jin Zhang
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Zhuanzhuan Che
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Quanyong Zhang
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, China
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming, China
| | - Guangyao Mao
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Yan Wang
- Department of Prenatal Diagnosis, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing, China
| | - Ping Hu
- Department of Prenatal Diagnosis, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing, China
| | - Kai Chen
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, China
- Yunnan Key Laboratory of Primate Biomedical Research, Kunming, China
| | - Fei Sun
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Wei Xie
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Zhuojuan Luo
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China.
- Co-innovation Center of Neuroregeneration, Nantong University, Nantong, China.
- Jiangsu Provincial Key Laboratory of Critical Care Medicine, Southeast University, Nanjing, China.
- Shenzhen Research Institute, Southeast University, Shenzhen, China.
| | - Chengqi Lin
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China.
- Co-innovation Center of Neuroregeneration, Nantong University, Nantong, China.
- Shenzhen Research Institute, Southeast University, Shenzhen, China.
- Jiangsu Province Hi-Tech Key Laboratory for Biomedical Research, Southeast University, Nanjing, China.
| |
Collapse
|
18
|
Setton J, Hadi K, Choo ZN, Kuchin KS, Tian H, Da Cruz Paula A, Rosiene J, Selenica P, Behr J, Yao X, Deshpande A, Sigouros M, Manohar J, Nauseef JT, Mosquera JM, Elemento O, Weigelt B, Riaz N, Reis-Filho JS, Powell SN, Imieliński M. Long-molecule scars of backup DNA repair in BRCA1- and BRCA2-deficient cancers. Nature 2023; 621:129-137. [PMID: 37587346 PMCID: PMC10482687 DOI: 10.1038/s41586-023-06461-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 07/20/2023] [Indexed: 08/18/2023]
Abstract
Homologous recombination (HR) deficiency is associated with DNA rearrangements and cytogenetic aberrations1. Paradoxically, the types of DNA rearrangements that are specifically associated with HR-deficient cancers only minimally affect chromosomal structure2. Here, to address this apparent contradiction, we combined genome-graph analysis of short-read whole-genome sequencing (WGS) profiles across thousands of tumours with deep linked-read WGS of 46 BRCA1- or BRCA2-mutant breast cancers. These data revealed a distinct class of HR-deficiency-enriched rearrangements called reciprocal pairs. Linked-read WGS showed that reciprocal pairs with identical rearrangement orientations gave rise to one of two distinct chromosomal outcomes, distinguishable only with long-molecule data. Whereas one (cis) outcome corresponded to the copying and pasting of a small segment to a distant site, a second (trans) outcome was a quasi-balanced translocation or multi-megabase inversion with substantial (10 kb) duplications at each junction. We propose an HR-independent replication-restart repair mechanism to explain the full spectrum of reciprocal pair outcomes. Linked-read WGS also identified single-strand annealing as a repair pathway that is specific to BRCA2 deficiency in human cancers. Integrating these features in a classifier improved discrimination between BRCA1- and BRCA2-deficient genomes. In conclusion, our data reveal classes of rearrangements that are specific to BRCA1 or BRCA2 deficiency as a source of cytogenetic aberrations in HR-deficient cells.
Collapse
Affiliation(s)
- Jeremy Setton
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Kevin Hadi
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Physiology and Biophysics PhD program, Weill Cornell Medicine, New York, NY, USA
| | - Zi-Ning Choo
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Physiology and Biophysics PhD program, Weill Cornell Medicine, New York, NY, USA
| | - Katherine S Kuchin
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Huasong Tian
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Arnaud Da Cruz Paula
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Joel Rosiene
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Pier Selenica
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Julie Behr
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Xiaotong Yao
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Aditya Deshpande
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Michael Sigouros
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Jyothi Manohar
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Jones T Nauseef
- New York Genome Center, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Division of Hematology and Medical Oncology, Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Juan-Miguel Mosquera
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Olivier Elemento
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Britta Weigelt
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Nadeem Riaz
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Jorge S Reis-Filho
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Simon N Powell
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | - Marcin Imieliński
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA.
- New York Genome Center, New York, NY, USA.
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA.
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA.
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA.
- Department of Pathology and Perlmutter Cancer Center, NYU Grossman School of Medicine, New York, NY, USA.
| |
Collapse
|
19
|
Ryoo SB, Heo S, Lim Y, Lee W, Cho SH, Ahn J, Kang JK, Kim SY, Kim HP, Bang D, Kang SB, Yu CS, Oh ST, Park JW, Jeong SY, Kim YJ, Park KJ, Han SW, Kim TY. Personalised circulating tumour DNA assay with large-scale mutation coverage for sensitive minimal residual disease detection in colorectal cancer. Br J Cancer 2023; 129:374-381. [PMID: 37280413 PMCID: PMC10338477 DOI: 10.1038/s41416-023-02300-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 04/25/2023] [Accepted: 05/02/2023] [Indexed: 06/08/2023] Open
Abstract
BACKGROUND Postoperative minimal residual disease (MRD) detection using circulating-tumour DNA (ctDNA) requires a highly sensitive analysis platform. We have developed a tumour-informed, hybrid-capture ctDNA sequencing MRD assay. METHODS Personalised target-capture panels for ctDNA detection were designed using individual variants identified in tumour whole-exome sequencing of each patient. MRD status was determined using ultra-high-depth sequencing data of plasma cell-free DNA. The MRD positivity and its association with clinical outcome were analysed in Stage II or III colorectal cancer (CRC). RESULTS In 98 CRC patients, personalised panels for ctDNA sequencing were built from tumour data, including a median of 185 variants per patient. In silico simulation showed that increasing the number of target variants increases MRD detection sensitivity in low fractions (<0.01%). At postoperative 3-week, 21.4% of patients were positive for MRD by ctDNA. Postoperative positive MRD was strongly associated with poor disease-free survival (DFS) (adjusted hazard ratio 8.40, 95% confidence interval 3.49-20.2). Patients with a negative conversion of MRD after adjuvant therapy showed significantly better DFS (P < 0.001). CONCLUSION Tumour-informed, hybrid-capture-based ctDNA assay monitoring a large number of patient-specific mutations is a sensitive strategy for MRD detection to predict recurrence in CRC.
Collapse
Affiliation(s)
- Seung-Bum Ryoo
- Department of Surgery, Seoul National University Hospital, Seoul, Korea
| | | | | | | | | | - Jongseong Ahn
- IMBdx, Seoul, Korea
- Department of Chemistry, Yonsei University, Seoul, Korea
| | | | | | | | - Duhee Bang
- Department of Chemistry, Yonsei University, Seoul, Korea
| | - Sung-Bum Kang
- Department of Surgery, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Chang Sik Yu
- Department of Surgery, Asan Medical Center, Seoul, Korea
| | - Seong Taek Oh
- Department of Surgery, The Catholic University of Korea Uijeongbu St. Mary's Hospital, Uijeongbu, Korea
| | - Ji Won Park
- Department of Surgery, Seoul National University Hospital, Seoul, Korea
| | - Seung-Yong Jeong
- Department of Surgery, Seoul National University Hospital, Seoul, Korea
| | - Young-Joon Kim
- Department of Biochemistry, Yonsei University, Seoul, Korea
| | - Kyu Joo Park
- Department of Surgery, Seoul National University Hospital, Seoul, Korea
| | - Sae-Won Han
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Korea.
- Cancer Research Institute, Seoul National University, Seoul, Korea.
| | - Tae-You Kim
- IMBdx, Seoul, Korea
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Korea
- Cancer Research Institute, Seoul National University, Seoul, Korea
| |
Collapse
|
20
|
Wang F, Cao H, Xia Q, Liu Z, Wang M, Gao F, Xu D, Deng B, Diao Y, Kapranov P. Lessons from discovery of true ADAR RNA editing sites in a human cell line. BMC Biol 2023; 21:160. [PMID: 37468903 PMCID: PMC10357658 DOI: 10.1186/s12915-023-01651-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 06/20/2023] [Indexed: 07/21/2023] Open
Abstract
BACKGROUND Conversion or editing of adenosine (A) into inosine (I) catalyzed by specialized cellular enzymes represents one of the most common post-transcriptional RNA modifications with emerging connection to disease. A-to-I conversions can happen at specific sites and lead to increase in proteome diversity and changes in RNA stability, splicing, and regulation. Such sites can be detected as adenine-to-guanine sequence changes by next-generation RNA sequencing which resulted in millions reported sites from multiple genome-wide surveys. Nonetheless, the lack of extensive independent validation in such endeavors, which is critical considering the relatively high error rate of next-generation sequencing, leads to lingering questions about the validity of the current compendiums of the editing sites and conclusions based on them. RESULTS Strikingly, we found that the current analytical methods suffer from very high false positive rates and that a significant fraction of sites in the public databases cannot be validated. In this work, we present potential solutions to these problems and provide a comprehensive and extensively validated list of A-to-I editing sites in a human cancer cell line. Our findings demonstrate that most of true A-to-I editing sites in a human cancer cell line are located in the non-coding transcripts, the so-called RNA 'dark matter'. On the other hand, many ADAR editing events occurring in exons of human protein-coding mRNAs, including those that can recode the transcriptome, represent false positives and need to be interpreted with caution. Nonetheless, yet undiscovered authentic ADAR sites that increase the diversity of human proteome exist and warrant further identification. CONCLUSIONS Accurate identification of human ADAR sites remains a challenging problem, particularly for the sites in exons of protein-coding mRNAs. As a result, genome-wide surveys of ADAR editome must still be accompanied by extensive Sanger validation efforts. However, given the vast number of unknown human ADAR sites, there is a need for further developments of the analytical techniques, potentially those that are based on deep learning solutions, in order to provide a quick and reliable identification of the editome in any sample.
Collapse
Affiliation(s)
- Fang Wang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Huifen Cao
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China.
| | - Qiu Xia
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Ziheng Liu
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Ming Wang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Fan Gao
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Dongyang Xu
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Bolin Deng
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Yong Diao
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China.
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, 361102, China.
| |
Collapse
|
21
|
Bevill SM, Casaní-Galdón S, El Farran CA, Cytrynbaum EG, Macias KA, Oldeman SE, Oliveira KJ, Moore MM, Hegazi E, Adriaens C, Najm FJ, Demetri GD, Cohen S, Mullen JT, Riggi N, Johnstone SE, Bernstein BE. Impact of supraphysiologic MDM2 expression on chromatin networks and therapeutic responses in sarcoma. CELL GENOMICS 2023; 3:100321. [PMID: 37492096 PMCID: PMC10363746 DOI: 10.1016/j.xgen.2023.100321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 03/09/2023] [Accepted: 04/14/2023] [Indexed: 07/27/2023]
Abstract
Amplification of MDM2 on supernumerary chromosomes is a common mechanism of P53 inactivation across tumors. Here, we investigated the impact of MDM2 overexpression on chromatin, gene expression, and cellular phenotypes in liposarcoma. Three independent regulatory circuits predominate in aggressive, dedifferentiated tumors. RUNX and AP-1 family transcription factors bind mesenchymal gene enhancers. P53 and MDM2 co-occupy enhancers and promoters associated with P53 signaling. When highly expressed, MDM2 also binds thousands of P53-independent growth and stress response genes, whose promoters engage in multi-way topological interactions. Overexpressed MDM2 concentrates within nuclear foci that co-localize with PML and YY1 and could also contribute to P53-independent phenotypes associated with supraphysiologic MDM2. Importantly, we observe striking cell-to-cell variability in MDM2 copy number and expression in tumors and models. Whereas liposarcoma cells are generally sensitive to MDM2 inhibitors and their combination with pro-apoptotic drugs, MDM2-high cells tolerate them and may underlie the poor clinical efficacy of these agents.
Collapse
Affiliation(s)
- Samantha M. Bevill
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Salvador Casaní-Galdón
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Chadi A. El Farran
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Eli G. Cytrynbaum
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
- Department of Pathology and Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Kevin A. Macias
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Sylvie E. Oldeman
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Kayla J. Oliveira
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Molly M. Moore
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Esmat Hegazi
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
- Department of Pathology and Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Carmen Adriaens
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Fadi J. Najm
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - George D. Demetri
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA 02115, USA
| | - Sonia Cohen
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
- Department of Surgery, Massachusetts General Hospital, Boston, MA 02114, USA
| | - John T. Mullen
- Department of Surgery, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Nicolò Riggi
- Department of Cell and Tissue Genomics (CTG), Genentech Inc, South San Francisco, CA 94080, USA
| | - Sarah E. Johnstone
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Bradley E. Bernstein
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02115, USA
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
22
|
Sohn JI, Choi MH, Yi D, Menon VA, Kim YJ, Lee J, Park JW, Kyung S, Shin SH, Na B, Joung JG, Ju YS, Yeom MS, Koh Y, Yoon SS, Baek D, Kim TM, Nam JW. Ultrafast prediction of somatic structural variations by filtering out reads matched to pan-genome k-mer sets. Nat Biomed Eng 2023; 7:853-866. [PMID: 36536253 DOI: 10.1038/s41551-022-00980-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 11/01/2022] [Indexed: 12/24/2022]
Abstract
Variant callers typically produce massive numbers of false positives for structural variations, such as cancer-relevant copy-number alterations and fusion genes resulting from genome rearrangements. Here we describe an ultrafast and accurate detector of somatic structural variations that reduces read-mapping costs by filtering out reads matched to pan-genome k-mer sets. The detector, which we named ETCHING (for efficient detection of chromosomal rearrangements and fusion genes), reduces the number of false positives by leveraging machine-learning classifiers trained with six breakend-related features (clipped-read count, split-reads count, supporting paired-end read count, average mapping quality, depth difference and total length of clipped bases). When benchmarked against six callers on reference cell-free DNA, validated biomarkers of structural variants, matched tumour and normal whole genomes, and tumour-only targeted sequencing datasets, ETCHING was 11-fold faster than the second-fastest structural-variant caller at comparable performance and memory use. The speed and accuracy of ETCHING may aid large-scale genome projects and facilitate practical implementations in precision medicine.
Collapse
Affiliation(s)
- Jang-Il Sohn
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
- Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul, Republic of Korea
| | - Min-Hak Choi
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
| | - Dohun Yi
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
| | - Vipin A Menon
- Department of Life Science, Hanyang University, Seoul, Republic of Korea
| | - Yeon Jeong Kim
- Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
| | - Junehawk Lee
- Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Jung Woo Park
- Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | | | | | - Byunggook Na
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Je-Gun Joung
- Department of Biomedical Science, College of Life Science, CHA University, Seongnam, Republic of Korea
| | - Young Seok Ju
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
- Biomedical Science and Engineering Interdisciplinary Program, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Min Sun Yeom
- Center for Supercomputing Applications, Division of National Supercomputing, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Youngil Koh
- College of Medicine, Seoul National University, Seoul, Republic of Korea
| | - Sung-Soo Yoon
- College of Medicine, Seoul National University, Seoul, Republic of Korea
| | - Daehyun Baek
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Tae-Min Kim
- Department of Medical Informatics and Cancer Research Institute, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Jin-Wu Nam
- Department of Life Science, Hanyang University, Seoul, Republic of Korea.
- Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul, Republic of Korea.
- Bio-BigData Center, Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Seoul, Republic of Korea.
| |
Collapse
|
23
|
Commins N, Sullivan MR, McGowen K, Koch EM, Rubin EJ, Farhat M. Mutation rates and adaptive variation among the clinically dominant clusters of Mycobacterium abscessus. Proc Natl Acad Sci U S A 2023; 120:e2302033120. [PMID: 37216535 PMCID: PMC10235944 DOI: 10.1073/pnas.2302033120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 04/13/2023] [Indexed: 05/24/2023] Open
Abstract
Mycobacterium abscessus (Mab) is a multidrug-resistant pathogen increasingly responsible for severe pulmonary infections. Analysis of whole-genome sequences (WGS) of Mab demonstrates dense genetic clustering of clinical isolates collected from disparate geographic locations. This has been interpreted as supporting patient-to-patient transmission, but epidemiological studies have contradicted this interpretation. Here, we present evidence for a slowing of the Mab molecular clock rate coincident with the emergence of phylogenetic clusters. We performed phylogenetic inference using publicly available WGS from 483 Mab patient isolates. We implement a subsampling approach in combination with coalescent analysis to estimate the molecular clock rate along the long internal branches of the tree, indicating a faster long-term molecular clock rate compared to branches within phylogenetic clusters. We used ancestry simulation to predict the effects of clock rate variation on phylogenetic clustering and found that the degree of clustering in the observed phylogeny is more easily explained by a clock rate slowdown than by transmission. We also find that phylogenetic clusters are enriched in mutations affecting DNA repair machinery and report that clustered isolates have lower spontaneous mutation rates in vitro. We propose that Mab adaptation to the host environment through variation in DNA repair genes affects the organism's mutation rate and that this manifests as phylogenetic clustering. These results challenge the model that phylogenetic clustering in Mab is explained by person-to-person transmission and inform our understanding of transmission inference in emerging, facultative pathogens.
Collapse
Affiliation(s)
- Nicoletta Commins
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA02115
| | - Mark R. Sullivan
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA02115
| | - Kerry McGowen
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA02115
| | - Evan M. Koch
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA02115
| | - Eric J. Rubin
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA02115
- Department of Microbiology, Harvard Medical School, Boston, MA02115
| | - Maha Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA02115
- Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA02114
| |
Collapse
|
24
|
Zhuo X, Hsu S, Purushotham D, Kuntala PK, Harrison JK, Du AY, Chen S, Li D, Wang T. Comparing genomic and epigenomic features across species using the WashU Comparative Epigenome Browser. Genome Res 2023; 33:824-835. [PMID: 37156621 PMCID: PMC10317122 DOI: 10.1101/gr.277550.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 05/03/2023] [Indexed: 05/10/2023]
Abstract
Genome browsers have become an intuitive and critical tool to visualize and analyze genomic features and data. Conventional genome browsers display data/annotations on a single reference genome/assembly; there are also genomic alignment viewer/browsers that help users visualize alignment, mismatch, and rearrangement between syntenic regions. However, there is a growing need for a comparative epigenome browser that can display genomic and epigenomic data sets across different species and enable users to compare them between syntenic regions. Here, we present the WashU Comparative Epigenome Browser. It allows users to load functional genomic data sets/annotations mapped to different genomes and display them over syntenic regions simultaneously. The browser also displays genetic differences between the genomes from single-nucleotide variants (SNVs) to structural variants (SVs) to visualize the association between epigenomic differences and genetic differences. Instead of anchoring all data sets to the reference genome coordinates, it creates independent coordinates of different genome assemblies to faithfully present features and data mapped to different genomes. It uses a simple, intuitive genome-align track to illustrate the syntenic relationship between different species. It extends the widely used WashU Epigenome Browser infrastructure and can be expanded to support multiple species. This new browser function will greatly facilitate comparative genomic/epigenomic research, as well as support the recent growing needs to directly compare and benchmark the T2T CHM13 assembly and other human genome assemblies.
Collapse
Affiliation(s)
- Xiaoyu Zhuo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Silas Hsu
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Deepak Purushotham
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Prashant Kumar Kuntala
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Jessica K Harrison
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Alan Y Du
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Samuel Chen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Daofeng Li
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| |
Collapse
|
25
|
Olson ND, Wagner J, Dwarshuis N, Miga KH, Sedlazeck FJ, Salit M, Zook JM. Variant calling and benchmarking in an era of complete human genome sequences. Nat Rev Genet 2023:10.1038/s41576-023-00590-0. [PMID: 37059810 DOI: 10.1038/s41576-023-00590-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2023] [Indexed: 04/16/2023]
Abstract
Genetic variant calling from DNA sequencing has enabled understanding of germline variation in hundreds of thousands of humans. Sequencing technologies and variant-calling methods have advanced rapidly, routinely providing reliable variant calls in most of the human genome. We describe how advances in long reads, deep learning, de novo assembly and pangenomes have expanded access to variant calls in increasingly challenging, repetitive genomic regions, including medically relevant regions, and how new benchmark sets and benchmarking methods illuminate their strengths and limitations. Finally, we explore the possible future of more complete characterization of human genome variation in light of the recent completion of a telomere-to-telomere human genome reference assembly and human pangenomes, and we consider the innovations needed to benchmark their newly accessible repetitive regions and complex variants.
Collapse
Affiliation(s)
- Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Nathan Dwarshuis
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Fritz J Sedlazeck
- Baylor College of Medicine, Human Genome Sequencing Center, Houston, TX, USA
| | | | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA.
| |
Collapse
|
26
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
27
|
López-López D, Roldán G, Fernández-Rueda JL, Bostelmann G, Carmona R, Aquino V, Perez-Florido J, Ortuño F, Pita G, Núñez-Torres R, González-Neira A, Peña-Chilet M, Dopazo J. A crowdsourcing database for the copy-number variation of the Spanish population. Hum Genomics 2023; 17:20. [PMID: 36894999 PMCID: PMC9997023 DOI: 10.1186/s40246-023-00466-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 02/25/2023] [Indexed: 03/11/2023] Open
Abstract
BACKGROUND Despite being a very common type of genetic variation, the distribution of copy-number variations (CNVs) in the population is still poorly understood. The knowledge of the genetic variability, especially at the level of the local population, is a critical factor for distinguishing pathogenic from non-pathogenic variation in the discovery of new disease variants. RESULTS Here, we present the SPAnish Copy Number Alterations Collaborative Server (SPACNACS), which currently contains copy number variation profiles obtained from more than 400 genomes and exomes of unrelated Spanish individuals. By means of a collaborative crowdsourcing effort whole genome and whole exome sequencing data, produced by local genomic projects and for other purposes, is continuously collected. Once checked both, the Spanish ancestry and the lack of kinship with other individuals in the SPACNACS, the CNVs are inferred for these sequences and they are used to populate the database. A web interface allows querying the database with different filters that include ICD10 upper categories. This allows discarding samples from the disease under study and obtaining pseudo-control CNV profiles from the local population. We also show here additional studies on the local impact of CNVs in some phenotypes and on pharmacogenomic variants. SPACNACS can be accessed at: http://csvs.clinbioinfosspa.es/spacnacs/ . CONCLUSION SPACNACS facilitates disease gene discovery by providing detailed information of the local variability of the population and exemplifies how to reuse genomic data produced for other purposes to build a local reference database.
Collapse
Affiliation(s)
- Daniel López-López
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.,Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain
| | - Gema Roldán
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain
| | - Jose L Fernández-Rueda
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain
| | - Gerrit Bostelmann
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain
| | - Rosario Carmona
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.,Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain
| | - Virginia Aquino
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain
| | - Javier Perez-Florido
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.,Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain
| | - Francisco Ortuño
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.,Department of Computer Architecture and Computer Technology, University of Granada, 18071, Granada, Spain
| | - Guillermo Pita
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - Rocío Núñez-Torres
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - Anna González-Neira
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | | | - María Peña-Chilet
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.,Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain
| | - Joaquin Dopazo
- Computational Medicine Platform, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain. .,Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain. .,Centro de Investigación Biomédica en Red en Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain. .,FPS/ELIXIR-ES, Andalusian Public Foundation Progress and Health-FPS, 41013, Seville, Spain.
| |
Collapse
|
28
|
Thatikonda V, Islam SMA, Autry RJ, Jones BC, Gröbner SN, Warsow G, Hutter B, Huebschmann D, Fröhling S, Kool M, Blattner-Johnson M, Jones DTW, Alexandrov LB, Pfister SM, Jäger N. Comprehensive analysis of mutational signatures reveals distinct patterns and molecular processes across 27 pediatric cancers. NATURE CANCER 2023; 4:276-289. [PMID: 36702933 PMCID: PMC9970869 DOI: 10.1038/s43018-022-00509-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 12/21/2022] [Indexed: 01/27/2023]
Abstract
Analysis of mutational signatures can reveal underlying molecular mechanisms of the processes that have imprinted the somatic mutations found in cancer genomes. Here, we analyze single base substitutions and small insertions and deletions in pediatric cancers encompassing 785 whole-genome sequenced tumors from 27 molecularly defined cancer subtypes. We identified only a small number of mutational signatures active in pediatric cancers, compared with previously analyzed adult cancers. Further, we report a significant difference in the proportion of pediatric tumors showing homologous recombination repair defect signatures compared with previous analyses. In pediatric leukemias, we identified an indel signature, not previously reported, characterized by long insertions in nonrepeat regions, affecting mainly intronic and intergenic regions, but also exons of known cancer genes. We provide a systematic overview of COSMIC v.3 mutational signatures active across pediatric cancers, which is highly relevant for understanding tumor biology and enabling future research in defining biomarkers of treatment response.
Collapse
Affiliation(s)
- Venu Thatikonda
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Global Computational Biology and Digital Sciences, Boehringer Ingelheim RCV GmbH, Vienna, Austria
| | - S M Ashiqul Islam
- Department of Cellular and Molecular Medicine and Department of Bioengineering, Moores Cancer Center, UC San Diego, La Jolla, CA, USA
| | - Robert J Autry
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Barbara C Jones
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Department of Pediatric Oncology, Hematology and Immunology, Heidelberg University Hospital, Heidelberg, Germany
- Pediatric Glioma Research Group, DKFZ, Heidelberg, Germany
| | - Susanne N Gröbner
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Gregor Warsow
- Omics IT and Data Management Core Facility (W610), DKFZ, Heidelberg, Germany
| | - Barbara Hutter
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg, DKFZ, Heidelberg, Germany
- Division of Applied Bioinformatics, DKFZ, Heidelberg, Germany
| | - Daniel Huebschmann
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg, DKFZ, Heidelberg, Germany
- Pattern Recognition and Digital Medicine, Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM), Heidelberg, Germany
| | - Stefan Fröhling
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Division of Translational Medical Oncology, NCT Heidelberg and DKFZ, Heidelberg, Germany
| | - Marcel Kool
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Princess Máxima Center for Pediatric Oncology, Utrecht, the Netherlands
| | - Mirjam Blattner-Johnson
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Pediatric Glioma Research Group, DKFZ, Heidelberg, Germany
| | - David T W Jones
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Pediatric Glioma Research Group, DKFZ, Heidelberg, Germany
| | - Ludmil B Alexandrov
- Department of Cellular and Molecular Medicine and Department of Bioengineering, Moores Cancer Center, UC San Diego, La Jolla, CA, USA
| | - Stefan M Pfister
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany.
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- German Cancer Consortium (DKTK), Heidelberg, Germany.
- Department of Pediatric Oncology, Hematology and Immunology, Heidelberg University Hospital, Heidelberg, Germany.
| | - Natalie Jäger
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany.
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- German Cancer Consortium (DKTK), Heidelberg, Germany.
| |
Collapse
|
29
|
Du C, Jiang J, Li Y, Yu M, Jin J, Chen S, Fan H, Macfarlan TS, Cao B, Sun MA. Regulation of endogenous retrovirus-derived regulatory elements by GATA2/3 and MSX2 in human trophoblast stem cells. Genome Res 2023; 33:197-207. [PMID: 36806146 PMCID: PMC10069462 DOI: 10.1101/gr.277150.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 01/10/2023] [Indexed: 02/19/2023]
Abstract
The placenta is an organ with extraordinary phenotypic diversity in eutherian mammals. Recent evidence suggests that numerous human placental enhancers are evolved from lineage-specific insertions of endogenous retroviruses (ERVs), yet the transcription factors (TFs) underlying their regulation remain largely elusive. Here, by first focusing on MER41, a primate-specific ERV family previously linked to placenta and innate immunity, we uncover the binding motifs of multiple crucial trophoblast TFs (GATA2/3, MSX2, GRHL2) in addition to innate immunity TFs STAT1 and IRF1. Integration of ChIP-seq data confirms the binding of GATA2/3, MSX2, and their related factors on the majority of MER41-derived enhancers in human trophoblast stem cells (TSCs). MER41-derived enhancers that are constitutively active in human TSCs are distinct from those activated upon interferon stimulation, which is determined by the binding of relevant TFs and their subfamily compositions. We further demonstrate that GATA2/3 and MSX2 have prevalent binding to numerous other ERV families - indicating their broad impact on ERV-derived enhancers. Functionally, the derepression of many syncytiotrophoblast genes after MSX2 knockdown is likely to be mediated by regulatory elements derived from ERVs - suggesting ERVs are also important for mediating transcriptional repression. Overall, this study characterizes the regulation of ERV-derived regulatory elements by GATA2/3, MSX2, and their cofactors in human TSCs, and provides mechanistic insights into the importance of ERVs in human trophoblast regulatory network.
Collapse
Affiliation(s)
- Cui Du
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China
| | - Jing Jiang
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China
| | - Yuzhuo Li
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China
| | - Miao Yu
- Fujian Provincial Key Laboratory of Reproductive Health Research, School of Medicine, Xiamen University, Xiamen, Fujian 361102, China
| | - Jian Jin
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China
| | - Shuai Chen
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China
| | - Hairui Fan
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China
| | - Todd S Macfarlan
- The Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH, Bethesda, Maryland 20892, USA
| | - Bin Cao
- Fujian Provincial Key Laboratory of Reproductive Health Research, School of Medicine, Xiamen University, Xiamen, Fujian 361102, China;
| | - Ming-An Sun
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, Jiangsu 225009, China; .,Joint International Research Laboratory of Important Animal Infectious Diseases and Zoonoses of Jiangsu Higher Education Institutions, Yangzhou, Jiangsu 225009, China.,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonosis, Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou, Jiangsu 225009, China
| |
Collapse
|
30
|
Söylev A, Çokoglu SS, Koptekin D, Alkan C, Somel M. CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data. PLoS Comput Biol 2022; 18:e1010788. [PMID: 36516232 PMCID: PMC9873172 DOI: 10.1371/journal.pcbi.1010788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 01/24/2023] [Accepted: 12/03/2022] [Indexed: 12/15/2022] Open
Abstract
To date, ancient genome analyses have been largely confined to the study of single nucleotide polymorphisms (SNPs). Copy number variants (CNVs) are a major contributor of disease and of evolutionary adaptation, but identifying CNVs in ancient shotgun-sequenced genomes is hampered by typical low genome coverage (<1×) and short fragments (<80 bps), precluding standard CNV detection software to be effectively applied to ancient genomes. Here we present CONGA, tailored for genotyping CNVs at low coverage. Simulations and down-sampling experiments suggest that CONGA can genotype deletions >1 kbps with F-scores >0.75 at ≥1×, and distinguish between heterozygous and homozygous states. We used CONGA to genotype 10,002 outgroup-ascertained deletions across a heterogenous set of 71 ancient human genomes spanning the last 50,000 years, produced using variable experimental protocols. A fraction of these (21/71) display divergent deletion profiles unrelated to their population origin, but attributable to technical factors such as coverage and read length. The majority of the sample (50/71), despite originating from nine different laboratories and having coverages ranging from 0.44×-26× (median 4×) and average read lengths 52-121 bps (median 69), exhibit coherent deletion frequencies. Across these 50 genomes, inter-individual genetic diversity measured using SNPs and CONGA-genotyped deletions are highly correlated. CONGA-genotyped deletions also display purifying selection signatures, as expected. CONGA thus paves the way for systematic CNV analyses in ancient genomes, despite the technical challenges posed by low and variable genome coverage.
Collapse
Affiliation(s)
- Arda Söylev
- Department of Computer Engineering, Konya Food and Agriculture University, Konya, Turkey
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- * E-mail: (AS); (MS)
| | | | - Dilek Koptekin
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, Turkey
| | - Mehmet Somel
- Department of Biology, Middle East Technical University, Ankara, Turkey
- * E-mail: (AS); (MS)
| |
Collapse
|
31
|
Zhou X, Zheng H, Fu H, Dillehay McKillip KL, Pinney SM, Liu Y. CRAG: de novo characterization of cell-free DNA fragmentation hotspots in plasma whole-genome sequencing. Genome Med 2022; 14:138. [PMID: 36482487 PMCID: PMC9733064 DOI: 10.1186/s13073-022-01141-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 11/14/2022] [Indexed: 12/13/2022] Open
Abstract
The fine-scale cell-free DNA fragmentation patterns in early-stage cancers are poorly understood. We developed a de novo approach to characterize the cell-free DNA fragmentation hotspots from plasma whole-genome sequencing. Hotspots are enriched in open chromatin regions, and, interestingly, 3'end of transposons. Hotspots showed global hypo-fragmentation in early-stage liver cancers and are associated with genes involved in the initiation of hepatocellular carcinoma and associated with cancer stem cells. The hotspots varied across multiple early-stage cancers and demonstrated high performance for the diagnosis and identification of tissue-of-origin in early-stage cancers. We further validated the performance with a small number of independent case-control-matched early-stage cancer samples.
Collapse
Affiliation(s)
- Xionghui Zhou
- grid.239573.90000 0000 9025 8099Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229 USA ,grid.35155.370000 0004 1790 4137Present address: Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070 China
| | - Haizi Zheng
- grid.239573.90000 0000 9025 8099Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229 USA
| | - Hailu Fu
- grid.239573.90000 0000 9025 8099Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229 USA
| | - Kelsey L. Dillehay McKillip
- grid.24827.3b0000 0001 2179 9593University of Cincinnati Cancer Center, Cincinnati, OH 45229 USA ,grid.24827.3b0000 0001 2179 9593Department of Pathology & Laboratory Medicine, University of Cincinnati College of Medicine, Cincinnati, OH 45229 USA
| | - Susan M. Pinney
- grid.24827.3b0000 0001 2179 9593University of Cincinnati Cancer Center, Cincinnati, OH 45229 USA ,grid.24827.3b0000 0001 2179 9593Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, OH 45229 USA
| | - Yaping Liu
- grid.239573.90000 0000 9025 8099Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229 USA ,grid.24827.3b0000 0001 2179 9593University of Cincinnati Cancer Center, Cincinnati, OH 45229 USA ,grid.239573.90000 0000 9025 8099Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229 USA ,grid.24827.3b0000 0001 2179 9593Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229 USA ,grid.24827.3b0000 0001 2179 9593Department of Electrical Engineering and Computing Sciences, University of Cincinnati College of Engineering and Applied Science, Cincinnati, OH 45229 USA
| |
Collapse
|
32
|
Krishnamachari K, Lu D, Swift-Scott A, Yeraliyev A, Lee K, Huang W, Leng SN, Skanderup AJ. Accurate somatic variant detection using weakly supervised deep learning. Nat Commun 2022; 13:4248. [PMID: 35869060 PMCID: PMC9307817 DOI: 10.1038/s41467-022-31765-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 06/29/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractIdentification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of somatic variants from aligned tumor and matched normal DNA reads. VarNet is trained using image representations of 4.6 million high-confidence somatic variants annotated in 356 tumor whole genomes. We benchmark VarNet across a range of publicly available datasets, demonstrating performance often exceeding current state-of-the-art methods. Overall, our results demonstrate how a scalable deep learning approach could augment and potentially supplant human engineered features and heuristic filters in somatic variant calling.
Collapse
|
33
|
Lu F, Sossin A, Abell N, Montgomery SB, He Z. Deep learning-assisted genome-wide characterization of massively parallel reporter assays. Nucleic Acids Res 2022; 50:11442-11454. [PMID: 36350674 PMCID: PMC9723615 DOI: 10.1093/nar/gkac990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 10/04/2022] [Accepted: 10/19/2022] [Indexed: 11/10/2022] Open
Abstract
Massively parallel reporter assay (MPRA) is a high-throughput method that enables the study of the regulatory activities of tens of thousands of DNA oligonucleotides in a single experiment. While MPRA experiments have grown in popularity, their small sample sizes compared to the scale of the human genome limits our understanding of the regulatory effects they detect. To address this, we develop a deep learning model, MpraNet, to distinguish potential MPRA targets from the background genome. This model achieves high discriminative performance (AUROC = 0.85) at differentiating MPRA positives from a set of control variants that mimic the background genome when applied to the lymphoblastoid cell line. We observe that existing functional scores represent very distinct functional effects, and most of them fail to characterize the regulatory effect that MPRA detects. Using MpraNet, we predict potential MPRA functional variants across the genome and identify the distributions of MPRA effect relative to other characteristics of genetic variation, including allele frequency, alternative functional annotations specified by FAVOR, and phenome-wide associations. We also observed that the predicted MPRA positives are not uniformly distributed across the genome; instead, they are clumped together in active regions comprising 9.95% of the genome and inactive regions comprising 89.07% of the genome. Furthermore, we propose our model as a screen to filter MPRA experiment candidates at genome-wide scale, enabling future experiments to be more cost-efficient by increasing precision relative to that observed from previous MPRAs.
Collapse
Affiliation(s)
| | | | - Nathan Abell
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, CA 94305, USA,Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Zihuai He
- To whom correspondence should be addressed. Tel: +1 718 869 4929;
| |
Collapse
|
34
|
Toal TW, Estrada-Florez AP, Polanco-Echeverry GM, Sahasrabudhe RM, Lott PC, Suarez-Olaya JJ, Guevara-Tique AA, Rocha S, Morales-Arana A, Castro-Valencia F, Urayama S, Kirane A, Wei D, Rios-Sarabia N, Medrano R, Mantilla A, Echeverry de Polanco M, Torres J, Bohorquez-Lozano ME, Carvajal-Carmona LG. Multiregional Sequencing Analysis Reveals Extensive Genetic Heterogeneity in Gastric Tumors from Latinos. CANCER RESEARCH COMMUNICATIONS 2022; 2:1487-1496. [PMID: 36970058 PMCID: PMC10035402 DOI: 10.1158/2767-9764.crc-22-0149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 07/15/2022] [Accepted: 11/01/2022] [Indexed: 11/06/2022]
Abstract
UNLABELLED Gastric cancer is a leading cause of cancer mortality and health disparities in Latinos. We evaluated gastric intratumoral heterogeneity using multiregional sequencing of >700 cancer genes in 115 tumor biopsies from 32 patients, 29 who were Latinos. Analyses focused on comparisons with The Cancer Genome Atlas (TCGA) and on mutation clonality, druggability, and signatures. We found that only approximately 30% of all mutations were clonal and that only 61% of the known TCGA gastric cancer drivers harbored clonal mutations. Multiple clonal mutations were found in new candidate gastric cancer drivers such as EYS, FAT4, PCDHA1, RAD50, EXO1, RECQL4, and FSIP2. The genomically stable (GS) molecular subtype, which has the worse prognosis, was identified in 48% of our Latino patients, a fraction that was >2.3-fold higher than in TCGA Asian and White patients. Only a third of all tumors harbored clonal pathogenic mutations in druggable genes, with most (93%) GS tumors lacking actionable clonal mutations. Mutation signature analyses revealed that, in microsatellite-stable (MSS) tumors, DNA repair mutations were common for both tumor initiation and progression, while tobacco, POLE, and inflammation signatures likely initiate carcinogenesis. MSS tumor progression was likely driven by aging- and aflatoxin-associated mutations, as these latter changes were usually nonclonal. In microsatellite-unstable tumors, nonclonal tobacco-associated mutations were common. Our study, therefore, contributed to advancing gastric cancer molecular diagnostics and suggests clonal status is important to understanding gastric tumorigenesis. Our findings of a higher frequency of a poor prognosis associated molecular subtype in Latinos and a possible new aflatoxin gastric cancer etiology also advance cancer disparities research. SIGNIFICANCE Our study contributes to advancing our knowledge of gastric carcinogenesis, diagnostics, and cancer health disparities.
Collapse
Affiliation(s)
- Ted W. Toal
- Genome Center, University of California, Davis, California
| | - Ana P. Estrada-Florez
- Genome Center, University of California, Davis, California
- Grupo de Citogenética, Filogenia y Evolución de las Poblaciones, Universidad del Tolima, Ibagué, Colombia
| | | | | | - Paul C. Lott
- Genome Center, University of California, Davis, California
| | - John J. Suarez-Olaya
- Grupo de Citogenética, Filogenia y Evolución de las Poblaciones, Universidad del Tolima, Ibagué, Colombia
| | - Alix A. Guevara-Tique
- Grupo de Citogenética, Filogenia y Evolución de las Poblaciones, Universidad del Tolima, Ibagué, Colombia
| | - Sienna Rocha
- Genome Center, University of California, Davis, California
| | | | - Fabian Castro-Valencia
- Grupo de Citogenética, Filogenia y Evolución de las Poblaciones, Universidad del Tolima, Ibagué, Colombia
| | - Shiro Urayama
- UC Davis Comprehensive Cancer Center, Sacramento, California
- Division of Gastroenterology & Hepatology, University of California, Davis, California
| | - Amanda Kirane
- UC Davis Comprehensive Cancer Center, Sacramento, California
| | - Dongguang Wei
- Department of Pathology and Laboratory Medicine, University of California, Davis, California
| | - Nora Rios-Sarabia
- Unidad de Investigación en Enfermedades Infecciosas y Parasitarias, Unidad Médica de Alta Especialidad en Pediatría, Instituto Mexicano del Seguro Social, México City, México
| | - Rafael Medrano
- Departamento de Sarcomas y Tubo Digestivo Alto, Unidad Medica de Alta Especialidad en Oncología Instituto Mexicano del Seguro Social (IMSS), México City, México
| | - Alejandra Mantilla
- Departamento de Patología, Unidad Medica de Alta Especialidad en Oncología, Instituto Mexicano del Seguro Social (IMSS), México City, México
| | | | - Javier Torres
- Unidad de Investigación en Enfermedades Infecciosas y Parasitarias, Unidad Médica de Alta Especialidad en Pediatría, Instituto Mexicano del Seguro Social, México City, México
| | - Mabel E. Bohorquez-Lozano
- Grupo de Citogenética, Filogenia y Evolución de las Poblaciones, Universidad del Tolima, Ibagué, Colombia
| | - Luis G. Carvajal-Carmona
- Genome Center, University of California, Davis, California
- UC Davis Comprehensive Cancer Center, Sacramento, California
- Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, California
| |
Collapse
|
35
|
Liu S, Gao Y, Canela-Xandri O, Wang S, Yu Y, Cai W, Li B, Xiang R, Chamberlain AJ, Pairo-Castineira E, D’Mellow K, Rawlik K, Xia C, Yao Y, Navarro P, Rocha D, Li X, Yan Z, Li C, Rosen BD, Van Tassell CP, Vanraden PM, Zhang S, Ma L, Cole JB, Liu GE, Tenesa A, Fang L. A multi-tissue atlas of regulatory variants in cattle. Nat Genet 2022; 54:1438-1447. [PMID: 35953587 PMCID: PMC7613894 DOI: 10.1038/s41588-022-01153-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 07/07/2022] [Indexed: 12/12/2022]
Abstract
Characterization of genetic regulatory variants acting on livestock gene expression is essential for interpreting the molecular mechanisms underlying traits of economic value and for increasing the rate of genetic gain through artificial selection. Here we build a Cattle Genotype-Tissue Expression atlas (CattleGTEx) as part of the pilot phase of the Farm animal GTEx (FarmGTEx) project for the research community based on 7,180 publicly available RNA-sequencing (RNA-seq) samples. We describe the transcriptomic landscape of more than 100 tissues/cell types and report hundreds of thousands of genetic associations with gene expression and alternative splicing for 23 distinct tissues. We evaluate the tissue-sharing patterns of these genetic regulatory effects, and functionally annotate them using multiomics data. Finally, we link gene expression in different tissues to 43 economically important traits using both transcriptome-wide association and colocalization analyses to decipher the molecular regulatory mechanisms underpinning such agronomic traits in cattle.
Collapse
Affiliation(s)
- Shuli Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Yahui Gao
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
- Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Oriol Canela-Xandri
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Sheng Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Ying Yu
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Wentao Cai
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing 100193, China
| | - Bingjie Li
- Scotland’s Rural College (SRUC), Roslin Institute Building, Midlothian EH25 9RG, UK
| | - Ruidong Xiang
- Faculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville 3052, Victoria, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria 3083, Australia
| | - Amanda J. Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria 3083, Australia
| | - Erola Pairo-Castineira
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Kenton D’Mellow
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Konrad Rawlik
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
| | - Charley Xia
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
| | - Yuelin Yao
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Pau Navarro
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Dominique Rocha
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, F-78350, France
| | - Xiujin Li
- Guangdong Provincial Key Laboratory of Waterfowl Healthy Breeding, College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, Guangdong 510225, China
| | - Ze Yan
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Congjun Li
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Benjamin D. Rosen
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Curtis P. Van Tassell
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Paul M. Vanraden
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Shengli Zhang
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - John B. Cole
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - George E. Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Albert Tenesa
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Lingzhao Fang
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| |
Collapse
|
36
|
Han S, Dias GB, Basting PJ, Nelson MG, Patel S, Marzo M, Bergman CM. Ongoing transposition in cell culture reveals the phylogeny of diverse Drosophila S2 sublines. Genetics 2022; 221:iyac077. [PMID: 35536183 PMCID: PMC9252272 DOI: 10.1093/genetics/iyac077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 04/28/2022] [Indexed: 11/13/2022] Open
Abstract
Cultured cells are widely used in molecular biology despite poor understanding of how cell line genomes change in vitro over time. Previous work has shown that Drosophila cultured cells have a higher transposable element content than whole flies, but whether this increase in transposable element content resulted from an initial burst of transposition during cell line establishment or ongoing transposition in cell culture remains unclear. Here, we sequenced the genomes of 25 sublines of Drosophila S2 cells and show that transposable element insertions provide abundant markers for the phylogenetic reconstruction of diverse sublines in a model animal cell culture system. DNA copy number evolution across S2 sublines revealed dramatically different patterns of genome organization that support the overall evolutionary history reconstructed using transposable element insertions. Analysis of transposable element insertion site occupancy and ancestral states support a model of ongoing transposition dominated by episodic activity of a small number of retrotransposon families. Our work demonstrates that substantial genome evolution occurs during long-term Drosophila cell culture, which may impact the reproducibility of experiments that do not control for subline identity.
Collapse
Affiliation(s)
- Shunhua Han
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Guilherme B Dias
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Preston J Basting
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Michael G Nelson
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Sanjai Patel
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Mar Marzo
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Casey M Bergman
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
37
|
Chang NC, Rovira Q, Wells J, Feschotte C, Vaquerizas JM. Zebrafish transposable elements show extensive diversification in age, genomic distribution, and developmental expression. Genome Res 2022; 32:1408-1423. [PMID: 34987056 PMCID: PMC9341512 DOI: 10.1101/gr.275655.121] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 12/30/2021] [Indexed: 12/02/2022]
Abstract
There is considerable interest in understanding the effect of transposable elements (TEs) on embryonic development. Studies in humans and mice are limited by the difficulty of working with mammalian embryos and by the relative scarcity of active TEs in these organisms. The zebrafish is an outstanding model for the study of vertebrate development, and over half of its genome consists of diverse TEs. However, zebrafish TEs remain poorly characterized. Here we describe the demography and genomic distribution of zebrafish TEs and their expression throughout embryogenesis using bulk and single-cell RNA sequencing data. These results reveal a highly dynamic genomic ecosystem comprising nearly 2000 distinct TE families, which vary in copy number by four orders of magnitude and span a wide range of ages. Longer retroelements tend to be retained in intergenic regions, whereas short interspersed nuclear elements (SINEs) and DNA transposons are more frequently found nearby or within genes. Locus-specific mapping of TE expression reveals extensive TE transcription during development. Although two-thirds of TE transcripts are likely driven by nearby gene promoters, we still observe stage- and tissue-specific expression patterns in self-regulated TEs. Long terminal repeat (LTR) retroelements are most transcriptionally active immediately following zygotic genome activation, whereas DNA transposons are enriched among transcripts expressed in later stages of development. Single-cell analysis reveals several endogenous retroviruses expressed in specific somatic cell lineages. Overall, our study provides a valuable resource for using zebrafish as a model to study the impact of TEs on vertebrate development.
Collapse
Affiliation(s)
- Ni-Chen Chang
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850, USA
| | - Quirze Rovira
- Max Planck Institute for Molecular Biomedicine, 48149 Muenster, Germany
| | - Jonathan Wells
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850, USA
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850, USA
| | - Juan M Vaquerizas
- Max Planck Institute for Molecular Biomedicine, 48149 Muenster, Germany
- MRC London Institute of Medical Sciences, London W12 0NN, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London W12 0NN, United Kingdom
| |
Collapse
|
38
|
Deng L, Xie B, Wang Y, Zhang X, Xu S. A protocol for applying a population-specific reference genome assembly to population genetics and medical studies. STAR Protoc 2022; 3:101440. [PMID: 35664259 PMCID: PMC9157554 DOI: 10.1016/j.xpro.2022.101440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
With a growing number of available de novo sequenced genomes, protocols for their applications to population genetics will benefit our understanding of the human genome. Here we detail analytic steps to apply an example de novo reference genome to map and detect variants of short-read sequences from corresponding populations and to discover variants of disease-relevant genes. Using this protocol, we can improve variant discovery, better investigate population-specific genome properties, and evaluate the potential of sequenced genomes in medical studies. For complete details on the use and execution of this protocol, please refer to Lou et al. (2022). Protocol for mapping and variants detection of short-read sequences Advantages of using a population-specific reference genome in population genomic studies Analytic steps to discover potential variants of disease-relevant genes
Publisher’s note: Undertaking any experimental protocol requires adherence to local institutional guidelines for laboratory safety and ethics.
Collapse
Affiliation(s)
- Lian Deng
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Bo Xie
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yimin Wang
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiaoxi Zhang
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, China
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai 201203, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
- Corresponding author
| |
Collapse
|
39
|
Eraslan G, Drokhlyansky E, Anand S, Fiskin E, Subramanian A, Slyper M, Wang J, Van Wittenberghe N, Rouhana JM, Waldman J, Ashenberg O, Lek M, Dionne D, Win TS, Cuoco MS, Kuksenko O, Tsankov AM, Branton PA, Marshall JL, Greka A, Getz G, Segrè AV, Aguet F, Rozenblatt-Rosen O, Ardlie KG, Regev A. Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function. Science 2022; 376:eabl4290. [PMID: 35549429 PMCID: PMC9383269 DOI: 10.1126/science.abl4290] [Citation(s) in RCA: 157] [Impact Index Per Article: 78.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Understanding gene function and regulation in homeostasis and disease requires knowledge of the cellular and tissue contexts in which genes are expressed. Here, we applied four single-nucleus RNA sequencing methods to eight diverse, archived, frozen tissue types from 16 donors and 25 samples, generating a cross-tissue atlas of 209,126 nuclei profiles, which we integrated across tissues, donors, and laboratory methods with a conditional variational autoencoder. Using the resulting cross-tissue atlas, we highlight shared and tissue-specific features of tissue-resident cell populations; identify cell types that might contribute to neuromuscular, metabolic, and immune components of monogenic diseases and the biological processes involved in their pathology; and determine cell types and gene modules that might underlie disease mechanisms for complex traits analyzed by genome-wide association studies.
Collapse
Affiliation(s)
- Gökcen Eraslan
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Eugene Drokhlyansky
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Shankara Anand
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Evgenij Fiskin
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Ayshwarya Subramanian
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Michal Slyper
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jiali Wang
- Department of Ophthalmology, Harvard Medical School, Boston, MA 02115, USA
- Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA 02114, USA
- Medical and Population Genetics Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - John M. Rouhana
- Department of Ophthalmology, Harvard Medical School, Boston, MA 02115, USA
- Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA 02114, USA
- Medical and Population Genetics Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Julia Waldman
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Orr Ashenberg
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Danielle Dionne
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Thet Su Win
- Department of Dermatology, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Michael S. Cuoco
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Olena Kuksenko
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Philip A. Branton
- The Joint Pathology Center Gynecologic/Breast Pathology, Silver Spring, MD 20910, USA
| | | | - Anna Greka
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Center for Cancer Research and Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
- Harvard Medical School, Boston, MA 02115, USA
| | - Ayellet V. Segrè
- Department of Ophthalmology, Harvard Medical School, Boston, MA 02115, USA
- Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA 02114, USA
- Medical and Population Genetics Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - François Aguet
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Orit Rozenblatt-Rosen
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
40
|
Olson ND, Wagner J, McDaniel J, Stephens SH, Westreich ST, Prasanna AG, Johanson E, Boja E, Maier EJ, Serang O, Jáspez D, Lorenzo-Salazar JM, Muñoz-Barrera A, Rubio-Rodríguez LA, Flores C, Kyriakidis K, Malousi A, Shafin K, Pesout T, Jain M, Paten B, Chang PC, Kolesnikov A, Nattestad M, Baid G, Goel S, Yang H, Carroll A, Eveleigh R, Bourgey M, Bourque G, Li G, Ma C, Tang L, Du Y, Zhang S, Morata J, Tonda R, Parra G, Trotta JR, Brueffer C, Demirkaya-Budak S, Kabakci-Zorlu D, Turgut D, Kalay Ö, Budak G, Narcı K, Arslan E, Brown R, Johnson IJ, Dolgoborodov A, Semenyuk V, Jain A, Tetikol HS, Jain V, Ruehle M, Lajoie B, Roddey C, Catreux S, Mehio R, Ahsan MU, Liu Q, Wang K, Ebrahim Sahraeian SM, Fang LT, Mohiyuddin M, Hung C, Jain C, Feng H, Li Z, Chen L, Sedlazeck FJ, Zook JM. PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions. CELL GENOMICS 2022; 2:S2666-979X(22)00058-1. [PMID: 35720974 PMCID: PMC9205427 DOI: 10.1016/j.xgen.2022.100129] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 11/01/2021] [Accepted: 04/08/2022] [Indexed: 11/19/2022]
Abstract
The precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications. Challenge submissions included numerous innovative methods, with graph-based and machine learning methods scoring best for short-read and long-read datasets, respectively. With machine learning approaches, combining multiple sequencing technologies performed particularly well. Recent developments in sequencing and variant calling have enabled benchmarking variants in challenging genomic regions, paving the way for the identification of previously unknown clinically relevant variants.
Collapse
Affiliation(s)
- Nathan D. Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| | - Jennifer McDaniel
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| | | | | | | | - Elaine Johanson
- Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, USA
| | - Emily Boja
- Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, USA
| | - Ezekiel J. Maier
- Booz Allen Hamilton, 8283 Greensboro Drive, Mclean, VA 22102, USA
| | - Omar Serang
- DNAnexus, Inc., 1975 W El Camino Real #204, Mountain View, CA 94040, USA
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
- Research Unit, Hospital Universitario N.S. de Candelaria, Santa Cruz de Tenerife, Spain
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain
| | - Konstantinos Kyriakidis
- School of Pharmacy, Aristotle University of Thessaloniki (AUTH), 541 24 Thessaloniki, Greece
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation, 570 01 Thessaloniki, Greece
| | - Andigoni Malousi
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation, 570 01 Thessaloniki, Greece
- Laboratory of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki (AUTH), 541 24 Thessaloniki, Greece
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Trevor Pesout
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Pi-Chuan Chang
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | | | - Maria Nattestad
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Gunjan Baid
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Sidharth Goel
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Howard Yang
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Andrew Carroll
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Robert Eveleigh
- The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
| | - Mathieu Bourgey
- The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
| | - Guillaume Bourque
- The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
| | - Gen Li
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - ChouXian Ma
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - LinQi Tang
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - YuanPing Du
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - ShaoWei Zhang
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - Jordi Morata
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Raúl Tonda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Genís Parra
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Jean-Rémi Trotta
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Christian Brueffer
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden
| | | | | | - Deniz Turgut
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Özem Kalay
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Gungor Budak
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Kübra Narcı
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Elif Arslan
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | | | | | | | | | - Amit Jain
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | | | | | | | | | | | | | | | - Mian Umair Ahsan
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Qian Liu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | - Li Tai Fang
- Roche Sequencing Solutions, Santa Clara, CA 95050, USA
| | | | | | - Chirag Jain
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Justin M. Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| |
Collapse
|
41
|
Li Y, Baptista RP, Mei X, Kissinger JC. Small and intermediate size structural RNAs in the unicellular parasite Cryptosporidium parvum as revealed by sRNA-seq and comparative genomics. Microb Genom 2022; 8. [PMID: 35536609 PMCID: PMC9465071 DOI: 10.1099/mgen.0.000821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Small and intermediate-size noncoding RNAs (sRNAs and is-ncRNAs) have been shown to play important regulatory roles in the development of several eukaryotic organisms. However, they have not been thoroughly explored in Cryptosporidium parvum, an obligate zoonotic protist parasite responsible for the diarrhoeal disease cryptosporidiosis. Using Illumina sequencing of a small RNA library, a systematic identification of novel small and is-ncRNAs was performed in C. parvum excysted sporozoites. A total of 79 novel is-ncRNA candidates, including antisense, intergenic and intronic is-ncRNAs, were identified, including 7 new small nucleolar RNAs (snoRNAs). Expression of select novel is-ncRNAs was confirmed by RT-PCR. Phylogenetic conservation was analysed using covariance models (CMs) in related Cryptosporidium and apicomplexan parasite genome sequences. A potential new type of small ncRNA derived from tRNA fragments was observed. Overall, a deep profiling analysis of novel is-ncRNAs in C. parvum and related species revealed structural features and conservation of these novel is-ncRNAs. Covariance models can be used to detect is-ncRNA genes in other closely related parasites. These findings provide important new sequences for additional functional characterization of novel is-ncRNAs in the protist pathogen C. parvum.
Collapse
Affiliation(s)
- Yiran Li
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Rodrigo P Baptista
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA.,Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, USA.,Present address: Houston Methodist Research Institute, Houston, TX, USA
| | - Xiaohan Mei
- Department of Physiology and Pharmacology, University of Georgia, Athens, GA, USA
| | - Jessica C Kissinger
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA.,Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, USA.,Department of Genetics, University of Georgia, Athens, GA, USA
| |
Collapse
|
42
|
Ramudo-Cela L, Santana-Martínez S, García-Ramos M, Bergamino M, García-Giustiniani D, Vélez-Vieitez P, Hernández-Hernández JL, García-Ibarbia C, González-Bustos P, Ruíz-Martín P, González-Lozano J, Santomé-Collazo L, Grana-Fernandez A, Cabaleiro-Cerviño P, Ortíz M, Monserrat-Iglesias L. Combining familial hypercholesterolemia and statin genetic studies as a strategy for the implementation of pharmacogenomics. A multidisciplinary approach. THE PHARMACOGENOMICS JOURNAL 2022; 22:180-187. [PMID: 35361995 DOI: 10.1038/s41397-022-00274-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 02/27/2022] [Accepted: 03/17/2022] [Indexed: 06/14/2023]
Abstract
The diagnostic process of familial hypercholesterolemia frequently involves the use of genetic studies. Patients are treated with lipid-lowering drugs, frequently statins. Although pharmacogenomic clinical practice guidelines focusing on genotype-based statin prescription have been published, their use in routine clinical practice remains very modest.We have implemented a new NGS strategy that combines a panel of genes related to familial hypercholesterolemia with genomic regions related to the pharmacogenomics of lipid-lowering drugs described in clinical practice guidelines and in EMA and FDA drug labels. A multidisciplinary team of doctors, biologists, and pharmacists creates a clinical report that provides diagnostic and therapeutic findings using a knowledge management and clinical decision support system, as well as an algorithm for treatment selection.For 12 months, a total of 483 genetic diagnostic studies for familial hypercholesterolemia were carried out, of which 221 (45.8%) requested a complementary pharmacogenomic test. Of these 221 patients, 66.5% were carriers of actionable variants in any of the studied pharmacogenomic pathways: 46.6% of patients in one pathway, 19.0% in two pathways, and 0.9% in three pathways. 45.7% of patients could have a response to atorvastatin different from that of the reference population, 45.7% for simvastatin and lovastatin, 29.0% for fluvastatin, and 6.7% patients for pitavastatin.This implementation approach facilitates the incorporation of pharmacogenomic studies in clinical care practice, it does not add complexity nor additional steps to laboratory processes, and improves the pharmacotherapeutic process of patients.
Collapse
Affiliation(s)
- Luis Ramudo-Cela
- Health in Code S.L., Scientific Department, A Coruña, Spain.
- Complexo Hospitalario Universitario A Coruña, A Coruña, Spain.
- Universidade da Coruña, A Coruña, Spain.
| | | | | | | | | | | | - Jose Luis Hernández-Hernández
- Department of Internal Medicine, Hospital Universitario Marqués de Valdecilla-IDIVAL, University of Cantabria, Santander, Spain
| | - Carmen García-Ibarbia
- Department of Internal Medicine, Hospital Universitario Marqués de Valdecilla-IDIVAL, University of Cantabria, Santander, Spain
| | | | - Patricia Ruíz-Martín
- Department of Cardiology, Hospital Regional Universitario de Málaga, Málaga, Spain
| | | | | | | | | | - Martín Ortíz
- Health in Code S.L., Scientific Department, A Coruña, Spain
| | | |
Collapse
|
43
|
Dietlein F, Wang AB, Fagre C, Tang A, Besselink NJM, Cuppen E, Li C, Sunyaev SR, Neal JT, Van Allen EM. Genome-wide analysis of somatic noncoding mutation patterns in cancer. Science 2022; 376:eabg5601. [PMID: 35389777 PMCID: PMC9092060 DOI: 10.1126/science.abg5601] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
We established a genome-wide compendium of somatic mutation events in 3949 whole cancer genomes representing 19 tumor types. Protein-coding events captured well-established drivers. Noncoding events near tissue-specific genes, such as ALB in the liver or KLK3 in the prostate, characterized localized passenger mutation patterns and may reflect tumor-cell-of-origin imprinting. Noncoding events in regulatory promoter and enhancer regions frequently involved cancer-relevant genes such as BCL6, FGFR2, RAD51B, SMC6, TERT, and XBP1 and represent possible drivers. Unlike most noncoding regulatory events, XBP1 mutations primarily accumulated outside the gene's promoter, and we validated their effect on gene expression using CRISPR-interference screening and luciferase reporter assays. Broadly, our study provides a blueprint for capturing mutation events across the entire genome to guide advances in biological discovery, therapies, and diagnostics.
Collapse
Affiliation(s)
- Felix Dietlein
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA.,Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA.,Corresponding author. (E.M.V.A.); (F.D.)
| | - Alex B. Wang
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Christian Fagre
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Anran Tang
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA.,Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Nicolle J. M. Besselink
- Center for Molecular Medicine and Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, Netherlands
| | - Edwin Cuppen
- Center for Molecular Medicine and Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, Netherlands.,Hartwig Medical Foundation, 1098 XH Amsterdam, Netherlands
| | - Chunliang Li
- Department of Tumor Cell Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Shamil R. Sunyaev
- Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - James T. Neal
- Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Eliezer M. Van Allen
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA.,Cancer Program, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA.,Corresponding author. (E.M.V.A.); (F.D.)
| |
Collapse
|
44
|
DeGiorgio M, Szpiech ZA. A spatially aware likelihood test to detect sweeps from haplotype distributions. PLoS Genet 2022; 18:e1010134. [PMID: 35404934 PMCID: PMC9022890 DOI: 10.1371/journal.pgen.1010134] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/21/2022] [Accepted: 03/04/2022] [Indexed: 01/13/2023] Open
Abstract
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the "width" of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
| | - Zachary A. Szpiech
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| |
Collapse
|
45
|
Hui S, Nielsen R. SCONCE: a method for profiling copy number alterations in cancer evolution using single-cell whole genome sequencing. Bioinformatics 2022; 38:1801-1808. [PMID: 35080614 PMCID: PMC8963318 DOI: 10.1093/bioinformatics/btac041] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 12/23/2021] [Accepted: 01/24/2022] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION Copy number alterations (CNAs) are a significant driver in cancer growth and development, but remain poorly characterized on the single cell level. Although genome evolution in cancer cells is Markovian through evolutionary time, CNAs are not Markovian along the genome. However, existing methods call copy number profiles with Hidden Markov Models or change point detection algorithms based on changes in observed read depth, corrected by genome content and do not account for the stochastic evolutionary process. RESULTS We present a theoretical framework to use tumor evolutionary history to accurately call CNAs in a principled manner. To model the tumor evolutionary process and account for technical noise from low coverage single-cell whole genome sequencing data, we developed SCONCE, a method based on a Hidden Markov Model to analyze read depth data from tumor cells using matched normal cells as negative controls. Using a combination of public data sets and simulations, we show SCONCE accurately decodes copy number profiles, and provides a useful tool for understanding tumor evolution. AVAILABILITYAND IMPLEMENTATION SCONCE is implemented in C++11 and is freely available from https://github.com/NielsenBerkeleyLab/sconce. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sandra Hui
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Rasmus Nielsen
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Department of Statistics, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
46
|
Schmitz RJ, Marand AP, Zhang X, Mosher RA, Turck F, Chen X, Axtell MJ, Zhong X, Brady SM, Megraw M, Meyers BC. Quality control and evaluation of plant epigenomics data. THE PLANT CELL 2022; 34:503-513. [PMID: 34648025 PMCID: PMC8773985 DOI: 10.1093/plcell/koab255] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 10/08/2021] [Indexed: 05/22/2023]
Abstract
Epigenomics is the study of molecular signatures associated with discrete regions within genomes, many of which are important for a wide range of nuclear processes. The ability to profile the epigenomic landscape associated with genes, repetitive regions, transposons, transcription, differential expression, cis-regulatory elements, and 3D chromatin interactions has vastly improved our understanding of plant genomes. However, many epigenomic and single-cell genomic assays are challenging to perform in plants, leading to a wide range of data quality issues; thus, the data require rigorous evaluation prior to downstream analyses and interpretation. In this commentary, we provide considerations for the evaluation of plant epigenomics and single-cell genomics data quality with the aim of improving the quality and utility of studies using those data across diverse plant species.
Collapse
Affiliation(s)
- Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | - Alexandre P Marand
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | - Xuan Zhang
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | - Rebecca A Mosher
- School of Plant Sciences, University of Arizona, Tucson, Arizona 85721, USA
| | - Franziska Turck
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, Köln, Germany
| | - Xuemei Chen
- Department of Botany and Plant Sciences, University of California, Riverside, California 92521, USA
| | - Michael J Axtell
- Department of Biology and Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16801, USA
| | - Xuehua Zhong
- Wisconsin Institute for Discovery & Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin 53715, USA
| | - Siobhan M Brady
- Department of Plant Biology and Genome Center, University of California Davis, Davis, California 95616, USA
| | - Molly Megraw
- Department of Botany and Plant Pathology, Center for Quantitative Life Sciences, Oregon State University, Corvallis, Oregon 97331 USA
| | - Blake C Meyers
- Donald Danforth Plant Science Center, St Louis, Missouri 63132, USA
- Division of Plant Sciences, University of Missouri, Columbia, Missouri 65211, USA
| |
Collapse
|
47
|
Taub MA, Conomos MP, Keener R, Iyer KR, Weinstock JS, Yanek LR, Lane J, Miller-Fleming TW, Brody JA, Raffield LM, McHugh CP, Jain D, Gogarten SM, Laurie CA, Keramati A, Arvanitis M, Smith AV, Heavner B, Barwick L, Becker LC, Bis JC, Blangero J, Bleecker ER, Burchard EG, Celedón JC, Chang YPC, Custer B, Darbar D, de las Fuentes L, DeMeo DL, Freedman BI, Garrett ME, Gladwin MT, Heckbert SR, Hidalgo BA, Irvin MR, Islam T, Johnson WC, Kaab S, Launer L, Lee J, Liu S, Moscati A, North KE, Peyser PA, Rafaels N, Seidman C, Weeks DE, Wen F, Wheeler MM, Williams LK, Yang IV, Zhao W, Aslibekyan S, Auer PL, Bowden DW, Cade BE, Chen Z, Cho MH, Cupples LA, Curran JE, Daya M, Deka R, Eng C, Fingerlin TE, Guo X, Hou L, Hwang SJ, Johnsen JM, Kenny EE, Levin AM, Liu C, Minster RL, Naseri T, Nouraie M, Reupena MS, Sabino EC, Smith JA, Smith NL, Lasky-Su J, Taylor JG, Telen MJ, Tiwari HK, Tracy RP, White MJ, Zhang Y, Wiggins KL, Weiss ST, Vasan RS, Taylor KD, Sinner MF, Silverman EK, Shoemaker MB, Sheu WHH, Sciurba F, Schwartz DA, Rotter JI, Roden D, Redline S, Raby BA, Psaty BM, Peralta JM, Palmer ND, Nekhai S, Montgomery CG, Mitchell BD, Meyers DA, McGarvey ST, Mak AC, Loos RJ, Kumar R, Kooperberg C, Konkle BA, Kelly S, Kardia SL, Kaplan R, He J, Gui H, Gilliland FD, Gelb BD, Fornage M, Ellinor PT, de Andrade M, Correa A, Chen YDI, Boerwinkle E, Barnes KC, Ashley-Koch AE, Arnett DK, Albert C, Laurie CC, Abecasis G, Nickerson DA, Wilson JG, Rich SS, Levy D, Ruczinski I, Aviv A, Blackwell TW, Thornton T, O’Connell J, Cox NJ, Perry JA, Armanios M, Battle A, Pankratz N, Reiner AP, Mathias RA. Genetic determinants of telomere length from 109,122 ancestrally diverse whole-genome sequences in TOPMed. CELL GENOMICS 2022; 2:S2666-979X(21)00105-1. [PMID: 35530816 PMCID: PMC9075703 DOI: 10.1016/j.xgen.2021.100084] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 09/03/2021] [Accepted: 12/10/2021] [Indexed: 01/16/2023]
Abstract
Genetic studies on telomere length are important for understanding age-related diseases. Prior GWAS for leukocyte TL have been limited to European and Asian populations. Here, we report the first sequencing-based association study for TL across ancestrally-diverse individuals (European, African, Asian and Hispanic/Latino) from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. We used whole genome sequencing (WGS) of whole blood for variant genotype calling and the bioinformatic estimation of telomere length in n=109,122 individuals. We identified 59 sentinel variants (p-value <5×10-9) in 36 loci associated with telomere length, including 20 newly associated loci (13 were replicated in external datasets). There was little evidence of effect size heterogeneity across populations. Fine-mapping at OBFC1 indicated the independent signals colocalized with cell-type specific eQTLs for OBFC1 (STN1). Using a multi-variant gene-based approach, we identified two genes newly implicated in telomere length, DCLRE1B (SNM1B) and PARN. In PheWAS, we demonstrated our TL polygenic trait scores (PTS) were associated with increased risk of cancer-related phenotypes.
Collapse
Affiliation(s)
- Margaret A. Taub
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Matthew P. Conomos
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Rebecca Keener
- Department of Biomedical Engineering, Johns Hopkins Whiting School of Engineering, Baltimore, MD, USA
| | - Kruthika R. Iyer
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Joshua S. Weinstock
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Lisa R. Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - John Lane
- Department of Laboratory Medicine & Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Tyne W. Miller-Fleming
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina, Chapel Hill, Chapel Hill, NC, USA
| | - Caitlin P. McHugh
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Deepti Jain
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Stephanie M. Gogarten
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Cecelia A. Laurie
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Ali Keramati
- Department of Cardiology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Marios Arvanitis
- Department of Medicine, Division of Cardiology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Albert V. Smith
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Benjamin Heavner
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Lucas Barwick
- LTRC Data Coordinating Center, The Emmes Company, LLC, Rockville, MD, USA
| | - Lewis C. Becker
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Eugene R. Bleecker
- Department of Medicine, Division of Genetics, Genomics, and Precision Medicine, University of Arizona, Tucson, AZ, USA
- Division of Pharmacogenomics, University of Arizona, Tucson, AZ, USA
| | - Esteban G. Burchard
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Juan C. Celedón
- Division of Pediatric Pulmonary Medicine, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Yen Pei C. Chang
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Brian Custer
- Vitalant Research Institute, San Francisco, CA, USA
- Department of Laboratory Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Dawood Darbar
- Division of Cardiology, University of Illinois at Chicago, Chicago, IL, USA
| | - Lisa de las Fuentes
- Cardiovascular Division, Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Dawn L. DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Barry I. Freedman
- Department of Internal Medicine, Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Melanie E. Garrett
- Department of Medicine and Duke Comprehensive Sickle Cell Center, Duke University Medical Center, Durham, NC, USA
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
| | - Mark T. Gladwin
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Susan R. Heckbert
- Cardiovascular Health Research Unit and Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Bertha A. Hidalgo
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Marguerite R. Irvin
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Talat Islam
- Division of Environmental Health, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, USA
| | - W. Craig Johnson
- Department of Biostatistics, Collaborative Health Studies Coordinating Center, University of Washington, Seattle, WA, USA
| | - Stefan Kaab
- Department of Medicine I, University Hospital Munich, Ludwig-Maximilian’s University, Munich, Germany
- German Centre for Cardiovascular Research (DZHK), partner site Munich Heart Alliance, Munich, Germany
| | - Lenore Launer
- Laboratory of Epidemiology and Population Science, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Jiwon Lee
- Department of Medicine, Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA
| | - Simin Liu
- Department of Epidemiology and Brown Center for Global Cardiometabolic Health, Brown University, Providence, RI, USA
| | - Arden Moscati
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kari E. North
- Department of Epidemiology, University of North Carolina, Chapel Hill, Chapel Hill, NC, USA
| | - Patricia A. Peyser
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Nicholas Rafaels
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO, USA
| | | | - Daniel E. Weeks
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Fayun Wen
- Center for Sickle Cell Disease and Department of Medicine, College of Medicine, Howard University, Washington, DC 20059, USA
| | - Marsha M. Wheeler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - L. Keoki Williams
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
| | - Ivana V. Yang
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO, USA
| | - Wei Zhao
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Stella Aslibekyan
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Paul L. Auer
- Zilber School of Public Health, University of Wisconsin, Milwaukee, Milwaukee, WI, USA
| | - Donald W. Bowden
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Brian E. Cade
- Harvard Medical School, Boston, MA, USA
- Division of Sleep Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Zhanghua Chen
- Division of Environmental Health, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, USA
| | - Michael H. Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - L. Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- The National Heart, Lung, and Blood Institute, Boston University’s Framingham Heart Study, Framingham, MA, USA
| | - Joanne E. Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Michelle Daya
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO, USA
| | - Ranjan Deka
- Department of Environmental and Public Health Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Tasha E. Fingerlin
- Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA
- Department of Biostatistics and Informatics, University of Colorado, Denver, Aurora, CO, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Shih-Jen Hwang
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jill M. Johnsen
- Bloodworks Northwest Research Institute, Seattle, WA, USA
- University of Washington, Department of Medicine, Seattle, WA, USA
| | - Eimear E. Kenny
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Albert M. Levin
- Department of Public Health Sciences, Henry Ford Health System, Detroit, MI, USA
| | - Chunyu Liu
- The National Heart, Lung, and Blood Institute, Boston University’s Framingham Heart Study, Framingham, MA, USA
- The Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, MD, USA
| | - Ryan L. Minster
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
- Department of Epidemiology & International Health Institute, School of Public Health, Brown University, Providence, RI, USA
| | - Mehdi Nouraie
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | | | - Ester C. Sabino
- Instituto de Medicina Tropical da Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Nicholas L. Smith
- Cardiovascular Health Research Unit and Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Jessica Lasky-Su
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - James G. Taylor
- Center for Sickle Cell Disease and Department of Medicine, College of Medicine, Howard University, Washington, DC 20059, USA
| | - Marilyn J. Telen
- Department of Medicine and Duke Comprehensive Sickle Cell Center, Duke University Medical Center, Durham, NC, USA
- Duke Comprehensive Sickle Cell Center, Duke University Medical Center, Durham, NC, USA
| | - Hemant K. Tiwari
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Russell P. Tracy
- Departments of Pathology & Laboratory Medicine and Biochemistry, Larrner College of Medicine, University of Vermont, Colchester, VT, USA
| | - Marquitta J. White
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Yingze Zhang
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Kerri L. Wiggins
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Scott T. Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Ramachandran S. Vasan
- The National Heart, Lung, and Blood Institute, Boston University’s Framingham Heart Study, Framingham, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Moritz F. Sinner
- Department of Medicine I, University Hospital Munich, Ludwig-Maximilian’s University, Munich, Germany
- German Centre for Cardiovascular Research (DZHK), partner site Munich Heart Alliance, Munich, Germany
| | - Edwin K. Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - M. Benjamin Shoemaker
- Departments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Wayne H.-H. Sheu
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Frank Sciurba
- Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - David A. Schwartz
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO, USA
| | - Jerome I. Rotter
- Institute for Translational Genomics and Population Sciences, Departments of Pediatrics and Medicine, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Daniel Roden
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Susan Redline
- Division of Sleep Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Benjamin A. Raby
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Pulmonary Medicine, Boston Children’s Hospital, Boston, MA, USA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services, University of Washington, Seattle, WA, USA
| | - Juan M. Peralta
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Nicholette D. Palmer
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Sergei Nekhai
- Center for Sickle Cell Disease and Department of Medicine, College of Medicine, Howard University, Washington, DC 20059, USA
| | - Courtney G. Montgomery
- Genes and Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
| | - Braxton D. Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Deborah A. Meyers
- Department of Medicine, Division of Genetics, Genomics, and Precision Medicine, University of Arizona, Tucson, AZ, USA
- Division of Pharmacogenomics, University of Arizona, Tucson, AZ, USA
| | - Stephen T. McGarvey
- Department of Epidemiology & International Health Institute, School of Public Health, Brown University, Providence, RI, USA
| | | | - Angel C.Y. Mak
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Ruth J.F. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rajesh Kumar
- Division of Allergy and Clinical Immunology, The Ann and Robert H. Lurie Children’s Hospital of Chicago, and Department of Pediatrics, Northwestern University, Chicago, IL, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Barbara A. Konkle
- Bloodworks Northwest Research Institute, Seattle, WA, USA
- University of Washington, Department of Medicine, Seattle, WA, USA
| | - Shannon Kelly
- Vitalant Research Institute, San Francisco, CA, USA
- UCSF Benioff Children’s Hospital, Oakland, CA, USA
| | - Sharon L.R. Kardia
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Robert Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Jiang He
- Department of Medicine, Tulane University School of Medicine, New Orleans, LA, USA
| | - Hongsheng Gui
- Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
| | - Frank D. Gilliland
- Division of Environmental Health, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, USA
| | - Bruce D. Gelb
- Mindich Child Health and Development Institute, Departments of Pediatrics and Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Patrick T. Ellinor
- Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Mariza de Andrade
- Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Adolfo Correa
- Jackson Heart Study and Departments of Medicine and Population Health Science, Jackson, MS, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Kathleen C. Barnes
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO, USA
| | - Allison E. Ashley-Koch
- Department of Medicine and Duke Comprehensive Sickle Cell Center, Duke University Medical Center, Durham, NC, USA
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
| | - Donna K. Arnett
- College of Public Health, University of Kentucky, Lexington, KY, USA
| | - Christine Albert
- Harvard Medical School, Boston, MA, USA
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | | | | | | | - Cathy C. Laurie
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Goncalo Abecasis
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Regeneron Pharmaceuticals, Tarrytown, NY, USA
| | | | - James G. Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MI, USA
| | - Stephen S. Rich
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Daniel Levy
- The National Heart, Lung, and Blood Institute, Boston University’s Framingham Heart Study, Framingham, MA, USA
- The Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, MD, USA
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Abraham Aviv
- Center of Human Development and Aging, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Thomas W. Blackwell
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Timothy Thornton
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Jeff O’Connell
- Division of Endocrinology, Diabetes, and Nutrition, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nancy J. Cox
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - James A. Perry
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Mary Armanios
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins Whiting School of Engineering, Baltimore, MD, USA
- Departments of Computer Science and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Nathan Pankratz
- Department of Laboratory Medicine & Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Alexander P. Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Rasika A. Mathias
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| |
Collapse
|
48
|
Polani S, Dean M, Lichter-Peled A, Hendrickson S, Tsang S, Fang X, Feng Y, Qiao W, Avni G, Kahila Bar-Gal G. Sequence Variant in the TRIM39-RPP21 Gene Readthrough is Shared Across a Cohort of Arabian Foals Diagnosed with Juvenile Idiopathic Epilepsy. JOURNAL OF GENETIC MUTATION DISORDERS 2022; 1:103. [PMID: 35465405 PMCID: PMC9031527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Juvenile idiopathic epilepsy (JIE) is a self-limiting neurological disorder with a suspected genetic predisposition affecting young Arabian foals of the Egyptian lineage. The condition is characterized by tonic-clonic seizures with intermittent post-ictal blindness, in which most incidents are sporadic and unrecognized. This study aimed to identify genetic components shared across a local cohort of Arabian foals diagnosed with JIE via a combined whole genome and targeted resequencing approach: Initial whole genome comparisons between a small cohort of nine diagnosed foals (cases) and 27 controls from other horse breeds identified variants uniquely shared amongst the case cohort. Further validation via targeted resequencing of these variants, that pertain to non-intergenic regions, on additional eleven case individuals revealed a single 19bp deletion coupled with a triple-C insertion (Δ19InsCCC) within the TRIM39-RPP21 gene readthrough that was uniquely shared across all case individuals, and absent from three additional Arabian controls. Furthermore, we have confirmed recent findings refuting potential linkage between JIE and other inherited diseases in the Arabian lineage, and refuted the potential linkage between JIE and genes predisposing a similar disorder in human newborns. This is the first study to report a genetic variant to be shared in a sub-population cohort of Arabian foals diagnosed with JIE. Further evaluation of the sensitivity and specificity of the Δ19InsCCC allele within additional cohorts of the Arabian horse is warranted in order to validate its credibility as a marker for JIE, and to ascertain whether it has been introduced into other horse breeds by Arabian ancestry.
Collapse
Affiliation(s)
- S Polani
- Koret School of Veterinary Medicine, The Robert H. Smith Faculty of Agriculture, Food and Environmental Sciences, The Hebrew University of Jerusalem, Rehovot, Israel
| | - M Dean
- National Cancer Institute, Division of Cancer Epidemiology & Genetics, Laboratory of Translational Genomics, USA
| | - A Lichter-Peled
- Koret School of Veterinary Medicine, The Robert H. Smith Faculty of Agriculture, Food and Environmental Sciences, The Hebrew University of Jerusalem, Rehovot, Israel
| | - S Hendrickson
- Department of Biology, Shepherd University, Shepherdstown, USA
| | | | - X Fang
- BGI-Shenzhen, Shenzhen, China
| | - Y Feng
- BGI-Shenzhen, Shenzhen, China
| | - W Qiao
- BGI-Shenzhen, Shenzhen, China
| | - G Avni
- Medisoos Equine Clinic, Kibutz Magal, Israel
| | - G Kahila Bar-Gal
- Koret School of Veterinary Medicine, The Robert H. Smith Faculty of Agriculture, Food and Environmental Sciences, The Hebrew University of Jerusalem, Rehovot, Israel
| |
Collapse
|
49
|
Wang R, Jiang Y. Copy Number Variation Detection by Single-Cell DNA Sequencing with SCOPE. Methods Mol Biol 2022; 2493:279-288. [PMID: 35751822 DOI: 10.1007/978-1-0716-2293-3_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Whole-genome single-cell DNA sequencing (scDNA-seq) enables the characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we describe SCOPE, a normalization and copy number estimation method for scDNA-seq data. We give an overview of the methodology and illustrate SCOPE with step-by-step demonstrations.
Collapse
Affiliation(s)
- Rujin Wang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.
- Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC, USA.
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA.
| |
Collapse
|
50
|
Iannucci A, Benazzo A, Natali C, Arida EA, Zein MSA, Jessop TS, Bertorelle G, Ciofi C. Population structure, genomic diversity and demographic history of Komodo dragons inferred from whole-genome sequencing. Mol Ecol 2021; 30:6309-6324. [PMID: 34390519 PMCID: PMC9292392 DOI: 10.1111/mec.16121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 07/28/2021] [Accepted: 08/03/2021] [Indexed: 02/07/2023]
Abstract
Population and conservation genetics studies have greatly benefited from the development of new techniques and bioinformatic tools associated with next-generation sequencing. Analysis of extensive data sets from whole-genome sequencing of even a few individuals allows the detection of patterns of fine-scale population structure and detailed reconstruction of demographic dynamics through time. In this study, we investigated the population structure, genomic diversity and demographic history of the Komodo dragon (Varanus komodoensis), the world's largest lizard, by sequencing the whole genomes of 24 individuals from the five main Indonesian islands comprising the entire range of the species. Three main genomic groups were observed. The populations of the Island of Komodo and the northern coast of Flores, in particular, were identified as two distinct conservation units. Degrees of genomic divergence among island populations were interpreted as a result of changes in sea level affecting connectivity across islands. Demographic inference suggested that Komodo dragons probably experienced a relatively steep population decline over the last million years, reaching a relatively stable Ne during the Saalian glacial cycle (400-150 thousand years ago) followed by a rapid Ne decrease. Genomic diversity of Komodo dragons was similar to that found in endangered or already extinct reptile species. Overall, this study provides an example of how whole-genome analysis of a few individuals per population can help define population structure and intraspecific demographic dynamics. This is particularly important when applying population genomics data to conservation of rare or elusive endangered species.
Collapse
Affiliation(s)
| | - Andrea Benazzo
- Department of Life Sciences and BiotechnologyUniversity of FerraraFerraraItaly
| | - Chiara Natali
- Department of BiologyUniversity of FlorenceFirenzeItaly
| | - Evy Ayu Arida
- Research Center for BiologyThe Indonesian Institute of Sciences (LIPI)Cibinong Science CenterCibinongIndonesia
| | - Moch Samsul Arifin Zein
- Research Center for BiologyThe Indonesian Institute of Sciences (LIPI)Cibinong Science CenterCibinongIndonesia
| | - Tim S. Jessop
- School of Life and Environmental SciencesDeakin UniversityGeelongVic.Australia
| | - Giorgio Bertorelle
- Department of Life Sciences and BiotechnologyUniversity of FerraraFerraraItaly
| | - Claudio Ciofi
- Department of BiologyUniversity of FlorenceFirenzeItaly
| |
Collapse
|