1
|
La Fleur A, Shi Y, Seelig G. Decoding biology with massively parallel reporter assays and machine learning. Genes Dev 2024; 38:843-865. [PMID: 39362779 PMCID: PMC11535156 DOI: 10.1101/gad.351800.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
Collapse
Affiliation(s)
- Alyssa La Fleur
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA;
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA;
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
2
|
Healey HM, Penn HB, Small CM, Bassham S, Goyal V, Woods MA, Cresko WA. Single Cell Sequencing Provides Clues about the Developmental Genetic Basis of Evolutionary Adaptations in Syngnathid Fishes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.08.588518. [PMID: 38645265 PMCID: PMC11030337 DOI: 10.1101/2024.04.08.588518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Seahorses, pipefishes, and seadragons are fishes from the family Syngnathidae that have evolved extraordinary traits including male pregnancy, elongated snouts, loss of teeth, and dermal bony armor. The developmental genetic and cellular changes that led to the evolution of these traits are largely unknown. Recent syngnathid genome assemblies revealed suggestive gene content differences and provide the opportunity for detailed genetic analyses. We created a single cell RNA sequencing atlas of Gulf pipefish embryos to understand the developmental basis of four traits: derived head shape, toothlessness, dermal armor, and male pregnancy. We completed marker gene analyses, built genetic networks, and examined spatial expression of select genes. We identified osteochondrogenic mesenchymal cells in the elongating face that express regulatory genes bmp4, sfrp1a, and prdm16. We found no evidence for tooth primordia cells, and we observed re-deployment of osteoblast genetic networks in developing dermal armor. Finally, we found that epidermal cells expressed nutrient processing and environmental sensing genes, potentially relevant for the brooding environment. The examined pipefish evolutionary innovations are composed of recognizable cell types, suggesting derived features originate from changes within existing gene networks. Future work addressing syngnathid gene networks across multiple stages and species is essential for understanding how their novelties evolved.
Collapse
Affiliation(s)
- Hope M Healey
- Institute of Ecology and Evolution, University of Oregon
| | - Hayden B Penn
- Institute of Ecology and Evolution, University of Oregon
| | - Clayton M Small
- Institute of Ecology and Evolution, University of Oregon
- School of Computer and Data Science, University of Oregon
| | - Susan Bassham
- Institute of Ecology and Evolution, University of Oregon
| | - Vithika Goyal
- Institute of Ecology and Evolution, University of Oregon
| | - Micah A Woods
- Institute of Ecology and Evolution, University of Oregon
| | - William A Cresko
- Institute of Ecology and Evolution, University of Oregon
- Knight Campus for Accelerating Scientific Impact, University of Oregon
| |
Collapse
|
3
|
Bond ML, Quiroga-Barber IY, D’Costa S, Wu Y, Bell JL, McAfee JC, Kramer NE, Lee S, Patrucco M, Phanstiel DH, Won H. Deciphering the functional impact of Alzheimer's Disease-associated variants in resting and proinflammatory immune cells. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.09.13.24313654. [PMID: 39371155 PMCID: PMC11451667 DOI: 10.1101/2024.09.13.24313654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Genome-wide association studies have identified loci associated with Alzheimer's Disease (AD), but identifying the exact causal variants and genes at each locus is challenging due to linkage disequilibrium and their largely non-coding nature. To address this, we performed a massively parallel reporter assay of 3,576 AD-associated variants in THP-1 macrophages in both resting and proinflammatory states and identified 47 expression-modulating variants (emVars). To understand the endogenous chromatin context of emVars, we built an activity-by-contact model using epigenomic maps of macrophage inflammation and inferred condition-specific enhancer-promoter pairs. Intersection of emVars with enhancer-promoter pairs and microglia expression quantitative trait loci allowed us to connect 39 emVars to 76 putative AD risk genes enriched for AD-associated molecular signatures. Overall, systematic characterization of AD-associated variants enhances our understanding of the regulatory mechanisms underlying AD pathogenesis.
Collapse
Affiliation(s)
- Marielle L. Bond
- Curriculum in Genetics & Molecular Biology, University of North Carolina at Chapel Hill
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | | | - Susan D’Costa
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
| | - Yijia Wu
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Jessica L. Bell
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Jessica C. McAfee
- Curriculum in Genetics & Molecular Biology, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Nicole E. Kramer
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill
| | - Sool Lee
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill
| | - Mary Patrucco
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| | - Douglas H. Phanstiel
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill
- Department of Cell Biology & Physiology, University of North Carolina at Chapel Hill
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill
- Neuroscience Center, University of North Carolina at Chapel Hill
| |
Collapse
|
4
|
Broadaway KA, Brotman SM, Rosen JD, Currin KW, Alkhawaja AA, Etheridge AS, Wright F, Gallins P, Jima D, Zhou YH, Love MI, Innocenti F, Mohlke KL. Liver eQTL meta-analysis illuminates potential molecular mechanisms of cardiometabolic traits. Am J Hum Genet 2024; 111:1899-1913. [PMID: 39173627 PMCID: PMC11393674 DOI: 10.1016/j.ajhg.2024.07.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 07/24/2024] [Accepted: 07/25/2024] [Indexed: 08/24/2024] Open
Abstract
Understanding the molecular mechanisms of complex traits is essential for developing targeted interventions. We analyzed liver expression quantitative-trait locus (eQTL) meta-analysis data on 1,183 participants to identify conditionally distinct signals. We found 9,013 eQTL signals for 6,564 genes; 23% of eGenes had two signals, and 6% had three or more signals. We then integrated the eQTL results with data from 29 cardiometabolic genome-wide association study (GWAS) traits and identified 1,582 GWAS-eQTL colocalizations for 747 eGenes. Non-primary eQTL signals accounted for 17% of all colocalizations. Isolating signals by conditional analysis prior to coloc resulted in 37% more colocalizations than using marginal eQTL and GWAS data, highlighting the importance of signal isolation. Isolating signals also led to stronger evidence of colocalization: among 343 eQTL-GWAS signal pairs in multi-signal regions, analyses that isolated the signals of interest resulted in higher posterior probability of colocalization for 41% of tests. Leveraging allelic heterogeneity, we predicted causal effects of gene expression on liver traits for four genes. To predict functional variants and regulatory elements, we colocalized eQTL with liver chromatin accessibility QTL (caQTL) and found 391 colocalizations, including 73 with non-primary eQTL signals and 60 eQTL signals that colocalized with both a caQTL and a GWAS signal. Finally, we used publicly available massively parallel reporter assays in HepG2 to highlight 14 eQTL signals that include at least one expression-modulating variant. This multi-faceted approach to unraveling the genetic underpinnings of liver-related traits could lead to therapeutic development.
Collapse
Affiliation(s)
- K Alaine Broadaway
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Sarah M Brotman
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Jonathan D Rosen
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Kevin W Currin
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Abdalla A Alkhawaja
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Amy S Etheridge
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Fred Wright
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA; Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA; Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Paul Gallins
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| | - Dereje Jima
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| | - Yi-Hui Zhou
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA; Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Federico Innocenti
- Eshelman School of Pharmacy, Division of Pharmacotherapy and Experimental Therapeutics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.
| |
Collapse
|
5
|
Peterson RE, Choudhri A, Mitelut C, Tanelus A, Capo-Battaglia A, Williams AH, Schneider DM, Sanes DH. Unsupervised discovery of family specific vocal usage in the Mongolian gerbil. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.11.532197. [PMID: 39282260 PMCID: PMC11398318 DOI: 10.1101/2023.03.11.532197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/21/2024]
Abstract
In nature, animal vocalizations can provide crucial information about identity, including kinship and hierarchy. However, lab-based vocal behavior is typically studied during brief interactions between animals with no prior social relationship, and under environmental conditions with limited ethological relevance. Here, we address this gap by establishing long-term acoustic recordings from Mongolian gerbil families, a core social group that uses an array of sonic and ultrasonic vocalizations. Three separate gerbil families were transferred to an enlarged environment and continuous 20-day audio recordings were obtained. Using a variational autoencoder (VAE) to quantify 583,237 vocalizations, we show that gerbils exhibit a more elaborate vocal repertoire than has been previously reported and that vocal repertoire usage differs significantly by family. By performing gaussian mixture model clustering on the VAE latent space, we show that families preferentially use characteristic sets of vocal clusters and that these usage preferences remain stable over weeks. Furthermore, gerbils displayed family-specific transitions between vocal clusters. Since gerbils live naturally as extended families in complex underground burrows that are adjacent to other families, these results suggest the presence of a vocal dialect which could be exploited by animals to represent kinship. These findings position the Mongolian gerbil as a compelling animal model to study the neural basis of vocal communication and demonstrates the potential for using unsupervised machine learning with uninterrupted acoustic recordings to gain insights into naturalistic animal behavior.
Collapse
Affiliation(s)
- Ralph E. Peterson
- Center for Neural Science, New York University, New York, NY
- Center for Computational Neuroscience, Flatiron Institute, New York, NY
| | | | - Catalin Mitelut
- Center for Neural Science, New York University, New York, NY
| | - Aramis Tanelus
- Center for Neural Science, New York University, New York, NY
- Center for Computational Neuroscience, Flatiron Institute, New York, NY
| | | | - Alex H. Williams
- Center for Neural Science, New York University, New York, NY
- Center for Computational Neuroscience, Flatiron Institute, New York, NY
| | | | - Dan H. Sanes
- Center for Neural Science, New York University, New York, NY
- Department of Psychology, New York University, New York, NY
- Department of Biology, New York University, New York, NY
- Neuroscience Institute, New York University School of Medicine, New York, NY
| |
Collapse
|
6
|
Lyons EF, Devanneaux LC, Muller RY, Freitas AV, Meacham ZA, McSharry MV, Trinh VN, Rogers AJ, Ingolia NT, Lareau LF. Translation elongation as a rate limiting step of protein production. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.27.568910. [PMID: 38076849 PMCID: PMC10705293 DOI: 10.1101/2023.11.27.568910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The impact of synonymous codon choice on protein output has important implications for understanding endogenous gene expression and design of synthetic mRNAs. Synonymous codons are decoded at different speeds, but simple models predict that this should not drive protein output. Instead, translation initiation should be the rate limiting step for production of protein per mRNA, with little impact of codon choice. Previously, we used a neural network model to design a series of synonymous fluorescent reporters and showed that their protein output in yeast spanned a seven-fold range corresponding to their predicted translation elongation speed. Here, we show that this effect is not due primarily to the established impact of slow elongation on mRNA stability, but rather, that slow elongation further decreases the number of proteins made per mRNA. We combine simulations and careful experiments on fluorescent reporters to show that translation is limited on non-optimally encoded transcripts. Using a genome-wide CRISPRi screen, we find that impairing translation initiation attenuates the impact of slow elongation, showing a dynamic balance between rate limiting steps of protein production. Our results show that codon choice can directly limit protein production across the full range of endogenous variability in codon usage.
Collapse
Affiliation(s)
- Elijah F Lyons
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Lou C Devanneaux
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Ryan Y Muller
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Anna V Freitas
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Zuriah A Meacham
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Maria V McSharry
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Van N Trinh
- Department of Bioengineering, University of California, Berkeley, California
| | - Anna J Rogers
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, California
| | - Liana F Lareau
- Department of Molecular and Cell Biology, University of California, Berkeley, California
- Department of Bioengineering, University of California, Berkeley, California
- Chan Zuckerberg Biohub, San Francisco, California
| |
Collapse
|
7
|
Schuurmans IK, Smajlagic D, Baltramonaityte V, Malmberg ALK, Neumann A, Creasey N, Felix JF, Tiemeier H, Pingault JB, Czamara D, Raïkkönen K, Page CM, Lyle R, Havdahl A, Lahti J, Walton E, Bekkhus M, Cecil CAM. Genetic susceptibility to neurodevelopmental conditions associates with neonatal DNA methylation patterns in the general population: an individual participant data meta-analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.01.24309384. [PMID: 39006433 PMCID: PMC11245083 DOI: 10.1101/2024.07.01.24309384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Background Autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), and schizophrenia (SCZ) are highly heritable and linked to disruptions in foetal (neuro)development. While epigenetic processes are considered an important underlying pathway between genetic susceptibility and neurodevelopmental conditions, it is unclear (i) whether genetic susceptibility to these conditions is associated with epigenetic patterns, specifically DNA methylation (DNAm), already at birth; (ii) to what extent DNAm patterns are unique or shared across conditions, and (iii) whether these neonatal DNAm patterns can be leveraged to enhance genetic prediction of (neuro)developmental outcomes. Methods We conducted epigenome-wide meta-analyses of genetic susceptibility to ASD, ADHD, and schizophrenia, quantified using polygenic scores (PGSs) on cord blood DNAm, using four population-based cohorts (n pooled=5,802), all North European. Heterogeneity statistics were used to estimate overlap in DNAm patterns between PGSs. Subsequently, DNAm-based measures of PGSs were built in a target sample, and used as predictors to test incremental variance explained over PGS in 130 (neuro)developmental outcomes spanning birth to 14 years. Outcomes In probe-level analyses, SCZ-PGS associated with neonatal DNAm at 246 loci (p<9×10-8), predominantly in the major histocompatibility complex. Functional characterization of these DNAm loci confirmed strong genetic effects, significant blood-brain concordance and enrichment for immune-related pathways. 8 loci were identified for ASD-PGS (mapping to FDFT1 and MFHAS1), and none for ADHD-PGS. Regional analyses indicated a large number of differentially methylated regions for all PGSs (SCZ-PGS: 157, ASD-PGS: 130, ADHD-PGS: 166). DNAm signals showed little overlap between PGSs. We found suggestive evidence that incorporating DNAm-based measures of genetic susceptibility at birth increases explained variance for several child cognitive and motor outcomes over and above PGS. Interpretation Genetic susceptibility for neurodevelopmental conditions, particularly schizophrenia, is detectable in cord blood DNAm at birth in a population-based sample, with largely distinct DNAm patterns between PGSs. These findings support an early-origins perspective on schizophrenia. Funding HorizonEurope; European Research Council.
Collapse
Affiliation(s)
- I K Schuurmans
- Department of Epidemiology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- The Generation R Study Group, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - D Smajlagic
- PROMENTA Research Centre, Department of Psychology, University of Oslo, Oslo, Norway
| | | | - A L K Malmberg
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - A Neumann
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - N Creasey
- The Generation R Study Group, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Clinical, Educational, and Health Psychology, Division of Psychology & Language Sciences, University College London, London, UK
| | - J F Felix
- The Generation R Study Group, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - H Tiemeier
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - J B Pingault
- Department of Clinical, Educational, and Health Psychology, Division of Psychology & Language Sciences, University College London, London, UK
- Social, Genetic & Developmental Psychiatry (SGDP) Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - D Czamara
- Max-Planck-Institute of Psychiatry, Department Genes and Environment, Munich, Germany
| | - K Raïkkönen
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Department of Obstetrics and Gynecology, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - C M Page
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
| | - R Lyle
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
| | - A Havdahl
- PROMENTA Research Centre, Department of Psychology, University of Oslo, Oslo, Norway
- PsychGen Centre for Genetic Epidemiology and Mental Health, Norwegian Institute of Public Health, Oslo, Norway
- Nic Waals Institute, Lovisenberg Diaconal Hospital, Oslo, Norway
| | - J Lahti
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - E Walton
- Department of Psychology, University of Bath, Bath, United Kingdom
| | - M Bekkhus
- PROMENTA Research Centre, Department of Psychology, University of Oslo, Oslo, Norway
| | - C A M Cecil
- Department of Epidemiology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
| |
Collapse
|
8
|
Lobel JH, Ingolia NT. Precise measurement of molecular phenotypes with barcode-based CRISPRi systems. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.21.600132. [PMID: 38948701 PMCID: PMC11213135 DOI: 10.1101/2024.06.21.600132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Genome-wide CRISPR-Cas9 screens have untangled regulatory networks and revealed the genetic underpinnings of diverse biological processes. Their success relies on experimental designs that interrogate specific molecular phenotypes and distinguish key regulators from background effects. Here, we realize these goals with a generalizable platform for CRISPR interference with barcoded expression reporter sequencing (CiBER-seq) that dramatically improves the sensitivity and scope of genome-wide screens. We systematically address technical factors that distort phenotypic measurements by normalizing expression reporters against closely-matched control promoters, integrated together into the genome at single copy. To test our ability to capture post-transcriptional and post-translational regulation through sequencing, we screened for genes that affected nonsense-mediated mRNA decay and Doa10-mediated cytosolic protein decay. Our optimized CiBER-seq screens accurately capture the known components of well-studied RNA and protein quality control pathways with minimal background. These results demonstrate the precision and versatility of CiBER-seq for dissecting the genetic networks controlling cellular behaviors.
Collapse
Affiliation(s)
- Joseph H. Lobel
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Nicholas T. Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Lead contact
| |
Collapse
|
9
|
Regner MJ, Garcia-Recio S, Thennavan A, Wisniewska K, Mendez-Giraldez R, Felsheim B, Spanheimer PM, Parker JS, Perou CM, Franco HL. Defining the Regulatory Logic of Breast Cancer Using Single-Cell Epigenetic and Transcriptome Profiling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.13.598858. [PMID: 38948758 PMCID: PMC11212881 DOI: 10.1101/2024.06.13.598858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Annotation of the cis-regulatory elements that drive transcriptional dysregulation in cancer cells is critical to improving our understanding of tumor biology. Herein, we present a compendium of matched chromatin accessibility (scATAC-seq) and transcriptome (scRNA-seq) profiles at single-cell resolution from human breast tumors and healthy mammary tissues processed immediately following surgical resection. We identify the most likely cell-of-origin for luminal breast tumors and basal breast tumors and then introduce a novel methodology that implements linear mixed-effects models to systematically quantify associations between regions of chromatin accessibility (i.e. regulatory elements) and gene expression in malignant cells versus normal mammary epithelial cells. These data unveil regulatory elements with that switch from silencers of gene expression in normal cells to enhancers of gene expression in cancer cells, leading to the upregulation of clinically relevant oncogenes. To translate the utility of this dataset into tractable models, we generated matched scATAC-seq and scRNA-seq profiles for breast cancer cell lines, revealing, for each subtype, a conserved oncogenic gene expression program between in vitro and in vivo cells. Together, this work highlights the importance of non-coding regulatory mechanisms that underlie oncogenic processes and the ability of single-cell multi-omics to define the regulatory logic of BC cells at single-cell resolution.
Collapse
Affiliation(s)
- Matthew J. Regner
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Susana Garcia-Recio
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Aatish Thennavan
- Department of Systems Biology, UT MD Anderson Cancer Center, Houston, TX, USA, 77030
| | - Kamila Wisniewska
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Raul Mendez-Giraldez
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Brooke Felsheim
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Philip M. Spanheimer
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Surgery, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Joel S. Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Charles M. Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Hector L. Franco
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Division of Clinical and Translational Cancer Research, University of Puerto Rico Comprehensive Cancer Center, San Juan, PR 00935
| |
Collapse
|
10
|
Diamond PD, McGlincy NJ, Ingolia NT. Depletion of cap-binding protein eIF4E dysregulates amino acid metabolic gene expression. Mol Cell 2024; 84:2119-2134.e5. [PMID: 38848691 DOI: 10.1016/j.molcel.2024.05.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 02/21/2024] [Accepted: 05/09/2024] [Indexed: 06/09/2024]
Abstract
Protein synthesis is metabolically costly and must be tightly coordinated with changing cellular needs and nutrient availability. The cap-binding protein eIF4E makes the earliest contact between mRNAs and the translation machinery, offering a key regulatory nexus. We acutely depleted this essential protein and found surprisingly modest effects on cell growth and recovery of protein synthesis. Paradoxically, impaired protein biosynthesis upregulated genes involved in the catabolism of aromatic amino acids simultaneously with the induction of the amino acid biosynthetic regulon driven by the integrated stress response factor GCN4. We further identified the translational control of Pho85 cyclin 5 (PCL5), a negative regulator of Gcn4, that provides a consistent protein-to-mRNA ratio under varied translation environments. This regulation depended in part on a uniquely long poly(A) tract in the PCL5 5' UTR and poly(A) binding protein. Collectively, these results highlight how eIF4E connects protein synthesis to metabolic gene regulation, uncovering mechanisms controlling translation during environmental challenges.
Collapse
Affiliation(s)
- Paige D Diamond
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Nicholas J McGlincy
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA; Center for Computational Biology and California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
11
|
Deng C, Whalen S, Steyert M, Ziffra R, Przytycki PF, Inoue F, Pereira DA, Capauto D, Norton S, Vaccarino FM, Pollen AA, Nowakowski TJ, Ahituv N, Pollard KS. Massively parallel characterization of regulatory elements in the developing human cortex. Science 2024; 384:eadh0559. [PMID: 38781390 DOI: 10.1126/science.adh0559] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 03/13/2024] [Indexed: 05/25/2024]
Abstract
Nucleotide changes in gene regulatory elements are important determinants of neuronal development and diseases. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 open chromatin regions, including thousands of sequences with cell type-specific accessibility and variants associated with brain gene regulation. In primary cells, we identified 46,802 active enhancer sequences and 164 variants that alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.
Collapse
Affiliation(s)
- Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Sean Whalen
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Marilyn Steyert
- Department of Anatomy, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
- Chan Zuckerberg Biohub, San Francisco, San Francisco, CA 94158, USA
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
| | - Ryan Ziffra
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Anatomy, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA 94143, USA
| | | | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto 606-8501, Japan
| | - Daniela A Pereira
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
- Graduate Program of Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais 31270-901, Brazil
| | - Davide Capauto
- Child Study Center, Yale University, New Haven, CT 06520, USA
| | - Scott Norton
- Child Study Center, Yale University, New Haven, CT 06520, USA
| | - Flora M Vaccarino
- Child Study Center, Yale University, New Haven, CT 06520, USA
- Department of Neuroscience, Yale University, New Haven, CT 06520, USA
| | - Alex A Pollen
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Tomasz J Nowakowski
- Department of Anatomy, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Katherine S Pollard
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
12
|
Chao KH, Heinz JM, Hoh C, Mao A, Shumate A, Pertea M, Salzberg SL. Combining DNA and protein alignments to improve genome annotation with LiftOn. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.593026. [PMID: 38798552 PMCID: PMC11118573 DOI: 10.1101/2024.05.16.593026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
As the number and variety of assembled genomes continues to grow, the number of annotated genomes is falling behind, particularly for eukaryotes. DNA-based mapping tools help to address this challenge, but they are only able to transfer annotation between closely-related species. Here we introduce LiftOn, a homology-based software tool that integrates DNA and protein alignments to enhance the accuracy of genome-scale annotation and to allow mapping between relatively distant species. LiftOn's protein-centric algorithm considers both types of alignments, chooses optimal open reading frames, resolves overlapping gene loci, and finds additional gene copies where they exist. LiftOn can reliably transfer annotation between genomes representing members of the same species, as we demonstrate on human, mouse, honey bee, rice, and Arabidopsis thaliana. It can further map annotation effectively across species pairs as far apart as mouse and rat or Drosophila melanogaster and D. erecta.
Collapse
Affiliation(s)
- Kuan-Hao Chao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jakob M. Heinz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Celine Hoh
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alan Mao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alaina Shumate
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mihaela Pertea
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA
| |
Collapse
|
13
|
Ahern DT, Bansal P, Faustino IV, Glatt-Deeley HR, Massey R, Kondaveeti Y, Banda EC, Pinter SF. Isogenic hiPSC models of Turner syndrome development reveal shared roles of inactive X and Y in the human cranial neural crest network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.08.531747. [PMID: 36945647 PMCID: PMC10028916 DOI: 10.1101/2023.03.08.531747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
Modeling the developmental etiology of viable human aneuploidy can be challenging in rodents due to syntenic boundaries, or primate-specific biology. In humans, monosomy-X (45,X) causes Turner syndrome (TS), altering craniofacial, skeletal, endocrine, and cardiovascular development, which in contrast remain unaffected in 39,X-mice. To learn how human monosomy-X may impact early embryonic development, we turned to human 45,X and isogenic euploid induced pluripotent stem cells (hiPSCs) from male and female mosaic donors. Because neural crest (NC) derived cell types are hypothesized to underpin craniofacial and cardiovascular changes in TS, we performed a highly-powered differential expression study on hiPSC-derived anterior neural crest cells (NCCs). Across three independent isogenic panels, 45,X NCCs show impaired acquisition of PAX7+SOX10+ markers, and disrupted expression of other NCC-specific genes, relative to their isogenic euploid controls. In particular, 45,X NCCs increase cholesterol biosynthesis genes while reducing transcripts that feature 5' terminal oligopyrimidine (TOP) motifs, including those of ribosomal protein and nuclear-encoded mitochondrial genes. Such metabolic pathways are also over-represented in weighted co-expression gene modules that are preserved in monogenic neurocristopathy. Importantly, these gene modules are also significantly enriched in 28% of all TS-associated terms of the human phenotype ontology. Our analysis identifies specific sex-linked genes that are expressed from two copies in euploid males and females alike and qualify as candidate haploinsufficient drivers of TS phenotypes in NC-derived lineages. This study demonstrates that isogenic hiPSC-derived NCC panels representing monosomy-X can serve as a powerful model of early NC development in TS and inform new hypotheses towards its etiology.
Collapse
Affiliation(s)
- Darcy T. Ahern
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, United States
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
| | - Prakhar Bansal
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, United States
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
| | - Isaac V. Faustino
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
| | - Heather R. Glatt-Deeley
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
| | - Rachael Massey
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, United States
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States
| | - Yuvabharath Kondaveeti
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
| | - Erin C. Banda
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
| | - Stefan F. Pinter
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, United States
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, United States
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States
| |
Collapse
|
14
|
Lee G, Wong C, Cho A, West JJ, Crawford AJ, Russo GC, Si BR, Kim J, Hoffner L, Jang C, Jung M, Leone RD, Konstantopoulos K, Ewald AJ, Wirtz D, Jeong S. Serine synthesis pathway upregulated by E-cadherin is essential for the proliferation and metastasis of breast cancers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.24.541452. [PMID: 37292712 PMCID: PMC10245808 DOI: 10.1101/2023.05.24.541452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The loss of E-cadherin (E-cad), an epithelial cell adhesion molecule, has been implicated in the epithelial-mesenchymal transition (EMT), promoting invasion and migration of cancer cells and, consequently, metastasis. However, recent studies have demonstrated that E-cad supports the survival and proliferation of metastatic cancer cells, suggesting that our understanding of E-cad in metastasis is far from comprehensive. Here, we report that E-cad upregulates the de novo serine synthesis pathway (SSP) in breast cancer cells. The SSP provides metabolic precursors for biosynthesis and resistance to oxidative stress, critically beneficial for E-cad-positive breast cancer cells to achieve faster tumor growth and more metastases. Inhibition of PHGDH, a rate-limiting enzyme in the SSP, significantly and specifically hampered the proliferation of E-cad-positive breast cancer cells and rendered them vulnerable to oxidative stress, inhibiting their metastatic potential. Our findings reveal that E-cad adhesion molecule significantly reprograms cellular metabolism, promoting tumor growth and metastasis of breast cancers.
Collapse
|
15
|
Farrow SL, Gokuladhas S, Schierding W, Pudjihartono M, Perry JK, Cooper AA, O'Sullivan JM. Identification of 27 allele-specific regulatory variants in Parkinson's disease using a massively parallel reporter assay. NPJ Parkinsons Dis 2024; 10:44. [PMID: 38413607 PMCID: PMC10899198 DOI: 10.1038/s41531-024-00659-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 02/12/2024] [Indexed: 02/29/2024] Open
Abstract
Genome wide association studies (GWAS) have identified a number of genomic loci that are associated with Parkinson's disease (PD) risk. However, the majority of these variants lie in non-coding regions, and thus the mechanisms by which they influence disease development, and/or potential subtypes, remain largely elusive. To address this, we used a massively parallel reporter assay (MPRA) to screen the regulatory function of 5254 variants that have a known or putative connection to PD. We identified 138 loci with enhancer activity, of which 27 exhibited allele-specific regulatory activity in HEK293 cells. The identified regulatory variant(s) typically did not match the original tag variant within the PD associated locus, supporting the need for deeper exploration of these loci. The existence of allele specific transcriptional impacts within HEK293 cells, confirms that at least a subset of the PD associated regions mark functional gene regulatory elements. Future functional studies that confirm the putative targets of the empirically verified regulatory variants will be crucial for gaining a greater understanding of how gene regulatory network(s) modulate PD risk.
Collapse
Affiliation(s)
- Sophie L Farrow
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
| | | | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Department of Ophthalmology, The University of Auckland, Auckland, New Zealand
| | | | - Jo K Perry
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Antony A Cooper
- Australian Parkinsons Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
- Australian Parkinsons Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia.
- Singapore Institute for Clinical Sciences, Agency for Science Technology and Research, Singapore, Singapore.
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, United Kingdom.
| |
Collapse
|
16
|
Vanhove M, Schwabl P, Clementson C, Early AM, Laws M, Anthony F, Florimond C, Mathieu L, James K, Knox C, Singh N, Buckee CO, Musset L, Cox H, Niles-Robin R, Neafsey DE. Temporal and spatial dynamics of Plasmodium falciparum clonal lineages in Guyana. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.31.578156. [PMID: 38352461 PMCID: PMC10862847 DOI: 10.1101/2024.01.31.578156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
Plasmodium parasites, the causal agents of malaria, are eukaryotic organisms that obligately undergo sexual recombination within mosquitoes. However, in low transmission settings where most mosquitoes become infected with only a single parasite clone, parasites recombine with themselves, and the clonal lineage is propagated rather than broken up by outcrossing. We investigated whether stochastic/neutral factors drive the persistence and abundance of Plasmodium falciparum clonal lineages in Guyana, a country with relatively low malaria transmission, but the only setting in the Americas in which an important artemisinin resistance mutation (pfk13 C580Y) has been observed. To investigate whether this clonality was potentially associated with the persistence and spatial spread of the mutation, we performed whole genome sequencing on 1,727 Plasmodium falciparum samples collected from infected patients across a five-year period (2016-2021). We characterized the relatedness between each pair of monoclonal infections (n=1,409) through estimation of identity by descent (IBD) and also typed each sample for known or candidate drug resistance mutations. A total of 160 clones (mean IBD ≥ 0.90) were circulating in Guyana during the study period, comprising 13 highly related clusters (mean IBD ≥ 0.40). In the five-year study period, we observed a decrease in frequency of a mutation associated with artemisinin partner drug (piperaquine) resistance (pfcrt C350R) and limited co-occurence of pfcrt C350R with duplications of plasmepsin 2/3, an epistatic interaction associated with piperaquine resistance. We additionally report polymorphisms exhibiting evidence of selection for drug resistance or other phenotypes and reported a novel pfk13 mutation (G718S) as well as 61 nonsynonymous substitutions that increased markedly in frequency. However, P. falciparum clonal dynamics in Guyana appear to be largely driven by stochastic factors, in contrast to other geographic regions. The use of multiple artemisinin combination therapies in Guyana may have contributed to the disappearance of the pfk13 C580Y mutation.
Collapse
Affiliation(s)
- Mathieu Vanhove
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Philipp Schwabl
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Angela M Early
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Margaret Laws
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Frank Anthony
- National Malaria Program, Ministry of Health, Georgetown, Guyana
| | - Célia Florimond
- Laboratoire de parasitologie, World Health Organization Collaborating Center for Surveillance of Antimalarial Drug Resistance, Center Nationale de Référence du Paludisme, Institut Pasteur de la Guyane, Cayenne, French Guiana
| | - Luana Mathieu
- Laboratoire de parasitologie, World Health Organization Collaborating Center for Surveillance of Antimalarial Drug Resistance, Center Nationale de Référence du Paludisme, Institut Pasteur de la Guyane, Cayenne, French Guiana
| | - Kashana James
- National Malaria Program, Ministry of Health, Georgetown, Guyana
| | - Cheyenne Knox
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Narine Singh
- National Malaria Program, Ministry of Health, Georgetown, Guyana
| | - Caroline O Buckee
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Lise Musset
- Laboratoire de parasitologie, World Health Organization Collaborating Center for Surveillance of Antimalarial Drug Resistance, Center Nationale de Référence du Paludisme, Institut Pasteur de la Guyane, Cayenne, French Guiana
| | - Horace Cox
- National Malaria Program, Ministry of Health, Georgetown, Guyana
- Caribbean Public Health Agency, Trinidad and Tobago
| | - Reza Niles-Robin
- National Malaria Program, Ministry of Health, Georgetown, Guyana
| | - Daniel E Neafsey
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
17
|
Feng Y, Xie N, Inoue F, Fan S, Saskin J, Zhang C, Zhang F, Hansen MEB, Nyambo T, Mpoloka SW, Mokone GG, Fokunang C, Belay G, Njamnshi AK, Marks MS, Oancea E, Ahituv N, Tishkoff SA. Integrative functional genomic analyses identify genetic variants influencing skin pigmentation in Africans. Nat Genet 2024; 56:258-272. [PMID: 38200130 PMCID: PMC11005318 DOI: 10.1038/s41588-023-01626-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 11/28/2023] [Indexed: 01/12/2024]
Abstract
Skin color is highly variable in Africans, yet little is known about the underlying molecular mechanism. Here we applied massively parallel reporter assays to screen 1,157 candidate variants influencing skin pigmentation in Africans and identified 165 single-nucleotide polymorphisms showing differential regulatory activities between alleles. We combine Hi-C, genome editing and melanin assays to identify regulatory elements for MFSD12, HMG20B, OCA2, MITF, LEF1, TRPS1, BLOC1S6 and CYB561A3 that impact melanin levels in vitro and modulate human skin color. We found that independent mutations in an OCA2 enhancer contribute to the evolution of human skin color diversity and detect signals of local adaptation at enhancers of MITF, LEF1 and TRPS1, which may contribute to the light skin color of Khoesan-speaking populations from Southern Africa. Additionally, we identified CYB561A3 as a novel pigmentation regulator that impacts genes involved in oxidative phosphorylation and melanogenesis. These results provide insights into the mechanisms underlying human skin color diversity and adaptive evolution.
Collapse
Affiliation(s)
- Yuanqing Feng
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Ning Xie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Shaohua Fan
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
- Human Phenome Institute, School of Life Science, Fudan University, Shanghai, China
| | - Joshua Saskin
- Department of Neuroscience, Brown University, Providence, RI, USA
| | - Chao Zhang
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Fang Zhang
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Matthew E B Hansen
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Thomas Nyambo
- Department of Biochemistry and Molecular Biology, Hubert Kairuki Memorial University, Dar es Salaam, Tanzania
| | - Sununguko Wata Mpoloka
- Department of Biological Sciences, Faculty of Sciences, University of Botswana, Gaborone, Botswana
| | | | - Charles Fokunang
- Department of Pharmacotoxicology and Pharmacokinetics, Faculty of Medicine and Biomedical Sciences, The University of Yaoundé I, Yaoundé, Cameroon
| | - Gurja Belay
- Department of Biology, Addis Ababa University, Addis Ababa, Ethiopia
| | - Alfred K Njamnshi
- Brain Research Africa Initiative (BRAIN); Neuroscience Lab, Faculty of Medicine and Biomedical Sciences, The University of Yaoundé I, Department of Neurology, Central Hospital Yaoundé, Yaoundé, Cameroon
| | - Michael S Marks
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA, USA
| | - Elena Oancea
- Department of Neuroscience, Brown University, Providence, RI, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Sarah A Tishkoff
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA.
- Center for Global Genomics and Health Equity, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
18
|
Liu W, Pratte KA, Castaldi PJ, Hersh C, Bowler RP, Banaei-Kashani F, Kechris KJ. A Generalized Higher-order Correlation Analysis Framework for Multi-Omics Network Inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.22.576667. [PMID: 38328226 PMCID: PMC10849540 DOI: 10.1101/2024.01.22.576667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Multiple -omics (genomics, proteomics, etc.) profiles are commonly generated to gain insight into a disease or physiological system. Constructing multi-omics networks with respect to the trait(s) of interest provides an opportunity to understand relationships between molecular features but integration is challenging due to multiple data sets with high dimensionality. One approach is to use canonical correlation to integrate one or two omics types and a single trait of interest. However, these types of methods may be limited due to (1) not accounting for higher-order correlations existing among features, (2) computational inefficiency when extending to more than two omics data when using a penalty term-based sparsity method, and (3) lack of flexibility for focusing on specific correlations (e.g., omics-to-phenotype correlation versus omics-to-omics correlations). In this work, we have developed a novel multi-omics network analysis pipeline called Sparse Generalized Tensor Canonical Correlation Analysis Network Inference (SGTCCA-Net) that can effectively overcome these limitations. We also introduce an implementation to improve the summarization of networks for downstream analyses. Simulation and real-data experiments demonstrate the effectiveness of our novel method for inferring omics networks and features of interest.
Collapse
Affiliation(s)
- Weixuan Liu
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Peter J. Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, United States
| | - Craig Hersh
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, United States
| | - Russell P. Bowler
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO, USA
| | - Farnoush Banaei-Kashani
- Department of Computer Science and Engineering, College of Engineering, Design and Computing, University of Colorado Denver, Denver, CO, USA
| | - Katerina J. Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
19
|
Wang L, Dong W, Yin Z, Sheng J, Ezeana CF, Yang L, Yu X, Wong SSY, Wan Z, Danforth RL, Han K, Gao D, Wong STC. Charting Single Cell Lineage Dynamics and Mutation Networks via Homing CRISPR. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.05.574236. [PMID: 38260351 PMCID: PMC10802354 DOI: 10.1101/2024.01.05.574236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Single cell lineage tracing, essential for unraveling cellular dynamics in disease evolution is critical for developing targeted therapies. CRISPR-Cas9, known for inducing permanent and cumulative mutations, is a cornerstone in lineage tracing. The novel homing guide RNA (hgRNA) technology enhances this by enabling dynamic retargeting and facilitating ongoing genetic modifications. Charting these mutations, especially through successive hgRNA edits, poses a significant challenge. Our solution, LINEMAP, is a computational framework designed to trace and map these mutations with precision. LINEMAP meticulously discerns mutation alleles at single-cell resolution and maps their complex interrelationships through a mutation evolution network. By utilizing a Markov Process model, we can predict mutation transition probabilities, revealing potential mutational routes and pathways. Our reconstruction algorithm, anchored in the Markov model's attributes, reconstructs cellular lineage pathways, shedding light on the cell's evolutionary journey to the minutiae of single-cell division. Our findings reveal an intricate network of mutation evolution paired with a predictive Markov model, advancing our capability to reconstruct single-cell lineage via hgRNA. This has substantial implications for advancing our understanding of biological mechanisms and propelling medical research forward.
Collapse
Affiliation(s)
- Lin Wang
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Wenjuan Dong
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Zheng Yin
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
- Biostatistics and Bioinformatics Shared Resource, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Jianting Sheng
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Chika F. Ezeana
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Li Yang
- T.T. and W. F. Chao Center for BRAIN, Houston Methodist Research Institute, Houston, Texas 77030
| | - Xiaohui Yu
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | | | - Zhihao Wan
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Rebecca L. Danforth
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Kun Han
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
| | - Dingcheng Gao
- Department of Cell & Development Biology, Weill Cornell Medical College, New York, NY 10065
| | - Stephen T. C. Wong
- Department of System Medicine and Bioengineering, Houston Methodist Neal Cancer Center, Houston, Texas 77030
- Departments of Radiology, Pathology and Genomic Medicine, Houston Methodist Hospital, Weill Cornell Medical College, Houston, TX 77030
| |
Collapse
|
20
|
Walter KS, Cohen T, Mathema B, Colijn C, Sobkowiak B, Comas I, Goig GA, Croda J, Andrews JR. Signatures of transmission in within-host M. tuberculosis variation. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.28.23300451. [PMID: 38234741 PMCID: PMC10793532 DOI: 10.1101/2023.12.28.23300451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Background Because M. tuberculosis evolves slowly, transmission clusters often contain multiple individuals with identical consensus genomes, making it difficult to reconstruct transmission chains. Finding additional sources of shared M. tuberculosis variation could help overcome this problem. Previous studies have reported M. tuberculosis diversity within infected individuals; however, whether within-host variation improves transmission inferences remains unclear. Methods To evaluate the transmission information present in within-host M. tuberculosis variation, we re-analyzed publicly available sequence data from three household transmission studies, using household membership as a proxy for transmission linkage between donor-recipient pairs. Findings We found moderate levels of minority variation present in M. tuberculosis sequence data from cultured isolates that varied significantly across studies (mean: 6, 7, and 170 minority variants above a 1% minor allele frequency threshold, outside of PE/PPE genes). Isolates from household members shared more minority variants than did isolates from unlinked individuals in the three studies (mean 98 shared minority variants vs. 10; 0.8 vs. 0.2, and 0.7 vs. 0.2, respectively). Shared within-host variation was significantly associated with household membership (OR: 1.51 [1.30,1.71], for one standard deviation increase in shared minority variants). Models that included shared within-host variation improved the accuracy of predicting household membership in all three studies as compared to models without within-host variation (AUC: 0.95 versus 0.92, 0.99 versus 0.95, and 0.93 versus 0.91). Interpretation Within-host M. tuberculosis variation persists through culture and could enhance the resolution of transmission inferences. The substantial differences in minority variation recovered across studies highlights the need to optimize approaches to recover and incorporate within-host variation into automated phylogenetic and transmission inference. Funding NIAID: 5K01AI173385.
Collapse
Affiliation(s)
| | - Ted Cohen
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
| | - Barun Mathema
- Department of Epidemiology, Columbia University Mailman School of Public Health; New York, United States
| | - Caroline Colijn
- Department of Mathematics, Simon Fraser University; Burnaby, Canada
| | - Benjamin Sobkowiak
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
| | - Iñaki Comas
- Institute of Biomedicine of Valencia (CSIC), Valencia, Spain
| | - Galo A Goig
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Julio Croda
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
- Federal University of Mato Grosso do Sul - UFMS, Campo Grande, MS, Brazil
- Oswaldo Cruz Foundation Mato Grosso do Sul, Mato Grosso do Sul, Brazil
| | - Jason R Andrews
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
21
|
McAfee JC, Lee S, Lee J, Bell JL, Krupa O, Davis J, Insigne K, Bond ML, Zhao N, Boyle AP, Phanstiel DH, Love MI, Stein JL, Ruzicka WB, Davila-Velderrain J, Kosuri S, Won H. Systematic investigation of allelic regulatory activity of schizophrenia-associated common variants. CELL GENOMICS 2023; 3:100404. [PMID: 37868037 PMCID: PMC10589626 DOI: 10.1016/j.xgen.2023.100404] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 02/23/2023] [Accepted: 08/21/2023] [Indexed: 10/24/2023]
Abstract
Genome-wide association studies (GWASs) have successfully identified 145 genomic regions that contribute to schizophrenia risk, but linkage disequilibrium makes it challenging to discern causal variants. We performed a massively parallel reporter assay (MPRA) on 5,173 fine-mapped schizophrenia GWAS variants in primary human neural progenitors and identified 439 variants with allelic regulatory effects (MPRA-positive variants). Transcription factor binding had modest predictive power, while fine-map posterior probability, enhancer overlap, and evolutionary conservation failed to predict MPRA-positive variants. Furthermore, 64% of MPRA-positive variants did not exhibit expressive quantitative trait loci signature, suggesting that MPRA could identify yet unexplored variants with regulatory potentials. To predict the combinatorial effect of MPRA-positive variants on gene regulation, we propose an accessibility-by-contact model that combines MPRA-measured allelic activity with neuronal chromatin architecture.
Collapse
Affiliation(s)
- Jessica C. McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Sool Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiseok Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica L. Bell
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kimberly Insigne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Marielle L. Bond
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nanxiang Zhao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alan P. Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Douglas H. Phanstiel
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael I. Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason L. Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - W. Brad Ruzicka
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02141, USA
- Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
22
|
Duan YY, Chen XF, Zhu RJ, Jia YY, Huang XT, Zhang M, Yang N, Dong SS, Zeng M, Feng Z, Zhu DL, Wu H, Jiang F, Shi W, Hu WX, Ke X, Chen H, Liu Y, Jing RH, Guo Y, Li M, Yang TL. High-throughput functional dissection of noncoding SNPs with biased allelic enhancer activity for insulin resistance-relevant phenotypes. Am J Hum Genet 2023; 110:1266-1288. [PMID: 37506691 PMCID: PMC10432149 DOI: 10.1016/j.ajhg.2023.07.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/04/2023] [Accepted: 07/05/2023] [Indexed: 07/30/2023] Open
Abstract
Most of the single-nucleotide polymorphisms (SNPs) associated with insulin resistance (IR)-relevant phenotypes by genome-wide association studies (GWASs) are located in noncoding regions, complicating their functional interpretation. Here, we utilized an adapted STARR-seq to evaluate the regulatory activities of 5,987 noncoding SNPs associated with IR-relevant phenotypes. We identified 876 SNPs with biased allelic enhancer activity effects (baaSNPs) across 133 loci in three IR-relevant cell lines (HepG2, preadipocyte, and A673), which showed pervasive cell specificity and significant enrichment for cell-specific open chromatin regions or enhancer-indicative markers (H3K4me1, H3K27ac). Further functional characterization suggested several transcription factors (TFs) with preferential allelic binding to baaSNPs. We also incorporated multi-omics data to prioritize 102 candidate regulatory target genes for baaSNPs and revealed prevalent long-range regulatory effects and cell-specific IR-relevant biological functional enrichment on them. Specifically, we experimentally verified the distal regulatory mechanism at IRS1 locus, in which rs952227-A reinforces IRS1 expression by long-range chromatin interaction and preferential binding to the transcription factor HOXC6 to augment the enhancer activity. Finally, based on our STARR-seq screening data, we predicted the enhancer activity of 227,343 noncoding SNPs associated with IR-relevant phenotypes (fasting insulin adjusted for BMI, HDL cholesterol, and triglycerides) from the largest available GWAS summary statistics. We further provided an open resource (http://www.bigc.online/fnSNP-IR) for better understanding genetic regulatory mechanisms of IR-relevant phenotypes.
Collapse
Affiliation(s)
- Yuan-Yuan Duan
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Xiao-Feng Chen
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Ren-Jie Zhu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Ying-Ying Jia
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Xiao-Ting Huang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Meng Zhang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Ning Yang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Shan-Shan Dong
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Mengqi Zeng
- Frontier Institute of Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Zhihui Feng
- Frontier Institute of Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Dong-Li Zhu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Hao Wu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Feng Jiang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Wei Shi
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Wei-Xin Hu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Xin Ke
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Hao Chen
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Rui-Hua Jing
- Department of Ophthalmology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi 710000, China
| | - Yan Guo
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Meng Li
- Department of Orthopedics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China.
| | - Tie-Lin Yang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China; Department of Orthopedics, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China.
| |
Collapse
|
23
|
Myers MA, Arnold BJ, Bansal V, Mullen KM, Zaccaria S, Raphael BJ. HATCHet2: clone- and haplotype-specific copy number inference from bulk tumor sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548855. [PMID: 37502835 PMCID: PMC10370020 DOI: 10.1101/2023.07.13.548855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Multi-region DNA sequencing of primary tumors and metastases from individual patients helps identify somatic aberrations driving cancer development. However, most methods to infer copy-number aberrations (CNAs) analyze individual samples. We introduce HATCHet2 to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 introduces a novel statistic, the mirrored haplotype B-allele frequency (mhBAF), to identify mirrored-subclonal CNAs having different numbers of copies of parental haplotypes in different tumor clones. HATCHet2 also has high accuracy in identifying focal CNAs and extends the earlier HATCHet method in several directions. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 50 prostate cancer samples from 10 patients reveals previously-unreported mirrored-subclonal CNAs affecting cancer genes.
Collapse
Affiliation(s)
- Matthew A. Myers
- Department of Computer Science, Princeton University, Princeton, USA
| | - Brian J. Arnold
- Center for Statistics and Machine Learning, Princeton University, Princeton, USA
| | - Vineet Bansal
- Princeton Research Computing, Princeton University, Princeton, NJ, USA
| | - Katelyn M. Mullen
- Human Oncology & Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Simone Zaccaria
- Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
| | | |
Collapse
|
24
|
Wang Y, Selvaraj MS, Li X, Li Z, Holdcraft JA, Arnett DK, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Cade BE, Carlson JC, Carson AP, Chen YDI, Curran JE, de Vries PS, Dutcher SK, Ellinor PT, Floyd JS, Fornage M, Freedman BI, Gabriel S, Germer S, Gibbs RA, Guo X, He J, Heard-Costa N, Hildalgo B, Hou L, Irvin MR, Joehanes R, Kaplan RC, Kardia SLR, Kelly TN, Kim R, Kooperberg C, Kral BG, Levy D, Li C, Liu C, Lloyd-Jone D, Loos RJF, Mahaney MC, Martin LW, Mathias RA, Minster RL, Mitchell BD, Montasser ME, Morrison AC, Murabito JM, Naseri T, O’Connell JR, Palmer ND, Preuss MH, Psaty BM, Raffield LM, Rao DC, Redline S, Reiner AP, Rich SS, Ruepena MS, Sheu WHH, Smith JA, Smith A, Tiwari HK, Tsai MY, Viaud-Martinez KA, Wang Z, Yanek LR, Zhao W, Rotter JI, Lin X, Natarajan P, Peloso GM. Rare variants in long non-coding RNAs are associated with blood lipid levels in the TOPMed Whole Genome Sequencing Study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.28.23291966. [PMID: 37425772 PMCID: PMC10327287 DOI: 10.1101/2023.06.28.23291966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions. Large-scale whole genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess the associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with blood lipid levels (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare variant aggregate association tests using the STAAR (variant-Set Test for Association using Annotation infoRmation) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare coding variants in nearby protein coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500 kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variations and rare protein coding variations at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNA, implicating new therapeutic opportunities.
Collapse
Affiliation(s)
- Yuxuan Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Margaret Sunitha Selvaraj
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, IN, USA
- Center for Computational Biology & Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jacob A. Holdcraft
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Donna K. Arnett
- Provost Office, University of South Carolina, Columbia, SC, USA
- Department of Epidemiology and Biostatistics, University of South Carolina Arnold School of Public Health, Columbia, SC, USA
| | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Donald W. Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Brian E. Cade
- Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - Jenna C. Carlson
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - April P. Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Joanne E. Curran
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Paul S. de Vries
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Susan K. Dutcher
- The McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Patrick T. Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - James S. Floyd
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Myriam Fornage
- Center for Human Genetics, University of Texas Health at Houston, Houston, TX, USA
| | - Barry I. Freedman
- Department of Internal Medicine, Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | | | | | - Richard A. Gibbs
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University Translational Science Institute, New Orleans, LA, USA
| | - Nancy Heard-Costa
- Framingham Heart Study, Framingham, MA, USA
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Bertha Hildalgo
- Department of Epidemiology, University of Alabama at Birmingham School of Public Health, Birmingham, AL, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Marguerite R. Irvin
- Department of Epidemiology, University of Alabama at Birmingham School of Public Health, Birmingham, AL, USA
| | - Roby Joehanes
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Robert C. Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Sharon LR. Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | - Tanika N. Kelly
- Department of Medicine, Division of Nephrology, University of Illinois Chicago, Chicago, IL, USA
| | - Ryan Kim
- Psomagen, Inc. (formerly Macrogen USA), Rockville, MD, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Brian G. Kral
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Daniel Levy
- Framingham Heart Study, Framingham, MA, USA
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Changwei Li
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
- Tulane University Translational Science Institute, New Orleans, LA, USA
| | - Chunyu Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Don Lloyd-Jone
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Ruth JF. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- NNF Center for Basic Metabolic Research, University of Copenhagen, Cophenhagen, Denmark
| | - Michael C. Mahaney
- Department of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Lisa W. Martin
- George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Rasika A. Mathias
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Ryan L. Minster
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Braxton D. Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - May E. Montasser
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Alanna C. Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Joanne M. Murabito
- Framingham Heart Study, Framingham, MA, USA
- Department of Medicine, Boston Medical Center, Boston University Chobanian and Avedisian School of Medicine, Boston, MA, USA
| | | | - Jeffrey R. O’Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D. Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Michael H. Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Dabeeru C. Rao
- Division of Biostatistics, Washington University School of Medicine, St. Louis, MO, USA
| | - Susan Redline
- Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | | | | | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | - Albert Smith
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Hemant K. Tiwari
- Department of Biostatistics, University of Alabama, Birmingham, AL, USA
| | - Michael Y. Tsai
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | | | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lisa R. Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | | | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Xihong Lin
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
| | - Pradeep Natarajan
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Gina M. Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| |
Collapse
|
25
|
Soneson C, Bendel AM, Diss G, Stadler MB. mutscan-a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data. Genome Biol 2023; 24:132. [PMID: 37264470 DOI: 10.1186/s13059-023-02967-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 05/10/2023] [Indexed: 06/03/2023] Open
Abstract
Multiplexed assays of variant effect (MAVE) experimentally measure the effect of large numbers of sequence variants by selective enrichment of sequences with desirable properties followed by quantification by sequencing. mutscan is an R package for flexible analysis of such experiments, covering the entire workflow from raw reads up to statistical analysis and visualization. The core components are implemented in C++ for efficiency. Various experimental designs are supported, including single or paired reads with optional unique molecular identifiers. To find variants with changed relative abundance, mutscan employs established statistical models provided in the edgeR and limma packages. mutscan is available from https://github.com/fmicompbio/mutscan .
Collapse
Affiliation(s)
- Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Michael B Stadler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
- University of Basel, Basel, Switzerland.
| |
Collapse
|
26
|
Brůna T, Li H, Guhlin J, Honsel D, Herbold S, Stanke M, Nenasheva N, Ebel M, Gabriel L, Hoff KJ. GALBA: Genome Annotation with Miniprot and AUGUSTUS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.10.536199. [PMID: 37090650 PMCID: PMC10120627 DOI: 10.1101/2023.04.10.536199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein- to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a previously unannotated land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments. Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms.
Collapse
Affiliation(s)
- Tomáš Brůna
- US Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA & Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Joseph Guhlin
- Genomics Aotearoa and Laboratory for Evolution and Development, Department of Biochemistry, University of Otago, PO Box 56, Dunedin 9016, New Zealand
| | - Daniel Honsel
- Institute of Computer Science, University of Göttingen, 37077 Göttingen, Germany
| | - Steffen Herbold
- Faculty for Computer Science and Mathematics, University of Passau, 94032 Passau, Germany
| | - Mario Stanke
- Institute of Mathematics and Computer Science & Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
| | - Natalia Nenasheva
- Institute of Mathematics and Computer Science & Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
| | - Matthis Ebel
- Institute of Mathematics and Computer Science & Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
| | - Lars Gabriel
- Institute of Mathematics and Computer Science & Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
| | - Katharina J. Hoff
- Institute of Mathematics and Computer Science & Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
| |
Collapse
|
27
|
Speer RM, Nandi SP, Cooper KL, Zhou X, Yu H, Guo Y, Hudson LG, Alexandrov LB, Liu KJ. Arsenic is a potent co-mutagen of ultraviolet light. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.22.529578. [PMID: 36865271 PMCID: PMC9980120 DOI: 10.1101/2023.02.22.529578] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Environmental co-exposures are widespread and are major contributors to carcinogenic mechanisms. Two well-established environmental agents causing skin cancer are ultraviolet radiation (UVR) and arsenic. Arsenic is a known co-carcinogen that enhances UVR's carcinogenicity. However, the mechanisms of arsenic co-carcinogenesis are not well understood. In this study, we utilized primary human keratinocytes and a hairless mouse model to investigate the carcinogenic and mutagenic properties of co-exposure to arsenic and UVR. In vitro and in vivo exposures revealed that, on its own, arsenic is neither mutagenic nor carcinogenic. However, in combination with UVR, arsenic exposure has a synergistic effect leading to an accelerated mouse skin carcinogenesis as well as to more than 2-fold enrichment of UVR mutational burden. Notably, mutational signature ID13, previously found only in UVR-associated human skin cancers, was observed exclusively in mouse skin tumors and cell lines jointly exposed to arsenic and UVR. This signature was not observed in any model system exposed purely to arsenic or purely to UVR, making ID13 the first co-exposure signature to be reported using controlled experimental conditions. Analysis of existing genomics data from basal cell carcinomas and melanomas revealed that only a subset of human skin cancers harbor ID13 and, consistent with our experimental observations, these cancers exhibited an elevated UVR mutagenesis. Our results provide the first report of a unique mutational signature caused by a co-exposure to two environmental carcinogens and the first comprehensive evidence that arsenic is a potent co-mutagen and co-carcinogen of UVR. Importantly, our findings suggest that a large proportion of human skin cancers are not formed purely due to UVR exposure but rather due to a co-exposure of UVR and other co-mutagens such as arsenic.
Collapse
Affiliation(s)
- Rachel M. Speer
- Department of Pharmaceutical Sciences, College of Pharmacy, University of New Mexico, Albuquerque, NM 87106, USA
| | - Shuvro P. Nandi
- Department of Cellular and Molecular Medicine, UC San Diego, La Jolla, CA, 92093, USA
- Moores Cancer Center, UC San Diego, La Jolla, CA, 92037, USA
| | - Karen L. Cooper
- Department of Pharmaceutical Sciences, College of Pharmacy, University of New Mexico, Albuquerque, NM 87106, USA
| | - Xixi Zhou
- Department of Pharmaceutical Sciences, College of Pharmacy, University of New Mexico, Albuquerque, NM 87106, USA
| | - Hui Yu
- Department of Internal Medicine, Division of Molecular Medicine, University of New Mexico, Albuquerque, NM 87106, USA
| | - Yan Guo
- Department of Internal Medicine, Division of Molecular Medicine, University of New Mexico, Albuquerque, NM 87106, USA
| | - Laurie G. Hudson
- Department of Pharmaceutical Sciences, College of Pharmacy, University of New Mexico, Albuquerque, NM 87106, USA
| | - Ludmil B. Alexandrov
- Department of Cellular and Molecular Medicine, UC San Diego, La Jolla, CA, 92093, USA
- Moores Cancer Center, UC San Diego, La Jolla, CA, 92037, USA
- Department of Bioengineering, UC San Diego, La Jolla, CA, 92093, USA
| | - Ke Jian Liu
- Department of Pharmaceutical Sciences, College of Pharmacy, University of New Mexico, Albuquerque, NM 87106, USA
- Stony Brook Cancer Center, Stony Brook University, Stony Brook NY 11794, USA
- Department of Pathology, Stony Brook University School of Medicine, Stony Brook, NY 11794, USA
| |
Collapse
|
28
|
Gallego Romero I, Lea AJ. Leveraging massively parallel reporter assays for evolutionary questions. Genome Biol 2023; 24:26. [PMID: 36788564 PMCID: PMC9926830 DOI: 10.1186/s13059-023-02856-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 01/17/2023] [Indexed: 02/16/2023] Open
Abstract
A long-standing goal of evolutionary biology is to decode how gene regulation contributes to organismal diversity. Doing so is challenging because it is hard to predict function from non-coding sequence and to perform molecular research with non-model taxa. Massively parallel reporter assays (MPRAs) enable the testing of thousands to millions of sequences for regulatory activity simultaneously. Here, we discuss the execution, advantages, and limitations of MPRAs, with a focus on evolutionary questions. We propose solutions for extending MPRAs to rare taxa and those with limited genomic resources, and we underscore MPRA's broad potential for driving genome-scale, functional studies across organisms.
Collapse
Affiliation(s)
- Irene Gallego Romero
- Melbourne Integrative Genomics, University of Melbourne, Royal Parade, Parkville, Victoria, 3010, Australia. .,School of BioSciences, The University of Melbourne, Royal Parade, Parkville, 3010, Australia. .,The Centre for Stem Cell Systems, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, 30 Royal Parade, Parkville, Victoria, 3010, Australia. .,Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Riia 23b, 51010, Tartu, Estonia.
| | - Amanda J. Lea
- grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37240 USA ,Child and Brain Development Program, Canadian Institute for Advanced Study, Toronto, Canada
| |
Collapse
|
29
|
Bhattarai KR, Mobley RJ, Barnett KR, Ferguson DC, Hansen BS, Diedrich JD, Bergeron BP, Yang W, Crews KR, Manring CS, Jabbour E, Paietta E, Litzow MR, Kornblau SM, Stock W, Inaba H, Jeha S, Pui CH, Cheng C, Pruett-Miller SM, Relling MV, Yang JJ, Evans WE, Savic D. Functional investigation of inherited noncoding genetic variation impacting the pharmacogenomics of childhood acute lymphoblastic leukemia treatment. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.10.23285762. [PMID: 36798219 PMCID: PMC9934807 DOI: 10.1101/2023.02.10.23285762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
Although acute lymphoblastic leukemia (ALL) is the most common childhood cancer, there is limited understanding of the contribution of inherited genetic variation on inter-individual differences in chemotherapy response. Defining genetic factors impacting therapy failure can help better predict response and identify drug resistance mechanisms. We therefore mapped inherited noncoding variants associated with chemotherapeutic drug resistance and/or treatment outcome to ALL cis-regulatory elements and investigated their gene regulatory potential and genomic connectivity using massively parallel reporter assays and promoter capture Hi-C, respectively. We identified 53 variants with reproducible allele-specific effects on transcription and high-confidence gene targets. Subsequent functional interrogation of the top variant (rs1247117) determined that it disrupted a PU.1 consensus motif and PU.1 binding affinity. Importantly, deletion of the genomic interval containing rs1247117 sensitized ALL cells to vincristine. Together, these data demonstrate that noncoding regulatory variation associated with diverse pharmacological traits harbor significant effects on allele-specific transcriptional activity and impact sensitivity to chemotherapeutic agents in ALL.
Collapse
Affiliation(s)
- Kashi Raj Bhattarai
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Robert J. Mobley
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Kelly R. Barnett
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Daniel C. Ferguson
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Baranda S. Hansen
- Center for Advanced Genome Engineering, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Cell and Molecular Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Jonathan D. Diedrich
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Brennan P. Bergeron
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
- Graduate School of Biomedical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Wenjian Yang
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Kristine R. Crews
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Christopher S. Manring
- Alliance Hematologic Malignancy Biorepository; Clara D. Bloomfield Center for Leukemia Outcomes Research, Columbus, OH 43210, USA
| | - Elias Jabbour
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | - Mark R. Litzow
- Division of Hematology, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Steven M. Kornblau
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Wendy Stock
- Comprehensive Cancer Center, University of Chicago Medicine, Chicago, IL
| | - Hiroto Inaba
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN
| | - Sima Jeha
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN
| | - Ching-Hon Pui
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN
| | - Cheng Cheng
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN
| | - Shondra M. Pruett-Miller
- Center for Advanced Genome Engineering, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Cell and Molecular Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Mary V. Relling
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Jun J. Yang
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
- Graduate School of Biomedical Sciences, St. Jude Children's Research Hospital, Memphis, TN
- Integrated Biomedical Sciences Program, University of Tennessee Health Science Center, Memphis, TN
| | - William E. Evans
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
| | - Daniel Savic
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN
- Graduate School of Biomedical Sciences, St. Jude Children's Research Hospital, Memphis, TN
- Integrated Biomedical Sciences Program, University of Tennessee Health Science Center, Memphis, TN
| |
Collapse
|
30
|
Muhtaseb AW, Duan J. Modeling common and rare genetic risk factors of neuropsychiatric disorders in human induced pluripotent stem cells. Schizophr Res 2022:S0920-9964(22)00156-6. [PMID: 35459617 PMCID: PMC9735430 DOI: 10.1016/j.schres.2022.04.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 04/05/2022] [Accepted: 04/07/2022] [Indexed: 12/13/2022]
Abstract
Recent genome-wide association studies (GWAS) and whole-exome sequencing of neuropsychiatric disorders, especially schizophrenia, have identified a plethora of common and rare disease risk variants/genes. Translating the mounting human genetic discoveries into novel disease biology and more tailored clinical treatments is tied to our ability to causally connect genetic risk variants to molecular and cellular phenotypes. When combined with the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated (Cas) nuclease-mediated genome editing system, human induced pluripotent stem cell (hiPSC)-derived neural cultures (both 2D and 3D organoids) provide a promising tractable cellular model for bridging the gap between genetic findings and disease biology. In this review, we first conceptualize the advances in understanding the disease polygenicity and convergence from the past decade of iPSC modeling of different types of genetic risk factors of neuropsychiatric disorders. We then discuss the major cell types and cellular phenotypes that are most relevant to neuropsychiatric disorders in iPSC modeling. Finally, we critically review the limitations of iPSC modeling of neuropsychiatric disorders and outline the need for implementing and developing novel methods to scale up the number of iPSC lines and disease risk variants in a systematic manner. Sufficiently scaled-up iPSC modeling and a better functional interpretation of genetic risk variants, in combination with cutting-edge CRISPR/Cas9 gene editing and single-cell multi-omics methods, will enable the field to identify the specific and convergent molecular and cellular phenotypes in precision for neuropsychiatric disorders.
Collapse
Affiliation(s)
- Abdurrahman W Muhtaseb
- Center for Psychiatric Genetics, NorthShore University HealthSystem, Evanston, IL 60201, United States of America; Department of Human Genetics, The University of Chicago, Chicago, IL 60637, United States of America
| | - Jubao Duan
- Center for Psychiatric Genetics, NorthShore University HealthSystem, Evanston, IL 60201, United States of America; Department of Psychiatry and Behavioral Neuroscience, The University of Chicago, Chicago, IL 60637, United States of America.
| |
Collapse
|
31
|
Muller RY, Meacham ZA, Ingolia NT. Plasmid and Sequencing Library Preparation for CRISPRi Barcoded Expression Reporter Sequencing (CiBER-seq) in Saccharomyces cerevisiae. Bio Protoc 2022; 12:e4376. [PMID: 35530514 PMCID: PMC9018440 DOI: 10.21769/bioprotoc.4376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 11/09/2021] [Accepted: 02/15/2022] [Indexed: 12/29/2022] Open
Abstract
Genetic networks regulate nearly all biological processes, including cellular differentiation, homeostasis, and immune responses. Determining the precise role of each gene within a regulatory network can explain its overall, integrated function, and pinpoint mechanisms underlying misregulation in disease states. Transcriptional reporter assays are a useful tool for dissecting these genetic networks, because they link a molecular process to a measurable readout, such as the expression of a fluorescent protein. Here, we introduce a new technique that uses expressed RNA barcodes as reporters, to measure transcriptional changes induced by CRISPRi-mediated genetic perturbation across a diverse, genome-wide library of guide RNAs. We describe an exemplary reporter based on the promoter that drives His4 expression in these guidelines, which can be used as a framework to interrogate other expression phenotypes. In this workflow, a library of plasmids is assembled, encoding a CRISPRi guide RNA (gRNA) along with one or more transcriptional reporters that drive expression of guide-specific nucleotide barcode sequences. For example, when interrogating regulation of the budding yeast HIS4 promoter normalized against a control housekeeping promoter that drives Pgk1 expression, this plasmid library contains a gRNA expression cassette, a HIS4 reporter driving expression of one gRNA-specific nucleotide barcode, and a PGK1 reporter driving expression of a second, gRNA-specific barcode. Long-read sequencing is used to determine which gRNA is associated with these nucleotide barcodes. The plasmid library is then transformed into yeast cells, where each cell receives one plasmid, and experiences a genetic perturbation driven by the guide on that plasmid. The expressed RNA barcodes are extracted in bulk and quantified using high-throughput sequencing, thereby measuring the effect of their corresponding gRNA on barcoded reporter expression. In the case of the HIS4 reporter described above, guides disrupting translation elongation will increase expression of the associated HIS4 barcode specifically, without changing expression of the PGK1 control barcode. It is further possible to quantify plasmid abundance by DNA sequencing, as an additional approach to normalize for differences in plasmid abundance within the population of cells. This protocol outlines the steps to prepare barcode reporter CRISPRi plasmid libraries, link guides to barcodes with long-read sequencing, and measure expression changes through barcode RNA and DNA sequencing. This method is ideal for probing transcriptional or post-transcriptional regulation, as it measures the effects of a genetic perturbation by directly quantifying reporter RNA abundance, rather than relying on indirect growth or fluorescence readouts. Graphic abstract.
Collapse
Affiliation(s)
- Ryan Y Muller
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, USA
| | - Zuriah A Meacham
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, USA
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, USA
| |
Collapse
|
32
|
Toropainen A, Stolze LK, Örd T, Whalen MB, Torrell PM, Link VM, Kaikkonen MU, Romanoski CE. Functional noncoding SNPs in human endothelial cells fine-map vascular trait associations. Genome Res 2022; 32:409-424. [PMID: 35193936 PMCID: PMC8896458 DOI: 10.1101/gr.276064.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/06/2022] [Indexed: 11/25/2022]
Abstract
Functional consequences of genetic variation in the noncoding human genome are difficult to ascertain despite demonstrated associations to common, complex disease traits. To elucidate properties of functional noncoding SNPs with effects in human endothelial cells (ECs), we utilized our previous molecular quantitative trait locus (molQTL) analysis for transcription factor binding, chromatin accessibility, and H3K27 acetylation to nominate a set of likely functional noncoding SNPs. Together with information from genome-wide association studies (GWASs) for vascular disease traits, we tested the ability of 34,344 variants to perturb enhancer function in ECs using the highly multiplexed STARR-seq assay. Of these, 5711 variants validated, whose enriched attributes included: (1) mutations to TF binding motifs for ETS or AP-1 that are regulators of the EC state; (2) location in accessible and H3K27ac-marked EC chromatin; and (3) molQTL associations whereby alleles associate with differences in chromatin accessibility and TF binding across genetically diverse ECs. Next, using pro-inflammatory IL1B as an activator of cell state, we observed robust evidence (>50%) of context-specific SNP effects, underscoring the prevalence of noncoding gene-by-environment (GxE) effects. Lastly, using these cumulative data, we fine-mapped vascular disease loci and highlighted evidence suggesting mechanisms by which noncoding SNPs at two loci affect risk for pulse pressure/large artery stroke and abdominal aortic aneurysm through respective effects on transcriptional regulation of POU4F1 and LDAH. Together, we highlight the attributes and context dependence of functional noncoding SNPs and provide new mechanisms underlying vascular disease risk.
Collapse
Affiliation(s)
- Anu Toropainen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Lindsey K Stolze
- The Department of Cellular and Molecular Medicine, The University of Arizona, Tucson, Arizona 85721, USA.,The Genetics Interdisciplinary Graduate Program, The University of Arizona, Tucson, Arizona 85721, USA
| | - Tiit Örd
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Michael B Whalen
- The Department of Cellular and Molecular Medicine, The University of Arizona, Tucson, Arizona 85721, USA
| | - Paula Martí Torrell
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Verena M Link
- Metaorganism Immunity Section, Laboratory of Host Immunity and Microbiome, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Minna U Kaikkonen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Casey E Romanoski
- The Department of Cellular and Molecular Medicine, The University of Arizona, Tucson, Arizona 85721, USA.,The Genetics Interdisciplinary Graduate Program, The University of Arizona, Tucson, Arizona 85721, USA
| |
Collapse
|
33
|
Örd T, Õunap K, Stolze LK, Aherrahrou R, Nurminen V, Toropainen A, Selvarajan I, Lönnberg T, Aavik E, Ylä-Herttuala S, Civelek M, Romanoski CE, Kaikkonen MU. Single-Cell Epigenomics and Functional Fine-Mapping of Atherosclerosis GWAS Loci. Circ Res 2021; 129:240-258. [PMID: 34024118 PMCID: PMC8260472 DOI: 10.1161/circresaha.121.318971] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Supplemental Digital Content is available in the text. Genome-wide association studies have identified hundreds of loci associated with coronary artery disease (CAD). Many of these loci are enriched in cisregulatory elements but not linked to cardiometabolic risk factors nor to candidate causal genes, complicating their functional interpretation.
Collapse
Affiliation(s)
- Tiit Örd
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Kadri Õunap
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Lindsey K. Stolze
- Department of Cellular and Molecular Medicine, The College of Medicine, The University of Arizona, Tucson, AZ (L.K.S., C.E.R.)
| | - Redouane Aherrahrou
- Center for Public Health Genomics (R.A., M.C.), University of Virginia, Charlottesville
| | - Valtteri Nurminen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Anu Toropainen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Ilakya Selvarajan
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Tapio Lönnberg
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Finland (T.L.)
| | - Einari Aavik
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Seppo Ylä-Herttuala
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| | - Mete Civelek
- Center for Public Health Genomics (R.A., M.C.), University of Virginia, Charlottesville
- Department of Biomedical Engineering (M.C.), University of Virginia, Charlottesville
| | - Casey E. Romanoski
- Department of Cellular and Molecular Medicine, The College of Medicine, The University of Arizona, Tucson, AZ (L.K.S., C.E.R.)
| | - Minna U. Kaikkonen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio (T.Ö., K.Õ., V.N., A.T., I.S., E.A., S.Y.-H., M.U.K.)
| |
Collapse
|
34
|
Letiagina AE, Omelina ES, Ivankin AV, Pindyurin AV. MPRAdecoder: Processing of the Raw MPRA Data With a priori Unknown Sequences of the Region of Interest and Associated Barcodes. Front Genet 2021; 12:618189. [PMID: 34046055 PMCID: PMC8148044 DOI: 10.3389/fgene.2021.618189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 03/25/2021] [Indexed: 11/13/2022] Open
Abstract
Massively parallel reporter assays (MPRAs) enable high-throughput functional evaluation of numerous DNA regulatory elements and/or their mutant variants. The assays are based on the construction of reporter plasmid libraries containing two variable parts, a region of interest (ROI) and a barcode (BC), located outside and within the transcription unit, respectively. Importantly, each plasmid molecule in a such a highly diverse library is characterized by a unique BC-ROI association. The reporter constructs are delivered to target cells and expression of BCs at the transcript level is assayed by RT-PCR followed by next-generation sequencing (NGS). The obtained values are normalized to the abundance of BCs in the plasmid DNA sample. Altogether, this allows evaluating the regulatory potential of the associated ROI sequences. However, depending on the MPRA library construction design, the BC and ROI sequences as well as their associations can be a priori unknown. In such a case, the BC and ROI sequences, their possible mutant variants, and unambiguous BC-ROI associations have to be identified, whereas all uncertain cases have to be excluded from the analysis. Besides the preparation of additional "mapping" samples for NGS, this also requires specific bioinformatics tools. Here, we present a pipeline for processing raw MPRA data obtained by NGS for reporter construct libraries with a priori unknown sequences of BCs and ROIs. The pipeline robustly identifies unambiguous (so-called genuine) BCs and ROIs associated with them, calculates the normalized expression level for each BC and the averaged values for each ROI, and provides a graphical visualization of the processed data.
Collapse
Affiliation(s)
- Anna E Letiagina
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.,Faculty of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Evgeniya S Omelina
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Anton V Ivankin
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Alexey V Pindyurin
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
35
|
Korgaonkar A, Han C, Lemire AL, Siwanowicz I, Bennouna D, Kopec RE, Andolfatto P, Shigenobu S, Stern DL. A novel family of secreted insect proteins linked to plant gall development. Curr Biol 2021. [PMID: 33974861 DOI: 10.1101/2020.10.28.359562] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
AbstractIn an elaborate form of inter-species exploitation, many insects hijack plant development to induce novel plant organs called galls that provide the insect with a source of nutrition and a temporary home. Galls result from dramatic reprogramming of plant cell biology driven by insect molecules, but the roles of specific insect molecules in gall development have not yet been determined. Here we study the aphidHormaphis cornu, which makes distinctive “cone” galls on leaves of witch hazelHamamelis virginiana. We found that derived genetic variants in the aphid genedeterminant of gall color(dgc) are associated with strong downregulation ofdgctranscription in aphid salivary glands, upregulation in galls of seven genes involved in anthocyanin synthesis, and deposition of two red anthocyanins in galls. We hypothesize that aphids inject DGC protein into galls, and that this results in differential expression of a small number of plant genes.Dgcis a member of a large, diverse family of novel predicted secreted proteins characterized by a pair of widely spaced cysteine-tyrosine-cysteine (CYC) residues, which we named BICYCLE proteins.Bicyclegenes are most strongly expressed in the salivary glands specifically of galling aphid generations, suggesting that they may regulate many aspects of gall development.Bicyclegenes have experienced unusually frequent diversifying selection, consistent with their potential role controlling gall development in a molecular arms race between aphids and their host plants.One Sentence SummaryAphidbicyclegenes, which encode diverse secreted proteins, contribute to plant gall development.
Collapse
|
36
|
Uebbing S, Gockley J, Reilly SK, Kocher AA, Geller E, Gandotra N, Scharfe C, Cotney J, Noonan JP. Massively parallel discovery of human-specific substitutions that alter enhancer activity. Proc Natl Acad Sci U S A 2021; 118:e2007049118. [PMID: 33372131 PMCID: PMC7812811 DOI: 10.1073/pnas.2007049118] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Genetic changes that altered the function of gene regulatory elements have been implicated in the evolution of human traits such as the expansion of the cerebral cortex. However, identifying the particular changes that modified regulatory activity during human evolution remain challenging. Here we used massively parallel enhancer assays in neural stem cells to quantify the functional impact of >32,000 human-specific substitutions in >4,300 human accelerated regions (HARs) and human gain enhancers (HGEs), which include enhancers with novel activities in humans. We found that >30% of active HARs and HGEs exhibited differential activity between human and chimpanzee. We isolated the effects of human-specific substitutions from background genetic variation to identify the effects of genetic changes most relevant to human evolution. We found that substitutions interacted in both additive and nonadditive ways to modify enhancer function. Substitutions within HARs, which are highly constrained compared to HGEs, showed smaller effects on enhancer activity, suggesting that the impact of human-specific substitutions is buffered in enhancers with constrained ancestral functions. Our findings yield insight into how human-specific genetic changes altered enhancer function and provide a rich set of candidates for studies of regulatory evolution in humans.
Collapse
Affiliation(s)
- Severin Uebbing
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Jake Gockley
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Acadia A Kocher
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Evan Geller
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Neeru Gandotra
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Curt Scharfe
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - Justin Cotney
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510;
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
- Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510
| |
Collapse
|
37
|
Mulvey B, Lagunas T, Dougherty JD. Massively Parallel Reporter Assays: Defining Functional Psychiatric Genetic Variants Across Biological Contexts. Biol Psychiatry 2021; 89:76-89. [PMID: 32843144 PMCID: PMC7938388 DOI: 10.1016/j.biopsych.2020.06.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 06/09/2020] [Accepted: 06/10/2020] [Indexed: 12/18/2022]
Abstract
Neuropsychiatric phenotypes have long been known to be influenced by heritable risk factors, directly confirmed by the past decade of genetic studies that have revealed specific genetic variants enriched in disease cohorts. However, the initial hope that a small set of genes would be responsible for a given disorder proved false. The more complex reality is that a given disorder may be influenced by myriad small-effect noncoding variants and/or by rare but severe coding variants, many de novo. Noncoding genomic sequences-for which molecular functions cannot usually be inferred-harbor a large portion of these variants, creating a substantial barrier to understanding higher-order molecular and biological systems of disease. Fortunately, novel genetic technologies-scalable oligonucleotide synthesis, RNA sequencing, and CRISPR (clustered regularly interspaced short palindromic repeats)-have opened novel avenues to experimentally identify biologically significant variants en masse. Massively parallel reporter assays (MPRAs) are an especially versatile technique resulting from such innovations. MPRAs are powerful molecular genetics tools that can be used to screen thousands of untranscribed or untranslated sequences and their variants for functional effects in a single experiment. This approach, though underutilized in psychiatric genetics, has several useful features for the field. We review methods for assaying putatively functional genetic variants and regions, emphasizing MPRAs and the opportunities they hold for dissection of psychiatric polygenicity. We discuss literature applying functional assays in neurogenetics, highlighting strengths, caveats, and design considerations-especially regarding disease-relevant variables (cell type, neurodevelopment, and sex), and we ultimately propose applications of MPRA to both computational and experimental neurogenetics of polygenic disease risk.
Collapse
Affiliation(s)
- Bernard Mulvey
- Division of Biology and Biomedical Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Tomás Lagunas
- Division of Biology and Biomedical Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Joseph D Dougherty
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri.
| |
Collapse
|
38
|
Campoy JA, Sun H, Goel M, Jiao WB, Folz-Donahue K, Wang N, Rubio M, Liu C, Kukat C, Ruiz D, Huettel B, Schneeberger K. Gamete binning: chromosome-level and haplotype-resolved genome assembly enabled by high-throughput single-cell sequencing of gamete genomes. Genome Biol 2020; 21:306. [PMID: 33372615 DOI: 10.1101/2020.04.24.060046] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 12/11/2020] [Indexed: 05/26/2023] Open
Abstract
Generating chromosome-level, haplotype-resolved assemblies of heterozygous genomes remains challenging. To address this, we developed gamete binning, a method based on single-cell sequencing of haploid gametes enabling separation of the whole-genome sequencing reads into haplotype-specific reads sets. After assembling the reads of each haplotype, the contigs are scaffolded to chromosome level using a genetic map derived from the gametes. We assemble the two genomes of a diploid apricot tree based on whole-genome sequencing of 445 individual pollen grains. The two haplotype assemblies (N50: 25.5 and 25.8 Mb) feature a haplotyping precision of greater than 99% and are accurately scaffolded to chromosome-level.
Collapse
Affiliation(s)
- José A Campoy
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Hequan Sun
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
- Faculty of Biology, LMU Munich, Großhaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Manish Goel
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Wen-Biao Jiao
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Kat Folz-Donahue
- FACS & Imaging Core Facility, Max Planck Institute for Biology of Ageing, 50931, Cologne, Germany
| | - Nan Wang
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Auf der Morgenstelle 32, 72076, Tübingen, Germany
| | - Manuel Rubio
- Departament of Plant Breeding, CEBAS-CSIC, PO Box 164, E-30100 Espinardo, Murcia, Spain
| | - Chang Liu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Auf der Morgenstelle 32, 72076, Tübingen, Germany
- Institute of Biology, University of Hohenheim, Garbenstraße 30, 70599, Stuttgart, Germany
| | - Christian Kukat
- FACS & Imaging Core Facility, Max Planck Institute for Biology of Ageing, 50931, Cologne, Germany
| | - David Ruiz
- Departament of Plant Breeding, CEBAS-CSIC, PO Box 164, E-30100 Espinardo, Murcia, Spain
| | - Bruno Huettel
- Max Planck-Genome-center Cologne, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Korbinian Schneeberger
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany.
- Faculty of Biology, LMU Munich, Großhaderner Str. 2, 82152, Planegg-Martinsried, Germany.
| |
Collapse
|
39
|
Muller R, Meacham ZA, Ferguson L, Ingolia NT. CiBER-seq dissects genetic networks by quantitative CRISPRi profiling of expression phenotypes. Science 2020; 370:eabb9662. [PMID: 33303588 PMCID: PMC7819735 DOI: 10.1126/science.abb9662] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 10/22/2020] [Indexed: 12/12/2022]
Abstract
To realize the promise of CRISPR-Cas9-based genetics, approaches are needed to quantify a specific, molecular phenotype across genome-wide libraries of genetic perturbations. We addressed this challenge by profiling transcriptional, translational, and posttranslational reporters using CRISPR interference (CRISPRi) with barcoded expression reporter sequencing (CiBER-seq). Our barcoding approach allowed us to connect an entire library of guides to their individual phenotypic consequences using pooled sequencing. CiBER-seq profiling fully recapitulated the integrated stress response (ISR) pathway in yeast. Genetic perturbations causing uncharged transfer RNA (tRNA) accumulation activated ISR reporter transcription. Notably, tRNA insufficiency also activated the reporter, independent of the uncharged tRNA sensor. By uncovering alternate triggers for ISR activation, we illustrate how precise, comprehensive CiBER-seq profiling provides a powerful and broadly applicable tool for dissecting genetic networks.
Collapse
Affiliation(s)
- Ryan Muller
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Zuriah A Meacham
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Lucas Ferguson
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
40
|
Renganaath K, Chong R, Day L, Kosuri S, Kruglyak L, Albert FW. Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross. eLife 2020; 9:e62669. [PMID: 33179598 PMCID: PMC7685706 DOI: 10.7554/elife.62669] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/11/2020] [Indexed: 02/06/2023] Open
Abstract
Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Collapse
Affiliation(s)
- Kaushik Renganaath
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| | - Rockie Chong
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Laura Day
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Sriram Kosuri
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Frank W Albert
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|
41
|
Matoba N, Liang D, Sun H, Aygün N, McAfee JC, Davis JE, Raffield LM, Qian H, Piven J, Li Y, Kosuri S, Won H, Stein JL. Common genetic risk variants identified in the SPARK cohort support DDHD2 as a candidate risk gene for autism. Transl Psychiatry 2020; 10:265. [PMID: 32747698 PMCID: PMC7400671 DOI: 10.1038/s41398-020-00953-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 07/22/2020] [Indexed: 12/13/2022] Open
Abstract
Autism spectrum disorder (ASD) is a highly heritable neurodevelopmental disorder. Large genetically informative cohorts of individuals with ASD have led to the identification of a limited number of common genome-wide significant (GWS) risk loci to date. However, many more common genetic variants are expected to contribute to ASD risk given the high heritability. Here, we performed a genome-wide association study (GWAS) on 6222 case-pseudocontrol pairs from the Simons Foundation Powering Autism Research for Knowledge (SPARK) dataset to identify additional common genetic risk factors and molecular mechanisms underlying risk for ASD. We identified one novel GWS locus from the SPARK GWAS and four significant loci, including an additional novel locus from meta-analysis with a previous GWAS. We replicated the previous observation of significant enrichment of ASD heritability within regulatory regions of the developing cortex, indicating that disruption of gene regulation during neurodevelopment is critical for ASD risk. We further employed a massively parallel reporter assay (MPRA) and identified a putative causal variant at the novel locus from SPARK GWAS with strong impacts on gene regulation (rs7001340). Expression quantitative trait loci data demonstrated an association between the risk allele and decreased expression of DDHD2 (DDHD domain containing 2) in both adult and prenatal brains. In conclusion, by integrating genetic association data with multi-omic gene regulatory annotations and experimental validation, we fine-mapped a causal risk variant and demonstrated that DDHD2 is a novel gene associated with ASD risk.
Collapse
Affiliation(s)
- Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Dan Liang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Huaigu Sun
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jessica C McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jessica E Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Huijun Qian
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Joseph Piven
- Department of Psychiatry and the Carolina Institute for Developmental Disabilities, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yun Li
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Sriam Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
42
|
Kreimer A, Yosef N. Evaluation of Davis et al.: Exploring Sequence of Determinants of Transcriptional Regulation-The Case of c-AMP Response Element. Cell Syst 2020; 11:2-4. [PMID: 32702318 DOI: 10.1016/j.cels.2020.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
One snapshot of the peer review process for "Dissection of c-AMP Response Element Architecture by Using Genomic and Episomal Massively Parallel Reporter Assays" (Davis et al., 2020).
Collapse
Affiliation(s)
- Anat Kreimer
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Nir Yosef
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA; Chan Zuckerberg Biohub, San Francisco, CA, USA; Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology, and Harvard University, Boston, MA, USA.
| |
Collapse
|
43
|
Qiao D, Zigler CM, Cho MH, Silverman EK, Zhou X, Castaldi PJ, Laird NH. Statistical considerations for the analysis of massively parallel reporter assays data. Genet Epidemiol 2020; 44:785-794. [PMID: 32681690 DOI: 10.1002/gepi.22337] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 06/12/2020] [Accepted: 07/03/2020] [Indexed: 01/23/2023]
Abstract
Noncoding DNA contains gene regulatory elements that alter gene expression, and the function of these elements can be modified by genetic variation. Massively parallel reporter assays (MPRA) enable high-throughput identification and characterization of functional genetic variants, but the statistical methods to identify allelic effects in MPRA data have not been fully developed. In this study, we demonstrate how the baseline allelic imbalance in MPRA libraries can produce biased results, and we propose a novel, nonparametric, adaptive testing method that is robust to this bias. We compare the performance of this method with other commonly used methods, and we demonstrate that our novel adaptive method controls Type I error in a wide range of scenarios while maintaining excellent power. We have implemented these tests along with routines for simulating MPRA data in the Analysis Toolset for MPRA (@MPRA), an R package for the design and analyses of MPRA experiments. It is publicly available at http://github.com/redaq/atMPRA.
Collapse
Affiliation(s)
- Dandi Qiao
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Corwin M Zigler
- Department of Statistics and Data Sciences, Department of Women's Health, University of Texas at Austin, Austin, Texas
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Xiaobo Zhou
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Peter J Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts.,Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Nan H Laird
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts
| |
Collapse
|
44
|
Ghazi AR, Kong X, Chen ES, Edelstein LC, Shaw CA. Bayesian modelling of high-throughput sequencing assays with malacoda. PLoS Comput Biol 2020; 16:e1007504. [PMID: 32692749 PMCID: PMC7394446 DOI: 10.1371/journal.pcbi.1007504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 07/31/2020] [Accepted: 06/09/2020] [Indexed: 12/13/2022] Open
Abstract
NGS studies have uncovered an ever-growing catalog of human variation while leaving an enormous gap between observed variation and experimental characterization of variant function. High-throughput screens powered by NGS have greatly increased the rate of variant functionalization, but the development of comprehensive statistical methods to analyze screen data has lagged. In the massively parallel reporter assay (MPRA), short barcodes are counted by sequencing DNA libraries transfected into cells and the cell's output RNA in order to simultaneously measure the shifts in transcription induced by thousands of genetic variants. These counts present many statistical challenges, including overdispersion, depth dependence, and uncertain DNA concentrations. So far, the statistical methods used have been rudimentary, employing transformations on count level data and disregarding experimental and technical structure while failing to quantify uncertainty in the statistical model. We have developed an extensive framework for the analysis of NGS functionalization screens available as an R package called malacoda (available from github.com/andrewGhazi/malacoda). Our software implements a probabilistic, fully Bayesian model of screen data. The model uses the negative binomial distribution with gamma priors to model sequencing counts while accounting for effects from input library preparation and sequencing depth. The method leverages the high-throughput nature of the assay to estimate the priors empirically. External annotations such as ENCODE data or DeepSea predictions can also be incorporated to obtain more informative priors-a transformative capability for data integration. The package also includes quality control and utility functions, including automated barcode counting and visualization methods. To validate our method, we analyzed several datasets using malacoda and alternative MPRA analysis methods. These data include experiments from the literature, simulated assays, and primary MPRA data. We also used luciferase assays to experimentally validate several hits from our primary data, as well as variants for which the various methods disagree and variants detectable only with the aid of external annotations.
Collapse
Affiliation(s)
- Andrew R. Ghazi
- Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, Texas, United States of America
| | - Xianguo Kong
- Cardeza Foundation for Hematologic Research, Thomas Jefferson University, Philadelphia, Pennsylvania, United States of America
| | - Ed S. Chen
- Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Leonard C. Edelstein
- Cardeza Foundation for Hematologic Research, Thomas Jefferson University, Philadelphia, Pennsylvania, United States of America
| | - Chad A. Shaw
- Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| |
Collapse
|
45
|
de Jongh RP, van Dijk AD, Julsing MK, Schaap PJ, de Ridder D. Designing Eukaryotic Gene Expression Regulation Using Machine Learning. Trends Biotechnol 2020; 38:191-201. [DOI: 10.1016/j.tibtech.2019.07.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/12/2019] [Accepted: 07/19/2019] [Indexed: 12/11/2022]
|
46
|
Myint L, Wang R, Boukas L, Hansen KD, Goff LA, Avramopoulos D. A screen of 1,049 schizophrenia and 30 Alzheimer's-associated variants for regulatory potential. Am J Med Genet B Neuropsychiatr Genet 2020; 183:61-73. [PMID: 31503409 PMCID: PMC7233147 DOI: 10.1002/ajmg.b.32761] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 08/19/2019] [Accepted: 08/20/2019] [Indexed: 11/10/2022]
Abstract
Recent genome-wide association studies (GWAS) identified numerous schizophrenia (SZ) and Alzheimer's disease (AD) associated loci, most outside protein-coding regions and hypothesized to affect gene transcription. We used a massively parallel reporter assay to screen, 1,049 SZ and 30 AD variants in 64 and nine loci, respectively for allele differences in driving reporter gene expression. A library of synthetic oligonucleotides assaying each allele five times was transfected into K562 chronic myelogenous leukemia lymphoblasts and SK-SY5Y human neuroblastoma cells. One hundred forty eight variants showed allelic differences in K562 and 53 in SK-SY5Y cells, on average 2.6 variants per locus. Nine showed significant differences in both lines, a modest overlap reflecting different regulatory landscapes of these lines that also differ significantly in chromatin marks. Eight of nine were in the same direction. We observe no preference for risk alleles to increase or decrease expression. We find a positive correlation between the number of SNPs in linkage disequilibrium and the proportion of functional SNPs supporting combinatorial effects that may lead to haplotype selection. Our results prioritize future functional follow up of disease associated SNPs to determine the driver GWAS variant(s), at each locus and enhance our understanding of gene regulation dynamics.
Collapse
Affiliation(s)
- Leslie Myint
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Ruihua Wang
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Leandros Boukas
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Kasper D. Hansen
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Loyal A. Goff
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Dimitrios Avramopoulos
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland
- Department of Psychiatry, Johns Hopkins School of Medicine, Baltimore, Maryland
| |
Collapse
|
47
|
Ashuach T, Fischer DS, Kreimer A, Ahituv N, Theis FJ, Yosef N. MPRAnalyze: statistical framework for massively parallel reporter assays. Genome Biol 2019; 20:183. [PMID: 31477158 PMCID: PMC6717970 DOI: 10.1186/s13059-019-1787-z] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Accepted: 08/09/2019] [Indexed: 11/10/2022] Open
Abstract
Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods.
Collapse
Affiliation(s)
- Tal Ashuach
- Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, California USA
- Center for Computational Biology, University of California Berkeley, Berkeley, California USA
| | - David S. Fischer
- Institute of Computational Biology, Helmholz Zentrum München, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Anat Kreimer
- Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, California USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, California USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, California USA
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholz Zentrum München, Neuherberg, Germany
| | - Nir Yosef
- Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, California USA
- Center for Computational Biology, University of California Berkeley, Berkeley, California USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA USA
- Chan Zuckerberg BioHub, San Francisco, California USA
| |
Collapse
|