1
|
Gilbert MA, Keefer-Jacques E, Jadhav T, Antfolk D, Ming Q, Valente N, Shaw GTW, Sottolano CJ, Matwijec G, Luca VC, Loomes KM, Rajagopalan R, Hayeck TJ, Spinner NB. Functional characterization of 2,832 JAG1 variants supports reclassification for Alagille syndrome and improves guidance for clinical variant interpretation. Am J Hum Genet 2024; 111:1656-1672. [PMID: 39043182 PMCID: PMC11339624 DOI: 10.1016/j.ajhg.2024.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 06/15/2024] [Accepted: 06/24/2024] [Indexed: 07/25/2024] Open
Abstract
Pathogenic variants in the JAG1 gene are a primary cause of the multi-system disorder Alagille syndrome. Although variant detection rates are high for this disease, there is uncertainty associated with the classification of missense variants that leads to reduced diagnostic yield. Consequently, up to 85% of reported JAG1 missense variants have uncertain or conflicting classifications. We generated a library of 2,832 JAG1 nucleotide variants within exons 1-7, a region with a high number of reported missense variants, and designed a high-throughput assay to measure JAG1 membrane expression, a requirement for normal function. After calibration using a set of 175 known or predicted pathogenic and benign variants included within the variant library, 486 variants were characterized as functionally abnormal (n = 277 abnormal and n = 209 likely abnormal), of which 439 (90.3%) were missense. We identified divergent membrane expression occurring at specific residues, indicating that loss of the wild-type residue itself does not drive pathogenicity, a finding supported by structural modeling data and with broad implications for clinical variant classification both for Alagille syndrome and globally across other disease genes. Of 144 uncertain variants reported in patients undergoing clinical or research testing, 27 had functionally abnormal membrane expression, and inclusion of our data resulted in the reclassification of 26 to likely pathogenic. Functional evidence augments the classification of genomic variants, reducing uncertainty and improving diagnostics. Inclusion of this repository of functional evidence during JAG1 variant reclassification will significantly affect resolution of variant pathogenicity, making a critical impact on the molecular diagnosis of Alagille syndrome.
Collapse
Affiliation(s)
- Melissa A Gilbert
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA; Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| | - Ernest Keefer-Jacques
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Tanaya Jadhav
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Daniel Antfolk
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Qianqian Ming
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Nicolette Valente
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Grace Tzun-Wen Shaw
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Christopher J Sottolano
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Grace Matwijec
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Vincent C Luca
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Kathleen M Loomes
- Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Tristan J Hayeck
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nancy B Spinner
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
2
|
Fasano C, Lepore Signorile M, De Marco K, Forte G, Disciglio V, Sanese P, Grossi V, Simone C. In Silico Deciphering of the Potential Impact of Variants of Uncertain Significance in Hereditary Colorectal Cancer Syndromes. Cells 2024; 13:1314. [PMID: 39195204 DOI: 10.3390/cells13161314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 07/23/2024] [Accepted: 08/03/2024] [Indexed: 08/29/2024] Open
Abstract
Colorectal cancer (CRC) ranks third in terms of cancer incidence worldwide and is responsible for 8% of all deaths globally. Approximately 10% of CRC cases are caused by inherited pathogenic mutations in driver genes involved in pathways that are crucial for CRC tumorigenesis and progression. These hereditary mutations significantly increase the risk of initial benign polyps or adenomas developing into cancer. In recent years, the rapid and accurate sequencing of CRC-specific multigene panels by next-generation sequencing (NGS) technologies has enabled the identification of several recurrent pathogenic variants with established functional consequences. In parallel, rare genetic variants that are not characterized and are, therefore, called variants of uncertain significance (VUSs) have also been detected. The classification of VUSs is a challenging task because each amino acid has specific biochemical properties and uniquely contributes to the structural stability and functional activity of proteins. In this scenario, the ability to computationally predict the effect of a VUS is crucial. In particular, in silico prediction methods can provide useful insights to assess the potential impact of a VUS and support additional clinical evaluation. This approach can further benefit from recent advances in artificial intelligence-based technologies. In this review, we describe the main in silico prediction tools that can be used to evaluate the structural and functional impact of VUSs and provide examples of their application in the analysis of gene variants involved in hereditary CRC syndromes.
Collapse
Affiliation(s)
- Candida Fasano
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Martina Lepore Signorile
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Katia De Marco
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Giovanna Forte
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Vittoria Disciglio
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Paola Sanese
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Valentina Grossi
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
| | - Cristiano Simone
- Medical Genetics, National Institute of Gastroenterology, IRCCS "Saverio de Bellis" Research Hospital, 70013 Castellana Grotte, Italy
- Medical Genetics, Department of Precision and Regenerative Medicine and Jonic Area (DiMePRe-J), University of Bari Aldo Moro, 70124 Bari, Italy
| |
Collapse
|
3
|
Hong Z, Shimagaki KS, Barton JP. popDMS infers mutation effects from deep mutational scanning data. Bioinformatics 2024; 40:btae499. [PMID: 39115383 PMCID: PMC11335369 DOI: 10.1093/bioinformatics/btae499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 07/10/2024] [Accepted: 08/06/2024] [Indexed: 08/22/2024] Open
Abstract
SUMMARY Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions. AVAILABILITY AND IMPLEMENTATION popDMS is implemented in Python and Julia, and is freely available on GitHub at https://github.com/bartonlab/popDMS.
Collapse
Affiliation(s)
- Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, United States
| | - Kai S Shimagaki
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, PA 15260, United States
| | - John P Barton
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, United States
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, PA 15260, United States
- Department of Physics and Astronomy, University of Pittsburgh, PA 15260, United States
| |
Collapse
|
4
|
Shephard VK, Brown ML, Thompson BA, Harpur A, McAlary L. Rapid classification of a novel ALS-causing I149S variant in superoxide dismutase-1. Amyotroph Lateral Scler Frontotemporal Degener 2024; 25:608-614. [PMID: 38742757 DOI: 10.1080/21678421.2024.2351177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/15/2024] [Accepted: 04/30/2024] [Indexed: 05/16/2024]
Abstract
Variants of the oxygen free radical scavenging enzyme superoxide dismutase-1 (SOD1) are associated with the neurodegenerative disease amyotrophic lateral sclerosis (ALS). These variants occur in roughly 20% of familial ALS cases, and 1% of sporadic ALS cases. Here, we identified a novel SOD1 variant in a patient in their 50s who presented with movement deficiencies and neuropsychiatric features. The variant was heterozygous and resulted in the isoleucine at position 149 being substituted with a serine (I149S). In silico analysis predicted the variant to be destabilizing to the SOD1 protein structure. Expression of the SOD1I149S variant with a C-terminal EGFP tag in neuronal-like NSC-34 cells resulted in extensive inclusion formation and reduced cell viability. Immunoblotting revealed that the intramolecular disulphide between Cys57 and Cys146 was fully reduced for SOD1I149S. Furthermore, SOD1I149S was highly susceptible to proteolytic digestion, suggesting a large degree of instability to the protein fold. Finally, fluorescence correlation spectroscopy and native-PAGE of cell lysates showed that SOD1I149S was monomeric in solution in comparison to the dimeric SOD1WT. This experimental data was obtained within 3 months and resulted in the rapid re-classification of the variant from a variant of unknown significance (VUS) to a clinically actionable likely pathogenic variant.
Collapse
Affiliation(s)
- Victoria K Shephard
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Wollongong, Australia
| | - Mikayla L Brown
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Wollongong, Australia
| | - Bryony A Thompson
- Department of Pathology, Royal Melbourne Hospital, Melbourne, VIC, Australia, and
| | - Alisha Harpur
- Department of Genomic Medicine, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Luke McAlary
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Wollongong, Australia
| |
Collapse
|
5
|
Habib AM, Cox JJ, Okorokov AL. Out of the dark: the emerging roles of lncRNAs in pain. Trends Genet 2024; 40:694-705. [PMID: 38926010 DOI: 10.1016/j.tig.2024.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/16/2024] [Accepted: 04/17/2024] [Indexed: 06/28/2024]
Abstract
The dark genome, the nonprotein-coding part of the genome, is replete with long noncoding RNAs (lncRNAs). These functionally versatile transcripts, with specific temporal and spatial expression patterns, are critical gene regulators that play essential roles in health and disease. In recent years, FAAH-OUT was identified as the first lncRNA associated with an inherited human pain insensitivity disorder. Several other lncRNAs have also been studied for their contribution to chronic pain and genome-wide association studies are frequently identifying single nucleotide polymorphisms that map to lncRNAs. For a long time overlooked, lncRNAs are coming out of the dark and into the light as major players in human pain pathways and as potential targets for new RNA-based analgesic medicines.
Collapse
Affiliation(s)
- Abdella M Habib
- College of Medicine, QU Health, Qatar University, PO Box 2713, Doha, Qatar
| | - James J Cox
- Wolfson Institute for Biomedical Research, Division of Medicine, University College London, London, WC1E 6BT, UK.
| | - Andrei L Okorokov
- Wolfson Institute for Biomedical Research, Division of Medicine, University College London, London, WC1E 6BT, UK.
| |
Collapse
|
6
|
Ozturk K, Panwala R, Sheen J, Ford K, Jayne N, Portell A, Zhang DE, Hutter S, Haferlach T, Ideker T, Mali P, Carter H. Interface-guided phenotyping of coding variants in the transcription factor RUNX1. Cell Rep 2024; 43:114436. [PMID: 38968069 PMCID: PMC11345852 DOI: 10.1016/j.celrep.2024.114436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 05/15/2024] [Accepted: 06/19/2024] [Indexed: 07/07/2024] Open
Abstract
Single-gene missense mutations remain challenging to interpret. Here, we deploy scalable functional screening by sequencing (SEUSS), a Perturb-seq method, to generate mutations at protein interfaces of RUNX1 and quantify their effect on activities of downstream cellular programs. We evaluate single-cell RNA profiles of 115 mutations in myelogenous leukemia cells and categorize them into three functionally distinct groups, wild-type (WT)-like, loss-of-function (LoF)-like, and hypomorphic, that we validate in orthogonal assays. LoF-like variants dominate the DNA-binding site and are recurrent in cancer; however, recurrence alone does not predict functional impact. Hypomorphic variants share characteristics with LoF-like but favor protein interactions, promoting gene expression indicative of nerve growth factor (NGF) response and cytokine recruitment of neutrophils. Accessible DNA near differentially expressed genes frequently contains RUNX1-binding motifs. Finally, we reclassify 16 variants of uncertain significance and train a classifier to predict 103 more. Our work demonstrates the potential of targeting protein interactions to better define the landscape of phenotypes reachable by missense mutations.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Division of Medical Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Rebecca Panwala
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Jeanna Sheen
- School of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kyle Ford
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Nathan Jayne
- School of Biological Sciences, University of California, San Diego, La Jolla, CA, USA; Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
| | - Andrew Portell
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Dong-Er Zhang
- Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
| | - Stephan Hutter
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany
| | - Torsten Haferlach
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany
| | - Trey Ideker
- Division of Medical Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA; Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
| | - Hannah Carter
- Division of Medical Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA; Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
7
|
Plowman JN, Matoy EJ, Uppala LV, Draves SB, Watson CJ, Sefranek BA, Stacey ML, Anderson SP, Belshan MA, Blue EE, Huff CD, Fu Y, Stessman HAF. Targeted sequencing for hereditary breast and ovarian cancer in BRCA1/2-negative families reveals complex genetic architecture and phenocopies. HGG ADVANCES 2024; 5:100306. [PMID: 38734904 PMCID: PMC11166883 DOI: 10.1016/j.xhgg.2024.100306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 05/07/2024] [Accepted: 05/07/2024] [Indexed: 05/13/2024] Open
Abstract
Approximately 20% of breast cancer cases are attributed to increased family risk, yet variation in BRCA1/2 can only explain 20%-25% of cases. Historically, only single gene or single variant testing were common in at-risk family members, and further sequencing studies were rarely offered after negative results. In this study, we applied an efficient and inexpensive targeted sequencing approach to provide molecular diagnoses in 245 human samples representing 134 BRCA mutation-negative (BRCAX) hereditary breast and ovarian cancer (HBOC) families recruited from 1973 to 2019 by Dr. Henry Lynch. Sequencing identified 391 variants, which were functionally annotated and ranked based on their predicted clinical impact. Known pathogenic CHEK2 breast cancer variants were identified in five BRCAX families in this study. While BRCAX was an inclusion criterion for this study, we still identified a pathogenic BRCA2 variant (p.Met192ValfsTer13) in one family. A portion of BRCAX families could be explained by other hereditary cancer syndromes that increase HBOC risk: Li-Fraumeni syndrome (gene: TP53) and Lynch syndrome (gene: MSH6). Interestingly, many families carried additional variants of undetermined significance (VOUSs) that may further modify phenotypes of syndromic family members. Ten families carried more than one potential VOUS, suggesting the presence of complex multi-variant families. Overall, nine BRCAX HBOC families in our study may be explained by known likely pathogenic/pathogenic variants, and six families carried potential VOUSs, which require further functional testing. To address this, we developed a functional assay where we successfully re-classified one family's PMS2 VOUS as benign.
Collapse
Affiliation(s)
- Jocelyn N Plowman
- Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE 68178, USA
| | - Evanjalina J Matoy
- Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE 68178, USA
| | - Lavanya V Uppala
- Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE 68178, USA
| | - Samantha B Draves
- Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE 68178, USA
| | - Cynthia J Watson
- Creighton University Core Facilities, Creighton University, Omaha, NE 68178, USA
| | - Bridget A Sefranek
- Creighton University Core Facilities, Creighton University, Omaha, NE 68178, USA
| | - Mark L Stacey
- Creighton University Core Facilities, Creighton University, Omaha, NE 68178, USA
| | - Samuel P Anderson
- Creighton University Core Facilities, Creighton University, Omaha, NE 68178, USA
| | - Michael A Belshan
- Department of Medical Microbiology and Immunology, Creighton University, Omaha, NE 68178, USA
| | - Elizabeth E Blue
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA 98195, USA; Institute for Public Health Genetics, University of Washington, Seattle, WA 98195, USA; Brotman Baty Institute, Seattle, WA 98195, USA
| | - Chad D Huff
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yusi Fu
- Department of Biomedical Sciences, Creighton University, Omaha, NE 68178, USA
| | - Holly A F Stessman
- Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE 68178, USA; Creighton University Core Facilities, Creighton University, Omaha, NE 68178, USA.
| |
Collapse
|
8
|
Murciano-Goroff YR, Uppal M, Chen M, Harada G, Schram AM. Basket Trials: Past, Present, and Future. ANNUAL REVIEW OF CANCER BIOLOGY 2024; 8:59-80. [PMID: 38938274 PMCID: PMC11210107 DOI: 10.1146/annurev-cancerbio-061421-012927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]
Abstract
Large-scale tumor molecular profiling has revealed that diverse cancer histologies are driven by common pathways with unifying biomarkers that can be exploited therapeutically. Disease-agnostic basket trials have been increasingly utilized to test biomarker-driven therapies across cancer types. These trials have led to drug approvals and improved the lives of patients while simultaneously advancing our understanding of cancer biology. This review focuses on the practicalities of implementing basket trials, with an emphasis on molecularly targeted trials. We examine the biologic subtleties of genomic biomarker and patient selection, discuss previous successes in drug development facilitated by basket trials, describe certain novel targets and drugs, and emphasize practical considerations for participant recruitment and study design. This review also highlights strategies for aiding patient access to basket trials. As basket trials become more common, steps to ensure equitable implementation of these studies will be critical for molecularly targeted drug development.
Collapse
Affiliation(s)
| | - Manik Uppal
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Weill Cornell Medical College, New York, NY, USA
| | - Monica Chen
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Guilherme Harada
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Alison M Schram
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Weill Cornell Medical College, New York, NY, USA
| |
Collapse
|
9
|
Ma K, Gauthier LO, Cheung F, Huang S, Lek M. High-throughput assays to assess variant effects on disease. Dis Model Mech 2024; 17:dmm050573. [PMID: 38940340 PMCID: PMC11225591 DOI: 10.1242/dmm.050573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Interpreting the wealth of rare genetic variants discovered in population-scale sequencing efforts and deciphering their associations with human health and disease present a critical challenge due to the lack of sufficient clinical case reports. One promising avenue to overcome this problem is deep mutational scanning (DMS), a method of introducing and evaluating large-scale genetic variants in model cell lines. DMS allows unbiased investigation of variants, including those that are not found in clinical reports, thus improving rare disease diagnostics. Currently, the main obstacle limiting the full potential of DMS is the availability of functional assays that are specific to disease mechanisms. Thus, we explore high-throughput functional methodologies suitable to examine broad disease mechanisms. We specifically focus on methods that do not require robotics or automation but instead use well-designed molecular tools to transform biological mechanisms into easily detectable signals, such as cell survival rate, fluorescence or drug resistance. Here, we aim to bridge the gap between disease-relevant assays and their integration into the DMS framework.
Collapse
Affiliation(s)
- Kaiyue Ma
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Logan O. Gauthier
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Frances Cheung
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Shushu Huang
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| |
Collapse
|
10
|
Linga BG, Mohammed SGAA, Farrell T, Rifai HA, Al-Dewik N, Qoronfleh MW. Genomic Newborn Screening for Pediatric Cancer Predisposition Syndromes: A Holistic Approach. Cancers (Basel) 2024; 16:2017. [PMID: 38893137 PMCID: PMC11171256 DOI: 10.3390/cancers16112017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 05/23/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024] Open
Abstract
As next-generation sequencing (NGS) has become more widely used, germline and rare genetic variations responsible for inherited illnesses, including cancer predisposition syndromes (CPSs) that account for up to 10% of childhood malignancies, have been found. The CPSs are a group of germline genetic disorders that have been identified as risk factors for pediatric cancer development. Excluding a few "classic" CPSs, there is no agreement regarding when and how to conduct germline genetic diagnostic studies in children with cancer due to the constant evolution of knowledge in NGS technologies. Various clinical screening tools have been suggested to aid in the identification of individuals who are at greater risk, using diverse strategies and with varied outcomes. We present here an overview of the primary clinical and molecular characteristics of various CPSs and summarize the existing clinical genomics data on the prevalence of CPSs in pediatric cancer patients. Additionally, we discuss several ethical issues, challenges, limitations, cost-effectiveness, and integration of genomic newborn screening for CPSs into a healthcare system. Furthermore, we assess the effectiveness of commonly utilized decision-support tools in identifying patients who may benefit from genetic counseling and/or direct genetic testing. This investigation highlights a tailored and systematic approach utilizing medical newborn screening tools such as the genome sequencing of high-risk newborns for CPSs, which could be a practical and cost-effective strategy in pediatric cancer care.
Collapse
Affiliation(s)
- BalaSubramani Gattu Linga
- Department of Research, Women’s Wellness and Research Center, Hamad Medical Corporation (HMC), P.O. Box 3050, Doha 0974, Qatar
- Translational and Precision Medicine Research, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
| | | | - Thomas Farrell
- Department of Research, Women’s Wellness and Research Center, Hamad Medical Corporation (HMC), P.O. Box 3050, Doha 0974, Qatar
| | - Hilal Al Rifai
- Neonatal Intensive Care Unit (NICU), Newborn Screening Unit, Department of Pediatrics and Neonatology, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
| | - Nader Al-Dewik
- Department of Research, Women’s Wellness and Research Center, Hamad Medical Corporation (HMC), P.O. Box 3050, Doha 0974, Qatar
- Translational and Precision Medicine Research, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
- Neonatal Intensive Care Unit (NICU), Newborn Screening Unit, Department of Pediatrics and Neonatology, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
- Genomics and Precision Medicine (GPM), College of Health & Life Science (CHLS), Hamad Bin Khalifa University (HBKU), Doha 0974, Qatar
- Faculty of Health and Social Care Sciences, Kingston University and St George’s University of London, Kingston upon Thames, Surrey, London KT1 2EE, UK
| | - M. Walid Qoronfleh
- Healthcare Research & Policy Division, Q3 Research Institute (QRI), Ann Arbor, MI 48197, USA
| |
Collapse
|
11
|
Parmar JM, Laing NG, Kennerson ML, Ravenscroft G. Genetics of inherited peripheral neuropathies and the next frontier: looking backwards to progress forwards. J Neurol Neurosurg Psychiatry 2024:jnnp-2024-333436. [PMID: 38744462 DOI: 10.1136/jnnp-2024-333436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 04/10/2024] [Indexed: 05/16/2024]
Abstract
Inherited peripheral neuropathies (IPNs) encompass a clinically and genetically heterogeneous group of disorders causing length-dependent degeneration of peripheral autonomic, motor and/or sensory nerves. Despite gold-standard diagnostic testing for pathogenic variants in over 100 known associated genes, many patients with IPN remain genetically unsolved. Providing patients with a diagnosis is critical for reducing their 'diagnostic odyssey', improving clinical care, and for informed genetic counselling. The last decade of massively parallel sequencing technologies has seen a rapid increase in the number of newly described IPN-associated gene variants contributing to IPN pathogenesis. However, the scarcity of additional families and functional data supporting variants in potential novel genes is prolonging patient diagnostic uncertainty and contributing to the missing heritability of IPNs. We review the last decade of IPN disease gene discovery to highlight novel genes, structural variation and short tandem repeat expansions contributing to IPN pathogenesis. From the lessons learnt, we provide our vision for IPN research as we anticipate the future, providing examples of emerging technologies, resources and tools that we propose that will expedite the genetic diagnosis of unsolved IPN families.
Collapse
Affiliation(s)
- Jevin M Parmar
- Rare Disease Genetics and Functional Genomics, Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia
- Centre for Medical Research, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Nigel G Laing
- Centre for Medical Research, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, Western Australia, Australia
- Preventive Genetics, Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia
| | - Marina L Kennerson
- Northcott Neuroscience Laboratory, ANZAC Research Institute, Concord, New South Wales, Australia
- Molecular Medicine Laboratory, Concord Hospital, Concord, New South Wales, Australia
| | - Gianina Ravenscroft
- Rare Disease Genetics and Functional Genomics, Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia
- Centre for Medical Research, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, Western Australia, Australia
| |
Collapse
|
12
|
Ryu J, Barkal S, Yu T, Jankowiak M, Zhou Y, Francoeur M, Phan QV, Li Z, Tognon M, Brown L, Love MI, Bhat V, Lettre G, Ascher DB, Cassa CA, Sherwood RI, Pinello L. Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification. Nat Genet 2024; 56:925-937. [PMID: 38658794 DOI: 10.1038/s41588-024-01726-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 03/21/2024] [Indexed: 04/26/2024]
Abstract
CRISPR base editing screens enable analysis of disease-associated variants at scale; however, variable efficiency and precision confounds the assessment of variant-induced phenotypes. Here, we provide an integrated experimental and computational pipeline that improves estimation of variant effects in base editing screens. We use a reporter construct to measure guide RNA (gRNA) editing outcomes alongside their phenotypic consequences and introduce base editor screen analysis with activity normalization (BEAN), a Bayesian network that uses per-guide editing outcomes provided by the reporter and target site chromatin accessibility to estimate variant impacts. BEAN outperforms existing tools in variant effect quantification. We use BEAN to pinpoint common regulatory variants that alter low-density lipoprotein (LDL) uptake, implicating previously unreported genes. Additionally, through saturation base editing of LDLR, we accurately quantify missense variant pathogenicity that is consistent with measurements in UK Biobank patients and identify underlying structural mechanisms. This work provides a widely applicable approach to improve the power of base editing screens for disease-associated variant characterization.
Collapse
Affiliation(s)
- Jayoung Ryu
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Sam Barkal
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Tian Yu
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Martin Jankowiak
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Yunzhuo Zhou
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Matthew Francoeur
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Quang Vinh Phan
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhijian Li
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Manuel Tognon
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computer Science Department, University of Verona, Verona, Italy
| | - Lara Brown
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael I Love
- Department of Genetics, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Vineel Bhat
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Guillaume Lettre
- Montreal Heart Institute, Montréal, Quebec, Canada
- Faculté de Médecine, Université de Montréal, Montréal, Quebec, Canada
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Christopher A Cassa
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Richard I Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Luca Pinello
- Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.
- Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Pathology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
13
|
Claussnitzer M, Parikh VN, Wagner AH, Arbesfeld JA, Bult CJ, Firth HV, Muffley LA, Nguyen Ba AN, Riehle K, Roth FP, Tabet D, Bolognesi B, Glazer AM, Rubin AF. Minimum information and guidelines for reporting a multiplexed assay of variant effect. Genome Biol 2024; 25:100. [PMID: 38641812 PMCID: PMC11027375 DOI: 10.1186/s13059-024-03223-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 03/25/2024] [Indexed: 04/21/2024] Open
Abstract
Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.
Collapse
Affiliation(s)
- Melina Claussnitzer
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Cambridge, MA, 02142, USA
| | - Victoria N Parikh
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43210, USA
| | - Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | - Helen V Firth
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
- Dept of Medical Genetics, Cambridge University Hospitals NHS Trust, Cambridge, UK
| | - Lara A Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA, 98105, USA
| | - Alex N Nguyen Ba
- Department of Biology, University of Toronto at Mississauga, Mississauga, ON, Canada
| | - Kevin Riehle
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Daniel Tabet
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalunya (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain.
| | - Andrew M Glazer
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
14
|
Wu H, Lin JH, Tang XY, Marenne G, Zou WB, Schutz S, Masson E, Génin E, Fichou Y, Le Gac G, Férec C, Liao Z, Chen JM. Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possible SPINK1 coding variants. Hum Genomics 2024; 18:21. [PMID: 38414044 PMCID: PMC10898081 DOI: 10.1186/s40246-024-00586-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 02/13/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis. RESULTS Our study began with a retrospective analysis of 27 SPINK1 coding SNVs previously assessed using FLGSA, proceeded with a prospective analysis of 35 new FLGSA-tested SPINK1 coding SNVs, followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, which account for 9.3% of the 720 possible coding SNVs. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through detailed comparison of FLGSA results and SpliceAI predictions, we inferred that the remaining 653 untested coding SNVs in the SPINK1 gene are unlikely to significantly affect splicing. Of the 12 splice-altering events, nine produced both normally spliced and aberrantly spliced transcripts, while the remaining three only generated aberrantly spliced transcripts. These splice-impacting SNVs were found solely in exons 1 and 2, notably at the first and/or last coding nucleotides of these exons. Among the 12 splice-altering events, 11 were missense variants (2.17% of 506 potential missense variants), and one was synonymous (0.61% of 164 potential synonymous variants). Notably, adjusting the SpliceAI cut-off to 0.30 instead of the conventional 0.20 would improve specificity without reducing sensitivity. CONCLUSIONS By integrating FLGSA with SpliceAI, we have determined that less than 2% (1.67%) of all possible coding SNVs in SPINK1 significantly influence splicing outcomes. Our findings emphasize the critical importance of conducting splicing analysis within the broader genomic sequence context of the study gene and highlight the inherent uncertainties associated with intermediate SpliceAI scores (0.20 to 0.80). This study contributes to the field by being the first to prospectively interpret all potential coding SNVs in a disease-associated gene with a high degree of accuracy, representing a meaningful attempt at shifting from retrospective to prospective variant analysis in the era of exome and genome sequencing.
Collapse
Affiliation(s)
- Hao Wu
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Jin-Huan Lin
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Xin-Ying Tang
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
- Department of Prevention and Health Care, Eastern Hepatobiliary Surgery Hospital, Naval Medical University, Shanghai, China
| | - Gaëlle Marenne
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Wen-Bin Zou
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China
- Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Sacha Schutz
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Service de Génétique Médicale et de Biologie de La Reproduction, CHRU Brest, Brest, France
| | - Emmanuelle Masson
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Service de Génétique Médicale et de Biologie de La Reproduction, CHRU Brest, Brest, France
| | | | - Yann Fichou
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Gerald Le Gac
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Service de Génétique Médicale et de Biologie de La Reproduction, CHRU Brest, Brest, France
| | - Claude Férec
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Zhuan Liao
- Department of Gastroenterology, Changhai Hospital, Naval Medical University, 168 Changhai Road, Shanghai, 200433, China.
- Shanghai Institute of Pancreatic Diseases, Shanghai, China.
| | - Jian-Min Chen
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France.
| |
Collapse
|
15
|
Hong Z, Barton JP. popDMS infers mutation effects from deep mutational scanning data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.29.577759. [PMID: 38352383 PMCID: PMC10862717 DOI: 10.1101/2024.01.29.577759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions.
Collapse
Affiliation(s)
- Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, USA
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
- Department of Physics and Astronomy, University of Pittsburgh, USA
| |
Collapse
|
16
|
Fowler DM, Rehm HL. Will variants of uncertain significance still exist in 2030? Am J Hum Genet 2024; 111:5-10. [PMID: 38086381 PMCID: PMC10806733 DOI: 10.1016/j.ajhg.2023.11.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/12/2023] [Accepted: 11/13/2023] [Indexed: 12/28/2023] Open
Abstract
In 2020, the National Human Genome Research Institute (NHGRI) made ten "bold predictions," including that "the clinical relevance of all encountered genomic variants will be readily predictable, rendering the diagnostic designation 'variant of uncertain significance (VUS)' obsolete." We discuss the prospects for this prediction, arguing that many, if not most, VUS in coding regions will be resolved by 2030. We outline a confluence of recent changes making this possible, especially advances in the standards for variant classification that better leverage diverse types of evidence, improvements in computational variant effect predictor performance, scalable multiplexed assays of variant effect capable of saturating the genome, and data-sharing efforts that will maximize the information gained from each new individual sequenced and variant interpreted. We suggest that clinicians and researchers can realize a future where VUSs have largely been eliminated, in line with the NHGRI's bold prediction. The length of time taken to reach this future, and thus whether we are able to achieve the goal of largely eliminating VUSs by 2030, is largely a consequence of the choices made now and in the next few years. We believe that investing in eliminating VUSs is worthwhile, since their predominance remains one of the biggest challenges to precision genomic medicine.
Collapse
Affiliation(s)
- Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA; Department of Bioengineering, University of Washington, Seattle, WA, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
| | - Heidi L Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
17
|
Maes S, Deploey N, Peelman F, Eyckerman S. Deep mutational scanning of proteins in mammalian cells. CELL REPORTS METHODS 2023; 3:100641. [PMID: 37963462 PMCID: PMC10694495 DOI: 10.1016/j.crmeth.2023.100641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]
Abstract
Protein mutagenesis is essential for unveiling the molecular mechanisms underlying protein function in health, disease, and evolution. In the past decade, deep mutational scanning methods have evolved to support the functional analysis of nearly all possible single-amino acid changes in a protein of interest. While historically these methods were developed in lower organisms such as E. coli and yeast, recent technological advancements have resulted in the increased use of mammalian cells, particularly for studying proteins involved in human disease. These advancements will aid significantly in the classification and interpretation of variants of unknown significance, which are being discovered at large scale due to the current surge in the use of whole-genome sequencing in clinical contexts. Here, we explore the experimental aspects of deep mutational scanning studies in mammalian cells and report the different methods used in each step of the workflow, ultimately providing a useful guide toward the design of such studies.
Collapse
Affiliation(s)
- Stefanie Maes
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biochemistry and Microbiology, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Nick Deploey
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Frank Peelman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium.
| |
Collapse
|
18
|
Derbel H, Zhao Z, Liu Q. Accurate prediction of functional effect of single amino acid variants with deep learning. Comput Struct Biotechnol J 2023; 21:5776-5784. [PMID: 38074467 PMCID: PMC10709104 DOI: 10.1016/j.csbj.2023.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/08/2023] [Accepted: 11/09/2023] [Indexed: 02/12/2024] Open
Abstract
The assessment of functional effect of amino acid variants is a critical biological problem in proteomics for clinical medicine and protein engineering. Although natively occurring variants offer insights into deleterious variants, high-throughput deep mutational experiments enable comprehensive investigation of amino acid variants for a given protein. However, these mutational experiments are too expensive to dissect millions of variants on thousands of proteins. Thus, computational approaches have been proposed, but they heavily rely on hand-crafted evolutionary conservation, limiting their accuracy. Recent advancement in transformers provides a promising solution to precisely estimate the functional effects of protein variants on high-throughput experimental data. Here, we introduce a novel deep learning model, namely Rep2Mut-V2, which leverages learned representation from transformer models. Rep2Mut-V2 significantly enhances the prediction accuracy for 27 types of measurements of functional effects of protein variants. In the evaluation of 38 protein datasets with 118,933 single amino acid variants, Rep2Mut-V2 achieved an average Spearman's correlation coefficient of 0.7. This surpasses the performance of six state-of-the-art methods, including the recently released methods ESM, DeepSequence and EVE. Even with limited training data, Rep2Mut-V2 outperforms ESM and DeepSequence, showing its potential to extend high-throughput experimental analysis for more protein variants to reduce experimental cost. In conclusion, Rep2Mut-V2 provides accurate predictions of the functional effects of single amino acid variants of protein coding sequences. This tool can significantly aid in the interpretation of variants in human disease studies.
Collapse
Affiliation(s)
- Houssemeddine Derbel
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, Las Vegas, NV 89154, USA
| | - Zhongming Zhao
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Qian Liu
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, Las Vegas, NV 89154, USA
- School of Life Sciences, College of Sciences, University of Nevada, Las Vegas, Las Vegas, NV 89154, USA
| |
Collapse
|
19
|
Abakarova M, Marquet C, Rera M, Rost B, Laine E. Alignment-based Protein Mutational Landscape Prediction: Doing More with Less. Genome Biol Evol 2023; 15:evad201. [PMID: 37936309 PMCID: PMC10653582 DOI: 10.1093/gbe/evad201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 10/27/2023] [Accepted: 11/01/2023] [Indexed: 11/09/2023] Open
Abstract
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictive pipeline.
Collapse
Affiliation(s)
- Marina Abakarova
- CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), Sorbonne Université, UMR 7238, Paris 75005, France
- Université Paris Cité, INSERM UMR U1284, 75004 Paris, France
| | - Céline Marquet
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748 Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748 Garching, Germany
| | - Michael Rera
- Université Paris Cité, INSERM UMR U1284, 75004 Paris, France
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748 Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, Garching, 85748 Munich, Germany
- TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| | - Elodie Laine
- CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), Sorbonne Université, UMR 7238, Paris 75005, France
- Institut universitaire de France (IUF)
| |
Collapse
|
20
|
Xie MJ, Cromie GA, Owens K, Timour MS, Tang M, Kutz JN, El-Hattab AW, McLaughlin RN, Dudley AM. Constructing and interpreting a large-scale variant effect map for an ultrarare disease gene: Comprehensive prediction of the functional impact of PSAT1 genotypes. PLoS Genet 2023; 19:e1010972. [PMID: 37812589 PMCID: PMC10561871 DOI: 10.1371/journal.pgen.1010972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 09/13/2023] [Indexed: 10/11/2023] Open
Abstract
Reduced activity of the enzymes encoded by PHGDH, PSAT1, and PSPH causes a set of ultrarare, autosomal recessive diseases known as serine biosynthesis defects. These diseases present in a broad phenotypic spectrum: at the severe end is Neu-Laxova syndrome, in the intermediate range are infantile serine biosynthesis defects with severe neurological manifestations and growth deficiency, and at the mild end is childhood disease with intellectual disability. However, L-serine supplementation, especially if started early, can ameliorate and in some cases even prevent symptoms. Therefore, knowledge of pathogenic variants can improve clinical outcomes. Here, we use a yeast-based assay to individually measure the functional impact of 1,914 SNV-accessible amino acid substitutions in PSAT. Results of our assay agree well with clinical interpretations and protein structure-function relationships, supporting the inclusion of our data as functional evidence as part of the ACMG variant interpretation guidelines. We use existing ClinVar variants, disease alleles reported in the literature and variants present as homozygotes in the primAD database to define assay ranges that could aid clinical variant interpretation for up to 98% of the tested variants. In addition to measuring the functional impact of individual variants in yeast haploid cells, we also assay pairwise combinations of PSAT1 alleles that recapitulate human genotypes, including compound heterozygotes, in yeast diploids. Results from our diploid assay successfully distinguish the genotypes of affected individuals from those of healthy carriers and agree well with disease severity. Finally, we present a linear model that uses individual allele measurements to predict the biallelic function of ~1.8 million allele combinations corresponding to potential human genotypes. Taken together, our work provides an example of how large-scale functional assays in model systems can be powerfully applied to the study of ultrarare diseases.
Collapse
Affiliation(s)
- Michael J. Xie
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
- Molecular Engineering Graduate Program, University of Washington, Seattle, Washington, United States of America
| | - Gareth A. Cromie
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
| | - Katherine Owens
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
- Department of Applied Mathematics, University of Washington, Seattle, Washington, United States of America
| | - Martin S. Timour
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
| | - Michelle Tang
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
| | - J. Nathan Kutz
- Department of Applied Mathematics, University of Washington, Seattle, Washington, United States of America
| | - Ayman W. El-Hattab
- Department of Clinical Sciences, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | | | - Aimée M. Dudley
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
- Molecular Engineering Graduate Program, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
21
|
Ryu J, Barkal S, Yu T, Jankowiak M, Zhou Y, Francoeur M, Phan QV, Li Z, Tognon M, Brown L, Love MI, Lettre G, Ascher DB, Cassa CA, Sherwood RI, Pinello L. Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.08.23295253. [PMID: 37732177 PMCID: PMC10508837 DOI: 10.1101/2023.09.08.23295253] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
CRISPR base editing screens are powerful tools for studying disease-associated variants at scale. However, the efficiency and precision of base editing perturbations vary, confounding the assessment of variant-induced phenotypic effects. Here, we provide an integrated pipeline that improves the estimation of variant impact in base editing screens. We perform high-throughput ABE8e-SpRY base editing screens with an integrated reporter construct to measure the editing efficiency and outcomes of each gRNA alongside their phenotypic consequences. We introduce BEAN, a Bayesian network that accounts for per-guide editing outcomes and target site chromatin accessibility to estimate variant impacts. We show this pipeline attains superior performance compared to existing tools in variant classification and effect size quantification. We use BEAN to pinpoint common variants that alter LDL uptake, implicating novel genes. Additionally, through saturation base editing of LDLR, we enable accurate quantitative prediction of the effects of missense variants on LDL-C levels, which aligns with measurements in UK Biobank individuals, and identify structural mechanisms underlying variant pathogenicity. This work provides a widely applicable approach to improve the power of base editor screens for disease-associated variant characterization.
Collapse
Affiliation(s)
- Jayoung Ryu
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Sam Barkal
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Tian Yu
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Yunzhuo Zhou
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Matthew Francoeur
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Quang Vinh Phan
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhijian Li
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Manuel Tognon
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computer Science Department, University of Verona, Verona, Italy
| | - Lara Brown
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael I. Love
- Department of Genetics, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Guillaume Lettre
- Montreal Heart Institute, Montréal, QC H1T 1C8, Canada
- Faculté de Médecine, Université de Montréal, Montréal, QC H3T 1J4, Canada
| | - David B. Ascher
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Christopher A. Cassa
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Richard I. Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Luca Pinello
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
22
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
23
|
Sahu S, Sullivan TL, Mitrophanov AY, Galloux M, Nousome D, Southon E, Caylor D, Mishra AP, Evans CN, Clapp ME, Burkett S, Malys T, Chari R, Biswas K, Sharan SK. Saturation genome editing of 11 codons and exon 13 of BRCA2 coupled with chemotherapeutic drug response accurately determines pathogenicity of variants. PLoS Genet 2023; 19:e1010940. [PMID: 37713444 PMCID: PMC10529611 DOI: 10.1371/journal.pgen.1010940] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 09/27/2023] [Accepted: 08/28/2023] [Indexed: 09/17/2023] Open
Abstract
The unknown pathogenicity of a significant number of variants found in cancer-related genes is attributed to limited epidemiological data, resulting in their classification as variant of uncertain significance (VUS). To date, Breast Cancer gene-2 (BRCA2) has the highest number of VUSs, which has necessitated the development of several robust functional assays to determine their functional significance. Here we report the use of a humanized-mouse embryonic stem cell (mESC) line expressing a single copy of the human BRCA2 for a CRISPR-Cas9-based high-throughput functional assay. As a proof-of-principle, we have saturated 11 codons encoded by BRCA2 exons 3, 18, 19 and all possible single-nucleotide variants in exon 13 and multiplexed these variants for their functional categorization. Specifically, we used a pool of 180-mer single-stranded donor DNA to generate all possible combination of variants. Using a high throughput sequencing-based approach, we show a significant drop in the frequency of non-functional variants, whereas functional variants are enriched in the pool of the cells. We further demonstrate the response of these variants to the DNA-damaging agents, cisplatin and olaparib, allowing us to use cellular survival and drug response as parameters for variant classification. Using this approach, we have categorized 599 BRCA2 variants including 93-single nucleotide variants (SNVs) across the 11 codons, of which 28 are reported in ClinVar. We also functionally categorized 252 SNVs from exon 13 into 188 functional and 60 non-functional variants, demonstrating that saturation genome editing (SGE) coupled with drug sensitivity assays can enhance functional annotation of BRCA2 VUS.
Collapse
Affiliation(s)
- Sounak Sahu
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Teresa L. Sullivan
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Alexander Y. Mitrophanov
- Statistical Consulting and Scientific Programming, Frederick National Laboratory for Cancer Research, National Institutes of Health, Frederick, Maryland, United States of America
| | | | - Darryl Nousome
- CCR Bioinformatics Resource, Leidos Biomedical Sciences, Inc. Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Eileen Southon
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Dylan Caylor
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Arun Prakash Mishra
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Christine N. Evans
- Genome Modification Core, Laboratory Animal Sciences Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Michelle E. Clapp
- Genome Modification Core, Laboratory Animal Sciences Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Sandra Burkett
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Tyler Malys
- Statistical Consulting and Scientific Programming, Frederick National Laboratory for Cancer Research, National Institutes of Health, Frederick, Maryland, United States of America
| | - Raj Chari
- Genome Modification Core, Laboratory Animal Sciences Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Kajal Biswas
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| | - Shyam K. Sharan
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| |
Collapse
|
24
|
Chen J, Woldring DR, Huang F, Huang X, Wei GW. Topological deep learning based deep mutational scanning. Comput Biol Med 2023; 164:107258. [PMID: 37506452 PMCID: PMC10528359 DOI: 10.1016/j.compbiomed.2023.107258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/28/2023] [Accepted: 07/08/2023] [Indexed: 07/30/2023]
Abstract
High-throughput deep mutational scanning (DMS) experiments have significantly impacted protein engineering, drug discovery, immunology, cancer biology, and evolutionary biology by enabling the systematic understanding of protein functions. However, the mutational space associated with proteins is astronomically large, making it overwhelming for current experimental capabilities. Therefore, alternative methods for DMS are imperative. We propose a topological deep learning (TDL) paradigm to facilitate in silico DMS. We utilize a new topological data analysis (TDA) technique based on the persistent spectral theory, also known as persistent Laplacian, to capture both topological invariants and the homotopic shape evolution of data. To validate our TDL-DMS model, we use SARS-CoV-2 datasets and show excellent accuracy and reliability for binding interface mutations. This finding is significant for SARS-CoV-2 variant forecasting and designing effective antibodies and vaccines. Our proposed model is expected to have a significant impact on drug discovery, vaccine design, precision medicine, and protein engineering.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematical Sciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Daniel R Woldring
- Department of Chemical Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Faqing Huang
- Department of Chemistry and Biochemistry, University of Southern Mississippi, Hattiesburg, MS 39406, USA
| | - Xuefei Huang
- Department of Chemistry, Michigan State University, MI 48824, USA; Department of Biomedical Engineering, Michigan State University, East Lansing, MI 48824, USA; The Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
25
|
Ozturk K, Panwala R, Sheen J, Ford K, Payne N, Zhang DE, Hutter S, Haferlach T, Ideker T, Mali P, Carter H. Interface-guided phenotyping of coding variants in the transcription factor RUNX1 with SEUSS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.03.551876. [PMID: 37577681 PMCID: PMC10418284 DOI: 10.1101/2023.08.03.551876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Understanding the consequences of single amino acid substitutions in cancer driver genes remains an unmet need. Perturb-seq provides a tool to investigate the effects of individual mutations on cellular programs. Here we deploy SEUSS, a Perturb-seq like approach, to generate and assay mutations at physical interfaces of the RUNX1 Runt domain. We measured the impact of 115 mutations on RNA profiles in single myelogenous leukemia cells and used the profiles to categorize mutations into three functionally distinct groups: wild-type (WT)-like, loss-of-function (LOF)-like and hypomorphic. Notably, the largest concentration of functional mutations (non-WT-like) clustered at the DNA binding site and contained many of the more frequently observed mutations in human cancers. Hypomorphic variants shared characteristics with loss of function variants but had gene expression profiles indicative of response to neural growth factor and cytokine recruitment of neutrophils. Additionally, DNA accessibility changes upon perturbations were enriched for RUNX1 binding motifs, particularly near differentially expressed genes. Overall, our work demonstrates the potential of targeting protein interaction interfaces to better define the landscape of prospective phenotypes reachable by amino acid substitutions.
Collapse
|
26
|
Fowler DM, Adams DJ, Gloyn AL, Hahn WC, Marks DS, Muffley LA, Neal JT, Roth FP, Rubin AF, Starita LM, Hurles ME. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol 2023; 24:147. [PMID: 37394429 PMCID: PMC10316620 DOI: 10.1186/s13059-023-02986-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 06/13/2023] [Indexed: 07/04/2023] Open
Abstract
Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an 'Atlas' of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.
Collapse
Affiliation(s)
- Douglas M. Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA USA
- Department of Bioengineering, University of Washington, Seattle, WA USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA USA
| | | | - Anna L. Gloyn
- Department of Pediatrics & Department of Genetics, Division of Endocrinology, Stanford School of Medicine, Stanford University, Stanford, CA USA
| | - William C. Hahn
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA USA
- Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Debora S. Marks
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Department of Systems Biology, Harvard Medical School, Cambridge, USA
| | - Lara A. Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA USA
| | - James T. Neal
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease at Broad Institute, Cambridge, MA USA
| | - Frederick P. Roth
- Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON Canada
| | - Alan F. Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC Australia
| | - Lea M. Starita
- Department of Genome Sciences, University of Washington, Seattle, WA USA
- Department of Bioengineering, University of Washington, Seattle, WA USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA USA
| | | |
Collapse
|
27
|
Rong S, Neil CR, Welch A, Duan C, Maguire S, Meremikwu IC, Meyerson M, Evans BJ, Fairbrother WG. Large-scale functional screen identifies genetic variants with splicing effects in modern and archaic humans. Proc Natl Acad Sci U S A 2023; 120:e2218308120. [PMID: 37192163 PMCID: PMC10214146 DOI: 10.1073/pnas.2218308120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 04/12/2023] [Indexed: 05/18/2023] Open
Abstract
Humans coexisted and interbred with other hominins which later became extinct. These archaic hominins are known to us only through fossil records and for two cases, genome sequences. Here, we engineer Neanderthal and Denisovan sequences into thousands of artificial genes to reconstruct the pre-mRNA processing patterns of these extinct populations. Of the 5,169 alleles tested in this massively parallel splicing reporter assay (MaPSy), we report 962 exonic splicing mutations that correspond to differences in exon recognition between extant and extinct hominins. Using MaPSy splicing variants, predicted splicing variants, and splicing quantitative trait loci, we show that splice-disrupting variants experienced greater purifying selection in anatomically modern humans than that in Neanderthals. Adaptively introgressed variants were enriched for moderate-effect splicing variants, consistent with positive selection for alternative spliced alleles following introgression. As particularly compelling examples, we characterized a unique tissue-specific alternative splicing variant at the adaptively introgressed innate immunity gene TLR1, as well as a unique Neanderthal introgressed alternative splicing variant in the gene HSPG2 that encodes perlecan. We further identified potentially pathogenic splicing variants found only in Neanderthals and Denisovans in genes related to sperm maturation and immunity. Finally, we found splicing variants that may contribute to variation among modern humans in total bilirubin, balding, hemoglobin levels, and lung capacity. Our findings provide unique insights into natural selection acting on splicing in human evolution and demonstrate how functional assays can be used to identify candidate causal variants underlying differences in gene regulation and phenotype.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI02912
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Christopher R. Neil
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Anastasia Welch
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Chaorui Duan
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Samantha Maguire
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Ijeoma C. Meremikwu
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Malcolm Meyerson
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Ben J. Evans
- Department of Biology, McMaster University, Hamilton, ONL8S 4K1, Canada
| | - William G. Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI02912
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
- Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI02912
| |
Collapse
|
28
|
Tayebi N, Leon‐Ricardo B, McCall K, Mehinovic E, Engelstad K, Huynh V, Turner TN, Weisenberg J, Thio LL, Hruz P, Williams RSB, De Vivo DC, Petit V, Haller G, Gurnett CA. Quantitative determination of SLC2A1 variant functional effects in GLUT1 deficiency syndrome. Ann Clin Transl Neurol 2023; 10:787-801. [PMID: 37000947 PMCID: PMC10187726 DOI: 10.1002/acn3.51767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 03/08/2023] [Accepted: 03/12/2023] [Indexed: 04/03/2023] Open
Abstract
OBJECTIVE The goal of this study is to demonstrate the utility of a growth assay to quantify the functional impact of single nucleotide variants (SNVs) in SLC2A1, the gene responsible for Glut1DS. METHODS The functional impact of 40 SNVs in SLC2A1 was quantitatively determined in HAP1 cells in which SLC2A1 is required for growth. Donor libraries were introduced into the endogenous SLC2A1 gene in HAP1-Lig4KO cells using CRISPR/Cas9. Cell populations were harvested and sequenced to quantify the effect of variants on growth and generate a functional score. Quantitative functional scores were compared to 3-OMG uptake, SLC2A1 cell surface expression, CADD score, and clinical data, including CSF/blood glucose ratio. RESULTS Nonsense variants (N = 3) were reduced in cell culture over time resulting in negative scores (mean score: -1.15 ± 0.17), whereas synonymous variants (N = 10) were not depleted (mean score: 0.25 ± 0.12) (P < 2e-16). Missense variants (N = 27) yielded a range of functional scores including slightly negative scores, supporting a partial function and intermediate phenotype. Several variants with normal results on either cell surface expression (p.N34S and p.W65R) or 3-OMG uptake (p.W65R) had negative functional scores. There is a moderate but significant correlation between our functional scores and CADD scores. INTERPRETATION Cell growth is useful to quantitatively determine the functional effects of SLC2A1 variants. Nonsense variants were reliably distinguished from benign variants in this in vitro functional assay. For facilitating early diagnosis and therapeutic intervention, future work is needed to determine the functional effect of every possible variant in SLC2A1.
Collapse
Affiliation(s)
- Naeimeh Tayebi
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Brian Leon‐Ricardo
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Kevin McCall
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Elvisa Mehinovic
- Department of GeneticsWashington University in St LouisSt LouisMissouriUSA
| | - Kristin Engelstad
- Department of NeurologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Vincent Huynh
- Department of NeurologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Tychele N. Turner
- Department of GeneticsWashington University in St LouisSt LouisMissouriUSA
| | - Judy Weisenberg
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Liu L. Thio
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Paul Hruz
- Department of PediatricsWashington University in St LouisSt LouisMissouriUSA
| | - Robin S. B. Williams
- Centre for Biomedical Sciences, Department of Biological SciencesRoyal Holloway University of LondonEghamUK
| | - Darryl C. De Vivo
- Department of NeurologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | | | - Gabe Haller
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
- Department of GeneticsWashington University in St LouisSt LouisMissouriUSA
- Department of Neurological SurgeryWashington University in St LouisSt LouisMissouriUSA
| | | |
Collapse
|
29
|
Durrant MG, Fanton A, Tycko J, Hinks M, Chandrasekaran SS, Perry NT, Schaepe J, Du PP, Lotfy P, Bassik MC, Bintu L, Bhatt AS, Hsu PD. Systematic discovery of recombinases for efficient integration of large DNA sequences into the human genome. Nat Biotechnol 2023; 41:488-499. [PMID: 36217031 PMCID: PMC10083194 DOI: 10.1038/s41587-022-01494-w] [Citation(s) in RCA: 56] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 09/01/2022] [Indexed: 11/09/2022]
Abstract
Large serine recombinases (LSRs) are DNA integrases that facilitate the site-specific integration of mobile genetic elements into bacterial genomes. Only a few LSRs, such as Bxb1 and PhiC31, have been characterized to date, with limited efficiency as tools for DNA integration in human cells. In this study, we developed a computational approach to identify thousands of LSRs and their DNA attachment sites, expanding known LSR diversity by >100-fold and enabling the prediction of their insertion site specificities. We tested their recombination activity in human cells, classifying them as landing pad, genome-targeting or multi-targeting LSRs. Overall, we achieved up to seven-fold higher recombination than Bxb1 and genome integration efficiencies of 40-75% with cargo sizes over 7 kb. We also demonstrate virus-free, direct integration of plasmid or amplicon libraries for improved functional genomics applications. This systematic discovery of recombinases directly from microbial sequencing data provides a resource of over 60 LSRs experimentally characterized in human cells for large-payload genome insertion without exposed DNA double-stranded breaks.
Collapse
Affiliation(s)
- Matthew G Durrant
- Arc Institute, Palo Alto, CA, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Alison Fanton
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Michaela Hinks
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Sita S Chandrasekaran
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Nicholas T Perry
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Julia Schaepe
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Peter P Du
- Department of Genetics, Stanford University, Stanford, CA, USA
- Cancer Biology Program, Stanford University, Stanford, CA, USA
| | - Peter Lotfy
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Medicine (Hematology), Stanford University, Stanford, CA, USA.
| | - Patrick D Hsu
- Arc Institute, Palo Alto, CA, USA.
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA.
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA.
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA.
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
30
|
Llargués-Sistac G, Bonjoch L, Castellvi-Bel S. HAP1, a new revolutionary cell model for gene editing using CRISPR-Cas9. Front Cell Dev Biol 2023; 11:1111488. [PMID: 36936678 PMCID: PMC10020200 DOI: 10.3389/fcell.2023.1111488] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
The use of next-generation sequencing (NGS) technologies has been instrumental in the characterization of the mutational landscape of complex human diseases like cancer. But despite the enormous rise in the identification of disease candidate genetic variants, their functionality is yet to be fully elucidated in order to have a clear implication in patient care. Haploid human cell models have become the tool of choice for functional gene studies, since they only contain one copy of the genome and can therefore show the unmasked phenotype of genetic variants. Over the past few years, the human near-haploid cell line HAP1 has widely been consolidated as one of the favorite cell line models for functional genetic studies. Its rapid turnover coupled with the fact that only one allele needs to be modified in order to express the subsequent desired phenotype has made this human cell line a valuable tool for gene editing by CRISPR-Cas9 technologies. This review examines the recent uses of the HAP1 cell line model in functional genetic studies and high-throughput genetic screens using the CRISPR-Cas9 system. It covers its use in an attempt to develop new and relevant disease models to further elucidate gene function, and create new ways to understand the genetic basis of human diseases. We will cover the advantages and potential of the use of CRISPR-Cas9 technology on HAP1 to easily and efficiently study the functional interpretation of gene function and human single-nucleotide genetic variants of unknown significance identified through NGS technologies, and its implications for changes in clinical practice and patient care.
Collapse
Affiliation(s)
- Gemma Llargués-Sistac
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Gastroenterology Department, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Barcelona, Spain
| | | | - Sergi Castellvi-Bel
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Gastroenterology Department, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Barcelona, Spain
| |
Collapse
|
31
|
Shah BP, Sleiman PM, Mc Donald J, Moeller IH, Kleyn P. Functional characterization of all missense variants in LEPR, PCSK1, and POMC genes arising from single-nucleotide variants. Expert Rev Endocrinol Metab 2023; 18:209-219. [PMID: 36864747 DOI: 10.1080/17446651.2023.2179985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 02/09/2023] [Indexed: 02/18/2023]
Abstract
OBJECTIVE Hyperphagia and early-onset, severe obesity are clinical characteristics of rare melanocortin-4 receptor (MC4R) pathway diseases due to loss-of-function (LOF) variants in genes comprising the MC4R pathway. In vitro functional characterization of 12,879 possible exonic missense variants from single-nucleotide variants (SNVs) of LEPR, POMC, and PCSK1 was performed to determine the impact of these variants on protein function. METHODS SNVs of the three genes were transiently transfected into cell lines, and each variant was subsequently classified according to functional impact. We validated three assays by comparing classifications against functional characterization of 29 previously published variants. RESULTS Our results significantly correlated with previously published pathogenic categories (r = 0.623; P = 3.03 × 10-4) of all potential missense variants arising from SNVs. Of all observed variants identified through available databases and a tested cohort of 16,061 patients with obesity, 8.6% of LEPR, 63.2% of PCSK1, and 10.6% of POMC variants exhibited LOF, including variants currently classified as a variant of uncertain significance (VUS). CONCLUSIONS The functional data provided here can assist in the reclassification of several VUS in LEPR, PCSK1, and POMC and highlight their impact in MC4R pathway diseases.
Collapse
Affiliation(s)
- Bhavik P Shah
- Rhythm Pharmaceuticals, Inc, Boston, MA, USA
- Bridgebio Pharma, Palo Alto, CA
| | | | | | - Ida H Moeller
- Rhythm Pharmaceuticals, Inc, Boston, MA, USA
- Sarepta Therapeutics, Cambridge, MA, USA
| | | |
Collapse
|
32
|
Gallego Romero I, Lea AJ. Leveraging massively parallel reporter assays for evolutionary questions. Genome Biol 2023; 24:26. [PMID: 36788564 PMCID: PMC9926830 DOI: 10.1186/s13059-023-02856-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 01/17/2023] [Indexed: 02/16/2023] Open
Abstract
A long-standing goal of evolutionary biology is to decode how gene regulation contributes to organismal diversity. Doing so is challenging because it is hard to predict function from non-coding sequence and to perform molecular research with non-model taxa. Massively parallel reporter assays (MPRAs) enable the testing of thousands to millions of sequences for regulatory activity simultaneously. Here, we discuss the execution, advantages, and limitations of MPRAs, with a focus on evolutionary questions. We propose solutions for extending MPRAs to rare taxa and those with limited genomic resources, and we underscore MPRA's broad potential for driving genome-scale, functional studies across organisms.
Collapse
Affiliation(s)
- Irene Gallego Romero
- Melbourne Integrative Genomics, University of Melbourne, Royal Parade, Parkville, Victoria, 3010, Australia. .,School of BioSciences, The University of Melbourne, Royal Parade, Parkville, 3010, Australia. .,The Centre for Stem Cell Systems, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, 30 Royal Parade, Parkville, Victoria, 3010, Australia. .,Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Riia 23b, 51010, Tartu, Estonia.
| | - Amanda J. Lea
- grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37240 USA ,Child and Brain Development Program, Canadian Institute for Advanced Study, Toronto, Canada
| |
Collapse
|
33
|
Xie MJ, Cromie GA, Owens K, Timour MS, Tang M, Kutz JN, El-Hattab AW, McLaughlin RN, Dudley AM. Predicting the functional effect of compound heterozygous genotypes from large scale variant effect maps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.11.523651. [PMID: 36711904 PMCID: PMC9882023 DOI: 10.1101/2023.01.11.523651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Background Pathogenic variants in PHGDH, PSAT1 , and PSPH cause a set of rare, autosomal recessive diseases known as serine biosynthesis defects. Serine biosynthesis defects present in a broad phenotypic spectrum that includes, at the severe end, Neu-Laxova syndrome, a lethal multiple congenital anomaly disease, intermediately in the form of infantile serine biosynthesis defects with severe neurological manifestations and growth deficiency, and at the mild end, as childhood disease with intellectual disability. However, because L-serine supplementation, especially if started early, can ameliorate and in some cases even prevent symptoms, knowledge of pathogenic variants is highly actionable. Methods Recently, our laboratory established a yeast-based assay for human PSAT1 function. We have now applied it at scale to assay the functional impact of 1,914 SNV-accessible amino acid substitutions. In addition to assaying the functional impact of individual variants in yeast haploid cells, we can assay pairwise combinations of PSAT1 alleles that recapitulate human genotypes, including compound heterozygotes, in yeast diploids. Results Results of our assays of individual variants (in haploid yeast cells) agree well with clinical interpretations and protein structure-function relationships, supporting the use of our data as functional evidence under the ACMG interpretation guidelines. Results from our diploid assay successfully distinguish patient genotypes from those of healthy carriers and agree well with disease severity. Finally, we present a linear model that uses individual allele measurements (in haploid yeast cells) to accurately predict the biallelic function (in diploid yeast cells) of ~ 1.8 million allele combinations corresponding to potential human genotypes. Conclusions Taken together, our work provides an example of how large-scale functional assays in model systems can be powerfully applied to the study of a rare disease.
Collapse
|
34
|
Sora V, Laspiur AO, Degn K, Arnaudi M, Utichi M, Beltrame L, De Menezes D, Orlandi M, Stoltze UK, Rigina O, Sackett PW, Wadt K, Schmiegelow K, Tiberti M, Papaleo E. RosettaDDGPrediction for high-throughput mutational scans: From stability to binding. Protein Sci 2023; 32:e4527. [PMID: 36461907 PMCID: PMC9795540 DOI: 10.1002/pro.4527] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/05/2022]
Abstract
Reliable prediction of free energy changes upon amino acid substitutions (ΔΔGs) is crucial to investigate their impact on protein stability and protein-protein interaction. Advances in experimental mutational scans allow high-throughput studies thanks to multiplex techniques. On the other hand, genomics initiatives provide a large amount of data on disease-related variants that can benefit from analyses with structure-based methods. Therefore, the computational field should keep the same pace and provide new tools for fast and accurate high-throughput ΔΔG calculations. In this context, the Rosetta modeling suite implements effective approaches to predict folding/unfolding ΔΔGs in a protein monomer upon amino acid substitutions and calculate the changes in binding free energy in protein complexes. However, their application can be challenging to users without extensive experience with Rosetta. Furthermore, Rosetta protocols for ΔΔG prediction are designed considering one variant at a time, making the setup of high-throughput screenings cumbersome. For these reasons, we devised RosettaDDGPrediction, a customizable Python wrapper designed to run free energy calculations on a set of amino acid substitutions using Rosetta protocols with little intervention from the user. Moreover, RosettaDDGPrediction assists with checking completed runs and aggregates raw data for multiple variants, as well as generates publication-ready graphics. We showed the potential of the tool in four case studies, including variants of uncertain significance in childhood cancer, proteins with known experimental unfolding ΔΔGs values, interactions between target proteins and disordered motifs, and phosphomimetics. RosettaDDGPrediction is available, free of charge and under GNU General Public License v3.0, at https://github.com/ELELAB/RosettaDDGPrediction.
Collapse
Affiliation(s)
- Valentina Sora
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Adrian Otamendi Laspiur
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Matteo Arnaudi
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Mattia Utichi
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Ludovica Beltrame
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Dayana De Menezes
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Matteo Orlandi
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Ulrik Kristoffer Stoltze
- Department of Clinical GeneticsCopenhagen University Hospital RigshospitaletCopenhagenDenmark
- Department of Pediatrics and Adolescent MedicineUniversity Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Olga Rigina
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Peter Wad Sackett
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Karin Wadt
- Department of Clinical GeneticsCopenhagen University Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Kjeld Schmiegelow
- Department of Pediatrics and Adolescent MedicineUniversity Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| |
Collapse
|
35
|
Dace P, Findlay GM. Reducing uncertainty in genetic testing with Saturation Genome Editing. MED GENET-BERLIN 2022; 34:297-304. [PMID: 38836089 PMCID: PMC11006300 DOI: 10.1515/medgen-2022-2159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
Accurate interpretation of human genetic data is critical for optimizing outcomes in the era of genomic medicine. Powerful methods for testing genetic variants for functional effects are allowing researchers to characterize thousands of variants across disease genes. Here, we review experimental tools enabling highly scalable assays of variants, focusing specifically on Saturation Genome Editing (SGE). We discuss examples of how this technique is being implemented for variant testing at scale and describe how SGE data for BRCA1 have been clinically validated and used to aid variant interpretation. The initial success at predicting variant pathogenicity with SGE has spurred efforts to expand this and related techniques to many more genes.
Collapse
Affiliation(s)
- Phoebe Dace
- The Genome Function Laboratory, The Francis Crick Institute, 1 Midland Rd, London, United Kingdom
| | - Gregory M Findlay
- The Genome Function Laboratory, The Francis Crick Institute, 1 Midland Rd, London, United Kingdom
| |
Collapse
|
36
|
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022; 56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Scalable sequence-function studies have enabled the systematic analysis and cataloging of hundreds of thousands of coding and noncoding genetic variants in the human genome. This has improved clinical variant interpretation and provided insights into the molecular, biophysical, and cellular effects of genetic variants at an astonishing scale and resolution across the spectrum of allele frequencies. In this review, we explore current applications and prospects for the field and outline the principles underlying scalable functional assay design, with a focus on the study of single-nucleotide coding and noncoding variants.
Collapse
Affiliation(s)
- Daniel Tabet
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Victoria Parikh
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Frederick P Roth
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Center for Genomic Medicine and Endocrine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Harvard University, Boston, Massachusetts, USA;
| |
Collapse
|
37
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
38
|
Boeck L, Burbaud S, Skwark M, Pearson WH, Sangen J, Wuest AW, Marshall EKP, Weimann A, Everall I, Bryant JM, Malhotra S, Bannerman BP, Kierdorf K, Blundell TL, Dionne MS, Parkhill J, Andres Floto R. Mycobacterium abscessus pathogenesis identified by phenogenomic analyses. Nat Microbiol 2022; 7:1431-1441. [PMID: 36008617 PMCID: PMC9418003 DOI: 10.1038/s41564-022-01204-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/19/2022] [Indexed: 12/12/2022]
Abstract
The medical and scientific response to emerging and established pathogens is often severely hampered by ignorance of the genetic determinants of virulence, drug resistance and clinical outcomes that could be used to identify therapeutic drug targets and forecast patient trajectories. Taking the newly emergent multidrug-resistant bacteria Mycobacterium abscessus as an example, we show that combining high-dimensional phenotyping with whole-genome sequencing in a phenogenomic analysis can rapidly reveal actionable systems-level insights into bacterial pathobiology. Through phenotyping of 331 clinical isolates, we discovered three distinct clusters of isolates, each with different virulence traits and associated with a different clinical outcome. We combined genome-wide association studies with proteome-wide computational structural modelling to define likely causal variants, and employed direct coupling analysis to identify co-evolving, and therefore potentially epistatic, gene networks. We then used in vivo CRISPR-based silencing to validate our findings and discover clinically relevant M. abscessus virulence factors including a secretion system, thus illustrating how phenogenomics can reveal critical pathways within emerging pathogenic bacteria.
Collapse
Affiliation(s)
- Lucas Boeck
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK
- Cambridge Centre for AI in Medicine, Cambridge, UK
- Wellcome Sanger Institute, Hinxton, UK
- Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Sophie Burbaud
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK
- Cambridge Centre for AI in Medicine, Cambridge, UK
| | - Marcin Skwark
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Will H Pearson
- MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK
- Department of Life Sciences, Imperial College London, London, UK
| | - Jasper Sangen
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK
- Cambridge Centre for AI in Medicine, Cambridge, UK
| | - Andreas W Wuest
- Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Eleanor K P Marshall
- MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK
- Department of Life Sciences, Imperial College London, London, UK
| | - Aaron Weimann
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK
- Cambridge Centre for AI in Medicine, Cambridge, UK
| | | | - Josephine M Bryant
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK
- Cambridge Centre for AI in Medicine, Cambridge, UK
| | - Sony Malhotra
- Department of Biochemistry, University of Cambridge, Cambridge, UK
- Scientific Computing Department, Science and Technology Facilities Council, Harwell, UK
| | - Bridget P Bannerman
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK
- Cambridge Centre for AI in Medicine, Cambridge, UK
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Katrin Kierdorf
- MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK
- Department of Life Sciences, Imperial College London, London, UK
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Marc S Dionne
- MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK
- Department of Life Sciences, Imperial College London, London, UK
| | - Julian Parkhill
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - R Andres Floto
- Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK.
- Cambridge Centre for AI in Medicine, Cambridge, UK.
- Cambridge Centre for Lung Infection, Royal Papworth Hospital, Cambridge, UK.
| |
Collapse
|
39
|
Pinglay S, Bulajić M, Rahe DP, Huang E, Brosh R, Mamrak NE, King BR, German S, Cadley JA, Rieber L, Easo N, Lionnet T, Mahony S, Maurano MT, Holt LJ, Mazzoni EO, Boeke JD. Synthetic regulatory reconstitution reveals principles of mammalian Hox cluster regulation. Science 2022; 377:eabk2820. [PMID: 35771912 PMCID: PMC9648154 DOI: 10.1126/science.abk2820] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Precise Hox gene expression is crucial for embryonic patterning. Intra-Hox transcription factor binding and distal enhancer elements have emerged as the major regulatory modules controlling Hox gene expression. However, quantifying their relative contributions has remained elusive. Here, we introduce "synthetic regulatory reconstitution," a conceptual framework for studying gene regulation, and apply it to the HoxA cluster. We synthesized and delivered variant rat HoxA clusters (130 to 170 kilobases) to an ectopic location in the mouse genome. We found that a minimal HoxA cluster recapitulated correct patterns of chromatin remodeling and transcription in response to patterning signals, whereas the addition of distal enhancers was needed for full transcriptional output. Synthetic regulatory reconstitution could provide a generalizable strategy for deciphering the regulatory logic of gene expression in complex genomes.
Collapse
Affiliation(s)
- Sudarshan Pinglay
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Milica Bulajić
- Department of Biology, New York University, New York, NY 10003, USA
| | - Dylan P. Rahe
- Department of Biology, New York University, New York, NY 10003, USA
| | - Emily Huang
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Ran Brosh
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Nicholas E. Mamrak
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Benjamin R. King
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Sergei German
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - John A. Cadley
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Lila Rieber
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Nicole Easo
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
| | - Timothée Lionnet
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
- Department of Cell Biology, NYU Langone Health, New York, NY 10016, USA
- Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Matthew T. Maurano
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
- Department of Pathology, NYU Langone Health, New York, NY 10016, USA
| | - Liam J. Holt
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
- Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA
- Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016, USA
| | | | - Jef D. Boeke
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
- Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA
- Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016, USA
| |
Collapse
|
40
|
Li B, Jin B, Capra JA, Bush WS. Integration of Protein Structure and Population-Scale DNA Sequence Data for Disease Gene Discovery and Variant Interpretation. Annu Rev Biomed Data Sci 2022; 5:141-161. [PMID: 35508071 DOI: 10.1146/annurev-biodatasci-122220-112147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The experimental and computational techniques for capturing information about protein structures and genetic variation within the human genome have advanced dramatically in the past 20 years, generating extensive new data resources. In this review, we discuss these advances, along with new approaches for determining the impact a genetic variant has on protein function. We focus on the potential of new methods that integrate human genetic variation into protein structures to discover relationships to disease, including the discovery of mutational hotspots in cancer-related proteins, the localization of protein-altering variants within protein regions for common complex diseases, and the assessment of variants of unknown significance for Mendelian traits. We expect that approaches that integrate these data sources will play increasingly important roles in disease gene discovery and variant interpretation. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Bian Li
- Department of Biological Sciences and Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, USA
| | - Bowen Jin
- Graduate Program in Systems Biology and Bioinformatics, Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - John A Capra
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, California, USA;
| | - William S Bush
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio, USA;
| |
Collapse
|
41
|
Biswas A, Haldane A, Levy RM. Limits to detecting epistasis in the fitness landscape of HIV. PLoS One 2022; 17:e0262314. [PMID: 35041711 PMCID: PMC8765623 DOI: 10.1371/journal.pone.0262314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/20/2021] [Indexed: 02/05/2023] Open
Abstract
The rapid evolution of HIV is constrained by interactions between mutations which affect viral fitness. In this work, we explore the role of epistasis in determining the mutational fitness landscape of HIV for multiple drug target proteins, including Protease, Reverse Transcriptase, and Integrase. Epistatic interactions between residues modulate the mutation patterns involved in drug resistance, with unambiguous signatures of epistasis best seen in the comparison of the Potts model predicted and experimental HIV sequence “prevalences” expressed as higher-order marginals (beyond triplets) of the sequence probability distribution. In contrast, experimental measures of fitness such as viral replicative capacities generally probe fitness effects of point mutations in a single background, providing weak evidence for epistasis in viral systems. The detectable effects of epistasis are obscured by higher evolutionary conservation at sites. While double mutant cycles in principle, provide one of the best ways to probe epistatic interactions experimentally without reference to a particular background, we show that the analysis is complicated by the small dynamic range of measurements. Overall, we show that global pairwise interaction Potts models are necessary for predicting the mutational landscape of viral proteins.
Collapse
Affiliation(s)
- Avik Biswas
- Department of Physics, Temple University, Philadelphia, PA, United States of America
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, United States of America
| | - Allan Haldane
- Department of Physics, Temple University, Philadelphia, PA, United States of America
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, United States of America
| | - Ronald M. Levy
- Department of Physics, Temple University, Philadelphia, PA, United States of America
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, United States of America
- Department of Chemistry, Temple University, Philadelphia, PA, United States of America
- * E-mail:
| |
Collapse
|
42
|
Tosi L, Chaikban L, Larman BH, Rosenfield J, Parekkadan B. Massively parallel DNA target capture using long adapter single stranded oligonucleotide (LASSO) probes assembled through a novel DNA recombinase mediated methodology. Biotechnol J 2022; 17:e2100240. [PMID: 34775678 PMCID: PMC8825753 DOI: 10.1002/biot.202100240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 11/05/2021] [Accepted: 11/05/2021] [Indexed: 02/03/2023]
Abstract
In the attempt to bridge the widening gap from DNA sequence to biological function, we developed a novel methodology to assemble Long-Adapter Single-Strand Oligonucleotide (LASSO) probe libraries that enabled the massively multiplexed capture of kilobase-sized DNA fragments for downstream long read DNA sequencing or expression. This method uses short DNA oligonucleotides (pre-LASSO probes) and a plasmid vector that supplies the linker sequence for the mature LASSO probe through Cre-LoxP intramolecular recombination. This strategy generates high quality LASSO probes libraries (≈46% of correct probes). We performed NGS analysis of the post-capture PCR amplification of DNA circles obtained from the LASSO capture of 3087 Escherichia coli ORFs spanning from 400- to 5000 bp. The median enrichment of all targeted ORFs versus untargeted ORFs was 30 times. For ORFs up to 1kb in size, targeted ORFs were enriched up to a median of 260-fold. Here, we show that LASSO probes obtained in this manner, were able to capture full-length open reading frames from total human cDNA. Furthermore, we show that the LASSO capture specificity and sensitivity is sufficient for target capture from total human genomic DNA template. This technology can be used for the preparation of long-read sequencing libraries and for massively multiplexed cloning of human sequences.
Collapse
Affiliation(s)
- Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA
| | - Lamia Chaikban
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA
| | - Benjamin H. Larman
- Institute of Cell Engineering, Division of Immunology,
Department of Pathology, Johns Hopkins University, Baltimore, MD, USA
| | - Jeffrey Rosenfield
- Cancer Institute of New Jersey, New Brunswick, New Jersey
08854, USA,Department of Pathology, Robert Wood Johnson Medical
School, New Brunswick, NJ 08903, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA,Cancer Institute of New Jersey, New Brunswick, New Jersey
08854, USA,Correspondence and requests for materials should
be addressed to B.P. (; 599 Taylor
Road, Piscataway, NJ 08854)
| |
Collapse
|
43
|
Fayer S, Horton C, Dines JN, Rubin AF, Richardson ME, McGoldrick K, Hernandez F, Pesaran T, Karam R, Shirts BH, Fowler DM, Starita LM. Closing the gap: Systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am J Hum Genet 2021; 108:2248-2258. [PMID: 34793697 DOI: 10.1016/j.ajhg.2021.11.001] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 10/29/2021] [Indexed: 12/13/2022] Open
Abstract
Clinical interpretation of missense variants is challenging because the majority identified by genetic testing are rare and their functional effects are unknown. Consequently, most variants are of uncertain significance and cannot be used for clinical diagnosis or management. Although not much can be done to ameliorate variant rarity, multiplexed assays of variant effect (MAVEs), where thousands of single-nucleotide variant effects are simultaneously measured experimentally, provide functional evidence that can help resolve variants of unknown significance (VUSs). However, a rigorous assessment of the clinical value of multiplexed functional data for variant interpretation is lacking. Thus, we systematically combined previously published BRCA1, TP53, and PTEN multiplexed functional data with phenotype and family history data for 324 VUSs identified by a single diagnostic testing laboratory. We curated 49,281 variant functional scores from MAVEs for these three genes and integrated four different TP53 multiplexed functional datasets into a single functional prediction for each variant by using machine learning. We then determined the strength of evidence provided by each multiplexed functional dataset and reevaluated 324 VUSs. Multiplexed functional data were effective in driving variant reclassification when combined with clinical data, eliminating 49% of VUSs for BRCA1, 69% for TP53, and 15% for PTEN. Thus, multiplexed functional data, which are being generated for numerous genes, are poised to have a major impact on clinical variant interpretation.
Collapse
|
44
|
Matreyek KA, Stephany JJ, Ahler E, Fowler DM. Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Med 2021; 13:165. [PMID: 34649609 PMCID: PMC8518224 DOI: 10.1186/s13073-021-00984-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 09/30/2021] [Indexed: 01/22/2023] Open
Abstract
Background PTEN is a multi-functional tumor suppressor protein regulating cell growth, immune signaling, neuronal function, and genome stability. Experimental characterization can help guide the clinical interpretation of the thousands of germline or somatic PTEN variants observed in patients. Two large-scale mutational datasets, one for PTEN variant intracellular abundance encompassing 4112 missense variants and one for lipid phosphatase activity encompassing 7244 variants, were recently published. The combined information from these datasets can reveal variant-specific phenotypes that may underlie various clinical presentations, but this has not been comprehensively examined, particularly for somatic PTEN variants observed in cancers. Methods Here, we add to these efforts by measuring the intracellular abundance of 764 new PTEN variants and refining abundance measurements for 3351 previously studied variants. We use this expanded and refined PTEN abundance dataset to explore the mutational patterns governing PTEN intracellular abundance, and then incorporate the phosphatase activity data to subdivide PTEN variants into four functionally distinct groups. Results This analysis revealed a set of highly abundant but lipid phosphatase defective variants that could act in a dominant-negative fashion to suppress PTEN activity. Two of these variants were, indeed, capable of dysregulating Akt signaling in cells harboring a WT PTEN allele. Both variants were observed in multiple breast or uterine tumors, demonstrating the disease relevance of these high abundance, inactive variants. Conclusions We show that multidimensional, large-scale variant functional data, when paired with public cancer genomics datasets and follow-up assays, can improve understanding of uncharacterized cancer-associated variants, and provide better insights into how they contribute to oncogenesis. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00984-x.
Collapse
Affiliation(s)
- Kenneth A Matreyek
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. .,Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| | - Jason J Stephany
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Ethan Ahler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.,Present Address: Revolution Medicines, Redwood City, CA, 94063, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. .,Department of Bioengineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
45
|
Sesta L, Uguzzoni G, Fernandez-de-Cossio-Diaz J, Pagnani A. AMaLa: Analysis of Directed Evolution Experiments via Annealed Mutational Approximated Landscape. Int J Mol Sci 2021; 22:10908. [PMID: 34681569 PMCID: PMC8535593 DOI: 10.3390/ijms222010908] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 09/24/2021] [Accepted: 09/27/2021] [Indexed: 01/12/2023] Open
Abstract
We present Annealed Mutational approximated Landscape (AMaLa), a new method to infer fitness landscapes from Directed Evolution experiments sequencing data. Such experiments typically start from a single wild-type sequence, which undergoes Darwinian in vitro evolution via multiple rounds of mutation and selection for a target phenotype. In the last years, Directed Evolution is emerging as a powerful instrument to probe fitness landscapes under controlled experimental conditions and as a relevant testing ground to develop accurate statistical models and inference algorithms (thanks to high-throughput screening and sequencing). Fitness landscape modeling either uses the enrichment of variants abundances as input, thus requiring the observation of the same variants at different rounds or assuming the last sequenced round as being sampled from an equilibrium distribution. AMaLa aims at effectively leveraging the information encoded in the whole time evolution. To do so, while assuming statistical sampling independence between sequenced rounds, the possible trajectories in sequence space are gauged with a time-dependent statistical weight consisting of two contributions: (i) an energy term accounting for the selection process and (ii) a generalized Jukes-Cantor model for the purely mutational step. This simple scheme enables accurately describing the Directed Evolution dynamics and inferring a fitness landscape that correctly reproduces the measures of the phenotype under selection (e.g., antibiotic drug resistance), notably outperforming widely used inference strategies. In addition, we assess the reliability of AMaLa by showing how the inferred statistical model could be used to predict relevant structural properties of the wild-type sequence.
Collapse
Affiliation(s)
- Luca Sesta
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy; (L.S.); (G.U.); (A.P.)
| | - Guido Uguzzoni
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy; (L.S.); (G.U.); (A.P.)
| | - Jorge Fernandez-de-Cossio-Diaz
- Laboratory of Physics of the Ecole Normale Supérieure, CNRS UMR 8023 & PSL Research, Sorbonne Université, 24 rue Lhomond, 75005 Paris, France
- Center of Molecular Immunology, Systems Biology Department, Playa, Havana CP 11600, Cuba
| | - Andrea Pagnani
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy; (L.S.); (G.U.); (A.P.)
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, I-10060 Candiolo, Italy
- INFN, Sezione di Torino, I-10125 Torino, Italy
| |
Collapse
|
46
|
Findlay GM. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum Mol Genet 2021; 30:R187-R197. [PMID: 34338757 PMCID: PMC8490018 DOI: 10.1093/hmg/ddab219] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
The application of genomics to medicine has accelerated the discovery of mutations underlying disease and has enhanced our knowledge of the molecular underpinnings of diverse pathologies. As the amount of human genetic material queried via sequencing has grown exponentially in recent years, so too has the number of rare variants observed. Despite progress, our ability to distinguish which rare variants have clinical significance remains limited. Over the last decade, however, powerful experimental approaches have emerged to characterize variant effects orders of magnitude faster than before. Fueled by improved DNA synthesis and sequencing and, more recently, by CRISPR/Cas9 genome editing, multiplex functional assays provide a means of generating variant effect data in wide-ranging experimental systems. Here, I review recent applications of multiplex assays that link human variants to disease phenotypes and I describe emerging strategies that will enhance their clinical utility in coming years.
Collapse
Affiliation(s)
- Gregory M Findlay
- The Francis Crick Institute, The Genome Function Laboratory, London NW1 1AT, UK
| |
Collapse
|
47
|
Geck RC, Boyle G, Amorosi CJ, Fowler DM, Dunham MJ. Measuring Pharmacogene Variant Function at Scale Using Multiplexed Assays. Annu Rev Pharmacol Toxicol 2021; 62:531-550. [PMID: 34516287 DOI: 10.1146/annurev-pharmtox-032221-085807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As costs of next-generation sequencing decrease, identification of genetic variants has far outpaced our ability to understand their functional consequences. This lack of understanding is a central challenge to a key promise of pharmacogenomics: using genetic information to guide drug selection and dosing. Recently developed multiplexed assays of variant effect enable experimental measurement of the function of thousands of variants simultaneously. Here, we describe multiplexed assays that have been performed on nearly 25,000 variants in eight key pharmacogenes (ADRB2, CYP2C9, CYP2C19, NUDT15, SLCO1B1, TMPT, VKORC1, and the LDLR promoter), discuss advances in experimental design, and explore key challenges that must be overcome to maximize the utility of multiplexed functional data. Expected final online publication date for the Annual Review of Pharmacology and Toxicology, Volume 62 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Renee C Geck
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Gabriel Boyle
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Clara J Amorosi
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; , .,Department of Bioengineering, University of Washington, Seattle, Washington 98195, USA
| | - Maitreya J Dunham
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| |
Collapse
|
48
|
Sarfati H, Naftaly S, Papo N, Keasar C. Predicting mutant outcome by combining deep mutational scanning and machine learning. Proteins 2021; 90:45-57. [PMID: 34293212 DOI: 10.1002/prot.26184] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 06/01/2021] [Accepted: 07/11/2021] [Indexed: 02/02/2023]
Abstract
Deep mutational scanning provides unprecedented wealth of quantitative data regarding the functional outcome of mutations in proteins. A single experiment may measure properties (eg, structural stability) of numerous protein variants. Leveraging the experimental data to gain insights about unexplored regions of the mutational landscape is a major computational challenge. Such insights may facilitate further experimental work and accelerate the development of novel protein variants with beneficial therapeutic or industrially relevant properties. Here we present a novel, machine learning approach for the prediction of functional mutation outcome in the context of deep mutational screens. Using sequence (one-hot) features of variants with known properties, as well as structural features derived from models thereof, we train predictive statistical models to estimate the unknown properties of other variants. The utility of the new computational scheme is demonstrated using five sets of mutational scanning data, denoted "targets": (a) protease specificity of APPI (amyloid precursor protein inhibitor) variants; (b-d) three stability related properties of IGBPG (immunoglobulin G-binding β1 domain of streptococcal protein G) variants; and (e) fluorescence of GFP (green fluorescent protein) variants. Performance is measured by the overall correlation of the predicted and observed properties, and enrichment-the ability to predict the most potent variants and presumably guide further experiments. Despite the diversity of the targets the statistical models can generalize variant examples thereof and predict the properties of test variants with both single and multiple mutations.
Collapse
Affiliation(s)
- Hagit Sarfati
- Department of Computer Science, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Si Naftaly
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Niv Papo
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Chen Keasar
- Department of Computer Science, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| |
Collapse
|
49
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
50
|
Froehlich JJ, Uyar B, Herzog M, Theil K, Glažar P, Akalin A, Rajewsky N. Parallel genetics of regulatory sequences using scalable genome editing in vivo. Cell Rep 2021; 35:108988. [PMID: 33852857 DOI: 10.1016/j.celrep.2021.108988] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 01/13/2021] [Accepted: 03/23/2021] [Indexed: 12/27/2022] Open
Abstract
How regulatory sequences control gene expression is fundamental for explaining phenotypes in health and disease. Regulatory elements must ultimately be understood within their genomic environment and development- or tissue-specific contexts. Because this is technically challenging, few regulatory elements have been characterized in vivo. Here, we use inducible Cas9 and multiplexed guide RNAs to create hundreds of mutations in enhancers/promoters and 3' UTRs of 16 genes in C. elegans. Our software crispr-DART analyzes indel mutations in targeted DNA sequencing. We quantify the impact of mutations on expression and fitness by targeted RNA sequencing and DNA sampling. When applying our approach to the lin-41 3' UTR, generating hundreds of mutants, we find that the two adjacent binding sites for the miRNA let-7 can regulate lin-41 expression independently of each other. Finally, we map regulatory genotypes to phenotypic traits for several genes. Our approach enables parallel analysis of regulatory sequences directly in animals.
Collapse
Affiliation(s)
- Jonathan J Froehlich
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Bora Uyar
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Margareta Herzog
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Kathrin Theil
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Petar Glažar
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Altuna Akalin
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany
| | - Nikolaus Rajewsky
- Systems Biology of Gene Regulatory Elements, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Hannoversche Str. 28, 10115 Berlin, Germany.
| |
Collapse
|