1
|
Barrios-Núñez I, Martínez-Redondo G, Medina-Burgos P, Cases I, Fernández R, Rojas A. Decoding functional proteome information in model organisms using protein language models. NAR Genom Bioinform 2024; 6:lqae078. [PMID: 38962255 PMCID: PMC11217674 DOI: 10.1093/nargab/lqae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Revised: 05/31/2024] [Accepted: 06/26/2024] [Indexed: 07/05/2024] Open
Abstract
Protein language models have been tested and proved to be reliable when used on curated datasets but have not yet been applied to full proteomes. Accordingly, we tested how two different machine learning-based methods performed when decoding functional information from the proteomes of selected model organisms. We found that protein language models are more precise and informative than deep learning methods for all the species tested and across the three gene ontologies studied, and that they better recover functional information from transcriptomic experiments. The results obtained indicate that these language models are likely to be suitable for large-scale annotation and downstream analyses, and we recommend a guide for their use.
Collapse
Affiliation(s)
- Israel Barrios-Núñez
- Computational Biology and Bioinformatics Group, Andalusian Center for Developmental Biology (CABD-CSIC), 41013 Sevilla, Spain
| | | | - Patricia Medina-Burgos
- Computational Biology and Bioinformatics Group, Andalusian Center for Developmental Biology (CABD-CSIC), 41013 Sevilla, Spain
| | - Ildefonso Cases
- Bioinformatics Unit, Andalusian Center for Developmental Biology (CABD-CSIC), 41013 Sevilla, Spain
| | - Rosa Fernández
- Metazoa Phylogenomics Lab, Institute of Evolutionary Biology (CSIC-UPF), 08003 Barcelona, Spain
| | - Ana M Rojas
- Computational Biology and Bioinformatics Group, Andalusian Center for Developmental Biology (CABD-CSIC), 41013 Sevilla, Spain
| |
Collapse
|
2
|
Kwon JJ, Pan J, Gonzalez G, Hahn WC, Zitnik M. On knowing a gene: A distributional hypothesis of gene function. Cell Syst 2024; 15:488-496. [PMID: 38810640 PMCID: PMC11189734 DOI: 10.1016/j.cels.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 02/25/2024] [Accepted: 04/30/2024] [Indexed: 05/31/2024]
Abstract
As words can have multiple meanings that depend on sentence context, genes can have various functions that depend on the surrounding biological system. This pleiotropic nature of gene function is limited by ontologies, which annotate gene functions without considering biological contexts. We contend that the gene function problem in genetics may be informed by recent technological leaps in natural language processing, in which representations of word semantics can be automatically learned from diverse language contexts. In contrast to efforts to model semantics as "is-a" relationships in the 1990s, modern distributional semantics represents words as vectors in a learned semantic space and fuels current advances in transformer-based models such as large language models and generative pre-trained transformers. A similar shift in thinking of gene functions as distributions over cellular contexts may enable a similar breakthrough in data-driven learning from large biological datasets to inform gene function.
Collapse
Affiliation(s)
- Jason J Kwon
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Joshua Pan
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Guadalupe Gonzalez
- Department of Computing, Faculty of Engineering, Imperial College, London SW7 2AZ, UK
| | - William C Hahn
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Marinka Zitnik
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Harvard Medical School, Department of Biomedical Informatics, Boston, MA 02115, USA; Harvard Data Science Initiative, Harvard University, Cambridge, MA 02138, USA; Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA 02134, USA.
| |
Collapse
|
3
|
Lucaci AG, Pond SLK. AOC: Analysis of Orthologous Collections - an application for the characterization of natural selection in protein-coding sequences. ARXIV 2024:arXiv:2406.09522v1. [PMID: 38947939 PMCID: PMC11213150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Motivation Modern molecular sequence analysis increasingly relies on automated and robust software tools for interpretation, annotation, and biological insight. The Analysis of Orthologous Collections (AOC) application automates the identification of genomic sites and species/lineages influenced by natural selection in coding sequence analysis. AOC quantifies different types of selection: negative, diversifying or directional positive, or differential selection between groups of branches. We include all steps necessary to go from unaligned homologous sequences to complete results and interactive visualizations that are designed to aid in the useful interpretation and contextualization. Results We are motivated by a desire to make evolutionary analyses as simple as possible, and to close the disparity in the literature between genes which draw a significant amount of interest and those that are largely overlooked and underexplored. We believe that such underappreciated and understudied genetic datasets can hold rich biological information and offer substantial insights into the diverse patterns and processes of evolution, especially if domain experts are able to perform the analyses themselves. Availability and implementation A Snakemake [Mölder et al., 2021] application implementation is publicly available on GitHub at https://github.com/aglucaci/AnalysisOfOrthologousCollections and is accompanied by software documentation and a tutorial.
Collapse
Affiliation(s)
- Alexander G Lucaci
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY 10021, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | | |
Collapse
|
4
|
Oba GM, Nakato R. Clover: An unbiased method for prioritizing differentially expressed genes using a data-driven approach. Genes Cells 2024; 29:456-470. [PMID: 38602264 PMCID: PMC11163938 DOI: 10.1111/gtc.13119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/12/2024] [Accepted: 03/20/2024] [Indexed: 04/12/2024]
Abstract
Identifying key genes from a list of differentially expressed genes (DEGs) is a critical step in transcriptome analysis. However, current methods, including Gene Ontology analysis and manual annotation, essentially rely on existing knowledge, which is highly biased depending on the extent of the literature. As a result, understudied genes, some of which may be associated with important molecular mechanisms, are often ignored or remain obscure. To address this problem, we propose Clover, a data-driven scoring method to specifically highlight understudied genes. Clover aims to prioritize genes associated with important molecular mechanisms by integrating three metrics: the likelihood of appearing in the DEG list, tissue specificity, and number of publications. We applied Clover to Alzheimer's disease data and confirmed that it successfully detected known associated genes. Moreover, Clover effectively prioritized understudied but potentially druggable genes. Overall, our method offers a novel approach to gene characterization and has the potential to expand our understanding of gene functions. Clover is an open-source software written in Python3 and available on GitHub at https://github.com/G708/Clover.
Collapse
Affiliation(s)
- Gina Miku Oba
- Laboratory of Computational Genomics, Institute for Quantitative BiosciencesUniversity of TokyoTokyoJapan
- Department of Computational Biology and Medical Science, Graduate School of Frontier ScienceUniversity of TokyoTokyoJapan
| | - Ryuichiro Nakato
- Laboratory of Computational Genomics, Institute for Quantitative BiosciencesUniversity of TokyoTokyoJapan
- Department of Computational Biology and Medical Science, Graduate School of Frontier ScienceUniversity of TokyoTokyoJapan
| |
Collapse
|
5
|
Ullman MT, Clark GM, Pullman MY, Lovelett JT, Pierpont EI, Jiang X, Turkeltaub PE. The neuroanatomy of developmental language disorder: a systematic review and meta-analysis. Nat Hum Behav 2024; 8:962-975. [PMID: 38491094 DOI: 10.1038/s41562-024-01843-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/01/2024] [Indexed: 03/18/2024]
Abstract
Developmental language disorder (DLD) is a common neurodevelopmental disorder with adverse impacts that continue into adulthood. However, its neural bases remain unclear. Here we address this gap by systematically identifying and quantitatively synthesizing neuroanatomical studies of DLD using co-localization likelihood estimation, a recently developed neuroanatomical meta-analytic technique. Analyses of structural brain data (22 peer-reviewed papers, 577 participants) revealed highly consistent anomalies only in the basal ganglia (100% of participant groups in which this structure was examined, weighted by group sample sizes; 99.8% permutation-based likelihood the anomaly clustering was not due to chance). These anomalies were localized specifically to the anterior neostriatum (again 100% weighted proportion and 99.8% likelihood). As expected given the task dependence of activation, functional neuroimaging data (11 peer-reviewed papers, 414 participants) yielded less consistency, though anomalies again occurred primarily in the basal ganglia (79.0% and 95.1%). Multiple sensitivity analyses indicated that the patterns were robust. The meta-analyses elucidate the neuroanatomical signature of DLD, and implicate the basal ganglia in particular. The findings support the procedural circuit deficit hypothesis of DLD, have basic research and translational implications for the disorder, and advance our understanding of the neuroanatomy of language.
Collapse
Affiliation(s)
- Michael T Ullman
- Brain and Language Laboratory, Department of Neuroscience, Georgetown University, Washington DC, USA.
| | - Gillian M Clark
- Cognitive Neuroscience Unit, School of Psychology, Deakin University, Geelong, Victoria, Australia
| | - Mariel Y Pullman
- Brain and Language Laboratory, Department of Neuroscience, Georgetown University, Washington DC, USA
- Mount Sinai Beth Israel, New York, NY, USA
| | - Jarrett T Lovelett
- Brain and Language Laboratory, Department of Neuroscience, Georgetown University, Washington DC, USA
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA
| | - Elizabeth I Pierpont
- Department of Pediatrics, University of Minnesota Medical Center, Minneapolis, MN, USA
| | - Xiong Jiang
- Department of Neuroscience, Georgetown University, Washington DC, USA
| | - Peter E Turkeltaub
- Center for Brain Plasticity and Recovery, Georgetown University, Washington DC, USA
- Research Division, MedStar National Rehabilitation Network, Washington DC, USA
| |
Collapse
|
6
|
Richardson R, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: Understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. eLife 2024; 12:RP93429. [PMID: 38546716 PMCID: PMC10977968 DOI: 10.7554/elife.93429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2024] Open
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of -omics studies. To promote the investigation of understudied genes, we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese Richardson
- Interdisciplinary Biological Sciences, Northwestern UniversityEvanstonUnited States
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
- Department of Molecular Biosciences, Northwestern UniversityEvanstonUnited States
- Department of Physics and Astronomy, Northwestern UniversityEvanstonUnited States
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- The Potocsnak Longevity Institute, Northwestern UniversityChicagoUnited States
- Simpson Querrey Lung Institute for Translational Science, Northwestern UniversityChicagoUnited States
| |
Collapse
|
7
|
Richardson RAK, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.28.530483. [PMID: 36909550 PMCID: PMC10002660 DOI: 10.1101/2023.02.28.530483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of - omics studies. To promote the investigation of understudied genes we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese AK Richardson
- Interdisciplinary Biological Sciences, Northwestern University
- Department of Chemical and Biological Engineering, Northwestern University
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
- Department of Physics and Astronomy, Northwestern University
- Department of Molecular Biosciences, Northwestern University
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern University
- The Potocsnak Longevity Institute, Northwestern University
- Simpson Querrey Lung Institute for Translational Science, Northwestern University
| |
Collapse
|
8
|
Koutrouli M, Nastou K, Piera Líndez P, Bouwmeester R, Rasmussen S, Martens L, Jensen LJ. FAVA: high-quality functional association networks inferred from scRNA-seq and proteomics data. Bioinformatics 2024; 40:btae010. [PMID: 38192003 PMCID: PMC10868155 DOI: 10.1093/bioinformatics/btae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 12/07/2023] [Accepted: 01/05/2024] [Indexed: 01/10/2024] Open
Abstract
MOTIVATION Protein networks are commonly used for understanding how proteins interact. However, they are typically biased by data availability, favoring well-studied proteins with more interactions. To uncover functions of understudied proteins, we must use data that are not affected by this literature bias, such as single-cell RNA-seq and proteomics. Due to data sparseness and redundancy, functional association analysis becomes complex. RESULTS To address this, we have developed FAVA (Functional Associations using Variational Autoencoders), which compresses high-dimensional data into a low-dimensional space. FAVA infers networks from high-dimensional omics data with much higher accuracy than existing methods, across a diverse collection of real as well as simulated datasets. FAVA can process large datasets with over 0.5 million conditions and has predicted 4210 interactions between 1039 understudied proteins. Our findings showcase FAVA's capability to offer novel perspectives on protein interactions. FAVA functions within the scverse ecosystem, employing AnnData as its input source. AVAILABILITY AND IMPLEMENTATION Source code, documentation, and tutorials for FAVA are accessible on GitHub at https://github.com/mikelkou/fava. FAVA can also be installed and used via pip/PyPI as well as via the scverse ecosystem https://github.com/scverse/ecosystem-packages/tree/main/packages/favapy.
Collapse
Affiliation(s)
- Mikaela Koutrouli
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Katerina Nastou
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Pau Piera Líndez
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| |
Collapse
|
9
|
Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res 2024; 52:D1143-D1154. [PMID: 38183205 PMCID: PMC10767851 DOI: 10.1093/nar/gkad989] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/14/2023] [Accepted: 10/17/2023] [Indexed: 01/07/2024] Open
Abstract
Machine Learning-based scoring and classification of genetic variants aids the assessment of clinical findings and is employed to prioritize variants in diverse genetic studies and analyses. Combined Annotation-Dependent Depletion (CADD) is one of the first methods for the genome-wide prioritization of variants across different molecular functions and has been continuously developed and improved since its original publication. Here, we present our most recent release, CADD v1.7. We explored and integrated new annotation features, among them state-of-the-art protein language model scores (Meta ESM-1v), regulatory variant effect predictions (from sequence-based convolutional neural networks) and sequence conservation scores (Zoonomia). We evaluated the new version on data sets derived from ClinVar, ExAC/gnomAD and 1000 Genomes variants. For coding effects, we tested CADD on 31 Deep Mutational Scanning (DMS) data sets from ProteinGym and, for regulatory effect prediction, we used saturation mutagenesis reporter assay data of promoter and enhancer sequences. The inclusion of new features further improved the overall performance of CADD. As with previous releases, all data sets, genome-wide CADD v1.7 scores, scripts for on-site scoring and an easy-to-use webserver are readily provided via https://cadd.bihealth.org/ or https://cadd.gs.washington.edu/ to the community.
Collapse
Affiliation(s)
- Max Schubach
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Thorben Maass
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| | - Lusiné Nazaretyan
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Sebastian Röner
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Martin Kircher
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| |
Collapse
|
10
|
Rappsilber J. A dive into the unknome. Trends Genet 2024; 40:15-16. [PMID: 37968205 DOI: 10.1016/j.tig.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 10/23/2023] [Indexed: 11/17/2023]
Abstract
We may never understand the function of all genes, findings by Freeman, Munro and colleagues suggest, unless we rethink our approaches. They make a thorough attempt at quantifying the unknownness of protein-coding genes and experimentally prove that many neglected genes hold the seed of important discoveries.
Collapse
Affiliation(s)
- Juri Rappsilber
- Technische Universität Berlin, Chair of Bioanalytics, 10623 Berlin, Germany; Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK; Si-M/'Der Simulierte Mensch', a Science Framework of Technische Universität Berlin and Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
11
|
Kurt Z, Cheng J, Barrere-Cain R, McQuillen CN, Saleem Z, Hsu N, Jiang N, Pan C, Franzén O, Koplev S, Wang S, Björkegren J, Lusis AJ, Blencowe M, Yang X. Shared and distinct pathways and networks genetically linked to coronary artery disease between human and mouse. eLife 2023; 12:RP88266. [PMID: 38060277 PMCID: PMC10703441 DOI: 10.7554/elife.88266] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
Mouse models have been used extensively to study human coronary artery disease (CAD) or atherosclerosis and to test therapeutic targets. However, whether mouse and human share similar genetic factors and pathogenic mechanisms of atherosclerosis has not been thoroughly investigated in a data-driven manner. We conducted a cross-species comparison study to better understand atherosclerosis pathogenesis between species by leveraging multiomics data. Specifically, we compared genetically driven and thus CAD-causal gene networks and pathways, by using human GWAS of CAD from the CARDIoGRAMplusC4D consortium and mouse GWAS of atherosclerosis from the Hybrid Mouse Diversity Panel (HMDP) followed by integration with functional multiomics human (STARNET and GTEx) and mouse (HMDP) databases. We found that mouse and human shared >75% of CAD causal pathways. Based on network topology, we then predicted key regulatory genes for both the shared pathways and species-specific pathways, which were further validated through the use of single cell data and the latest CAD GWAS. In sum, our results should serve as a much-needed guidance for which human CAD-causal pathways can or cannot be further evaluated for novel CAD therapies using mouse models.
Collapse
Affiliation(s)
- Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
- The Information School at the University of SheffieldSheffieldUnited Kingdom
| | - Jenny Cheng
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Rio Barrere-Cain
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Caden N McQuillen
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Zara Saleem
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Neil Hsu
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Nuoya Jiang
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Calvin Pan
- Department of Medicine, Division of Cardiology, University of California, Los AngelesLos AngelesUnited States
| | - Oscar Franzén
- Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Simon Koplev
- Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Susanna Wang
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Johan Björkegren
- Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Medicine, (Huddinge), Karolinska InstitutetHuddingeSweden
| | - Aldons J Lusis
- Department of Medicine, Division of Cardiology, University of California, Los AngelesLos AngelesUnited States
- Departments of Human Genetics & Microbiology, Immunology, and Molecular Genetics, UCLALos AngelesUnited States
- Cardiovascular Research Laboratory, David Geffen School of Medicine, UCLALos AngelesUnited States
| | - Montgomery Blencowe
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los AngelesLos AngelesUnited States
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los AngelesLos AngelesUnited States
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los AngelesLos AngelesUnited States
- Interdepartmental Program of Bioinformatics, University of California, Los AngelesLos AngelesUnited States
- Department of Molecular and Medical Pharmacology, University of California, Los AngelesLos AngelesUnited States
| |
Collapse
|
12
|
Gill K, Rajan JRS, Chow E, Ashbrook DG, Williams RW, Zwicker JG, Goldowitz D. Developmental coordination disorder: What can we learn from RI mice using motor learning tasks and QTL analysis. GENES, BRAIN, AND BEHAVIOR 2023; 22:e12859. [PMID: 37553802 PMCID: PMC10733574 DOI: 10.1111/gbb.12859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 07/13/2023] [Accepted: 07/16/2023] [Indexed: 08/10/2023]
Abstract
Developmental Coordination Disorder (DCD) is a neurodevelopmental disorder of unknown etiology that affects one in 20 children. There is an indication that DCD has an underlying genetic component due to its high heritability. Therefore, we explored the use of a recombinant inbred family of mice known as the BXD panel to understand the genetic basis of complex traits (i.e., motor learning) through identification of quantitative trait loci (QTLs). The overall aim of this study was to utilize the QTL approach to evaluate the genome-to-phenome correlation in BXD strains of mice in order to better understand the human presentation of DCD. Results of this current study confirm differences in motor learning in selected BXD strains and strains with altered cerebellar volume. Five strains - BXD15, BXD27, BXD28, BXD75, and BXD86 - exhibited the most DCD-like phenotype when compared with other BXD strains of interest. Results indicate that BXD15 and BXD75 struggled primarily with gross motor skills, BXD28 primarily had difficulties with fine motor skills, and BXD27 and BXD86 strains struggled with both fine and gross motor skills. The functional roles of genes within significant QTLs were assessed in relation to DCD-like behavior. Only Rab3a (Ras-related protein Rab-3A) emerged as a high likelihood candidate gene for the horizontal ladder rung task. This gene is associated with brain and skeletal muscle development, but lacked nonsynonymous polymorphisms. This study along with Gill et al. (same issue) is the first studies to specifically examine the genetic linkage of DCD using BXD strains of mice.
Collapse
Affiliation(s)
- Kamaldeep Gill
- Rehabilitation Sciences, University of British ColumbiaVancouverBritish ColumbiaCanada
- British Columbia Children's Hospital Research InstituteVancouverBritish ColumbiaCanada
| | - Jeffy Rajan Soundara Rajan
- Department of Medical GeneticsUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Centre for Molecular Medicine and TherapeuticsDepartment of Medical Genetics, University of British ColumbiaVancouverBritish ColumbiaCanada
| | - Eric Chow
- British Columbia Children's Hospital Research InstituteVancouverBritish ColumbiaCanada
- Centre for Molecular Medicine and TherapeuticsDepartment of Medical Genetics, University of British ColumbiaVancouverBritish ColumbiaCanada
| | - David G. Ashbrook
- Department of GeneticsGenomics and Informatics, University of Tennessee Health Science CenterMemphisTennesseeUSA
| | - Robert W. Williams
- Department of GeneticsGenomics and Informatics, University of Tennessee Health Science CenterMemphisTennesseeUSA
| | - Jill G. Zwicker
- British Columbia Children's Hospital Research InstituteVancouverBritish ColumbiaCanada
- Department of Occupational Science & Occupational TherapyUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Department of PediatricsUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Daniel Goldowitz
- British Columbia Children's Hospital Research InstituteVancouverBritish ColumbiaCanada
- Department of Medical GeneticsUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Centre for Molecular Medicine and TherapeuticsDepartment of Medical Genetics, University of British ColumbiaVancouverBritish ColumbiaCanada
| |
Collapse
|
13
|
Allayee H, Farber CR, Seldin MM, Williams EG, James DE, Lusis AJ. Systems genetics approaches for understanding complex traits with relevance for human disease. eLife 2023; 12:e91004. [PMID: 37962168 PMCID: PMC10645424 DOI: 10.7554/elife.91004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 10/16/2023] [Indexed: 11/15/2023] Open
Abstract
Quantitative traits are often complex because of the contribution of many loci, with further complexity added by environmental factors. In medical research, systems genetics is a powerful approach for the study of complex traits, as it integrates intermediate phenotypes, such as RNA, protein, and metabolite levels, to understand molecular and physiological phenotypes linking discrete DNA sequence variation to complex clinical and physiological traits. The primary purpose of this review is to describe some of the resources and tools of systems genetics in humans and rodent models, so that researchers in many areas of biology and medicine can make use of the data.
Collapse
Affiliation(s)
- Hooman Allayee
- Departments of Population & Public Health Sciences, University of Southern CaliforniaLos AngelesUnited States
- Biochemistry & Molecular Medicine, Keck School of Medicine, University of Southern CaliforniaLos AngelesUnited States
| | - Charles R Farber
- Center for Public Health Genomics, University of Virginia School of MedicineCharlottesvilleUnited States
- Departments of Biochemistry & Molecular Genetics, University of Virginia School of MedicineCharlottesvilleUnited States
- Public Health Sciences, University of Virginia School of MedicineCharlottesvilleUnited States
| | - Marcus M Seldin
- Department of Biological Chemistry, University of California, IrvineIrvineUnited States
| | - Evan Graehl Williams
- Luxembourg Centre for Systems Biomedicine, University of LuxembourgLuxembourgLuxembourg
| | - David E James
- School of Life and Environmental Sciences, University of SydneyCamperdownAustralia
- Faculty of Medicine and Health, University of SydneyCamperdownAustralia
- Charles Perkins Centre, University of SydneyCamperdownAustralia
| | - Aldons J Lusis
- Departments of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Medicine, University of California, Los AngelesLos AngelesUnited States
- Microbiology, Immunology, & Molecular Genetics, David Geffen School of Medicine of UCLALos AngelesUnited States
| |
Collapse
|
14
|
Hogan CA, Gratz SJ, Dumouchel JL, Thakur RS, Delgado A, Lentini JM, Madhwani KR, Fu D, O'Connor‐Giles KM. Expanded tRNA methyltransferase family member TRMT9B regulates synaptic growth and function. EMBO Rep 2023; 24:e56808. [PMID: 37642556 PMCID: PMC10561368 DOI: 10.15252/embr.202356808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 08/03/2023] [Accepted: 08/14/2023] [Indexed: 08/31/2023] Open
Abstract
Nervous system function rests on the formation of functional synapses between neurons. We have identified TRMT9B as a new regulator of synapse formation and function in Drosophila. TRMT9B has been studied for its role as a tumor suppressor and is one of two metazoan homologs of yeast tRNA methyltransferase 9 (Trm9), which methylates tRNA wobble uridines. Whereas Trm9 homolog ALKBH8 is ubiquitously expressed, TRMT9B is enriched in the nervous system. However, in the absence of animal models, TRMT9B's role in the nervous system has remained unstudied. Here, we generate null alleles of TRMT9B and find it acts postsynaptically to regulate synaptogenesis and promote neurotransmission. Through liquid chromatography-mass spectrometry, we find that ALKBH8 catalyzes canonical tRNA wobble uridine methylation, raising the question of whether TRMT9B is a methyltransferase. Structural modeling studies suggest TRMT9B retains methyltransferase function and, in vivo, disruption of key methyltransferase residues blocks TRMT9B's ability to rescue synaptic overgrowth, but not neurotransmitter release. These findings reveal distinct roles for TRMT9B in the nervous system and highlight the significance of tRNA methyltransferase family diversification in metazoans.
Collapse
Affiliation(s)
- Caley A Hogan
- Genetics Training ProgramUniversity of Wisconsin‐MadisonMadisonWIUSA
| | - Scott J Gratz
- Department of NeuroscienceBrown UniversityProvidenceRIUSA
| | | | - Rajan S Thakur
- Department of NeuroscienceBrown UniversityProvidenceRIUSA
| | - Ambar Delgado
- Department of NeuroscienceBrown UniversityProvidenceRIUSA
| | - Jenna M Lentini
- Department of Biology, Center for RNA BiologyUniversity of RochesterRochesterNYUSA
| | | | - Dragony Fu
- Department of Biology, Center for RNA BiologyUniversity of RochesterRochesterNYUSA
| | - Kate M O'Connor‐Giles
- Department of NeuroscienceBrown UniversityProvidenceRIUSA
- Carney Institute for Brain ScienceProvidenceRIUSA
| |
Collapse
|
15
|
Rodríguez-López M, Bordin N, Lees J, Scholes H, Hassan S, Saintain Q, Kamrad S, Orengo C, Bähler J. Broad functional profiling of fission yeast proteins using phenomics and machine learning. eLife 2023; 12:RP88229. [PMID: 37787768 PMCID: PMC10547477 DOI: 10.7554/elife.88229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open
Abstract
Many proteins remain poorly characterized even in well-studied organisms, presenting a bottleneck for research. We applied phenomics and machine-learning approaches with Schizosaccharomyces pombe for broad cues on protein functions. We assayed colony-growth phenotypes to measure the fitness of deletion mutants for 3509 non-essential genes in 131 conditions with different nutrients, drugs, and stresses. These analyses exposed phenotypes for 3492 mutants, including 124 mutants of 'priority unstudied' proteins conserved in humans, providing varied functional clues. For example, over 900 proteins were newly implicated in the resistance to oxidative stress. Phenotype-correlation networks suggested roles for poorly characterized proteins through 'guilt by association' with known proteins. For complementary functional insights, we predicted Gene Ontology (GO) terms using machine learning methods exploiting protein-network and protein-homology data (NET-FF). We obtained 56,594 high-scoring GO predictions, of which 22,060 also featured high information content. Our phenotype-correlation data and NET-FF predictions showed a strong concordance with existing PomBase GO annotations and protein networks, with integrated analyses revealing 1675 novel GO predictions for 783 genes, including 47 predictions for 23 priority unstudied proteins. Experimental validation identified new proteins involved in cellular aging, showing that these predictions and phenomics data provide a rich resource to uncover new protein functions.
Collapse
Affiliation(s)
- María Rodríguez-López
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Nicola Bordin
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Jon Lees
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
- University of BristolBristolUnited Kingdom
| | - Harry Scholes
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Shaimaa Hassan
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
- Helwan University, Faculty of PharmacyCairoEgypt
| | - Quentin Saintain
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Stephan Kamrad
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Christine Orengo
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Jürg Bähler
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| |
Collapse
|
16
|
Kurt Z, Cheng J, McQuillen CN, Saleem Z, Hsu N, Jiang N, Barrere-Cain R, Pan C, Franzen O, Koplev S, Wang S, Bjorkegren J, Lusis AJ, Blencowe M, Yang X. Shared and distinct pathways and networks genetically linked to coronary artery disease between human and mouse. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.08.544148. [PMID: 37333408 PMCID: PMC10274918 DOI: 10.1101/2023.06.08.544148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Mouse models have been used extensively to study human coronary artery disease (CAD) or atherosclerosis and to test therapeutic targets. However, whether mouse and human share similar genetic factors and pathogenic mechanisms of atherosclerosis has not been thoroughly investigated in a data-driven manner. We conducted a cross-species comparison study to better understand atherosclerosis pathogenesis between species by leveraging multiomics data. Specifically, we compared genetically driven and thus CAD-causal gene networks and pathways, by using human GWAS of CAD from the CARDIoGRAMplusC4D consortium and mouse GWAS of atherosclerosis from the Hybrid Mouse Diversity Panel (HMDP) followed by integration with functional multiomics human (STARNET and GTEx) and mouse (HMDP) databases. We found that mouse and human shared >75% of CAD causal pathways. Based on network topology, we then predicted key regulatory genes for both the shared pathways and species-specific pathways, which were further validated through the use of single cell data and the latest CAD GWAS. In sum, our results should serve as a much-needed guidance for which human CAD-causal pathways can or cannot be further evaluated for novel CAD therapies using mouse models.
Collapse
Affiliation(s)
- Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Department of Computer and Information Sciences, University of Northumbria, Ellison Pl, Newcastle upon Tyne NE1 8ST, UK
| | - Jenny Cheng
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Caden N. McQuillen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Zara Saleem
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Neil Hsu
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Nuoya Jiang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Rio Barrere-Cain
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Calvin Pan
- Department of Medicine, Division of Cardiology, University of California, Los Angeles, 650 Charles E Young Drive South, Los Angeles, CA 90095-1679, USA
| | - Oscar Franzen
- Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029-6574, US
| | - Simon Koplev
- Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029-6574, US
| | - Susanna Wang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Johan Bjorkegren
- Department of Genetics & Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029-6574, US
- Department of Medicine, (Huddinge), Karolinska Institutet, 141 57 Huddinge, Sweden
| | - Aldons J. Lusis
- Department of Medicine, Division of Cardiology, University of California, Los Angeles, 650 Charles E Young Drive South, Los Angeles, CA 90095-1679, USA
- Departments of Human Genetics & Microbiology, Immunology, and Molecular Genetics, UCLA, CA 90095, USA
- Cardiovascular Research Laboratory, David Geffen School of Medicine, UCLA, CA 90095
| | - Montgomery Blencowe
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular and Integrative Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| |
Collapse
|
17
|
Anderson B, Rosston P, Ong HW, Hossain MA, Davis-Gilbert ZW, Drewry DH. How many kinases are druggable? A review of our current understanding. Biochem J 2023; 480:1331-1363. [PMID: 37642371 PMCID: PMC10586788 DOI: 10.1042/bcj20220217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 08/11/2023] [Accepted: 08/15/2023] [Indexed: 08/31/2023]
Abstract
There are over 500 human kinases ranging from very well-studied to almost completely ignored. Kinases are tractable and implicated in many diseases, making them ideal targets for medicinal chemistry campaigns, but is it possible to discover a drug for each individual kinase? For every human kinase, we gathered data on their citation count, availability of chemical probes, approved and investigational drugs, PDB structures, and biochemical and cellular assays. Analysis of these factors highlights which kinase groups have a wealth of information available, and which groups still have room for progress. The data suggest a disproportionate focus on the more well characterized kinases while much of the kinome remains comparatively understudied. It is noteworthy that tool compounds for understudied kinases have already been developed, and there is still untapped potential for further development in this chemical space. Finally, this review discusses many of the different strategies employed to generate selectivity between kinases. Given the large volume of information available and the progress made over the past 20 years when it comes to drugging kinases, we believe it is possible to develop a tool compound for every human kinase. We hope this review will prove to be both a useful resource as well as inspire the discovery of a tool for every kinase.
Collapse
Affiliation(s)
- Brian Anderson
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Peter Rosston
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Han Wee Ong
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Mohammad Anwar Hossain
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Zachary W. Davis-Gilbert
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - David H. Drewry
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
- UNC Lineberger Comprehensive Cancer Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
18
|
Tantoso E, Eisenhaber B, Sinha S, Jensen LJ, Eisenhaber F. Did the early full genome sequencing of yeast boost gene function discovery? Biol Direct 2023; 18:46. [PMID: 37574542 PMCID: PMC10424406 DOI: 10.1186/s13062-023-00403-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 08/01/2023] [Indexed: 08/15/2023] Open
Abstract
BACKGROUND Although the genome of Saccharomyces cerevisiae (S. cerevisiae) was the first one of a eukaryote organism that was fully sequenced (in 1996), a complete understanding of the potential of encoded biomolecular mechanisms has not yet been achieved. Here, we wish to quantify how far the goal of a full list of S. cerevisiae gene functions still is. RESULTS The scientific literature about S. cerevisiae protein-coding genes has been mapped onto the yeast genome via the mentioning of names for genomic regions in scientific publications. The match was quantified with the ratio of a given gene name's occurrences to those of any gene names in the article. We find that ~ 230 elite genes with ≥ 75 full publication equivalents (FPEs, FPE = 1 is an idealized publication referring to just a single gene) command ~ 45% of all literature. At the same time, about two thirds of the genes (each with less than 10 FPEs) are described in just 12% of the literature (in average each such gene has just ~ 1.5% of the literature of an elite gene). About 600 genes have not been mentioned in any dedicated article. Compared with other groups of genes, the literature growth rates were highest for uncharacterized or understudied genes until late nineties of the twentieth century. Yet, these growth rates deteriorated and became negative thereafter. Thus, yeast function discovery for previously uncharacterized genes has returned to the level of ~ 1980. At the same time, literature for anyhow well-studied genes (with a threshold T10 (≥ 10 FPEs) and higher) remains steadily growing. CONCLUSIONS Did the early full genome sequencing of yeast boost gene function discovery? The data proves that the moment of publishing the full genome in reality coincides with the onset of decline of gene function discovery for previously uncharacterized genes. If the current status of literature about yeast molecular mechanisms can be extrapolated into the future, it will take about another ~ 50 years to complete the yeast gene function list. We found that a small group of scientific journals contributed extraordinarily to publishing early reports relevant to yeast gene function discoveries.
Collapse
Affiliation(s)
- Erwin Tantoso
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
| | - Birgit Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- LASA - Lausitz Advanced Scientific Applications gGmbH, Straße Der Einheit 2-24, 02943, Weißwasser, Federal Republic of Germany.
| | - Swati Sinha
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Frank Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- LASA - Lausitz Advanced Scientific Applications gGmbH, Straße Der Einheit 2-24, 02943, Weißwasser, Federal Republic of Germany.
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Republic of Singapore.
| |
Collapse
|
19
|
Potter A, Hangas A, Goffart S, Huynen MA, Cabrera-Orefice A, Spelbrink JN. Uncharacterized protein C17orf80 - a novel interactor of human mitochondrial nucleoids. J Cell Sci 2023; 136:jcs260822. [PMID: 37401363 PMCID: PMC10445727 DOI: 10.1242/jcs.260822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 06/26/2023] [Indexed: 07/05/2023] Open
Abstract
Molecular functions of many human proteins remain unstudied, despite the demonstrated association with diseases or pivotal molecular structures, such as mitochondrial DNA (mtDNA). This small genome is crucial for the proper functioning of mitochondria, the energy-converting organelles. In mammals, mtDNA is arranged into macromolecular complexes called nucleoids that serve as functional stations for its maintenance and expression. Here, we aimed to explore an uncharacterized protein C17orf80, which was previously detected close to the nucleoid components by proximity labelling mass spectrometry. To investigate the subcellular localization and function of C17orf80, we took advantage of immunofluorescence microscopy, interaction proteomics and several biochemical assays. We demonstrate that C17orf80 is a mitochondrial membrane-associated protein that interacts with nucleoids even when mtDNA replication is inhibited. In addition, we show that C17orf80 is not essential for mtDNA maintenance and mitochondrial gene expression in cultured human cells. These results provide a basis for uncovering the molecular function of C17orf80 and the nature of its association with nucleoids, possibly leading to new insights about mtDNA and its expression.
Collapse
Affiliation(s)
- Alisa Potter
- Department of Pediatrics, Amalia Children's Hospital, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
- Radboud Center for Mitochondrial Medicine (RCMM), Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Anu Hangas
- Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, 80101, Finland
| | - Steffi Goffart
- Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, 80101, Finland
| | - Martijn A. Huynen
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Alfredo Cabrera-Orefice
- Radboud Center for Mitochondrial Medicine (RCMM), Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Johannes N. Spelbrink
- Department of Pediatrics, Amalia Children's Hospital, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
- Radboud Center for Mitochondrial Medicine (RCMM), Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| |
Collapse
|
20
|
Rocha JJ, Jayaram SA, Stevens TJ, Muschalik N, Shah RD, Emran S, Robles C, Freeman M, Munro S. Functional unknomics: Systematic screening of conserved genes of unknown function. PLoS Biol 2023; 21:e3002222. [PMID: 37552676 PMCID: PMC10409296 DOI: 10.1371/journal.pbio.3002222] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/27/2023] [Indexed: 08/10/2023] Open
Abstract
The human genome encodes approximately 20,000 proteins, many still uncharacterised. It has become clear that scientific research tends to focus on well-studied proteins, leading to a concern that poorly understood genes are unjustifiably neglected. To address this, we have developed a publicly available and customisable "Unknome database" that ranks proteins based on how little is known about them. We applied RNA interference (RNAi) in Drosophila to 260 unknown genes that are conserved between flies and humans. Knockdown of some genes resulted in loss of viability, and functional screening of the rest revealed hits for fertility, development, locomotion, protein quality control, and resilience to stress. CRISPR/Cas9 gene disruption validated a component of Notch signalling and 2 genes contributing to male fertility. Our work illustrates the importance of poorly understood genes, provides a resource to accelerate future research, and highlights a need to support database curation to ensure that misannotation does not erode our awareness of our own ignorance.
Collapse
Affiliation(s)
- João J. Rocha
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Tim J. Stevens
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Rajen D. Shah
- Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
| | - Sahar Emran
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Cristina Robles
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Matthew Freeman
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
- Sir William Dunn School of Pathology, University of Oxford, Oxford, United Kingdom
| | - Sean Munro
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
21
|
Kratz A, Kim M, Kelly MR, Zheng F, Koczor CA, Li J, Ono K, Qin Y, Churas C, Chen J, Pillich RT, Park J, Modak M, Collier R, Licon K, Pratt D, Sobol RW, Krogan NJ, Ideker T. A multi-scale map of protein assemblies in the DNA damage response. Cell Syst 2023; 14:447-463.e8. [PMID: 37220749 PMCID: PMC10330685 DOI: 10.1016/j.cels.2023.04.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 01/30/2023] [Accepted: 04/25/2023] [Indexed: 05/25/2023]
Abstract
The DNA damage response (DDR) ensures error-free DNA replication and transcription and is disrupted in numerous diseases. An ongoing challenge is to determine the proteins orchestrating DDR and their organization into complexes, including constitutive interactions and those responding to genomic insult. Here, we use multi-conditional network analysis to systematically map DDR assemblies at multiple scales. Affinity purifications of 21 DDR proteins, with/without genotoxin exposure, are combined with multi-omics data to reveal a hierarchical organization of 605 proteins into 109 assemblies. The map captures canonical repair mechanisms and proposes new DDR-associated proteins extending to stress, transport, and chromatin functions. We find that protein assemblies closely align with genetic dependencies in processing specific genotoxins and that proteins in multiple assemblies typically act in multiple genotoxin responses. Follow-up by DDR functional readouts newly implicates 12 assembly members in double-strand-break repair. The DNA damage response assemblies map is available for interactive visualization and query (ccmi.org/ddram/).
Collapse
Affiliation(s)
- Anton Kratz
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Minkyu Kim
- University of California San Francisco, Department of Cellular and Molecular Pharmacology, San Francisco, CA 94158, USA; The J. David Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA; University of Texas Health Science Center San Antonio, Department of Biochemistry and Structural Biology, San Antonio, TX 78229, USA
| | - Marcus R Kelly
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Fan Zheng
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Christopher A Koczor
- University of South Alabama, Department of Pharmacology and Mitchell Cancer Institute, Mobile, AL 36604, USA
| | - Jianfeng Li
- University of South Alabama, Department of Pharmacology and Mitchell Cancer Institute, Mobile, AL 36604, USA
| | - Keiichiro Ono
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Yue Qin
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Christopher Churas
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Jing Chen
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Rudolf T Pillich
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Jisoo Park
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Maya Modak
- University of California San Francisco, Department of Cellular and Molecular Pharmacology, San Francisco, CA 94158, USA; The J. David Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Rachel Collier
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Kate Licon
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Dexter Pratt
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Robert W Sobol
- University of South Alabama, Department of Pharmacology and Mitchell Cancer Institute, Mobile, AL 36604, USA; Brown University, Department of Pathology and Laboratory Medicine and Legorreta Cancer Center, Providence, RI 02903, USA.
| | - Nevan J Krogan
- University of California San Francisco, Department of Cellular and Molecular Pharmacology, San Francisco, CA 94158, USA; The J. David Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA.
| | - Trey Ideker
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA.
| |
Collapse
|
22
|
Elsamad G, Mecawi AS, Pauža AG, Gillard B, Paterson A, Duque VJ, Šarenac O, Žigon NJ, Greenwood M, Greenwood MP, Murphy D. Ageing restructures the transcriptome of the hypothalamic supraoptic nucleus and alters the response to dehydration. NPJ AGING 2023; 9:12. [PMID: 37264028 PMCID: PMC10234251 DOI: 10.1038/s41514-023-00108-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 05/04/2023] [Indexed: 06/03/2023]
Abstract
Ageing is associated with altered neuroendocrine function. In the context of the hypothalamic supraoptic nucleus, which makes the antidiuretic hormone vasopressin, ageing alters acute responses to hyperosmotic cues, rendering the elderly more susceptible to dehydration. Chronically, vasopressin has been associated with numerous diseases of old age, including type 2 diabetes and metabolic syndrome. Bulk RNAseq transcriptome analysis has been used to catalogue the polyadenylated supraoptic nucleus transcriptomes of adult (3 months) and aged (18 months) rats in basal euhydrated and stimulated dehydrated conditions. Gene ontology and Weighted Correlation Network Analysis revealed that ageing is associated with alterations in the expression of extracellular matrix genes. Interestingly, whilst the transcriptomic response to dehydration is overall blunted in aged animals compared to adults, there is a specific enrichment of differentially expressed genes related to neurodegenerative processes in the aged cohort, suggesting that dehydration itself may provoke degenerative consequences in aged rats.
Collapse
Affiliation(s)
- Ghadir Elsamad
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England
| | - André Souza Mecawi
- Laboratory of Molecular Neuroendocrinology, Department of Biophysics, Paulista School of Medicine, Federal University of São Paulo, São Paulo, Brazil
| | - Audrys G Pauža
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England
- Translational Cardio-Respiratory Research Group, Department of Physiology, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | - Benjamin Gillard
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England
| | - Alex Paterson
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England
- Insilico Consulting Ltd., Wapping Wharf, Bristol, England
| | - Victor J Duque
- Laboratory of Molecular Neuroendocrinology, Department of Biophysics, Paulista School of Medicine, Federal University of São Paulo, São Paulo, Brazil
| | - Olivera Šarenac
- Institute of Pharmacology, Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
- Department of Safety Pharmacology, Abbvie, North Chicago, Illinois, USA
| | - Nina Japundžić Žigon
- Institute of Pharmacology, Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| | - Mingkwan Greenwood
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England
| | - Michael P Greenwood
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England
| | - David Murphy
- Molecular Neuroendocrinology Research Group, Bristol Medical School: Translational Health Sciences, Dorothy Hodgkin Building, University of Bristol, Bristol, England.
| |
Collapse
|
23
|
Muraleedharan A, Vanderperre B. The endo-lysosomal system in Parkinson's disease: expanding the horizon. J Mol Biol 2023:168140. [PMID: 37148997 DOI: 10.1016/j.jmb.2023.168140] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/22/2023] [Accepted: 04/27/2023] [Indexed: 05/08/2023]
Abstract
Parkinson's disease (PD) is the second most common neurodegenerative disorder after Alzheimer's disease, and its prevalence is increasing with age. A wealth of genetic evidence indicates that the endo-lysosomal system is a major pathway driving PD pathogenesis with a growing number of genes encoding endo-lysosomal proteins identified as risk factors for PD, making it a promising target for therapeutic intervention. However, detailed knowledge and understanding of the molecular mechanisms linking these genes to the disease are available for only a handful of them (e.g. LRRK2, GBA1, VPS35). Taking on the challenge of studying poorly characterized genes and proteins can be daunting, due to the limited availability of tools and knowledge from previous literature. This review aims at providing a valuable source of molecular and cellular insights into the biology of lesser-studied PD-linked endo-lysosomal genes, to help and encourage researchers in filling the knowledge gap around these less popular genetic players. Specific endo-lysosomal pathways discussed range from endocytosis, sorting, and vesicular trafficking to the regulation of membrane lipids of these membrane-bound organelles and the specific enzymatic activities they contain. We also provide perspectives on future challenges that the community needs to tackle and propose approaches to move forward in our understanding of these poorly studied endo-lysosomal genes. This will help harness their potential in designing innovative and efficient treatments to ultimately re-establish neuronal homeostasis in PD but also other diseases involving endo-lysosomal dysfunction.
Collapse
Affiliation(s)
- Amitha Muraleedharan
- Centre d'Excellence en Recherche sur les Maladies Orphelines - Fondation Courtois and Biological Sciences Department, Université du Québec à Montréal
| | - Benoît Vanderperre
- Centre d'Excellence en Recherche sur les Maladies Orphelines - Fondation Courtois and Biological Sciences Department, Université du Québec à Montréal
| |
Collapse
|
24
|
Sadegh S, Skelton J, Anastasi E, Maier A, Adamowicz K, Möller A, Kriege NM, Kronberg J, Haller T, Kacprowski T, Wipat A, Baumbach J, Blumenthal DB. Lacking mechanistic disease definitions and corresponding association data hamper progress in network medicine and beyond. Nat Commun 2023; 14:1662. [PMID: 36966134 PMCID: PMC10039912 DOI: 10.1038/s41467-023-37349-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 03/13/2023] [Indexed: 03/27/2023] Open
Abstract
A long-term objective of network medicine is to replace our current, mainly phenotype-based disease definitions by subtypes of health conditions corresponding to distinct pathomechanisms. For this, molecular and health data are modeled as networks and are mined for pathomechanisms. However, many such studies rely on large-scale disease association data where diseases are annotated using the very phenotype-based disease definitions the network medicine field aims to overcome. This raises the question to which extent the biases mechanistically inadequate disease annotations introduce in disease association data distort the results of studies which use such data for pathomechanism mining. We address this question using global- and local-scale analyses of networks constructed from disease association data of various types. Our results indicate that large-scale disease association data should be used with care for pathomechanism mining and that analyses of such data should be accompanied by close-up analyses of molecular data for well-characterized patient cohorts.
Collapse
Affiliation(s)
- Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - James Skelton
- School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Elisa Anastasi
- School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Anna Möller
- Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Nils M Kriege
- Faculty of Computer Science, University of Vienna, Vienna, Austria
- Research Network Data Science, University of Vienna, Vienna, Austria
| | - Jaanika Kronberg
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Toomas Haller
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Anil Wipat
- School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - David B Blumenthal
- Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
| |
Collapse
|
25
|
Zhu M, Tang M, Du Y. Identification of TAC1 Associated with Alzheimer's Disease Using a Robust Rank Aggregation Approach. J Alzheimers Dis 2023; 91:1339-1349. [PMID: 36617784 DOI: 10.3233/jad-220950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
BACKGROUND Alzheimer's disease (AD) brings heavy burden to society and family. There is an urgent need to find effective methods for disease diagnosis and treatment. The robust rank aggregation (RRA) approach that could aggregate the resulting gene lists has been widely utilized in genomic data analysis. OBJECTIVE To identify hub genes using RRA approach in AD. METHODS Seven microarray datasets in frontal cortex from GEO database were used to identify differential expressed genes (DEGs) in AD patients using RRA approach. STRING was performed to explore the protein-to-protein interaction (PPI). Gene Ontology enrichment and Kyoto Encyclopedia of Genes and Genomes pathway analyses were utilized for enrichment analysis. Human Gene Connectome and Gene Set Enrichment Analysis were used for functional annotation. Finally, the expression levels of hub genes were validated in the cortex of 5xFAD mice by quantitative real-time polymerase chain reaction. RESULTS After RRA analysis, 473 DEGs (216 upregulated and 257 downregulated) were identified in AD samples. PPI showed that DEGs had a total of 416 nodes and 2750 edges. These genes were divided into 17 clusters, each of which contains at least three genes. After functional annotation and enrichment analysis, TAC1 is identified as the hub gene and may be related to synaptic function and inflammation. In addition, Tac1 was found downregulated in cortices of 5xFAD mice. CONCLUSION In the current study, TAC1 is identified as a key gene in the frontal cortex of AD, providing insight into the possible pathogenesis and potential therapeutic targets for this disease.
Collapse
Affiliation(s)
- Min Zhu
- Department of Neurology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong, People's Republic of China.,Department of Neurology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, People's Republic of China
| | - Minglu Tang
- Department of Neurology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong, People's Republic of China.,Department of Neurology (Cognitive sleep ward), Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, People's Republic of China
| | - Yifeng Du
- Department of Neurology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong, People's Republic of China.,Department of Neurology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, People's Republic of China
| |
Collapse
|
26
|
Franchini L, Orlandi C. Probing the orphan receptors: Tools and directions. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2023; 195:47-76. [PMID: 36707155 DOI: 10.1016/bs.pmbts.2022.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The endogenous ligands activating a large fraction of the G Protein Coupled Receptor (GPCR) family members have yet to be identified. These receptors are commonly labeled as orphans (oGPCRs), and because of the absence of available pharmacological tools they are currently understudied. Nonetheless, genome wide association studies, together with research using animal models identified many physiological functions regulated by oGPCRs. Similarly, mutations in some oGPCRs have been associated with rare genetic disorders or with an increased risk of developing pathologies. The once underestimated pharmacological potential of targeting oGPCRs is increasingly being exploited by the development of novel tools to understand their biology and by drug discovery endeavors aimed at identifying new modulators of their activity. Here, we summarize recent advancements in the field of oGPCRs and future directions.
Collapse
Affiliation(s)
- Luca Franchini
- Department of Pharmacology and Physiology, University of Rochester Medical Center, Rochester, NY, United States
| | - Cesare Orlandi
- Department of Pharmacology and Physiology, University of Rochester Medical Center, Rochester, NY, United States.
| |
Collapse
|
27
|
Holm L, Laiho A, Törönen P, Salgado M. DALI shines a light on remote homologs: One hundred discoveries. Protein Sci 2023; 32:e4519. [PMID: 36419248 PMCID: PMC9793968 DOI: 10.1002/pro.4519] [Citation(s) in RCA: 129] [Impact Index Per Article: 129.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 11/15/2022] [Accepted: 11/20/2022] [Indexed: 11/25/2022]
Abstract
Structural comparison reveals remote homology that often fails to be detected by sequence comparison. The DALI web server (http://ekhidna2.biocenter.helsinki.fi/dali) is a platform for structural analysis that provides database searches and interactive visualization, including structural alignments annotated with secondary structure, protein families and sequence logos, and 3D structure superimposition supported by color-coded sequence and structure conservation. Here, we are using DALI to mine the AlphaFold Database version 1, which increased the structural coverage of protein families by 20%. We found 100 remote homologous relationships hitherto unreported in the current reference database for protein domains, Pfam 35.0. In particular, we linked 35 domains of unknown function (DUFs) to the previously characterized families, generating a functional hypothesis that can be explored downstream in structural biology studies. Other findings include gene fusions, tandem duplications, and adjustments to domain boundaries. The evidence for homology can be browsed interactively through live examples on DALI's website.
Collapse
Affiliation(s)
- Liisa Holm
- Organismal and Evolutionary Biology Research Program, Faculty of Biological and Environmental Sciences & Institute of Biotechnology, Helsinki Institute of Life SciencesUniversity of HelsinkiHelsinkiFinland
| | - Aleksi Laiho
- Organismal and Evolutionary Biology Research Program, Faculty of Biological and Environmental Sciences & Institute of Biotechnology, Helsinki Institute of Life SciencesUniversity of HelsinkiHelsinkiFinland
| | - Petri Törönen
- Organismal and Evolutionary Biology Research Program, Faculty of Biological and Environmental Sciences & Institute of Biotechnology, Helsinki Institute of Life SciencesUniversity of HelsinkiHelsinkiFinland
| | - Marco Salgado
- Organismal and Evolutionary Biology Research Program, Faculty of Biological and Environmental Sciences & Institute of Biotechnology, Helsinki Institute of Life SciencesUniversity of HelsinkiHelsinkiFinland
| |
Collapse
|
28
|
Pauza AG, Murphy D, Paton JFR. Transcriptomics of the Carotid Body. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1427:1-11. [PMID: 37322330 DOI: 10.1007/978-3-031-32371-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The carotid body (CB) has emerged as a potential therapeutic target for treating sympathetically mediated cardiovascular, respiratory, and metabolic diseases. In adjunct to its classical role as an arterial O2 sensor, the CB is a multimodal sensor activated by a range of stimuli in the circulation. However, consensus on how CB multimodality is achieved is lacking; even the best studied O2-sensing appears to involve multiple convergent mechanisms. A strategy to understand multimodal sensing is to adopt a hypothesis-free, high-throughput transcriptomic approach. This has proven instrumental for understanding fundamental mechanisms of CB response to hypoxia and other stimulants, its developmental niche, cellular heterogeneity, laterality, and pathophysiological remodeling in disease states. Herein, we review this published work that reveals novel molecular mechanisms underpinning multimodal sensing and reveals numerous gaps in knowledge that require experimental testing.
Collapse
Affiliation(s)
- Audrys G Pauza
- Manaaki Manawa - The Centre for Heart Research, Department of Physiology, Faculty of Medical & Health Sciences, University of Auckland, Auckland, New Zealand.
| | - David Murphy
- Molecular Neuroendocrinology Research Group, Bristol Medical School, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Julian F R Paton
- Manaaki Manawa - The Centre for Heart Research, Department of Physiology, Faculty of Medical & Health Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
29
|
De Paolis Kaluza MC, Jain S, Radivojac P. An Approach to Identifying and Quantifying Bias in Biomedical Data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2023; 28:311-322. [PMID: 36540987 PMCID: PMC9782737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Data biases are a known impediment to the development of trustworthy machine learning models and their application to many biomedical problems. When biased data is suspected, the assumption that the labeled data is representative of the population must be relaxed and methods that exploit a typically representative unlabeled data must be developed. To mitigate the adverse effects of unrepresentative data, we consider a binary semi-supervised setting and focus on identifying whether the labeled data is biased and to what extent. We assume that the class-conditional distributions were generated by a family of component distributions represented at different proportions in labeled and unlabeled data. We also assume that the training data can be transformed to and subsequently modeled by a nested mixture of multivariate Gaussian distributions. We then develop a multi-sample expectation-maximization algorithm that learns all individual and shared parameters of the model from the combined data. Using these parameters, we develop a statistical test for the presence of the general form of bias in labeled data and estimate the level of this bias by computing the distance between corresponding class-conditional distributions in labeled and unlabeled data. We first study the new methods on synthetic data to understand their behavior and then apply them to real-world biomedical data to provide evidence that the bias estimation procedure is both possible and effective.
Collapse
|
30
|
Papadakos KS, Ekström A, Slipek P, Skourti E, Reid S, Pietras K, Blom AM. Sushi domain-containing protein 4 binds to epithelial growth factor receptor and initiates autophagy in an EGFR phosphorylation independent manner. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2022; 41:363. [PMID: 36578014 PMCID: PMC9798675 DOI: 10.1186/s13046-022-02565-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 12/07/2022] [Indexed: 12/29/2022]
Abstract
BACKGROUND Sushi domain-containing protein 4 (SUSD4) is a recently discovered protein with unknown cellular functions. We previously revealed that SUSD4 can act as complement inhibitor and as a potential tumor suppressor. METHODS In a syngeneic mouse model of breast cancer, tumors expressing SUSD4 had a smaller volume compared with the corresponding mock control tumors. Additionally, data from three different expression databases and online analysis tools confirm that for breast cancer patients, high mRNA expression of SUSD4 in the tumor tissue correlates with a better prognosis. In vitro experiments utilized triple-negative breast cancer cell lines (BT-20 and MDA-MB-468) stably expressing SUSD4. Moreover, we established a cell line based on BT-20 in which the gene for EGFR was knocked out with the CRISPR-Cas9 method. RESULTS We discovered that the Epithelial Growth Factor Receptor (EGFR) interacts with SUSD4. Furthermore, triple-negative breast cancer cell lines stably expressing SUSD4 had higher autophagic flux. The initiation of autophagy required the expression of EGFR but not phosphorylation of the receptor. Expression of SUSD4 in the breast cancer cells led to activation of the tumor suppressor LKB1 and consequently to the activation of AMPKα1. Finally, autophagy was initiated after stimulation of the ULK1, Atg14 and Beclin-1 axis in SUSD4 expressing cells. CONCLUSIONS In this study we provide novel insight into the molecular mechanism of action whereby SUSD4 acts as an EGFR inhibitor without affecting the phosphorylation of the receptor and may potentially influence the recycling of EGFR to the plasma membrane.
Collapse
Affiliation(s)
- Konstantinos S. Papadakos
- grid.4514.40000 0001 0930 2361Division of Medical Protein Chemistry, Department of Translational Medicine, Lund University, Inga Maria Nilsson’s street 53, 214 28 Malmö, Sweden
| | - Alexander Ekström
- grid.4514.40000 0001 0930 2361Division of Medical Protein Chemistry, Department of Translational Medicine, Lund University, Inga Maria Nilsson’s street 53, 214 28 Malmö, Sweden
| | - Piotr Slipek
- grid.4514.40000 0001 0930 2361Division of Medical Protein Chemistry, Department of Translational Medicine, Lund University, Inga Maria Nilsson’s street 53, 214 28 Malmö, Sweden
| | - Eleni Skourti
- grid.4514.40000 0001 0930 2361Division of Medical Protein Chemistry, Department of Translational Medicine, Lund University, Inga Maria Nilsson’s street 53, 214 28 Malmö, Sweden
| | - Steven Reid
- grid.4514.40000 0001 0930 2361Division of Translational Cancer Research, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Kristian Pietras
- grid.4514.40000 0001 0930 2361Division of Translational Cancer Research, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Anna M. Blom
- grid.4514.40000 0001 0930 2361Division of Medical Protein Chemistry, Department of Translational Medicine, Lund University, Inga Maria Nilsson’s street 53, 214 28 Malmö, Sweden
| |
Collapse
|
31
|
Delmas M, Filangi O, Duperier C, Paulhe N, Vinson F, Rodriguez-Mier P, Giacomoni F, Jourdan F, Frainay C. Suggesting disease associations for overlooked metabolites using literature from metabolic neighbors. Gigascience 2022; 12:giad065. [PMID: 37712592 PMCID: PMC10502579 DOI: 10.1093/gigascience/giad065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 06/13/2023] [Accepted: 07/28/2023] [Indexed: 09/16/2023] Open
Abstract
In human health research, metabolic signatures extracted from metabolomics data have a strong added value for stratifying patients and identifying biomarkers. Nevertheless, one of the main challenges is to interpret and relate these lists of discriminant metabolites to pathological mechanisms. This task requires experts to combine their knowledge with information extracted from databases and the scientific literature. However, we show that most compounds (>99%) in the PubChem database lack annotated literature. This dearth of available information can have a direct impact on the interpretation of metabolic signatures, which is often restricted to a subset of significant metabolites. To suggest potential pathological phenotypes related to overlooked metabolites that lack annotated literature, we extend the "guilt-by-association" principle to literature information by using a Bayesian framework. The underlying assumption is that the literature associated with the metabolic neighbors of a compound can provide valuable insights, or an a priori, into its biomedical context. The metabolic neighborhood of a compound can be defined from a metabolic network and correspond to metabolites to which it is connected through biochemical reactions. With the proposed approach, we suggest more than 35,000 associations between 1,047 overlooked metabolites and 3,288 diseases (or disease families). All these newly inferred associations are freely available on the FORUM ftp server (see information at https://github.com/eMetaboHUB/Forum-LiteraturePropagation).
Collapse
Affiliation(s)
- Maxime Delmas
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
| | - Olivier Filangi
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Christophe Duperier
- Université Clermont Auvergne, INRAE, UNH, Plateforme d’Exploration du Métabolisme, MetaboHUB Clermont, F-63000 Clermont-Ferrand, France
| | - Nils Paulhe
- Université Clermont Auvergne, INRAE, UNH, Plateforme d’Exploration du Métabolisme, MetaboHUB Clermont, F-63000 Clermont-Ferrand, France
| | - Florence Vinson
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, 31300, France
| | - Pablo Rodriguez-Mier
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
| | - Franck Giacomoni
- Université Clermont Auvergne, INRAE, UNH, Plateforme d’Exploration du Métabolisme, MetaboHUB Clermont, F-63000 Clermont-Ferrand, France
| | - Fabien Jourdan
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, 31300, France
| | - Clément Frainay
- Toxalim (Research Center in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, 31300 Toulouse, France
| |
Collapse
|
32
|
Stankiewicz AM, Jaszczyk A, Goscik J, Juszczak GR. Stress and the brain transcriptome: Identifying commonalities and clusters in standardized data from published experiments. Prog Neuropsychopharmacol Biol Psychiatry 2022; 119:110558. [PMID: 35405299 DOI: 10.1016/j.pnpbp.2022.110558] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 03/17/2022] [Accepted: 04/04/2022] [Indexed: 12/28/2022]
Abstract
Interpretation of transcriptomic experiments is hindered by many problems including false positives/negatives inherent to big-data methods and changes in gene nomenclature. To find the most consistent effect of stress on brain transcriptome, we retrieved data from 79 studies applying animal models and 3 human studies investigating post-traumatic stress disorder (PTSD). The analyzed data were obtained either with microarrays or RNA sequencing applied to samples collected from more than 1887 laboratory animals and from 121 human subjects. Based on the initial database containing a quarter million differential expression effect sizes representing transcripts in three species, we identified the most frequently reported genes in 223 stress-control comparisons. Additionally, the analysis considers sex, individual vulnerability and contribution of glucocorticoids. We also found an overlap between gene expression in PTSD patients and animals which indicates relevance of laboratory models for human stress response. Our analysis points to genes that, as far as we know, were not specifically tested for their role in stress response (Pllp, Arrdc2, Midn, Mfsd2a, Ccn1, Htra1, Csrnp1, Tenm4, Tnfrsf25, Sema3b, Fmo2, Adamts4, Gjb1, Errfi1, Fgf18, Galnt6, Slc25a42, Ifi30, Slc4a1, Cemip, Klf10, Tom1, Dcdc2c, Fancd2, Luzp2, Trpm1, Abcc12, Osbpl1a, Ptp4a2). Provided transcriptomic resource will be useful for guiding the new research.
Collapse
Affiliation(s)
- Adrian M Stankiewicz
- Department of Molecular Biology, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Jastrzebiec, Poland
| | - Aneta Jaszczyk
- Department of Animal Behavior and Welfare, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Jastrzebiec, Poland
| | - Joanna Goscik
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
| | - Grzegorz R Juszczak
- Department of Animal Behavior and Welfare, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Jastrzebiec, Poland.
| |
Collapse
|
33
|
Byrne JA, Park Y, Richardson RAK, Pathmendra P, Sun M, Stoeger T. Protection of the human gene research literature from contract cheating organizations known as research paper mills. Nucleic Acids Res 2022; 50:12058-12070. [PMID: 36477580 PMCID: PMC9757046 DOI: 10.1093/nar/gkac1139] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 11/08/2022] [Accepted: 11/14/2022] [Indexed: 12/12/2022] Open
Abstract
Human gene research generates new biology insights with translational potential, yet few studies have considered the health of the human gene literature. The accessibility of human genes for targeted research, combined with unreasonable publication pressures and recent developments in scholarly publishing, may have created a market for low-quality or fraudulent human gene research articles, including articles produced by contract cheating organizations known as paper mills. This review summarises the evidence that paper mills contribute to the human gene research literature at scale and outlines why targeted gene research may be particularly vulnerable to systematic research fraud. To raise awareness of targeted gene research from paper mills, we highlight features of problematic manuscripts and publications that can be detected by gene researchers and/or journal staff. As improved awareness and detection could drive the further evolution of paper mill-supported publications, we also propose changes to academic publishing to more effectively deter and correct problematic publications at scale. In summary, the threat of paper mill-supported gene research highlights the need for all researchers to approach the literature with a more critical mindset, and demand publications that are underpinned by plausible research justifications, rigorous experiments and fully transparent reporting.
Collapse
Affiliation(s)
- Jennifer A Byrne
- To whom correspondence should be addressed. Tel: +61 2 4920 4135;
| | - Yasunori Park
- School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Reese A K Richardson
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, USA
| | - Pranujan Pathmendra
- School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Mengyi Sun
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, USA
| | - Thomas Stoeger
- To whom correspondence should be addressed. Tel: +61 2 4920 4135;
| |
Collapse
|
34
|
Functional genomic tools for emerging model species. Trends Ecol Evol 2022; 37:1104-1115. [PMID: 35914975 DOI: 10.1016/j.tree.2022.07.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/08/2022] [Accepted: 07/11/2022] [Indexed: 01/12/2023]
Abstract
Most studies in the field of ecology and evolution aiming to connect genotype to phenotype rarely validate identified loci using functional tools. Recent developments in RNA interference (RNAi) and clustered regularly interspaced palindromic repeats (CRISPR)-Cas genome editing have dramatically increased the feasibility of functional validation. However, these methods come with specific challenges when applied to emerging model organisms, including limited spatial control of gene silencing, low knock-in efficiencies, and low throughput of functional validation. Moreover, many functional studies to date do not recapitulate ecologically relevant variation, and this limits their scope for deeper insights into evolutionary processes. We therefore argue that increased use of gene editing by allelic replacement through homology-directed repair (HDR) would greatly benefit the field of ecology and evolution.
Collapse
|
35
|
Yu JSL, Heineike BM, Hartl J, Aulakh SK, Correia-Melo C, Lehmann A, Lemke O, Agostini F, Lee CT, Demichev V, Messner CB, Mülleder M, Ralser M. Inorganic sulfur fixation via a new homocysteine synthase allows yeast cells to cooperatively compensate for methionine auxotrophy. PLoS Biol 2022; 20:e3001912. [PMID: 36455053 PMCID: PMC9757880 DOI: 10.1371/journal.pbio.3001912] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 12/16/2022] [Accepted: 11/14/2022] [Indexed: 12/03/2022] Open
Abstract
The assimilation, incorporation, and metabolism of sulfur is a fundamental process across all domains of life, yet how cells deal with varying sulfur availability is not well understood. We studied an unresolved conundrum of sulfur fixation in yeast, in which organosulfur auxotrophy caused by deletion of the homocysteine synthase Met17p is overcome when cells are inoculated at high cell density. In combining the use of self-establishing metabolically cooperating (SeMeCo) communities with proteomic, genetic, and biochemical approaches, we discovered an uncharacterized gene product YLL058Wp, herein named Hydrogen Sulfide Utilizing-1 (HSU1). Hsu1p acts as a homocysteine synthase and allows the cells to substitute for Met17p by reassimilating hydrosulfide ions leaked from met17Δ cells into O-acetyl-homoserine and forming homocysteine. Our results show that cells can cooperate to achieve sulfur fixation, indicating that the collective properties of microbial communities facilitate their basic metabolic capacity to overcome sulfur limitation.
Collapse
Affiliation(s)
- Jason S. L. Yu
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, United Kingdom
| | - Benjamin M. Heineike
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, United Kingdom
| | - Johannes Hartl
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| | - Simran K. Aulakh
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, United Kingdom
| | - Clara Correia-Melo
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, United Kingdom
| | - Andrea Lehmann
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| | - Oliver Lemke
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| | - Federica Agostini
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| | - Cory T. Lee
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| | - Vadim Demichev
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| | - Christoph B. Messner
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, United Kingdom
| | - Michael Mülleder
- Core Facility—High Throughput Mass Spectrometry, Charité Universitätsmedizin, Berlin, Germany
| | - Markus Ralser
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, United Kingdom
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
| |
Collapse
|
36
|
Stoeger T, Grant RA, McQuattie-Pimentel AC, Anekalla KR, Liu SS, Tejedor-Navarro H, Singer BD, Abdala-Valencia H, Schwake M, Tetreault MP, Perlman H, Balch WE, Chandel NS, Ridge KM, Sznajder JI, Morimoto RI, Misharin AV, Budinger GRS, Nunes Amaral LA. Aging is associated with a systemic length-associated transcriptome imbalance. NATURE AGING 2022; 2:1191-1206. [PMID: 37118543 PMCID: PMC10154227 DOI: 10.1038/s43587-022-00317-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 10/21/2022] [Indexed: 12/14/2022]
Abstract
Aging is among the most important risk factors for morbidity and mortality. To contribute toward a molecular understanding of aging, we analyzed age-resolved transcriptomic data from multiple studies. Here, we show that transcript length alone explains most transcriptional changes observed with aging in mice and humans. We present three lines of evidence supporting the biological importance of the uncovered transcriptome imbalance. First, in vertebrates the length association primarily displays a lower relative abundance of long transcripts in aging. Second, eight antiaging interventions of the Interventions Testing Program of the National Institute on Aging can counter this length association. Third, we find that in humans and mice the genes with the longest transcripts enrich for genes reported to extend lifespan, whereas those with the shortest transcripts enrich for genes reported to shorten lifespan. Our study opens fundamental questions on aging and the organization of transcriptomes.
Collapse
Affiliation(s)
- Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.
- Center for Genetic Medicine, Northwestern University, Evanston, IL, USA.
| | - Rogan A Grant
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, USA
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
| | | | - Kishore R Anekalla
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
| | - Sophia S Liu
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | | | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
- Department of Biochemistry and Molecular Genetics, Northwestern University, Evanston, IL, USA
| | - Hiam Abdala-Valencia
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
| | - Michael Schwake
- Department of Neurology, Northwestern University, Evanston, IL, USA
- Faculty of Chemistry, University of Bielefeld, Bielefeld, Germany
| | - Marie-Pier Tetreault
- Division of Gastroenterology and Hepatology, Northwestern University, Evanston, IL, USA
| | - Harris Perlman
- Division of Rheumatology, Northwestern University, Evanston, IL, USA
| | | | - Navdeep S Chandel
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
| | - Karen M Ridge
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
| | - Jacob I Sznajder
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
| | - Richard I Morimoto
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, USA.
- Rice Institute for Biomedical Research, Northwestern University, Evanston, IL, USA.
| | - Alexander V Misharin
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA.
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA.
| | - G R Scott Budinger
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA.
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA.
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.
- Department of Physics and Astronomy, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
37
|
Kelleher KJ, Sheils TK, Mathias SL, Yang JJ, Metzger V, Siramshetty V, Nguyen DT, Jensen LJ, Vidović D, Schürer S, Holmes J, Sharma K, Pillai A, Bologa C, Edwards J, Mathé E, Oprea T. Pharos 2023: an integrated resource for the understudied human proteome. Nucleic Acids Res 2022; 51:D1405-D1416. [PMID: 36624666 PMCID: PMC9825581 DOI: 10.1093/nar/gkac1033] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/12/2022] [Accepted: 11/28/2022] [Indexed: 11/30/2022] Open
Abstract
The Illuminating the Druggable Genome (IDG) project aims to improve our understanding of understudied proteins and our ability to study them in the context of disease biology by perturbing them with small molecules, biologics, or other therapeutic modalities. Two main products from the IDG effort are the Target Central Resource Database (TCRD) (http://juniper.health.unm.edu/tcrd/), which curates and aggregates information, and Pharos (https://pharos.nih.gov/), a web interface for fusers to extract and visualize data from TCRD. Since the 2021 release, TCRD/Pharos has focused on developing visualization and analysis tools that help reveal higher-level patterns in the underlying data. The current iterations of TCRD and Pharos enable users to perform enrichment calculations based on subsets of targets, diseases, or ligands and to create interactive heat maps and UpSet charts of many types of annotations. Using several examples, we show how to address disease biology and drug discovery questions through enrichment calculations and UpSet charts.
Collapse
Affiliation(s)
- Keith J Kelleher
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Timothy K Sheils
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Stephen L Mathias
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA
| | - Jeremy J Yang
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA
| | - Vincent T Metzger
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA
| | - Vishal B Siramshetty
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen 2200, Copenhagen, Denmark
| | - Dušica Vidović
- Institute for Data Science and Computing, University of Miami, Coral Gables, FL 33146, USA,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| | - Stephan C Schürer
- Institute for Data Science and Computing, University of Miami, Coral Gables, FL 33146, USA,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA,Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| | - Jayme Holmes
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA
| | - Karlie R Sharma
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Ajay Pillai
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Cristian G Bologa
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA
| | - Jeremy S Edwards
- Correspondence may also be addressed to Jeremy Edwards. Tel: +1 505 277 6655;
| | - Ewy A Mathé
- To whom correspondence should be addressed. Tel: +1 301 402 8953;
| | - Tudor I Oprea
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, NM 87131, USA
| |
Collapse
|
38
|
Sales de Queiroz A, Sales Santa Cruz G, Jean-Marie A, Mazauric D, Roux J, Cazals F. Gene prioritization based on random walks with restarts and absorbing states, to define gene sets regulating drug pharmacodynamics from single-cell analyses. PLoS One 2022; 17:e0268956. [PMID: 36342924 PMCID: PMC9639845 DOI: 10.1371/journal.pone.0268956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 05/12/2022] [Indexed: 11/09/2022] Open
Abstract
Prioritizing genes for their role in drug sensitivity, is an important step in understanding drugs mechanisms of action and discovering new molecular targets for co-treatment. To formalize this problem, we consider two sets of genes X and P respectively composing the gene signature of cell sensitivity at the drug IC50 and the genes involved in its mechanism of action, as well as a protein interaction network (PPIN) containing the products of X and P as nodes. We introduce Genetrank, a method to prioritize the genes in X for their likelihood to regulate the genes in P. Genetrank uses asymmetric random walks with restarts, absorbing states, and a suitable renormalization scheme. Using novel so-called saturation indices, we show that the conjunction of absorbing states and renormalization yields an exploration of the PPIN which is much more progressive than that afforded by random walks with restarts only. Using MINT as underlying network, we apply Genetrank to a predictive gene signature of cancer cells sensitivity to tumor-necrosis-factor-related apoptosis-inducing ligand (TRAIL), performed in single-cells. Our ranking provides biological insights on drug sensitivity and a gene set considerably enriched in genes regulating TRAIL pharmacodynamics when compared to the most significant differentially expressed genes obtained from a statistical analysis framework alone. We also introduce gene expression radars, a visualization tool embedded in MA plots to assess all pairwise interactions at a glance on graphical representations of transcriptomics data. Genetrank is made available in the Structural Bioinformatics Library (https://sbl.inria.fr/doc/Genetrank-user-manual.html). It should prove useful for mining gene sets in conjunction with a signaling pathway, whenever other approaches yield relatively large sets of genes.
Collapse
Affiliation(s)
| | | | | | | | - Jérémie Roux
- CNRS UMR 7284, Inserm U 1081, Institut de Recherche sur le Cancer et le Vieillissement de Nice, Centre Antoine Lacassagne, Universite Côte d’Azur, Nice, France
- * E-mail: (FC); (JR)
| | - Frédéric Cazals
- Inria, Université Côte d’Azur, Nice, France
- * E-mail: (FC); (JR)
| |
Collapse
|
39
|
Seale C, Tepeli Y, Gonçalves JP. Overcoming selection bias in synthetic lethality prediction. Bioinformatics 2022; 38:4360-4368. [PMID: 35876858 PMCID: PMC9477536 DOI: 10.1093/bioinformatics/btac523] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 07/13/2022] [Accepted: 07/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. RESULTS We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples. AVAILABILITY AND IMPLEMENTATION https://github.com/joanagoncalveslab/sbsl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft 2600 AC, The Netherlands
| | - Yasin Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| |
Collapse
|
40
|
Gable AL, Szklarczyk D, Lyon D, Matias Rodrigues JF, von Mering C. Systematic assessment of pathway databases, based on a diverse collection of user-submitted experiments. Brief Bioinform 2022; 23:6695266. [PMID: 36088548 PMCID: PMC9487593 DOI: 10.1093/bib/bbac355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 07/13/2022] [Accepted: 07/30/2022] [Indexed: 11/14/2022] Open
Abstract
Abstract
A knowledge-based grouping of genes into pathways or functional units is essential for describing and understanding cellular complexity. However, it is not always clear a priori how and at what level of specificity functionally interconnected genes should be partitioned into pathways, for a given application. Here, we assess and compare nine existing and two conceptually novel functional classification systems, with respect to their discovery power and generality in gene set enrichment testing. We base our assessment on a collection of nearly 2000 functional genomics datasets provided by users of the STRING database. With these real-life and diverse queries, we assess which systems typically provide the most specific and complete enrichment results. We find many structural and performance differences between classification systems. Overall, the well-established, hierarchically organized pathway annotation systems yield the best enrichment performance, despite covering substantial parts of the human genome in general terms only. On the other hand, the more recent unsupervised annotation systems perform strongest in understudied areas and organisms, and in detecting more specific pathways, albeit with less informative labels.
Collapse
Affiliation(s)
- Annika L Gable
- Department of Molecular Life Sciences, University of Zurich , 8057 Zurich, Switzerland
| | - Damian Szklarczyk
- Department of Molecular Life Sciences, University of Zurich , 8057 Zurich, Switzerland
- Swiss Institute of Bioinformatics , 1015 Lausanne, Switzerland
| | - David Lyon
- Department of Molecular Life Sciences, University of Zurich , 8057 Zurich, Switzerland
- Swiss Institute of Bioinformatics , 1015 Lausanne, Switzerland
| | | | - Christian von Mering
- Department of Molecular Life Sciences, University of Zurich , 8057 Zurich, Switzerland
- Swiss Institute of Bioinformatics , 1015 Lausanne, Switzerland
| |
Collapse
|
41
|
de Crécy-lagard V, Amorin de Hegedus R, Arighi C, Babor J, Bateman A, Blaby I, Blaby-Haas C, Bridge AJ, Burley SK, Cleveland S, Colwell LJ, Conesa A, Dallago C, Danchin A, de Waard A, Deutschbauer A, Dias R, Ding Y, Fang G, Friedberg I, Gerlt J, Goldford J, Gorelik M, Gyori BM, Henry C, Hutinet G, Jaroch M, Karp PD, Kondratova L, Lu Z, Marchler-Bauer A, Martin MJ, McWhite C, Moghe GD, Monaghan P, Morgat A, Mungall CJ, Natale DA, Nelson WC, O’Donoghue S, Orengo C, O’Toole KH, Radivojac P, Reed C, Roberts RJ, Rodionov D, Rodionova IA, Rudolf JD, Saleh L, Sheynkman G, Thibaud-Nissen F, Thomas PD, Uetz P, Vallenet D, Carter EW, Weigele PR, Wood V, Wood-Charlson EM, Xu J. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022; 2022:6663924. [PMID: 35961013 PMCID: PMC9374478 DOI: 10.1093/database/baac062] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/28/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022]
Abstract
Over the last 25 years, biology has entered the genomic era and is becoming a science of ‘big data’. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3–4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.
Collapse
Affiliation(s)
- Valérie de Crécy-lagard
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | | | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware , Newark, DE 19713, USA
| | - Jill Babor
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Crysten Blaby-Haas
- Biology Department, Brookhaven National Laboratory , Upton, NY 11973, USA
| | - Alan J Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA
| | - Stacey Cleveland
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Lucy J Colwell
- Departmenf of Chemistry, University of Cambridge , Lensfield Road, Cambridge CB2 1EW, UK
| | - Ana Conesa
- Spanish National Research Council, Institute for Integrative Systems Biology , Paterna, Valencia 46980, Spain
| | - Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology , i12, Boltzmannstr. 3, Garching/Munich 85748, Germany
| | - Antoine Danchin
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong , 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China
| | - Anita de Waard
- Research Collaboration Unit, Elsevier , Jericho, VT 05465, USA
| | - Adam Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Raquel Dias
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida , Gainesville, FL 32610, USA
| | - Gang Fang
- NYU-Shanghai , Shanghai 200120, China
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University , Ames, IA 50011, USA
| | - John Gerlt
- Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign , Urbana, IL 61801, USA
| | - Joshua Goldford
- Physics of Living Systems, Massachusetts Institute of Technology , Cambridge, MA 02139, USA
| | - Mark Gorelik
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School , Boston, MA 02115, USA
| | - Christopher Henry
- Mathematics and Computer Science Division, Argonne National Laboratory , Argonne, IL 60439, USA
| | - Geoffrey Hutinet
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Marshall Jaroch
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | - Peter D Karp
- Bioinformatics Research Group, SRI International , Menlo Park, CA 94025, USA
| | | | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Maria-Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
| | - Claire McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ 08540, USA
| | - Gaurav D Moghe
- Plant Biology Section, School of Integrative Plant Science, Cornell University , Ithaca, NY 14853, USA
| | - Paul Monaghan
- Department of Agricultural Education and Communication, University of Florida , Gainesville, FL 32611, USA
| | - Anne Morgat
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Darren A Natale
- Georgetown University Medical Center , Washington, DC 20007, USA
| | - William C Nelson
- Biological Sciences Division, Pacific Northwest National Laboratories , Richland, WA 99354, USA
| | - Seán O’Donoghue
- School of Biotechnology and Biomolecular Sciences, University of NSW , Sydney, NSW 2052, Australia
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London , London WC1E 6BT, UK
| | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA
| | - Colbie Reed
- Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA
| | | | - Dmitri Rodionov
- Sanford Burnham Prebys Medical Discovery Institute , La Jolla, CA 92037, USA
| | - Irina A Rodionova
- Department of Bioengineering, Division of Engineering, University of California at San Diego , La Jolla, CA 92093-0412, USA
| | - Jeffrey D Rudolf
- Department of Chemistry, University of Florida , Gainesville, FL 32611, USA
| | - Lana Saleh
- New England Biolabs , Ipswich, MA 01938, USA
| | - Gloria Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia , Charlottesville, VA, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033, USA
| | - Peter Uetz
- Center for Biological Data Science, Virginia Commonwealth University , Richmond, VA 23284, USA
| | - David Vallenet
- LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS , Evry 91057, France
| | - Erica Watson Carter
- Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| | | | - Valerie Wood
- Department of Biochemistry, University of Cambridge , Cambridge CB2 1GA, UK
| | - Elisha M Wood-Charlson
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Jin Xu
- Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| |
Collapse
|
42
|
Li C, Feng Y, Fu Z, Deng J, Gu Y, Wang H, Wu X, Huang Z, Zhu Y, Liu Z, Huang M, Wang T, Hu S, Yao B, Zeng Y, Zhou CJ, Brown SDM, Liu Y, Vidal-Puig A, Dong Y, Xu Y. Human-specific gene CT47 blocks PRMT5 degradation to lead to meiosis arrest. Cell Death Discov 2022; 8:345. [PMID: 35918318 PMCID: PMC9345867 DOI: 10.1038/s41420-022-01139-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 07/13/2022] [Accepted: 07/18/2022] [Indexed: 11/25/2022] Open
Abstract
Exploring the functions of human-specific genes (HSGs) is challenging due to the lack of a tractable genetic model system. Testosterone is essential for maintaining human spermatogenesis and fertility, but the underlying mechanism is unclear. Here, we identified Cancer/Testis Antigen gene family 47 (CT47) as an essential regulator of human-specific spermatogenesis by stabilizing arginine methyltransferase 5 (PRMT5). A humanized mouse model revealed that CT47 functions to arrest spermatogenesis by interacting with and regulating CT47/PRMT5 accumulation in the nucleus during the leptotene/zygotene-to-pachytene transition of meiosis. We demonstrate that testosterone induces nuclear depletion of CT47/PRMT5 and rescues leptotene-arrested spermatocyte progression in humanized testes. Loss of CT47 in human embryonic stem cells (hESCs) by CRISPR/Cas9 led to an increase in haploid cells but blocked the testosterone-induced increase in haploid cells when hESCs were differentiated into haploid spermatogenic cells. Moreover, CT47 levels were decreased in nonobstructive azoospermia. Together, these results established CT47 as a crucial regulator of human spermatogenesis by preventing meiosis initiation before the testosterone surge.
Collapse
Affiliation(s)
- Chao Li
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Yuming Feng
- Department of Reproductive Medical Center, Jinling Hospital, Medical School of Nanjing University, Nanjing, Jiangsu, 210002, China
| | - Zhenxin Fu
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Junjie Deng
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Yue Gu
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Hanben Wang
- State Key Laboratory of Reproductive Medicine (SKLRM), Nanjing Medical University, Nanjing, Jiangsu, 210029, China
| | - Xin Wu
- State Key Laboratory of Reproductive Medicine (SKLRM), Nanjing Medical University, Nanjing, Jiangsu, 210029, China
| | - Zhengyun Huang
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Yichen Zhu
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Zhiwei Liu
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Moli Huang
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Tao Wang
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Shijun Hu
- Department of Cardiovascular Surgery of the First Affiliated Hospital & Institute for Cardiovascular Science, Collaborative Innovation Center of Hematology, State Key Laboratory of Radiation Medicine and Protection, Medical College, Soochow University, Suzhou, 215000, China
| | - Bing Yao
- Department of Reproductive Medical Center, Jinling Hospital, Medical School of Nanjing University, Nanjing, Jiangsu, 210002, China
| | - Yizhun Zeng
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China
| | - Chengji J Zhou
- Department of Biochemistry and Molecular Medicine, University of California at Davis, School of Medicine, Sacramento, CA, USA
| | - Steve D M Brown
- Medical Research Council (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, UK
| | - Yi Liu
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Antonio Vidal-Puig
- University of Cambridge Metabolic Research Laboratories, Institute of Metabolic Science, MDU MRC, Cambridge, UK
| | - Yingying Dong
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China.
| | - Ying Xu
- Cambridge-Su Genomic Resource Center, Jiangsu Key Laboratory of Neuropsychiatric Diseases, Medical School of Soochow University, Suzhou, Jiangsu, 215123, China.
| |
Collapse
|
43
|
Wasilewska K, Gambin T, Rydzanicz M, Szczałuba K, Płoski R. Postzygotic mutations and where to find them - Recent advances and future implications in the field of non-neoplastic somatic mosaicism. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2022; 790:108426. [PMID: 35690331 DOI: 10.1016/j.mrrev.2022.108426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 05/05/2022] [Accepted: 06/03/2022] [Indexed: 01/01/2023]
Abstract
The technological progress of massively parallel sequencing (MPS) has triggered a remarkable development in the research on postzygotic mutations. Although the overwhelming majority of studies in the field focus on oncogenesis, non-neoplastic diseases are attracting more and more attention. The aim of this review was to summarize some of the most recent findings in the field of somatic mosaicism in diseases other than neoplastic events. We discuss the abundance and role of postzygotic mutations, with a special emphasis on disorders which occur only in a mosaic form (obligatory mosaic diseases; OMDs). Based on the list of OMDs compiled from the published literature and three databases (OMIM, Orphanet and MosaicBase), we demonstrate the prevalence of cancer-related genes across OMDs and suggest other sources to further explore OMDs and OMD-related genes. Additionally, we comment on some practical aspects related to mosaic diseases, such as approaches to tissue sampling, the MPS coverage required to detect variants at a very low frequency, as well as on bioinformatic and molecular tools dedicated to detect somatic mutations in MPS data.
Collapse
Affiliation(s)
- Krystyna Wasilewska
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland
| | - Tomasz Gambin
- Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland
| | - Małgorzata Rydzanicz
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland
| | - Krzysztof Szczałuba
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland
| | - Rafał Płoski
- Department of Medical Genetics, Medical University of Warsaw, ul. Pawińskiego 3c, 02-106 Warsaw, Poland.
| |
Collapse
|
44
|
|
45
|
Sharma VS, Fossati A, Ciuffa R, Buljan M, Williams EG, Chen Z, Shao W, Pedrioli PGA, Purcell AW, Martínez MR, Song J, Manica M, Aebersold R, Li C. PCfun: a hybrid computational framework for systematic characterization of protein complex function. Brief Bioinform 2022; 23:6611913. [PMID: 35724564 PMCID: PMC9310514 DOI: 10.1093/bib/bbac239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 05/05/2022] [Accepted: 05/21/2022] [Indexed: 11/14/2022] Open
Abstract
In molecular biology, it is a general assumption that the ensemble of expressed molecules, their activities and interactions determine biological function, cellular states and phenotypes. Stable protein complexes—or macromolecular machines—are, in turn, the key functional entities mediating and modulating most biological processes. Although identifying protein complexes and their subunit composition can now be done inexpensively and at scale, determining their function remains challenging and labor intensive. This study describes Protein Complex Function predictor (PCfun), the first computational framework for the systematic annotation of protein complex functions using Gene Ontology (GO) terms. PCfun is built upon a word embedding using natural language processing techniques based on 1 million open access PubMed Central articles. Specifically, PCfun leverages two approaches for accurately identifying protein complex function, including: (i) an unsupervised approach that obtains the nearest neighbor (NN) GO term word vectors for a protein complex query vector and (ii) a supervised approach using Random Forest (RF) models trained specifically for recovering the GO terms of protein complex queries described in the CORUM protein complex database. PCfun consolidates both approaches by performing a hypergeometric statistical test to enrich the top NN GO terms within the child terms of the GO terms predicted by the RF models. The documentation and implementation of the PCfun package are available at https://github.com/sharmavaruns/PCfun. We anticipate that PCfun will serve as a useful tool and novel paradigm for the large-scale characterization of protein complex function.
Collapse
Affiliation(s)
- Varun S Sharma
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland.,CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Andrea Fossati
- Quantitative Biosciences Institute (QBI) and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,J. David Gladstone Institutes, San Francisco, CA 94158, USA
| | - Rodolfo Ciuffa
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland
| | - Marija Buljan
- Empa - Swiss Federal Laboratories for Materials Science and Technology, St. Gallen, Switzerland.,Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Evan G Williams
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette Luxembourg
| | - Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
| | - Wenguang Shao
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland
| | - Patrick G A Pedrioli
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland
| | - Anthony W Purcell
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | | | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | | | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland.,Faculty of Science, University of Zurich, Switzerland
| | - Chen Li
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland.,Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| |
Collapse
|
46
|
Kustatscher G, Collins T, Gingras AC, Guo T, Hermjakob H, Ideker T, Lilley KS, Lundberg E, Marcotte EM, Ralser M, Rappsilber J. An open invitation to the Understudied Proteins Initiative. Nat Biotechnol 2022; 40:815-817. [PMID: 35534555 DOI: 10.1038/s41587-022-01316-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Georg Kustatscher
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, UK.
| | | | - Anne-Claude Gingras
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Sinai Health System, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Tiannan Guo
- Zhejiang Provincial Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China.,Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Trey Ideker
- Division of Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Emma Lundberg
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology, Stockholm, Sweden.,Department of Bioengineering, Stanford University, Stanford, CA, USA.,Department of Pathology, Stanford University, Stanford, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX, USA
| | - Markus Ralser
- Department of Biochemistry, Charité University Medicine, Berlin, Germany.,The Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK
| | - Juri Rappsilber
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, UK. .,Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany. .,Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
47
|
Amaral LAN. A cautionary tale from the machine scientist. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00491-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
48
|
|
49
|
Park Y, West RA, Pathmendra P, Favier B, Stoeger T, Capes-Davis A, Cabanac G, Labbé C, Byrne JA. Identification of human gene research articles with wrongly identified nucleotide sequences. Life Sci Alliance 2022; 5:e202101203. [PMID: 35022248 PMCID: PMC8807875 DOI: 10.26508/lsa.202101203] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 12/27/2021] [Accepted: 12/28/2021] [Indexed: 01/01/2023] Open
Abstract
Nucleotide sequence reagents underpin molecular techniques that have been applied across hundreds of thousands of publications. We have previously reported wrongly identified nucleotide sequence reagents in human research publications and described a semi-automated screening tool Seek & Blastn to fact-check their claimed status. We applied Seek & Blastn to screen >11,700 publications across five literature corpora, including all original publications in Gene from 2007 to 2018 and all original open-access publications in Oncology Reports from 2014 to 2018. After manually checking Seek & Blastn outputs for >3,400 human research articles, we identified 712 articles across 78 journals that described at least one wrongly identified nucleotide sequence. Verifying the claimed identities of >13,700 sequences highlighted 1,535 wrongly identified sequences, most of which were claimed targeting reagents for the analysis of 365 human protein-coding genes and 120 non-coding RNAs. The 712 problematic articles have received >17,000 citations, including citations by human clinical trials. Given our estimate that approximately one-quarter of problematic articles may misinform the future development of human therapies, urgent measures are required to address unreliable gene research articles.
Collapse
Affiliation(s)
- Yasunori Park
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Rachael A West
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- Children's Cancer Research Unit, Kids Research, The Children's Hospital at Westmead, Westmead, Australia
| | | | - Bertrand Favier
- Université Grenoble Alpes, Translationnelle et Innovation en Médecine et Complexité, Grenoble, France
| | - Thomas Stoeger
- Successful Clinical Response in Pneumonia Therapy Systems Biology Center, Northwestern University, Evanston, IL, USA
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Center for Genetic Medicine, Northwestern University School of Medicine, Chicago, IL, USA
| | - Amanda Capes-Davis
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- CellBank Australia, Children's Medical Research Institute, Westmead, Australia
| | - Guillaume Cabanac
- Computer Science Department, Institut de Recherche en Informatique de Toulouse, Unité Mixte de Recherche 5505 Centre National de la Recherche Scientifique (CNRS), University of Toulouse, Toulouse, France
| | - Cyril Labbé
- Université Grenoble Alpes, CNRS, Grenoble INP, Laboratoire d'Informatique de Grenoble, Grenoble, France
| | - Jennifer A Byrne
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- New South Wales Health Statewide Biobank, New South Wales Health Pathology, Camperdown, Australia
| |
Collapse
|
50
|
Cho NH, Cheveralls KC, Brunner AD, Kim K, Michaelis AC, Raghavan P, Kobayashi H, Savy L, Li JY, Canaj H, Kim JYS, Stewart EM, Gnann C, McCarthy F, Cabrera JP, Brunetti RM, Chhun BB, Dingle G, Hein MY, Huang B, Mehta SB, Weissman JS, Gómez-Sjöberg R, Itzhak DN, Royer LA, Mann M, Leonetti MD. OpenCell: Endogenous tagging for the cartography of human cellular organization. Science 2022; 375:eabi6983. [PMID: 35271311 DOI: 10.1126/science.abi6983] [Citation(s) in RCA: 146] [Impact Index Per Article: 73.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Elucidating the wiring diagram of the human cell is a central goal of the postgenomic era. We combined genome engineering, confocal live-cell imaging, mass spectrometry, and data science to systematically map the localization and interactions of human proteins. Our approach provides a data-driven description of the molecular and spatial networks that organize the proteome. Unsupervised clustering of these networks delineates functional communities that facilitate biological discovery. We found that remarkably precise functional information can be derived from protein localization patterns, which often contain enough information to identify molecular interactions, and that RNA binding proteins form a specific subgroup defined by unique interaction and localization properties. Paired with a fully interactive website (opencell.czbiohub.org), our work constitutes a resource for the quantitative cartography of human cellular organization.
Collapse
Affiliation(s)
| | | | - Andreas-David Brunner
- Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Kibeom Kim
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - André C Michaelis
- Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | | | - Laura Savy
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Jason Y Li
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Hera Canaj
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | | | | | - Christian Gnann
- Chan Zuckerberg Biohub, San Francisco, CA, USA.,Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology, Stockholm, Sweden
| | | | | | - Rachel M Brunetti
- Department of Biochemistry and Biophysics, University of California, San Francisco, CA, USA
| | | | - Greg Dingle
- Chan Zuckerberg Initiative, Redwood City, CA, USA
| | | | - Bo Huang
- Chan Zuckerberg Biohub, San Francisco, CA, USA.,Department of Biochemistry and Biophysics, University of California, San Francisco, CA, USA.,Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
| | | | - Jonathan S Weissman
- Whitehead Institute, Koch Institute, Howard Hughes Medical Institute, and Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.,Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA
| | | | | | | | - Matthias Mann
- Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.,NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|