1
|
Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J 2024; 23:2727-2739. [PMID: 39035835 PMCID: PMC11260399 DOI: 10.1016/j.csbj.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
Understanding protein-protein interactions (PPIs) and the pathways they comprise is essential for comprehending cellular functions and their links to specific phenotypes. Despite the prevalence of molecular data generated by high-throughput sequencing technologies, a significant gap remains in translating this data into functional information regarding the series of interactions that underlie phenotypic differences. In this review, we present an in-depth analysis of heterogeneous network methodologies for modeling protein pathways, highlighting the critical role of integrating multifaceted biological data. It outlines the process of constructing these networks, from data representation to machine learning-driven predictions and evaluations. The work underscores the potential of heterogeneous networks in capturing the complexity of proteomic interactions, thereby offering enhanced accuracy in pathway prediction. This approach not only deepens our understanding of cellular processes but also opens up new possibilities in disease treatment and drug discovery by leveraging the predictive power of comprehensive proteomic data analysis.
Collapse
Affiliation(s)
- Gowri Nayar
- Department of Biomedical Data Science, Stanford University, United States
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, United States
- Department of Genetics, Stanford University, United States
- Department of Medicine, Stanford University, United States
- Department of Bioengineering, Stanford University, United States
| |
Collapse
|
2
|
Gillani M, Pollastri G. Protein subcellular localization prediction tools. Comput Struct Biotechnol J 2024; 23:1796-1807. [PMID: 38707539 PMCID: PMC11066471 DOI: 10.1016/j.csbj.2024.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/11/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Protein subcellular localization prediction is of great significance in bioinformatics and biological research. Most of the proteins do not have experimentally determined localization information, computational prediction methods and tools have been acting as an active research area for more than two decades now. Knowledge of the subcellular location of a protein provides valuable information about its functionalities, the functioning of the cell, and other possible interactions with proteins. Fast, reliable, and accurate predictors provides platforms to harness the abundance of sequence data to predict subcellular locations accordingly. During the last decade, there has been a considerable amount of research effort aimed at developing subcellular localization predictors. This paper reviews recent subcellular localization prediction tools in the Eukaryotic, Prokaryotic, and Virus-based categories followed by a detailed analysis. Each predictor is discussed based on its main features, strengths, weaknesses, algorithms used, prediction techniques, and analysis. This review is supported by prediction tools taxonomies that highlight their rele- vant area and examples for uncomplicated categorization and ease of understandability. These taxonomies help users find suitable tools according to their needs. Furthermore, recent research gaps and challenges are discussed to cover areas that need the utmost attention. This survey provides an in-depth analysis of the most recent prediction tools to facilitate readers and can be considered a quick guide for researchers to identify and explore the recent literature advancements.
Collapse
Affiliation(s)
- Maryam Gillani
- School of Computer Science, University College Dublin (UCD), Dublin, D04 V1W8, Ireland
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin (UCD), Dublin, D04 V1W8, Ireland
| |
Collapse
|
3
|
Li Y, Zeng GH, Liang YJ, Yang HR, Zhu XL, Zhai YJ, Duan LX, Xu YY. Improving quantitative prediction of protein subcellular locations in fluorescence images through deep generative models. Comput Biol Med 2024; 179:108913. [PMID: 39047508 DOI: 10.1016/j.compbiomed.2024.108913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/21/2024] [Accepted: 07/15/2024] [Indexed: 07/27/2024]
Abstract
Machine learning has been employed in recognizing protein localization at the subcellular level, which highly facilitates the protein function studies, especially for those multi-label proteins that localize in more than one organelle. However, existing works mostly study the qualitative classification of protein subcellular locations, ignoring fraction of one multi-label protein in different locations. Actually, about 50 % proteins are multi-label proteins, and the ignorance of quantitative information highly restricts the understanding of their spatial distribution and functional mechanism. One reason of the lack of quantitative study is the insufficiency of quantitative annotations. To address the data shortage problem, here we proposed a generative model, PLocGAN, which could generate cell images with conditional quantitative annotation of the fluorescence distribution. The model was a conditional generative adversarial network, in which the condition learning utilized partial label learning to overcome the lack of training labels and allowed training with only qualitative labels. Meanwhile, it used contrastive learning to enhance diversity of the generated images. We assessed the PLocGAN on four pixel-fused synthetic datasets and one real dataset, and demonstrated that the model could generate images with good fidelity and diversity, outperforming existing state-of-the-art generative methods. To verify the utility of PLocGAN in the quantitative prediction of protein subcellular locations, we replaced the training images with generated quantitative images and built prediction models, and found that they had a boosting effect on the quantitative estimation. This work demonstrates the effectiveness of deep generative models in bioimage analysis, and provides a new solution for quantitative subcellular proteomics.
Collapse
Affiliation(s)
- Yu Li
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Guo-Hua Zeng
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Yong-Jia Liang
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Hong-Rui Yang
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Xi-Liang Zhu
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Yu-Jia Zhai
- Cancer Center, Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524000, China
| | - Li-Xia Duan
- Guangzhou Red Cross Hospital, Medical College, Jinan University, Guangzhou, 510220, China
| | - Ying-Ying Xu
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China.
| |
Collapse
|
4
|
Li MM, Huang Y, Sumathipala M, Liang MQ, Valdeolivas A, Ananthakrishnan AN, Liao K, Marbach D, Zitnik M. Contextual AI models for single-cell protein biology. Nat Methods 2024:10.1038/s41592-024-02341-3. [PMID: 39039335 DOI: 10.1038/s41592-024-02341-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 06/10/2024] [Indexed: 07/24/2024]
Abstract
Understanding protein function and developing molecular therapies require deciphering the cell types in which proteins act as well as the interactions between proteins. However, modeling protein interactions across biological contexts remains challenging for existing algorithms. Here we introduce PINNACLE, a geometric deep learning approach that generates context-aware protein representations. Leveraging a multiorgan single-cell atlas, PINNACLE learns on contextualized protein interaction networks to produce 394,760 protein representations from 156 cell type contexts across 24 tissues. PINNACLE's embedding space reflects cellular and tissue organization, enabling zero-shot retrieval of the tissue hierarchy. Pretrained protein representations can be adapted for downstream tasks: enhancing 3D structure-based representations for resolving immuno-oncological protein interactions, and investigating drugs' effects across cell types. PINNACLE outperforms state-of-the-art models in nominating therapeutic targets for rheumatoid arthritis and inflammatory bowel diseases and pinpoints cell type contexts with higher predictive capability than context-free models. PINNACLE's ability to adjust its outputs on the basis of the context in which it operates paves the way for large-scale context-specific predictions in biology.
Collapse
Affiliation(s)
- Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Yepeng Huang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Marissa Sumathipala
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Man Qing Liang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alberto Valdeolivas
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Ashwin N Ananthakrishnan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital, Boston, MA, USA
| | - Katherine Liao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Brigham and Women's Hospital, Boston, MA, USA
| | - Daniel Marbach
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Data Science Initiative, Cambridge, MA, USA.
| |
Collapse
|
5
|
Li MM, Huang Y, Sumathipala M, Liang MQ, Valdeolivas A, Ananthakrishnan AN, Liao K, Marbach D, Zitnik M. Contextual AI models for single-cell protein biology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.18.549602. [PMID: 37503080 PMCID: PMC10370131 DOI: 10.1101/2023.07.18.549602] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Understanding protein function and developing molecular therapies require deciphering the cell types in which proteins act as well as the interactions between proteins. However, modeling protein interactions across biological contexts remains challenging for existing algorithms. Here, we introduce Pinnacle, a geometric deep learning approach that generates context-aware protein representations. Leveraging a multi-organ single-cell atlas, Pinnacle learns on contextualized protein interaction networks to produce 394,760 protein representations from 156 cell type contexts across 24 tissues. Pinnacle's embedding space reflects cellular and tissue organization, enabling zero-shot retrieval of the tissue hierarchy. Pretrained protein representations can be adapted for downstream tasks: enhancing 3D structure-based representations for resolving immuno-oncological protein interactions, and investigating drugs' effects across cell types. Pinnacle outperforms state-of-the-art models in nominating therapeutic targets for rheumatoid arthritis and inflammatory bowel diseases, and pinpoints cell type contexts with higher predictive capability than context-free models. Pinnacle's ability to adjust its outputs based on the context in which it operates paves way for large-scale context-specific predictions in biology.
Collapse
Affiliation(s)
- Michelle M. Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Yepeng Huang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Marissa Sumathipala
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Man Qing Liang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alberto Valdeolivas
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Ashwin N. Ananthakrishnan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital, Boston, MA, USA
| | - Katherine Liao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Brigham and Women’s Hospital, Boston, MA, USA
| | - Daniel Marbach
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
| |
Collapse
|
6
|
Fliri A, Kajiji S. Effects of vitamin D signaling in cardiovascular disease: centrality of macrophage polarization. Front Cardiovasc Med 2024; 11:1388025. [PMID: 38984353 PMCID: PMC11232491 DOI: 10.3389/fcvm.2024.1388025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 05/24/2024] [Indexed: 07/11/2024] Open
Abstract
Among the leading causes of natural death are cardiovascular diseases, cancer, and respiratory diseases. Factors causing illness include genetic predisposition, aging, stress, chronic inflammation, environmental factors, declining autophagy, and endocrine abnormalities including insufficient vitamin D levels. Inconclusive clinical outcomes of vitamin D supplements in cardiovascular diseases demonstrate the need to identify cause-effect relationships without bias. We employed a spectral clustering methodology capable of analyzing large diverse datasets for examining the role of vitamin D's genomic and non-genomic signaling in disease in this study. The results of this investigation showed the following: (1) vitamin D regulates multiple reciprocal feedback loops including p53, macrophage autophagy, nitric oxide, and redox-signaling; (2) these regulatory schemes are involved in over 2,000 diseases. Furthermore, the balance between genomic and non-genomic signaling by vitamin D affects autophagy regulation of macrophage polarization in tissue homeostasis. These findings provide a deeper understanding of how interactions between genomic and non-genomic signaling affect vitamin D pharmacology and offer opportunities for increasing the efficacy of vitamin D-centered treatment of cardiovascular disease and healthy lifespans.
Collapse
Affiliation(s)
- Anton Fliri
- Emergent System Analytics LLC, Clinton, CT, United States
| | - Shama Kajiji
- Emergent System Analytics LLC, Clinton, CT, United States
| |
Collapse
|
7
|
Kwon JJ, Pan J, Gonzalez G, Hahn WC, Zitnik M. On knowing a gene: A distributional hypothesis of gene function. Cell Syst 2024; 15:488-496. [PMID: 38810640 PMCID: PMC11189734 DOI: 10.1016/j.cels.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 02/25/2024] [Accepted: 04/30/2024] [Indexed: 05/31/2024]
Abstract
As words can have multiple meanings that depend on sentence context, genes can have various functions that depend on the surrounding biological system. This pleiotropic nature of gene function is limited by ontologies, which annotate gene functions without considering biological contexts. We contend that the gene function problem in genetics may be informed by recent technological leaps in natural language processing, in which representations of word semantics can be automatically learned from diverse language contexts. In contrast to efforts to model semantics as "is-a" relationships in the 1990s, modern distributional semantics represents words as vectors in a learned semantic space and fuels current advances in transformer-based models such as large language models and generative pre-trained transformers. A similar shift in thinking of gene functions as distributions over cellular contexts may enable a similar breakthrough in data-driven learning from large biological datasets to inform gene function.
Collapse
Affiliation(s)
- Jason J Kwon
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Joshua Pan
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Guadalupe Gonzalez
- Department of Computing, Faculty of Engineering, Imperial College, London SW7 2AZ, UK
| | - William C Hahn
- Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Marinka Zitnik
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Harvard Medical School, Department of Biomedical Informatics, Boston, MA 02115, USA; Harvard Data Science Initiative, Harvard University, Cambridge, MA 02138, USA; Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA 02134, USA.
| |
Collapse
|
8
|
Adnane M, de Almeida AM, Chapwanya A. Unveiling the power of proteomics in advancing tropical animal health and production. Trop Anim Health Prod 2024; 56:182. [PMID: 38825622 DOI: 10.1007/s11250-024-04037-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 05/20/2024] [Indexed: 06/04/2024]
Abstract
Proteomics, the large-scale study of proteins in biological systems has emerged as a pivotal tool in the field of animal and veterinary sciences, mainly for investigating local and rustic breeds. Proteomics provides valuable insights into biological processes underlying animal growth, reproduction, health, and disease. In this review, we highlight the key proteomics technologies, methodologies, and their applications in domestic animals, particularly in the tropical context. We also discuss advances in proteomics research, including integration of multi-omics data, single-cell proteomics, and proteogenomics, all of which are promising for improving animal health, adaptation, welfare, and productivity. However, proteomics research in domestic animals faces challenges, such as sample preparation variation, data quality control, privacy and ethical considerations relating to animal welfare. We also provide recommendations for overcoming these challenges, emphasizing the importance of following best practices in sample preparation, data quality control, and ethical compliance. We therefore aim for this review to harness the full potential of proteomics in advancing our understanding of animal biology and ultimately improve animal health and productivity in local breeds of diverse animal species in a tropical context.
Collapse
Affiliation(s)
- Mounir Adnane
- Department of Biomedicine, Institute of Veterinary Sciences, University of Tiaret, Tiaret, 14000, Algeria.
| | - André M de Almeida
- LEAF-Linking Landscape, Environment, Agriculture and Food Research Center, Associate Laboratory TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, Lisboa, 1349-017, Portugal
| | - Aspinas Chapwanya
- Department of Clinical Sciences, Ross University School of Veterinary Medicine, Basseterre, 00265, Saint Kitts and Nevis
| |
Collapse
|
9
|
Munro V, Kelly V, Messner CB, Kustatscher G. Cellular control of protein levels: A systems biology perspective. Proteomics 2024; 24:e2200220. [PMID: 38012370 DOI: 10.1002/pmic.202200220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 11/13/2023] [Accepted: 11/15/2023] [Indexed: 11/29/2023]
Abstract
How cells regulate protein levels is a central question of biology. Over the past decades, molecular biology research has provided profound insights into the mechanisms and the molecular machinery governing each step of the gene expression process, from transcription to protein degradation. Recent advances in transcriptomics and proteomics have complemented our understanding of these fundamental cellular processes with a quantitative, systems-level perspective. Multi-omic studies revealed significant quantitative, kinetic and functional differences between the genome, transcriptome and proteome. While protein levels often correlate with mRNA levels, quantitative investigations have demonstrated a substantial impact of translation and protein degradation on protein expression control. In addition, protein-level regulation appears to play a crucial role in buffering protein abundances against undesirable mRNA expression variation. These findings have practical implications for many fields, including gene function prediction and precision medicine.
Collapse
Affiliation(s)
- Victoria Munro
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, UK
| | - Van Kelly
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, UK
| | - Christoph B Messner
- Precision Proteomics Center, Swiss Institute of Allergy and Asthma Research (SIAF), University of Zurich, Davos, Switzerland
| | - Georg Kustatscher
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
10
|
Rutherford KM, Lera-Ramírez M, Wood V. PomBase: a Global Core Biodata Resource-growth, collaboration, and sustainability. Genetics 2024; 227:iyae007. [PMID: 38376816 PMCID: PMC11075564 DOI: 10.1093/genetics/iyae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 01/13/2024] [Indexed: 02/21/2024] Open
Abstract
PomBase (https://www.pombase.org), the model organism database (MOD) for fission yeast, was recently awarded Global Core Biodata Resource (GCBR) status by the Global Biodata Coalition (GBC; https://globalbiodata.org/) after a rigorous selection process. In this MOD review, we present PomBase's continuing growth and improvement over the last 2 years. We describe these improvements in the context of the qualitative GCBR indicators related to scientific quality, comprehensivity, accelerating science, user stories, and collaborations with other biodata resources. This review also showcases the depth of existing connections both within the biocuration ecosystem and between PomBase and its user community.
Collapse
Affiliation(s)
- Kim M Rutherford
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Manuel Lera-Ramírez
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Valerie Wood
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| |
Collapse
|
11
|
Sun Z, Ning Z, Figeys D. The Landscape and Perspectives of the Human Gut Metaproteomics. Mol Cell Proteomics 2024; 23:100763. [PMID: 38608842 PMCID: PMC11098955 DOI: 10.1016/j.mcpro.2024.100763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/26/2024] [Accepted: 04/09/2024] [Indexed: 04/14/2024] Open
Abstract
The human gut microbiome is closely associated with human health and diseases. Metaproteomics has emerged as a valuable tool for studying the functionality of the gut microbiome by analyzing the entire proteins present in microbial communities. Recent advancements in liquid chromatography and tandem mass spectrometry (LC-MS/MS) techniques have expanded the detection range of metaproteomics. However, the overall coverage of the proteome in metaproteomics is still limited. While metagenomics studies have revealed substantial microbial diversity and functional potential of the human gut microbiome, few studies have summarized and studied the human gut microbiome landscape revealed with metaproteomics. In this article, we present the current landscape of human gut metaproteomics studies by re-analyzing the identification results from 15 published studies. We quantified the limited proteome coverage in metaproteomics and revealed a high proportion of annotation coverage of metaproteomics-identified proteins. We conducted a preliminary comparison between the metaproteomics view and the metagenomics view of the human gut microbiome, identifying key areas of consistency and divergence. Based on the current landscape of human gut metaproteomics, we discuss the feasibility of using metaproteomics to study functionally unknown proteins and propose a whole workflow peptide-centric analysis. Additionally, we suggest enhancing metaproteomics analysis by refining taxonomic classification and calculating confidence scores, as well as developing tools for analyzing the interaction between taxonomy and function.
Collapse
Affiliation(s)
- Zhongzhi Sun
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada; Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Zhibin Ning
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Daniel Figeys
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada; Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada.
| |
Collapse
|
12
|
Coorssen JR, Padula MP. Proteomics-The State of the Field: The Definition and Analysis of Proteomes Should Be Based in Reality, Not Convenience. Proteomes 2024; 12:14. [PMID: 38651373 PMCID: PMC11036260 DOI: 10.3390/proteomes12020014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/17/2024] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
With growing recognition and acknowledgement of the genuine complexity of proteomes, we are finally entering the post-proteogenomic era. Routine assessment of proteomes as inferred correlates of gene sequences (i.e., canonical 'proteins') cannot provide the necessary critical analysis of systems-level biology that is needed to understand underlying molecular mechanisms and pathways or identify the most selective biomarkers and therapeutic targets. These critical requirements demand the analysis of proteomes at the level of proteoforms/protein species, the actual active molecular players. Currently, only highly refined integrated or integrative top-down proteomics (iTDP) enables the analytical depth necessary to provide routine, comprehensive, and quantitative proteome assessments across the widest range of proteoforms inherent to native systems. Here we provide a broad perspective of the field, taking in historical and current realities, to establish a more balanced understanding of where the field has come from (in particular during the ten years since Proteomes was launched), current issues, and how things likely need to proceed if necessary deep proteome analyses are to succeed. We base this in our firm belief that the best proteomic analyses reflect, as closely as possible, the native sample at the moment of sampling. We also seek to emphasise that this and future analytical approaches are likely best based on the broad recognition and exploitation of the complementarity of currently successful approaches. This also emphasises the need to continuously evaluate and further optimize established approaches, to avoid complacency in thinking and expectations but also to promote the critical and careful development and introduction of new approaches, most notably those that address proteoforms. Above all, we wish to emphasise that a rigorous focus on analytical quality must override current thinking that largely values analytical speed; the latter would certainly be nice, if only proteoforms could thus be effectively, routinely, and quantitatively assessed. Alas, proteomes are composed of proteoforms, not molecular species that can be amplified or that directly mirror genes (i.e., 'canonical'). The problem is hard, and we must accept and address it as such, but the payoff in playing this longer game of rigorous deep proteome analyses is the promise of far more selective biomarkers, drug targets, and truly personalised or even individualised medicine.
Collapse
Affiliation(s)
- Jens R. Coorssen
- Department of Biological Sciences, Faculty of Mathematics and Science, Brock University, St. Catharines, ON L2S 3A1, Canada
- Institute for Globally Distributed Open Research and Education (IGDORE), St. Catharines, ON L2N 4X2, Canada
| | - Matthew P. Padula
- School of Life Sciences and Proteomics, Lipidomics and Metabolomics Core Facility, Faculty of Science, University of Technology Sydney, Sydney, NSW 2007, Australia
| |
Collapse
|
13
|
Malatesta M, Fornasier E, Di Salvo ML, Tramonti A, Zangelmi E, Peracchi A, Secchi A, Polverini E, Giachin G, Battistutta R, Contestabile R, Percudani R. One substrate many enzymes virtual screening uncovers missing genes of carnitine biosynthesis in human and mouse. Nat Commun 2024; 15:3199. [PMID: 38615009 PMCID: PMC11016064 DOI: 10.1038/s41467-024-47466-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/26/2024] [Indexed: 04/15/2024] Open
Abstract
The increasing availability of experimental and computational protein structures entices their use for function prediction. Here we develop an automated procedure to identify enzymes involved in metabolic reactions by assessing substrate conformations docked to a library of protein structures. By screening AlphaFold-modeled vitamin B6-dependent enzymes, we find that a metric based on catalytically favorable conformations at the enzyme active site performs best (AUROC Score=0.84) in identifying genes associated with known reactions. Applying this procedure, we identify the mammalian gene encoding hydroxytrimethyllysine aldolase (HTMLA), the second enzyme of carnitine biosynthesis. Upon experimental validation, we find that the top-ranked candidates, serine hydroxymethyl transferase (SHMT) 1 and 2, catalyze the HTMLA reaction. However, a mouse protein absent in humans (threonine aldolase; Tha1) catalyzes the reaction more efficiently. Tha1 did not rank highest based on the AlphaFold model, but its rank improved to second place using the experimental crystal structure we determined at 2.26 Å resolution. Our findings suggest that humans have lost a gene involved in carnitine biosynthesis, with HTMLA activity of SHMT partially compensating for its function.
Collapse
Affiliation(s)
- Marco Malatesta
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | | | - Martino Luigi Di Salvo
- Istituto Pasteur Italia-Fondazione Cenci Bolognetti and Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome, Italy
| | - Angela Tramonti
- Institute of Molecular Biology and Pathology, Italian National Research Council, Rome, Italy
| | - Erika Zangelmi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Alessio Peracchi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Andrea Secchi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Eugenia Polverini
- Department of Mathematical, Physical and Computer Sciences, University of Parma, Parma, Italy
| | - Gabriele Giachin
- Department of Chemical Sciences, University of Padua, Padova, Italy
| | | | - Roberto Contestabile
- Istituto Pasteur Italia-Fondazione Cenci Bolognetti and Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome, Italy.
| | - Riccardo Percudani
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy.
| |
Collapse
|
14
|
Zhao Y, Yang Z, Wang L, Zhang Y, Lin H, Wang J. Predicting Protein Functions Based on Heterogeneous Graph Attention Technique. IEEE J Biomed Health Inform 2024; 28:2408-2415. [PMID: 38319781 DOI: 10.1109/jbhi.2024.3357834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
In bioinformatics, protein function prediction stands as a fundamental area of research and plays a crucial role in addressing various biological challenges, such as the identification of potential targets for drug discovery and the elucidation of disease mechanisms. However, known functional annotation databases usually provide positive experimental annotations that proteins carry out a given function, and rarely record negative experimental annotations that proteins do not carry out a given function. Therefore, existing computational methods based on deep learning models focus on these positive annotations for prediction and ignore these scarce but informative negative annotations, leading to an underestimation of precision. To address this issue, we introduce a deep learning method that utilizes a heterogeneous graph attention technique. The method first constructs a heterogeneous graph that covers the protein-protein interaction network, ontology structure, and positive and negative annotation information. Then, it learns embedding representations of proteins and ontology terms by using the heterogeneous graph attention technique. Finally, it leverages these learned representations to reconstruct the positive protein-term associations and score unobserved functional annotations. It can enhance the predictive performance by incorporating these known limited negative annotations into the constructed heterogeneous graph. Experimental results on three species (i.e., Human, Mouse, and Arabidopsis) demonstrate that our method can achieve better performance in predicting new protein annotations than state-of-the-art methods.
Collapse
|
15
|
Idrees S, Paudel KR, Sadaf T, Hansbro PM. Uncovering domain motif interactions using high-throughput protein-protein interaction detection methods. FEBS Lett 2024; 598:725-742. [PMID: 38439692 DOI: 10.1002/1873-3468.14841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 02/18/2024] [Indexed: 03/06/2024]
Abstract
Protein-protein interactions (PPIs) are often mediated by short linear motifs (SLiMs) in one protein and domain in another, known as domain-motif interactions (DMIs). During the past decade, SLiMs have been studied to find their role in cellular functions such as post-translational modifications, regulatory processes, protein scaffolding, cell cycle progression, cell adhesion, cell signalling and substrate selection for proteasomal degradation. This review provides a comprehensive overview of the current PPI detection techniques and resources, focusing on their relevance to capturing interactions mediated by SLiMs. We also address the challenges associated with capturing DMIs. Moreover, a case study analysing the BioGrid database as a source of DMI prediction revealed significant known DMI enrichment in different PPI detection methods. Overall, it can be said that current high-throughput PPI detection methods can be a reliable source for predicting DMIs.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Tayyaba Sadaf
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Philip M Hansbro
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| |
Collapse
|
16
|
Richardson R, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: Understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. eLife 2024; 12:RP93429. [PMID: 38546716 PMCID: PMC10977968 DOI: 10.7554/elife.93429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2024] Open
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of -omics studies. To promote the investigation of understudied genes, we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese Richardson
- Interdisciplinary Biological Sciences, Northwestern UniversityEvanstonUnited States
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
- Department of Molecular Biosciences, Northwestern UniversityEvanstonUnited States
- Department of Physics and Astronomy, Northwestern UniversityEvanstonUnited States
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- The Potocsnak Longevity Institute, Northwestern UniversityChicagoUnited States
- Simpson Querrey Lung Institute for Translational Science, Northwestern UniversityChicagoUnited States
| |
Collapse
|
17
|
Song FV, Su J, Huang S, Zhang N, Li K, Ni M, Liao M. DeepSS2GO: protein function prediction from secondary structure. Brief Bioinform 2024; 25:bbae196. [PMID: 38701416 PMCID: PMC11066904 DOI: 10.1093/bib/bbae196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/31/2024] [Accepted: 04/10/2024] [Indexed: 05/05/2024] Open
Abstract
Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.
Collapse
Affiliation(s)
- Fu V Song
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| | - Jiaqi Su
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| | - Sixing Huang
- Gemini Data Japan, Kitaku Oujikamiya 1-11-11, 115-0043, Tokyo, Japan
| | - Neng Zhang
- Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, E1 4NS, London, UK
| | - Kaiyue Li
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| | - Ming Ni
- MGI Tech, Beishan Industrial Zone, 518083, Shenzhen, China
| | - Maofu Liao
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
- Institute for Biological Electron Microscopy, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| |
Collapse
|
18
|
Schäfer PSL, Dimitrov D, Villablanca EJ, Saez-Rodriguez J. Integrating single-cell multi-omics and prior biological knowledge for a functional characterization of the immune system. Nat Immunol 2024; 25:405-417. [PMID: 38413722 DOI: 10.1038/s41590-024-01768-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 01/16/2024] [Indexed: 02/29/2024]
Abstract
The immune system comprises diverse specialized cell types that cooperate to defend the host against a wide range of pathogenic threats. Recent advancements in single-cell and spatial multi-omics technologies provide rich information about the molecular state of immune cells. Here, we review how the integration of single-cell and spatial multi-omics data with prior knowledge-gathered from decades of detailed biochemical studies-allows us to obtain functional insights, focusing on gene regulatory processes and cell-cell interactions. We present diverse applications in immunology and critically assess underlying assumptions and limitations. Finally, we offer a perspective on the ongoing technological and algorithmic developments that promise to get us closer to a systemic mechanistic understanding of the immune system.
Collapse
Affiliation(s)
- Philipp Sven Lars Schäfer
- Institute for Computational Bioscience, Faculty of Medicine and Heidelberg University Hospital, Heidelberg University, Heidelberg, Germany
| | - Daniel Dimitrov
- Institute for Computational Bioscience, Faculty of Medicine and Heidelberg University Hospital, Heidelberg University, Heidelberg, Germany
| | - Eduardo J Villablanca
- Division of Immunology and Allergy, Department of Medicine Solna, Karolinska Institute and Karolinska University Hospital, Stockholm, Sweden
- Center of Molecular Medicine, Stockholm, Sweden
| | - Julio Saez-Rodriguez
- Institute for Computational Bioscience, Faculty of Medicine and Heidelberg University Hospital, Heidelberg University, Heidelberg, Germany.
| |
Collapse
|
19
|
Wicke D, Neumann P, Gößringer M, Chernev A, Davydov S, Poehlein A, Daniel R, Urlaub H, Hartmann R, Ficner R, Stülke J. The previously uncharacterized RnpM (YlxR) protein modulates the activity of ribonuclease P in Bacillus subtilis in vitro. Nucleic Acids Res 2024; 52:1404-1419. [PMID: 38050972 PMCID: PMC10853771 DOI: 10.1093/nar/gkad1171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 11/22/2023] [Indexed: 12/07/2023] Open
Abstract
Even though Bacillus subtilis is one of the most studied organisms, no function has been identified for about 20% of its proteins. Among these unknown proteins are several RNA- and ribosome-binding proteins suggesting that they exert functions in cellular information processing. In this work, we have investigated the RNA-binding protein YlxR. This protein is widely conserved in bacteria and strongly constitutively expressed in B. subtilis suggesting an important function. We have identified the RNA subunit of the essential RNase P as the binding partner of YlxR. The main activity of RNase P is the processing of 5' ends of pre-tRNAs. In vitro processing assays demonstrated that the presence of YlxR results in reduced RNase P activity. Chemical cross-linking studies followed by in silico docking analysis and experiments with site-directed mutant proteins suggest that YlxR binds to the region of the RNase P RNA that is important for binding and cleavage of the pre-tRNA substrate. We conclude that the YlxR protein is a novel interaction partner of the RNA subunit of RNase P that serves to finetune RNase P activity to ensure appropriate amounts of mature tRNAs for translation. We rename the YlxR protein RnpM for RNase P modulator.
Collapse
Affiliation(s)
- Dennis Wicke
- Department of General Microbiology, GZMB, Georg-August-University Göttingen, Göttingen, Germany
| | - Piotr Neumann
- Department of Molecular Structural Biology, GZMB, Georg-August-University Göttingen, Göttingen, Germany
| | - Markus Gößringer
- Institute for the Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany
| | - Aleksandar Chernev
- Bioanalytical Mass Spectrometry, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Swetlana Davydov
- Institute for the Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany
| | - Anja Poehlein
- Department of Genomic and Applied Microbiology & Göttingen Genomics Laboratory, GZMB, Georg-August-University Göttingen, Göttingen, Germany
| | - Rolf Daniel
- Department of Genomic and Applied Microbiology & Göttingen Genomics Laboratory, GZMB, Georg-August-University Göttingen, Göttingen, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Institute of Clinical Chemistry, GZMB, University Medical Centre Göttingen, Germany
- Cluster of Excellence “Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells” (MBExC), Georg-August-University Göttingen, Germany
| | - Roland K Hartmann
- Institute for the Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany
| | - Ralf Ficner
- Department of Molecular Structural Biology, GZMB, Georg-August-University Göttingen, Göttingen, Germany
| | - Jörg Stülke
- Department of General Microbiology, GZMB, Georg-August-University Göttingen, Göttingen, Germany
| |
Collapse
|
20
|
Richardson RAK, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.28.530483. [PMID: 36909550 PMCID: PMC10002660 DOI: 10.1101/2023.02.28.530483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of - omics studies. To promote the investigation of understudied genes we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese AK Richardson
- Interdisciplinary Biological Sciences, Northwestern University
- Department of Chemical and Biological Engineering, Northwestern University
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
- Department of Physics and Astronomy, Northwestern University
- Department of Molecular Biosciences, Northwestern University
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern University
- The Potocsnak Longevity Institute, Northwestern University
- Simpson Querrey Lung Institute for Translational Science, Northwestern University
| |
Collapse
|
21
|
Maio G, Smith M, Bhawal R, Zhang S, Baskin JM, Li J, Lin H. Interactome Analysis Identifies the Role of BZW2 in Promoting Endoplasmic Reticulum-Mitochondria Contact and Mitochondrial Metabolism. Mol Cell Proteomics 2024; 23:100709. [PMID: 38154691 PMCID: PMC10835002 DOI: 10.1016/j.mcpro.2023.100709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 12/19/2023] [Accepted: 12/24/2023] [Indexed: 12/30/2023] Open
Abstract
Understanding the molecular functions of less-studied proteins is an important task of life science research. Despite reports of basic leucine zipper and W2 domain-containing protein 2 (BZW2) promoting cancer progression first emerging in 2017, little is known about its molecular function. Using a quantitative proteomic approach to identify its interacting proteins, we found that BZW2 interacts with both endoplasmic reticulum (ER) and mitochondrial proteins. We thus hypothesized that BZW2 localizes to and promotes the formation of ER-mitochondria contact sites and that such localization would promote calcium transport from ER to the mitochondria and promote ATP production. Indeed, we found that BZW2 localized to ER-mitochondria contact sites and that BZW2 knockdown decreased ER-mitochondria contact, mitochondrial calcium levels, and ATP production. These findings provide key insights into molecular functions of BZW2, the potential role of BZW2 in cancer progression, and highlight the utility of interactome data in understanding the function of less-studied proteins.
Collapse
Affiliation(s)
- George Maio
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, USA
| | - Mike Smith
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, USA
| | - Ruchika Bhawal
- Proteomics and Metabolomics Facility, Cornell University, Ithaca, New York, USA
| | - Sheng Zhang
- Proteomics and Metabolomics Facility, Cornell University, Ithaca, New York, USA
| | - Jeremy M Baskin
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, USA
| | - Jenny Li
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, USA
| | - Hening Lin
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, USA; Howard Hughes Medical Institute, Cornell University, Ithaca, New York, USA; Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA.
| |
Collapse
|
22
|
Macedo-da-Silva J, Mule SN, Rosa-Fernandes L, Palmisano G. A computational pipeline elucidating functions of conserved hypothetical Trypanosoma cruzi proteins based on public proteomic data. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 138:401-428. [PMID: 38220431 DOI: 10.1016/bs.apcsb.2023.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
The proteome is complex, dynamic, and functionally diverse. Functional proteomics aims to characterize the functions of proteins in biological systems. However, there is a delay in annotating the function of proteins, even in model organisms. This gap is even greater in other organisms, including Trypanosoma cruzi, the causative agent of the parasitic, systemic, and sometimes fatal disease called Chagas disease. About 99.8% of Trypanosoma cruzi proteome is not manually annotated (unreviewed), among which>25% are conserved hypothetical proteins (CHPs), calling attention to the knowledge gap on the protein content of this organism. CHPs are conserved proteins among different species of various evolutionary lineages; however, they lack functional validation. This study describes a bioinformatics pipeline applied to public proteomic data to infer possible biological functions of conserved hypothetical Trypanosoma cruzi proteins. Here, the adopted strategy consisted of collecting differentially expressed proteins between the epimastigote and metacyclic trypomastigotes stages of Trypanosoma cruzi; followed by the functional characterization of these CHPs applying a manifold learning technique for dimension reduction and 3D structure homology analysis (Spalog). We found a panel of 25 and 26 upregulated proteins in the epimastigote and metacyclic trypomastigote stages, respectively; among these, 18 CHPs (8 in the epimastigote stage and 10 in the metacyclic stage) were characterized. The data generated corroborate the literature and complement the functional analyses of differentially regulated proteins at each stage, as they attribute potential functions to CHPs, which are frequently identified in Trypanosoma cruzi proteomics studies. However, it is important to point out that experimental validation is required to deepen our understanding of the CHPs.
Collapse
Affiliation(s)
- Janaina Macedo-da-Silva
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil
| | - Simon Ngao Mule
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil
| | - Livia Rosa-Fernandes
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil; Centre for Motor Neuron Disease Research, Faculty of Medicine, Health & Human Sciences, Macquarie Medical School, Sydney, NSW, Australia
| | - Giuseppe Palmisano
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, Sao Paulo, Brazil; School of Natural Sciences, Macquarie University, Sydney, NSW, Australia.
| |
Collapse
|
23
|
Rappsilber J. A dive into the unknome. Trends Genet 2024; 40:15-16. [PMID: 37968205 DOI: 10.1016/j.tig.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 10/23/2023] [Indexed: 11/17/2023]
Abstract
We may never understand the function of all genes, findings by Freeman, Munro and colleagues suggest, unless we rethink our approaches. They make a thorough attempt at quantifying the unknownness of protein-coding genes and experimentally prove that many neglected genes hold the seed of important discoveries.
Collapse
Affiliation(s)
- Juri Rappsilber
- Technische Universität Berlin, Chair of Bioanalytics, 10623 Berlin, Germany; Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK; Si-M/'Der Simulierte Mensch', a Science Framework of Technische Universität Berlin and Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
24
|
Skinnider MA, Akinlaja MO, Foster LJ. Mapping protein states and interactions across the tree of life with co-fractionation mass spectrometry. Nat Commun 2023; 14:8365. [PMID: 38102123 PMCID: PMC10724252 DOI: 10.1038/s41467-023-44139-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Accepted: 12/01/2023] [Indexed: 12/17/2023] Open
Abstract
We present CFdb, a harmonized resource of interaction proteomics data from 411 co-fractionation mass spectrometry (CF-MS) datasets spanning 21,703 fractions. Meta-analysis of this resource charts protein abundance, phosphorylation, and interactions throughout the tree of life, including a reference map of the human interactome. We show how large-scale CF-MS data can enhance analyses of individual CF-MS datasets, and exemplify this strategy by mapping the honey bee interactome.
Collapse
Affiliation(s)
- Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Ludwig Institute for Cancer Research, Princeton University, Princeton, NJ, USA
| | - Mopelola O Akinlaja
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada.
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
25
|
Rahban M, Ahmad F, Piatyszek MA, Haertlé T, Saso L, Saboury AA. Stabilization challenges and aggregation in protein-based therapeutics in the pharmaceutical industry. RSC Adv 2023; 13:35947-35963. [PMID: 38090079 PMCID: PMC10711991 DOI: 10.1039/d3ra06476j] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 11/30/2023] [Indexed: 04/26/2024] Open
Abstract
Protein-based therapeutics have revolutionized the pharmaceutical industry and become vital components in the development of future therapeutics. They offer several advantages over traditional small molecule drugs, including high affinity, potency and specificity, while demonstrating low toxicity and minimal adverse effects. However, the development and manufacturing processes of protein-based therapeutics presents challenges related to protein folding, purification, stability and immunogenicity that should be addressed. These proteins, like other biological molecules, are prone to chemical and physical instabilities. The stability of protein-based drugs throughout the entire manufacturing, storage and delivery process is essential. The occurrence of structural instability resulting from misfolding, unfolding, and modifications, as well as aggregation, poses a significant risk to the efficacy of these drugs, overshadowing their promising attributes. Gaining insight into structural alterations caused by aggregation and their impact on immunogenicity is vital for the advancement and refinement of protein therapeutics. Hence, in this review, we have discussed some features of protein aggregation during production, formulation and storage as well as stabilization strategies in protein engineering and computational methods to prevent aggregation.
Collapse
Affiliation(s)
- Mahdie Rahban
- Neuroscience Research Center, Institute of Neuropharmacology, Kerman University of Medical Sciences Kerman Iran
| | - Faizan Ahmad
- Department of Biochemistry, School of Chemical & Life Sciences, Jamia Hamdard New Delhi-110062 India
| | | | | | - Luciano Saso
- Department of Physiology and Pharmacology "Vittorio Erspamer", Sapienza University Rome Italy
| | - Ali Akbar Saboury
- Institute of Biochemistry and Biophysics, University of Tehran Tehran 1417614335 Iran +9821 66404680 +9821 66956984
| |
Collapse
|
26
|
van Breugel ME, van Kruijsbergen I, Mittal C, Lieftink C, Brouwer I, van den Brand T, Kluin RJC, Hoekman L, Menezes RX, van Welsem T, Del Cortona A, Malik M, Beijersbergen RL, Lenstra TL, Verstrepen KJ, Pugh BF, van Leeuwen F. Locus-specific proteome decoding reveals Fpt1 as a chromatin-associated negative regulator of RNA polymerase III assembly. Mol Cell 2023; 83:4205-4221.e9. [PMID: 37995691 PMCID: PMC11289708 DOI: 10.1016/j.molcel.2023.10.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 09/27/2023] [Accepted: 10/26/2023] [Indexed: 11/25/2023]
Abstract
Transcription of tRNA genes by RNA polymerase III (RNAPIII) is tuned by signaling cascades. The emerging notion of differential tRNA gene regulation implies the existence of additional regulatory mechanisms. However, tRNA gene-specific regulators have not been described. Decoding the local chromatin proteome of a native tRNA gene in yeast revealed reprogramming of the RNAPIII transcription machinery upon nutrient perturbation. Among the dynamic proteins, we identified Fpt1, a protein of unknown function that uniquely occupied RNAPIII-regulated genes. Fpt1 binding at tRNA genes correlated with the efficiency of RNAPIII eviction upon nutrient perturbation and required the transcription factors TFIIIB and TFIIIC but not RNAPIII. In the absence of Fpt1, eviction of RNAPIII was reduced, and the shutdown of ribosome biogenesis genes was impaired upon nutrient perturbation. Our findings provide support for a chromatin-associated mechanism required for RNAPIII eviction from tRNA genes and tuning the physiological response to changing metabolic demands.
Collapse
Affiliation(s)
- Maria Elize van Breugel
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Ila van Kruijsbergen
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Chitvan Mittal
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA; Department of Molecular Biology and Genetics, Biotechnology Building, Cornell University, Ithaca, NY 14853, USA
| | - Cor Lieftink
- Division of Molecular Carcinogenesis and Robotics and Screening Center, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Ineke Brouwer
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands; Division of Gene Regulation, Netherlands Cancer Institute, Oncode Institute, Amsterdam 1066 CX, the Netherlands
| | - Teun van den Brand
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Roelof J C Kluin
- Genomics Core Facility, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Liesbeth Hoekman
- Proteomics Facility, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Renée X Menezes
- Biostatistics Centre and Division of Psychosocial Research and Epidemiology, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Tibor van Welsem
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Andrea Del Cortona
- VIB-KU Leuven Center for Microbiology, KU Leuven, 3001 Heverlee-Leuven, Belgium
| | - Muddassir Malik
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Roderick L Beijersbergen
- Division of Molecular Carcinogenesis and Robotics and Screening Center, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands; Genomics Core Facility, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands
| | - Tineke L Lenstra
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands; Division of Gene Regulation, Netherlands Cancer Institute, Oncode Institute, Amsterdam 1066 CX, the Netherlands
| | - Kevin J Verstrepen
- VIB-KU Leuven Center for Microbiology, KU Leuven, 3001 Heverlee-Leuven, Belgium
| | - B Franklin Pugh
- Department of Molecular Biology and Genetics, Biotechnology Building, Cornell University, Ithaca, NY 14853, USA
| | - Fred van Leeuwen
- Division of Gene Regulation, Netherlands Cancer Institute, Amsterdam 1066 CX, the Netherlands; Department of Medical Biology, Amsterdam UMC, University of Amsterdam, Amsterdam 1105 AZ, the Netherlands.
| |
Collapse
|
27
|
Brechtmann F, Bechtler T, Londhe S, Mertes C, Gagneur J. Evaluation of input data modality choices on functional gene embeddings. NAR Genom Bioinform 2023; 5:lqad095. [PMID: 37942285 PMCID: PMC10629286 DOI: 10.1093/nargab/lqad095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 09/07/2023] [Accepted: 09/28/2023] [Indexed: 11/10/2023] Open
Abstract
Functional gene embeddings, numerical vectors capturing gene function, provide a promising way to integrate functional gene information into machine learning models. These embeddings are learnt by applying self-supervised machine-learning algorithms on various data types including quantitative omics measurements, protein-protein interaction networks and literature. However, downstream evaluations comparing alternative data modalities used to construct functional gene embeddings have been lacking. Here we benchmarked functional gene embeddings obtained from various data modalities for predicting disease-gene lists, cancer drivers, phenotype-gene associations and scores from genome-wide association studies. Off-the-shelf predictors trained on precomputed embeddings matched or outperformed dedicated state-of-the-art predictors, demonstrating their high utility. Embeddings based on literature and protein-protein interactions inferred from low-throughput experiments outperformed embeddings derived from genome-wide experimental data (transcriptomics, deletion screens and protein sequence) when predicting curated gene lists. In contrast, they did not perform better when predicting genome-wide association signals and were biased towards highly-studied genes. These results indicate that embeddings derived from literature and low-throughput experiments appear favourable in many existing benchmarks because they are biased towards well-studied genes and should therefore be considered with caution. Altogether, our study and precomputed embeddings will facilitate the development of machine-learning models in genetics and related fields.
Collapse
Affiliation(s)
- Felix Brechtmann
- TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Thibault Bechtler
- TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Shubhankar Londhe
- TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Christian Mertes
- TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Munich Data Science Institute, Technical University of Munich, Garching, Germany
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
| | - Julien Gagneur
- TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany
| |
Collapse
|
28
|
Huang HJ, Li YY, Ye ZX, Li LL, Hu QL, He YJ, Qi YH, Zhang Y, Li T, Lu G, Mao QZ, Zhuo JC, Lu JB, Xu ZT, Sun ZT, Yan F, Chen JP, Zhang CX, Li JM. Co-option of a non-retroviral endogenous viral element in planthoppers. Nat Commun 2023; 14:7264. [PMID: 37945658 PMCID: PMC10636211 DOI: 10.1038/s41467-023-43186-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 11/02/2023] [Indexed: 11/12/2023] Open
Abstract
Non-retroviral endogenous viral elements (nrEVEs) are widely dispersed throughout the genomes of eukaryotes. Although nrEVEs are known to be involved in host antiviral immunity, it remains an open question whether they can be domesticated as functional proteins to serve cellular innovations in arthropods. In this study, we found that endogenous toti-like viral elements (ToEVEs) are ubiquitously integrated into the genomes of three planthopper species, with highly variable distributions and polymorphism levels in planthopper populations. Three ToEVEs display exon‒intron structures and active transcription, suggesting that they might have been domesticated by planthoppers. CRISPR/Cas9 experiments revealed that one ToEVE in Nilaparvata lugens, NlToEVE14, has been co-opted by its host and plays essential roles in planthopper development and fecundity. Large-scale analysis of ToEVEs in arthropod genomes indicated that the number of arthropod nrEVEs is currently underestimated and that they may contribute to the functional diversity of arthropod genes.
Collapse
Affiliation(s)
- Hai-Jian Huang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Yi-Yuan Li
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Zhuang-Xin Ye
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- College of Forestry, Nanjing Forestry University, Nanjing, 210037, China
| | - Li-Li Li
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Qing-Ling Hu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Yu-Juan He
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Yu-Hua Qi
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Yan Zhang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Ting Li
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Gang Lu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Qian-Zhuo Mao
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Ji-Chong Zhuo
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Jia-Bao Lu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Zhong-Tian Xu
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Zong-Tao Sun
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Fei Yan
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China
| | - Jian-Ping Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China.
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China.
- College of Forestry, Nanjing Forestry University, Nanjing, 210037, China.
| | - Chuan-Xi Zhang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China.
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China.
| | - Jun-Min Li
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China.
- Key Laboratory of Biotechnology in Plant Protection of Ministry of Agriculture and Zhejiang Province, Institute of Plant Virology, Ningbo University, Ningbo, 315211, China.
| |
Collapse
|
29
|
Ning J, Yang M, Liu W, Luo X, Yue X. Proteomics and Peptidomics As a Tool to Compare the Proteins and Endogenous Peptides in Human, Cow, and Donkey Milk. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:16435-16451. [PMID: 37882656 DOI: 10.1021/acs.jafc.3c04534] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
Cow's milk is the most widely used ingredient in infant formulas. However, its specific protein composition can cause allergic reactions. Finding alternatives to replace cow's milk and fill the nutritional gap with human milk is essential for the health of infants. Proteomic and peptidomic techniques have supported the elucidation of milk's nutritional ingredients. Recently, omics approaches have attracted increasing interest in the investigation of milk because of their high throughput, precision, sensitivity, and reproducibility. This review offers a significant overview of recent developments in proteomics and peptidomics used to study the differences in human, cow, and donkey milk. All three types of milks were identified to have critical biological functions in human health, particularly in infants. Donkey milk proteins were closer in composition to human milk, were less likely to cause allergic reactions, and may be developed as novel raw materials for formula milk powders.
Collapse
Affiliation(s)
- Jianting Ning
- College of Food Science, Shenyang Agricultural University, Shenyang 110866, People's Republic of China
| | - Mei Yang
- College of Food Science, Shenyang Agricultural University, Shenyang 110866, People's Republic of China
| | - Wanting Liu
- College of Food Science, Shenyang Agricultural University, Shenyang 110866, People's Republic of China
| | - Xue Luo
- College of Food Science, Shenyang Agricultural University, Shenyang 110866, People's Republic of China
| | - Xiqing Yue
- College of Food Science, Shenyang Agricultural University, Shenyang 110866, People's Republic of China
| |
Collapse
|
30
|
Liu Y, Fu Y, Xue X, Tang G, Si L. BRD2 protects the rat H9C2 cardiomyocytes from hypoxia‑reoxygenation injury by targeting Nrf2/HO‑1 signaling pathway. Exp Ther Med 2023; 26:542. [PMID: 37869639 PMCID: PMC10587885 DOI: 10.3892/etm.2023.12241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 08/08/2023] [Indexed: 10/24/2023] Open
Abstract
Myocardial ischemia-reperfusion (I/R) injury is a common complication of acute myocardial infarction following percutaneous coronary intervention, but there are currently no effective pharmacological targets for adjuvant therapy due to a lack of knowledge of I/R injury mechanisms in cardiomyocytes. To evaluate the effects of hypoxia-reoxygenation on the plasma proteome of cardiomyocytes and prospective therapeutic targets, five sets of H9C2 cardiomyocytes from rats were cultured under various hypoxic circumstances. Using Cell Counting Kit-8 (CCK8) and lactose dehydrogenase (LDH) release assays, the cell viability and LDH release of H9C2 cells were analyzed. Proteome sequencing was then performed on cardiomyocytes to show the quantitative protein changes during the I/R injury process. After hypoxia/reoxygenation, bromodomain-containing protein 2 (BRD2) expression was evaluated. After administering the BRD2 inhibitor dBET1, the expression of nuclear factor erythroid 2-related factor 2/haem oxygenase-1 (Nrf2/HO-1) was identified. The results showed that in the group exposed to 4 h of hypoxia followed by 4 h of reoxygenation (H/R4), the cell survival rate was dramatically reduced, although the apoptotic rate and LDH were much higher than in the normal oxygen group. In addition, the expressions of 2,325 proteins differed considerably between these two groups, with 128 upregulated and 122 downregulated proteins being discovered in the H/R4 group. After 4 h of reoxygenation, the BRD2 expression was increased. Following the addition of dBET1 to suppress BRD2, the expression of Nrf2/HO-1 was reduced, but the rate of apoptosis increased. In conclusion, through the Nrf2/HO-1 signaling pathway, BRD2 protects cardiomyocytes from damage caused by hypoxia/reoxygenation. This may have implications for novel treatment targets to minimize I/R damage to the myocardium.
Collapse
Affiliation(s)
- Yingcun Liu
- Department of Cardiology, The Third Affiliated Hospital, Chongqing Medical University, Chongqing 401120, P.R. China
| | - Yuqing Fu
- Department of Cardiology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen, Guangdong 518000, P.R. China
| | - Xin Xue
- Department of Cardiology, The Third Affiliated Hospital, Chongqing Medical University, Chongqing 401120, P.R. China
| | - Gang Tang
- Department of Cardiovascular Medicine, Shapingba Hospital Affiliated to Chongqing University, Chongqing 400030, P.R. China
| | - Liangyi Si
- Department of Cardiology, The Third Affiliated Hospital, Chongqing Medical University, Chongqing 401120, P.R. China
| |
Collapse
|
31
|
Kipen J, Jaldén J. Beam search decoder for enhancing sequence decoding speed in single-molecule peptide sequencing data. PLoS Comput Biol 2023; 19:e1011345. [PMID: 37934778 PMCID: PMC10656014 DOI: 10.1371/journal.pcbi.1011345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/17/2023] [Accepted: 10/15/2023] [Indexed: 11/09/2023] Open
Abstract
Next-generation single-molecule protein sequencing technologies have the potential to significantly accelerate biomedical research. These technologies offer sensitivity and scalability for proteomic analysis. One auspicious method is fluorosequencing, which involves: cutting naturalized proteins into peptides, attaching fluorophores to specific amino acids, and observing variations in light intensity as one amino acid is removed at a time. The original peptide is classified from the sequence of light-intensity reads, and proteins can subsequently be recognized with this information. The amino acid step removal is achieved by attaching the peptides to a wall on the C-terminal and using a process called Edman Degradation to remove an amino acid from the N-Terminal. Even though a framework (Whatprot) has been proposed for the peptide classification task, processing times remain restrictive due to the massively parallel data acquisicion system. In this paper, we propose a new beam search decoder with a novel state formulation that obtains considerably lower processing times at the expense of only a slight accuracy drop compared to Whatprot. Furthermore, we explore how our novel state formulation may lead to even faster decoders in the future.
Collapse
Affiliation(s)
- Javier Kipen
- Division of Information Science and Engineering, Kungsliga Tekniska Högskolan, Stockholm, Stockholm, Sweden
| | - Joakim Jaldén
- Division of Information Science and Engineering, Kungsliga Tekniska Högskolan, Stockholm, Stockholm, Sweden
| |
Collapse
|
32
|
Kolbowski L, Fischer L, Rappsilber J. Cleavable Cross-Linkers Redefined by a Novel MS 3-Trigger Algorithm. Anal Chem 2023; 95:15461-15464. [PMID: 37816155 PMCID: PMC10603603 DOI: 10.1021/acs.analchem.3c01673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/22/2023] [Indexed: 10/12/2023]
Abstract
Cross-linking mass spectrometry (MS) is currently transitioning from a routine tool in structural biology to enabling structural systems biology. MS-cleavable cross-linkers could substantially reduce the associated search space expansion by allowing a MS3-based approach for identifying cross-linked peptides. However, MS2 (MS/MS)-based approaches currently outperform approaches utilizing MS3. We show here that the sensitivity and specificity of triggering MS3 have been hampered algorithmically. Our four-step MS3-trigger algorithm greatly outperformed currently employed methods and comes close to reaching the theoretical limit.
Collapse
Affiliation(s)
- Lars Kolbowski
- Technische
Universität Berlin, Chair of Bioanalytics, 10623 Berlin, Germany
| | - Lutz Fischer
- Technische
Universität Berlin, Chair of Bioanalytics, 10623 Berlin, Germany
| | - Juri Rappsilber
- Technische
Universität Berlin, Chair of Bioanalytics, 10623 Berlin, Germany
- Wellcome
Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, United Kingdom
- Si-M/“Der
Simulierte Mensch”, a Science Framework of Technische Universität
Berlin and Charité - Universitätsmedizin Berlin, 10623 Berlin, Germany
| |
Collapse
|
33
|
Derry A, Altman RB. Explainable protein function annotation using local structure embeddings. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.13.562298. [PMID: 37905033 PMCID: PMC10614799 DOI: 10.1101/2023.10.13.562298] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
The rapid expansion of protein sequence and structure databases has resulted in a significant number of proteins with ambiguous or unknown function. While advances in machine learning techniques hold great potential to fill this annotation gap, current methods for function prediction are unable to associate global function reliably to the specific residues responsible for that function. We address this issue by introducing PARSE (Protein Annotation by Residue-Specific Enrichment), a knowledge-based method which combines pre-trained embeddings of local structural environments with traditional statistical techniques to identify enriched functions with residue-level explainability. For the task of predicting the catalytic function of enzymes, PARSE achieves comparable or superior global performance to state-of-the-art machine learning methods (F1 score > 85%) while simultaneously annotating the specific residues involved in each function with much greater precision. Since it does not require supervised training, our method can make one-shot predictions for very rare functions and is not limited to a particular type of functional label (e.g. Enzyme Commission numbers or Gene Ontology codes). Finally, we leverage the AlphaFold Structure Database to perform functional annotation at a proteome scale. By applying PARSE to the dark proteome-predicted structures which cannot be classified into known structural families-we predict several novel bacterial metalloproteases. Each of these proteins shares a strongly conserved catalytic site despite highly divergent sequences and global folds, illustrating the value of local structure representations for new function discovery.
Collapse
Affiliation(s)
- Alexander Derry
- Department of Biomedical Data Science, Stanford University, Stanford, CA
| | - Russ B Altman
- Department of Biomedical Data Science, Stanford University, Stanford, CA
- Departments of Bioengineering, Genetics, and Medicine, Stanford University, Stanford, CA
| |
Collapse
|
34
|
Gaiteri C, Connell DR, Sultan FA, Iatrou A, Ng B, Szymanski BK, Zhang A, Tasaki S. Robust, scalable, and informative clustering for diverse biological networks. Genome Biol 2023; 24:228. [PMID: 37828545 PMCID: PMC10571258 DOI: 10.1186/s13059-023-03062-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 09/19/2023] [Indexed: 10/14/2023] Open
Abstract
Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm-SpeakEasy2: Champagne. These tests identify trends in performance, show no single method is universally optimal, and allow us to examine factors behind variation in performance. Multiple metrics indicate SpeakEasy2 generally provides robust, scalable, and informative clusters for a range of applications.
Collapse
Affiliation(s)
- Chris Gaiteri
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA.
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA.
- Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA.
| | - David R Connell
- Rush University Graduate College, Rush University Medical Center, Chicago, IL, USA
| | - Faraz A Sultan
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Artemis Iatrou
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
- Department of Psychiatry, McLean Hospital, Harvard Medical School, Harvard University, Belmont, MA, USA
| | - Bernard Ng
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Boleslaw K Szymanski
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USA
- Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY, USA
- Academy of Social Sciences, Łódź, Poland
| | - Ada Zhang
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Shinya Tasaki
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
- Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA
| |
Collapse
|
35
|
Heil L, Damoc E, Arrey TN, Pashkova A, Denisov E, Petzoldt J, Peterson AC, Hsu C, Searle BC, Shulman N, Riffle M, Connolly B, MacLean BX, Remes PM, Senko MW, Stewart HI, Hock C, Makarov AA, Hermanson D, Zabrouskov V, Wu CC, MacCoss MJ. Evaluating the Performance of the Astral Mass Analyzer for Quantitative Proteomics Using Data-Independent Acquisition. J Proteome Res 2023; 22:3290-3300. [PMID: 37683181 PMCID: PMC10563156 DOI: 10.1021/acs.jproteome.3c00357] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Indexed: 09/10/2023]
Abstract
We evaluate the quantitative performance of the newly released Asymmetric Track Lossless (Astral) analyzer. Using data-independent acquisition, the Thermo Scientific Orbitrap Astral mass spectrometer quantifies 5 times more peptides per unit time than state-of-the-art Thermo Scientific Orbitrap mass spectrometers, which have long been the gold standard for high-resolution quantitative proteomics. Our results demonstrate that the Orbitrap Astral mass spectrometer can produce high-quality quantitative measurements across a wide dynamic range. We also use a newly developed extracellular vesicle enrichment protocol to reach new depths of coverage in the plasma proteome, quantifying over 5000 plasma proteins in a 60 min gradient with the Orbitrap Astral mass spectrometer.
Collapse
Affiliation(s)
- Lilian
R. Heil
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Eugen Damoc
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | - Tabiwang N. Arrey
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | - Anna Pashkova
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | - Eduard Denisov
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | - Johannes Petzoldt
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | | | - Chris Hsu
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Brian C. Searle
- Pelotonia
Institute for Immuno-Oncology, The Ohio
State University Comprehensive Cancer Center, Columbus, Ohio 43210, United States
- Department
of Biomedical Informatics, The Ohio State
University, Columbus, Ohio 43210, United States
| | - Nicholas Shulman
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Michael Riffle
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Brian Connolly
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Brendan X. MacLean
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Philip M. Remes
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Michael W. Senko
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Hamish I. Stewart
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | - Christian Hock
- Thermo
Fisher Scientific, Hanna-Kunath
Ste. 11, 28199 Bremen, Germany
| | | | - Daniel Hermanson
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Vlad Zabrouskov
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Christine C. Wu
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Michael J. MacCoss
- Department
of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| |
Collapse
|
36
|
Rodríguez-López M, Bordin N, Lees J, Scholes H, Hassan S, Saintain Q, Kamrad S, Orengo C, Bähler J. Broad functional profiling of fission yeast proteins using phenomics and machine learning. eLife 2023; 12:RP88229. [PMID: 37787768 PMCID: PMC10547477 DOI: 10.7554/elife.88229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open
Abstract
Many proteins remain poorly characterized even in well-studied organisms, presenting a bottleneck for research. We applied phenomics and machine-learning approaches with Schizosaccharomyces pombe for broad cues on protein functions. We assayed colony-growth phenotypes to measure the fitness of deletion mutants for 3509 non-essential genes in 131 conditions with different nutrients, drugs, and stresses. These analyses exposed phenotypes for 3492 mutants, including 124 mutants of 'priority unstudied' proteins conserved in humans, providing varied functional clues. For example, over 900 proteins were newly implicated in the resistance to oxidative stress. Phenotype-correlation networks suggested roles for poorly characterized proteins through 'guilt by association' with known proteins. For complementary functional insights, we predicted Gene Ontology (GO) terms using machine learning methods exploiting protein-network and protein-homology data (NET-FF). We obtained 56,594 high-scoring GO predictions, of which 22,060 also featured high information content. Our phenotype-correlation data and NET-FF predictions showed a strong concordance with existing PomBase GO annotations and protein networks, with integrated analyses revealing 1675 novel GO predictions for 783 genes, including 47 predictions for 23 priority unstudied proteins. Experimental validation identified new proteins involved in cellular aging, showing that these predictions and phenomics data provide a rich resource to uncover new protein functions.
Collapse
Affiliation(s)
- María Rodríguez-López
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Nicola Bordin
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Jon Lees
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
- University of BristolBristolUnited Kingdom
| | - Harry Scholes
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Shaimaa Hassan
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
- Helwan University, Faculty of PharmacyCairoEgypt
| | - Quentin Saintain
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Stephan Kamrad
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Christine Orengo
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Jürg Bähler
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| |
Collapse
|
37
|
Kelly K, Lewis PA, Plun-Favreau H, Manzoni C. Protein network analysis links the NSL complex to Parkinson's disease via mitochondrial and nuclear biology. Mol Omics 2023; 19:668-679. [PMID: 37427757 DOI: 10.1039/d2mo00325b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Whilst the majority of Parkinson's Disease (PD) cases are sporadic, much of our understanding of the pathophysiological basis of the disease can be traced back to the study of rare, monogenic forms of PD. In the past decade, the availability of genome-wide association studies (GWAS) has facilitated a shift in focus, toward identifying common risk variants conferring increased risk of developing PD across the population. A recent mitophagy screening assay of GWAS candidates has functionally implicated the non-specific lethal (NSL) complex in the regulation of PINK1-mitophagy. Here, a bioinformatics approach has been taken to investigate the proteome of the NSL complex, to unpick its relevance to PD pathogenesis. The NSL interactome has been built, using 3 online tools: PINOT, HIPPIE and MIST, to mine curated, literature-derived protein-protein interaction (PPI) data. We built (i) the 'mitochondrial' NSL interactome exploring its relevance to PD genetics and (ii) the PD-oriented NSL interactome to uncover biological pathways underpinning the NSL/PD association. In this study, we find the mitochondrial NSL interactome to be significantly enriched for the protein products of PD-associated genes, including the Mendelian PD genes LRRK2 and VPS35. In addition, we find nuclear processes to be amongst those most significantly enriched within the PD-associated NSL interactome. These findings strengthen the role of the NSL complex in sporadic and familial PD, mediated by both its mitochondrial and nuclear functions.
Collapse
Affiliation(s)
- Katie Kelly
- UCL Queen Square Institute of Neurology, Queen Square, London, WC1N 3BG, UK.
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
| | - Patrick A Lewis
- UCL Queen Square Institute of Neurology, Queen Square, London, WC1N 3BG, UK.
- Royal Veterinary College, University of London, Royal College Street, Camden, NW1 0TU, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
| | - Helene Plun-Favreau
- UCL Queen Square Institute of Neurology, Queen Square, London, WC1N 3BG, UK.
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
| | - Claudia Manzoni
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- UCL School of Pharmacy, Brunswick Square, London, WC1N 1AX, UK.
| |
Collapse
|
38
|
Liao Y, Savage SR, Dou Y, Shi Z, Yi X, Jiang W, Lei JT, Zhang B. A proteogenomics data-driven knowledge base of human cancer. Cell Syst 2023; 14:777-787.e5. [PMID: 37619559 PMCID: PMC10530292 DOI: 10.1016/j.cels.2023.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/11/2023] [Accepted: 07/25/2023] [Indexed: 08/26/2023]
Abstract
By combining mass-spectrometry-based proteomics and phosphoproteomics with genomics, epi-genomics, and transcriptomics, proteogenomics provides comprehensive molecular characterization of cancer. Using this approach, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 primary tumors spanning 10 cancer types, many with matched normal tissues. Here, we present LinkedOmicsKB, a proteogenomics data-driven knowledge base that makes consistently processed and systematically precomputed CPTAC pan-cancer proteogenomics data available to the public through ∼40,000 gene-, protein-, mutation-, and phenotype-centric web pages. Visualization techniques facilitate efficient exploration and reasoning of complex, interconnected data. Using three case studies, we illustrate the practical utility of LinkedOmicsKB in providing new insights into genes, phosphorylation sites, somatic mutations, and cancer phenotypes. With precomputed results of 19,701 coding genes, 125,969 phosphosites, and 256 genotypes and phenotypes, LinkedOmicsKB provides a comprehensive resource to accelerate proteogenomics data-driven discoveries to improve our understanding and treatment of human cancer. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Yuxing Liao
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Sara R Savage
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yongchao Dou
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Zhiao Shi
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Xinpei Yi
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Wen Jiang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jonathan T Lei
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
39
|
Tantoso E, Eisenhaber B, Sinha S, Jensen LJ, Eisenhaber F. Did the early full genome sequencing of yeast boost gene function discovery? Biol Direct 2023; 18:46. [PMID: 37574542 PMCID: PMC10424406 DOI: 10.1186/s13062-023-00403-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 08/01/2023] [Indexed: 08/15/2023] Open
Abstract
BACKGROUND Although the genome of Saccharomyces cerevisiae (S. cerevisiae) was the first one of a eukaryote organism that was fully sequenced (in 1996), a complete understanding of the potential of encoded biomolecular mechanisms has not yet been achieved. Here, we wish to quantify how far the goal of a full list of S. cerevisiae gene functions still is. RESULTS The scientific literature about S. cerevisiae protein-coding genes has been mapped onto the yeast genome via the mentioning of names for genomic regions in scientific publications. The match was quantified with the ratio of a given gene name's occurrences to those of any gene names in the article. We find that ~ 230 elite genes with ≥ 75 full publication equivalents (FPEs, FPE = 1 is an idealized publication referring to just a single gene) command ~ 45% of all literature. At the same time, about two thirds of the genes (each with less than 10 FPEs) are described in just 12% of the literature (in average each such gene has just ~ 1.5% of the literature of an elite gene). About 600 genes have not been mentioned in any dedicated article. Compared with other groups of genes, the literature growth rates were highest for uncharacterized or understudied genes until late nineties of the twentieth century. Yet, these growth rates deteriorated and became negative thereafter. Thus, yeast function discovery for previously uncharacterized genes has returned to the level of ~ 1980. At the same time, literature for anyhow well-studied genes (with a threshold T10 (≥ 10 FPEs) and higher) remains steadily growing. CONCLUSIONS Did the early full genome sequencing of yeast boost gene function discovery? The data proves that the moment of publishing the full genome in reality coincides with the onset of decline of gene function discovery for previously uncharacterized genes. If the current status of literature about yeast molecular mechanisms can be extrapolated into the future, it will take about another ~ 50 years to complete the yeast gene function list. We found that a small group of scientific journals contributed extraordinarily to publishing early reports relevant to yeast gene function discoveries.
Collapse
Affiliation(s)
- Erwin Tantoso
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
| | - Birgit Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- LASA - Lausitz Advanced Scientific Applications gGmbH, Straße Der Einheit 2-24, 02943, Weißwasser, Federal Republic of Germany.
| | - Swati Sinha
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Frank Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- LASA - Lausitz Advanced Scientific Applications gGmbH, Straße Der Einheit 2-24, 02943, Weißwasser, Federal Republic of Germany.
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Republic of Singapore.
| |
Collapse
|
40
|
Heil LR, Damoc E, Arrey TN, Pashkova A, Denisov E, Petzoldt J, Peterson AC, Hsu C, Searle BC, Shulman N, Riffle M, Connolly B, MacLean BX, Remes PM, Senko MW, Stewart HI, Hock C, Makarov AA, Hermanson D, Zabrouskov V, Wu CC, MacCoss MJ. Evaluating the Performance of the Astral Mass Analyzer for Quantitative Proteomics Using Data Independent Acquisition. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.03.543570. [PMID: 37398334 PMCID: PMC10312564 DOI: 10.1101/2023.06.03.543570] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
We evaluate the quantitative performance of the newly released Asymmetric Track Lossless (Astral) analyzer. Using data independent acquisition, the Thermo Scientific™ Orbitrap™ Astral™ mass spectrometer quantifies 5 times more peptides per unit time than state-of-the-art Thermo Scientific™ Orbitrap™ mass spectrometers, which have long been the gold standard for high resolution quantitative proteomics. Our results demonstrate that the Orbitrap Astral mass spectrometer can produce high quality quantitative measurements across a wide dynamic range. We also use a newly developed extra-cellular vesicle enrichment protocol to reach new depths of coverage in the plasma proteome, quantifying over 5,000 plasma proteins in a 60-minute gradient with the Orbitrap Astral mass spectrometer.
Collapse
Affiliation(s)
- Lilian R. Heil
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Eugen Damoc
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | - Tabiwang N. Arrey
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | - Anna Pashkova
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | - Eduard Denisov
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | - Johannes Petzoldt
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | | | - Chris Hsu
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Brian C. Searle
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, United States
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, United States
| | - Nicholas Shulman
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Michael Riffle
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Brian Connolly
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Brendan X. MacLean
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Philip M. Remes
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Michael W. Senko
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Hamish I. Stewart
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | - Christian Hock
- Thermo Fisher Scientific, Hanna-Kunath Ste. 11, 28199 Bremen, Germany
| | | | - Daniel Hermanson
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Vlad Zabrouskov
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Christine C. Wu
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| | - Michael J. MacCoss
- Department of Genome Sciences, University of Washington, 3720 15th Street NE, Seattle, Washington 98195, United States
| |
Collapse
|
41
|
Palukuri MV, Patil RS, Marcotte EM. Molecular complex detection in protein interaction networks through reinforcement learning. BMC Bioinformatics 2023; 24:306. [PMID: 37532987 PMCID: PMC10394916 DOI: 10.1186/s12859-023-05425-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 07/20/2023] [Indexed: 08/04/2023] Open
Abstract
BACKGROUND Proteins often assemble into higher-order complexes to perform their biological functions. Such protein-protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be applied to define the various higher-order protein complexes. Current methods include unsupervised and supervised approaches, often assuming that protein complexes manifest only as dense subgraphs. Utilizing supervised approaches, the focus is not on how to find them in a network, but only on learning which subgraphs correspond to complexes, currently solved using heuristics. However, learning to walk trajectories on a network to identify protein complexes leads naturally to a reinforcement learning (RL) approach, a strategy not extensively explored for community detection. Here, we develop and evaluate a reinforcement learning pipeline for community detection on weighted protein-protein interaction networks to detect new protein complexes. The algorithm is trained to calculate the value of different subgraphs encountered while walking on the network to reconstruct known complexes. A distributed prediction algorithm then scales the RL pipeline to search for novel protein complexes on large PPI networks. RESULTS The reinforcement learning pipeline is applied to a human PPI network consisting of 8k proteins and 60k PPI, which results in 1,157 protein complexes. The method demonstrated competitive accuracy with improved speed compared to previous algorithms. We highlight protein complexes such as C4orf19, C18orf21, and KIAA1522 which are currently minimally characterized. Additionally, the results suggest TMC04 be a putative additional subunit of the KICSTOR complex and confirm the involvement of C15orf41 in a higher-order complex with HIRA, CDAN1, ASF1A, and by 3D structural modeling. CONCLUSIONS Reinforcement learning offers several distinct advantages for community detection, including scalability and knowledge of the walk trajectories defining those communities. Applied to currently available human protein interaction networks, this method had comparable accuracy with other algorithms and notable savings in computational time, and in turn, led to clear predictions of protein function and interactions for several uncharacterized human proteins.
Collapse
Affiliation(s)
- Meghana V Palukuri
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.
- Oden Institute for Computational Engineering and Sciences, University of Texas, Austin, TX, 78712, USA.
| | - Ridhi S Patil
- Department of Biomedical Engineering, University of Texas, Austin, TX, 78712, USA.
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.
- Oden Institute for Computational Engineering and Sciences, University of Texas, Austin, TX, 78712, USA.
| |
Collapse
|
42
|
Potter A, Hangas A, Goffart S, Huynen MA, Cabrera-Orefice A, Spelbrink JN. Uncharacterized protein C17orf80 - a novel interactor of human mitochondrial nucleoids. J Cell Sci 2023; 136:jcs260822. [PMID: 37401363 PMCID: PMC10445727 DOI: 10.1242/jcs.260822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 06/26/2023] [Indexed: 07/05/2023] Open
Abstract
Molecular functions of many human proteins remain unstudied, despite the demonstrated association with diseases or pivotal molecular structures, such as mitochondrial DNA (mtDNA). This small genome is crucial for the proper functioning of mitochondria, the energy-converting organelles. In mammals, mtDNA is arranged into macromolecular complexes called nucleoids that serve as functional stations for its maintenance and expression. Here, we aimed to explore an uncharacterized protein C17orf80, which was previously detected close to the nucleoid components by proximity labelling mass spectrometry. To investigate the subcellular localization and function of C17orf80, we took advantage of immunofluorescence microscopy, interaction proteomics and several biochemical assays. We demonstrate that C17orf80 is a mitochondrial membrane-associated protein that interacts with nucleoids even when mtDNA replication is inhibited. In addition, we show that C17orf80 is not essential for mtDNA maintenance and mitochondrial gene expression in cultured human cells. These results provide a basis for uncovering the molecular function of C17orf80 and the nature of its association with nucleoids, possibly leading to new insights about mtDNA and its expression.
Collapse
Affiliation(s)
- Alisa Potter
- Department of Pediatrics, Amalia Children's Hospital, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
- Radboud Center for Mitochondrial Medicine (RCMM), Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Anu Hangas
- Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, 80101, Finland
| | - Steffi Goffart
- Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, 80101, Finland
| | - Martijn A. Huynen
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Alfredo Cabrera-Orefice
- Radboud Center for Mitochondrial Medicine (RCMM), Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
- Department of Medical BioSciences, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| | - Johannes N. Spelbrink
- Department of Pediatrics, Amalia Children's Hospital, Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
- Radboud Center for Mitochondrial Medicine (RCMM), Radboud University Medical Center, Nijmegen, 6525 GA, The Netherlands
| |
Collapse
|
43
|
Rocha JJ, Jayaram SA, Stevens TJ, Muschalik N, Shah RD, Emran S, Robles C, Freeman M, Munro S. Functional unknomics: Systematic screening of conserved genes of unknown function. PLoS Biol 2023; 21:e3002222. [PMID: 37552676 PMCID: PMC10409296 DOI: 10.1371/journal.pbio.3002222] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/27/2023] [Indexed: 08/10/2023] Open
Abstract
The human genome encodes approximately 20,000 proteins, many still uncharacterised. It has become clear that scientific research tends to focus on well-studied proteins, leading to a concern that poorly understood genes are unjustifiably neglected. To address this, we have developed a publicly available and customisable "Unknome database" that ranks proteins based on how little is known about them. We applied RNA interference (RNAi) in Drosophila to 260 unknown genes that are conserved between flies and humans. Knockdown of some genes resulted in loss of viability, and functional screening of the rest revealed hits for fertility, development, locomotion, protein quality control, and resilience to stress. CRISPR/Cas9 gene disruption validated a component of Notch signalling and 2 genes contributing to male fertility. Our work illustrates the importance of poorly understood genes, provides a resource to accelerate future research, and highlights a need to support database curation to ensure that misannotation does not erode our awareness of our own ignorance.
Collapse
Affiliation(s)
- João J. Rocha
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Tim J. Stevens
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Rajen D. Shah
- Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
| | - Sahar Emran
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Cristina Robles
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Matthew Freeman
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
- Sir William Dunn School of Pathology, University of Oxford, Oxford, United Kingdom
| | - Sean Munro
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
44
|
Lanzer JD, Valdeolivas A, Pepin M, Hund H, Backs J, Frey N, Friederich HC, Schultz JH, Saez-Rodriguez J, Levinson RT. A network medicine approach to study comorbidities in heart failure with preserved ejection fraction. BMC Med 2023; 21:267. [PMID: 37488529 PMCID: PMC10367269 DOI: 10.1186/s12916-023-02922-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 06/05/2023] [Indexed: 07/26/2023] Open
Abstract
BACKGROUND Comorbidities are expected to impact the pathophysiology of heart failure (HF) with preserved ejection fraction (HFpEF). However, comorbidity profiles are usually reduced to a few comorbid disorders. Systems medicine approaches can model phenome-wide comorbidity profiles to improve our understanding of HFpEF and infer associated genetic profiles. METHODS We retrospectively explored 569 comorbidities in 29,047 HF patients, including 8062 HFpEF and 6585 HF with reduced ejection fraction (HFrEF) patients from a German university hospital. We assessed differences in comorbidity profiles between HF subtypes via multiple correspondence analysis. Then, we used machine learning classifiers to identify distinctive comorbidity profiles of HFpEF and HFrEF patients. Moreover, we built a comorbidity network (HFnet) to identify the main disease clusters that summarized the phenome-wide comorbidity. Lastly, we predicted novel gene candidates for HFpEF by linking the HFnet to a multilayer gene network, integrating multiple databases. To corroborate HFpEF candidate genes, we collected transcriptomic data in a murine HFpEF model. We compared predicted genes with the murine disease signature as well as with the literature. RESULTS We found a high degree of variance between the comorbidity profiles of HFpEF and HFrEF, while each was more similar to HFmrEF. The comorbidities present in HFpEF patients were more diverse than those in HFrEF and included neoplastic, osteologic and rheumatoid disorders. Disease communities in the HFnet captured important comorbidity concepts of HF patients which could be assigned to HF subtypes, age groups, and sex. Based on the HFpEF comorbidity profile, we predicted and recovered gene candidates, including genes involved in fibrosis (COL3A1, LOX, SMAD9, PTHL), hypertrophy (GATA5, MYH7), oxidative stress (NOS1, GSST1, XDH), and endoplasmic reticulum stress (ATF6). Finally, predicted genes were significantly overrepresented in the murine transcriptomic disease signature providing additional plausibility for their relevance. CONCLUSIONS We applied systems medicine concepts to analyze comorbidity profiles in a HF patient cohort. We were able to identify disease clusters that helped to characterize HF patients. We derived a distinct comorbidity profile for HFpEF, which was leveraged to suggest novel candidate genes via network propagation. The identification of distinctive comorbidity profiles and candidate genes from routine clinical data provides insights that may be leveraged to improve diagnosis and identify treatment targets for HFpEF patients.
Collapse
Affiliation(s)
- Jan D Lanzer
- Institute for Computational Biomedicine, Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Bioquant, Heidelberg, Germany.
- Department of General Internal Medicine and Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany.
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany.
- Informatics for Life, Heidelberg, Germany.
| | - Alberto Valdeolivas
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Mark Pepin
- Institute of Experimental Cardiology, Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Im Neuenheimer Feld 669, 69120, Heidelberg, Germany
| | - Hauke Hund
- Department of Cardiology, Internal Medicine III, Heidelberg University Hospital, Heidelberg, Germany
| | - Johannes Backs
- Institute of Experimental Cardiology, Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Im Neuenheimer Feld 669, 69120, Heidelberg, Germany
| | - Norbert Frey
- Department of Cardiology, Internal Medicine III, Heidelberg University Hospital, Heidelberg, Germany
| | - Hans-Christoph Friederich
- Department of General Internal Medicine and Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany
- Informatics for Life, Heidelberg, Germany
| | - Jobst-Hendrik Schultz
- Department of General Internal Medicine and Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany
- Informatics for Life, Heidelberg, Germany
| | - Julio Saez-Rodriguez
- Institute for Computational Biomedicine, Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Bioquant, Heidelberg, Germany
- Informatics for Life, Heidelberg, Germany
| | - Rebecca T Levinson
- Institute for Computational Biomedicine, Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Bioquant, Heidelberg, Germany.
- Department of General Internal Medicine and Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany.
- Informatics for Life, Heidelberg, Germany.
| |
Collapse
|
45
|
Kratz A, Kim M, Kelly MR, Zheng F, Koczor CA, Li J, Ono K, Qin Y, Churas C, Chen J, Pillich RT, Park J, Modak M, Collier R, Licon K, Pratt D, Sobol RW, Krogan NJ, Ideker T. A multi-scale map of protein assemblies in the DNA damage response. Cell Syst 2023; 14:447-463.e8. [PMID: 37220749 PMCID: PMC10330685 DOI: 10.1016/j.cels.2023.04.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 01/30/2023] [Accepted: 04/25/2023] [Indexed: 05/25/2023]
Abstract
The DNA damage response (DDR) ensures error-free DNA replication and transcription and is disrupted in numerous diseases. An ongoing challenge is to determine the proteins orchestrating DDR and their organization into complexes, including constitutive interactions and those responding to genomic insult. Here, we use multi-conditional network analysis to systematically map DDR assemblies at multiple scales. Affinity purifications of 21 DDR proteins, with/without genotoxin exposure, are combined with multi-omics data to reveal a hierarchical organization of 605 proteins into 109 assemblies. The map captures canonical repair mechanisms and proposes new DDR-associated proteins extending to stress, transport, and chromatin functions. We find that protein assemblies closely align with genetic dependencies in processing specific genotoxins and that proteins in multiple assemblies typically act in multiple genotoxin responses. Follow-up by DDR functional readouts newly implicates 12 assembly members in double-strand-break repair. The DNA damage response assemblies map is available for interactive visualization and query (ccmi.org/ddram/).
Collapse
Affiliation(s)
- Anton Kratz
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Minkyu Kim
- University of California San Francisco, Department of Cellular and Molecular Pharmacology, San Francisco, CA 94158, USA; The J. David Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA; University of Texas Health Science Center San Antonio, Department of Biochemistry and Structural Biology, San Antonio, TX 78229, USA
| | - Marcus R Kelly
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Fan Zheng
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Christopher A Koczor
- University of South Alabama, Department of Pharmacology and Mitchell Cancer Institute, Mobile, AL 36604, USA
| | - Jianfeng Li
- University of South Alabama, Department of Pharmacology and Mitchell Cancer Institute, Mobile, AL 36604, USA
| | - Keiichiro Ono
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Yue Qin
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Christopher Churas
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Jing Chen
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Rudolf T Pillich
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Jisoo Park
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Maya Modak
- University of California San Francisco, Department of Cellular and Molecular Pharmacology, San Francisco, CA 94158, USA; The J. David Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA
| | - Rachel Collier
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Kate Licon
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Dexter Pratt
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA
| | - Robert W Sobol
- University of South Alabama, Department of Pharmacology and Mitchell Cancer Institute, Mobile, AL 36604, USA; Brown University, Department of Pathology and Laboratory Medicine and Legorreta Cancer Center, Providence, RI 02903, USA.
| | - Nevan J Krogan
- University of California San Francisco, Department of Cellular and Molecular Pharmacology, San Francisco, CA 94158, USA; The J. David Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA.
| | - Trey Ideker
- University of California San Diego, Department of Medicine, San Diego, CA 92093, USA; The Cancer Cell Map Initiative, San Francisco and La Jolla, CA, USA.
| |
Collapse
|
46
|
Meng K, Lu S, Li Y, Hu L, Zhang J, Cao Y, Wang Y, Zhang CZ, He Q. LINC00493-encoded microprotein SMIM26 exerts anti-metastatic activity in renal cell carcinoma. EMBO Rep 2023; 24:e56282. [PMID: 37009826 PMCID: PMC10240204 DOI: 10.15252/embr.202256282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 03/08/2023] [Accepted: 03/17/2023] [Indexed: 04/04/2023] Open
Abstract
Human microproteins encoded by long non-coding RNAs (lncRNA) have been increasingly discovered, however, complete functional characterization of these emerging proteins is scattered. Here, we show that LINC00493-encoded SMIM26, an understudied microprotein localized in mitochondria, is tendentiously downregulated in clear cell renal cell carcinoma (ccRCC) and correlated with poor overall survival. LINC00493 is recognized by RNA-binding protein PABPC4 and transferred to ribosomes for translation of a 95-amino-acid protein SMIM26. SMIM26, but not LINC00493, suppresses ccRCC growth and metastatic lung colonization by interacting with acylglycerol kinase (AGK) and glutathione transport regulator SLC25A11 via its N-terminus. This interaction increases the mitochondrial localization of AGK and subsequently inhibits AGK-mediated AKT phosphorylation. Moreover, the formation of the SMIM26-AGK-SCL25A11 complex maintains mitochondrial glutathione import and respiratory efficiency, which is abrogated by AGK overexpression or SLC25A11 knockdown. This study functionally characterizes the LINC00493-encoded microprotein SMIM26 and establishes its anti-metastatic role in ccRCC, and therefore illuminates the importance of hidden proteins in human cancers.
Collapse
Affiliation(s)
- Kun Meng
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
- The First Affiliated Hospital of Jinan University and MOE Key Laboratory of Tumor Molecular Biology, Jinan UniversityGuangzhouChina
| | - Shaohua Lu
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
- Sino‐French Hoffmann Institute, School of Basic Medical Sciences, State Key Laboratory of Respiratory DiseaseGuangzhou Medical UniversityGuangzhouChina
| | - Yu‐Ying Li
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
| | - Li‐Ling Hu
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
| | - Jing Zhang
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
- The First Affiliated Hospital of Jinan University and MOE Key Laboratory of Tumor Molecular Biology, Jinan UniversityGuangzhouChina
| | - Yun Cao
- Department of Pathology, State Key Laboratory of Oncology in South ChinaSun Yat‐sen University Cancer CenterGuangzhouChina
| | - Yang Wang
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
| | - Chris Zhiyi Zhang
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
| | - Qing‐Yu He
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan UniversityGuangzhouChina
- The First Affiliated Hospital of Jinan University and MOE Key Laboratory of Tumor Molecular Biology, Jinan UniversityGuangzhouChina
| |
Collapse
|
47
|
Liu L, Jones BF, Uzzi B, Wang D. Data, measurement and empirical methods in the science of science. Nat Hum Behav 2023:10.1038/s41562-023-01562-4. [PMID: 37264084 DOI: 10.1038/s41562-023-01562-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 02/17/2023] [Indexed: 06/03/2023]
Abstract
The advent of large-scale datasets that trace the workings of science has encouraged researchers from many different disciplinary backgrounds to turn scientific methods into science itself, cultivating a rapidly expanding 'science of science'. This Review considers this growing, multidisciplinary literature through the lens of data, measurement and empirical methods. We discuss the purposes, strengths and limitations of major empirical approaches, seeking to increase understanding of the field's diverse methodologies and expand researchers' toolkits. Overall, new empirical developments provide enormous capacity to test traditional beliefs and conceptual frameworks about science, discover factors associated with scientific productivity, predict scientific outcomes and design policies that facilitate scientific progress.
Collapse
Affiliation(s)
- Lu Liu
- Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA
- Kellogg School of Management, Northwestern University, Evanston, IL, USA
- College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, USA
| | - Benjamin F Jones
- Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA
- Kellogg School of Management, Northwestern University, Evanston, IL, USA
- National Bureau of Economic Research, Cambridge, MA, USA
- Brookings Institution, Washington, DC, USA
| | - Brian Uzzi
- Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA
- Kellogg School of Management, Northwestern University, Evanston, IL, USA
| | - Dashun Wang
- Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA.
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.
- Kellogg School of Management, Northwestern University, Evanston, IL, USA.
- McCormick School of Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
48
|
Tuncay A, Crabtree DR, Muggeridge DJ, Husi H, Cobley JN. Performance benchmarking microplate-immunoassays for quantifying target-specific cysteine oxidation reveals their potential for understanding redox-regulation and oxidative stress. Free Radic Biol Med 2023; 204:252-265. [PMID: 37192685 DOI: 10.1016/j.freeradbiomed.2023.05.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 04/24/2023] [Accepted: 05/05/2023] [Indexed: 05/18/2023]
Abstract
The antibody-linked oxi-state assay (ALISA) for quantifying target-specific cysteine oxidation can benefit specialist and non-specialist users. Specialists can benefit from time-efficient analysis and high-throughput target and/or sample n-plex capacities. The simple and accessible "off-the-shelf" nature of ALISA brings the benefits of oxidative damage assays to non-specialists studying redox-regulation. Until performance benchmarking establishes confidence in the "unseen" microplate results, ALISA is unlikely to be widely adopted. Here, we implemented pre-set pass/fail criteria to benchmark ALISA by evaluating immunoassay performance in diverse contexts. ELISA-mode ALISA assays were accurate, reliable, and sensitive. For example, the average inter-assay CV for detecting 20%- and 40%-oxidised PRDX2 or GAPDH standards was 4.6% (range: 3.6-7.4%). ALISA displayed target-specificity. Immunodepleting the target decreased the signal by ∼75%. Single-antibody formatted ALISA failed to quantify the matrix-facing alpha subunit of the mitochondrial ATP synthase. However, RedoxiFluor quantified the alpha subunit displaying exceptional performance in the single-antibody format. ALISA discovered that (1) monocyte-to-macrophage differentiation amplified PRDX2-oxidation in THP-1 cells and (2) exercise increased GAPDH-specific oxidation in human erythrocytes. The "unseen" microplate data were "seen-to-be-believed" via orthogonal visually displayed immunoassays like the dimer method. Finally, we established target (n = 3) and sample (n = 100) n-plex capacities in ∼4 h with 50-70 min hands-on time. Our work showcases the potential of ALISA to advance our understanding of redox-regulation and oxidative stress.
Collapse
Affiliation(s)
- Ahmet Tuncay
- Division of Biomedical Science, Life Science Innovation Centre, University of the Highlands and Islands, Inverness, IV2 5NA, Scotland, UK
| | - Daniel R Crabtree
- Division of Biomedical Science, Life Science Innovation Centre, University of the Highlands and Islands, Inverness, IV2 5NA, Scotland, UK
| | | | - Holger Husi
- Division of Biomedical Science, Life Science Innovation Centre, University of the Highlands and Islands, Inverness, IV2 5NA, Scotland, UK
| | - James N Cobley
- Division of Biomedical Science, Life Science Innovation Centre, University of the Highlands and Islands, Inverness, IV2 5NA, Scotland, UK; Cysteine Redox Technology Group, Life Science Innovation Centre, University of the Highlands and Islands, Inverness, IV2 5NA, Scotland, UK.
| |
Collapse
|
49
|
Elfmann C, Stülke J. PAE viewer: a webserver for the interactive visualization of the predicted aligned error for multimer structure predictions and crosslinks. Nucleic Acids Res 2023:7151339. [PMID: 37140053 DOI: 10.1093/nar/gkad350] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 04/14/2023] [Accepted: 04/21/2023] [Indexed: 05/05/2023] Open
Abstract
The development of AlphaFold for protein structure prediction has opened a new era in structural biology. This is even more the case for AlphaFold-Multimer for the prediction of protein complexes. The interpretation of these predictions has become more important than ever, but it is difficult for the non-specialist. While an evaluation of the prediction quality is provided for monomeric protein predictions by the AlphaFold Protein Structure Database, such a tool is missing for predicted complex structures. Here, we present the PAE Viewer webserver (http://www.subtiwiki.uni-goettingen.de/v4/paeViewerDemo), an online tool for the integrated visualization of predicted protein complexes using a 3D structure display combined with an interactive representation of the Predicted Aligned Error (PAE). This metric allows an estimation of the quality of the prediction. Importantly, our webserver also allows the integration of experimental cross-linking data which helps to interpret the reliability of the structure predictions. With the PAE Viewer, the user obtains a unique online tool which for the first time allows the intuitive evaluation of the PAE for protein complex structure predictions with integrated crosslinks.
Collapse
Affiliation(s)
- Christoph Elfmann
- Department of General Microbiology, Georg-August-University Göttingen, GZMB, 37077 Göttingen, Germany
| | - Jörg Stülke
- Department of General Microbiology, Georg-August-University Göttingen, GZMB, 37077 Göttingen, Germany
| |
Collapse
|
50
|
Messner CB, Demichev V, Muenzner J, Aulakh SK, Barthel N, Röhl A, Herrera-Domínguez L, Egger AS, Kamrad S, Hou J, Tan G, Lemke O, Calvani E, Szyrwiel L, Mülleder M, Lilley KS, Boone C, Kustatscher G, Ralser M. The proteomic landscape of genome-wide genetic perturbations. Cell 2023; 186:2018-2034.e21. [PMID: 37080200 PMCID: PMC7615649 DOI: 10.1016/j.cell.2023.03.026] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 01/20/2023] [Accepted: 03/21/2023] [Indexed: 04/22/2023]
Abstract
Functional genomic strategies have become fundamental for annotating gene function and regulatory networks. Here, we combined functional genomics with proteomics by quantifying protein abundances in a genome-scale knockout library in Saccharomyces cerevisiae, using data-independent acquisition mass spectrometry. We find that global protein expression is driven by a complex interplay of (1) general biological properties, including translation rate, protein turnover, the formation of protein complexes, growth rate, and genome architecture, followed by (2) functional properties, such as the connectivity of a protein in genetic, metabolic, and physical interaction networks. Moreover, we show that functional proteomics complements current gene annotation strategies through the assessment of proteome profile similarity, protein covariation, and reverse proteome profiling. Thus, our study reveals principles that govern protein expression and provides a genome-spanning resource for functional annotation.
Collapse
Affiliation(s)
- Christoph B Messner
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK; Precision Proteomics Center, Swiss Institute of Allergy and Asthma Research (SIAF), University of Zurich, 7265 Davos, Switzerland
| | - Vadim Demichev
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK; Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany; Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Cambridge CB2 1QW, UK
| | - Julia Muenzner
- Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany
| | - Simran K Aulakh
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK
| | - Natalie Barthel
- Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany
| | - Annika Röhl
- Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany
| | | | - Anna-Sophia Egger
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK
| | - Stephan Kamrad
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK
| | - Jing Hou
- The Donnelly Centre, University of Toronto, Toronto, ON M5S3E1, Canada
| | - Guihong Tan
- The Donnelly Centre, University of Toronto, Toronto, ON M5S3E1, Canada
| | - Oliver Lemke
- Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany
| | - Enrica Calvani
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK
| | - Lukasz Szyrwiel
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK; Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany
| | - Michael Mülleder
- Charité Universitätsmedizin, Core Facility - High Throughput Mass Spectrometry, 10117 Berlin, Germany
| | - Kathryn S Lilley
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Cambridge CB2 1QW, UK
| | - Charles Boone
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S3E1, Canada; The Donnelly Centre, University of Toronto, Toronto, ON M5S3E1, Canada; RIKEN Center for Sustainable Resource Science, Wako, 351-0198 Saitama, Japan
| | - Georg Kustatscher
- Wellcome Centre for Cell Biology, University of Edinburgh, Max Born Crescent, Edinburgh EH9 3BF, Scotland, UK.
| | - Markus Ralser
- The Francis Crick Institute, Molecular Biology of Metabolism Laboratory, London NW1 1AT, UK; Charité Universitätsmedizin Berlin, Department of Biochemistry, 10117 Berlin, Germany; The Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany.
| |
Collapse
|