1
|
Kumar A. Proteogenomics for Non-model Ocean-Derived Fungi. Methods Mol Biol 2025; 2859:197-210. [PMID: 39436603 DOI: 10.1007/978-1-0716-4152-1_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Most biological and biomedical experiments are designed and studied using the most common model organisms (MOs) like humans, mice, Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, worms, fruit flies, zebrafish, and Arabidopsis thaliana. These model organisms have been extensively studied and have a well-established set of genetic, physiological, and other tools available for research. In contrast, non-model organisms (NMOs) are those that are not traditionally used in scientific research and do not have a well-established set of genetic or other biological tools available for their study. The majority of MOs are associated with land habitats but rarely with ocean environments. The ocean forms the largest portion of our planet, yet ocean-derived organisms are the least explored, and these organisms are primarily NMOs. However, these are thrilling living entities, such as ocean-derived fungi (ODF). These ODFs are a diverse group of fungi that live in different ocean sectors, including the ocean, estuaries, and coastal ecosystems. These fungi are found to colonize and adapt to different substrates. They are important decomposers in marine ecosystems, breaking down dead organic matter and recycling nutrients. ODFs have adapted to survive in the unique and challenging conditions of the ocean environment, including high salt concentrations, low nutrient availability, and exposure to waves and currents. ODFs are potent producers of natural compounds with pharmaceutical and industrial applications, such as antibiotics, anticancer agents, antivirals, and enzymes for industrial processes. ODFs are an exciting group of fungi; however, these are the least studied because of the nonavailability of MOs from this group. Hence, there is a massive scope of expanding our current knowledge about ODFs, their genetic traits, potential future drug-producing capabilities, and lifestyle traits.With the advent of next-generation DNA sequencing, there is huge potential for the characterization of the genetic material of ODF as NMOs. Parallel proteomic methods also pose huge potential. A marriage of NGS and proteomic methods generates a new avenue called proteogenomics, which focuses on better annotation of existing genomic data. Both methods are getting cheaper and accessible to the research community for studying the proteogenomics of NMOs. Herein, the proteogenomic protocol development and data analyses are illustrated for the ocean-derived fungus Scopulariopsis brevicaulis.
Collapse
Affiliation(s)
- Abhishek Kumar
- Manipal Academy of Higher Education (MAHE), Manipal & Institute of Bioinformatics, Bangalore, India.
| |
Collapse
|
2
|
Singer F, Kuhring M, Renard BY, Muth T. Moving Toward Metaproteogenomics: A Computational Perspective on Analyzing Microbial Samples via Proteogenomics. Methods Mol Biol 2025; 2859:297-318. [PMID: 39436609 DOI: 10.1007/978-1-0716-4152-1_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Microbial sample analysis has received growing attention within the last decade, driven by important findings in microbiome research and promising applications in the biotechnological field. Modern mass spectrometry-based methodology has been established in this context, providing sufficient sensitivity, resolution, dynamic range, and throughput to analyze the so-called metaproteome of complex microbial mixtures from clinical or environmental samples. While proteomic analyses were previously restricted to common model organisms, next-generation sequencing technologies nowadays allow for the rapid and cost-efficient characterization of whole metagenomes of microbial consortia and specific genomes from non-model organisms to which microbes contribute by significant amounts. This proteogenomic approach, meaning the combined application of genomic and proteomic methods, enables researchers to create a protein database that presents a tailored blueprint of the microbial sample under investigation. This contribution provides an overview of the computational challenges and opportunities in proteogenomics and metaproteomics as of January 2018. For practical application, we first showcase an integrative proteogenomic method that circumvents existing reference databases by creating sample-specific transcripts. The underlying algorithm uses a graph network approach that combines RNA-Seq and peptide information. As a second example, we provide a tutorial for a simulation tool that estimates the computational limits of detecting microbial non-model organisms. This method evaluates the potential influence of error-tolerant searches and proteogenomic approaches on databases of interest. Finally, we discuss recommendations for developing future strategies that may help overcome present limitations by combining the strengths of genome- and proteome-based methods and moving toward an integrated metaproteogenomics approach.
Collapse
Affiliation(s)
- Franziska Singer
- NEXUS Personalized Health Technologies, ETH Zürich, Zürich, Switzerland
- Research Group Bioinformatics (NG4), Robert Koch Institute, Berlin, Germany
| | - Mathias Kuhring
- Core Unit Bioinformatics, Berlin Institute of Health (BIH) at Charité, Berlin, Germany
| | - Bernhard Y Renard
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
- Bioinformatics Unit, Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany.
| | - Thilo Muth
- Domain Data Competence Center (MF2), Department for Research Infrastructure and Information Technology, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
3
|
Sommer B, Jaeger-Honz S. From Gene to Whole Cell: Modeling, Visualization, and Analysis. Methods Mol Biol 2025; 2859:65-92. [PMID: 39436597 DOI: 10.1007/978-1-0716-4152-1_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Proteogenomics combines proteomic and genetic data to gain new insights in molecular mechanisms. Here, we extend this approach toward structural biology from a tool perspective. The chapter starts with tools which can be used to explore genetic information and then enrich those with proteomic data. Based on the corresponding identifiers, three-dimensional structures of proteins are identified and used to embed them in their molecular environment, here the surrounding membrane. This membrane is then mapped onto the surface of an interpretative three-dimensional cell model. Then, the embedded protein and the cell environment are associated with a metabolic pathway, again based on the identifiers provided by biomedical databases. Accompanying the different chapters, related work is discussed which can alternatively be used. Finally, an outlook toward immersive analytics is given.
Collapse
Affiliation(s)
- Bjorn Sommer
- Innovation Design Engineering, School of Design, Royal College of Art, London, UK.
| | - Sabrina Jaeger-Honz
- Life Science Informatics, Department of Computer and Information Science, University of Konstanz, Konstanz, Germany
| |
Collapse
|
4
|
Parveen A, Kumar A. Introduction to Integrated Proteogenomic Pipeline for Dealing with Pathogenic Missense SNPs. Methods Mol Biol 2025; 2859:93-107. [PMID: 39436598 DOI: 10.1007/978-1-0716-4152-1_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Proteogenomics is a multi-omics setup combining mass spectrometry and next-generation sequencing (NGS) technologies (using genomics and/or transcriptomics) with main aims of improving genome annotation and facilitating characterization of proteo-isoforms. However, working with proteogenomic approach is a very challenging task as it is generating multi-omics data and integrating these data for interpretation of results for biological or clinical implications. There is an urgent need for the development of protocols for integrated proteogenomics approaches. Genome resequencing yields massive data for missense single-nucleotide polymorphisms (SNP), and SNPs are yet not fully covered for their pathogenic nature using proteogenomic approaches. In this chapter, we present such a protocol for dealing with pathogenic missense SNPs using an integrated proteogenomics pipeline combining several steps: DNA-Seq, RNA-Seq, mass spectroscopy (MS), making customized databases of produced datasets, and screening and filtering for useful MS spectrums. This protocol also provides users with tricks and tips for the modifications, based on the requirements of the projects.
Collapse
Affiliation(s)
- Alisha Parveen
- Manipal Academy of Higher Education (MAHE), Manipal & Institute of Bioinformatics, Bangalore, India
- , Manipal, India
| | - Abhishek Kumar
- Manipal Academy of Higher Education (MAHE), Manipal & Institute of Bioinformatics, Bangalore, India.
- , Manipal, India.
| |
Collapse
|
5
|
Liang W, Zhu Z, Zheng C. Application of Proteomics Technology Based on LC-MS Combined with Western Blotting and Co-IP in Antiviral Innate Immunity. Methods Mol Biol 2025; 2854:93-106. [PMID: 39192122 DOI: 10.1007/978-1-0716-4108-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
As an interferon-stimulating factor protein, STING plays a role in the response and downstream liaison in antiviral natural immunity. Upon viral invasion, the immediate response of STING protein leads to a series of changes in downstream proteins, which ultimately leads to an antiviral immune response in the form of proinflammatory cytokines and type I interferons, thus triggering an innate immune response, an adaptive immune response in vivo, and long-term protection of the host. In the field of antiviral natural immunity, it is particularly important to rigorously and sequentially probe the dynamic changes in the antiviral natural immunity connector protein STING caused by the entire anti-inflammatory and anti-pathway mechanism and the differences in upstream and downstream proteins. Traditionally, proteomics technology has been validated by detecting proteins in a 2D platform, for which it is difficult to sensitively identify changes in the nature and abundance of target proteins. With the development of mass spectrometry (MS) technology, MS-based proteomics has made important contributions to characterizing the dynamic changes in the natural immune proteome induced by viral infections. MS analytical techniques have several advantages, such as high throughput, rapidity, sensitivity, accuracy, and automation. The most common techniques for detecting complex proteomes are liquid chromatography (LC) and mass spectrometry (MS). LC-MS (Liquid Chromatography-Mass Spectrometry), which combines the physical separation capability of LC and the mass analysis capability of MS, is a powerful technique mainly used for analyzing the proteome of cells, tissues, and body fluids. To explore the combination of traditional proteomics techniques such as Western blotting, Co-IP (co-Immunoprecipitation), and the latest LC-MS methods to probe the anti-inflammatory pathway and the differential changes in upstream and downstream proteins induced by the antiviral natural immune junction protein STING.
Collapse
Affiliation(s)
- Weizheng Liang
- Central Laboratory, The First Affiliated Hospital of Hebei North University, Zhangjiakou, Hebei, China
- Department of General Surgery, The First Affiliated Hospital of Hebei North University, Zhangjiakou, Hebei, China
| | - Zhenpeng Zhu
- Department of General Surgery, The First Affiliated Hospital of Hebei North University, Zhangjiakou, Hebei, China
| | - Chunfu Zheng
- Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
6
|
Kumar P, Johnson JE, McGowan T, Chambers MC, Heydarian M, Mehta S, Easterly C, Griffin TJ, Jagtap PD. Discovering Novel Proteoforms Using Proteogenomic Workflows Within the Galaxy Bioinformatics Platform. Methods Mol Biol 2025; 2859:109-128. [PMID: 39436599 DOI: 10.1007/978-1-0716-4152-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Proteogenomics is a growing "multi-omics" research area that combines mass spectrometry-based proteomics and high-throughput nucleotide sequencing technologies. Proteogenomics has helped in genomic annotation for organisms whose complete genome sequences became available by using high-throughput DNA sequencing technologies. Apart from genome annotation, this multi-omics approach has also helped researchers confirm expression of variant proteins belonging to unique proteoforms that could have resulted from single-nucleotide polymorphism (SNP), insertion and deletions (Indels), splice isoforms, or other genome or transcriptome variations.A proteogenomic study depends on a multistep informatics workflow, requiring different software at each step. These integrated steps include creating an appropriate protein sequence database, matching spectral data against these sequences, and finally identifying peptide sequences corresponding to novel proteoforms followed by variant classification and functional analysis. The disparate software required for a proteogenomic study is difficult for most researchers to access and use, especially those lacking computational expertise. Furthermore, using them disjointedly can be error-prone as it requires setting up individual parameters for each software. Consequently, reproducibility suffers. Managing output files from each software is an additional challenge. One solution for these challenges in proteogenomics is the open-source Web-based computational platform Galaxy. Its capability to create and manage workflows comprised of disparate software while recording and saving all important parameters promotes both usability and reproducibility. Here, we describe a workflow that can perform proteogenomic analysis on a Galaxy-based platform. This Galaxy workflow facilitates matching of spectral data with a customized protein sequence database, identifying novel protein variants, assessing quality of results, and classifying variants along with visualization against the genome.
Collapse
Affiliation(s)
- Praveen Kumar
- Data Sciences & Quantitative Biology, Discovery Sciences, AstraZeneca, Waltham, MA, USA
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN, USA
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | | | | | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Caleb Easterly
- Carolina Population Center, University of North Carolina, Chapel Hill, NC, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
7
|
Ren Y, Yue Y, Li X, Weng S, Xu H, Liu L, Cheng Q, Luo P, Zhang T, Liu Z, Han X. Proteogenomics offers a novel avenue in neoantigen identification for cancer immunotherapy. Int Immunopharmacol 2024; 142:113147. [PMID: 39270345 DOI: 10.1016/j.intimp.2024.113147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/11/2024] [Accepted: 09/08/2024] [Indexed: 09/15/2024]
Abstract
Cancer neoantigens are tumor-specific non-synonymous mutant peptides that activate the immune system to produce an anti-tumor response. Personalized cancer vaccines based on neoantigens are currently one of the most promising therapeutic approaches for cancer treatment. By utilizing the unique mutations within each patient's tumor, these vaccines aim to elicit a strong and specific immune response against cancer cells. However, the identification of neoantigens remains challenging due to the low accuracy of current prediction tools and the high false-positive rate of candidate neoantigens. Since the concept of "proteogenomics" emerged in 2004, it has evolved rapidly with the increased sequencing depth of next-generation sequencing technologies and the maturation of mass spectrometry-based proteomics technologies to become a more comprehensive approach to neoantigen identification, allowing the discovery of high-confidence candidate neoantigens. In this review, we summarize the reason why cancer neoantigens have become attractive targets for immunotherapy, the mechanism of cancer vaccines and the advances in cancer immunotherapy. Considerations relevant to the application emerging of proteogenomics technologies for neoantigen identification and challenges in this field are described.
Collapse
Affiliation(s)
- Yuqing Ren
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China; Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Yi Yue
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Xinyang Li
- Department of Oncology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Siyuan Weng
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Hui Xu
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Long Liu
- Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Quan Cheng
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Peng Luo
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Tengfei Zhang
- Department of Oncology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China.
| | - Zaoqu Liu
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China.
| | - Xinwei Han
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China; Interventional Institute of Zhengzhou University, Zhengzhou, Henan 450052, China; Interventional Treatment and Clinical Research Center of Henan Province, Zhengzhou, Henan 450052, China.
| |
Collapse
|
8
|
Ghanegolmohammadi F, Eslami M, Ohya Y. Systematic data analysis pipeline for quantitative morphological cell phenotyping. Comput Struct Biotechnol J 2024; 23:2949-2962. [PMID: 39104709 PMCID: PMC11298594 DOI: 10.1016/j.csbj.2024.07.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 07/09/2024] [Accepted: 07/10/2024] [Indexed: 08/07/2024] Open
Abstract
Quantitative morphological phenotyping (QMP) is an image-based method used to capture morphological features at both the cellular and population level. Its interdisciplinary nature, spanning from data collection to result analysis and interpretation, can lead to uncertainties, particularly among those new to this actively growing field. High analytical specificity for a typical QMP is achieved through sophisticated approaches that can leverage subtle cellular morphological changes. Here, we outline a systematic workflow to refine the QMP methodology. For a practical review, we describe the main steps of a typical QMP; in each step, we discuss the available methods, their applications, advantages, and disadvantages, along with the R functions and packages for easy implementation. This review does not cover theoretical backgrounds, but provides several references for interested researchers. It aims to broaden the horizons for future phenome studies and demonstrate how to exploit years of endeavors to achieve more with less.
Collapse
Affiliation(s)
- Farzan Ghanegolmohammadi
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Mohammad Eslami
- Harvard Ophthalmology AI Lab, Schepen’s Eye Research Institute of Massachusetts Eye and Ear Infirmary, Harvard Medical School, Boston, USA
| | - Yoshikazu Ohya
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| |
Collapse
|
9
|
Raj A, Aggarwal S, Singh P, Yadav AK, Dash D. PgxSAVy: A tool for comprehensive evaluation of variant peptide quality in proteogenomics - catching the (un)usual suspects. Comput Struct Biotechnol J 2024; 23:711-722. [PMID: 38292474 PMCID: PMC10825656 DOI: 10.1016/j.csbj.2023.12.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/19/2023] [Accepted: 12/23/2023] [Indexed: 02/01/2024] Open
Abstract
Variant peptides resulting from single nucleotide polymorphisms (SNPs) can lead to aberrant protein functions and have translational potential for disease diagnosis and personalized therapy. Variant peptides detected by proteogenomics are fraught with high number of false positives, but there is no uniform and comprehensive approach to assess variant quality across analysis pipelines. Despite class-specific FDR along with ad-hoc filters, the problem is far from solved. These protocols are typically manual and tedious, and thus not uniform across labs. We demonstrate that variant peptide rescoring, integrated with intensity, variant event information and search result features, allows better discrimination of correct variant peptides. Implemented into PgxSAVy - a tool for quality control of variant peptides, this method can tackle the high rate of false positives. PgxSAVy provides a rigorous framework for quality control and annotations of variant peptides on the basis of (i) variant quality, (ii) isobaric masses, and (iii) disease annotation. PgxSAVy demonstrated high accuracy by identifying true variants with 98.43% accuracy on simulated data. Large-scale proteogenomic reanalysis of ∼2.8 million spectra (PXD004010 and PXD001468) resulted in 12,705 variant peptide spectrum matches (PSMs), of which PgxSAVy evaluated 3028 (23.8%), 1409 (11.1%) and 8268 (65.1%) as confident, semi-confident and doubtful respectively. PgxSAVy also annotates the variants based on their pathogenicity and provides support for assisted manual validation. The analysis of proteins carrying variants can provide fine granularity in discovering important pathways. PgxSAVy will advance personalized medicine by providing a comprehensive framework for quality control and prioritization of proteogenomics variants. PgxSAVy is freely available at https://pgxsavy.igib.res.in/ as a webserver and https://github.com/anuragraj/PgxSAVy as a stand-alone tool.
Collapse
Affiliation(s)
- Anurag Raj
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Suruchi Aggarwal
- Computational and Mathematical Biology Centre (CMBC), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Drug Discovery (CDD), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Microbial Research (CMR), Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
| | - Prateek Singh
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Amit Kumar Yadav
- Computational and Mathematical Biology Centre (CMBC), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Drug Discovery (CDD), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Microbial Research (CMR), Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
| | - Debasis Dash
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
10
|
Ragucci S, Landi N, Di Maro A. Myoglobin as a molecular biomarker for meat authentication and traceability. Food Chem 2024; 458:140326. [PMID: 38970962 DOI: 10.1016/j.foodchem.2024.140326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 06/17/2024] [Accepted: 07/02/2024] [Indexed: 07/08/2024]
Abstract
The global incidence of economically motivated meat adulteration represents a crucial issue for the food industry. Undeclared addition of cheaper or low-quality species to meat products of high commercial value has become a common practice that needs to be countered with specific measures. In this framework, myoglobin (Mb) is a sarcoplasmic haemoprotein, primarily responsible for meat colour and has been successfully used in meat fraud authentication. Mb is highly soluble in water, easily monitored at 409 nm and species-specific. Knowing that various analytical DNA-based and protein-based methods, as well as spectroscopic techniques have been developed over the years for the detection of meat fraud, the aim of the present review is to take stock of the situation regarding the possible use of Mb as a molecular biomarker for the easy and rapid detection of undeclared species in meat products, avoiding the need of sophisticated or expensive equipment and specialised operators.
Collapse
Affiliation(s)
- Sara Ragucci
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100-Caserta, Italy..
| | - Nicola Landi
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100-Caserta, Italy.; Institute of Crystallography, National Research Council of Italy, Via Vivaldi 43, 81100-Caserta, Italy
| | - Antimo Di Maro
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100-Caserta, Italy..
| |
Collapse
|
11
|
Vasylieva V, Arefiev I, Bourassa F, Trifiro FA, Brunet MA. Proteomics Can Rise to the Challenge of Pseudogenes' Coding Nature. J Proteome Res 2024. [PMID: 39486438 DOI: 10.1021/acs.jproteome.4c00116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2024]
Abstract
Throughout the past decade, technological advances in genomics and transcriptomics have revealed pervasive translation throughout mammalian genomes. These putative proteins are usually excluded from proteomics analyses, as they are absent from common protein repositories. A sizable portion of these noncanonical proteins is translated from pseudogenes. Pseudogenes are commonly termed defective copies of coding genes unable to produce proteins. Here, we suggest that proteomics can help in their annotation. First, we define important terms and review specific examples underlining the caveats in pseudogene annotation and their coding potential. Then, we will discuss the challenges inherent to pseudogenes that have thus far rendered complex their confidence in omics data. Finally, we identify recent developments in experimental procedures, instrumentation, and computational methods in proteomics that put the field in a unique position to solve the pseudogene annotation conundrum.
Collapse
Affiliation(s)
- Valeriia Vasylieva
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Ihor Arefiev
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Francis Bourassa
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Félix-Antoine Trifiro
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Marie A Brunet
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| |
Collapse
|
12
|
Desai H, Andrews KH, Bergersen KV, Ofori S, Yu F, Shikwana F, Arbing MA, Boatner LM, Villanueva M, Ung N, Reed EF, Nesvizhskii AI, Backus KM. Chemoproteogenomic stratification of the missense variant cysteinome. Nat Commun 2024; 15:9284. [PMID: 39468056 PMCID: PMC11519605 DOI: 10.1038/s41467-024-53520-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 10/15/2024] [Indexed: 10/30/2024] Open
Abstract
Cancer genomes are rife with genetic variants; one key outcome of this variation is widespread gain-of-cysteine mutations. These acquired cysteines can be both driver mutations and sites targeted by precision therapies. However, despite their ubiquity, nearly all acquired cysteines remain unidentified via chemoproteomics; identification is a critical step to enable functional analysis, including assessment of potential druggability and susceptibility to oxidation. Here, we pair cysteine chemoproteomics-a technique that enables proteome-wide pinpointing of functional, redox sensitive, and potentially druggable residues-with genomics to reveal the hidden landscape of cysteine genetic variation. Our chemoproteogenomics platform integrates chemoproteomic, whole exome, and RNA-seq data, with a customized two-stage false discovery rate (FDR) error controlled proteomic search, which is further enhanced with a user-friendly FragPipe interface. Chemoproteogenomics analysis reveals that cysteine acquisition is a ubiquitous feature of both healthy and cancer genomes that is further elevated in the context of decreased DNA repair. Reference cysteines proximal to missense variants are also found to be pervasive, supporting heretofore untapped opportunities for variant-specific chemical probe development campaigns. As chemoproteogenomics is further distinguished by sample-matched combinatorial variant databases and is compatible with redox proteomics and small molecule screening, we expect widespread utility in guiding proteoform-specific biology and therapeutic discovery.
Collapse
Affiliation(s)
- Heta Desai
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, USA
| | - Katrina H Andrews
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Kristina V Bergersen
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Samuel Ofori
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Flowreen Shikwana
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, USA
| | - Mark A Arbing
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- UCLA-DOE Institute for Genomics and Proteomics, UCLA, Los Angeles, CA, USA
| | - Lisa M Boatner
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, USA
| | - Miranda Villanueva
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, USA
| | - Nicholas Ung
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Elaine F Reed
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Keriann M Backus
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Molecular Biology Institute, UCLA, Los Angeles, CA, USA.
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, USA.
- UCLA-DOE Institute for Genomics and Proteomics, UCLA, Los Angeles, CA, USA.
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA, USA.
- Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
13
|
Vorauer C, Boniche-Alfaro C, Murphree T, Matsui T, Weiss T, Fries BC, Guttman M. Direct Mapping of Polyclonal Epitopes in Serum by HDX-MS. Anal Chem 2024; 96:16758-16767. [PMID: 39434663 DOI: 10.1021/acs.analchem.4c03274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Elucidating the interactions that drive antigen recognition is central to understanding antibody-mediated protection and is vital for the rational design of immunogens. Often, structural knowledge of epitopes targeted by antibodies is derived from isolated studies of monoclonal antibodies, for which numerous structural techniques exist. In contrast, there are very few approaches capable of mapping the full scope of antigen surfaces targeted by polyclonal sera through the course of a natural antibody response. Here, we develop an approach using immobilized antigen coupled to hydrogen/deuterium exchange with mass spectrometry (HDX-MS) to probe epitope targeting in the context of the fully native serum environment. Using the well-characterized Staphylococcal enterotoxin B (SEB) as a model system, we show that complex combinations of epitopes can be detected and subtle differences across different antisera can be discerned. This work reveals new insight into how neutralizing antibodies and antisera target SEB and, more importantly, establishes a novel method for directly mapping the epitope landscape of polyclonal sera.
Collapse
Affiliation(s)
- Clint Vorauer
- Department of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Camila Boniche-Alfaro
- Department of Medicine, Infectious Disease Division, Stony Brook University, Stony Brook, New York 11794, United States
- Veteran's Administration Medical Center, Northport, New York 11768, United States
| | - Taylor Murphree
- Department of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Tsutomu Matsui
- Stanford Synchrotron Radiation Laboratory, SLAC, Menlo Park, California 94025, United States
| | - Thomas Weiss
- Stanford Synchrotron Radiation Laboratory, SLAC, Menlo Park, California 94025, United States
| | - Bettina C Fries
- Department of Medicine, Infectious Disease Division, Stony Brook University, Stony Brook, New York 11794, United States
- Veteran's Administration Medical Center, Northport, New York 11768, United States
- Department of Microbiology and Immunology, Stony Brook University, Stony Brook, New York 11794, United States
| | - Miklos Guttman
- Department of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
14
|
Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Valipour Kahrood H, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and Visualization of Quantitative Proteomics Data Using FragPipe-Analyst. J Proteome Res 2024; 23:4303-4315. [PMID: 39254081 DOI: 10.1021/acs.jproteome.4c00294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows, including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Collapse
Affiliation(s)
- Yi Hsiao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Haijian Zhang
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Hossein Valipour Kahrood
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
- Monash Genomics & Bioinformatics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Joel R Steele
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ralf B Schittenhelm
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
15
|
Avella I, Schulte L, Hurka S, Damm M, Eichberg J, Schiffmann S, Henke M, Timm T, Lochnit G, Hardes K, Vilcinskas A, Lüddecke T. Proteogenomics-guided functional venomics resolves the toxin arsenal and activity of Deinagkistrodon acutus venom. Int J Biol Macromol 2024; 278:135041. [PMID: 39182889 DOI: 10.1016/j.ijbiomac.2024.135041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Revised: 08/20/2024] [Accepted: 08/22/2024] [Indexed: 08/27/2024]
Abstract
Snakebite primarily impacts rural communities of Africa, Asia, and Latin America. The sharp-nosed viper (Deinagkistrodon acutus) is among the snakes of highest medical importance in Asia. Despite various studies on its venom using modern venomics techniques, a comprehensive understanding of composition and function of this species' venom remains lacking. We combined proteogenomics with extensive bioactivity profiling to present the first genome-level catalogue of D. acutus venom proteins and their exochemistry. Our analysis identified an unusually simple venom containing 45 components from 20 distinct protein families. Relative toxin abundances indicate that C-type lectin and C-type lectin-related protein (CTL), snake venom metalloproteinase (svMP), snake venom serine protease (svSP), and phospholipase A2 (PLA2) constitute 90 % of the venom. Bioassays targeting key aspects of viperid envenomation showed considerable concentration-dependent cytotoxicity, particularly in kidney and lung cells, and potent protease and PLA2 activity. Factor Xa and thrombin activities were minor, and no plasmin activity was observed. Effects on haemolysis, intracellular calcium (Ca2+) release, and nitric oxide (NO) synthesis were negligible. Our analysis provides the first holistic genome-based overview of the toxin arsenal of D. acutus, predicting the molecular and functional basis of its life-threatening effects, and opens novel avenues for treating envenomation by this highly dangerous snake.
Collapse
Affiliation(s)
- Ignazio Avella
- Animal Venomics Lab, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; Institute for Insect Biotechnology, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany; LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany.
| | - Lennart Schulte
- Animal Venomics Lab, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; Institute for Insect Biotechnology, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany; LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Branch for Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany
| | - Sabine Hurka
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Branch for Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; BMBF Junior Research Group in Bioeconomy (BioKreativ) "SymBioÖkonomie", Ohlebergsweg 12, 35392 Giessen, Germany
| | - Maik Damm
- Animal Venomics Lab, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; Institute for Insect Biotechnology, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany; LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany
| | - Johanna Eichberg
- Branch for Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; BMBF Junior Research Group in Infection Research "ASCRIBE", Ohlebergsweg 12, 35392 Giessen, Germany
| | - Susanne Schiffmann
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), 60596 Frankfurt am Main, Germany
| | - Marina Henke
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), 60596 Frankfurt am Main, Germany
| | - Thomas Timm
- Protein Analytics, Institute of Biochemistry, Faculty of Medicine, Justus Liebig University Giessen, Friedrichstrasse 24, 35392 Giessen, Germany
| | - Günther Lochnit
- Protein Analytics, Institute of Biochemistry, Faculty of Medicine, Justus Liebig University Giessen, Friedrichstrasse 24, 35392 Giessen, Germany
| | - Kornelia Hardes
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Branch for Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; BMBF Junior Research Group in Infection Research "ASCRIBE", Ohlebergsweg 12, 35392 Giessen, Germany
| | - Andreas Vilcinskas
- Institute for Insect Biotechnology, Justus Liebig University Giessen, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany; LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Branch for Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany
| | - Tim Lüddecke
- Animal Venomics Lab, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany; LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Natural Product Genomics, Senckenberganlage 25, 60325 Frankfurt am Main, Germany; Branch for Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Ohlebergsweg 12, 35392 Giessen, Germany.
| |
Collapse
|
16
|
Zhu C, Liu LY, Ha A, Yamaguchi TN, Zhu H, Hugh-White R, Livingstone J, Patel Y, Kislinger T, Boutros PC. moPepGen: Rapid and Comprehensive Identification of Non-canonical Peptides. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.28.587261. [PMID: 38585946 PMCID: PMC10996593 DOI: 10.1101/2024.03.28.587261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Gene expression is a multi-step transformation of biological information from its storage form (DNA) into functional forms (protein and some RNAs). Regulatory activities at each step of this transformation multiply a single gene into a myriad of proteoforms. Proteogenomics is the study of how genomic and transcriptomic variation creates this proteomic diversity, and is limited by the challenges of modeling the complexities of gene-expression. We therefore created moPepGen, a graph-based algorithm that comprehensively generates non-canonical peptides in linear time. moPepGen works with multiple technologies, in multiple species and on all types of genetic and transcriptomic data. In human cancer proteomes, it enumerates previously unobservable noncanonical peptides arising from germline and somatic genomic variants, noncoding open reading frames, RNA fusions and RNA circularization. By enabling efficient detection and quantitation of previously hidden proteins in both existing and new proteomic data, moPepGen facilitates all proteogenomics applications. It is available at: https://github.com/uclahs-cds/package-moPepGen.
Collapse
Affiliation(s)
- Chenghao Zhu
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, CA, USA
- Department of Urology, University of California, Los Angeles, CA, USA
| | - Lydia Y. Liu
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
- Vector Institute for Artificial Intelligence, Toronto, Canada
| | - Annie Ha
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
| | - Takafumi N. Yamaguchi
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, CA, USA
| | - Helen Zhu
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
- Vector Institute for Artificial Intelligence, Toronto, Canada
| | - Rupert Hugh-White
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, CA, USA
| | - Julie Livingstone
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, CA, USA
| | - Yash Patel
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, CA, USA
| | - Thomas Kislinger
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
| | - Paul C. Boutros
- Department of Human Genetics, University of California, Los Angeles, CA, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, CA, USA
- Department of Urology, University of California, Los Angeles, CA, USA
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
| |
Collapse
|
17
|
Tariq U, Saeed F. Predicting peptide properties from mass spectrometry data using deep attention-based multitask network and uncertainty quantification. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.21.609035. [PMID: 39229185 PMCID: PMC11370541 DOI: 10.1101/2024.08.21.609035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Database search algorithms reduce the number of potential candidate peptides against which scoring needs to be performed using a single (i.e. mass) property for filtering. While useful, filtering based on one property may lead to exclusion of non-abundant spectra and uncharacterized peptides - potentially exacerbating the streetlight effect. Here we present ProteoRift, a novel attention and multitask deep-network, which can predict multiple peptide properties (length, missed cleavages, and modification status) directly from spectra. We demonstrate that ProteoRift can predict these properties with up to 97% accuracy resulting in search-space reduction by more than 90%. As a result, our end-to-end pipeline is shown to exhibit 8x to 12x speedups with peptide deduction accuracy comparable to algorithmic techniques. We also formulate two uncertainty estimation metrics, which can distinguish between in-distribution and out-of-distribution data (ROC-AUC 0.99) and predict high-scoring mass spectra against correct peptide (ROC-AUC 0.94). These models and metrics are integrated in an end-to-end ML pipeline available at https://github.com/pcdslab/ProteoRift.
Collapse
Affiliation(s)
- Usman Tariq
- Knight Foundation School of Computing, and Information Sciences, Florida International University (FIU), Miami, FL USA
| | - Fahad Saeed
- Knight Foundation School of Computing, and Information Sciences, Florida International University (FIU), Miami, FL USA
- Biomolecular Sciences Institute (BSI), Florida International University, Miami, FL, USA
- Department of Human and Molecular Genetics, Herbert Wertheim School of Medicine, Florida International University, Miami, FL, USA
| |
Collapse
|
18
|
Piana D, Iavarone F, De Paolis E, Daniele G, Parisella F, Minucci A, Greco V, Urbani A. Phenotyping Tumor Heterogeneity through Proteogenomics: Study Models and Challenges. Int J Mol Sci 2024; 25:8830. [PMID: 39201516 PMCID: PMC11354793 DOI: 10.3390/ijms25168830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 07/31/2024] [Accepted: 08/06/2024] [Indexed: 09/02/2024] Open
Abstract
Tumor heterogeneity refers to the diversity observed among tumor cells: both between different tumors (inter-tumor heterogeneity) and within a single tumor (intra-tumor heterogeneity). These cells can display distinct morphological and phenotypic characteristics, including variations in cellular morphology, metastatic potential and variability treatment responses among patients. Therefore, a comprehensive understanding of such heterogeneity is necessary for deciphering tumor-specific mechanisms that may be diagnostically and therapeutically valuable. Innovative and multidisciplinary approaches are needed to understand this complex feature. In this context, proteogenomics has been emerging as a significant resource for integrating omics fields such as genomics and proteomics. By combining data obtained from both Next-Generation Sequencing (NGS) technologies and mass spectrometry (MS) analyses, proteogenomics aims to provide a comprehensive view of tumor heterogeneity. This approach reveals molecular alterations and phenotypic features related to tumor subtypes, potentially identifying therapeutic biomarkers. Many achievements have been made; however, despite continuous advances in proteogenomics-based methodologies, several challenges remain: in particular the limitations in sensitivity and specificity and the lack of optimal study models. This review highlights the impact of proteogenomics on characterizing tumor phenotypes, focusing on the critical challenges and current limitations of its use in different clinical and preclinical models for tumor phenotypic characterization.
Collapse
Affiliation(s)
- Diletta Piana
- Department of Basic Biotechnological Sciences, Intensivological and Perioperative Clinics, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (D.P.); (F.I.); (F.P.)
- Departmen Unity of Chemistry, Biochemistry and Clinical Molecular Biology, Department of Diagnostic and Laboratory Medicine, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (E.D.P.); (A.M.)
| | - Federica Iavarone
- Department of Basic Biotechnological Sciences, Intensivological and Perioperative Clinics, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (D.P.); (F.I.); (F.P.)
- Departmen Unity of Chemistry, Biochemistry and Clinical Molecular Biology, Department of Diagnostic and Laboratory Medicine, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (E.D.P.); (A.M.)
| | - Elisa De Paolis
- Departmen Unity of Chemistry, Biochemistry and Clinical Molecular Biology, Department of Diagnostic and Laboratory Medicine, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (E.D.P.); (A.M.)
- Departmental Unit of Molecular and Genomic Diagnostics, Genomics Core Facility, Gemelli Science and Technology Park (G-STeP), Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy
| | - Gennaro Daniele
- Phase 1 Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy;
| | - Federico Parisella
- Department of Basic Biotechnological Sciences, Intensivological and Perioperative Clinics, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (D.P.); (F.I.); (F.P.)
| | - Angelo Minucci
- Departmen Unity of Chemistry, Biochemistry and Clinical Molecular Biology, Department of Diagnostic and Laboratory Medicine, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (E.D.P.); (A.M.)
- Departmental Unit of Molecular and Genomic Diagnostics, Genomics Core Facility, Gemelli Science and Technology Park (G-STeP), Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy
| | - Viviana Greco
- Department of Basic Biotechnological Sciences, Intensivological and Perioperative Clinics, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (D.P.); (F.I.); (F.P.)
- Departmen Unity of Chemistry, Biochemistry and Clinical Molecular Biology, Department of Diagnostic and Laboratory Medicine, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (E.D.P.); (A.M.)
| | - Andrea Urbani
- Department of Basic Biotechnological Sciences, Intensivological and Perioperative Clinics, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (D.P.); (F.I.); (F.P.)
- Departmen Unity of Chemistry, Biochemistry and Clinical Molecular Biology, Department of Diagnostic and Laboratory Medicine, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (E.D.P.); (A.M.)
| |
Collapse
|
19
|
Rodriguez JM, Abascal F, Cerdán-Vélez D, Gómez LM, Vázquez J, Tress ML. Evidence for widespread translation of 5' untranslated regions. Nucleic Acids Res 2024; 52:8112-8126. [PMID: 38953162 DOI: 10.1093/nar/gkae571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 06/07/2024] [Accepted: 06/19/2024] [Indexed: 07/03/2024] Open
Abstract
Ribosome profiling experiments support the translation of a range of novel human open reading frames. By contrast, most peptides from large-scale proteomics experiments derive from just one source, 5' untranslated regions. Across the human genome we find evidence for 192 translated upstream regions, most of which would produce protein isoforms with extended N-terminal ends. Almost all of these N-terminal extensions are from highly abundant genes, which suggests that the novel regions we detect are just the tip of the iceberg. These upstream regions have characteristics that are not typical of coding exons. Their GC-content is remarkably high, even higher than 5' regions in other genes, and a large majority have non-canonical start codons. Although some novel upstream regions have cross-species conservation - five have orthologues in invertebrates for example - the reading frames of two thirds are not conserved beyond simians. These non-conserved regions also have no evidence of purifying selection, which suggests that much of this translation is not functional. In addition, non-conserved upstream regions have significantly more peptides in cancer cell lines than would be expected, a strong indication that an aberrant or noisy translation initiation process may play an important role in translation from upstream regions.
Collapse
Affiliation(s)
- Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
- CIBER de Enfermedades Cardiovasculares (CIBERCV), 28029 Madrid, Spain
| | - Federico Abascal
- Somatic Evolution Group, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA. UK
| | - Daniel Cerdán-Vélez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Laura Martínez Gómez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Jesús Vázquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
- CIBER de Enfermedades Cardiovasculares (CIBERCV), 28029 Madrid, Spain
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| |
Collapse
|
20
|
Vieira de Souza E, L Bookout A, Barnes CA, Miller B, Machado P, Basso LA, Bizarro CV, Saghatelian A. Rp3: Ribosome profiling-assisted proteogenomics improves coverage and confidence during microprotein discovery. Nat Commun 2024; 15:6839. [PMID: 39122697 PMCID: PMC11316118 DOI: 10.1038/s41467-024-50301-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 07/08/2024] [Indexed: 08/12/2024] Open
Abstract
There has been a dramatic increase in the identification of non-canonical translation and a significant expansion of the protein-coding genome. Among the strategies used to identify unannotated small Open Reading Frames (smORFs) that encode microproteins, Ribosome profiling (Ribo-Seq) is the gold standard for the annotation of novel coding sequences by reporting on smORF translation. In Ribo-Seq, ribosome-protected footprints (RPFs) that map to multiple genomic sites are removed since they cannot be unambiguously assigned to a specific genomic location. Furthermore, RPFs necessarily result in short (25-34 nucleotides) reads, increasing the chance of multi-mapping alignments, such that smORFs residing in these regions cannot be identified by Ribo-Seq. Moreover, it has been challenging to identify protein evidence for Ribo-Seq. To solve this, we developed Rp3, a pipeline that integrates proteogenomics and Ribosome profiling to provide unambiguous evidence for a subset of microproteins missed by current Ribo-Seq pipelines. Here, we show that Rp3 maximizes proteomics detection and confidence of microprotein-encoding smORFs.
Collapse
Affiliation(s)
- Eduardo Vieira de Souza
- Centro de Pesquisas em Biologia Molecular e Funcional (CPBMF) and Instituto Nacional de Ciência e Tecnologia em Tuberculose (INCT-TB), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
- Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul, 90616-900, Porto Alegre, Rio Grande do Sul, Brazil
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | | | - Brendan Miller
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Pablo Machado
- Centro de Pesquisas em Biologia Molecular e Funcional (CPBMF) and Instituto Nacional de Ciência e Tecnologia em Tuberculose (INCT-TB), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
- Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul, 90616-900, Porto Alegre, Rio Grande do Sul, Brazil
| | - Luiz A Basso
- Centro de Pesquisas em Biologia Molecular e Funcional (CPBMF) and Instituto Nacional de Ciência e Tecnologia em Tuberculose (INCT-TB), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
- Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul, 90616-900, Porto Alegre, Rio Grande do Sul, Brazil
| | - Cristiano V Bizarro
- Centro de Pesquisas em Biologia Molecular e Funcional (CPBMF) and Instituto Nacional de Ciência e Tecnologia em Tuberculose (INCT-TB), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil.
- Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul, 90616-900, Porto Alegre, Rio Grande do Sul, Brazil.
| | - Alan Saghatelian
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA.
| |
Collapse
|
21
|
Vincent D, Appels R. Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations. Int J Mol Sci 2024; 25:8614. [PMID: 39201310 PMCID: PMC11354340 DOI: 10.3390/ijms25168614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 08/04/2024] [Accepted: 08/05/2024] [Indexed: 09/02/2024] Open
Abstract
Triticum aestivum is an important crop whose reference genome (International Wheat Genome Sequencing Consortium (IWGSC) RefSeq v2.1) offers a valuable resource for understanding wheat genetic structure, improving agronomic traits, and developing new cultivars. A key aspect of gene model annotation is protein-level evidence of gene expression obtained from proteomics studies, followed up by proteogenomics to physically map proteins to the genome. In this research, we have retrieved the largest recent wheat proteomics datasets publicly available and applied the Basic Local Alignment Search Tool (tBLASTn) algorithm to map the 861,759 identified unique peptides against IWGSC RefSeq v2.1. Of the 92,719 hits, 83,015 unique peptides aligned along 33,612 High Confidence (HC) genes, thus validating 31.4% of all wheat HC gene models. Furthermore, 6685 unique peptides were mapped against 3702 Low Confidence (LC) gene models, and we argue that these gene models should be considered for HC status. The remaining 2934 orphan peptides can be used for novel gene discovery, as exemplified here on chromosome 4D. We demonstrated that tBLASTn could not map peptides exhibiting mid-sequence frame shift. We supply all our proteogenomics results, Galaxy workflow and Python code, as well as Browser Extensible Data (BED) files as a resource for the wheat community via the Apollo Jbrowse, and GitHub repositories. Our workflow could be applied to other proteomics datasets to expand this resource with proteins and peptides from biotically and abiotically stressed samples. This would help tease out wheat gene expression under various environmental conditions, both spatially and temporally.
Collapse
Affiliation(s)
| | - Rudi Appels
- Faculty of Science, University of Melbourne, Parkville, VIC 3010, Australia;
| |
Collapse
|
22
|
Meng W, Takeuchi Y, Ward JP, Sultan H, Arthur CD, Mardis ER, Artyomov MN, Lichti CF, Schreiber RD. Improvement of Tumor Neoantigen Detection by High-Field Asymmetric Waveform Ion Mobility Mass Spectrometry. Cancer Immunol Res 2024; 12:988-1006. [PMID: 38768391 PMCID: PMC11456315 DOI: 10.1158/2326-6066.cir-23-0900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/05/2024] [Accepted: 05/17/2024] [Indexed: 05/22/2024]
Abstract
Cancer neoantigens have been shown to elicit cancer-specific T-cell responses and have garnered much attention for their roles in both spontaneous and therapeutically induced antitumor responses. Mass spectrometry (MS) profiling of tumor immunopeptidomes has been used, in part, to identify MHC-bound mutant neoantigen ligands. However, under standard conditions, MS-based detection of such rare but clinically relevant neoantigens is relatively insensitive, requiring 300 million cells or more. Here, to quantitatively define the minimum detectable amounts of therapeutically relevant MHC-I and MHC-II neoantigen peptides, we analyzed different dilutions of immunopeptidomes isolated from the well-characterized T3 mouse methylcholanthrene (MCA)-induced cell line by MS. Using either data-dependent acquisition or parallel reaction monitoring (PRM), we established the minimum amount of material required to detect the major T3 neoantigens in the presence or absence of high field asymmetric waveform ion mobility spectrometry (FAIMS). This analysis yielded a 14-fold enhancement of sensitivity in detecting the major T3 MHC-I neoantigen (mLama4) with FAIMS-PRM compared with PRM without FAIMS, allowing ex vivo detection of this neoantigen from an individual 100 mg T3 tumor. These findings were then extended to two other independent MCA-sarcoma lines (1956 and F244). This study demonstrates that FAIMS substantially increases the sensitivity of MS-based characterization of validated neoantigens from tumors.
Collapse
Affiliation(s)
- Wei Meng
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Yoshiko Takeuchi
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Jeffrey P. Ward
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Hussein Sultan
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Cora D. Arthur
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Elaine R. Mardis
- The Steve and Cindy Rasmussen Institute for Genomic Medicine at Nationwide Children’s Hospital, Columbus, OH 43215, U.S.A
| | - Maxim N. Artyomov
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Cheryl F. Lichti
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| | - Robert D. Schreiber
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
- The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO 63110, U.S.A
| |
Collapse
|
23
|
Zargar SM, Hami A, Manzoor M, Mir RA, Mahajan R, Bhat KA, Gani U, Sofi NR, Sofi PA, Masi A. Buckwheat OMICS: present status and future prospects. Crit Rev Biotechnol 2024; 44:717-734. [PMID: 37482536 DOI: 10.1080/07388551.2023.2229511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 03/31/2023] [Accepted: 06/01/2023] [Indexed: 07/25/2023]
Abstract
Buckwheat (Fagopyrum spp.) is an underutilized resilient crop of North Western Himalayas belonging to the family Polygonaceae and is a source of essential nutrients and therapeutics. Common Buckwheat and Tatary Buckwheat are the two main cultivated species used as food. It is the only grain crop possessing rutin, an important metabolite with high nutraceutical potential. Due to its inherent tolerance to various biotic and abiotic stresses and a short life cycle, Buckwheat has been proposed as a model crop plant. Nutritional security is one of the major concerns, breeding for a nutrient-dense crop such as Buckwheat will provide a sustainable solution. Efforts toward improving Buckwheat for nutrition and yield are limited due to the lack of available: genetic resources, genomics, transcriptomics and metabolomics. In order to harness the agricultural importance of Buckwheat, an integrated breeding and OMICS platforms needs to be established that can pave the way for a better understanding of crop biology and developing commercial varieties. This, coupled with the availability of the genome sequences of both Buckwheat species in the public domain, should facilitate the identification of alleles/QTLs and candidate genes. There is a need to further our understanding of the molecular basis of the genetic regulation that controls various economically important traits. The present review focuses on: the food and nutritional importance of Buckwheat, its various omics resources, utilization of omics approaches in understanding Buckwheat biology and, finally, how an integrated platform of breeding and omics will help in developing commercially high yielding nutrient rich cultivars in Buckwheat.
Collapse
Affiliation(s)
- Sajad Majeed Zargar
- Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Srinagar, India
| | - Ammarah Hami
- Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Srinagar, India
| | - Madhiya Manzoor
- Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Srinagar, India
| | - Rakeeb Ahmad Mir
- Department of Biotechnology, School of Life Sciences, Central University of Kashmir, Ganderbal, India
| | - Reetika Mahajan
- Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Srinagar, India
| | - Kaiser A Bhat
- Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Srinagar, India
| | - Umar Gani
- Plant Sciences and Agrotechnology Division, CSIR-Indian Institute of Integrative Medicine, Jammu, India
| | - Najeebul Rehman Sofi
- MRCFC, Sher-E-Kashmir University of Agricultural Sciences and Technology of Kashmir, India
| | - Parvaze A Sofi
- Division of Plant Breeding and Genetics, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Srinagar, India
| | - Antonio Masi
- Department of Agronomy, Food, Natural Resources, Animals, and Environment, University of Padova, Padua, Italy
| |
Collapse
|
24
|
Fröhlich K, Fahrner M, Brombacher E, Seredynska A, Maldacker M, Kreutz C, Schmidt A, Schilling O. Data-Independent Acquisition: A Milestone and Prospect in Clinical Mass Spectrometry-Based Proteomics. Mol Cell Proteomics 2024; 23:100800. [PMID: 38880244 PMCID: PMC11380018 DOI: 10.1016/j.mcpro.2024.100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 06/08/2024] [Accepted: 06/13/2024] [Indexed: 06/18/2024] Open
Abstract
Data-independent acquisition (DIA) has revolutionized the field of mass spectrometry (MS)-based proteomics over the past few years. DIA stands out for its ability to systematically sample all peptides in a given m/z range, allowing an unbiased acquisition of proteomics data. This greatly mitigates the issue of missing values and significantly enhances quantitative accuracy, precision, and reproducibility compared to many traditional methods. This review focuses on the critical role of DIA analysis software tools, primarily focusing on their capabilities and the challenges they address in proteomic research. Advances in MS technology, such as trapped ion mobility spectrometry, or high field asymmetric waveform ion mobility spectrometry require sophisticated analysis software capable of handling the increased data complexity and exploiting the full potential of DIA. We identify and critically evaluate leading software tools in the DIA landscape, discussing their unique features, and the reliability of their quantitative and qualitative outputs. We present the biological and clinical relevance of DIA-MS and discuss crucial publications that paved the way for in-depth proteomic characterization in patient-derived specimens. Furthermore, we provide a perspective on emerging trends in clinical applications and present upcoming challenges including standardization and certification of MS-based acquisition strategies in molecular diagnostics. While we emphasize the need for continuous development of software tools to keep pace with evolving technologies, we advise researchers against uncritically accepting the results from DIA software tools. Each tool may have its own biases, and some may not be as sensitive or reliable as others. Our overarching recommendation for both researchers and clinicians is to employ multiple DIA analysis tools, utilizing orthogonal analysis approaches to enhance the robustness and reliability of their findings.
Collapse
Affiliation(s)
- Klemens Fröhlich
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany
| | - Eva Brombacher
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany; Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Adrianna Seredynska
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Maximilian Maldacker
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany
| | - Alexander Schmidt
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany.
| |
Collapse
|
25
|
Cui M, Deng F, Disis ML, Cheng C, Zhang L. Advances in the Clinical Application of High-throughput Proteomics. EXPLORATORY RESEARCH AND HYPOTHESIS IN MEDICINE 2024; 9:209-220. [PMID: 39148720 PMCID: PMC11326426 DOI: 10.14218/erhm.2024.00006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
High-throughput proteomics has become an exciting field and a potential frontier of modern medicine since the early 2000s. While significant progress has been made in the technical aspects of the field, translating proteomics to clinical applications has been challenging. This review summarizes recent advances in clinical applications of high-throughput proteomics and discusses the associated challenges, advantages, and future directions. We focus on research progress and clinical applications of high-throughput proteomics in breast cancer, bladder cancer, laryngeal squamous cell carcinoma, gastric cancer, colorectal cancer, and coronavirus disease 2019. The future application of high-throughput proteomics will face challenges such as varying protein properties, limitations of statistical modeling, technical and logistical difficulties in data deposition, integration, and harmonization, as well as regulatory requirements for clinical validation and considerations. However, there are several noteworthy advantages of high-throughput proteomics, including the identification of novel global protein networks, the discovery of new proteins, and the synergistic incorporation with other omic data. We look forward to participating in and embracing future advances in high-throughput proteomics, such as proteomics-based single-cell biology and its clinical applications, individualized proteomics, pathology informatics, digital pathology, and deep learning models for high-throughput proteomics. Several new proteomic technologies are noteworthy, including data-independent acquisition mass spectrometry, nanopore-based proteomics, 4-D proteomics, and secondary ion mass spectrometry. In summary, we believe high-throughput proteomics will drastically shift the paradigm of translational research, clinical practice, and public health in the near future.
Collapse
Affiliation(s)
- Miao Cui
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Pathology, Mount Sinai West, New York, NY, USA
| | - Fei Deng
- Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, USA
| | - Mary L Disis
- UW Medicine Cancer Vaccine Institute, University of Washington, Seattle, WA, USA
| | - Chao Cheng
- Department of Medicine, Baylor College of Medicine, Houston, TX, USA
- Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA
| | - Lanjing Zhang
- Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, USA
- Department of Pathology, Princeton Medical Center, Plainsboro, NJ, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
| |
Collapse
|
26
|
Erban T, Sopko B. Understanding bacterial pathogen diversity: A proteogenomic analysis and use of an array of genome assemblies to identify novel virulence factors of the honey bee bacterial pathogen Paenibacillus larvae. Proteomics 2024; 24:e2300280. [PMID: 38742951 DOI: 10.1002/pmic.202300280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 03/07/2024] [Accepted: 04/08/2024] [Indexed: 05/16/2024]
Abstract
Mass spectrometry proteomics data are typically evaluated against publicly available annotated sequences, but the proteogenomics approach is a useful alternative. A single genome is commonly utilized in custom proteomic and proteogenomic data analysis. We pose the question of whether utilizing numerous different genome assemblies in a search database would be beneficial. We reanalyzed raw data from the exoprotein fraction of four reference Enterobacterial Repetitive Intergenic Consensus (ERIC) I-IV genotypes of the honey bee bacterial pathogen Paenibacillus larvae and evaluated them against three reference databases (from NCBI-protein, RefSeq, and UniProt) together with an array of protein sequences generated by six-frame direct translation of 15 genome assemblies from GenBank. The wide search yielded 453 protein hits/groups, which UpSet analysis categorized into 50 groups based on the success of protein identification by the 18 database components. Nine hits that were not identified by a unique peptide were not considered for marker selection, which discarded the only protein that was not identified by the reference databases. We propose that the variability in successful identifications between genome assemblies is useful for marker mining. The results suggest that various strains of P. larvae can exhibit specific traits that set them apart from the established genotypes ERIC I-V.
Collapse
Affiliation(s)
- Tomas Erban
- Proteomics and Metabolomics Laboratory, Crop Research Institute, Prague, Czechia
| | - Bruno Sopko
- Proteomics and Metabolomics Laboratory, Crop Research Institute, Prague, Czechia
| |
Collapse
|
27
|
Kalhor M, Lapin J, Picciani M, Wilhelm M. Rescoring Peptide Spectrum Matches: Boosting Proteomics Performance by Integrating Peptide Property Predictors Into Peptide Identification. Mol Cell Proteomics 2024; 23:100798. [PMID: 38871251 PMCID: PMC11269915 DOI: 10.1016/j.mcpro.2024.100798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 05/26/2024] [Accepted: 06/09/2024] [Indexed: 06/15/2024] Open
Abstract
Rescoring of peptide spectrum matches originating from database search engines enabled by peptide property predictors is exceeding the performance of peptide identification from traditional database search engines. In contrast to the peptide spectrum match scores calculated by traditional database search engines, rescoring peptide spectrum matches generates scores based on comparing observed and predicted peptide properties, such as fragment ion intensities and retention times. These newly generated scores enable a more efficient discrimination between correct and incorrect peptide spectrum matches. This approach was shown to lead to substantial improvements in the number of confidently identified peptides, facilitating the analysis of challenging datasets in various fields such as immunopeptidomics, metaproteomics, proteogenomics, and single-cell proteomics. In this review, we summarize the key elements leading up to the recent introduction of multiple data-driven rescoring pipelines. We provide an overview of relevant post-processing rescoring tools, introduce prominent data-driven rescoring pipelines for various applications, and highlight limitations, opportunities, and future perspectives of this approach and its impact on mass spectrometry-based proteomics.
Collapse
Affiliation(s)
- Mostafa Kalhor
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Joel Lapin
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mario Picciani
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany; Munich Data Science Institute, Technical University of Munich, Garching, Germany.
| |
Collapse
|
28
|
Thiery J, Fahrner M. Integration of proteomics in the molecular tumor board. Proteomics 2024; 24:e2300002. [PMID: 38143279 DOI: 10.1002/pmic.202300002] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 12/26/2023]
Abstract
Cancer remains one of the most complex and challenging diseases in mankind. To address the need for a personalized treatment approach for particularly complex tumor cases, molecular tumor boards (MTBs) have been initiated. MTBs are interdisciplinary teams that perform in-depth molecular diagnostics to cooperatively and interdisciplinarily advise on the best therapeutic strategy. Current molecular diagnostics are routinely performed on the transcriptomic and genomic levels, aiming to identify tumor-driving mutations. However, these approaches can only partially capture the actual phenotype and the molecular key players of tumor growth and progression. Thus, direct investigation of the expressed proteins and activated signaling pathways provide valuable complementary information on the tumor-driving molecular characteristics of the tissue. Technological advancements in mass spectrometry-based proteomics enable the robust, rapid, and sensitive detection of thousands of proteins in minimal sample amounts, paving the way for clinical proteomics and the probing of oncogenic signaling activity. Therefore, proteomics is currently being integrated into molecular diagnostics within MTBs and holds promising potential in aiding tumor classification and identifying personalized treatment strategies. This review introduces MTBs and describes current clinical proteomics, its potential in precision oncology, and highlights the benefits of multi-omic data integration.
Collapse
Affiliation(s)
- Johanna Thiery
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany
| |
Collapse
|
29
|
Lian F, Yang H, Hong R, Xu H, Yu T, Sun G, Zheng G, Xie B. Evaluation of the antitumor effect of neoantigen peptide vaccines derived from the translatome of lung cancer. Cancer Immunol Immunother 2024; 73:129. [PMID: 38744688 PMCID: PMC11093939 DOI: 10.1007/s00262-024-03670-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 03/08/2024] [Indexed: 05/16/2024]
Abstract
Emerging evidence suggests that tumor-specific neoantigens are ideal targets for cancer immunotherapy. However, how to predict tumor neoantigens based on translatome data remains obscure. Through the extraction of ribosome-nascent chain complexes (RNCs) from LLC cells, followed by RNC-mRNA extraction, RNC-mRNA sequencing, and comprehensive bioinformatic analysis, we successfully identified proteins undergoing translatome and exhibiting mutations in the cells. Subsequently, novel antigens identification was analyzed by the interaction between their high affinity and the Major Histocompatibility Complex (MHC). Neoantigens immunogenicity was analyzed by enzyme-linked immunospot assay (ELISpot). Finally, in vivo experiments in mice were conducted to evaluate the antitumor effects of translatome-derived neoantigen peptides on lung cancer. The results showed that ten neoantigen peptides were identified and synthesized by translatome data from LLC cells; 8 out of the 10 neoantigens had strong immunogenicity. The neoantigen peptide vaccine group exhibited significant tumor growth inhibition effect. In conclusion, neoantigen peptide vaccine derived from the translatome of lung cancer exhibited significant tumor growth inhibition effect.
Collapse
Affiliation(s)
- Fenbao Lian
- Shengli Clinical Medical College, Fujian Medical University, No. 134 East Street, Fuzhou City, 350001, Fujian Province, China
- Department of Respiratory Medicine and Critical Care Medicine, Fujian Provincial Hospital, No. 134 East Street, Fuzhou, 350001, China
| | - Haitao Yang
- Shengli Clinical Medical College, Fujian Medical University, No. 134 East Street, Fuzhou City, 350001, Fujian Province, China
- Department of Respiratory Medicine and Critical Care Medicine, Fujian Provincial Hospital, No. 134 East Street, Fuzhou, 350001, China
| | - Rujun Hong
- Shengli Clinical Medical College, Fujian Medical University, No. 134 East Street, Fuzhou City, 350001, Fujian Province, China
- Department of Respiratory Medicine and Critical Care Medicine, Fujian Provincial Hospital, No. 134 East Street, Fuzhou, 350001, China
| | - Hang Xu
- Shengli Clinical Medical College, Fujian Medical University, No. 134 East Street, Fuzhou City, 350001, Fujian Province, China
- Department of Respiratory Medicine and Critical Care Medicine, Fujian Provincial Hospital, No. 134 East Street, Fuzhou, 350001, China
| | - Tingting Yu
- Department of Thoracic Oncology, The Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Gang Sun
- Department of Breast and Thyroid Surgery, The Affiliated Tumor Hospital of Xinjiang Medical University, 789 East Suzhou Street, Xinshi District, Urumqi, 830011, Xinjiang, China.
- Xinjiang Cancer Center/Key Laboratory of Oncology of Xinjiang Uyghur Autonomous Region, Urumqi, 830011, Xinjiang, China.
| | - Guanying Zheng
- Shengli Clinical Medical College, Fujian Medical University, No. 134 East Street, Fuzhou City, 350001, Fujian Province, China.
- Department of Respiratory Medicine and Critical Care Medicine, Fujian Provincial Hospital, No. 134 East Street, Fuzhou, 350001, China.
| | - Baosong Xie
- Shengli Clinical Medical College, Fujian Medical University, No. 134 East Street, Fuzhou City, 350001, Fujian Province, China.
- Department of Respiratory Medicine and Critical Care Medicine, Fujian Provincial Hospital, No. 134 East Street, Fuzhou, 350001, China.
| |
Collapse
|
30
|
Raj-Kumar PK, Lin X, Liu T, Sturtz LA, Gritsenko MA, Petyuk VA, Sagendorf TJ, Deyarmin B, Liu J, Praveen-Kumar A, Wang G, McDermott JE, Shukla AK, Moore RJ, Monroe ME, Webb-Robertson BJM, Hooke JA, Fantacone-Campbell L, Mostoller B, Kvecher L, Kane J, Melley J, Somiari S, Soon-Shiong P, Smith RD, Mural RJ, Rodland KD, Shriver CD, Kovatich AJ, Hu H. Proteogenomic characterization of difficult-to-treat breast cancer with tumor cells enriched through laser microdissection. Breast Cancer Res 2024; 26:76. [PMID: 38745208 PMCID: PMC11094977 DOI: 10.1186/s13058-024-01835-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/05/2024] [Indexed: 05/16/2024] Open
Abstract
BACKGROUND Breast cancer (BC) is the most commonly diagnosed cancer and the leading cause of cancer death among women globally. Despite advances, there is considerable variation in clinical outcomes for patients with non-luminal A tumors, classified as difficult-to-treat breast cancers (DTBC). This study aims to delineate the proteogenomic landscape of DTBC tumors compared to luminal A (LumA) tumors. METHODS We retrospectively collected a total of 117 untreated primary breast tumor specimens, focusing on DTBC subtypes. Breast tumors were processed by laser microdissection (LMD) to enrich tumor cells. DNA, RNA, and protein were simultaneously extracted from each tumor preparation, followed by whole genome sequencing, paired-end RNA sequencing, global proteomics and phosphoproteomics. Differential feature analysis, pathway analysis and survival analysis were performed to better understand DTBC and investigate biomarkers. RESULTS We observed distinct variations in gene mutations, structural variations, and chromosomal alterations between DTBC and LumA breast tumors. DTBC tumors predominantly had more mutations in TP53, PLXNB3, Zinc finger genes, and fewer mutations in SDC2, CDH1, PIK3CA, SVIL, and PTEN. Notably, Cytoband 1q21, which contains numerous cell proliferation-related genes, was significantly amplified in the DTBC tumors. LMD successfully minimized stromal components and increased RNA-protein concordance, as evidenced by stromal score comparisons and proteomic analysis. Distinct DTBC and LumA-enriched clusters were observed by proteomic and phosphoproteomic clustering analysis, some with survival differences. Phosphoproteomics identified two distinct phosphoproteomic profiles for high relapse-risk and low relapse-risk basal-like tumors, involving several genes known to be associated with breast cancer oncogenesis and progression, including KIAA1522, DCK, FOXO3, MYO9B, ARID1A, EPRS, ZC3HAV1, and RBM14. Lastly, an integrated pathway analysis of multi-omics data highlighted a robust enrichment of proliferation pathways in DTBC tumors. CONCLUSIONS This study provides an integrated proteogenomic characterization of DTBC vs LumA with tumor cells enriched through laser microdissection. We identified many common features of DTBC tumors and the phosphopeptides that could serve as potential biomarkers for high/low relapse-risk basal-like BC and possibly guide treatment selections.
Collapse
Affiliation(s)
- Praveen-Kumar Raj-Kumar
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Xiaoying Lin
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Tao Liu
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Lori A Sturtz
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | | | | | | | - Brenda Deyarmin
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | - Jianfang Liu
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | | | - Guisong Wang
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
- The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD, USA
| | | | - Anil K Shukla
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Ronald J Moore
- Pacific Northwest National Laboratory, Richland, WA, USA
| | | | | | - Jeffrey A Hooke
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
- The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD, USA
| | - Leigh Fantacone-Campbell
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
- The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD, USA
| | - Brad Mostoller
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | - Leonid Kvecher
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Jennifer Kane
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | - Jennifer Melley
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | - Stella Somiari
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | | | | | - Richard J Mural
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA
| | | | - Craig D Shriver
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA.
- Department of Surgery, Walter Reed National Military Medical Center, Bethesda, MD, USA.
| | - Albert J Kovatich
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
- The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD, USA
| | - Hai Hu
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA, USA.
- Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services University of the Health Sciences, Bethesda, MD, USA.
| |
Collapse
|
31
|
Bernal-Gallardo JJ, de Folter S. Plant genome information facilitates plant functional genomics. PLANTA 2024; 259:117. [PMID: 38592421 PMCID: PMC11004055 DOI: 10.1007/s00425-024-04397-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 03/20/2024] [Indexed: 04/10/2024]
Abstract
MAIN CONCLUSION In this review, we give an overview of plant sequencing efforts and how this impacts plant functional genomics research. Plant genome sequence information greatly facilitates the studies of plant biology, functional genomics, evolution of genomes and genes, domestication processes, phylogenetic relationships, among many others. More than two decades of sequencing efforts have boosted the number of available sequenced plant genomes. The first plant genome, of Arabidopsis, was published in the year 2000 and currently, 4604 plant genomes from 1482 plant species have been published. Various large sequence initiatives are running, which are planning to produce tens of thousands of sequenced plant genomes in the near future. In this review, we give an overview on the status of sequenced plant genomes and on the use of genome information in different research areas.
Collapse
Affiliation(s)
- Judith Jazmin Bernal-Gallardo
- Unidad de Genómica Avanzada (UGA-Langebio), Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (Cinvestav), Irapuato, Mexico
| | - Stefan de Folter
- Unidad de Genómica Avanzada (UGA-Langebio), Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (Cinvestav), Irapuato, Mexico.
| |
Collapse
|
32
|
Choi S, Paek E. pXg: Comprehensive Identification of Noncanonical MHC-I-Associated Peptides From De Novo Peptide Sequencing Using RNA-Seq Reads. Mol Cell Proteomics 2024; 23:100743. [PMID: 38403075 PMCID: PMC10979277 DOI: 10.1016/j.mcpro.2024.100743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/19/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024] Open
Abstract
Discovering noncanonical peptides has been a common application of proteogenomics. Recent studies suggest that certain noncanonical peptides, known as noncanonical major histocompatibility complex-I (MHC-I)-associated peptides (ncMAPs), that bind to MHC-I may make good immunotherapeutic targets. De novo peptide sequencing is a great way to find ncMAPs since it can detect peptide sequences from their tandem mass spectra without using any sequence databases. However, this strategy has not been widely applied for ncMAP identification because there is not a good way to estimate its false-positive rates. In order to completely and accurately identify immunopeptides using de novo peptide sequencing, we describe a unique pipeline called proteomics X genomics. In contrast to current pipelines, it makes use of genomic data, RNA-Seq abundance and sequencing quality, in addition to proteomic features to increase the sensitivity and specificity of peptide identification. We show that the peptide-spectrum match quality and genetic traits have a clear relationship, showing that they can be utilized to evaluate peptide-spectrum matches. From 10 samples, we found 24,449 canonical MHC-I-associated peptides and 956 ncMAPs by using a target-decoy competition. Three hundred eighty-seven ncMAPs and 1611 canonical MHC-I-associated peptides were new identifications that had not yet been published. We discovered 11 ncMAPs produced from a squirrel monkey retrovirus in human cell lines in addition to the two ncMAPs originating from a complementarity determining region 3 in an antibody thanks to the unrestricted search space assumed by de novo sequencing. These entirely new identifications show that proteomics X genomics can make the most of de novo peptide sequencing's advantages and its potential use in the search for new immunotherapeutic targets.
Collapse
Affiliation(s)
- Seunghyuk Choi
- Department of Computer Science, Hanyang University, Seoul, Republic of Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul, Republic of Korea; Institute for Artificial Intelligence Research, Hanyang University, Seoul, Republic of Korea.
| |
Collapse
|
33
|
Ferreira HJ, Stevenson BJ, Pak H, Yu F, Almeida Oliveira J, Huber F, Taillandier-Coindard M, Michaux J, Ricart-Altimiras E, Kraemer AI, Kandalaft LE, Speiser DE, Nesvizhskii AI, Müller M, Bassani-Sternberg M. Immunopeptidomics-based identification of naturally presented non-canonical circRNA-derived peptides. Nat Commun 2024; 15:2357. [PMID: 38490980 PMCID: PMC10943130 DOI: 10.1038/s41467-024-46408-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 02/16/2024] [Indexed: 03/18/2024] Open
Abstract
Circular RNAs (circRNAs) are covalently closed non-coding RNAs lacking the 5' cap and the poly-A tail. Nevertheless, it has been demonstrated that certain circRNAs can undergo active translation. Therefore, aberrantly expressed circRNAs in human cancers could be an unexplored source of tumor-specific antigens, potentially mediating anti-tumor T cell responses. This study presents an immunopeptidomics workflow with a specific focus on generating a circRNA-specific protein fasta reference. The main goal of this workflow is to streamline the process of identifying and validating human leukocyte antigen (HLA) bound peptides potentially originating from circRNAs. We increase the analytical stringency of our workflow by retaining peptides identified independently by two mass spectrometry search engines and/or by applying a group-specific FDR for canonical-derived and circRNA-derived peptides. A subset of circRNA-derived peptides specifically encoded by the region spanning the back-splice junction (BSJ) are validated with targeted MS, and with direct Sanger sequencing of the respective source transcripts. Our workflow identifies 54 unique BSJ-spanning circRNA-derived peptides in the immunopeptidome of melanoma and lung cancer samples. Our approach enlarges the catalog of source proteins that can be explored for immunotherapy.
Collapse
Affiliation(s)
- Humberto J Ferreira
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Brian J Stevenson
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - HuiSong Pak
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Jessica Almeida Oliveira
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Florian Huber
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Marie Taillandier-Coindard
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Justine Michaux
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Emma Ricart-Altimiras
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Anne I Kraemer
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| | - Lana E Kandalaft
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Daniel E Speiser
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Markus Müller
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland.
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland.
- Agora Cancer Research Centre, Lausanne, Switzerland.
- Center of Experimental Therapeutics, Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland.
| |
Collapse
|
34
|
Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Kahrood HV, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and visualization of quantitative proteomics data using FragPipe-Analyst. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.05.583643. [PMID: 38496650 PMCID: PMC10942459 DOI: 10.1101/2024.03.05.583643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Collapse
Affiliation(s)
- Yi Hsiao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Haijian Zhang
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hossein Valipour Kahrood
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
- Monash Genomics & Bioinformatics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Joel R. Steele
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ralf B. Schittenhelm
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
35
|
Li Y, He Q, Guo H, Shuai SC, Cheng J, Liu L, Shuai J. AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics. J Proteome Res 2024; 23:834-843. [PMID: 38252705 DOI: 10.1021/acs.jproteome.3c00729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.
Collapse
Affiliation(s)
- Yulin Li
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Qingzu He
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Huan Guo
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, Illinois 60208, United States
| | - Jinyan Cheng
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Liyu Liu
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Jianwei Shuai
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| |
Collapse
|
36
|
Cao X, Sun S, Xing J. A Massive Proteogenomic Screen Identifies Thousands of Novel Peptides From the Human "Dark" Proteome. Mol Cell Proteomics 2024; 23:100719. [PMID: 38242438 PMCID: PMC10867589 DOI: 10.1016/j.mcpro.2024.100719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 01/01/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024] Open
Abstract
Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Anesthesiology, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Siqi Sun
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA.
| |
Collapse
|
37
|
Afonin AM, Piironen AK, de Sousa Maciel I, Ivanova M, Alatalo A, Whipp AM, Pulkkinen L, Rose RJ, van Kamp I, Kaprio J, Kanninen KM. Proteomic insights into mental health status: plasma markers in young adults. Transl Psychiatry 2024; 14:55. [PMID: 38267423 PMCID: PMC10808121 DOI: 10.1038/s41398-024-02751-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 01/26/2024] Open
Abstract
Global emphasis on enhancing prevention and treatment strategies necessitates an increased understanding of the biological mechanisms of psychopathology. Plasma proteomics is a powerful tool that has been applied in the context of specific mental disorders for biomarker identification. The p-factor, also known as the "general psychopathology factor", is a concept in psychopathology suggesting that there is a common underlying factor that contributes to the development of various forms of mental disorders. It has been proposed that the p-factor can be used to understand the overall mental health status of an individual. Here, we aimed to discover plasma proteins associated with the p-factor in 775 young adults in the FinnTwin12 cohort. Using liquid chromatography-tandem mass spectrometry, 13 proteins with a significant connection with the p-factor were identified, 8 of which were linked to epidermal growth factor receptor (EGFR) signaling. This exploratory study provides new insight into biological alterations associated with mental health status in young adults.
Collapse
Affiliation(s)
- Alexey M Afonin
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Aino-Kaisa Piironen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Izaque de Sousa Maciel
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Mariia Ivanova
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Arto Alatalo
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Alyce M Whipp
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Lea Pulkkinen
- Department of Psychology, University of Jyvaskyla, Jyvaskyla, Finland
| | - Richard J Rose
- Department of Psychological & Brain Sciences, Indiana University, Bloomington, IN, USA
| | - Irene van Kamp
- Centre for Sustainability, Environment and Health, National Institute for Public Health and the Environment, Bilthoven, the Netherlands
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Katja M Kanninen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland.
| |
Collapse
|
38
|
Santos LGC, Parreira VDSC, da Silva EMG, Santos MDM, Fernandes ADF, Neves-Ferreira AGDC, Carvalho PC, Freitas FCDP, Passetti F. SpliceProt 2.0: A Sequence Repository of Human, Mouse, and Rat Proteoforms. Int J Mol Sci 2024; 25:1183. [PMID: 38256255 PMCID: PMC10816255 DOI: 10.3390/ijms25021183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/15/2023] [Accepted: 01/03/2024] [Indexed: 01/24/2024] Open
Abstract
SpliceProt 2.0 is a public proteogenomics database that aims to list the sequence of known proteins and potential new proteoforms in human, mouse, and rat proteomes. This updated repository provides an even broader range of computationally translated proteins and serves, for example, to aid with proteomic validation of splice variants absent from the reference UniProtKB/SwissProt database. We demonstrate the value of SpliceProt 2.0 to predict orthologous proteins between humans and murines based on transcript reconstruction, sequence annotation and detection at the transcriptome and proteome levels. In this release, the annotation data used in the reconstruction of transcripts based on the methodology of ternary matrices were acquired from new databases such as Ensembl, UniProt, and APPRIS. Another innovation implemented in the pipeline is the exclusion of transcripts predicted to be susceptible to degradation through the NMD pathway. Taken together, our repository and its applications represent a valuable resource for the proteogenomics community.
Collapse
Affiliation(s)
- Letícia Graziela Costa Santos
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Vinícius da Silva Coutinho Parreira
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Esdras Matheus Gomes da Silva
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
- Laboratory of Toxinology, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Av. Brazil 4036, Campus Maré, Rio de Janeiro 21040-361, RJ, Brazil
| | - Marlon Dias Mariano Santos
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Alexander da Franca Fernandes
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Ana Gisele da Costa Neves-Ferreira
- Laboratory of Toxinology, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Av. Brazil 4036, Campus Maré, Rio de Janeiro 21040-361, RJ, Brazil
| | - Paulo Costa Carvalho
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Flávia Cristina de Paula Freitas
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
- Departamento de Genética e Evolução, Universidade Federal de São Carlos (UFSCar), Rodovia Washington Luis, Km 235, São Carlos 13565-905, SP, Brazil
| | - Fabio Passetti
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| |
Collapse
|
39
|
Lyapina I, Fesenko I. Intracellular and Extracellular Peptidomes of the Model Plant, Physcomitrium patens. Methods Mol Biol 2024; 2758:375-385. [PMID: 38549025 DOI: 10.1007/978-1-0716-3646-6_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Here, we report our approach to peptidomic analysis of the plant model Physcomitrium patens. Intracellular and extracellular peptides were extracted under conditions preventing proteolytic digestion by endogenous proteases. The extracts were fractionated on size exclusion columns to isolate intracellular peptides and on reversed-phase cartridges to isolate extracellular peptides, with the isolated peptides subjected to LC-MS/MS analysis. Mass spectrometry data were analyzed for the presence of peptides derived from the known proteins or microproteins encoded by small open reading frames (<100 aa, smORFs) predicted in the moss genome. Experimental details are provided for each step.
Collapse
Affiliation(s)
- Irina Lyapina
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia
| | - Igor Fesenko
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia
| |
Collapse
|
40
|
Huang D, Zhu X, Ye S, Zhang J, Liao J, Zhang N, Zeng X, Wang J, Yang B, Zhang Y, Lao L, Chen J, Xin M, Nie Y, Saw PE, Su S, Song E. Tumour circular RNAs elicit anti-tumour immunity by encoding cryptic peptides. Nature 2024; 625:593-602. [PMID: 38093017 DOI: 10.1038/s41586-023-06834-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 11/03/2023] [Indexed: 12/23/2023]
Abstract
Emerging data have shown that previously defined noncoding genomes might encode peptides that bind human leukocyte antigen (HLA) as cryptic antigens to stimulate adaptive immunity1,2. However, the significance and mechanisms of action of cryptic antigens in anti-tumour immunity remain unclear. Here mass spectrometry of the HLA class I (HLA-I) peptidome coupled with ribosome sequencing of human breast cancer samples identified HLA-I-binding cryptic antigenic peptides that were noncanonically translated by a tumour-specific circular RNA (circRNA): circFAM53B. The cryptic peptides efficiently primed naive CD4+ and CD8+ T cells in an antigen-specific manner and induced anti-tumour immunity. Clinically, the expression of circFAM53B and its encoded peptides was associated with substantial infiltration of antigen-specific CD8+ T cells and better survival in patients with breast cancer and patients with melanoma. Mechanistically, circFAM53B-encoded peptides had strong binding affinity to both HLA-I and HLA-II molecules. In vivo, administration of vaccines consisting of tumour-specific circRNA or its encoded peptides in mice bearing breast cancer tumours or melanoma induced enhanced infiltration of tumour-antigen-specific cytotoxic T cells, which led to effective tumour control. Overall, our findings reveal that noncanonical translation of circRNAs can drive efficient anti-tumour immunity, which suggests that vaccination exploiting tumour-specific circRNAs may serve as an immunotherapeutic strategy against malignant tumours.
Collapse
Affiliation(s)
- Di Huang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Xiaofeng Zhu
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Shuying Ye
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Jiahui Zhang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Jianyou Liao
- Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Ning Zhang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Xin Zeng
- Program of Molecular Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Jiawen Wang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Bing Yang
- Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Yin Zhang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Liyan Lao
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Jianing Chen
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Min Xin
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Yan Nie
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Phei Er Saw
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Shicheng Su
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.
- Department of Immunology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China.
- Department of Infectious Diseases, the Third Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China.
- Biotherapy Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.
| | - Erwei Song
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.
| |
Collapse
|
41
|
Provencher N, Leblanc S, Jacques JF, Roucou X. Exploring the Alternative Proteome with OpenProt and Mass Spectrometry. Methods Mol Biol 2024; 2836:3-17. [PMID: 38995532 DOI: 10.1007/978-1-0716-4007-4_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Proteogenomics has revealed the translation of unannotated open reading frames (ORFs) present in mRNAs and in noncoding RNAs (ncRNAs). OpenProt annotates all ORFs with a minimum of 30 codons in the transcriptome of several species and displays many functional features associated with the corresponding proteins. Two types of proteins are annotated: reference or canonical proteins which are proteins already annotated in UniProt, RefSeq, or Ensembl and noncanonical proteins. Noncanonical proteins form two groups: predicted novel isoforms that display a significant level of homology with a reference protein and alternative proteins that are new proteins with no significant homology to known proteins. This chapter describes how to check whether a gene and/or transcript contains multiple open reading frames and how to use OpenProt databases for the detection of alternative proteins and novel isoforms by mass spectrometry-based proteomics.
Collapse
Affiliation(s)
- Nicolas Provencher
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada.
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada.
| |
Collapse
|
42
|
Olaya‐Abril A, Biełło K, Rodríguez‐Caballero G, Cabello P, Sáez LP, Moreno‐Vivián C, Luque‐Almagro VM, Roldán MD. Bacterial tolerance and detoxification of cyanide, arsenic and heavy metals: Holistic approaches applied to bioremediation of industrial complex wastes. Microb Biotechnol 2024; 17:e14399. [PMID: 38206076 PMCID: PMC10832572 DOI: 10.1111/1751-7915.14399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 01/12/2024] Open
Abstract
Cyanide is a highly toxic compound that is found in wastewaters generated from different industrial activities, such as mining or jewellery. These residues usually contain high concentrations of other toxic pollutants like arsenic and heavy metals that may form different complexes with cyanide. To develop bioremediation strategies, it is necessary to know the metabolic processes involved in the tolerance and detoxification of these pollutants, but most of the current studies are focused on the characterization of the microbial responses to each one of these environmental hazards individually, and the effect of co-contaminated wastes on microbial metabolism has been hardly addressed. This work summarizes the main strategies developed by bacteria to alleviate the effects of cyanide, arsenic and heavy metals, analysing interactions among these toxic chemicals. Additionally, it is discussed the role of systems biology and synthetic biology as tools for the development of bioremediation strategies of complex industrial wastes and co-contaminated sites, emphasizing the importance and progress derived from meta-omic studies.
Collapse
Affiliation(s)
- Alfonso Olaya‐Abril
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - Karolina Biełło
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - Gema Rodríguez‐Caballero
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - Purificación Cabello
- Departamento de Botánica, Ecología y Fisiología Vegetal, Edificio Celestino Mutis, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - Lara P. Sáez
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - Conrado Moreno‐Vivián
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - Víctor Manuel Luque‐Almagro
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| | - María Dolores Roldán
- Departamento de Bioquímica y Biología Molecular, Edificio Severo Ochoa, Campus de RabanalesUniversidad de CórdobaCórdobaSpain
| |
Collapse
|
43
|
Solovyeva EM, Utzinger S, Vissières A, Mitchelmore J, Ahrné E, Hermes E, Poetsch T, Ronco M, Bidinosti M, Merkl C, Serluca FC, Fessenden J, Naumann U, Voshol H, Meyer AS, Hoersch S. Integrative Proteogenomics for Differential Expression and Splicing Variation in a DM1 Mouse Model. Mol Cell Proteomics 2024; 23:100683. [PMID: 37993104 PMCID: PMC10770608 DOI: 10.1016/j.mcpro.2023.100683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/02/2023] [Accepted: 11/17/2023] [Indexed: 11/24/2023] Open
Abstract
Dysregulated mRNA splicing is involved in the pathogenesis of many diseases including cancer, neurodegenerative diseases, and muscular dystrophies such as myotonic dystrophy type 1 (DM1). Comprehensive assessment of dysregulated splicing on the transcriptome and proteome level has been methodologically challenging, and thus investigations have often been targeting only few genes. Here, we performed a large-scale coordinated transcriptomic and proteomic analysis to characterize a DM1 mouse model (HSALR) in comparison to wild type. Our integrative proteogenomics approach comprised gene- and splicing-level assessments for mRNAs and proteins. It recapitulated many known instances of aberrant mRNA splicing in DM1 and identified new ones. It enabled the design and targeting of splicing-specific peptides and confirmed the translation of known instances of aberrantly spliced disease-related genes (e.g., Atp2a1, Bin1, Ryr1), complemented by novel findings (Flnc and Ywhae). Comparative analysis of large-scale mRNA and protein expression data showed quantitative agreement of differentially expressed genes and splicing patterns between disease and wild type. We hence propose this work as a suitable blueprint for a robust and scalable integrative proteogenomic strategy geared toward advancing our understanding of splicing-based disorders. With such a strategy, splicing-based biomarker candidates emerge as an attractive and accessible option, as they can be efficiently asserted on the mRNA and protein level in coordinated fashion.
Collapse
Affiliation(s)
- Elizaveta M Solovyeva
- Research Informatics, Biomedical Research at Novartis, Basel, Switzerland; V.L. Talrose Institute for Energy Problems of Chemical Physics, N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, Russia.
| | - Stephan Utzinger
- Diseases of Aging and Regenerative Medicine, Biomedical Research at Novartis, Basel, Switzerland
| | | | - Joanna Mitchelmore
- Diseases of Aging and Regenerative Medicine, Biomedical Research at Novartis, Basel, Switzerland
| | - Erik Ahrné
- Discovery Sciences, Biomedical Research at Novartis, Basel, Switzerland
| | - Erwin Hermes
- Discovery Sciences, Biomedical Research at Novartis, Basel, Switzerland
| | - Tania Poetsch
- Discovery Sciences, Biomedical Research at Novartis, Basel, Switzerland
| | - Marie Ronco
- Diseases of Aging and Regenerative Medicine, Biomedical Research at Novartis, Basel, Switzerland
| | - Michael Bidinosti
- Diseases of Aging and Regenerative Medicine, Biomedical Research at Novartis, Basel, Switzerland
| | - Claudia Merkl
- Diseases of Aging and Regenerative Medicine, Biomedical Research at Novartis, Basel, Switzerland
| | - Fabrizio C Serluca
- Research Informatics, Biomedical Research at Novartis, Cambridge, Massachusetts, USA
| | - James Fessenden
- Neurodegenerative Diseases, Biomedical Research at Novartis, Cambridge, Massachusetts, USA
| | - Ulrike Naumann
- Discovery Sciences, Biomedical Research at Novartis, Basel, Switzerland
| | - Hans Voshol
- Discovery Sciences, Biomedical Research at Novartis, Basel, Switzerland
| | - Angelika S Meyer
- Diseases of Aging and Regenerative Medicine, Biomedical Research at Novartis, Basel, Switzerland
| | - Sebastian Hoersch
- Research Informatics, Biomedical Research at Novartis, Basel, Switzerland.
| |
Collapse
|
44
|
Genth J, Schäfer K, Cassidy L, Graspeuntner S, Rupp J, Tholey A. Identification of proteoforms of short open reading frame-encoded peptides in Blautia producta under different cultivation conditions. Microbiol Spectr 2023; 11:e0252823. [PMID: 37782090 PMCID: PMC10715070 DOI: 10.1128/spectrum.02528-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 08/14/2023] [Indexed: 10/03/2023] Open
Abstract
IMPORTANCE The identification of short open reading frame-encoded peptides (SEP) and different proteoforms in single cultures of gut microbes offers new insights into a largely neglected part of the microbial proteome landscape. This is of particular importance as SEP provide various predicted functions, such as acting as antimicrobial peptides, maintaining cell homeostasis under stress conditions, or even contributing to the virulence pattern. They are, thus, taking a poorly understood role in structure and function of microbial networks in the human body. A better understanding of SEP in the context of human health requires a precise understanding of the abundance of SEP both in commensal microbes as well as pathogens. For the gut beneficial B. producta, we demonstrate the importance of specific environmental conditions for biosynthesis of SEP expanding previous findings about their role in microbial interactions.
Collapse
Affiliation(s)
- Jerome Genth
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Kathrin Schäfer
- Department of Infectious Diseases and Microbiology, University of Lübeck, Lübeck, Germany
| | - Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Simon Graspeuntner
- Department of Infectious Diseases and Microbiology, University of Lübeck, Lübeck, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Lübeck, Germany
| | - Jan Rupp
- Department of Infectious Diseases and Microbiology, University of Lübeck, Lübeck, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Lübeck, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| |
Collapse
|
45
|
Lin MS, Varunjikar MS, Lie KK, Søfteland L, Dellafiora L, Ørnsrud R, Sanden M, Berntssen MHG, Dorne JLCM, Bafna V, Rasinger JD. Multi-tissue proteogenomic analysis for mechanistic toxicology studies in non-model species. ENVIRONMENT INTERNATIONAL 2023; 182:108309. [PMID: 37980879 DOI: 10.1016/j.envint.2023.108309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 08/15/2023] [Accepted: 11/04/2023] [Indexed: 11/21/2023]
Abstract
New approach methodologies (NAM), including omics and in vitro approaches, are contributing to the implementation of 3R (reduction, refinement and replacement) strategies in regulatory science and risk assessment. In this study, we present an integrative transcriptomics and proteomics analysis workflow for the validation and revision of complex fish genomes and demonstrate how proteogenomics expression matrices can be used to support multi-level omics data integration in non-model species in vivo and in vitro. Using Atlantic salmon as an example, we constructed proteogenomic databases from publicly available transcriptomic data and in-house generated RNA-Seq and LC-MS/MS data. Our analysis identified ∼80,000 peptides, providing direct evidence of translation for over 40,000 RefSeq structures. The data also highlighted 183 co-located peptide groups that supported a single transcript each, and in each case, either corrected a previous annotation, supported Ensembl annotations not present in RefSeq, or identified novel previously unannotated genes. Proteogenomics data-derived expression matrices revealed distinct profiles for the different tissue types analyzed. Focusing on proteins involved in defense against xenobiotics, we detected distinct expression patterns across different salmon tissues and observed homology in the expression of chemical defense proteins between in vivo and in vitro liver systems. Our study demonstrates the potential of proteogenomic analyses in extending our understanding of complex fish genomes and provides an advanced bioinformatic toolkit to support the further development of NAMs and their application in regulatory science and (eco)toxicological studies of non-model species.
Collapse
Affiliation(s)
- M S Lin
- Bioinformatics and Systems Biology Program, UC San Diego, San Diego, CA, United States.
| | | | - K K Lie
- Institute of Marine Research, Bergen, Norway.
| | - L Søfteland
- Institute of Marine Research, Bergen, Norway.
| | - L Dellafiora
- Department of Food and Drug, University of Parma, Parco Area delle Scienze 27/A, 43124 Parma, Italy.
| | - R Ørnsrud
- Institute of Marine Research, Bergen, Norway.
| | - M Sanden
- Institute of Marine Research, Bergen, Norway.
| | | | - J L C M Dorne
- European Food Safety Authority, Methodological and Scientific Support Unit, Via Carlo Magno 1A, 43121 Parma, Italy.
| | - V Bafna
- Computer Science & Engineering and HDSI, UC San Diego, San Diego, CA, United States.
| | | |
Collapse
|
46
|
Song YC, Das D, Zhang Y, Chen MX, Fernie AR, Zhu FY, Han J. Proteogenomics-based functional genome research: approaches, applications, and perspectives in plants. Trends Biotechnol 2023; 41:1532-1548. [PMID: 37365082 DOI: 10.1016/j.tibtech.2023.05.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/17/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023]
Abstract
Proteogenomics (PG) integrates the proteome with the genome and transcriptome to refine gene models and annotation. Coupled with single-cell (SC) assays, PG effectively distinguishes heterogeneity among cell groups. Affiliating spatial information to PG reveals the high-resolution circuitry within SC atlases. Additionally, PG can investigate dynamic changes in protein-coding genes in plants across growth and development as well as stress and external stimulation, significantly contributing to the functional genome. Here we summarize existing PG research in plants and introduce the technical features of various methods. Combining PG with other omics, such as metabolomics and peptidomics, can offer even deeper insights into gene functions. We argue that the application of PG will represent an important font of foundational knowledge for plants.
Collapse
Affiliation(s)
- Yu-Chen Song
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China
| | - Debatosh Das
- College of Agriculture, Food and Natural Resources (CAFNR), Division of Plant Sciences and Technology, 52 Agricultural Building, University of Missouri-Columbia, MO 65201, USA
| | - Youjun Zhang
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany; Center of Plant Systems Biology and Biotechnology, Plovdiv, Bulgaria
| | - Mo-Xian Chen
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China.
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany; Center of Plant Systems Biology and Biotechnology, Plovdiv, Bulgaria.
| | - Fu-Yuan Zhu
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China.
| | - Jiangang Han
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China.
| |
Collapse
|
47
|
Fuchs S, Engelmann S. Small proteins in bacteria - Big challenges in prediction and identification. Proteomics 2023; 23:e2200421. [PMID: 37609810 DOI: 10.1002/pmic.202200421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/24/2023]
Abstract
Proteins with up to 100 amino acids have been largely overlooked due to the challenges associated with predicting and identifying them using traditional methods. Recent advances in bioinformatics and machine learning, DNA sequencing, RNA and Ribo-seq technologies, and mass spectrometry (MS) have greatly facilitated the detection and characterisation of these elusive proteins in recent years. This has revealed their crucial role in various cellular processes including regulation, signalling and transport, as toxins and as folding helpers for protein complexes. Consequently, the systematic identification and characterisation of these proteins in bacteria have emerged as a prominent field of interest within the microbial research community. This review provides an overview of different strategies for predicting and identifying these proteins on a large scale, leveraging the power of these advanced technologies. Furthermore, the review offers insights into the future developments that may be expected in this field.
Collapse
Affiliation(s)
- Stephan Fuchs
- Genome Competence Center (MF1), Department MFI, Robert-Koch-Institut, Berlin, Germany
| | - Susanne Engelmann
- Institute for Microbiology, Technische Universität Braunschweig, Braunschweig, Germany
- Microbial Proteomics, Helmholtzzentrum für Infektionsforschung GmbH, Braunschweig, Germany
| |
Collapse
|
48
|
Wacholder A, Carvunis AR. Biological factors and statistical limitations prevent detection of most noncanonical proteins by mass spectrometry. PLoS Biol 2023; 21:e3002409. [PMID: 38048358 PMCID: PMC10721188 DOI: 10.1371/journal.pbio.3002409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 12/14/2023] [Accepted: 10/30/2023] [Indexed: 12/06/2023] Open
Abstract
Ribosome profiling experiments indicate pervasive translation of short open reading frames (ORFs) outside of annotated protein-coding genes. However, shotgun mass spectrometry (MS) experiments typically detect only a small fraction of the predicted protein products of this noncanonical translation. The rarity of detection could indicate that most predicted noncanonical proteins are rapidly degraded and not present in the cell; alternatively, it could reflect technical limitations. Here, we leveraged recent advances in ribosome profiling and MS to investigate the factors limiting detection of noncanonical proteins in yeast. We show that the low detection rate of noncanonical ORF products can largely be explained by small size and low translation levels and does not indicate that they are unstable or biologically insignificant. In particular, proteins encoded by evolutionarily young genes, including those with well-characterized biological roles, are too short and too lowly expressed to be detected by shotgun MS at current detection sensitivities. Additionally, we find that decoy biases can give misleading estimates of noncanonical protein false discovery rates, potentially leading to false detections. After accounting for these issues, we found strong evidence for 4 noncanonical proteins in MS data, which were also supported by evolution and translation data. These results illustrate the power of MS to validate unannotated genes predicted by ribosome profiling, but also its substantial limitations in finding many biologically relevant lowly expressed proteins.
Collapse
Affiliation(s)
- Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
49
|
Meng W, Schreiber RD, Lichti CF. Recent advances in immunopeptidomic-based tumor neoantigen discovery. Adv Immunol 2023; 160:1-36. [PMID: 38042584 DOI: 10.1016/bs.ai.2023.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2023]
Abstract
The role of aberrantly expressed proteins in tumors in driving immune-mediated control of cancer has been well documented for more than five decades. Today, we know that both aberrantly expressed normal proteins as well as mutant proteins (neoantigens) can function as tumor antigens in both humans and mice. Next-generation sequencing (NGS) and high-resolution mass spectrometry (MS) technologies have made significant advances since the early 2010s, enabling detection of rare but clinically relevant neoantigens recognized by T cells. MS profiling of tumor-specific immunopeptidomes remains the most direct method to identify mutant peptides bound to cellular MHC. However, the need for use of large numbers of cells or significant amounts of tumor tissue to achieve neoantigen detection has historically limited the application of MS. Newer, more sensitive MS technologies have recently demonstrated the capacities to detect neoantigens from fewer cells. Here, we highlight recent advancements in immunopeptidomics-based characterization of tumor-specific neoantigens. Various tumor antigen categories and neoantigen identification approaches are also discussed. Furthermore, we summarize recent reports that achieved successful tumor neoantigen detection by MS using a variety of starting materials, MS acquisition modes, and novel ion mobility devices.
Collapse
Affiliation(s)
- Wei Meng
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, United States; The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO, United States
| | - Robert D Schreiber
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, United States; The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO, United States.
| | - Cheryl F Lichti
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, United States; The Andrew M. and Jane M. Bursky Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, MO, United States.
| |
Collapse
|
50
|
Brantl S, Ul Haq I. Small proteins in Gram-positive bacteria. FEMS Microbiol Rev 2023; 47:fuad064. [PMID: 38052429 PMCID: PMC10730256 DOI: 10.1093/femsre/fuad064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/27/2023] [Accepted: 12/04/2023] [Indexed: 12/07/2023] Open
Abstract
Small proteins comprising less than 100 amino acids have been often ignored in bacterial genome annotations. About 10 years ago, focused efforts started to investigate whole peptidomes, which resulted in the discovery of a multitude of small proteins, but only a number of them have been characterized in detail. Generally, small proteins can be either membrane or cytosolic proteins. The latter interact with larger proteins, RNA or even metal ions. Here, we summarize our current knowledge on small proteins from Gram-positive bacteria with a special emphasis on the model organism Bacillus subtilis. Our examples include membrane-bound toxins of type I toxin-antitoxin systems, proteins that block the assembly of higher order structures, regulate sporulation or modulate the RNA degradosome. We do not consider antimicrobial peptides. Furthermore, we present methods for the identification and investigation of small proteins.
Collapse
Affiliation(s)
- Sabine Brantl
- AG Bakteriengenetik, Matthias-Schleiden-Institut, Friedrich-Schiller-Universität Jena, Philosophenweg 12, Jena D-07743, Germany
| | - Inam Ul Haq
- AG Bakteriengenetik, Matthias-Schleiden-Institut, Friedrich-Schiller-Universität Jena, Philosophenweg 12, Jena D-07743, Germany
| |
Collapse
|