1
|
Raj A, Aggarwal S, Singh P, Yadav AK, Dash D. PgxSAVy: A tool for comprehensive evaluation of variant peptide quality in proteogenomics - catching the (un)usual suspects. Comput Struct Biotechnol J 2024; 23:711-722. [PMID: 38292474 PMCID: PMC10825656 DOI: 10.1016/j.csbj.2023.12.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/19/2023] [Accepted: 12/23/2023] [Indexed: 02/01/2024] Open
Abstract
Variant peptides resulting from single nucleotide polymorphisms (SNPs) can lead to aberrant protein functions and have translational potential for disease diagnosis and personalized therapy. Variant peptides detected by proteogenomics are fraught with high number of false positives, but there is no uniform and comprehensive approach to assess variant quality across analysis pipelines. Despite class-specific FDR along with ad-hoc filters, the problem is far from solved. These protocols are typically manual and tedious, and thus not uniform across labs. We demonstrate that variant peptide rescoring, integrated with intensity, variant event information and search result features, allows better discrimination of correct variant peptides. Implemented into PgxSAVy - a tool for quality control of variant peptides, this method can tackle the high rate of false positives. PgxSAVy provides a rigorous framework for quality control and annotations of variant peptides on the basis of (i) variant quality, (ii) isobaric masses, and (iii) disease annotation. PgxSAVy demonstrated high accuracy by identifying true variants with 98.43% accuracy on simulated data. Large-scale proteogenomic reanalysis of ∼2.8 million spectra (PXD004010 and PXD001468) resulted in 12,705 variant peptide spectrum matches (PSMs), of which PgxSAVy evaluated 3028 (23.8%), 1409 (11.1%) and 8268 (65.1%) as confident, semi-confident and doubtful respectively. PgxSAVy also annotates the variants based on their pathogenicity and provides support for assisted manual validation. The analysis of proteins carrying variants can provide fine granularity in discovering important pathways. PgxSAVy will advance personalized medicine by providing a comprehensive framework for quality control and prioritization of proteogenomics variants. PgxSAVy is freely available at https://pgxsavy.igib.res.in/ as a webserver and https://github.com/anuragraj/PgxSAVy as a stand-alone tool.
Collapse
Affiliation(s)
- Anurag Raj
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Suruchi Aggarwal
- Computational and Mathematical Biology Centre (CMBC), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Drug Discovery (CDD), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Microbial Research (CMR), Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
| | - Prateek Singh
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Amit Kumar Yadav
- Computational and Mathematical Biology Centre (CMBC), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Drug Discovery (CDD), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Microbial Research (CMR), Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
| | - Debasis Dash
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
2
|
Edsjö A, Russnes HG, Lehtiö J, Tamborero D, Hovig E, Stenzinger A, Rosenquist R. High-throughput molecular assays for inclusion in personalised oncology trials - State-of-the-art and beyond. J Intern Med 2024; 295:785-803. [PMID: 38698538 DOI: 10.1111/joim.13785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
In the last decades, the development of high-throughput molecular assays has revolutionised cancer diagnostics, paving the way for the concept of personalised cancer medicine. This progress has been driven by the introduction of such technologies through biomarker-driven oncology trials. In this review, strengths and limitations of various state-of-the-art sequencing technologies, including gene panel sequencing (DNA and RNA), whole-exome/whole-genome sequencing and whole-transcriptome sequencing, are explored, focusing on their ability to identify clinically relevant biomarkers with diagnostic, prognostic and/or predictive impact. This includes the need to assess complex biomarkers, for example microsatellite instability, tumour mutation burden and homologous recombination deficiency, to identify patients suitable for specific therapies, including immunotherapy. Furthermore, the crucial role of biomarker analysis and multidisciplinary molecular tumour boards in selecting patients for trial inclusion is discussed in relation to various trial concepts, including drug repurposing. Recognising that today's exploratory techniques will evolve into tomorrow's routine diagnostics and clinical study inclusion assays, the importance of emerging technologies for multimodal diagnostics, such as proteomics and in vivo drug sensitivity testing, is also discussed. In addition, key regulatory aspects and the importance of patient engagement in all phases of a clinical trial are described. Finally, we propose a set of recommendations for consideration when planning a new precision cancer medicine trial.
Collapse
Affiliation(s)
- Anders Edsjö
- Department of Clinical Genetics, Pathology and Molecular Diagnostics, Office for Medical Services, Region Skåne, Lund, Sweden
- Division of Pathology, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Hege G Russnes
- Department of Pathology, Oslo University Hospital, Oslo, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Janne Lehtiö
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Stockholm, Sweden
- Cancer genomics and proteomics, Karolinska University Hospital, Solna, Sweden
| | - David Tamborero
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Stockholm, Sweden
| | - Eivind Hovig
- Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Albrecht Stenzinger
- Institute of Pathology, Division of Molecular Pathology, University Hospital Heidelberg, Heidelberg, Germany
| | - Richard Rosenquist
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
- Clinical Genetics and Genomics, Karolinska University Hospital, Solna, Sweden
| |
Collapse
|
3
|
Kurgan N, Kjærgaard Larsen J, Deshmukh AS. Harnessing the power of proteomics in precision diabetes medicine. Diabetologia 2024; 67:783-797. [PMID: 38345659 DOI: 10.1007/s00125-024-06097-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 12/20/2023] [Indexed: 03/21/2024]
Abstract
Precision diabetes medicine (PDM) aims to reduce errors in prevention programmes, diagnosis thresholds, prognosis prediction and treatment strategies. However, its advancement and implementation are difficult due to the heterogeneity of complex molecular processes and environmental exposures that influence an individual's disease trajectory. To address this challenge, it is imperative to develop robust screening methods for all areas of PDM. Innovative proteomic technologies, alongside genomics, have proven effective in precision cancer medicine and are showing promise in diabetes research for potential translation. This narrative review highlights how proteomics is well-positioned to help improve PDM. Specifically, a critical assessment of widely adopted affinity-based proteomic technologies in large-scale clinical studies and evidence of the benefits and feasibility of using MS-based plasma proteomics is presented. We also present a case for the use of proteomics to identify predictive protein panels for type 2 diabetes subtyping and the development of clinical prediction models for prevention, diagnosis, prognosis and treatment strategies. Lastly, we discuss the importance of plasma and tissue proteomics and its integration with genomics (proteogenomics) for identifying unique type 2 diabetes intra- and inter-subtype aetiology. We conclude with a call for action formed on advancing proteomics technologies, benchmarking their performance and standardisation across sites, with an emphasis on data sharing and the inclusion of diverse ancestries in large cohort studies. These efforts should foster collaboration with key stakeholders and align with ongoing academic programmes such as the Precision Medicine in Diabetes Initiative consortium.
Collapse
Affiliation(s)
- Nigel Kurgan
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark
| | - Jeppe Kjærgaard Larsen
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark
| | - Atul S Deshmukh
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
4
|
Peng Z, Li J, Jiang X, Wan C. sOCP: a framework predicting smORF coding potential based on TIS and in-frame features and effectively applied in the human genome. Brief Bioinform 2024; 25:bbae147. [PMID: 38600664 PMCID: PMC11006793 DOI: 10.1093/bib/bbae147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 02/25/2024] [Accepted: 03/19/2024] [Indexed: 04/12/2024] Open
Abstract
Small open reading frames (smORFs) have been acknowledged to play various roles on essential biological pathways and affect human beings from diabetes to tumorigenesis. Predicting smORFs in silico is quite a prerequisite for processing the omics data. Here, we proposed the smORF-coding-potential-predicting framework, sOCP, which provides functions to construct a model for predicting novel smORFs in some species. The sOCP model constructed in human was based on in-frame features and the nucleotide bias around the start codon, and the small feature subset was proved to be competent enough and avoid overfitting problems for complicated models. It showed more advanced prediction metrics than previous methods and could correlate closely with experimental evidence in a heterogeneous dataset. The model was applied to Rattus norvegicus and exhibited satisfactory performance. We then scanned smORFs with ATG and non-ATG start codons from the human genome and generated a database containing about a million novel smORFs with coding potential. Around 72 000 smORFs are located on the lncRNA regions of the genome. The smORF-encoded peptides may be involved in biological pathways rare for canonical proteins, including glucocorticoid catabolic process and the prokaryotic defense system. Our work provides a model and database for human smORF investigation and a convenient tool for further smORF prediction in other species.
Collapse
Affiliation(s)
- Zhao Peng
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| | - Jiaqiang Li
- School of Computer Science, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| | - Xingpeng Jiang
- School of Computer Science, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| | - Cuihong Wan
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| |
Collapse
|
5
|
Kore H, Datta KK, Nagaraj SH, Gowda H. Protein-coding potential of non-canonical open reading frames in human transcriptome. Biochem Biophys Res Commun 2023; 684:149040. [PMID: 37897910 DOI: 10.1016/j.bbrc.2023.09.068] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/09/2023] [Accepted: 09/23/2023] [Indexed: 10/30/2023]
Abstract
In recent years, proteogenomics and ribosome profiling studies have identified a large number of proteins encoded by noncoding regions in the human genome. They are encoded by small open reading frames (sORFs) in the untranslated regions (UTRs) of mRNAs and long non-coding RNAs (lncRNAs). These sORF encoded proteins (SEPs) are often <150AA and show poor evolutionary conservation. A subset of them have been functionally characterized and shown to play an important role in fundamental biological processes including cardiac and muscle function, DNA repair, embryonic development and various human diseases. How many novel protein-coding regions exist in the human genome and what fraction of them are functionally important remains a mystery. In this review, we discuss current progress in unraveling SEPs, approaches used for their identification, their limitations and reliability of these identifications. We also discuss functionally characterized SEPs and their involvement in various biological processes and diseases. Lastly, we provide insights into their distinctive features compared to canonical proteins and challenges associated with annotating these in protein reference databases.
Collapse
Affiliation(s)
- Hitesh Kore
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, Queensland, 4006, Australia; Faculty of Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia.
| | - Keshava K Datta
- Proteomics and Metabolomics Platform, La Trobe University, Melbourne, VIC, 3083, Australia
| | - Shivashankar H Nagaraj
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Faculty of Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia
| | - Harsha Gowda
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, Queensland, 4006, Australia; Faculty of Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Faculty of Medicine, The University of Queensland, Queensland, 4072, Australia.
| |
Collapse
|
6
|
Wang XY, Xu YM, Lau ATY. Proteogenomics in Cancer: Then and Now. J Proteome Res 2023; 22:3103-3122. [PMID: 37725793 DOI: 10.1021/acs.jproteome.3c00196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
For years, the paths of sequencing technologies and mass spectrometry have occurred in isolation, with each developing its own unique culture and expertise. These two technologies are crucial for inspecting complementary aspects of the molecular phenotype across the central dogma. Integrative multiomics strives to bridge the analysis gap among different fields to complete more comprehensive mechanisms of life events and diseases. Proteogenomics is one integrated multiomics field. Here in this review, we mainly summarize and discuss three aspects: workflow of proteogenomics, proteogenomics applications in cancer research, and the SWOT (Strengths, Weaknesses, Opportunities, Threats) analysis of proteogenomics in cancer research. In conclusion, proteogenomics has a promising future as it clarifies the functional consequences of many unannotated genomic abnormalities or noncanonical variants and identifies driver genes and novel therapeutic targets across cancers, which would substantially accelerate the development of precision oncology.
Collapse
Affiliation(s)
- Xiu-Yun Wang
- Laboratory of Cancer Biology and Epigenetics, Department of Cell Biology and Genetics, Shantou University Medical College, Shantou, Guangdong 515041, People's Republic of China
| | - Yan-Ming Xu
- Laboratory of Cancer Biology and Epigenetics, Department of Cell Biology and Genetics, Shantou University Medical College, Shantou, Guangdong 515041, People's Republic of China
| | - Andy T Y Lau
- Laboratory of Cancer Biology and Epigenetics, Department of Cell Biology and Genetics, Shantou University Medical College, Shantou, Guangdong 515041, People's Republic of China
| |
Collapse
|
7
|
Lei JT, Jaehnig EJ, Smith H, Holt MV, Li X, Anurag M, Ellis MJ, Mills GB, Zhang B, Labrie M. The Breast Cancer Proteome and Precision Oncology. Cold Spring Harb Perspect Med 2023; 13:a041323. [PMID: 37137501 PMCID: PMC10547392 DOI: 10.1101/cshperspect.a041323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
The goal of precision oncology is to translate the molecular features of cancer into predictive and prognostic tests that can be used to individualize treatment leading to improved outcomes and decreased toxicity. Success for this strategy in breast cancer is exemplified by efficacy of trastuzumab in tumors overexpressing ERBB2 and endocrine therapy for tumors that are estrogen receptor positive. However, other effective treatments, including chemotherapy, immune checkpoint inhibitors, and CDK4/6 inhibitors are not associated with strong predictive biomarkers. Proteomics promises another tier of information that, when added to genomic and transcriptomic features (proteogenomics), may create new opportunities to improve both treatment precision and therapeutic hypotheses. Here, we review both mass spectrometry-based and antibody-dependent proteomics as complementary approaches. We highlight how these methods have contributed toward a more complete understanding of breast cancer and describe the potential to guide diagnosis and treatment more accurately.
Collapse
Affiliation(s)
- Jonathan T Lei
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Eric J Jaehnig
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Hannah Smith
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Matthew V Holt
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Xi Li
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Meenakshi Anurag
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Matthew J Ellis
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Gordon B Mills
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Marilyne Labrie
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| |
Collapse
|
8
|
Desai H, Ofori S, Boatner L, Yu F, Villanueva M, Ung N, Nesvizhskii AI, Backus K. Multi-omic stratification of the missense variant cysteinome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.12.553095. [PMID: 37645963 PMCID: PMC10461992 DOI: 10.1101/2023.08.12.553095] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Cancer genomes are rife with genetic variants; one key outcome of this variation is gain-ofcysteine, which is the most frequently acquired amino acid due to missense variants in COSMIC. Acquired cysteines are both driver mutations and sites targeted by precision therapies. However, despite their ubiquity, nearly all acquired cysteines remain uncharacterized. Here, we pair cysteine chemoproteomics-a technique that enables proteome-wide pinpointing of functional, redox sensitive, and potentially druggable residues-with genomics to reveal the hidden landscape of cysteine acquisition. For both cancer and healthy genomes, we find that cysteine acquisition is a ubiquitous consequence of genetic variation that is further elevated in the context of decreased DNA repair. Our chemoproteogenomics platform integrates chemoproteomic, whole exome, and RNA-seq data, with a customized 2-stage false discovery rate (FDR) error controlled proteomic search, further enhanced with a user-friendly FragPipe interface. Integration of CADD predictions of deleteriousness revealed marked enrichment for likely damaging variants that result in acquisition of cysteine. By deploying chemoproteogenomics across eleven cell lines, we identify 116 gain-of-cysteines, of which 10 were liganded by electrophilic druglike molecules. Reference cysteines proximal to missense variants were also found to be pervasive, 791 in total, supporting heretofore untapped opportunities for proteoform-specific chemical probe development campaigns. As chemoproteogenomics is further distinguished by sample-matched combinatorial variant databases and compatible with redox proteomics and small molecule screening, we expect widespread utility in guiding proteoform-specific biology and therapeutic discovery.
Collapse
Affiliation(s)
- Heta Desai
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
| | - Samuel Ofori
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, 90095, USA
| | - Lisa Boatner
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Miranda Villanueva
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
| | - Nicholas Ung
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- DOE Institute for Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA, 90095, USA
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Keriann Backus
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- DOE Institute for Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA, 90095, USA
| |
Collapse
|
9
|
Li Z, Vacanti NM. A Tale of Three Proteomes: Visualizing Protein and Transcript Abundance Relationships in the Breast Cancer Proteome Portal. J Proteome Res 2023; 22:2727-2733. [PMID: 37493333 DOI: 10.1021/acs.jproteome.3c00290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2023]
Abstract
Molecular characterization is transforming research on novel therapeutics in breast cancer. High-throughput methodologies are unbiased to hypotheses; thus, data produced are relevant to address unlimited questions and provide resources for the experimental design process. However, the opportunity is often overlooked because data are not readily accessed or analyzed. Herein, the Breast Cancer Proteome Portal, the only online tool for analyzing protein and transcript abundances across the three breast cancer proteomics studies, is presented. The tool is applied to demonstrate that cofunctioning protein abundances are highly correlated and, conversely, high abundance correlation may be an indicator of cofunction. Furthermore, the cofunction-correlation relationship is less resolved at the transcript level. By applying analysis and visualization tools within the Breast Cancer Proteome Portal, insights are garnered about serine synthesis and the compartmentalization of one-carbon metabolism in breast cancer, and a transcription factor tumorigenic regulatory network of glutamine deamination and oxidation is proposed, illustrating that the Breast Cancer Proteome Portal provides an interface for garnering insights from the information-rich studies of the breast cancer proteome.
Collapse
Affiliation(s)
- Zhuoheng Li
- Division of Nutritional Sciences, Cornell University, Ithaca, New York 14853-0001, United States
| | - Nathaniel M Vacanti
- Division of Nutritional Sciences, Cornell University, Ithaca, New York 14853-0001, United States
| |
Collapse
|
10
|
Hassel KR, Brito-Estrada O, Makarewich CA. Microproteins: Overlooked regulators of physiology and disease. iScience 2023; 26:106781. [PMID: 37213226 PMCID: PMC10199267 DOI: 10.1016/j.isci.2023.106781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2023] Open
Abstract
Ongoing efforts to generate a complete and accurate annotation of the genome have revealed a significant blind spot for small proteins (<100 amino acids) originating from short open reading frames (sORFs). The recent discovery of numerous sORF-encoded proteins, termed microproteins, that play diverse roles in critical cellular processes has ignited the field of microprotein biology. Large-scale efforts are currently underway to identify sORF-encoded microproteins in diverse cell-types and tissues and specialized methods and tools have been developed to aid in their discovery, validation, and functional characterization. Microproteins that have been identified thus far play important roles in fundamental processes including ion transport, oxidative phosphorylation, and stress signaling. In this review, we discuss the optimized tools available for microprotein discovery and validation, summarize the biological functions of numerous microproteins, outline the promise for developing microproteins as therapeutic targets, and look forward to the future of the field of microprotein biology.
Collapse
Affiliation(s)
- Keira R. Hassel
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Omar Brito-Estrada
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Catherine A. Makarewich
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| |
Collapse
|
11
|
Kwiatek L, Landry-Voyer AM, Latour M, Yague-Sanz C, Bachand F. PABPN1 prevents the nuclear export of an unspliced RNA with a constitutive transport element and controls human gene expression via intron retention. RNA (NEW YORK, N.Y.) 2023; 29:644-662. [PMID: 36754576 PMCID: PMC10158996 DOI: 10.1261/rna.079294.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 01/12/2023] [Indexed: 05/06/2023]
Abstract
Intron retention is a type of alternative splicing where one or more introns remain unspliced in a polyadenylated transcript. Although many viral systems are known to translate proteins from mRNAs with retained introns, restriction mechanisms generally prevent export and translation of incompletely spliced mRNAs. Here, we provide evidence that the human nuclear poly(A)-binding protein, PABPN1, functions in such restrictions. Using a reporter construct in which nuclear export of an incompletely spliced mRNA is enhanced by a viral constitutive transport element (CTE), we show that PABPN1 depletion results in a significant increase in export and translation from the unspliced CTE-containing transcript. Unexpectedly, we find that inactivation of poly(A)-tail exosome targeting by depletion of PAXT components had no effect on export and translation of the unspliced reporter mRNA, suggesting a mechanism largely independent of nuclear RNA decay. Interestingly, a PABPN1 mutant selectively defective in stimulating poly(A) polymerase elongation strongly enhanced the expression of the unspliced, but not of intronless, reporter transcripts. Analysis of RNA-seq data also revealed that PABPN1 controls the expression of many human genes via intron retention. Notably, PABPN1-dependent intron retention events mostly affected 3'-terminal introns and were insensitive to PAXT and NEXT deficiencies. Our findings thus disclose a role for PABPN1 in restricting nuclear export of intron-retained transcripts and reinforce the interdependence between terminal intron splicing, 3' end processing, and polyadenylation.
Collapse
Affiliation(s)
- Lauren Kwiatek
- RNA Group, Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada J1E 4K8
| | - Anne-Marie Landry-Voyer
- RNA Group, Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada J1E 4K8
| | - Mélodie Latour
- RNA Group, Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada J1E 4K8
| | - Carlo Yague-Sanz
- RNA Group, Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada J1E 4K8
| | - Francois Bachand
- RNA Group, Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada J1E 4K8
| |
Collapse
|
12
|
Abstract
Our defenses against infection rely on the ability of the immune system to distinguish invading pathogens from self. This task is exceptionally challenging, if not seemingly impossible, in the case of retroviruses that have integrated almost seamlessly into the host. This review examines the limits of innate and adaptive immune responses elicited by endogenous retroviruses and other retroelements, the targets of immune recognition, and the consequences for host health and disease. Contrary to theoretical expectation, endogenous retroelements retain substantial immunogenicity, which manifests most profoundly when their epigenetic repression is compromised, contributing to autoinflammatory and autoimmune disease and age-related inflammation. Nevertheless, recent evidence suggests that regulated immune reactivity to endogenous retroelements is integral to immune system development and function, underpinning cancer immunosurveillance, resistance to infection, and responses to the microbiota. Elucidation of the interaction points with endogenous retroelements will therefore deepen our understanding of immune system function and contribution to disease.
Collapse
Affiliation(s)
- George Kassiotis
- Retroviral Immunology Laboratory, The Francis Crick Institute, London, United Kingdom;
- Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
13
|
Reilly L, Seddighi S, Singleton AB, Cookson MR, Ward ME, Qi YA. Variant biomarker discovery using mass spectrometry-based proteogenomics. FRONTIERS IN AGING 2023; 4:1191993. [PMID: 37168844 PMCID: PMC10165118 DOI: 10.3389/fragi.2023.1191993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 04/13/2023] [Indexed: 05/13/2023]
Abstract
Genomic diversity plays critical roles in risk of disease pathogenesis and diagnosis. While genomic variants-including single nucleotide variants, frameshift variants, and mis-splicing isoforms-are commonly detected at the DNA or RNA level, their translated variant protein or polypeptide products are ultimately the functional units of the associated disease. These products are often released in biofluids and could be leveraged for clinical diagnosis and patient stratification. Recent emergence of integrated analysis of genomics with mass spectrometry-based proteomics for biomarker discovery, also known as proteogenomics, have significantly advanced the understanding disease risk variants, precise medicine, and biomarker discovery. In this review, we discuss variant proteins in the context of cancers and neurodegenerative diseases, outline current and emerging proteogenomic approaches for biomarker discovery, and provide a comprehensive proteogenomic strategy for detection of putative biomarker candidates in human biospecimens. This strategy can be implemented for proteogenomic studies in any field of enquiry. Our review timely addresses the need of biomarkers for aging related diseases.
Collapse
Affiliation(s)
- Luke Reilly
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States
| | - Sahba Seddighi
- National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States
| | - Andrew B. Singleton
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, United States
| | - Mark R. Cookson
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, United States
| | - Michael E. Ward
- National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States
| | - Yue A. Qi
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States
| |
Collapse
|
14
|
Proteogenomic analysis of acute myeloid leukemia associates relapsed disease with reprogrammed energy metabolism both in adults and children. Leukemia 2023; 37:550-559. [PMID: 36572751 PMCID: PMC9991901 DOI: 10.1038/s41375-022-01796-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 12/06/2022] [Accepted: 12/08/2022] [Indexed: 12/27/2022]
Abstract
Despite improvement of current treatment strategies and novel targeted drugs, relapse and treatment resistance largely determine the outcome for acute myeloid leukemia (AML) patients. To identify the underlying molecular characteristics, numerous studies have been aimed to decipher the genomic- and transcriptomic landscape of AML. Nevertheless, further molecular changes allowing malignant cells to escape treatment remain to be elucidated. Mass spectrometry is a powerful tool enabling detailed insights into proteomic changes that could explain AML relapse and resistance. Here, we investigated AML samples from 47 adult and 22 pediatric patients at serial time-points during disease progression using mass spectrometry-based in-depth proteomics. We show that the proteomic profile at relapse is enriched for mitochondrial ribosomal proteins and subunits of the respiratory chain complex, indicative of reprogrammed energy metabolism from diagnosis to relapse. Further, higher levels of granzymes and lower levels of the anti-inflammatory protein CR1/CD35 suggest an inflammatory signature promoting disease progression. Finally, through a proteogenomic approach, we detected novel peptides, which present a promising repertoire in the search for biomarkers and tumor-specific druggable targets. Altogether, this study highlights the importance of proteomic studies in holistic approaches to improve treatment and survival of AML patients.
Collapse
|
15
|
Wang N, Guo S, Hao F, Zhang Y, Chen Y, Fei X, Wang J. Pseudogene SNRPFP1 derived long non-coding RNA facilitates hepatocellular carcinoma progress in vitro by sponging tumor-suppressive miR-126-5p. Sci Rep 2022; 12:21867. [PMID: 36535956 PMCID: PMC9763376 DOI: 10.1038/s41598-022-24597-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 11/17/2022] [Indexed: 12/23/2022] Open
Abstract
Pseudogene-derived transcripts, especially those barely transcribed in normal tissues, have been regarded as a kind of non-coding RNAs, and present potential functions in tumorigenicity and tumor development in human beings. However, their exact effects on hepatocellular carcinoma (HCC) remain largely unknown. On basis of our previous research and the constructed online database for the non-coding RNAs related to HCC, a series of pseudogene transcripts have been discovered, and SNRPFP1, the homologous pseudogene of SNRPF, was found to produce an anomalously high expression long non-coding RNA in HCC. In this study, we validated the expression of the SNRPFP1 transcript in both HCC tissues and cell lines. The adverse correlation between SNRPFP1 expression and patients' outcomes was observed. And depletion of SNRPF1 in HCC cells significantly suppressed cell proliferation and apoptosis resistance. Meanwhile, the motility of HCC cells was potently impaired. Interestingly, miR-126-5p, one of the tumor-suppressive genes commonly decreased in HCC, was found negatively expressed and correlated with SNRPF1, and a specific region of SNRPF1 transcript is directly binding to miR-126-5p in a molecular sponge way. The rescue experiment by knock-out miR-126-5p significantly reversed the cell growth suppression and a higher ratio of cell apoptosis induced by SNRPF1 depletion. Lastly, we concluded that SNRPF1 is a pseudogene active in HCC, and its abnormally over-expressed transcript is a strong promoter of HCC cell progress in vitro by sponging miR-126-5p. We believe that the findings in this study provide new strategies for HCC prevention and therapeutic treatment.
Collapse
Affiliation(s)
- Nan Wang
- grid.412277.50000 0004 1760 6738Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| | - Simin Guo
- grid.412277.50000 0004 1760 6738Department of Infectious Disease, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| | - Fengjie Hao
- grid.412277.50000 0004 1760 6738Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| | - Yifan Zhang
- grid.412277.50000 0004 1760 6738Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| | - Yongjun Chen
- grid.412277.50000 0004 1760 6738Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| | - Xiaochun Fei
- grid.412277.50000 0004 1760 6738Department of Pathology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| | - Junqing Wang
- grid.412277.50000 0004 1760 6738Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197, Rui Jin Er Road, Shanghai, 200025 People’s Republic of China
| |
Collapse
|
16
|
Tatari N, Khan S, Livingstone J, Zhai K, Mckenna D, Ignatchenko V, Chokshi C, Gwynne WD, Singh M, Revill S, Mikolajewicz N, Zhu C, Chan J, Hawkins C, Lu JQ, Provias JP, Ask K, Morrissy S, Brown S, Weiss T, Weller M, Han H, Greenspoon JN, Moffat J, Venugopal C, Boutros PC, Singh SK, Kislinger T. The proteomic landscape of glioblastoma recurrence reveals novel and targetable immunoregulatory drivers. Acta Neuropathol 2022; 144:1127-1142. [PMID: 36178522 PMCID: PMC10187978 DOI: 10.1007/s00401-022-02506-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 09/23/2022] [Accepted: 09/24/2022] [Indexed: 01/26/2023]
Abstract
Glioblastoma (GBM) is characterized by extensive cellular and genetic heterogeneity. Its initial presentation as primary disease (pGBM) has been subject to exhaustive molecular and cellular profiling. By contrast, our understanding of how GBM evolves to evade the selective pressure of therapy is starkly limited. The proteomic landscape of recurrent GBM (rGBM), which is refractory to most treatments used for pGBM, are poorly known. We, therefore, quantified the transcriptome and proteome of 134 patient-derived pGBM and rGBM samples, including 40 matched pGBM-rGBM pairs. GBM subtypes transition from pGBM to rGBM towards a preferentially mesenchymal state at recurrence, consistent with the increasingly invasive nature of rGBM. We identified immune regulatory/suppressive genes as important drivers of rGBM and in particular 2-5-oligoadenylate synthase 2 (OAS2) as an essential gene in recurrent disease. Our data identify a new class of therapeutic targets that emerge from the adaptive response of pGBM to therapy, emerging specifically in recurrent disease and may provide new therapeutic opportunities absent at pGBM diagnosis.
Collapse
Affiliation(s)
- Nazanin Tatari
- Centre for Discovery in Cancer Research, Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - Shahbaz Khan
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Julie Livingstone
- Department of Human Genetics and Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
| | - Kui Zhai
- Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Dillon Mckenna
- Department of Surgery, McMaster University, Hamilton, ON, Canada
| | | | - Chirayu Chokshi
- Centre for Discovery in Cancer Research, Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - William D Gwynne
- Centre for Discovery in Cancer Research, Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - Manoj Singh
- Centre for Discovery in Cancer Research, Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada.,Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Spencer Revill
- McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Nicholas Mikolajewicz
- Department of Molecular Genetics - Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Chenghao Zhu
- Department of Human Genetics and Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA
| | - Jennifer Chan
- Department of Pathology and Laboratory Medicine, University of Calgary, Calgary, AB, Canada
| | - Cynthia Hawkins
- Department of Pediatric Laboratory Medicine, Hospital for Sick Children, Toronto, Canada
| | - Jian-Qiang Lu
- Department of Pathology, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - John P Provias
- Department of Pathology, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Kjetil Ask
- McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Sorana Morrissy
- Department of Biochemistry and Molecular Biology, The University of Calgary, Calgary, AB, Canada
| | - Samuel Brown
- Department of Biochemistry and Molecular Biology, The University of Calgary, Calgary, AB, Canada
| | - Tobias Weiss
- Department of Neurology and Clinical Neuroscience Center, University Hospital Zurich and University of Zurich, Zurich, Switzerland
| | - Michael Weller
- Department of Neurology and Clinical Neuroscience Center, University Hospital Zurich and University of Zurich, Zurich, Switzerland
| | - Hong Han
- Department of Molecular Genetics - Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Jeffrey N Greenspoon
- Juravinski Cancer Center, Department of Oncology, Radiation Oncology, McMaster University, Hamilton, ON, Canada
| | - Jason Moffat
- Department of Molecular Genetics - Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Chitra Venugopal
- Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Paul C Boutros
- Department of Human Genetics and Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA. .,Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
| | - Sheila K Singh
- Centre for Discovery in Cancer Research, Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada. .,Department of Surgery, McMaster University, Hamilton, ON, Canada.
| | - Thomas Kislinger
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada. .,Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
17
|
Proteogenomic analysis of lung adenocarcinoma reveals tumor heterogeneity, survival determinants, and therapeutically relevant pathways. Cell Rep Med 2022; 3:100819. [PMID: 36384096 PMCID: PMC9729884 DOI: 10.1016/j.xcrm.2022.100819] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 05/09/2022] [Accepted: 10/18/2022] [Indexed: 11/17/2022]
Abstract
We present a deep proteogenomic profiling study of 87 lung adenocarcinoma (LUAD) tumors from the United States, integrating whole-genome sequencing, transcriptome sequencing, proteomics and phosphoproteomics by mass spectrometry, and reverse-phase protein arrays. We identify three subtypes from somatic genome signature analysis, including a transition-high subtype enriched with never smokers, a transversion-high subtype enriched with current smokers, and a structurally altered subtype enriched with former smokers, TP53 alterations, and genome-wide structural alterations. We show that within-tumor correlations of RNA and protein expression associate with tumor purity and immune cell profiles. We detect and independently validate expression signatures of RNA and protein that predict patient survival. Additionally, among co-measured genes, we found that protein expression is more often associated with patient survival than RNA. Finally, integrative analysis characterizes three expression subtypes with divergent mutations, proteomic regulatory networks, and therapeutic vulnerabilities. This proteogenomic characterization provides a foundation for molecularly informed medicine in LUAD.
Collapse
|
18
|
da Silva EMG, Rebello KM, Choi YJ, Gregorio V, Paschoal AR, Mitreva M, McKerrow JH, Neves-Ferreira AGDC, Passetti F. Identification of Novel Genes and Proteoforms in Angiostrongylus costaricensis through a Proteogenomic Approach. Pathogens 2022; 11:1273. [PMID: 36365024 PMCID: PMC9694666 DOI: 10.3390/pathogens11111273] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/15/2022] [Accepted: 10/20/2022] [Indexed: 07/22/2023] Open
Abstract
RNA sequencing (RNA-Seq) and mass-spectrometry-based proteomics data are often integrated in proteogenomic studies to assist in the prediction of eukaryote genome features, such as genes, splicing, single-nucleotide (SNVs), and single-amino-acid variants (SAAVs). Most genomes of parasite nematodes are draft versions that lack transcript- and protein-level information and whose gene annotations rely only on computational predictions. Angiostrongylus costaricensis is a roundworm species that causes an intestinal inflammatory disease, known as abdominal angiostrongyliasis (AA). Currently, there is no drug available that acts directly on this parasite, mostly due to the sparse understanding of its molecular characteristics. The available genome of A. costaricensis, specific to the Costa Rica strain, is a draft version that is not supported by transcript- or protein-level evidence. This study used RNA-Seq and MS/MS data to perform an in-depth annotation of the A. costaricensis genome. Our prediction improved the reference annotation with (a) novel coding and non-coding genes; (b) pieces of evidence of alternative splicing generating new proteoforms; and (c) a list of SNVs between the Brazilian (Crissiumal) and the Costa Rica strain. To the best of our knowledge, this is the first time that a multi-omics approach has been used to improve the genome annotation of A. costaricensis. We hope this improved genome annotation can assist in the future development of drugs, kits, and vaccines to treat, diagnose, and prevent AA caused by either the Brazil strain (Crissiumal) or the Costa Rica strain.
Collapse
Affiliation(s)
- Esdras Matheus Gomes da Silva
- Instituto Carlos Chagas, Fiocruz, Curitiba 81350-010, PR, Brazil
- Laboratory of Toxinology, Oswaldo Cruz Institute, Fiocruz, Rio de Janeiro 21040-900, RJ, Brazil
| | - Karina Mastropasqua Rebello
- Laboratory of Toxinology, Oswaldo Cruz Institute, Fiocruz, Rio de Janeiro 21040-900, RJ, Brazil
- Laboratory of Integrated Studies in Protozoology, Oswaldo Cruz Institute, Fiocruz, Rio de Janeiro 21040-360, RJ, Brazil
| | - Young-Jun Choi
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Vitor Gregorio
- Bioinformatics and Pattern Recognition Group (Bioinfo-CP), Department of Computer Science (DACOM), Federal University of Technology-Parana (UTFPR), Cornélio Procópio 86300-000, PR, Brazil
| | - Alexandre Rossi Paschoal
- Bioinformatics and Pattern Recognition Group (Bioinfo-CP), Department of Computer Science (DACOM), Federal University of Technology-Parana (UTFPR), Cornélio Procópio 86300-000, PR, Brazil
| | - Makedonka Mitreva
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - James H. McKerrow
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, CA 92093, USA
| | | | - Fabio Passetti
- Instituto Carlos Chagas, Fiocruz, Curitiba 81350-010, PR, Brazil
| |
Collapse
|
19
|
Hao F, Wang N, Gui H, Zhang Y, Wu Z, Wang J. Pseudogene UBE2MP1 derived transcript enhances in vitro cell proliferation and apoptosis resistance of hepatocellular carcinoma cells through miR-145-5p/RGS3 axis. Aging (Albany NY) 2022; 14:7906-7925. [PMID: 36214767 PMCID: PMC9596209 DOI: 10.18632/aging.204319] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 09/05/2022] [Indexed: 11/25/2022]
Abstract
Pseudogenes are barely transcribed at normal, while the anomalous transcripts of them are mostly regarded as long non-coding RNAs (lncRNAs), which play potential functions in human tumorigenicity and development. The exact effects of pseudogene-derived transcripts on hepatocellular carcinoma (HCC) are ambiguous. According to our previous research and constructed database on the HCC-related lncRNAs, we noticed that UBE2MP1 was transcriptionally activated in HCC as a pseudogene from the ubiquitin-conjugating enzyme member UBE2M. In this study, we validated the high expression of the UBE2MP1 transcript in HCC and its adverse correlation with dismal outcomes for the patients. UBE2MP1 depletion at the transcript level significantly impaired cell proliferation and apoptosis resistance in HCC cell lines. Notably, we discovered that the UBE2MP1 transcript shared a specific sequence, binding to the miR-145-5p seed region with a typical ceRNA effect. Simultaneously, we verified an axis of miR-145-5p/RGS3 in HCC cells, which promoted cell proliferation and apoptosis resistance with significance. And modulation of UE2MP1 could remarkably affect RGS3 expression and consequentially influence HCC cell growth in vitro. And combined with the rescue experiment modulating either miR-145-5p or RGS3 furtherly indicated UBE2MP1 as an upstream regulator of the axis in promoting HCC cell growth and maintenance. Thus, our findings provide new strategies for HCC prevention and individual treatment.
Collapse
Affiliation(s)
- Fengjie Hao
- Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People’s Republic of China
| | - Nan Wang
- Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People’s Republic of China
| | - Honglian Gui
- Department of Infectious Disease, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People’s Republic of China
| | - Yifan Zhang
- Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People’s Republic of China
| | - Zhiyuan Wu
- Department of Interventional Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People’s Republic of China
| | - Junqing Wang
- Department of General Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, People’s Republic of China
| |
Collapse
|
20
|
Urbiola-Salvador V, Miroszewska D, Jabłońska A, Qureshi T, Chen Z. Proteomics approaches to characterize the immune responses in cancer. BIOCHIMICA ET BIOPHYSICA ACTA. MOLECULAR CELL RESEARCH 2022; 1869:119266. [PMID: 35390423 DOI: 10.1016/j.bbamcr.2022.119266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 03/01/2022] [Accepted: 03/28/2022] [Indexed: 06/14/2023]
Abstract
Despite the dynamic development of cancer research, annually millions of people die of cancer. The human immune system is the major 'guard' against tumor development. Unfortunately, cancer cells have the ability to evade the immune system and continue to grow. The proper understanding of the intricate immune response in tumorigenesis remains the holy grail of cancer immunology and designing effective immunotherapy. To decode the immune responses in cancer, in recent years, proteomics studies have received considerable attention. Proteomics studies focus on the detection and quantification of proteins, which are the effectors of biological functions, and as such, are proven to reflect the cell state more accurately, in comparison to genomic or transcriptomic studies. In this review, we discuss the proteomics studies applied to characterize the immune responses in cancer and tumor immune microenvironment heterogeneity. Further, we describe emerging single-cell proteomics approaches that have the potential to be applied in cancer immunity studies.
Collapse
Affiliation(s)
- Víctor Urbiola-Salvador
- Intercollegiate Faculty of Biotechnology of University of Gdańsk and Medical University of Gdańsk, University of Gdańsk, Poland.
| | - Dominika Miroszewska
- Intercollegiate Faculty of Biotechnology of University of Gdańsk and Medical University of Gdańsk, University of Gdańsk, Poland.
| | - Agnieszka Jabłońska
- Intercollegiate Faculty of Biotechnology of University of Gdańsk and Medical University of Gdańsk, University of Gdańsk, Poland.
| | - Talha Qureshi
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland.
| | - Zhi Chen
- Intercollegiate Faculty of Biotechnology of University of Gdańsk and Medical University of Gdańsk, University of Gdańsk, Poland; Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland.
| |
Collapse
|
21
|
Corvigno S, Johnson AM, Wong KK, Cho MS, Afshar-Kharghan V, Menter DG, Sood AK. Novel Markers for Liquid Biopsies in Cancer Management: Circulating Platelets and Extracellular Vesicles. Mol Cancer Ther 2022; 21:1067-1075. [PMID: 35545008 DOI: 10.1158/1535-7163.mct-22-0087] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 04/05/2022] [Accepted: 05/05/2022] [Indexed: 02/03/2023]
Abstract
Although radiologic imaging and histologic assessment of tumor tissues are classic approaches for diagnosis and monitoring of treatment response, they have many limitations. These include challenges in distinguishing benign from malignant masses, difficult access to the tumor, high cost of the procedures, and tumor heterogeneity. In this setting, liquid biopsy has emerged as a potential alternative for both diagnostic and monitoring purposes. The approaches to liquid biopsy include cell-free DNA/circulating tumor DNA, long and micro noncoding RNAs, proteins/peptides, carbohydrates/lectins, lipids, and metabolites. Other approaches include detection and analysis of circulating tumor cells, extracellular vesicles, and tumor-activated platelets. Ultimately, reliable use of liquid biopsies requires bioinformatics and statistical integration of multiple datasets to achieve approval in a Clinical Laboratory Improvement Amendments setting. This review provides a balanced and critical assessment of recent discoveries regarding tumor-derived biomarkers in liquid biopsies along with the potential and pitfalls for cancer detection and longitudinal monitoring.
Collapse
Affiliation(s)
- Sara Corvigno
- Department of Gynecologic Oncology & Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Anna Maria Johnson
- Department of Gynecologic Oncology & Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Kwong-Kwok Wong
- Department of Gynecologic Oncology & Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas.,The University of Texas Graduate School of Biomedical Sciences at Houston, Houston, Texas
| | - Min Soon Cho
- Division of Internal Medicine, Benign Hematology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Vahid Afshar-Kharghan
- Division of Internal Medicine, Benign Hematology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - David G Menter
- Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Anil K Sood
- Department of Gynecologic Oncology & Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas.,Center for RNA Interference and Non-Coding RNA, The University of Texas MD Anderson Cancer Center, Houston, Texas
| |
Collapse
|
22
|
Bogaert A, Fijalkowska D, Staes A, Van de Steene T, Demol H, Gevaert K. Limited evidence for protein products of non-coding transcripts in the HEK293T cellular cytosol. Mol Cell Proteomics 2022; 21:100264. [PMID: 35788065 PMCID: PMC9396073 DOI: 10.1016/j.mcpro.2022.100264] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 06/22/2022] [Accepted: 06/30/2022] [Indexed: 10/25/2022] Open
Abstract
Ribosome profiling has revealed translation outside of canonical coding sequences (CDSs) including translation of short upstream ORFs, long non-coding RNAs, overlapping ORFs, ORFs in UTRs or ORFs in alternative reading frames. Studies combining mass spectrometry, ribosome profiling and CRISPR-based screens showed that hundreds of ORFs derived from non-coding transcripts produce (micro)proteins, while other studies failed to find evidence for such types of non-canonical translation products. Here, we attempted to discover translation products from non-coding regions by strongly reducing the complexity of the sample prior to mass spectrometric analysis. We used an extended database as the search space and applied stringent filtering of the identified peptides to find evidence for novel translation events. We show that, theoretically our strategy facilitates the detection of translation events of transcripts from non-coding regions, but experimentally only find 19 peptides that might originate from such translation events. Finally, Virotrap based interactome analysis of two N-terminal proteoforms originating from non-coding regions finally showed the functional potential of these novel proteins.
Collapse
Affiliation(s)
- Annelies Bogaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Daria Fijalkowska
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - An Staes
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Tessa Van de Steene
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Hans Demol
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Kris Gevaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium.
| |
Collapse
|
23
|
Liu Y, Zeng S, Wu M. Novel insights into noncanonical open reading frames in cancer. Biochim Biophys Acta Rev Cancer 2022; 1877:188755. [PMID: 35777601 DOI: 10.1016/j.bbcan.2022.188755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/11/2022] [Accepted: 06/23/2022] [Indexed: 12/12/2022]
Abstract
With technological advances, previously neglected noncanonical open reading frames (nORFs) are drawing ever-increasing attention. However, the translation potential of numerous putative nORFs remains elusive, and the functions of noncanonical peptides have not been systemically summarized. Moreover, the relationship between noncanonical peptides and their counterpart protein or RNA products remains elusive and the clinical implementation of noncanonical peptides has not been explored. In this review, we highlight how recent technological advances such as ribosome profiling, bioinformatics approaches and CRISPR/Cas9 facilitate the research of noncanonical peptides. We delineate the features of each nORF category and the evolutionary process underneath the nORFs. Most importantly, we summarize the diversified functions of noncanonical peptides in cancer based on their subcellular location, which reflect their extensive participation in key pathways and essential cellular activities in cancer cells. Meanwhile, the equilibrium between noncanonical peptides and their corresponding transcripts or counterpart products may be dysregulated under pathological states, which is essential for their roles in cancer. Lastly, we explore their underestimated potential in clinical application as diagnostic biomarkers and treatment targets against cancer.
Collapse
Affiliation(s)
- Yihan Liu
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410013, Hunan, China; The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan 410008, China; Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; Key Laboratory for Molecular Radiation Oncology of Hunan Province, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Shan Zeng
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; Key Laboratory for Molecular Radiation Oncology of Hunan Province, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China.
| | - Minghua Wu
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410013, Hunan, China; The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan 410008, China.
| |
Collapse
|
24
|
Choong WK, Sung TY. Multiaspect Examinations of Possible Alternative Mappings of Identified Variant Peptides: A Case Study on the HEK293 Cell Line. ACS OMEGA 2022; 7:16454-16467. [PMID: 35601313 PMCID: PMC9118379 DOI: 10.1021/acsomega.2c00466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 04/20/2022] [Indexed: 06/15/2023]
Abstract
Adopting proteogenomics approach to validate single nucleotide variation events by identifying corresponding single amino acid variant peptides from mass spectrometry (MS)-based proteomics data facilitates translational and clinical research. Although variant peptides are usually identified from MS data with a stringent false discovery rate (FDR), FDR control could fail to eliminate dubious results caused by several issues; thus, postexamination to eliminate dubious results is required. However, comprehensive postexaminations of identification results are still lacking. Therefore, we propose a framework of three bottom-up levels, peptide-spectrum match, peptide, and variant event levels, that consists of rigorous 11-aspect examinations from the MS perspective to further confirm the reliability of variant events. As a proof of concept and showing feasibility, we demonstrate 11 examinations on the identified variant peptides from an HEK293 cell line data set, where various database search strategies were applied to maximize the number of identified variant PSMs with an FDR <1% for postexaminations. The results showed that only FDR criterion is insufficient to validate identified variant peptides and the 11 postexaminations can reveal low-confidence variant events detected by shotgun proteomics experiments. Therefore, we suggest that postexaminations of identified variant events based on the proposed framework are necessary for proteogenomics studies.
Collapse
|
25
|
Aggarwal S, Raj A, Kumar D, Dash D, Yadav AK. False discovery rate: the Achilles' heel of proteogenomics. Brief Bioinform 2022; 23:6582880. [PMID: 35534181 DOI: 10.1093/bib/bbac163] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 03/14/2022] [Accepted: 04/12/2022] [Indexed: 12/25/2022] Open
Abstract
Proteogenomics refers to the integrated analysis of the genome and proteome that leverages mass-spectrometry (MS)-based proteomics data to improve genome annotations, understand gene expression control through proteoforms and find sequence variants to develop novel insights for disease classification and therapeutic strategies. However, proteogenomic studies often suffer from reduced sensitivity and specificity due to inflated database size. To control the error rates, proteogenomics depends on the target-decoy search strategy, the de-facto method for false discovery rate (FDR) estimation in proteomics. The proteogenomic databases constructed from three- or six-frame nucleotide database translation not only increase the search space and compute-time but also violate the equivalence of target and decoy databases. These searches result in poorer separation between target and decoy scores, leading to stringent FDR thresholds. Understanding these factors and applying modified strategies such as two-pass database search or peptide-class-specific FDR can result in a better interpretation of MS data without introducing additional statistical biases. Based on these considerations, a user can interpret the proteogenomics results appropriately and control false positives and negatives in a more informed manner. In this review, first, we briefly discuss the proteogenomic workflows and limitations in database construction, followed by various considerations that can influence potential novel discoveries in a proteogenomic study. We conclude with suggestions to counter these challenges for better proteogenomic data interpretation.
Collapse
Affiliation(s)
- Suruchi Aggarwal
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd milestone, PO Box No. 04, Faridabad-Gurgaon Expressway, Faridabad-121001, Haryana, India
| | - Anurag Raj
- GN Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics & Integrative Biology, South Campus, Mathura Road, New Delhi 110025, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad-201002, India
| | - Dhirendra Kumar
- GN Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics & Integrative Biology, South Campus, Mathura Road, New Delhi 110025, India
| | - Debasis Dash
- GN Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics & Integrative Biology, South Campus, Mathura Road, New Delhi 110025, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad-201002, India
| | - Amit Kumar Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd milestone, PO Box No. 04, Faridabad-Gurgaon Expressway, Faridabad-121001, Haryana, India
| |
Collapse
|
26
|
Zhang Z, Li Y, Yuan W, Wang Z, Wan C. Proteomic-driven identification of short open reading frame-encoded peptides. Proteomics 2022; 22:e2100312. [PMID: 35384297 DOI: 10.1002/pmic.202100312] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 11/10/2022]
Abstract
Accumulating evidence has shown that a large number of short open reading frames (sORFs) also have the ability to encode proteins. The discovery of sORFs opens up a new research area, leading to the identification and functional study of sORF encoded peptides (SEPs) at the omics level. Besides bioinformatics prediction and ribosomal profiling, mass spectrometry (MS) has become a significant tool as it directly detects the sequence of SEPs. Though MS-based proteomics methods have proved to be effective for qualitative and quantitative analysis of SEPs, the detection of SEPs is still a great challenge due to their low abundance and short sequence. To illustrate the progress in method development, we described and discussed the main steps of large-scale proteomics identification of SEPs, including SEP extraction and enrichment, MS detection, data processing and quality control, quantification, and function prediction and validation methods. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Zheng Zhang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Yujie Li
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Wenqian Yuan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Zhiwei Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| |
Collapse
|
27
|
Hari PS, Balakrishnan L, Kotyada C, Everad John A, Tiwary S, Shah N, Sirdeshmukh R. Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications. Mol Cell Proteomics 2022; 21:100220. [PMID: 35227895 PMCID: PMC9020135 DOI: 10.1016/j.mcpro.2022.100220] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 01/16/2022] [Accepted: 02/24/2022] [Indexed: 11/30/2022] Open
Abstract
We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipeline that consisted of de novo transcript assembly, six frame-translated custom database, and a combination of search engines to identify novel peptides. A portfolio of 4,387 novel peptide sequences initially identified was further screened through PepQuery validation tool (Clinical Proteomic Tumor Analysis Consortium), which yielded 1,558 novel peptides. We considered the dataset of 1,558 validated through PepQuery to understand their functional and clinical significance, leaving the rest to be further verified using other validation tools and approaches. The novel peptides mapped to the known gene sequences as well as to genomic regions yet undefined for translation, 580 novel peptides mapped to known protein-coding genes, 147 to non–protein-coding genes, and 831 belonged to novel translational sequences. The novel peptides belonging to protein-coding genes represented alternatively spliced events or 5′ or 3′ extensions, whereas others represented translation from pseudogenes, long noncoding RNAs, or novel peptides originating from uncharacterized protein-coding sequences—mostly from the intronic regions of known genes. Seventy-six of the 580 protein-coding genes were associated with cancer hallmark genes, which included key oncogenes, transcription factors, kinases, and cell surface receptors. Survival association analysis of the 76 novel peptide sequences revealed 10 of them to be significant, and we present a panel of six novel peptides, whose high expression was found to be strongly associated with poor survival of patients with human epidermal growth factor receptor 2–enriched subtype. Our analysis represents a landscape of novel peptides of different types that may be expressed in breast cancer tissues, whereas their presence in full-length functional proteins needs further investigations. Novel protein variants and peptides from noncoding sequences are rapidly emerging. Mining of mass spectrometry data using proteogenomic analysis reveals such entities. Novel peptides from coding and noncoding sequences identified in breast cancer. Novel peptides mapped to cancer hallmark genes in breast cancer. Panel of novel peptides with prognostic potential found for HER2-enriched subtype.
Collapse
Affiliation(s)
- P S Hari
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | - Lavanya Balakrishnan
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | - Chaithanya Kotyada
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | | | - Shivani Tiwary
- Simulation and Modeling Sciences, Pfizer Pharma GmBH, Berlin, Germany
| | - Nameeta Shah
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India.
| | - Ravi Sirdeshmukh
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India; Institute of Bioinformatics, International Tech Park, Bangalore, India; Health Sciences, Manipal Academy of Higher Education, Manipal, India.
| |
Collapse
|
28
|
Leong AZX, Lee PY, Mohtar MA, Syafruddin SE, Pung YF, Low TY. Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures. J Biomed Sci 2022; 29:19. [PMID: 35300685 PMCID: PMC8928697 DOI: 10.1186/s12929-022-00802-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/09/2022] [Indexed: 12/17/2022] Open
Abstract
A short open reading frame (sORFs) constitutes ≤ 300 bases, encoding a microprotein or sORF-encoded protein (SEP) which comprises ≤ 100 amino acids. Traditionally dismissed by genome annotation pipelines as meaningless noise, sORFs were found to possess coding potential with ribosome profiling (RIBO-Seq), which unveiled sORF-based transcripts at various genome locations. Nonetheless, the existence of corresponding microproteins that are stable and functional was little substantiated by experimental evidence initially. With recent advancements in multi-omics, the identification, validation, and functional characterisation of sORFs and microproteins have become feasible. In this review, we discuss the history and development of an emerging research field of sORFs and microproteins. In particular, we focus on an array of bioinformatics and OMICS approaches used for predicting, sequencing, validating, and characterizing these recently discovered entities. These strategies include RIBO-Seq which detects sORF transcripts via ribosome footprints, and mass spectrometry (MS)-based proteomics for sequencing the resultant microproteins. Subsequently, our discussion extends to the functional characterisation of microproteins by incorporating CRISPR/Cas9 screen and protein–protein interaction (PPI) studies. Our review discusses not only detection methodologies, but we also highlight on the challenges and potential solutions in identifying and validating sORFs and their microproteins. The novelty of this review lies within its validation for the functional role of microproteins, which could contribute towards the future landscape of microproteomics.
Collapse
Affiliation(s)
- Alyssa Zi-Xin Leong
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Pey Yee Lee
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - M Aiman Mohtar
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Saiful Effendi Syafruddin
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Yuh-Fen Pung
- Division of Biomedical Science, School of Pharmacy, University of Nottingham Malaysia, Semenyih, 43500, Selangor, Malaysia
| | - Teck Yew Low
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia.
| |
Collapse
|
29
|
Identification of multiple TAR DNA binding protein retropseudogene lineages during the evolution of primates. Sci Rep 2022; 12:3823. [PMID: 35264686 PMCID: PMC8907276 DOI: 10.1038/s41598-022-07908-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 02/22/2022] [Indexed: 11/08/2022] Open
Abstract
The TAR DNA Binding Protein (TARDBP) gene has become relevant after the discovery of its several pathogenic mutations. The lack of evolutionary history is in contrast to the amount of studies found in the literature. This study investigated the evolutionary dynamics associated with the retrotransposition of the TARDBP gene in primates. We identified novel retropseudogenes that likely originated in the ancestors of anthropoids, catarrhines, and lemuriformes, i.e. the strepsirrhine clade that inhabit Madagascar. We also found species-specific retropseudogenes in the Philippine tarsier, Bolivian squirrel monkey, capuchin monkey and vervet. The identification of a retropseudocopy of the TARDBP gene overlapping a lncRNA that is potentially expressed opens a new avenue to investigate TARDBP gene regulation, especially in the context of TARDBP associated pathologies.
Collapse
|
30
|
Bateman NW, Tarney CM, Abulez TS, Hood BL, Conrads KA, Zhou M, Soltis AR, Teng PN, Jackson A, Tian C, Dalgard CL, Wilkerson MD, Kessler MD, Goecker Z, Loffredo J, Shriver CD, Hu H, Cote M, Parker GJ, Segars J, Al-Hendy A, Risinger JI, Phippen NT, Casablanca Y, Darcy KM, Maxwell GL, Conrads TP, O'Connor TD. Peptide ancestry informative markers in uterine neoplasms from women of European, African, and Asian ancestry. iScience 2021; 25:103665. [PMID: 35036865 PMCID: PMC8753123 DOI: 10.1016/j.isci.2021.103665] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 10/29/2021] [Accepted: 12/17/2021] [Indexed: 02/07/2023] Open
Abstract
Characterization of ancestry-linked peptide variants in disease-relevant patient tissues represents a foundational step to connect patient ancestry with disease pathogenesis. Nonsynonymous single-nucleotide polymorphisms encoding missense substitutions within tryptic peptides exhibiting high allele frequencies in European, African, and East Asian populations, termed peptide ancestry informative markers (pAIMs), were prioritized from 1000 genomes. In silico analysis identified that as few as 20 pAIMs can determine ancestry proportions similarly to >260K SNPs (R2 = 0.99). Multiplexed proteomic analysis of >100 human endometrial cancer cell lines and uterine leiomyoma tissues combined resulted in the quantitation of 62 pAIMs that correlate with patient race and genotype-confirmed ancestry. Candidates include a D451E substitution in GC vitamin D-binding protein previously associated with altered vitamin D levels in African and European populations. pAIMs will support generalized proteoancestry assessment as well as efforts investigating the impact of ancestry on the human proteome and how this relates to the pathogenesis of uterine neoplasms.
Collapse
Affiliation(s)
- Nicholas W. Bateman
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA,Corresponding author 3289 Woodburn Rd, Suite 375, Annandale, VA 22003;
| | - Christopher M. Tarney
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA
| | - Tamara S. Abulez
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA
| | - Brian L. Hood
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA
| | - Kelly A. Conrads
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA
| | - Ming Zhou
- Department of Obstetrics and Gynecology, Inova Fairfax Medical Campus, 3300 Gallows Road, Falls Church, VA 22042, USA
| | - Anthony R. Soltis
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA,The American Genome Center; Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| | - Pang-Ning Teng
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA
| | - Amanda Jackson
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA
| | - Chunqiao Tian
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA
| | - Clifton L. Dalgard
- The American Genome Center; Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA,Department of Anatomy Physiology and Genetics, Uniformed Services University, 4301 Jones Bridge Road, Bethesda, MD 20814, USA
| | - Matthew D. Wilkerson
- The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA,The American Genome Center; Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA,Department of Anatomy Physiology and Genetics, Uniformed Services University, 4301 Jones Bridge Road, Bethesda, MD 20814, USA
| | - Michael D. Kessler
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Zachary Goecker
- University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Jeremy Loffredo
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA
| | - Craig D. Shriver
- The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA
| | - Hai Hu
- The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA 15963, USA
| | | | - Glendon J. Parker
- University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
| | - James Segars
- Johns Hopkins University Medical Center, Baltimore, MD 21218, USA
| | - Ayman Al-Hendy
- The University of Illinois College of Medicine, Chicago, IL 60612, USA
| | - John I. Risinger
- Department of Obstetrics and Gynecology, Michigan State University, East Lansing, MI 48824, USA
| | - Neil T. Phippen
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA
| | - Yovanni Casablanca
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA
| | - Kathleen M. Darcy
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr., Suite 100, Bethesda, MD 20817, USA
| | - G. Larry Maxwell
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Department of Obstetrics and Gynecology, Inova Fairfax Medical Campus, 3300 Gallows Road, Falls Church, VA 22042, USA
| | - Thomas P. Conrads
- Gynecologic Cancer Center of Excellence, Department of Gynecologic Surgery and Obstetrics, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,The John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda, MD 20889, USA,Department of Obstetrics and Gynecology, Inova Fairfax Medical Campus, 3300 Gallows Road, Falls Church, VA 22042, USA
| | - Timothy D. O'Connor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA,Program in Personalize and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA,Marlene and Stewart Greenebaum Comprehensive Cancer, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
31
|
Umer HM, Audain E, Zhu Y, Pfeuffer J, Sachsenberg T, Lehtiö J, Branca RM, Perez-Riverol Y. Generation of ENSEMBL-based proteogenomics databases boosts the identification of non-canonical peptides. Bioinformatics 2021; 38:1470-1472. [PMID: 34904638 PMCID: PMC8825679 DOI: 10.1093/bioinformatics/btab838] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 12/07/2021] [Accepted: 12/10/2021] [Indexed: 01/06/2023] Open
Abstract
SUMMARY We have implemented the pypgatk package and the pgdb workflow to create proteogenomics databases based on ENSEMBL resources. The tools allow the generation of protein sequences from novel protein-coding transcripts by performing a three-frame translation of pseudogenes, lncRNAs and other non-canonical transcripts, such as those produced by alternative splicing events. It also includes exonic out-of-frame translation from otherwise canonical protein-coding mRNAs. Moreover, the tool enables the generation of variant protein sequences from multiple sources of genomic variants including COSMIC, cBioportal, gnomAD and mutations detected from sequencing of patient samples. pypgatk and pgdb provide multiple functionalities for database handling including optimized target/decoy generation by the algorithm DecoyPyrat. Finally, we have reanalyzed six public datasets in PRIDE by generating cell-type specific databases for 65 cell lines using the pypgatk and pgdb workflow, revealing a wealth of non-canonical or cryptic peptides amounting to >5% of the total number of peptides identified. AVAILABILITY AND IMPLEMENTATION The software is freely available. pypgatk: https://github.com/bigbio/py-pgatk/ and pgdb: https://nf-co.re/pgdb. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Husen M Umer
- Department of Oncology‐Pathology, Science for Life Laboratory, Karolinska Institutet, Stockholm 17165, Sweden
| | - Enrique Audain
- Department of Congenital Heart Disease and Pediatric Cardiology, Universitätsklinikum Schleswig-Holstein Kiel, Kiel 24105, Germany
| | - Yafeng Zhu
- Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-sen University, Guangzhou 510120, China
| | - Julianus Pfeuffer
- Algorithmic Bioinformatics, Freie Universität Berlin, Berlin 14195, Germany,Visualization and Data Analysis, Zuse Institute Berlin, Berlin 14195, Germany
| | - Timo Sachsenberg
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
| | - Janne Lehtiö
- Department of Oncology‐Pathology, Science for Life Laboratory, Karolinska Institutet, Stockholm 17165, Sweden
| | - Rui M Branca
- Department of Oncology‐Pathology, Science for Life Laboratory, Karolinska Institutet, Stockholm 17165, Sweden,To whom correspondence should be addressed. or
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK,To whom correspondence should be addressed. or
| |
Collapse
|
32
|
Qi F, Tan Y, Yao A, Yang X, He Y. Psoriasis to Psoriatic Arthritis: The Application of Proteomics Technologies. Front Med (Lausanne) 2021; 8:681172. [PMID: 34869404 PMCID: PMC8635007 DOI: 10.3389/fmed.2021.681172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 10/18/2021] [Indexed: 11/13/2022] Open
Abstract
Psoriatic disease (PsD) is a spectrum of diseases that affect both skin [cutaneous psoriasis (PsC)] and musculoskeletal features [psoriatic arthritis (PsA)]. A considerable number of patients with PsC have asymptomatic synovio-entheseal inflammations, and approximately one-third of those eventually progress to PsA with an enigmatic mechanism. Published studies have shown that early interventions to the very early-stage PsA would effectively prevent substantial bone destructions or deformities, suggesting an unmet goal for exploring early PsA biomarkers. The emergence of proteomics technologies brings a complete view of all involved proteins in PsA transitions, offers a unique chance to map all potential peptides, and allows a direct head-to-head comparison of interaction pathways in PsC and PsA. This review summarized the latest development of proteomics technologies, highlighted its application in PsA biomarker discovery, and discussed the possible clinical detectable PsA risk factors in patients with PsC.
Collapse
Affiliation(s)
- Fei Qi
- Department of Dermatology, Capital Medical University Affiliated Beijing Chaoyang Hospital, Beijing, China
| | - Yaqi Tan
- Department of Dermatology, Capital Medical University Affiliated Beijing Chaoyang Hospital, Beijing, China
| | - Amin Yao
- Department of Dermatology, Capital Medical University Affiliated Beijing Chaoyang Hospital, Beijing, China
| | - Xutong Yang
- Department of Dermatology, Capital Medical University Affiliated Beijing Chaoyang Hospital, Beijing, China
| | - Yanling He
- Department of Dermatology, Capital Medical University Affiliated Beijing Chaoyang Hospital, Beijing, China
| |
Collapse
|
33
|
Chen L, Yang Y, Zhang Y, Li K, Cai H, Wang H, Zhao Q. The Small Open Reading Frame-Encoded Peptides: Advances in Methodologies and Functional Studies. Chembiochem 2021; 23:e202100534. [PMID: 34862721 DOI: 10.1002/cbic.202100534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 11/15/2021] [Indexed: 11/07/2022]
Abstract
Small open reading frames (sORFs) are an important class of genes with less than 100 codons. They were historically annotated as noncoding or even junk sequences. In recent years, accumulating evidence suggests that sORFs could encode a considerable number of polypeptides, many of which play important roles in both physiology and disease pathology. However, it has been technically challenging to directly detect sORF-encoded peptides (SEPs). Here, we discuss the latest advances in methodologies for identifying SEPs with mass spectrometry, as well as the progress on functional studies of SEPs.
Collapse
Affiliation(s)
- Lei Chen
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China.,Laboratory for Synthetic Chemistry and Chemical Biology Limited, Hong Kong Science and Technology Park, New Territories, Hong Kong SAR, 999077, P. R. China
| | - Ying Yang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Yuanliang Zhang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Kecheng Li
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Hongmin Cai
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510623, P. R. China
| | - Hongwei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510623, P. R. China
| | - Qian Zhao
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| |
Collapse
|
34
|
Lehtiö J, Arslan T, Siavelis I, Pan Y, Socciarelli F, Berkovska O, Umer HM, Mermelekas G, Pirmoradian M, Jönsson M, Brunnström H, Brustugun OT, Purohit KP, Cunningham R, Asl HF, Isaksson S, Arbajian E, Aine M, Karlsson A, Kotevska M, Hansen CG, Haakensen VD, Helland Å, Tamborero D, Johansson HJ, Branca RM, Planck M, Staaf J, Orre LM. Proteogenomics of non-small cell lung cancer reveals molecular subtypes associated with specific therapeutic targets and immune evasion mechanisms. NATURE CANCER 2021; 2:1224-1242. [PMID: 34870237 PMCID: PMC7612062 DOI: 10.1038/s43018-021-00259-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Despite major advancements in lung cancer treatment, long-term survival is still rare, and a deeper understanding of molecular phenotypes would allow the identification of specific cancer dependencies and immune evasion mechanisms. Here we performed in-depth mass spectrometry (MS)-based proteogenomic analysis of 141 tumors representing all major histologies of non-small cell lung cancer (NSCLC). We identified six distinct proteome subtypes with striking differences in immune cell composition and subtype-specific expression of immune checkpoints. Unexpectedly, high neoantigen burden was linked to global hypomethylation and complex neoantigens mapped to genomic regions, such as endogenous retroviral elements and introns, in immune-cold subtypes. Further, we linked immune evasion with LAG3 via STK11 mutation-dependent HNF1A activation and FGL1 expression. Finally, we develop a data-independent acquisition MS-based NSCLC subtype classification method, validate it in an independent cohort of 208 NSCLC cases and demonstrate its clinical utility by analyzing an additional cohort of 84 late-stage NSCLC biopsy samples.
Collapse
Affiliation(s)
- Janne Lehtiö
- Department of Oncology and Pathology, Karolinska Institutet, SciLifeLab, Solna, Sweden.
| | - Taner Arslan
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Ioannis Siavelis
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Yanbo Pan
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Fabio Socciarelli
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Olena Berkovska
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Husen M. Umer
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Georgios Mermelekas
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Mohammad Pirmoradian
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Mats Jönsson
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Hans Brunnström
- Department of Pathology, Laboratory Medicine Region Skåne, Lund, Sweden,Division of Pathology, Department of Clinical Sciences, Lund, Lund University, Lund, Sweden
| | - Odd Terje Brustugun
- Section of Oncology, Drammen Hospital, Vestre Viken Health Trust, Drammen, Norway,Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Krishna Pinganksha Purohit
- University of Edinburgh Centre for Inflammation Research, Institute for Regeneration and Repair, Queen’s Medical Research Institute, Edinburgh bioQuarter, 47 Little France Crescent, Edinburgh EH16 4TJ, UK,MRC Centre for Regenerative Medicine, Institute for Regeneration and Repair, University of Edinburgh, Edinburgh bioQuarter, 5 Little France Drive, Edinburgh EH16 4UU, UK
| | - Richard Cunningham
- University of Edinburgh Centre for Inflammation Research, Institute for Regeneration and Repair, Queen’s Medical Research Institute, Edinburgh bioQuarter, 47 Little France Crescent, Edinburgh EH16 4TJ, UK,MRC Centre for Regenerative Medicine, Institute for Regeneration and Repair, University of Edinburgh, Edinburgh bioQuarter, 5 Little France Drive, Edinburgh EH16 4UU, UK
| | - Hassan Foroughi Asl
- Genomic Medicine Center, Karolinska University Hospital, Stockholm, Sweden. Clinical Genomics Facility, Department of Microbiology, Tumour and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Sofi Isaksson
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Elsa Arbajian
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Mattias Aine
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Anna Karlsson
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Marija Kotevska
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden,Department of Respiratory Medicine and Allergology, Skåne University Hospital, Lund, Sweden
| | - Carsten Gram Hansen
- University of Edinburgh Centre for Inflammation Research, Institute for Regeneration and Repair, Queen’s Medical Research Institute, Edinburgh bioQuarter, 47 Little France Crescent, Edinburgh EH16 4TJ, UK,MRC Centre for Regenerative Medicine, Institute for Regeneration and Repair, University of Edinburgh, Edinburgh bioQuarter, 5 Little France Drive, Edinburgh EH16 4UU, UK
| | - Vilde Drageset Haakensen
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway,Department of Oncology, Oslo University Hospital, Oslo, Norway
| | - Åslaug Helland
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway,Department of Oncology, Oslo University Hospital, Oslo, Norway,Faculty of Medicine, University of Oslo, Norway
| | - David Tamborero
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Henrik J. Johansson
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Rui M. Branca
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| | - Maria Planck
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden,Department of Respiratory Medicine and Allergology, Skåne University Hospital, Lund, Sweden
| | - Johan Staaf
- Division of Oncology, Department of Clinical Sciences, Lund and CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Lukas M. Orre
- Department of Oncology and Pathology, Karolinska Institutet, Science for Life Laboratory, Solna, SE-17165, Sweden
| |
Collapse
|
35
|
Fijalkowski I, Peeters MKR, Van Damme P. Small Protein Enrichment Improves Proteomics Detection of sORF Encoded Polypeptides. Front Genet 2021; 12:713400. [PMID: 34721520 PMCID: PMC8554064 DOI: 10.3389/fgene.2021.713400] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 10/01/2021] [Indexed: 11/13/2022] Open
Abstract
With the rapid growth in the number of sequenced genomes, genome annotation efforts became almost exclusively reliant on automated pipelines. Despite their unquestionable utility, these methods have been shown to underestimate the true complexity of the studied genomes, with small open reading frames (sORFs; ORFs typically considered shorter than 300 nucleotides) and, in consequence, their protein products (sORF encoded polypeptides or SEPs) being the primary example of a poorly annotated and highly underexplored class of genomic elements. With the advent of advanced translatomics such as ribosome profiling, reannotation efforts have progressed a great deal in providing translation evidence for numerous, previously unannotated sORFs. However, proteomics validation of these riboproteogenomics discoveries remains challenging due to their short length and often highly variable physiochemical properties. In this work we evaluate and compare tailored, yet easily adaptable, protein extraction methodologies for their efficacy in the extraction and concomitantly proteomics detection of SEPs expressed in the prokaryotic model pathogen Salmonella typhimurium (S. typhimurium). Further, an optimized protocol for the enrichment and efficient detection of SEPs making use of the of amphipathic polymer amphipol A8-35 and relying on differential peptide vs. protein solubility was developed and compared with global extraction methods making use of chaotropic agents. Given the versatile biological functions SEPs have been shown to exert, this work provides an accessible protocol for proteomics exploration of this fascinating class of small proteins.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Gent, Belgium
| | - Marlies K. R. Peeters
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Gent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Gent, Belgium
| |
Collapse
|
36
|
Babu N, Bhat MY, John AE, Chatterjee A. The role of proteomics in the multiplexed analysis of gene alterations in human cancer. Expert Rev Proteomics 2021; 18:737-756. [PMID: 34602018 DOI: 10.1080/14789450.2021.1984884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
INTRODUCTION Proteomics has played a pivotal role in identifying proteins perturbed in disease conditions when compared with healthy samples. Study of dysregulated proteins aids in identifying diagnostic markers and potential therapeutic targets. Cancer is an outcome of interplay of several such disarrayed proteins and molecular pathways which perturb cellular homeostasis, resulting in transformation. In this review, we discuss various facets of proteomic approaches, including tools and technological advancements, aiding in understanding differentially expressed molecules and signaling mechanisms. AREAS COVERED In this review, we have taken the approach of documenting the different methods of proteomic studies, ranging from labeling techniques, data analysis methods, and the nature of molecule detected. We summarize each technique and provide a glimpse of cancer research carried out using them, highlighting the advantages and drawbacks in comparison with others. Literature search using online resources, such as PubMed and Google Scholar were carried out for this approach. EXPERT OPINION Technological advancements in proteomics studies have come a long way from the study of two-dimensional mapping of proteins separated on gels in the early 1970s. Higher precision in molecular identification and quantification (high throughput), and greater number of samples analyzed have been the focus of researchers.
Collapse
Affiliation(s)
- Niraj Babu
- Institute of Bioinformatics, International Technology Park, Bangalore, Bangalore, 560066, India.,Manipal Academy of Higher Education (MAHE), Manipal, India
| | - Mohd Younis Bhat
- Institute of Bioinformatics, International Technology Park, Bangalore, Bangalore, 560066, India
| | | | - Aditi Chatterjee
- Institute of Bioinformatics, International Technology Park, Bangalore, Bangalore, 560066, India.,Manipal Academy of Higher Education (MAHE), Manipal, India
| |
Collapse
|
37
|
Vitorino R, Choudhury M, Guedes S, Ferreira R, Thongboonkerd V, Sharma L, Amado F, Srivastava S. Peptidomics and proteogenomics: background, challenges and future needs. Expert Rev Proteomics 2021; 18:643-659. [PMID: 34517741 DOI: 10.1080/14789450.2021.1980388] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
INTRODUCTION With available genomic data and related information, it is becoming possible to better highlight mutations or genomic alterations associated with a particular disease or disorder. The advent of high-throughput sequencing technologies has greatly advanced diagnostics, prognostics, and drug development. AREAS COVERED Peptidomics and proteogenomics are the two post-genomic technologies that enable the simultaneous study of peptides and proteins/transcripts/genes. Both technologies add a remarkably large amount of data to the pool of information on various peptides associated with gene mutations or genome remodeling. Literature search was performed in the PubMed database and is up to date. EXPERT OPINION This article lists various techniques used for peptidomic and proteogenomic analyses. It also explains various bioinformatics workflows developed to understand differentially expressed peptides/proteins and their role in disease pathogenesis. Their role in deciphering disease pathways, cancer research, and biomarker discovery using biofluids is highlighted. Finally, the challenges and future requirements to overcome the current limitations for their effective clinical use are also discussed.
Collapse
Affiliation(s)
- Rui Vitorino
- Faculdade de Medicina da Universidade do Porto, Porto, Portugal.,iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal.,Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Manisha Choudhury
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Powai, India
| | - Sofia Guedes
- Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Rita Ferreira
- Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Visith Thongboonkerd
- Medical Proteomics Unit, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | | | - Francisco Amado
- Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Powai, India
| |
Collapse
|
38
|
Hyung D, Baek MJ, Lee J, Cho J, Kim HS, Park C, Cho SY. Protein-gene Expression Nexus: Comprehensive characterization of human cancer cell lines with proteogenomic analysis. Comput Struct Biotechnol J 2021; 19:4759-4769. [PMID: 34504668 PMCID: PMC8405889 DOI: 10.1016/j.csbj.2021.08.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 08/13/2021] [Accepted: 08/14/2021] [Indexed: 12/30/2022] Open
Abstract
Researchers have gained new therapeutic insights using multi-omics platform approaches to study DNA, RNA, and proteins of comprehensively characterized human cancer cell lines. To improve our understanding of the molecular features associated with oncogenic modulation in cancer, we proposed a proteogenomic database for human cancer cell lines, called Protein-gene Expression Nexus (PEN). We have expanded the characterization of cancer cell lines to include genetic, mRNA, and protein data of 145 cancer cell lines from various public studies. PEN contains proteomic and phosphoproteomic data on 4,129,728 peptides, 13,862 proteins, 7,138 phosphorylation site-associated genomic variations, 117 studies, and 12 cancer. We analyzed functional characterizations along with the integrated datasets, such as cis/trans association for copy number alteration (CNA), single amino acid variation for coding genes, post-translation modification site variation for Single Amino Acid Variation, and novel peptide expression for noncoding regions and fusion genes. PEN provides a user-friendly interface for searching, browsing, and downloading data and also supports the visualization of genome-wide association between CNA and expression, novel peptide landscape, mRNA-protein abundance, and functional annotation. Together, this dataset and PEN data portal provide a resource to accelerate cancer research using model cancer cell lines. PEN is freely accessible at http://combio.snu.ac.kr/pen.
Collapse
Affiliation(s)
- Daejin Hyung
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea
| | - Min-Jeong Baek
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea
| | - Jongkeun Lee
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea
| | - Juyeon Cho
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea
| | - Hyoun Sook Kim
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea
| | - Charny Park
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea
| | - Soo Young Cho
- National Cancer Center, 323 Ilsan-ro, Goyang-si, Gyeonggi-do 10408, Republic of Korea.,Department of Molecular and Life Science, Hanyang University, Ansan 15588, Republic of Korea
| |
Collapse
|
39
|
Hirsch FR, Walker J, Higgs BW, Cooper ZA, Raja RG, Wistuba II. The Combiome Hypothesis: Selecting Optimal Treatment for Cancer Patients. Clin Lung Cancer 2021; 23:1-13. [PMID: 34645581 DOI: 10.1016/j.cllc.2021.08.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/16/2021] [Accepted: 08/19/2021] [Indexed: 01/10/2023]
Abstract
Existing approaches for cancer diagnosis are inefficient in the use of diagnostic tissue, and decision-making is often sequential, typically resulting in delayed treatment initiation. Future diagnostic testing needs to be faster and optimize increasingly complex treatment decisions. We envision a future where comprehensive testing is routine. Our approach, termed the "combiome," combines holistic information from the tumor, and the patient's immune system. The combiome model proposed here advocates synchronized up-front testing with a panel of sensitive assays, revealing a more complete understanding of the patient phenotype and improved targeting and sequencing of treatments. Development and eventual adoption of the combiome model for diagnostic testing may provide better outcomes for all cancer patients, but will require significant changes in workflows, technology, regulations, and administration. In this review, we discuss the current and future testing landscape, targeting of personalized treatments, and technological and regulatory advances necessary to achieve the combiome.
Collapse
Affiliation(s)
- Fred R Hirsch
- Center for Thoracic Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY.
| | - Jill Walker
- Precision Medicine, R&D Oncology, AstraZeneca, Cambridge, UK
| | - Brandon W Higgs
- Translational and Clinical Data Sciences, Genmab, Princeton, NJ
| | - Zachary A Cooper
- Translational Medicine, R&D Oncology, AstraZeneca, Gaithersburg, MD
| | - Rajiv G Raja
- Translational Medicine, R&D Oncology, AstraZeneca, Gaithersburg, MD
| | - Ignacio I Wistuba
- Department of Translational Molecular Pathology, Division of Pathology and Laboratory Medicine, University of Texas MD Anderson Cancer Center, Houston, TX
| |
Collapse
|
40
|
He C, Guo J, Tian W, Wong CCL. Proteogenomics Integrating Novel Junction Peptide Identification Strategy Discovers Three Novel Protein Isoforms of Human NHSL1 and EEF1B2. J Proteome Res 2021; 20:5294-5303. [PMID: 34420305 DOI: 10.1021/acs.jproteome.1c00373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In eukaryotes, alternative pre-mRNA splicing allows a single gene to encode different protein isoforms that function in many biological processes, and they are used as biomarkers or therapeutic targets for diseases. Although protein isoforms in the human genome are well annotated, we speculate that some low-abundance protein isoforms may still be under-annotated because most genes have a primary coding product and alternative protein isoforms tend to be under-expressed. A peptide coencoded by a novel exon and an annotated exon separated by an intron is known as a novel junction peptide. In the absence of known transcripts and homologous proteins, traditional whole-genome six-frame translation-based proteogenomics cannot identify novel junction peptides, and it cannot capture novel alternative splice sites. In this article, we first propose a strategy and tool for identifying novel junction peptides, called CJunction, which we then integrate into a proteogenomics process specifically designed for novel protein isoform discovery and apply to the analysis of a deep-coverage HeLa mass spectrometry data set with identifier PXD004452 in ProteomeXchange. We succeeded in identifying and validating three novel protein isoforms of two functionally important genes, NHSL1 (causative gene of Nance-Horan syndrome) and EEF1B2 (translation elongation factor), which validate our hypothesis. These novel protein isoforms have significant sequence differences from the annotated gene-coding products introduced by the novel N-terminal, suggesting that they may play importantly different functions.
Collapse
Affiliation(s)
- Cuitong He
- Peking-Tsinghua Centre for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, 100871 Beijing, China.,Center for Precision Medicine Multi-Omics Research, Peking University Health Science Center, 100191 Beijing, China
| | - Jiangtao Guo
- Center for Precision Medicine Multi-Omics Research, Peking University Health Science Center, 100191 Beijing, China
| | - Wenmin Tian
- Center for Precision Medicine Multi-Omics Research, Peking University Health Science Center, 100191 Beijing, China.,School of Basic Medical Sciences, Peking University Health Science Center, 100191 Beijing, China
| | - Catherine C L Wong
- Peking-Tsinghua Centre for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, 100871 Beijing, China.,Center for Precision Medicine Multi-Omics Research, Peking University Health Science Center, 100191 Beijing, China.,School of Basic Medical Sciences, Peking University Health Science Center, 100191 Beijing, China.,Peking University First Hospital, 100034 Beijing, China.,Advanced Innovation Center for Human Brain Protection, Capital Medical University, 100069 Beijing, China
| |
Collapse
|
41
|
Hampel H, Nisticò R, Seyfried NT, Levey AI, Modeste E, Lemercier P, Baldacci F, Toschi N, Garaci F, Perry G, Emanuele E, Valenzuela PL, Lucia A, Urbani A, Sancesario GM, Mapstone M, Corbo M, Vergallo A, Lista S. Omics sciences for systems biology in Alzheimer's disease: State-of-the-art of the evidence. Ageing Res Rev 2021; 69:101346. [PMID: 33915266 DOI: 10.1016/j.arr.2021.101346] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 04/06/2021] [Accepted: 04/22/2021] [Indexed: 12/12/2022]
Abstract
Alzheimer's disease (AD) is characterized by non-linear, genetic-driven pathophysiological dynamics with high heterogeneity in biological alterations and disease spatial-temporal progression. Human in-vivo and post-mortem studies point out a failure of multi-level biological networks underlying AD pathophysiology, including proteostasis (amyloid-β and tau), synaptic homeostasis, inflammatory and immune responses, lipid and energy metabolism, oxidative stress. Therefore, a holistic, systems-level approach is needed to fully capture AD multi-faceted pathophysiology. Omics sciences - genomics, epigenomics, transcriptomics, proteomics, metabolomics, lipidomics - embedded in the systems biology (SB) theoretical and computational framework can generate explainable readouts describing the entire biological continuum of a disease. Such path in Neurology is encouraged by the promising results of omics sciences and SB approaches in Oncology, where stage-driven pathway-based therapies have been developed in line with the precision medicine paradigm. Multi-omics data integrated in SB network approaches will help detect and chart AD upstream pathomechanistic alterations and downstream molecular effects occurring in preclinical stages. Finally, integrating omics and neuroimaging data - i.e., neuroimaging-omics - will identify multi-dimensional biological signatures essential to track the clinical-biological trajectories, at the subpopulation or even individual level.
Collapse
|
42
|
Zhang J, Li S, Xie N, Nie G, Tang A, Zhang XE, Liang M, Yan X. A natural nanozyme in life is found: the iron core within ferritin shows superoxide dismutase catalytic activity. SCIENCE CHINA. LIFE SCIENCES 2021; 64:1375-1378. [PMID: 33481166 DOI: 10.1007/s11427-020-1865-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 12/10/2020] [Indexed: 01/18/2023]
Affiliation(s)
- Jianlin Zhang
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- Institute of Translational Medicine, Shenzhen Second People's Hospital, The First Affiliation Hospital of Shenzhen University, Health Science Center, Shenzhen, 518035, China
- Guangdong Provincial Key Laboratory of Brain Function and Disease, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shimin Li
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Ni Xie
- Institute of Translational Medicine, Shenzhen Second People's Hospital, The First Affiliation Hospital of Shenzhen University, Health Science Center, Shenzhen, 518035, China
| | - Guohui Nie
- Institute of Translational Medicine, Shenzhen Second People's Hospital, The First Affiliation Hospital of Shenzhen University, Health Science Center, Shenzhen, 518035, China
| | - Aifa Tang
- Institute of Translational Medicine, Shenzhen Second People's Hospital, The First Affiliation Hospital of Shenzhen University, Health Science Center, Shenzhen, 518035, China.
- Shenzhen Luohu Hospital Group, The 3rd Affiliated Hospital of Shenzhen University, Shenzhen, 518001, China.
| | - Xian-En Zhang
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Minmin Liang
- Experimental Center of Advanced Materials, School of Materials Science & Engineering, Beijing Institute of Technology, Beijing, 100081, China.
| | - Xiyun Yan
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
| |
Collapse
|
43
|
Poli G, Hasan S, Belia S, Cenciarini M, Tucker SJ, Imbrici P, Shehab S, Pessia M, Brancorsini S, D’Adamo MC. Kcnj16 (Kir5.1) Gene Ablation Causes Subfertility and Increases the Prevalence of Morphologically Abnormal Spermatozoa. Int J Mol Sci 2021; 22:5972. [PMID: 34205849 PMCID: PMC8199489 DOI: 10.3390/ijms22115972] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 05/26/2021] [Accepted: 05/27/2021] [Indexed: 12/16/2022] Open
Abstract
The ability of spermatozoa to swim towards an oocyte and fertilize it depends on precise K+ permeability changes. Kir5.1 is an inwardly-rectifying potassium (Kir) channel with high sensitivity to intracellular H+ (pHi) and extracellular K+ concentration [K+]o, and hence provides a link between pHi and [K+]o changes and membrane potential. The intrinsic pHi sensitivity of Kir5.1 suggests a possible role for this channel in the pHi-dependent processes that take place during fertilization. However, despite the localization of Kir5.1 in murine spermatozoa, and its increased expression with age and sexual maturity, the role of the channel in sperm morphology, maturity, motility, and fertility is unknown. Here, we confirmed the presence of Kir5.1 in spermatozoa and showed strong expression of Kir4.1 channels in smooth muscle and epithelial cells lining the epididymal ducts. In contrast, Kir4.2 expression was not detected in testes. To examine the possible role of Kir5.1 in sperm physiology, we bred mice with a deletion of the Kcnj16 (Kir5.1) gene and observed that 20% of Kir5.1 knock-out male mice were infertile. Furthermore, 50% of knock-out mice older than 3 months were unable to breed. By contrast, 100% of wild-type (WT) mice were fertile. The genetic inactivation of Kcnj16 also resulted in smaller testes and a greater percentage of sperm with folded flagellum compared to WT littermates. Nevertheless, the abnormal sperm from mutant animals displayed increased progressive motility. Thus, ablation of the Kcnj16 gene identifies Kir5.1 channel as an important element contributing to testis development, sperm flagellar morphology, motility, and fertility. These findings are potentially relevant to the understanding of the complex pHi- and [K+]o-dependent interplay between different sperm ion channels, and provide insight into their role in fertilization and infertility.
Collapse
Affiliation(s)
- Giulia Poli
- Section of Pathology, Department of Medicine and Surgery, University of Perugia, 06132 Perugia, Italy; (G.P.); (S.B.)
| | - Sonia Hasan
- Department of Physiology, Faculty of Medicine, Kuwait University, Safat 13110, Kuwait;
| | - Silvia Belia
- Department of Chemistry Biology and Biotechnology, University of Perugia, 06123 Perugia, Italy;
| | - Marta Cenciarini
- Section of Physiology & Biochemistry, Department of Medicine and Surgery, University of Perugia, 06132 Perugia, Italy;
| | - Stephen J. Tucker
- Clarendon Laboratory, Department of Physics, University of Oxford, Oxford OX1 3PU, UK;
| | - Paola Imbrici
- Department of Pharmacy-Drug Sciences, University of Bari ‘‘Aldo Moro”, 70125 Bari, Italy;
| | - Safa Shehab
- Department of Anatomy, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 17666, United Arab Emirates;
| | - Mauro Pessia
- Department of Physiology & Biochemistry, Faculty of Medicine and Surgery, University of Malta, MSD 2080 Msida, Malta;
- Department of Physiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 17666, United Arab Emirates
| | - Stefano Brancorsini
- Section of Pathology, Department of Medicine and Surgery, University of Perugia, 06132 Perugia, Italy; (G.P.); (S.B.)
| | - Maria Cristina D’Adamo
- Department of Physiology & Biochemistry, Faculty of Medicine and Surgery, University of Malta, MSD 2080 Msida, Malta;
| |
Collapse
|
44
|
Mou T, Pawitan Y, Stahl M, Vesterlund M, Deng W, Jafari R, Bohlin A, Österroos A, Siavelis L, Bäckvall H, Erkers T, Kiviluoto S, Seashore‐Ludlow B, Östling P, Orre LM, Kallioniemi O, Lehmann S, Lehtiö J, Vu TN. The transcriptome-wide landscape of molecular subtype-specific mRNA expression profiles in acute myeloid leukemia. Am J Hematol 2021; 96:580-588. [PMID: 33625756 DOI: 10.1002/ajh.26141] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 02/17/2021] [Accepted: 02/23/2021] [Indexed: 12/19/2022]
Abstract
Molecular classification of acute myeloid leukemia (AML) aids prognostic stratification and clinical management. Our aim in this study is to identify transcriptome-wide mRNAs that are specific to each of the molecular subtypes of AML. We analyzed RNA-sequencing data of 955 AML samples from three cohorts, including the BeatAML project, the Cancer Genome Atlas, and a cohort of Swedish patients to provide a comprehensive transcriptome-wide view of subtype-specific mRNA expression. We identified 729 subtype-specific mRNAs, discovered in the BeatAML project and validated in the other two cohorts. Using unique proteomics data, we also validated the presence of subtype-specific mRNAs at the protein level, yielding a rich collection of potential protein-based biomarkers for the AML community. To enable the exploration of subtype-specific mRNA expression by the broader scientific community, we provide an interactive resource to the public.
Collapse
Affiliation(s)
- Tian Mou
- Department of Medical Epidemiology and Biostatistics Karolinska Institutet Stockholm Sweden
- School of Biomedical Engineering Shenzhen University Shenzhen China
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics Karolinska Institutet Stockholm Sweden
| | - Matthias Stahl
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Mattias Vesterlund
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Wenjiang Deng
- Department of Medical Epidemiology and Biostatistics Karolinska Institutet Stockholm Sweden
| | - Rozbeh Jafari
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Anna Bohlin
- Department of Medicine Huddinge Karolinska Institutet, Unit for Hematology, Karolinska University Hospital Huddinge Stockholm Sweden
| | - Albin Österroos
- Department of Medical Sciences, Section of Hematology Uppsala University Hospital Uppsala Sweden
| | - Loannis Siavelis
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Helena Bäckvall
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Tom Erkers
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Santeri Kiviluoto
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Brinton Seashore‐Ludlow
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Päivi Östling
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
- Institute for Molecular Medicine Finland, University of Helsinki Helsinki Finland
| | - Lukas M. Orre
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Olli Kallioniemi
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
- Institute for Molecular Medicine Finland, University of Helsinki Helsinki Finland
| | - Sören Lehmann
- Department of Medicine Huddinge Karolinska Institutet, Unit for Hematology, Karolinska University Hospital Huddinge Stockholm Sweden
- Department of Medical Sciences, Section of Hematology Uppsala University Hospital Uppsala Sweden
| | - Janne Lehtiö
- Department of Oncology Pathology Karolinska Institutet, Science for Life Laboratory Stockholm Sweden
| | - Trung Nghia Vu
- Department of Medical Epidemiology and Biostatistics Karolinska Institutet Stockholm Sweden
| |
Collapse
|
45
|
Xiang R, Ma L, Yang M, Zheng Z, Chen X, Jia F, Xie F, Zhou Y, Li F, Wu K, Zhu Y. Increased expression of peptides from non-coding genes in cancer proteomics datasets suggests potential tumor neoantigens. Commun Biol 2021; 4:496. [PMID: 33888849 PMCID: PMC8062694 DOI: 10.1038/s42003-021-02007-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 03/22/2021] [Indexed: 02/05/2023] Open
Abstract
Neoantigen-based immunotherapy has yielded promising results in clinical trials. However, it is limited to tumor-specific mutations, and is often tailored to individual patients. Identifying suitable tumor-specific antigens is still a major challenge. Previous proteogenomics studies have identified peptides encoded by predicted non-coding sequences in human genome. To investigate whether tumors express specific peptides encoded by non-coding genes, we analyzed published proteomics data from five cancer types including 933 tumor samples and 275 matched normal samples and compared these to data from 31 different healthy human tissues. Our results reveal that many predicted non-coding genes such as DGCR9 and RHOXF1P3 encode peptides that are overexpressed in tumors compared to normal controls. Furthermore, from the non-coding genes-encoded peptides specifically detected in cancers, we predict a large number of “dark antigens” (neoantigens from non-coding genomic regions), which may provide an alternative source of neoantigens beyond standard tumor specific mutations. Rong Xiang et al. analyze the expression of non-coding genes encoded peptides in publicly-available proteomics data from five cancer types and matched controls. They identify peptides from non-coding genes including DGCR9 and RHOXF1P3 that are upregulated in tumors compared to controls, suggesting that non-coding gene-encoded peptides may be a source of neoantigens in some cancers.
Collapse
Affiliation(s)
- Rong Xiang
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, China.,BGI-Shenzhen, Shenzhen, China
| | - Leyao Ma
- BGI-Shenzhen, Shenzhen, China.,Southeast University, Nanjing, China
| | | | | | | | | | | | - Yiming Zhou
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Fuqiang Li
- BGI-Shenzhen, Shenzhen, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen, China
| | - Kui Wu
- BGI-Shenzhen, Shenzhen, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen, China
| | - Yafeng Zhu
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.
| |
Collapse
|
46
|
Tooley JG, Catlin JP, Schaner Tooley CE. CREB-mediated transcriptional activation of NRMT1 drives muscle differentiation. Transcription 2021; 12:72-88. [PMID: 34403304 PMCID: PMC8555533 DOI: 10.1080/21541264.2021.1963627] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 07/28/2021] [Accepted: 07/29/2021] [Indexed: 12/29/2022] Open
Abstract
The N-terminal methyltransferase NRMT1 is an important regulator of protein/DNA interactions and plays a role in many cellular processes, including mitosis, cell cycle progression, chromatin organization, DNA damage repair, and transcriptional regulation. Accordingly, loss of NRMT1 results in both developmental pathologies and oncogenic phenotypes. Though NRMT1 plays such important and diverse roles in the cell, little is known about its own regulation. To better understand the mechanisms governing NRMT1 expression, we first identified its predominant transcriptional start site and minimal promoter region with predicted transcription factor motifs. We then used a combination of luciferase and binding assays to confirm CREB1 as the major regulator of NRMT1 transcription. We tested which conditions known to activate CREB1 also activated NRMT1 transcription, and found CREB1-mediated NRMT1 expression was increased during recovery from serum starvation and muscle cell differentiation. To determine how NRMT1 expression affects myoblast differentiation, we used CRISPR/Cas9 technology to knock out NRMT1 expression in immortalized C2C12 mouse myoblasts. C2C12 cells depleted of NRMT1 lacked Pax7 expression and were unable to proceed down the muscle differentiation pathway. Instead, they took on characteristics of C2C12 cells that have transdifferentiated into osteoblasts, including increased alkaline phosphatase and type I collagen expression and decreased proliferation. These data implicate NRMT1 as an important downstream target of CREB1 during muscle cell differentiation.
Collapse
Affiliation(s)
- John G. Tooley
- Department of Biochemistry, Jacobs School of Medicine and Biomedical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| | - James P. Catlin
- Department of Biochemistry, Jacobs School of Medicine and Biomedical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| | - Christine E. Schaner Tooley
- Department of Biochemistry, Jacobs School of Medicine and Biomedical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| |
Collapse
|
47
|
LncMachine: a machine learning algorithm for long noncoding RNA annotation in plants. Funct Integr Genomics 2021; 21:195-204. [PMID: 33635499 DOI: 10.1007/s10142-021-00769-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 01/20/2021] [Accepted: 01/25/2021] [Indexed: 12/09/2022]
Abstract
Following the elucidation of the critical roles they play in numerous important biological processes, long noncoding RNAs (lncRNAs) have gained vast attention in recent years. Manual annotation of lncRNAs is restricted by known gene annotations and is prone to false prediction due to the incompleteness of available data. However, with the advent of high-throughput sequencing technologies, a magnitude of high-quality data has become available for annotation, especially for plant species such as wheat. Here, we compared prediction accuracies of several machine learning algorithms using a 10-fold cross-validation. This study includes a comprehensive feature selection step to refine irrelevant and repeated features. We present a crop-specific, alignment-free coding potential prediction tool, LncMachine, that performs at higher prediction accuracies than the currently available popular tools (CPC2, CPAT, and CNIT) when used with the Random Forest algorithm. Further, LncMachine with Random Forest performed well on human and mouse data, with an average accuracy of 92.67%. LncMachine only requires either a FASTA file or a TAB separated CSV file containing features as input files. LncMachine can deploy several user-provided algorithms in real time and therefore be effortlessly applied to a wide range of studies.
Collapse
|
48
|
Schlesinger D, Elsässer SJ. Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins. FEBS J 2021; 289:53-74. [PMID: 33595896 DOI: 10.1111/febs.15769] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 01/17/2021] [Accepted: 02/15/2021] [Indexed: 02/07/2023]
Abstract
Short ORFs (sORFs), that is, occurrences of a start and stop codon within 100 codons or less, can be found in organisms of all domains of life, outnumbering annotated protein-coding ORFs by orders of magnitude. Even though functional proteins smaller than 100 amino acids are known, the coding potential of sORFs has often been overlooked, as it is not trivial to predict and test for functionality within the large number of sORFs. Recent advances in ribosome profiling and mass spectrometry approaches, together with refined bioinformatic predictions, have enabled a huge leap forward in this field and identified thousands of likely coding sORFs. A relatively low number of small proteins or microproteins produced from these sORFs have been characterized so far on the molecular, structural, and/or mechanistic level. These however display versatile and, in some cases, essential cellular functions, allowing for the exciting possibility that many more, previously unknown small proteins might be encoded in the genome, waiting to be discovered. This review will give an overview of the steadily growing microprotein field, focusing on eukaryotic small proteins. We will discuss emerging themes in the molecular action of microproteins, as well as advances and challenges in microprotein identification and characterization.
Collapse
Affiliation(s)
- Dörte Schlesinger
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| | - Simon J Elsässer
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
49
|
Gunnarsson S, Prabakaran S. In silico identification of novel open reading frames in Plasmodium falciparum oocyte and salivary gland sporozoites using proteogenomics framework. Malar J 2021; 20:71. [PMID: 33546698 PMCID: PMC7866754 DOI: 10.1186/s12936-021-03598-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/16/2021] [Indexed: 11/25/2022] Open
Abstract
Background Plasmodium falciparum causes the deadliest form of malaria, which remains one of the most prevalent infectious diseases. Unfortunately, the only licensed vaccine showed limited protection and resistance to anti-malarial drug is increasing, which can be largely attributed to the biological complexity of the parasite’s life cycle. The progression from one developmental stage to another in P. falciparum involves drastic changes in gene expressions, where its infectivity to human hosts varies greatly depending on the stage. Approaches to identify candidate genes that are responsible for the development of infectivity to human hosts typically involve differential gene expression analysis between stages. However, the detection may be limited to annotated proteins and open reading frames (ORFs) predicted using restrictive criteria. Methods The above problem is particularly relevant for P. falciparum; whose genome annotation is relatively incomplete given its clinical significance. In this work, systems proteogenomics approach was used to address this challenge, as it allows computational detection of unannotated, novel Open Reading Frames (nORFs), which are neglected by conventional analyses. Two pairs of transcriptome/proteome were obtained from a previous study where one was collected in the mosquito-infectious oocyst sporozoite stage, and the other in the salivary gland sporozoite stage with human infectivity. They were then re-analysed using the proteogenomics framework to identify nORFs in each stage. Results Translational products of nORFs that map to antisense, intergenic, intronic, 3′ UTR and 5′ UTR regions, as well as alternative reading frames of canonical proteins were detected. Some of these nORFs also showed differential expression between the two life cycle stages studied. Their regulatory roles were explored through further bioinformatics analyses including the expression regulation on the parent reference genes, in silico structure prediction, and gene ontology term enrichment analysis. Conclusion The identification of nORFs in P. falciparum sporozoites highlights the biological complexity of the parasite. Although the analyses are solely computational, these results provide a starting point for further experimental validation of the existence and functional roles of these nORFs,
Collapse
Affiliation(s)
- Sophie Gunnarsson
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|
50
|
Erady C, Boxall A, Puntambekar S, Suhas Jagannathan N, Chauhan R, Chong D, Meena N, Kulkarni A, Kasabe B, Prathivadi Bhayankaram K, Umrania Y, Andreani A, Nel J, Wayland MT, Pina C, Lilley KS, Prabakaran S. Pan-cancer analysis of transcripts encoding novel open-reading frames (nORFs) and their potential biological functions. NPJ Genom Med 2021; 6:4. [PMID: 33495453 PMCID: PMC7835362 DOI: 10.1038/s41525-020-00167-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 11/18/2020] [Indexed: 12/13/2022] Open
Abstract
Uncharacterized and unannotated open-reading frames, which we refer to as novel open reading frames (nORFs), may sometimes encode peptides that remain unexplored for novel therapeutic opportunities. To our knowledge, no systematic identification and characterization of transcripts encoding nORFs or their translation products in cancer, or in any other physiological process has been performed. We use our curated nORFs database (nORFs.org), together with RNA-Seq data from The Cancer Genome Atlas (TCGA) and Genotype-Expression (GTEx) consortiums, to identify transcripts containing nORFs that are expressed frequently in cancer or matched normal tissue across 22 cancer types. We show nORFs are subject to extensive dysregulation at the transcript level in cancer tissue and that a small subset of nORFs are associated with overall patient survival, suggesting that nORFs may have prognostic value. We also show that nORF products can form protein-like structures with post-translational modifications. Finally, we perform in silico screening for inhibitors against nORF-encoded proteins that are disrupted in stomach and esophageal cancer, showing that they can potentially be targeted by inhibitors. We hope this work will guide and motivate future studies that perform in-depth characterization of nORF functions in cancer and other diseases.
Collapse
Affiliation(s)
- Chaitanya Erady
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Adam Boxall
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Shraddha Puntambekar
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - N Suhas Jagannathan
- Cancer and Stem Cell Biology Programme, and Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Ruchi Chauhan
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - David Chong
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Narendra Meena
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Apurv Kulkarni
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - Bhagyashri Kasabe
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | | | - Yagnesh Umrania
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Adam Andreani
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Jean Nel
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Matthew T Wayland
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Cristina Pina
- Department of Haematology, Cambridge Biomedical Campus, Cambridge, CB2 0PT, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|