1
|
Song YC, Das D, Zhang Y, Chen MX, Fernie AR, Zhu FY, Han J. Proteogenomics-based functional genome research: approaches, applications, and perspectives in plants. Trends Biotechnol 2023; 41:1532-1548. [PMID: 37365082 DOI: 10.1016/j.tibtech.2023.05.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/17/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023]
Abstract
Proteogenomics (PG) integrates the proteome with the genome and transcriptome to refine gene models and annotation. Coupled with single-cell (SC) assays, PG effectively distinguishes heterogeneity among cell groups. Affiliating spatial information to PG reveals the high-resolution circuitry within SC atlases. Additionally, PG can investigate dynamic changes in protein-coding genes in plants across growth and development as well as stress and external stimulation, significantly contributing to the functional genome. Here we summarize existing PG research in plants and introduce the technical features of various methods. Combining PG with other omics, such as metabolomics and peptidomics, can offer even deeper insights into gene functions. We argue that the application of PG will represent an important font of foundational knowledge for plants.
Collapse
Affiliation(s)
- Yu-Chen Song
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China
| | - Debatosh Das
- College of Agriculture, Food and Natural Resources (CAFNR), Division of Plant Sciences and Technology, 52 Agricultural Building, University of Missouri-Columbia, MO 65201, USA
| | - Youjun Zhang
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany; Center of Plant Systems Biology and Biotechnology, Plovdiv, Bulgaria
| | - Mo-Xian Chen
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China.
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany; Center of Plant Systems Biology and Biotechnology, Plovdiv, Bulgaria.
| | - Fu-Yuan Zhu
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China.
| | - Jiangang Han
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Tree Genetics and Biotechnology of Educational Department of China, Key Laboratory of State Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China; College of Biology and Environment, Nanjing Forestry University, Nanjing 210037, China.
| |
Collapse
|
2
|
Manuel JM, Guilloy N, Khatir I, Roucou X, Laurent B. Re-evaluating the impact of alternative RNA splicing on proteomic diversity. Front Genet 2023; 14:1089053. [PMID: 36845399 PMCID: PMC9947481 DOI: 10.3389/fgene.2023.1089053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Alternative splicing (AS) constitutes a mechanism by which protein-coding genes and long non-coding RNA (lncRNA) genes produce more than a single mature transcript. From plants to humans, AS is a powerful process that increases transcriptome complexity. Importantly, splice variants produced from AS can potentially encode for distinct protein isoforms which can lose or gain specific domains and, hence, differ in their functional properties. Advances in proteomics have shown that the proteome is indeed diverse due to the presence of numerous protein isoforms. For the past decades, with the help of advanced high-throughput technologies, numerous alternatively spliced transcripts have been identified. However, the low detection rate of protein isoforms in proteomic studies raised debatable questions on whether AS contributes to proteomic diversity and on how many AS events are really functional. We propose here to assess and discuss the impact of AS on proteomic complexity in the light of the technological progress, updated genome annotation, and current scientific knowledge.
Collapse
Affiliation(s)
- Jeru Manoj Manuel
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Inès Khatir
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada,Quebec Network for Research on Protein Function Structure and Engineering, PROTEO, Québec, QC, Canada
| | - Benoit Laurent
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,*Correspondence: Benoit Laurent,
| |
Collapse
|
3
|
Reixachs‐Solé M, Eyras E. Uncovering the impacts of alternative splicing on the proteome with current omics techniques. WILEY INTERDISCIPLINARY REVIEWS. RNA 2022; 13:e1707. [PMID: 34979593 PMCID: PMC9542554 DOI: 10.1002/wrna.1707] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 11/27/2021] [Accepted: 11/29/2021] [Indexed: 12/15/2022]
Abstract
The high-throughput sequencing of cellular RNAs has underscored a broad effect of isoform diversification through alternative splicing on the transcriptome. Moreover, the differential production of transcript isoforms from gene loci has been recognized as a critical mechanism in cell differentiation, organismal development, and disease. Yet, the extent of the impact of alternative splicing on protein production and cellular function remains a matter of debate. Multiple experimental and computational approaches have been developed in recent years to address this question. These studies have unveiled how molecular changes at different steps in the RNA processing pathway can lead to differences in protein production and have functional effects. New and emerging experimental technologies open exciting new opportunities to develop new methods to fully establish the connection between messenger RNA expression and protein production and to further investigate how RNA variation impacts the proteome and cell function. This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing Translation > Regulation RNA Evolution and Genomics > Computational Analyses of RNA.
Collapse
Affiliation(s)
- Marina Reixachs‐Solé
- The John Curtin School of Medical ResearchAustralian National UniversityCanberraAustralian Capital TerritoryAustralia
- EMBL Australia Partner Laboratory Network and the Australian National UniversityCanberraAustralian Capital TerritoryAustralia
| | - Eduardo Eyras
- The John Curtin School of Medical ResearchAustralian National UniversityCanberraAustralian Capital TerritoryAustralia
- EMBL Australia Partner Laboratory Network and the Australian National UniversityCanberraAustralian Capital TerritoryAustralia
- Catalan Institution for Research and Advanced StudiesBarcelonaSpain
- Hospital del Mar Medical Research Institute (IMIM)BarcelonaSpain
| |
Collapse
|
4
|
Wu C, Lu X, Lu S, Wang H, Li D, Zhao J, Jin J, Sun Z, He QY, Chen Y, Zhang G. Efficient Detection of the Alternative Spliced Human Proteome Using Translatome Sequencing. Front Mol Biosci 2022; 9:895746. [PMID: 35720116 PMCID: PMC9201276 DOI: 10.3389/fmolb.2022.895746] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 04/28/2022] [Indexed: 01/08/2023] Open
Abstract
Alternative splicing (AS) isoforms create numerous proteoforms, expanding the complexity of the genome. Highly similar sequences, incomplete reference databases and the insufficient sequence coverage of mass spectrometry limit the identification of AS proteoforms. Here, we demonstrated full-length translating mRNAs (ribosome nascent-chain complex-bound mRNAs, RNC-mRNAs) sequencing (RNC-seq) strategy to sequence the entire translating mRNA using next-generation sequencing, including short-read and long-read technologies, to construct a protein database containing all translating AS isoforms. Taking the advantage of read length, short-read RNC-seq identified up to 15,289 genes and 15,906 AS isoforms in a single human cell line, much more than the Ribo-seq. The single-molecule long-read RNC-seq supplemented 4,429 annotated AS isoforms that were not identified by short-read datasets, and 4,525 novel AS isoforms that were not included in the public databases. Using such RNC-seq-guided database, we identified 6,766 annotated protein isoforms and 50 novel protein isoforms in mass spectrometry datasets. These results demonstrated the potential of full-length RNC-seq in investigating the proteome of AS isoforms.
Collapse
Affiliation(s)
- Chun Wu
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Xiaolong Lu
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Shaohua Lu
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
- State Key Laboratory of Respiratory Disease, School of Basic Medical Sciences, Sino-French Hoffmann Institute, Guangzhou Medical University, Guangzhou, China
| | - Hongwei Wang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Dehua Li
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Jing Zhao
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Jingjie Jin
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Zhenghua Sun
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Qing-Yu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Yang Chen
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| |
Collapse
|
5
|
Hari PS, Balakrishnan L, Kotyada C, Everad John A, Tiwary S, Shah N, Sirdeshmukh R. Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications. Mol Cell Proteomics 2022; 21:100220. [PMID: 35227895 PMCID: PMC9020135 DOI: 10.1016/j.mcpro.2022.100220] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 01/16/2022] [Accepted: 02/24/2022] [Indexed: 11/30/2022] Open
Abstract
We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipeline that consisted of de novo transcript assembly, six frame-translated custom database, and a combination of search engines to identify novel peptides. A portfolio of 4,387 novel peptide sequences initially identified was further screened through PepQuery validation tool (Clinical Proteomic Tumor Analysis Consortium), which yielded 1,558 novel peptides. We considered the dataset of 1,558 validated through PepQuery to understand their functional and clinical significance, leaving the rest to be further verified using other validation tools and approaches. The novel peptides mapped to the known gene sequences as well as to genomic regions yet undefined for translation, 580 novel peptides mapped to known protein-coding genes, 147 to non–protein-coding genes, and 831 belonged to novel translational sequences. The novel peptides belonging to protein-coding genes represented alternatively spliced events or 5′ or 3′ extensions, whereas others represented translation from pseudogenes, long noncoding RNAs, or novel peptides originating from uncharacterized protein-coding sequences—mostly from the intronic regions of known genes. Seventy-six of the 580 protein-coding genes were associated with cancer hallmark genes, which included key oncogenes, transcription factors, kinases, and cell surface receptors. Survival association analysis of the 76 novel peptide sequences revealed 10 of them to be significant, and we present a panel of six novel peptides, whose high expression was found to be strongly associated with poor survival of patients with human epidermal growth factor receptor 2–enriched subtype. Our analysis represents a landscape of novel peptides of different types that may be expressed in breast cancer tissues, whereas their presence in full-length functional proteins needs further investigations. Novel protein variants and peptides from noncoding sequences are rapidly emerging. Mining of mass spectrometry data using proteogenomic analysis reveals such entities. Novel peptides from coding and noncoding sequences identified in breast cancer. Novel peptides mapped to cancer hallmark genes in breast cancer. Panel of novel peptides with prognostic potential found for HER2-enriched subtype.
Collapse
Affiliation(s)
- P S Hari
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | - Lavanya Balakrishnan
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | - Chaithanya Kotyada
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | | | - Shivani Tiwary
- Simulation and Modeling Sciences, Pfizer Pharma GmBH, Berlin, Germany
| | - Nameeta Shah
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India.
| | - Ravi Sirdeshmukh
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India; Institute of Bioinformatics, International Tech Park, Bangalore, India; Health Sciences, Manipal Academy of Higher Education, Manipal, India.
| |
Collapse
|
6
|
Miller RM, Jordan BT, Mehlferber MM, Jeffery ED, Chatzipantsiou C, Kaur S, Millikin RJ, Dai Y, Tiberi S, Castaldi PJ, Shortreed MR, Luckey CJ, Conesa A, Smith LM, Deslattes Mays A, Sheynkman GM. Enhanced protein isoform characterization through long-read proteogenomics. Genome Biol 2022; 23:69. [PMID: 35241129 PMCID: PMC8892804 DOI: 10.1186/s13059-022-02624-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 02/02/2022] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. RESULTS We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. CONCLUSIONS Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
Collapse
Affiliation(s)
- Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Ben T Jordan
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Madison M Mehlferber
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
| | - Erin D Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | | | - Simi Kaur
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Robert J Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Yunxiang Dai
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Simone Tiberi
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of General Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Chance John Luckey
- Department of Pathology, University of Virginia, Charlottesville, VA, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
- Microbiology and Cell Science Department, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Anne Deslattes Mays
- Office of Data Science and Sharing, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, MD, USA
| | - Gloria M Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA.
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA.
- UVA Cancer Center, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
7
|
Halperin RF, Hegde A, Lang JD, Raupach EA, Legendre C, Liang WS, LoRusso PM, Sekulic A, Sosman JA, Trent JM, Rangasamy S, Pirrotte P, Schork NJ. Improved methods for RNAseq-based alternative splicing analysis. Sci Rep 2021; 11:10740. [PMID: 34031440 PMCID: PMC8144374 DOI: 10.1038/s41598-021-89938-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 04/13/2021] [Indexed: 01/04/2023] Open
Abstract
The robust detection of disease-associated splice events from RNAseq data is challenging due to the potential confounding effect of gene expression levels and the often limited number of patients with relevant RNAseq data. Here we present a novel statistical approach to splicing outlier detection and differential splicing analysis. Our approach tests for differences in the percentages of sequence reads representing local splice events. We describe a software package called Bisbee which can predict the protein-level effect of splice alterations, a key feature lacking in many other splicing analysis resources. We leverage Bisbee's prediction of protein level effects as a benchmark of its capabilities using matched sets of RNAseq and mass spectrometry data from normal tissues. Bisbee exhibits improved sensitivity and specificity over existing approaches and can be used to identify tissue-specific splice variants whose protein-level expression can be confirmed by mass spectrometry. We also applied Bisbee to assess evidence for a pathogenic splicing variant contributing to a rare disease and to identify tumor-specific splice isoforms associated with an oncogenic mutation. Bisbee was able to rediscover previously validated results in both of these cases and also identify common tumor-associated splice isoforms replicated in two independent melanoma datasets.
Collapse
Affiliation(s)
- Rebecca F Halperin
- Quantitative Medicine and Systems Biology Division, Translational Genomics Research Institute, Phoenix, AZ, USA.
| | - Apurva Hegde
- Collaborative Center for Translational Mass Spectrometry, Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Jessica D Lang
- Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Elizabeth A Raupach
- Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Christophe Legendre
- Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Winnie S Liang
- Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ, USA
- Neurogenomics Division, Translational Genomics Research Institute, Phoenix, AZ, USA
| | | | | | | | - Jeffrey M Trent
- Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ, USA
| | | | - Patrick Pirrotte
- Collaborative Center for Translational Mass Spectrometry, Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Nicholas J Schork
- Quantitative Medicine and Systems Biology Division, Translational Genomics Research Institute, Phoenix, AZ, USA
| |
Collapse
|
8
|
Agosto LM, Gazzara MR, Radens CM, Sidoli S, Baeza J, Garcia BA, Lynch KW. Deep profiling and custom databases improve detection of proteoforms generated by alternative splicing. Genome Res 2019; 29:2046-2055. [PMID: 31727681 PMCID: PMC6886501 DOI: 10.1101/gr.248435.119] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 09/16/2019] [Indexed: 02/05/2023]
Abstract
Alternative pre-mRNA splicing has long been proposed to contribute greatly to proteome complexity. However, the extent to which mature mRNA isoforms are successfully translated into protein remains controversial. Here, we used high-throughput RNA sequencing and mass spectrometry (MS)–based proteomics to better evaluate the translation of alternatively spliced mRNAs. To increase proteome coverage and improve protein quantitation, we optimized cell fractionation and sample processing steps at both the protein and peptide level. Furthermore, we generated a custom peptide database trained on analysis of RNA-seq data with MAJIQ, an algorithm optimized to detect and quantify differential and unannotated splice junction usage. We matched tandem mass spectra acquired by data-dependent acquisition (DDA) against our custom RNA-seq based database, as well as SWISS-PROT and RefSeq databases to improve identification of splicing-derived proteoforms by 28% compared with use of the SWISS-PROT database alone. Altogether, we identified peptide evidence for 554 alternate proteoforms corresponding to 274 genes. Our increased depth and detection of proteins also allowed us to track changes in the transcriptome and proteome induced by T-cell stimulation, as well as fluctuations in protein subcellular localization. In sum, our data here confirm that use of generic databases in proteomic studies underestimates the number of spliced mRNA isoforms that are translated into protein and provides a workflow that improves isoform detection in large-scale proteomic experiments.
Collapse
Affiliation(s)
- Laura M Agosto
- Biochemistry and Molecular Biophysics Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Matthew R Gazzara
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Caleb M Radens
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Genetics and Epigenetics, Cell & Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Simone Sidoli
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Josue Baeza
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Benjamin A Garcia
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Kristen W Lynch
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
9
|
Paik YK, Overall CM, Corrales F, Deutsch EW, Lane L, Omenn GS. Toward Completion of the Human Proteome Parts List: Progress Uncovering Proteins That Are Missing or Have Unknown Function and Developing Analytical Methods. J Proteome Res 2019; 17:4023-4030. [PMID: 30985145 PMCID: PMC6288998 DOI: 10.1021/acs.jproteome.8b00885] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Young-Ki Paik
- Yonsei Proteome Research Center, College of Life Science and Technology, Yonsei University
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences and Biochemistry & Molecular Biology, Faculty of Dentistry, University of British Columbia
| | - Fernando Corrales
- Functional Proteomics Laboratory National Center of Biotechnology, CSIC
| | | | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, CMU, University of Geneva
| | - Gilbert S Omenn
- Institute for Systems Biology, Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics & School of Public Health, University of Michigan
| |
Collapse
|