1
|
Wang XY, Xu YM, Lau ATY. Proteogenomics in Cancer: Then and Now. J Proteome Res 2023; 22:3103-3122. [PMID: 37725793 DOI: 10.1021/acs.jproteome.3c00196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
For years, the paths of sequencing technologies and mass spectrometry have occurred in isolation, with each developing its own unique culture and expertise. These two technologies are crucial for inspecting complementary aspects of the molecular phenotype across the central dogma. Integrative multiomics strives to bridge the analysis gap among different fields to complete more comprehensive mechanisms of life events and diseases. Proteogenomics is one integrated multiomics field. Here in this review, we mainly summarize and discuss three aspects: workflow of proteogenomics, proteogenomics applications in cancer research, and the SWOT (Strengths, Weaknesses, Opportunities, Threats) analysis of proteogenomics in cancer research. In conclusion, proteogenomics has a promising future as it clarifies the functional consequences of many unannotated genomic abnormalities or noncanonical variants and identifies driver genes and novel therapeutic targets across cancers, which would substantially accelerate the development of precision oncology.
Collapse
Affiliation(s)
- Xiu-Yun Wang
- Laboratory of Cancer Biology and Epigenetics, Department of Cell Biology and Genetics, Shantou University Medical College, Shantou, Guangdong 515041, People's Republic of China
| | - Yan-Ming Xu
- Laboratory of Cancer Biology and Epigenetics, Department of Cell Biology and Genetics, Shantou University Medical College, Shantou, Guangdong 515041, People's Republic of China
| | - Andy T Y Lau
- Laboratory of Cancer Biology and Epigenetics, Department of Cell Biology and Genetics, Shantou University Medical College, Shantou, Guangdong 515041, People's Republic of China
| |
Collapse
|
2
|
Varshney N, Mishra AK. Deep Learning in Phosphoproteomics: Methods and Application in Cancer Drug Discovery. Proteomes 2023; 11:proteomes11020016. [PMID: 37218921 DOI: 10.3390/proteomes11020016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 05/24/2023] Open
Abstract
Protein phosphorylation is a key post-translational modification (PTM) that is a central regulatory mechanism of many cellular signaling pathways. Several protein kinases and phosphatases precisely control this biochemical process. Defects in the functions of these proteins have been implicated in many diseases, including cancer. Mass spectrometry (MS)-based analysis of biological samples provides in-depth coverage of phosphoproteome. A large amount of MS data available in public repositories has unveiled big data in the field of phosphoproteomics. To address the challenges associated with handling large data and expanding confidence in phosphorylation site prediction, the development of many computational algorithms and machine learning-based approaches have gained momentum in recent years. Together, the emergence of experimental methods with high resolution and sensitivity and data mining algorithms has provided robust analytical platforms for quantitative proteomics. In this review, we compile a comprehensive collection of bioinformatic resources used for the prediction of phosphorylation sites, and their potential therapeutic applications in the context of cancer.
Collapse
Affiliation(s)
- Neha Varshney
- Division of Biological Sciences, Department of Cellular and Molecular Medicine, University of California, San Diego, CA 93093, USA
- Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA
| | - Abhinava K Mishra
- Molecular, Cellular and Developmental Biology Department, University of California, Santa Barbara, CA 93106, USA
| |
Collapse
|
3
|
Liu J, Wang Q, Kang Y, Xu S, Pang D. Unconventional protein post-translational modifications: the helmsmen in breast cancer. Cell Biosci 2022; 12:22. [PMID: 35216622 PMCID: PMC8881842 DOI: 10.1186/s13578-022-00756-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 02/07/2022] [Indexed: 01/10/2023] Open
Abstract
AbstractBreast cancer is the most prevalent malignant tumor and a leading cause of mortality among females worldwide. The tumorigenesis and progression of breast cancer involve complex pathophysiological processes, which may be mediated by post-translational modifications (PTMs) of proteins, stimulated by various genes and signaling pathways. Studies into PTMs have long been dominated by the investigation of protein phosphorylation and histone epigenetic modifications. However, with great advances in proteomic techniques, several other PTMs, such as acetylation, glycosylation, sumoylation, methylation, ubiquitination, citrullination, and palmitoylation have been confirmed in breast cancer. Nevertheless, the mechanisms, effects, and inhibitors of these unconventional PTMs (particularly, the non-histone modifications other than phosphorylation) received comparatively little attention. Therefore, in this review, we illustrate the functions of these PTMs and highlight their impact on the oncogenesis and progression of breast cancer. Identification of novel potential therapeutic drugs targeting PTMs and development of biological markers for the detection of breast cancer would be significantly valuable for the efficient selection of therapeutic regimens and prediction of disease prognosis in patients with breast cancer.
Collapse
|
4
|
Li L, Ching WK, Liu ZP. Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods. Comput Biol Chem 2022; 100:107747. [DOI: 10.1016/j.compbiolchem.2022.107747] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 06/17/2022] [Accepted: 07/25/2022] [Indexed: 11/03/2022]
|
5
|
Urban J. A review on recent trends in the phosphoproteomics workflow. From sample preparation to data analysis. Anal Chim Acta 2022; 1199:338857. [DOI: 10.1016/j.aca.2021.338857] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 07/14/2021] [Accepted: 07/15/2021] [Indexed: 12/12/2022]
|
6
|
Chen J, Li K, Yang J, Gu J. Bimetallic Ordered Large-Pore MesoMOFs for Simultaneous Enrichment and Dephosphorylation of Phosphopeptides. ACS APPLIED MATERIALS & INTERFACES 2021; 13:60173-60181. [PMID: 34882408 DOI: 10.1021/acsami.1c18201] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Despite the fact that bimetallic metal-organic frameworks (MOFs) could afford multiple functionalities by a synergistic effect of individual metallic centers, their intrinsic microporous structure frequently restricts their wide applications with bulky molecules involved. An urgent need is consequently triggered to design bimetallic hierarchical mesoporous MOFs (mesoMOFs). Herein, Zr/Ce mesoMOFs with a uniform pore size of up to 8 nm was successfully synthesized by a copolymer template strategy with the aid of a Hoffmeister ion. The obtained Zr/Ce mesoMOFs feature high porosity, good chemical and thermal stabilities, and tunable element components, and up to 70% Zr could be incorporated into the mesoporous Ce-based framework without deteriorating its crystallinity. Thanks to the synergistic effect of inherent Ce and Zr as well as the large and open pore channels, a broad range of phosphopeptides with different molecule sizes could be effectively checked out, thanks to their simultaneous enrichment and dephosphorylation capabilities. Such an ability to efficiently concentrate phosphopeptides remained intact even in the presence of abundant non-phosphorylated species. The practical detection of phosphopeptides from human serum was also verified, prefiguring the great potentials of bimetallic large-pore mesoMOFs for the proteome applications.
Collapse
Affiliation(s)
- Jingwen Chen
- Key Laboratory for Ultrafine Materials of Ministry of Education, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Ke Li
- Key Laboratory for Ultrafine Materials of Ministry of Education, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Jian Yang
- Key Laboratory for Ultrafine Materials of Ministry of Education, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Jinlou Gu
- Key Laboratory for Ultrafine Materials of Ministry of Education, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
7
|
Vitorino R, Choudhury M, Guedes S, Ferreira R, Thongboonkerd V, Sharma L, Amado F, Srivastava S. Peptidomics and proteogenomics: background, challenges and future needs. Expert Rev Proteomics 2021; 18:643-659. [PMID: 34517741 DOI: 10.1080/14789450.2021.1980388] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
INTRODUCTION With available genomic data and related information, it is becoming possible to better highlight mutations or genomic alterations associated with a particular disease or disorder. The advent of high-throughput sequencing technologies has greatly advanced diagnostics, prognostics, and drug development. AREAS COVERED Peptidomics and proteogenomics are the two post-genomic technologies that enable the simultaneous study of peptides and proteins/transcripts/genes. Both technologies add a remarkably large amount of data to the pool of information on various peptides associated with gene mutations or genome remodeling. Literature search was performed in the PubMed database and is up to date. EXPERT OPINION This article lists various techniques used for peptidomic and proteogenomic analyses. It also explains various bioinformatics workflows developed to understand differentially expressed peptides/proteins and their role in disease pathogenesis. Their role in deciphering disease pathways, cancer research, and biomarker discovery using biofluids is highlighted. Finally, the challenges and future requirements to overcome the current limitations for their effective clinical use are also discussed.
Collapse
Affiliation(s)
- Rui Vitorino
- Faculdade de Medicina da Universidade do Porto, Porto, Portugal.,iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal.,Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Manisha Choudhury
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Powai, India
| | - Sofia Guedes
- Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Rita Ferreira
- Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Visith Thongboonkerd
- Medical Proteomics Unit, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | | | - Francisco Amado
- Laqv/requimte, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Powai, India
| |
Collapse
|
8
|
Zhou M, Li H, Wang X, Guan Y. Evidence of widespread, independent sequence signature for transcription factor cobinding. Genome Res 2021; 31:265-278. [PMID: 33303494 PMCID: PMC7849410 DOI: 10.1101/gr.267310.120] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 12/03/2020] [Indexed: 01/03/2023]
Abstract
Transcription factors (TFs) are the vocabulary that genomes use to regulate gene expression and phenotypes. The interactions among TFs enrich this vocabulary and orchestrate diverse biological processes. Although simple models identify open chromatin and the presence of TF motifs as the two major contributors to TF binding patterns, it remains elusive what contributes to the in vivo TF cobinding landscape. In this study, we developed a machine learning algorithm to explore the contributors of the cobinding patterns. The algorithm substantially outperforms the state-of-the-field models for TF cobinding prediction. Game theory-based feature importance analysis reveals that, for most of the TF pairs we studied, independent motif sequences contribute one or more of the two TFs under investigation to their cobinding patterns. Such independent motif sequences include, but are not limited to, transcription initiation-related proteins and known TF complexes. We found the motif sequence signatures and the TFs are rarely mutual, corroborating a hierarchical and directional organization of the regulatory network and refuting the possibility of artifacts caused by shared sequence similarity with the TFs under investigation. We modeled such regulatory language with directed graphs, which reveal shared, global factors that are related to many binding and cobinding patterns.
Collapse
Affiliation(s)
- Manqi Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Xueqing Wang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
9
|
Protein Phosphorylation in Serine Residues Correlates with Progression from Precancerous Lesions to Cervical Cancer in Mexican Patients. BIOMED RESEARCH INTERNATIONAL 2020; 2020:5058928. [PMID: 32337254 PMCID: PMC7157794 DOI: 10.1155/2020/5058928] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/12/2020] [Indexed: 12/24/2022]
Abstract
Protein phosphorylation is a posttranslational modification that is essential for normal cellular processes; however, abnormal phosphorylation is one of the prime causes for alteration of many structural, functional, and regulatory proteins in disease conditions. In cancer, changes in the states of protein phosphorylation in tyrosine residues have been more studied than phosphorylation in threonine or serine residues, which also undergo alterations with greater predominance. In general, serine phosphorylation leads to the formation of multimolecular signaling complexes that regulate diverse biological processes, but in pathological conditions such as tumorigenesis, anomalous phosphorylation may result in the deregulation of some signaling pathways. Cervical cancer (CC), the main neoplasm associated with human papillomavirus (HPV) infection, is the fourth most frequent cancer worldwide. Persistent infection of the cervix with high-risk human papillomaviruses produces precancerous lesions starting with low-grade squamous intraepithelial lesions (LSIL), progressing to high-grade squamous intraepithelial lesions (HSIL) until CC is generated. Here, we compared the proteomic profile of phosphorylated proteins in serine residues from healthy, LSIL, HSIL, and CC samples. Our data show an increase in the number of phosphorylated proteins in serine residues as the grade of injury rises. These results provide a support for future studies focused on phosphorylated proteins and their possible correlation with the progression of cervical lesions.
Collapse
|
10
|
Deng K, Li H, Guan Y. Treatment Stratification of Patients with Metastatic Castration-Resistant Prostate Cancer by Machine Learning. iScience 2020; 23:100804. [PMID: 31978751 PMCID: PMC6976944 DOI: 10.1016/j.isci.2019.100804] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/22/2019] [Accepted: 12/19/2019] [Indexed: 11/28/2022] Open
Abstract
Prostate cancer is the most common cancer in men in the Western world. One-third of the patients with prostate cancer will develop resistance to hormonal therapy and progress into metastatic castration-resistant prostate cancer (mCRPC). Currently, docetaxel is a preferred treatment for mCRPC. However, about 20% of the patients will undergo early therapeutic failure owing to adverse events induced by docetaxel-based chemotherapy. There is an emergent need for a computational model that can accurately stratify patients into docetaxel-tolerable and docetaxel-intolerable groups. Here we present the best-performing algorithm in the Prostate Cancer DREAM Challenge for predicting adverse events caused by docetaxel treatment. We integrated the survival status and severity of adverse events into our model, which is an innovative way to complement and stratify the treatment discontinuation information. Critical stratification biomarkers were further identified in determining the treatment discontinuation. Our model has the potential to improve future personalized treatment in mCRPC.
Collapse
Affiliation(s)
- Kaiwen Deng
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; Department of Internal Medicine, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA.
| |
Collapse
|
11
|
Li H, Siddiqui O, Zhang H, Guan Y. Joint learning improves protein abundance prediction in cancers. BMC Biol 2019; 17:107. [PMID: 31870366 PMCID: PMC6929375 DOI: 10.1186/s12915-019-0730-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 12/04/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The classic central dogma in biology is the information flow from DNA to mRNA to protein, yet complicated regulatory mechanisms underlying protein translation often lead to weak correlations between mRNA and protein abundances. This is particularly the case in cancer samples and when evaluating the same gene across multiple samples. RESULTS Here, we report a method for predicting proteome from transcriptome, using a training dataset provided by NCI-CPTAC and TCGA, consisting of transcriptome and proteome data from 77 breast and 105 ovarian cancer samples. First, we establish a generic model capturing the correlation between mRNA and protein abundance of a single gene. Second, we build a gene-specific model capturing the interdependencies among multiple genes in a regulatory network. Third, we create a cross-tissue model by joint learning the information of shared regulatory networks and pathways across cancer tissues. Our method ranked first in the NCI-CPTAC DREAM Proteogenomics Challenge, and the predictive performance is close to the accuracy of experimental replicates. Key functional pathways and network modules controlling the proteomic abundance in cancers were revealed, in particular metabolism-related genes. CONCLUSIONS We present a method to predict proteome from transcriptome, leveraging data from different cancer tissues to build a trans-tissue model, and suggest how to integrate information from multiple cancers to provide a foundation for further research.
Collapse
Affiliation(s)
- Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.
| | - Omer Siddiqui
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA. .,Department of Internal Medicine, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.
| |
Collapse
|