1
|
Giuliani P, De Simone C, Febo G, Bellasame A, Tupone N, Di Virglio V, di Giuseppe F, Ciccarelli R, Di Iorio P, Angelucci S. Proteomics Studies on Extracellular Vesicles Derived from Glioblastoma: Where Do We Stand? Int J Mol Sci 2024; 25:9778. [PMID: 39337267 PMCID: PMC11431518 DOI: 10.3390/ijms25189778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 08/30/2024] [Accepted: 09/03/2024] [Indexed: 09/30/2024] Open
Abstract
Like most tumors, glioblastoma multiforme (GBM), the deadliest brain tumor in human adulthood, releases extracellular vesicles (EVs). Their content, reflecting that of the tumor of origin, can be donated to nearby and distant cells which, by acquiring it, become more aggressive. Therefore, the study of EV-transported molecules has become very important. Particular attention has been paid to EV proteins to uncover new GBM biomarkers and potential druggable targets. Proteomic studies have mainly been performed by "bottom-up" mass spectrometry (MS) analysis of EVs isolated by different procedures from conditioned media of cultured GBM cells and biological fluids from GBM patients. Although a great number of dysregulated proteins have been identified, the translation of these findings into clinics remains elusive, probably due to multiple factors, including the lack of standardized procedures for isolation/characterization of EVs and analysis of their proteome. Thus, it is time to change research strategies by adopting, in addition to harmonized EV selection techniques, different MS methods aimed at identifying selected tumoral protein mutations and/or isoforms due to post-translational modifications, which more deeply influence the tumor behavior. Hopefully, these data integrated with those from other "omics" disciplines will lead to the discovery of druggable pathways for novel GBM therapies.
Collapse
Affiliation(s)
- Patricia Giuliani
- Department of Medical, Oral and Biotechnological Sciences, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy; (P.G.); (C.D.S.); (G.F.); (A.B.); (P.D.I.)
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
| | - Chiara De Simone
- Department of Medical, Oral and Biotechnological Sciences, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy; (P.G.); (C.D.S.); (G.F.); (A.B.); (P.D.I.)
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
| | - Giorgia Febo
- Department of Medical, Oral and Biotechnological Sciences, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy; (P.G.); (C.D.S.); (G.F.); (A.B.); (P.D.I.)
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
| | - Alessia Bellasame
- Department of Medical, Oral and Biotechnological Sciences, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy; (P.G.); (C.D.S.); (G.F.); (A.B.); (P.D.I.)
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
| | - Nicola Tupone
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
- Department of Innovative Technologies in Medicine and Dentistry, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy;
| | - Vimal Di Virglio
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
- Department of Innovative Technologies in Medicine and Dentistry, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy;
| | - Fabrizio di Giuseppe
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
- Department of Innovative Technologies in Medicine and Dentistry, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy;
| | - Renata Ciccarelli
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
| | - Patrizia Di Iorio
- Department of Medical, Oral and Biotechnological Sciences, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy; (P.G.); (C.D.S.); (G.F.); (A.B.); (P.D.I.)
- Center for Advanced Studies and Technology (CAST), ‘G. D’Annunzio’ University of Chieti-Pescara, Via L Polacchi 13, 66100 Chieti, Italy; (N.T.); (V.D.V.); (F.d.G.)
| | - Stefania Angelucci
- Department of Innovative Technologies in Medicine and Dentistry, ‘G. D’Annunzio’ University of Chieti-Pescara, Via Vestini 31, 66100 Chieti, Italy;
- Stem TeCh Group, Via L Polacchi 13, 66100 Chieti, Italy
| |
Collapse
|
2
|
Vitorino R. Transforming Clinical Research: The Power of High-Throughput Omics Integration. Proteomes 2024; 12:25. [PMID: 39311198 PMCID: PMC11417901 DOI: 10.3390/proteomes12030025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 08/31/2024] [Accepted: 09/02/2024] [Indexed: 09/26/2024] Open
Abstract
High-throughput omics technologies have dramatically changed biological research, providing unprecedented insights into the complexity of living systems. This review presents a comprehensive examination of the current landscape of high-throughput omics pipelines, covering key technologies, data integration techniques and their diverse applications. It looks at advances in next-generation sequencing, mass spectrometry and microarray platforms and highlights their contribution to data volume and precision. In addition, this review looks at the critical role of bioinformatics tools and statistical methods in managing the large datasets generated by these technologies. By integrating multi-omics data, researchers can gain a holistic understanding of biological systems, leading to the identification of new biomarkers and therapeutic targets, particularly in complex diseases such as cancer. The review also looks at the integration of omics data into electronic health records (EHRs) and the potential for cloud computing and big data analytics to improve data storage, analysis and sharing. Despite significant advances, there are still challenges such as data complexity, technical limitations and ethical issues. Future directions include the development of more sophisticated computational tools and the application of advanced machine learning techniques, which are critical for addressing the complexity and heterogeneity of omics datasets. This review aims to serve as a valuable resource for researchers and practitioners, highlighting the transformative potential of high-throughput omics technologies in advancing personalized medicine and improving clinical outcomes.
Collapse
Affiliation(s)
- Rui Vitorino
- iBiMED, Department of Medical Sciences, University of Aveiro, 3810-193 Aveiro, Portugal;
- Department of Surgery and Physiology, Cardiovascular R&D Centre—UnIC@RISE, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal
| |
Collapse
|
3
|
Vizza P, Aracri F, Guzzi PH, Gaspari M, Veltri P, Tradigo G. Machine learning pipeline to analyze clinical and proteomics data: experiences on a prostate cancer case. BMC Med Inform Decis Mak 2024; 24:93. [PMID: 38584282 PMCID: PMC11000316 DOI: 10.1186/s12911-024-02491-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Accepted: 03/25/2024] [Indexed: 04/09/2024] Open
Abstract
Proteomic-based analysis is used to identify biomarkers in blood samples and tissues. Data produced by devices such as mass spectrometry requires platforms to identify and quantify proteins (or peptides). Clinical information can be related to mass spectrometry data to identify diseases at an early stage. Machine learning techniques can be used to support physicians and biologists in studying and classifying pathologies. We present the application of machine learning techniques to define a pipeline aimed at studying and classifying proteomics data enriched using clinical information. The pipeline allows users to relate established blood biomarkers with clinical parameters and proteomics data. The proposed pipeline entails three main phases: (i) feature selection, (ii) models training, and (iii) models ensembling. We report the experience of applying such a pipeline to prostate-related diseases. Models have been trained on several biological datasets. We report experimental results about two datasets that result from the integration of clinical and mass spectrometry-based data in the contexts of serum and urine analysis. The pipeline receives input data from blood analytes, tissue samples, proteomic analysis, and urine biomarkers. It then trains different models for feature selection, classification and voting. The presented pipeline has been applied on two datasets obtained in a 2 years research project which aimed to extract hidden information from mass spectrometry, serum, and urine samples from hundreds of patients. We report results on analyzing prostate datasets serum with 143 samples, including 79 PCa and 84 BPH patients, and an urine dataset with 121 samples, including 67 PCa and 54 BPH patients. As results pipeline allowed to identify interesting peptides in the two datasets, 6 for the first one and 2 for the second one. The best model for both serum (AUC=0.87, Accuracy=0.83, F1=0.81, Sensitivity=0.84, Specificity=0.81) and urine (AUC=0.88, Accuracy=0.83, F1=0.83, Sensitivity=0.85, Specificity=0.80) datasets showed good predictive performances. We made the pipeline code available on GitHub and we are confident that it will be successfully adopted in similar clinical setups.
Collapse
Affiliation(s)
- Patrizia Vizza
- Department of Surgical and Medical Sciences, Magna Græcia University, 88100, Catanzaro, Italy
| | - Federica Aracri
- Department of Surgical and Medical Sciences, Magna Græcia University, 88100, Catanzaro, Italy.
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, Magna Græcia University, 88100, Catanzaro, Italy
| | - Marco Gaspari
- Department of Experimental and Clinical Medicine, Magna Græcia University, 88100, Catanzaro, Italy
| | - Pierangelo Veltri
- Department of Computers, Modeling, Electronics and Systems Engineering, University of Calabria, 87036, Rende, Italy
| | - Giuseppe Tradigo
- Department of Theoretical and Applied Sciences, eCampus University, 22060, Novedrate, CO, Italy
| |
Collapse
|
4
|
Plouviez M, Dubreucq E. Key Proteomics Tools for Fundamental and Applied Microalgal Research. Proteomes 2024; 12:13. [PMID: 38651372 PMCID: PMC11036299 DOI: 10.3390/proteomes12020013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/28/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
Microscopic, photosynthetic prokaryotes and eukaryotes, collectively referred to as microalgae, are widely studied to improve our understanding of key metabolic pathways (e.g., photosynthesis) and for the development of biotechnological applications. Omics technologies, which are now common tools in biological research, have been shown to be critical in microalgal research. In the past decade, significant technological advancements have allowed omics technologies to become more affordable and efficient, with huge datasets being generated. In particular, where studies focused on a single or few proteins decades ago, it is now possible to study the whole proteome of a microalgae. The development of mass spectrometry-based methods has provided this leap forward with the high-throughput identification and quantification of proteins. This review specifically provides an overview of the use of proteomics in fundamental (e.g., photosynthesis) and applied (e.g., lipid production for biofuel) microalgal research, and presents future research directions in this field.
Collapse
Affiliation(s)
- Maxence Plouviez
- School of Agriculture and Environment, Massey University, Palmerston North 4410, New Zealand
- The Cawthron Institute, Nelson 7010, New Zealand
| | - Eric Dubreucq
- Agropolymer Engineering and Emerging Technologies, L’Institut Agro Montpellier, 34060 Montpellier, France;
| |
Collapse
|
5
|
Chatterjee S, Zaia J. Proteomics-based mass spectrometry profiling of SARS-CoV-2 infection from human nasopharyngeal samples. MASS SPECTROMETRY REVIEWS 2024; 43:193-229. [PMID: 36177493 PMCID: PMC9538640 DOI: 10.1002/mas.21813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/07/2022] [Accepted: 09/09/2022] [Indexed: 05/12/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the on-going global pandemic of coronavirus disease 2019 (COVID-19) that continues to pose a significant threat to public health worldwide. SARS-CoV-2 encodes four structural proteins namely membrane, nucleocapsid, spike, and envelope proteins that play essential roles in viral entry, fusion, and attachment to the host cell. Extensively glycosylated spike protein efficiently binds to the host angiotensin-converting enzyme 2 initiating viral entry and pathogenesis. Reverse transcriptase polymerase chain reaction on nasopharyngeal swab is the preferred method of sample collection and viral detection because it is a rapid, specific, and high-throughput technique. Alternate strategies such as proteomics and glycoproteomics-based mass spectrometry enable a more detailed and holistic view of the viral proteins and host-pathogen interactions and help in detection of potential disease markers. In this review, we highlight the use of mass spectrometry methods to profile the SARS-CoV-2 proteome from clinical nasopharyngeal swab samples. We also highlight the necessity for a comprehensive glycoproteomics mapping of SARS-CoV-2 from biological complex matrices to identify potential COVID-19 markers.
Collapse
Affiliation(s)
- Sayantani Chatterjee
- Department of Biochemistry, Center for Biomedical Mass SpectrometryBoston University School of MedicineBostonMassachusettsUSA
| | - Joseph Zaia
- Department of Biochemistry, Center for Biomedical Mass SpectrometryBoston University School of MedicineBostonMassachusettsUSA
- Bioinformatics ProgramBoston University School of MedicineBostonMassachusettsUSA
| |
Collapse
|
6
|
Bichmann L, Gupta S, Röst H. Data-Independent Acquisition Peptidomics. Methods Mol Biol 2024; 2758:77-88. [PMID: 38549009 DOI: 10.1007/978-1-0716-3646-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
In recent years, data-independent acquisition (DIA) has emerged as a powerful analysis method in biological mass spectrometry (MS). Compared to the previously predominant data-dependent acquisition (DDA), it offers a way to achieve greater reproducibility, sensitivity, and dynamic range in MS measurements. To make DIA accessible to non-expert users, a multifunctional, automated high-throughput pipeline DIAproteomics was implemented in the computational workflow framework "Nextflow" ( https://nextflow.io ). This allows high-throughput processing of proteomics and peptidomics DIA datasets on diverse computing infrastructures. This chapter provides a short summary and usage protocol guide for the most important modes of operation of this pipeline regarding the analysis of peptidomics datasets using the command line. In brief, DIAproteomics is a wrapper around the OpenSwathWorkflow and relies on either existing or ad-hoc generated spectral libraries from matching DDA runs. The OpenSwathWorkflow extracts chromatograms from the DIA runs and performs chromatographic peak-picking. Further downstream of the pipeline, these peaks are scored, aligned, and statistically evaluated for qualitative and quantitative differences across conditions depending on the user's interest. DIAproteomics is open-source and available under a permissive license. We encourage the scientific community to use or modify the pipeline to meet their specific requirements.
Collapse
Affiliation(s)
- Leon Bichmann
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
| | - Shubham Gupta
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Hannes Röst
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
7
|
Hay BN, Akinlaja MO, Baker TC, Houfani AA, Stacey RG, Foster LJ. Integration of data-independent acquisition (DIA) with co-fractionation mass spectrometry (CF-MS) to enhance interactome mapping capabilities. Proteomics 2023; 23:e2200278. [PMID: 37144656 DOI: 10.1002/pmic.202200278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 05/06/2023]
Abstract
Proteomics technologies are continually advancing, providing opportunities to develop stronger and more robust protein interaction networks (PINs). In part, this is due to the ever-growing number of high-throughput proteomics methods that are available. This review discusses how data-independent acquisition (DIA) and co-fractionation mass spectrometry (CF-MS) can be integrated to enhance interactome mapping abilities. Furthermore, integrating these two techniques can improve data quality and network generation through extended protein coverage, less missing data, and reduced noise. CF-DIA-MS shows promise in expanding our knowledge of interactomes, notably for non-model organisms (NMOs). CF-MS is a valuable technique on its own, but upon the integration of DIA, the potential to develop robust PINs increases, offering a unique approach for researchers to gain an in-depth understanding into the dynamics of numerous biological processes.
Collapse
Affiliation(s)
- Brenna N Hay
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Mopelola O Akinlaja
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Teesha C Baker
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Aicha Asma Houfani
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - R Greg Stacey
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
8
|
Devadasan MJ, Ramesha KP, Ramesh P, Kootimole CN, Jeyakumar S, Ashwitha A, Ammankallu S, Rai AB, Kumaresan A, Vedamurthy VG, Raju R, Das DN, Kataktalware MA, Prasad TSK. Exploring molecular dynamic indicators associated with reproductive performance of Bos indicus cattle in blood plasma samples through data-independent acquisition mass spectrometry. J Proteomics 2023; 285:104950. [PMID: 37321300 DOI: 10.1016/j.jprot.2023.104950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/07/2023] [Accepted: 06/09/2023] [Indexed: 06/17/2023]
Abstract
Improving reproductive performance of cattle is of paramount importance for sustainable dairy farming. Poor reproduction performance (RP) hinders the genetic improvement of important Bos indicus cattle breeds. It is well known that incorporation of molecular information along with conventional breeding method is far better than use of conventional method alone for the genetic improvement of reproductive performance traits in cattle. Therefore, the present study sought to investigate the plasma proteome of the Deoni cows in cyclical (n = 6) and pregnant (n = 6) reproductive phases with varying reproductive performance (high and low). High-throughput data independent acquisition (DIA) based proteomics was performed to understand corresponding proteome. We identified a total of 430 plasma proteins. Among cyclic cows, twenty proteins were differentially regulated in low RP as compared to high RP. BARD1 and AFP proteins were observed upregulated in cyclical cows whose upregulation reported to affect reproductive performance in cattle. Among the pregnant cows, thirty-five proteins were differentially regulated, including the downregulation of FGL2 and ZNFX1 that modulates the maternal immune response mechanism which is required for successful implantation of the embryo. Also, proteins such as AHSG, CLU and SERPINA6 were upregulated in the pregnant cows whose upregulation reported to reduced reproductive performance. The results of this study will be helpful in establishing a framework for future research on the aspect of improving reproductive performance in Bos indicus cattle breeds. SIGNIFICANCE: The Indian subcontinent is the center of domestication for Bos indicus cattle breeds and they are known for their disease resistance, heat tolerance, ability to survive in low input regime and harsh climatic conditions. In recent times, population of many important Bos indicus breeds including Deoni cattle is declining due to various factors, especially due to reproductive performance. Traditional breeding methods are not sufficient enough to understand and improve the reproductive performance traits in important Bos indicus cattle breeds. Proteomics approach is a promising technology to understand the complex biological factors which leads to poor reproductive performance in cattle. The present study utilized DIA based LC- MS/MS analysis to identify the plasma proteins associated with reproductive performance in cyclical and pregnant cows. This study if improved further, can be used to develop potential protein markers associated with reproductive performance which is useful for the selection and genetic improvement of important Bos indicus breeds.
Collapse
Affiliation(s)
- M Joel Devadasan
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | - Kerekoppa P Ramesha
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India.
| | - Poornima Ramesh
- Centre for System Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Chinmaya Narayana Kootimole
- Centre for System Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Sakthivel Jeyakumar
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | - A Ashwitha
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | - Shruthi Ammankallu
- Centre for System Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Akhila Balakrishna Rai
- Centre for System Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Arumugam Kumaresan
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | - Veerappa G Vedamurthy
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | - Rajesh Raju
- Centre for System Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - D N Das
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | - Mukund A Kataktalware
- Southern Regional Station, ICAR- National Dairy Research Institute, Banglore 560030, India
| | | |
Collapse
|
9
|
Allen C, Meinl R, Paez JS, Searle BC, Just S, Pino LK, Fondrie WE. nf-encyclopedia: A Cloud-Ready Pipeline for Chromatogram Library Data-Independent Acquisition Proteomics Workflows. J Proteome Res 2023; 22:2743-2749. [PMID: 37417926 DOI: 10.1021/acs.jproteome.2c00613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
Data-independent acquisition (DIA) mass spectrometry methods provide systematic and comprehensive quantification of the proteome; yet, relatively few open-source tools are available to analyze DIA proteomics experiments. Fewer still are tools that can leverage gas phase fractionated (GPF) chromatogram libraries to enhance the detection and quantification of peptides in these experiments. Here, we present nf-encyclopedia, an open-source NextFlow pipeline that connects three open-source tools, MSConvert, EncyclopeDIA, and MSstats, to analyze DIA proteomics experiments with or without chromatogram libraries. We demonstrate that nf-encyclopedia is reproducible when run on either a cloud platform or a local workstation and provides robust peptide and protein quantification. Additionally, we found that MSstats enhances protein-level quantitative performance over EncyclopeDIA alone. Finally, we benchmarked the ability of nf-encyclopedia to scale to large experiments in the cloud by leveraging the parallelization of compute resources. The nf-encyclopedia pipeline is available under a permissive Apache 2.0 license; run it on your desktop, cluster, or in the cloud: https://github.com/TalusBio/nf-encyclopedia.
Collapse
Affiliation(s)
- Carolyn Allen
- Talus Bioscience, Seattle, Washington 98122, United States
| | - Rico Meinl
- Talus Bioscience, Seattle, Washington 98122, United States
| | | | - Brian C Searle
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, United States
- Proteome Software, Inc., Portland, Oregon 97219, United States
| | - Seth Just
- Proteome Software, Inc., Portland, Oregon 97219, United States
| | - Lindsay K Pino
- Talus Bioscience, Seattle, Washington 98122, United States
| | | |
Collapse
|
10
|
In-Depth Proteomic Analysis of Blood Circulating Small Extracellular Vesicles. Methods Mol Biol 2023; 2628:279-289. [PMID: 36781792 DOI: 10.1007/978-1-0716-2978-9_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Circulating small extracellular vesicles (sEVs), also called exosomes, are key players in the investigation of cell-cell communication mechanisms and in the identification of new potential biomarkers. These particles can carry proteins, DNA, mRNA, miRNA, lipids and metabolites that are transported all over the human body, potentially reaching all the cells. In particular, proteins, which are well-known biological actors in cell signalling, will be discussed in this context. In this article, we present a mass spectrometry approach for the in-depth characterization of the sEVs proteome. The protocols include strategies for the isolation and purification of sEVs, for the extraction of proteins and the purification of sEVs proteins by the immunodepletion of the most abundant plasmatic proteins. Finally, bioinformatic analysis for the extraction of the most important biological features associated with the proteomic content of sEVs is reported.
Collapse
|
11
|
Plasma proteomic profiling in postural orthostatic tachycardia syndrome (POTS) reveals new disease pathways. Sci Rep 2022; 12:20051. [PMID: 36414707 PMCID: PMC9681882 DOI: 10.1038/s41598-022-24729-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 11/18/2022] [Indexed: 11/23/2022] Open
Abstract
Postural orthostatic tachycardia syndrome (POTS) is a cardiovascular autonomic disorder characterized by excessive heart rate increase on standing, leading to debilitating symptoms with limited therapeutic possibilities. Proteomics is a large-scale study of proteins that enables a systematic unbiased view on disease and health, allowing stratification of patients based on their protein background. The aim of the present study was to determine plasma protein biomarkers of POTS and to reveal proteomic pathways differentially regulated in POTS. We performed an age- and sex-matched, case-control study in 130 individuals (case-control ratio 1:1) including POTS and healthy controls. Mean age in POTS was 30 ± 9.8 years (84.6% women) versus controls 31 ± 9.8 years (80.0% women). We analyzed plasma proteins using data-independent acquisition (DIA) mass spectrometry. Pathway analysis of significantly differently expressed proteins was executed using a cutoff log2 fold change set to 1.2 and false discovery rate (p-value) of < 0.05. A total of 393 differential plasma proteins were identified. Label-free quantification of DIA-data identified 30 differentially expressed proteins in POTS compared with healthy controls. Pathway analysis identified the strongest network interactions particularly for proteins involved in thrombogenicity and enhanced platelet activity, but also inflammation, cardiac contractility and hypertrophy, and increased adrenergic activity. Our observations generated by the first use a label-free unbiased quantification reveal the proteomic footprint of POTS in terms of a hypercoagulable state, proinflammatory state, enhanced cardiac contractility and hypertrophy, skeletal muscle expression, and adrenergic activity. These findings support the hypothesis that POTS may be an autoimmune, inflammatory and hyperadrenergic disorder.
Collapse
|
12
|
Sun X, Yu Z, Liang C, Xie S, Wen J, Wang H, Wang J, Yang Y, Han R. Developmental changes in proteins of casein micelles in goat milk using data-independent acquisition-based proteomics methods during the lactation cycle. J Dairy Sci 2022; 106:47-60. [PMID: 36333141 DOI: 10.3168/jds.2022-22032] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 08/12/2022] [Indexed: 11/05/2022]
Abstract
Casein micelles (CM) play an important role in milk secretion, stability, and processing. The composition and content of milk proteins are affected by physiological factors, which have been widely investigated. However, the variation in CM proteins in goat milk throughout the lactation cycle has yet to be fully clarified. In the current study, milk samples were collected at d 1, 3, 30, 90, 150, and 240 of lactation from 15 dairy goats. The size of CM was determined using laser light scattering, and CM proteins were separated, digested, and identified using data-independent acquisition (DIA) and data-dependent acquisition (DDA)-based proteomics approaches. According to clustering and principal component analysis, protein profiles identified using DIA were similar to those identified using the DDA approach. Significant differences in the abundance of 115 proteins during the lactation cycle were identified using the DIA approach. Developmental changes in the CM proteome corresponding to lactation stages were revealed: levels of lecithin cholesterol acyltransferase, folate receptor α, and prominin 2 increased from 1 to 240 d, whereas levels of growth/differentiation factor 8, peptidoglycan-recognition protein, and 45 kDa calcium-binding protein decreased in the same period. In addition, lipoprotein lipase, glycoprotein IIIb, and α-lactalbumin levels increased from 1 to 90 d and then decreased to 240 d, which is consistent with the change in CM size. Protein-protein interaction analysis showed that fibronectin, albumin, and apolipoprotein E interacted more with other proteins at the central node. These findings indicate that changes in the CM proteome during lactation could be related to requirements of newborn development, as well as mammary gland development, and may thus contribute to elucidating the physical and chemical properties of CM.
Collapse
Affiliation(s)
- Xueheng Sun
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Zhongna Yu
- Haidu College, Qingdao Agricultural University, Laiyang 265200, Shandong, China
| | - Chuozi Liang
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Shubin Xie
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Jing Wen
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Hexiang Wang
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Jun Wang
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Yongxin Yang
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China
| | - Rongwei Han
- College of Food Science and Engineering, Qingdao Agricultural University, Qingdao 266109, Shandong, China.
| |
Collapse
|
13
|
Sun X, Yu Z, Liang C, Xie S, Wang H, Wang J, Yang Y, Han R. Comparative analysis of changes in whey proteins of goat milk throughout the lactation cycle using quantitative proteomics. J Dairy Sci 2022; 106:792-806. [DOI: 10.3168/jds.2022-21800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 08/25/2022] [Indexed: 11/23/2022]
|
14
|
Perez-Riverol Y. Proteomic repository data submission, dissemination, and reuse: key messages. Expert Rev Proteomics 2022; 19:297-310. [PMID: 36529941 PMCID: PMC7614296 DOI: 10.1080/14789450.2022.2160324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022]
Abstract
INTRODUCTION The creation of ProteomeXchange data workflows in 2012 transformed the field of proteomics, consisting of the standardization of data submission and dissemination and enabling the widespread reanalysis of public MS proteomics data worldwide. ProteomeXchange has triggered a growing trend toward public dissemination of proteomics data, facilitating the assessment, reuse, comparative analyses, and extraction of new findings from public datasets. By 2022, the consortium is integrated by PRIDE, PeptideAtlas, MassIVE, jPOST, iProX, and Panorama Public. AREAS COVERED Here, we review and discuss the current ecosystem of resources, guidelines, and file formats for proteomics data dissemination and reanalysis. Special attention is drawn to new exciting quantitative and post-translational modification-oriented resources. The challenges and future directions on data depositions including the lack of metadata and cloud-based and high-performance software solutions for fast and reproducible reanalysis of the available data are discussed. EXPERT OPINION The success of ProteomeXchange and the amount of proteomics data available in the public domain have triggered the creation and/or growth of other protein knowledgebase resources. Data reuse is a leading, active, and evolving field; supporting the creation of new formats, tools, and workflows to rediscover and reshape the public proteomics data.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
15
|
Walzer M, García-Seisdedos D, Prakash A, Brack P, Crowther P, Graham RL, George N, Mohammed S, Moreno P, Papatheodorou I, Hubbard SJ, Vizcaíno JA. Implementing the reuse of public DIA proteomics datasets: from the PRIDE database to Expression Atlas. Sci Data 2022; 9:335. [PMID: 35701420 PMCID: PMC9197839 DOI: 10.1038/s41597-022-01380-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 05/12/2022] [Indexed: 11/14/2022] Open
Abstract
The number of mass spectrometry (MS)-based proteomics datasets in the public domain keeps increasing, particularly those generated by Data Independent Acquisition (DIA) approaches such as SWATH-MS. Unlike Data Dependent Acquisition datasets, the re-use of DIA datasets has been rather limited to date, despite its high potential, due to the technical challenges involved. We introduce a (re-)analysis pipeline for public SWATH-MS datasets which includes a combination of metadata annotation protocols, automated workflows for MS data analysis, statistical analysis, and the integration of the results into the Expression Atlas resource. Automation is orchestrated with Nextflow, using containerised open analysis software tools, rendering the pipeline readily available and reproducible. To demonstrate its utility, we reanalysed 10 public DIA datasets from the PRIDE database, comprising 1,278 SWATH-MS runs. The robustness of the analysis was evaluated, and the results compared to those obtained in the original publications. The final expression values were integrated into Expression Atlas, making SWATH-MS experiments more widely available and combining them with expression data originating from other proteomics and transcriptomics datasets.
Collapse
Affiliation(s)
- Mathias Walzer
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom.
| | - David García-Seisdedos
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Ananth Prakash
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Paul Brack
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Oxford Road, Manchester, M13 9PT, United Kingdom
| | - Peter Crowther
- Melandra Limited, 16 Brook Road, Urmston, Manchester, M41 5RY, United Kingdom
| | - Robert L Graham
- School of Biological Sciences, Chlorine Gardens, Queen's University Belfast, Belfast, BT9 5DL, United Kingdom
| | - Nancy George
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Suhaib Mohammed
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Pablo Moreno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Simon J Hubbard
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Oxford Road, Manchester, M13 9PT, United Kingdom
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom.
| |
Collapse
|
16
|
Gotti C, Roux-Dalvai F, Joly-Beauparlant C, Mangnier L, Leclercq M, Droit A. DIA proteomics data from a UPS1-spiked E.coli protein mixture processed with six software tools. Data Brief 2022; 41:107829. [PMID: 35198661 PMCID: PMC8841991 DOI: 10.1016/j.dib.2022.107829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 12/21/2021] [Accepted: 01/11/2022] [Indexed: 11/26/2022] Open
Abstract
In this article, we provide a proteomic reference dataset that has been initially generated for a benchmarking of software tools for Data-Independent Acquisition (DIA) analysis. This large dataset includes 96 DIA .raw files acquired from a complex proteomic standard composed of an E.coli protein background spiked-in with 8 different concentrations of 48 human proteins (UPS1 Sigma). These 8 samples were analyzed in triplicates on an Orbitrap mass spectrometer with 4 different DIA window schemes. We also provide the spectral libraries and FASTA file used for their analysis and the software outputs of the six tools used in this study: DIA-NN, Spectronaut, ScaffoldDIA, DIA-Umpire, Skyline and OpenSWATH. This dataset also contains post-processed quantification tables where the peptides and proteins have been validated, their intensities normalized and the missing values imputed with a noise value. All the files are available on ProteomeXchange. Altogether, these files represent the most comprehensive DIA reference dataset acquired on an Orbitrap instrument ever published. It will be a very useful resource to the proteomic scientists in order to assess the performance of DIA software tools or to test their processing pipelines, to the software developers to improve their tools or develop new ones and to the students for their training on proteomics data analysis.
Collapse
|
17
|
Fahrner M, Föll MC, Grüning BA, Bernt M, Röst H, Schilling O. Democratizing data-independent acquisition proteomics analysis on public cloud infrastructures via the Galaxy framework. Gigascience 2022; 11:giac005. [PMID: 35166338 PMCID: PMC8848309 DOI: 10.1093/gigascience/giac005] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 11/26/2021] [Accepted: 01/12/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Data-independent acquisition (DIA) has become an important approach in global, mass spectrometric proteomic studies because it provides in-depth insights into the molecular variety of biological systems. However, DIA data analysis remains challenging owing to the high complexity and large data and sample size, which require specialized software and vast computing infrastructures. Most available open-source DIA software necessitates basic programming skills and covers only a fraction of a complete DIA data analysis. In consequence, DIA data analysis often requires usage of multiple software tools and compatibility thereof, severely limiting the usability and reproducibility. FINDINGS To overcome this hurdle, we have integrated a suite of open-source DIA tools in the Galaxy framework for reproducible and version-controlled data processing. The DIA suite includes OpenSwath, PyProphet, diapysef, and swath2stats. We have compiled functional Galaxy pipelines for DIA processing, which provide a web-based graphical user interface to these pre-installed and pre-configured tools for their use on freely accessible, powerful computational resources of the Galaxy framework. This approach also enables seamless sharing workflows with full configuration in addition to sharing raw data and results. We demonstrate the usability of an all-in-one DIA pipeline in Galaxy by the analysis of a spike-in case study dataset. Additionally, extensive training material is provided to further increase access for the proteomics community. CONCLUSION The integration of an open-source DIA analysis suite in the web-based and user-friendly Galaxy framework in combination with extensive training material empowers a broad community of researches to perform reproducible and transparent DIA data analysis.
Collapse
Affiliation(s)
- Matthias Fahrner
- Institute for Surgical Pathology, Medical Center–University of Freiburg, Faculty of Medicine, University of Freiburg, Breisacher Straße 115a, D-79106 Freiburg, Germany
- Faculty of Biology, Albert-Ludwigs-University Freiburg, Schänzlestraße 1, D-79104 Freiburg, Germany
- Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Albertstraße 19A, D-79104, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center–University of Freiburg, Faculty of Medicine, University of Freiburg, Breisacher Straße 115a, D-79106 Freiburg, Germany
- Khoury College of Computer Sciences, Northeastern University, 440 Huntington Ave, Boston, MA 02115, USA
| | - Björn Andreas Grüning
- Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany
| | - Matthias Bernt
- Young Investigators Group Bioinformatics and Transcriptomics, Helmholtz Centre for Environmental Research–UFZ, Permoserstraße 15, D-04318 Leipzig, Germany
| | - Hannes Röst
- Donnelly Centre, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center–University of Freiburg, Faculty of Medicine, University of Freiburg, Breisacher Straße 115a, D-79106 Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Hugstetter Straße 55, D-79106 Freiburg, Germany
- BIOSS Centre for Biological Signaling Studies, University of Freiburg, Schänzlestraße 18, D-79104 Freiburg, Germany
| |
Collapse
|