1
|
Peng H, Wang H, Kong W, Li J, Goh WWB. Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference. Nat Commun 2024; 15:3922. [PMID: 38724498 PMCID: PMC11082229 DOI: 10.1038/s41467-024-47899-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 04/16/2024] [Indexed: 05/12/2024] Open
Abstract
Identification of differentially expressed proteins in a proteomics workflow typically encompasses five key steps: raw data quantification, expression matrix construction, matrix normalization, missing value imputation (MVI), and differential expression analysis. The plethora of options in each step makes it challenging to identify optimal workflows that maximize the identification of differentially expressed proteins. To identify optimal workflows and their common properties, we conduct an extensive study involving 34,576 combinatoric experiments on 24 gold standard spike-in datasets. Applying frequent pattern mining techniques to top-ranked workflows, we uncover high-performing rules that demonstrate optimality has conserved properties. Via machine learning, we confirm optimal workflows are indeed predictable, with average cross-validation F1 scores and Matthew's correlation coefficients surpassing 0.84. We introduce an ensemble inference to integrate results from individual top-performing workflows for expanding differential proteome coverage and resolve inconsistencies. Ensemble inference provides gains in pAUC (up to 4.61%) and G-mean (up to 11.14%) and facilitates effective aggregation of information across varied quantification approaches such as topN, directLFQ, MaxLFQ intensities, and spectral counts. However, further development and evaluation are needed to establish acceptable frameworks for conducting ensemble inference on multiple proteomics workflows.
Collapse
Affiliation(s)
- Hui Peng
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - He Wang
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Weijia Kong
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Jinyan Li
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore.
- Center for Biomedical Informatics, Nanyang Technological University, Singapore, Singapore.
- Center of AI in Medicine, Nanyang Technological University, Singapore, Singapore.
- Division of Neurology, Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|
2
|
Hemandhar Kumar S, Tapken I, Kuhn D, Claus P, Jung K. bootGSEA: a bootstrap and rank aggregation pipeline for multi-study and multi-omics enrichment analyses. FRONTIERS IN BIOINFORMATICS 2024; 4:1380928. [PMID: 38633435 PMCID: PMC11021641 DOI: 10.3389/fbinf.2024.1380928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 03/18/2024] [Indexed: 04/19/2024] Open
Abstract
Introduction: Gene set enrichment analysis (GSEA) subsequent to differential expression analysis is a standard step in transcriptomics and proteomics data analysis. Although many tools for this step are available, the results are often difficult to reproduce because set annotations can change in the databases, that is, new features can be added or existing features can be removed. Finally, such changes in set compositions can have an impact on biological interpretation. Methods: We present bootGSEA, a novel computational pipeline, to study the robustness of GSEA. By repeating GSEA based on bootstrap samples, the variability and robustness of results can be studied. In our pipeline, not all genes or proteins are involved in the different bootstrap replicates of the analyses. Finally, we aggregate the ranks from the bootstrap replicates to obtain a score per gene set that shows whether it gains or loses evidence compared to the ranking of the standard GSEA. Rank aggregation is also used to combine GSEA results from different omics levels or from multiple independent studies at the same omics level. Results: By applying our approach to six independent cancer transcriptomics datasets, we showed that bootstrap GSEA can aid in the selection of more robust enriched gene sets. Additionally, we applied our approach to paired transcriptomics and proteomics data obtained from a mouse model of spinal muscular atrophy (SMA), a neurodegenerative and neurodevelopmental disease associated with multi-system involvement. After obtaining a robust ranking at both omics levels, both ranking lists were combined to aggregate the findings from the transcriptomics and proteomics results. Furthermore, we constructed the new R-package "bootGSEA," which implements the proposed methods and provides graphical views of the findings. Bootstrap-based GSEA was able in the example datasets to identify gene or protein sets that were less robust when the set composition changed during bootstrap analysis. Discussion: The rank aggregation step was useful for combining bootstrap results and making them comparable to the original findings on the single-omics level or for combining findings from multiple different omics levels.
Collapse
Affiliation(s)
- Shamini Hemandhar Kumar
- Institute for Animal Genomics, University of Veterinary Medicine, Foundation, Hannover, Germany
- Center for Systems Neuroscience (ZSN), University of Veterinary Medicine, Foundation, Hannover, Germany
| | - Ines Tapken
- Center for Systems Neuroscience (ZSN), University of Veterinary Medicine, Foundation, Hannover, Germany
- SMATHERIA gGmbH—Non-Profit Biomedical Research Institute, Hannover, Germany
| | - Daniela Kuhn
- SMATHERIA gGmbH—Non-Profit Biomedical Research Institute, Hannover, Germany
- Clinic for Conservative Dentistry, Periodontology and Preventive Dentistry, Hannover Medical School, Hannover, Germany
| | - Peter Claus
- Center for Systems Neuroscience (ZSN), University of Veterinary Medicine, Foundation, Hannover, Germany
- SMATHERIA gGmbH—Non-Profit Biomedical Research Institute, Hannover, Germany
| | - Klaus Jung
- Institute for Animal Genomics, University of Veterinary Medicine, Foundation, Hannover, Germany
- Center for Systems Neuroscience (ZSN), University of Veterinary Medicine, Foundation, Hannover, Germany
| |
Collapse
|
3
|
Strauss MT, Bludau I, Zeng WF, Voytik E, Ammar C, Schessner JP, Ilango R, Gill M, Meier F, Willems S, Mann M. AlphaPept: a modern and open framework for MS-based proteomics. Nat Commun 2024; 15:2168. [PMID: 38461149 PMCID: PMC10924963 DOI: 10.1038/s41467-024-46485-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/20/2024] [Indexed: 03/11/2024] Open
Abstract
In common with other omics technologies, mass spectrometry (MS)-based proteomics produces ever-increasing amounts of raw data, making efficient analysis a principal challenge. A plethora of different computational tools can process the MS data to derive peptide and protein identification and quantification. However, during the last years there has been dramatic progress in computer science, including collaboration tools that have transformed research and industry. To leverage these advances, we develop AlphaPept, a Python-based open-source framework for efficient processing of large high-resolution MS data sets. Numba for just-in-time compilation on CPU and GPU achieves hundred-fold speed improvements. AlphaPept uses the Python scientific stack of highly optimized packages, reducing the code base to domain-specific tasks while accessing the latest advances. We provide an easy on-ramp for community contributions through the concept of literate programming, implemented in Jupyter Notebooks. Large datasets can rapidly be processed as shown by the analysis of hundreds of proteomes in minutes per file, many-fold faster than acquisition. AlphaPept can be used to build automated processing pipelines with web-serving functionality and compatibility with downstream analysis tools. It provides easy access via one-click installation, a modular Python library for advanced users, and via an open GitHub repository for developers.
Collapse
Affiliation(s)
- Maximilian T Strauss
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Wen-Feng Zeng
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Eugenia Voytik
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Constantin Ammar
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Julia P Schessner
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | | | - Florian Meier
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
- Functional Proteomics, Jena University Hospital, Jena, Germany
| | - Sander Willems
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
4
|
Bianco M, Ventura G, Calvano CD, Losito I, Cataldi TRI. Food allergen detection by mass spectrometry: From common to novel protein ingredients. Proteomics 2023; 23:e2200427. [PMID: 37691088 DOI: 10.1002/pmic.202200427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 08/04/2023] [Accepted: 08/10/2023] [Indexed: 09/12/2023]
Abstract
Food allergens are molecules, mainly proteins, that trigger immune responses in susceptible individuals upon consumption even when they would otherwise be harmless. Symptoms of a food allergy can range from mild to acute; this last effect is a severe and potentially life-threatening reaction. The European Union (EU) has identified 14 common food allergens, but new allergens are likely to emerge with constantly changing food habits. Mass spectrometry (MS) is a promising alternative to traditional antibody-based assays for quantifying multiple allergenic proteins in complex matrices with high sensitivity and selectivity. Here, the main allergenic proteins and the advantages and drawbacks of some MS acquisition protocols, such as multiple reaction monitoring (MRM) and data-dependent analysis (DDA) for identifying and quantifying common allergenic proteins in processed foodstuffs are summarized. Sections dedicated to novel foods like microalgae and insects as new sources of allergenic proteins are included, emphasizing the significance of establishing stable marker peptides and validated methods using database searches. The discussion involves the in-silico digestion of allergenic proteins, providing insights into their potential impact on immunogenicity. Finally, case studies focussing on microalgae highlight the value of MS as an effective analytical tool for ensuring regulatory compliance throughout the food control chain.
Collapse
Affiliation(s)
- Mariachiara Bianco
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Giovanni Ventura
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Centro interdipartimentale SMART, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Cosima D Calvano
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Centro interdipartimentale SMART, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Ilario Losito
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Centro interdipartimentale SMART, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Tommaso R I Cataldi
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Centro interdipartimentale SMART, Università degli Studi di Bari Aldo Moro, Bari, Italy
| |
Collapse
|
5
|
Harris L, Fondrie WE, Oh S, Noble WS. Evaluating Proteomics Imputation Methods with Improved Criteria. J Proteome Res 2023; 22:3427-3438. [PMID: 37861703 PMCID: PMC10949645 DOI: 10.1021/acs.jproteome.3c00205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
Quantitative measurements produced by tandem mass spectrometry proteomics experiments typically contain a large proportion of missing values. Missing values hinder reproducibility, reduce statistical power, and make it difficult to compare across samples or experiments. Although many methods exist for imputing missing values, in practice, the most commonly used methods are among the worst performing. Furthermore, previous benchmarking studies have focused on relatively simple measurements of error such as the mean-squared error between imputed and held-out values. Here we evaluate the performance of commonly used imputation methods using three practical, "downstream-centric" criteria. These criteria measure the ability to identify differentially expressed peptides, generate new quantitative peptides, and improve the peptide lower limit of quantification. Our evaluation comprises several experiment types and acquisition strategies, including data-dependent and data-independent acquisition. We find that imputation does not necessarily improve the ability to identify differentially expressed peptides but that it can identify new quantitative peptides and improve the peptide lower limit of quantification. We find that MissForest is generally the best performing method per our downstream-centric criteria. We also argue that existing imputation methods do not properly account for the variance of peptide quantifications and highlight the need for methods that do.
Collapse
Affiliation(s)
- Lincoln Harris
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | | | - Sewoong Oh
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, United States
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
6
|
Yang M, Unsihuay D, Hu H, Nguele Meke F, Qu Z, Zhang ZY, Laskin J. Nano-DESI Mass Spectrometry Imaging of Proteoforms in Biological Tissues with High Spatial Resolution. Anal Chem 2023; 95:5214-5222. [PMID: 36917636 DOI: 10.1021/acs.analchem.2c04795] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
Mass spectrometry imaging (MSI) is a powerful tool for label-free mapping of the spatial distribution of proteins in biological tissues. We have previously demonstrated imaging of individual proteoforms in biological tissues using nanospray desorption electrospray ionization (nano-DESI), an ambient liquid extraction-based MSI technique. Nano-DESI MSI generates multiply charged protein ions, which is advantageous for their identification using top-down proteomics analysis. In this study, we demonstrate proteoform mapping in biological tissues with a spatial resolution down to 7 μm using nano-DESI MSI. A substantial decrease in protein signals observed in high-spatial-resolution MSI makes these experiments challenging. We have enhanced the sensitivity of nano-DESI MSI experiments by optimizing the design of the capillary-based probe and the thickness of the tissue section. In addition, we demonstrate that oversampling may be used to further improve spatial resolution at little or no expense to sensitivity. These developments represent a new step in MSI-based spatial proteomics, which complements targeted imaging modalities widely used for studying biological systems.
Collapse
Affiliation(s)
- Manxi Yang
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Daisy Unsihuay
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Hang Hu
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Frederick Nguele Meke
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Zihan Qu
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States.,Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Zhong-Yin Zhang
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States.,Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Julia Laskin
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| |
Collapse
|
7
|
Kong W, Hui HWH, Peng H, Goh WWB. Dealing with missing values in proteomics data. Proteomics 2022; 22:e2200092. [PMID: 36349819 DOI: 10.1002/pmic.202200092] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/10/2022]
Abstract
Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imputation (MVI) methods have been developed and adapted for proteomics data. These MVI methods perform their tasks based on different prior assumptions (e.g., data is normally or independently distributed) and operating principles (e.g., the algorithm is built to address random missingness only), resulting in varying levels of performance even when dealing with the same dataset. Thus, to achieve a satisfactory outcome, a suitable MVI method must be selected. To guide decision making on suitable MVI method, we provide a decision chart which facilitates strategic considerations on datasets presenting different characteristics. We also bring attention to other issues that can impact proper MVI such as the presence of confounders (e.g., batch effects) which can influence MVI performance. Thus, these too, should be considered during or before MVI.
Collapse
Affiliation(s)
- Weijia Kong
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Harvard Wai Hann Hui
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Hui Peng
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore.,Centre for Biomedical Informatics, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
8
|
Yan S, Bhawal R, Yin Z, Thannhauser TW, Zhang S. Recent advances in proteomics and metabolomics in plants. MOLECULAR HORTICULTURE 2022; 2:17. [PMID: 37789425 PMCID: PMC10514990 DOI: 10.1186/s43897-022-00038-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 06/20/2022] [Indexed: 10/05/2023]
Abstract
Over the past decade, systems biology and plant-omics have increasingly become the main stream in plant biology research. New developments in mass spectrometry and bioinformatics tools, and methodological schema to integrate multi-omics data have leveraged recent advances in proteomics and metabolomics. These progresses are driving a rapid evolution in the field of plant research, greatly facilitating our understanding of the mechanistic aspects of plant metabolisms and the interactions of plants with their external environment. Here, we review the recent progresses in MS-based proteomics and metabolomics tools and workflows with a special focus on their applications to plant biology research using several case studies related to mechanistic understanding of stress response, gene/protein function characterization, metabolic and signaling pathways exploration, and natural product discovery. We also present a projection concerning future perspectives in MS-based proteomics and metabolomics development including their applications to and challenges for system biology. This review is intended to provide readers with an overview of how advanced MS technology, and integrated application of proteomics and metabolomics can be used to advance plant system biology research.
Collapse
Affiliation(s)
- Shijuan Yan
- Guangdong Key Laboratory for Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Ruchika Bhawal
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, 139 Biotechnology Building, 526 Campus Road, Ithaca, NY, 14853, USA
| | - Zhibin Yin
- Guangdong Key Laboratory for Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | | | - Sheng Zhang
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, 139 Biotechnology Building, 526 Campus Road, Ithaca, NY, 14853, USA.
| |
Collapse
|
9
|
Tibolone Pre-Treatment Ameliorates the Dysregulation of Protein Translation and Transport Generated by Palmitic Acid-Induced Lipotoxicity in Human Astrocytes: A Label-Free MS-Based Proteomics and Network Analysis. Int J Mol Sci 2022; 23:ijms23126454. [PMID: 35742897 PMCID: PMC9223656 DOI: 10.3390/ijms23126454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/27/2022] [Accepted: 05/30/2022] [Indexed: 01/27/2023] Open
Abstract
Excessive accumulation and release of fatty acids (FAs) in adipose and non-adipose tissue are characteristic of obesity and are associated with the leading causes of death worldwide. Chronic exposure to high concentrations of FAs such as palmitic acid (pal) is a risk factor for developing different neurodegenerative diseases (NDs) through several mechanisms. In the brain, astrocytic dysregulation plays an essential role in detrimental processes like metabolic inflammatory state, oxidative stress, endoplasmic reticulum stress, and autophagy impairment. Evidence shows that tibolone, a synthetic steroid, induces neuroprotective effects, but its molecular mechanisms upon exposure to pal remain largely unknown. Due to the capacity of identifying changes in the whole data-set of proteins and their interaction allowing a deeper understanding, we used a proteomic approach on normal human astrocytes under supraphysiological levels of pal as a model to induce cytotoxicity, finding changes of expression in proteins related to translation, transport, autophagy, and apoptosis. Additionally, tibolone pre-treatment showed protective effects by restoring those same pal-altered processes and increasing the expression of proteins from cell survival processes. Interestingly, ARF3 and IPO7 were identified as relevant proteins, presenting a high weight in the protein-protein interaction network and significant differences in expression levels. These proteins are related to transport and translation processes, and their expression was restored by tibolone. This work suggests that the damage caused by pal in astrocytes simultaneously involves different mechanisms that the tibolone can partially revert, making tibolone interesting for further research to understand how to modulate these damages.
Collapse
|
10
|
Abdul-Ghani S, Skeffington KL, Kim M, Moscarelli M, Lewis PA, Heesom K, Fiorentino F, Emanueli C, Reeves BC, Punjabi PP, Angelini GD, Suleiman MS. Effect of cardioplegic arrest and reperfusion on left and right ventricular proteome/phosphoproteome in patients undergoing surgery for coronary or aortic valve disease. Int J Mol Med 2022; 49:77. [PMID: 35425992 PMCID: PMC9083849 DOI: 10.3892/ijmm.2022.5133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 02/15/2022] [Indexed: 11/18/2022] Open
Abstract
Our earlier work has shown inter‑disease and intra‑disease differences in the cardiac proteome between right (RV) and left (LV) ventricles of patients with aortic valve stenosis (AVS) or coronary artery disease (CAD). Whether disease remodeling also affects acute changes occuring in the proteome during surgical intervention is unknown. This study investigated the effects of cardioplegic arrest on cardiac proteins/phosphoproteins in LV and RV of CAD (n=6) and AVS (n=6) patients undergoing cardiac surgery. LV and RV biopsies were collected during surgery before ischemic cold blood cardioplegic arrest (pre) and 20 min after reperfusion (post). Tissues were snap frozen, proteins extracted, and the extracts were used for proteomic and phosphoproteomic analysis using Tandem Mass Tag (TMT) analysis. The results were analysed using QuickGO and Ingenuity Pathway Analysis softwares. For each comparision, our proteomic analysis identified more than 3,000 proteins which could be detected in both the pre and Post samples. Cardioplegic arrest and reperfusion were associated with significant differential expression of 24 (LV) and 120 (RV) proteins in the CAD patients, which were linked to mitochondrial function, inflammation and cardiac contraction. By contrast, AVS patients showed differential expression of only 3 LV proteins and 2 RV proteins, despite a significantly longer duration of ischaemic cardioplegic arrest. The relative expression of 41 phosphoproteins was significantly altered in CAD patients, with 18 phosphoproteins showing altered expression in AVS patients. Inflammatory pathways were implicated in the changes in phosphoprotein expression in both groups. Inter‑disease comparison for the same ventricular chamber at both timepoints revealed differences relating to inflammation and adrenergic and calcium signalling. In conclusion, the present study found that ischemic arrest and reperfusion trigger different changes in the proteomes and phosphoproteomes of LV and RV of CAD and AVS patients undergoing surgery, with markedly more changes in CAD patients despite a significantly shorter ischaemic period.
Collapse
Affiliation(s)
- Safa Abdul-Ghani
- Bristol Heart Institute and Bristol Medical School, University of Bristol, Bristol BS2 8HW, UK
- Department of Physiology, Faculty of Medicine, Al-Quds University, Abu-Dis, Palestine
| | - Katie L. Skeffington
- Bristol Heart Institute and Bristol Medical School, University of Bristol, Bristol BS2 8HW, UK
| | - Minjoo Kim
- Bristol Heart Institute and Bristol Medical School, University of Bristol, Bristol BS2 8HW, UK
| | - Marco Moscarelli
- National Heart and Lung Institute, Imperial College, London SW3 6LY, UK
- GVM Care and Research, Anthea Hospital, I-70124 Bari, Italy
| | - Philip A. Lewis
- University of Bristol Proteomics/Bioinformatics Facility, University of Bristol, Bristol BS8 1TD, UK
| | - Kate Heesom
- University of Bristol Proteomics/Bioinformatics Facility, University of Bristol, Bristol BS8 1TD, UK
| | | | - Costanza Emanueli
- National Heart and Lung Institute, Imperial College, London SW3 6LY, UK
| | - Barnaby C. Reeves
- Bristol Heart Institute and Bristol Medical School, University of Bristol, Bristol BS2 8HW, UK
| | | | - Gianni D. Angelini
- Bristol Heart Institute and Bristol Medical School, University of Bristol, Bristol BS2 8HW, UK
| | - M-Saadeh Suleiman
- Bristol Heart Institute and Bristol Medical School, University of Bristol, Bristol BS2 8HW, UK
| |
Collapse
|
11
|
Krueger A, Mohamed A, Kolka CM, Stoll T, Zaugg J, Linedale R, Morrison M, Soyer HP, Hugenholtz P, Frazer IH, Hill MM. Skin Cancer-Associated S. aureus Strains Can Induce DNA Damage in Human Keratinocytes by Downregulating DNA Repair and Promoting Oxidative Stress. Cancers (Basel) 2022; 14:2143. [PMID: 35565272 PMCID: PMC9106025 DOI: 10.3390/cancers14092143] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 04/14/2022] [Accepted: 04/18/2022] [Indexed: 12/19/2022] Open
Abstract
Actinic keratosis (AK) is a premalignant lesion, common on severely photodamaged skin, that can progress over time to cutaneous squamous cell carcinoma (SCC). A high bacterial load of Staphylococcus aureus is associated with AK and SCC, but it is unknown whether this has a direct impact on skin cancer development. To determine whether S. aureus can have cancer-promoting effects on skin cells, we performed RNA sequencing and shotgun proteomics on primary human keratinocytes after challenge with sterile culture supernatant ('secretome') from four S. aureus clinical strains isolated from AK and SCC. Secretomes of two of the S. aureus strains induced keratinocytes to overexpress biomarkers associated with skin carcinogenesis and upregulated the expression of enzymes linked to reduced skin barrier function. Further, these strains induced oxidative stress markers and all secretomes downregulated DNA repair mechanisms. Subsequent experiments on an expanded set of lesion-associated S. aureus strains confirmed that exposure to their secretomes led to increased oxidative stress and DNA damage in primary human keratinocytes. A significant correlation between the concentration of S. aureus phenol soluble modulin toxins in secretome and the secretome-induced level of oxidative stress and genotoxicity in keratinocytes was observed. Taken together, these data demonstrate that secreted compounds from lesion-associated clinical isolates of S. aureus can have cancer-promoting effects in keratinocytes that may be relevant to skin oncogenesis.
Collapse
Affiliation(s)
- Annika Krueger
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102, Australia; (A.K.); (R.L.); (M.M.); (I.H.F.)
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD 4006, Australia; (A.M.); (C.M.K.); (T.S.)
| | - Ahmed Mohamed
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD 4006, Australia; (A.M.); (C.M.K.); (T.S.)
| | - Cathryn M. Kolka
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD 4006, Australia; (A.M.); (C.M.K.); (T.S.)
| | - Thomas Stoll
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD 4006, Australia; (A.M.); (C.M.K.); (T.S.)
| | - Julian Zaugg
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072, Australia; (J.Z.); (P.H.)
| | - Richard Linedale
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102, Australia; (A.K.); (R.L.); (M.M.); (I.H.F.)
| | - Mark Morrison
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102, Australia; (A.K.); (R.L.); (M.M.); (I.H.F.)
| | - H. Peter Soyer
- Dermatology Research Centre, The University of Queensland Diamantina Institute, The University of Queensland, Brisbane, QLD 4102, Australia;
- Dermatology Department, Princess Alexandra Hospital, Brisbane, QLD 4102, Australia
| | - Philip Hugenholtz
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072, Australia; (J.Z.); (P.H.)
| | - Ian H. Frazer
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102, Australia; (A.K.); (R.L.); (M.M.); (I.H.F.)
| | - Michelle M. Hill
- The University of Queensland Diamantina Institute, Faculty of Medicine, The University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102, Australia; (A.K.); (R.L.); (M.M.); (I.H.F.)
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD 4006, Australia; (A.M.); (C.M.K.); (T.S.)
- The University of Queensland Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Herston, QLD 4006, Australia
| |
Collapse
|
12
|
Barber J, Al-Majdoub ZM, Couto N, Vasilogianni AM, Tillmann A, Alrubia S, Rostami-Hodjegan A, Achour B. Label-Free but Still Constrained: Assessment of Global Proteomic Strategies for the Quantification of Hepatic Enzymes and Transporters . Drug Metab Dispos 2022; 50:762-769. [DOI: 10.1124/dmd.121.000780] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 03/04/2022] [Indexed: 11/22/2022] Open
|
13
|
Abstract
Multi-omics data analysis is an important aspect of cancer molecular biology studies and has led to ground-breaking discoveries. Many efforts have been made to develop machine learning methods that automatically integrate omics data. Here, we review machine learning tools categorized as either general-purpose or task-specific, covering both supervised and unsupervised learning for integrative analysis of multi-omics data. We benchmark the performance of five machine learning approaches using data from the Cancer Cell Line Encyclopedia, reporting accuracy on cancer type classification and mean absolute error on drug response prediction, and evaluating runtime efficiency. This review provides recommendations to researchers regarding suitable machine learning method selection for their specific applications. It should also promote the development of novel machine learning methodologies for data integration, which will be essential for drug discovery, clinical trial design, and personalized treatments. Featuring a balance of both biological and technical content Categorizing the reviewed tools into general purpose and task-specific Performing an independent benchmarking analysis using a publicly available dataset
Collapse
Affiliation(s)
- Zhaoxiang Cai
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
| | - Rebecca C Poulos
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
| | - Jia Liu
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia.,Faculty of Medicine, Western Sydney University, Campbelltown, NSW, Australia
| | - Qing Zhong
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
| |
Collapse
|
14
|
Borvinskaya EV, Kochneva AA, Drozdova PB, Balan OV, Zgoda VG. Temperature-induced reorganisation of Schistocephalus solidus (Cestoda) proteome during the transition to the warm-blooded host. Biol Open 2021; 10:bio058719. [PMID: 34787304 PMCID: PMC8609239 DOI: 10.1242/bio.058719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 08/10/2021] [Indexed: 11/30/2022] Open
Abstract
The protein composition of the cestode Schistocephalus solidus was measured in an experiment simulating the trophic transmission of the parasite from a cold-blooded to a warm-blooded host. The first hour of host colonisation was studied in a model experiment, in which sticklebacks Gasterosteus aculeatus infected with S. solidus were heated at 40°C for 1 h. As a result, a decrease in the content of one tegument protein was detected in the plerocercoids of S. solidus. Sexual maturation of the parasites was initiated in an experiment where S. solidus larvae were taken from fish and cultured in vitro at 40°C for 48 h. Temperature-independent changes in the parasite proteome were investigated by incubating plerocercoids at 22°C for 48 h in culture medium. Analysis of the proteome allowed us to distinguish the temperature-induced genes of S. solidus, as well as to specify the molecular markers of the plerocercoid and adult worms. The main conclusion of the study is that the key enzymes of long-term metabolic changes (glycogen consumption, protein production, etc.) in parasites during colonisation of a warm-blooded host are induced by temperature.
Collapse
Affiliation(s)
| | - Albina A. Kochneva
- Institute of Biology, Karelian Research Centre of the Russian Academy of Sciences, 11 Pushkinskaya Street, 185910 Petrozavodsk, Karelia, Russia
| | - Polina B. Drozdova
- Institute of Biology, Irkutsk State University, 3 Lenin St, 664025 Irkutsk, Russia
| | - Olga V. Balan
- Institute of Biology, Karelian Research Centre of the Russian Academy of Sciences, 11 Pushkinskaya Street, 185910 Petrozavodsk, Karelia, Russia
| | - Victor G. Zgoda
- Department of Proteomic Research and Mass Spectrometry, Institute of Biomedical Chemistry (IBMC), 10 Pogodinskaya street, 119121 Moscow, Russia
| |
Collapse
|
15
|
Differential urine proteome analysis of a ventilator-induced lung injury rat model by label-free quantitative and parallel reaction monitoring proteomics. Sci Rep 2021; 11:21446. [PMID: 34728735 PMCID: PMC8563714 DOI: 10.1038/s41598-021-01007-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 10/18/2021] [Indexed: 12/17/2022] Open
Abstract
Urine is a promising resource for biomarker research. Therefore, the purpose of this study was to investigate potential urinary biomarkers to monitor the disease activity of ventilator-induced lung injury (VILI). In the discovery phase, a label-free data-dependent acquisition (DDA) quantitative proteomics method was used to profile the urinary proteomes of VILI rats. For further validation, the differential proteins were verified by parallel reaction monitoring (PRM)-targeted quantitative proteomics. In total, 727 high-confidence proteins were identified with at least 1 unique peptide (FDR ≤ 1%). Compared to the control group, 110 proteins (65 upregulated, 45 downregulated) were significantly changed in the VILI group (1.5-fold change, P < 0.05). The canonical pathways and protein-protein interaction analyses revealed that the differentially expressed proteins were enriched in multiple functions, including oxidative stress and inflammatory responses. Finally, thirteen proteins were identified as candidate biomarkers for VILI by PRM validation. Among these PRM-validated proteins, AMPN, MEP1B, LYSC1, DPP4 and CYC were previously reported as lung-associated disease biomarkers. SLC31, MEP1A, S15A2, NHRF1, XPP2, GGT1, HEXA, and ATPB were newly discovered in this study. Our results suggest that the urinary proteome might reflect the pathophysiological changes associated with VILI. These differential proteins are potential urinary biomarkers for the activity of VILI.
Collapse
|
16
|
Dataset of single nucleotide polymorphisms and comprehensive proteomic analysis of Streptococcus equi subsp. equi ATCC 39506. Data Brief 2021; 38:107402. [PMID: 34621931 PMCID: PMC8479396 DOI: 10.1016/j.dib.2021.107402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 09/13/2021] [Accepted: 09/16/2021] [Indexed: 12/02/2022] Open
Abstract
Streptococcus equi subspecies equi (S. equi) is an opportunistic pathogen and a major causative agent of equine strangles, a contagious respiratory infection in horses and other equines. In this study, we provide the dataset associated with our research publication “Streptococcus equi-derived extracellular vesicles as a vaccine candidate against Streptococcus equi infections” [1]. We describe the genomic differences between S. equi 4047 and S. equi ATCC 39506 and outline the comprehensive proteome information of various fractions, including the whole cell lysate, membrane proteome, secretory proteome, and extracellular vesicle proteome. In addition, we included a dataset of highly immunoreactive proteins identified through immunoprecipitation. The specifications table provides a detailed summary of the gene annotation and quantitative information obtained for each proteome. The proteomics data were analyzed using shotgun proteomics with LTQ Velos and Q Exactive mass spectrometry in the data-dependent acquisition mode. We have deposited the acquired data, including the mass spectrometry raw files and exported MASCOT search results, in the PRIDE public repository under the accession numbers PXD025152 and PXD025527.
Collapse
|
17
|
Kundu S. ProTG4: A Web Server to Approximate the Sequence of a Generic Protein From an in Silico Library of Translatable G-Quadruplex ( TG4)-Mapped Peptides. Bioinform Biol Insights 2021; 15:11779322211045878. [PMID: 34602814 PMCID: PMC8482721 DOI: 10.1177/11779322211045878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 08/13/2021] [Indexed: 11/25/2022] Open
Abstract
An RNA G-quadruplex in the protein coding segment of mRNA is translatable (TG4) and may potentially impact protein translation. This can be consequent to staggered ribosomal synthesis and/or result in an increased frequency of missense translational events. A mathematical model of the peptides that encompass the substituted amino acids, ie, the TG4-mapped peptidome, has been previously studied. However, the significance and relevance to disease biology of this model remains to be established. ProTG4 computes a confidence-of-sequence-identity (γ)-score, which is the average weighted length of every matched TG4-mapped peptide in a generic protein sequence. The weighted length is the product of the length of the peptide and the probability of its non-random occurrence in a library of randomly generated sequences of equivalent lengths. This is then averaged over the entire length of the protein sequence. ProTG4 is simple to operate, has clear instructions, and is accompanied by a set of ready-to-use examples. The rationale of the study, algorithms deployed, and the computational pipeline deployed are also part of the web page. Analyses by ProTG4 of taxonomically diverse protein sequences suggest that there is significant homology to TG4-mapped peptides. These findings, especially in potentially infectious and infesting agents, offer plausible explanations into the aetiology and pathogenesis of certain proteopathies. ProTG4 can also provide a quantitative measure to identify and annotate the canonical form of a generic protein sequence from its known isoforms. The article presents several case studies and discusses the relevance of ProTG4-assisted peptide analysis in gaining insights into various mechanisms of disease biology (mistranslation, alternate splicing, amino acid substitutions).
Collapse
Affiliation(s)
- Siddhartha Kundu
- Department of Biochemistry, All India Institute of Medical Sciences, New Delhi, India
| |
Collapse
|
18
|
Gotti C, Roux-Dalvai F, Joly-Beauparlant C, Mangnier L, Leclercq M, Droit A. Extensive and Accurate Benchmarking of DIA Acquisition Methods and Software Tools Using a Complex Proteomic Standard. J Proteome Res 2021; 20:4801-4814. [PMID: 34472865 DOI: 10.1021/acs.jproteome.1c00490] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Over the past decade, the data-independent acquisition mode has gained popularity for broad coverage of complex proteomes by LC-MS/MS and quantification of low-abundance proteins. However, there is no consensus in the literature on the best data acquisition parameters and processing tools to use for this specific application. Here, we present the most comprehensive comparison of DIA workflows on Orbitrap instruments published so far in the field of proteomics. Using a standard human 48 proteins mixture (UPS1-Sigma) at 8 different concentrations in an E. coli proteome background, we tested 36 workflows including 4 different DIA window acquisition schemes and 6 different software tools (DIA-NN, DIA-Umpire, OpenSWATH, ScaffoldDIA, Skyline, and Spectronaut) with or without the use of a DDA spectral library. On the basis of the number of proteins identified, quantification linearity and reproducibility, as well as sensitivity and specificity in 28 pairwise comparisons of different UPS1 concentrations, we summarize the major considerations and propose guidelines for choosing the DIA workflow best suited for LC-MS/MS proteomic analyses. Our 96 DIA raw files and software outputs have been deposited on ProteomeXchange for testing or developing new DIA processing tools.
Collapse
Affiliation(s)
- Clarisse Gotti
- Proteomics Platform, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada.,Computational Biology Laboratory, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada
| | - Florence Roux-Dalvai
- Proteomics Platform, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada.,Computational Biology Laboratory, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada
| | - Charles Joly-Beauparlant
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada
| | - Loïc Mangnier
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada
| | - Mickaël Leclercq
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada
| | - Arnaud Droit
- Proteomics Platform, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada.,Computational Biology Laboratory, CHU de Québec - Université Laval Research Centre, Québec City, Québec G1V 4G2, Canada
| |
Collapse
|
19
|
Khudyakov JI, Treat MD, Shanafelt MC, Deyarmin JS, Neely BA, van Breukelen F. Liver proteome response to torpor in a basoendothermic mammal, Tenrec ecaudatus, provides insights into the evolution of homeothermy. Am J Physiol Regul Integr Comp Physiol 2021; 321:R614-R624. [PMID: 34431404 DOI: 10.1152/ajpregu.00150.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Many mammals use adaptive heterothermy (e.g., torpor, hibernation) to reduce metabolic demands of maintaining high body temperature (Tb). Torpor is typically characterized by coordinated declines in Tb and metabolic rate (MR) followed by active rewarming. Most hibernators experience periods of euthermy between bouts of torpor during which homeostatic processes are restored. In contrast, the common tenrec, a basoendothermic Afrotherian mammal, hibernates without interbout arousals and displays extreme flexibility in Tb and MR. We investigated the molecular basis of this plasticity in tenrecs by profiling the liver proteome of animals that were active or torpid with high and more stable Tb (∼32°C) or lower Tb (∼14°C). We identified 768 tenrec liver proteins, of which 50.9% were differentially abundant between torpid and active animals. Protein abundance was significantly more variable in active cold and torpid compared with active warm animals, suggesting poor control of proteostasis. Our data suggest that torpor in tenrecs may lead to mismatches in protein pools due to poor coordination of anabolic and catabolic processes. We propose that the evolution of endothermy leading to a more realized homeothermy of boreoeutherians likely led to greater coordination of homeostatic processes and reduced mismatches in thermal sensitivities of metabolic pathways.
Collapse
Affiliation(s)
- Jane I Khudyakov
- Biological Sciences Department, University of the Pacific, Stockton, California
| | - Michael D Treat
- School of Life Sciences, University of Nevada, Las Vegas, Nevada
| | - Mikayla C Shanafelt
- Biological Sciences Department, University of the Pacific, Stockton, California
| | - Jared S Deyarmin
- Biological Sciences Department, University of the Pacific, Stockton, California
| | - Benjamin A Neely
- National Institute of Standards and Technology, Charleston, South Carolina
| | | |
Collapse
|
20
|
Gardner ML, Freitas MA. Multiple Imputation Approaches Applied to the Missing Value Problem in Bottom-Up Proteomics. Int J Mol Sci 2021; 22:ijms22179650. [PMID: 34502557 PMCID: PMC8431783 DOI: 10.3390/ijms22179650] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 08/28/2021] [Accepted: 08/31/2021] [Indexed: 01/15/2023] Open
Abstract
Analysis of differential abundance in proteomics data sets requires careful application of missing value imputation. Missing abundance values widely vary when performing comparisons across different sample treatments. For example, one would expect a consistent rate of “missing at random” (MAR) across batches of samples and varying rates of “missing not at random” (MNAR) depending on the inherent difference in sample treatments within the study. The missing value imputation strategy must thus be selected that best accounts for both MAR and MNAR simultaneously. Several important issues must be considered when deciding the appropriate missing value imputation strategy: (1) when it is appropriate to impute data; (2) how to choose a method that reflects the combinatorial manner of MAR and MNAR that occurs in an experiment. This paper provides an evaluation of missing value imputation strategies used in proteomics and presents a case for the use of hybrid left-censored missing value imputation approaches that can handle the MNAR problem common to proteomics data.
Collapse
Affiliation(s)
- Miranda L. Gardner
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
| | - Michael A. Freitas
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Correspondence: or
| |
Collapse
|
21
|
Abstract
Biological mass spectrometry (MS) encompasses a range of methods for characterizing proteins and other biomolecules. MS is uniquely powerful for the structural analysis of endogenous protein complexes, which are often heterogeneous, poorly abundant, and refractive to characterization by other methods. Here, we focus on how biological MS can contribute to the study of endogenous protein complexes, which we define as complexes expressed in the physiological host and purified intact, as opposed to reconstituted complexes assembled from heterologously expressed components. Biological MS can yield information on complex stoichiometry, heterogeneity, topology, stability, activity, modes of regulation, and even structural dynamics. We begin with a review of methods for isolating endogenous complexes. We then describe the various biological MS approaches, focusing on the type of information that each method yields. We end with future directions and challenges for these MS-based methods.
Collapse
Affiliation(s)
- Rivkah Rogawski
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Michal Sharon
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
22
|
Tang J, Fu J, Wang Y, Li B, Li Y, Yang Q, Cui X, Hong J, Li X, Chen Y, Xue W, Zhu F. ANPELA: analysis and performance assessment of the label-free quantification workflow for metaproteomic studies. Brief Bioinform 2021; 21:621-636. [PMID: 30649171 PMCID: PMC7299298 DOI: 10.1093/bib/bby127] [Citation(s) in RCA: 131] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 11/19/2018] [Accepted: 12/06/2018] [Indexed: 12/13/2022] Open
Abstract
Label-free quantification (LFQ) with a specific and sequentially integrated workflow of acquisition technique, quantification tool and processing method has emerged as the popular technique employed in metaproteomic research to provide a comprehensive landscape of the adaptive response of microbes to external stimuli and their interactions with other organisms or host cells. The performance of a specific LFQ workflow is highly dependent on the studied data. Hence, it is essential to discover the most appropriate one for a specific data set. However, it is challenging to perform such discovery due to the large number of possible workflows and the multifaceted nature of the evaluation criteria. Herein, a web server ANPELA (https://idrblab.org/anpela/) was developed and validated as the first tool enabling performance assessment of whole LFQ workflow (collective assessment by five well-established criteria with distinct underlying theories), and it enabled the identification of the optimal LFQ workflow(s) by a comprehensive performance ranking. ANPELA not only automatically detects the diverse formats of data generated by all quantification tools but also provides the most complete set of processing methods among the available web servers and stand-alone tools. Systematic validation using metaproteomic benchmarks revealed ANPELA's capabilities in 1 discovering well-performing workflow(s), (2) enabling assessment from multiple perspectives and (3) validating LFQ accuracy using spiked proteins. ANPELA has a unique ability to evaluate the performance of whole LFQ workflow and enables the discovery of the optimal LFQs by the comprehensive performance ranking of all 560 workflows. Therefore, it has great potential for applications in metaproteomic and other studies requiring LFQ techniques, as many features are shared among proteomic studies.
Collapse
Affiliation(s)
- Jing Tang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Jianbo Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Bo Li
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Yinghong Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Qingxia Yang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Xuejiao Cui
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Xiaofeng Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Yuzong Chen
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Singapore, Singapore
| | - Weiwei Xue
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| |
Collapse
|
23
|
Mohamed A, Hill MM. LipidSuite: interactive web server for lipidomics differential and enrichment analysis. Nucleic Acids Res 2021; 49:W346-W351. [PMID: 33950258 PMCID: PMC8262688 DOI: 10.1093/nar/gkab327] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/06/2021] [Accepted: 04/19/2021] [Indexed: 11/14/2022] Open
Abstract
Advances in mass spectrometry enabled high throughput profiling of lipids but differential analysis and biological interpretation of lipidomics datasets remains challenging. To overcome this barrier, we present LipidSuite, an end-to-end differential lipidomics data analysis server. LipidSuite offers a step-by-step workflow for preprocessing, exploration, differential analysis and enrichment analysis of untargeted and targeted lipidomics. Three lipidomics data formats are accepted for upload: mwTab file from Metabolomics Workbench, Skyline CSV Export, and a numerical matrix. Experimental variables to be used in analysis are uploaded in a separate file. Conventional lipid names are automatically parsed to enable lipid class and chain length analyses. Users can interactively explore data, choose subsets based on sample types or lipid classes or characteristics, and conduct univariate, multivariate and unsupervised analyses. For complex experimental designs and clinical cohorts, LipidSuite offers confounding variables adjustment. Finally, data tables and plots can be both interactively viewed or downloaded for publication or reports. Overall, we anticipate this free, user-friendly webserver to facilitate differential lipidomics data analysis and re-analysis, and fully harness biological interpretation from lipidomics datasets. LipidSuite is freely available at http://suite.lipidr.org.
Collapse
Affiliation(s)
- Ahmed Mohamed
- Precision & Systems Biomedicine Laboratory, QIMR Berghofer Medical Research Institute, Herston, QLD 4006, Australia
| | - Michelle M Hill
- Precision & Systems Biomedicine Laboratory, QIMR Berghofer Medical Research Institute, Herston, QLD 4006, Australia.,Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Herston, QLD 4006, Australia
| |
Collapse
|
24
|
Kochneva A, Borvinskaya E, Smirnov L. Zone of Interaction Between the Parasite and the Host: Protein Profile of the Body Cavity Fluid of Gasterosteus aculeatus L. Infected with the Cestode Schistocephalus solidus (Muller, 1776). Acta Parasitol 2021; 66:569-583. [PMID: 33387269 DOI: 10.1007/s11686-020-00318-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 11/17/2020] [Indexed: 12/18/2022]
Abstract
PURPOSE During infection, the host and the parasite "communicate" with each other through various molecules, including proteins. The aim of this study was to describe the excretory-secretory proteins from the helminth Schistocephalus solidus and its intermediate host, the three-spined stickleback Gasterosteus aculeatus L., which are likely to be involved in interactions between them. METHODS Combined samples of washes from the G. aculeatus sticklebacks cavity infected with the S. solidus, and washes from the parasite surface were used as experimental samples, while washes from the uninfected fish body cavity were used as control. The obtained samples were analyzed using mass-spectrometry nLC-MS/MS. RESULTS As a result of mass-spectrometry analysis 215 proteins were identified. Comparative quantitative analysis revealed significant differences in LFQ intensity between experimental and control samples for 20 stickleback proteins. In the experimental samples, we found an increase in the content of serpins, plasminogen, angiotensin 1-10, complement component C9, and a decrease in the content of triosephosphate isomerase, creatine kinase, fructose-biphosphate aldolase, superoxide dismutase, peroxidoxin-1, homocysteine-binding and fatty acid-binding proteins, compared to uninfected fish samples. In the experimental group washes, 30 S. solidus proteins were found, including malate dehydrogenase, annexin family proteins, serpins, peptidyl-prolyl cis-trans isomerase and fatty acid-binding protein. CONCLUSIONS Thus, the protein composition of washes from the helminth S. solidus surface and the body cavity of infected and uninfected stickleback G. aculeatus were studied. As a result, it was shown that various components of the immune defense system predominated in the washes of infected fish and helminths.
Collapse
|
25
|
Egert J, Brombacher E, Warscheid B, Kreutz C. DIMA: Data-Driven Selection of an Imputation Algorithm. J Proteome Res 2021; 20:3489-3496. [PMID: 34062065 DOI: 10.1021/acs.jproteome.1c00119] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Imputation is a prominent strategy when dealing with missing values (MVs) in proteomics data analysis pipelines. However, it is difficult to assess the performance of different imputation methods and varies strongly depending on data characteristics. To overcome this issue, we present the concept of a data-driven selection of an imputation algorithm (DIMA). The performance and broad applicability of DIMA are demonstrated on 142 quantitative proteomics data sets from the PRoteomics IDEntifications (PRIDE) database and on simulated data consisting of 5-50% MVs with different proportions of missing not at random and missing completely at random values. DIMA reliably suggests a high-performing imputation algorithm, which is always among the three best algorithms and results in a root mean square error difference (ΔRMSE) ≤ 10% in 80% of the cases. DIMA implementation is available in MATLAB at github.com/kreutz-lab/OmicsData and in R at github.com/kreutz-lab/DIMAR.
Collapse
Affiliation(s)
- Janine Egert
- Institute of Medical Biometry and Statistics (IMBI), Institute of Medicine and Medical Center Freiburg, 79104 Freiburg im Breisgau, Germany.,Centre for Integrative Biological Signalling Studies (CIBSS), Albert-Ludwigs-Universität Freiburg, 79104 Freiburg, Germany
| | - Eva Brombacher
- Institute of Medical Biometry and Statistics (IMBI), Institute of Medicine and Medical Center Freiburg, 79104 Freiburg im Breisgau, Germany.,Centre for Integrative Biological Signalling Studies (CIBSS), Albert-Ludwigs-Universität Freiburg, 79104 Freiburg, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), Albert-Ludwigs-Universität Freiburg, 79104 Freiburg, Germany.,Faculty of Biology, Albert-Ludwigs-Universität Freiburg, 79104 Freiburg im Breisgau, Germany
| | - Bettina Warscheid
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, Albert-Ludwigs-Universität Freiburg, 79104 Freiburg im Breisgau, Germany.,Signalling Research Centres BIOSS and CIBSS, Albert-Ludwigs-Universität Freiburg, 79104 Freiburg im Breisgau, Germany
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics (IMBI), Institute of Medicine and Medical Center Freiburg, 79104 Freiburg im Breisgau, Germany.,Signalling Research Centres BIOSS and CIBSS, Albert-Ludwigs-Universität Freiburg, 79104 Freiburg im Breisgau, Germany.,Center for Data Analysis and Modeling (FDM), Albert-Ludwigs-Universität Freiburg, 79104 Freiburg im Breisgau, Germany
| |
Collapse
|
26
|
Palomba A, Abbondio M, Fiorito G, Uzzau S, Pagnozzi D, Tanca A. Comparative Evaluation of MaxQuant and Proteome Discoverer MS1-Based Protein Quantification Tools. J Proteome Res 2021; 20:3497-3507. [PMID: 34038140 PMCID: PMC8280745 DOI: 10.1021/acs.jproteome.1c00143] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
![]()
MS1-based label-free
quantification can compare precursor ion peaks
across runs, allowing reproducible protein measurements. Among bioinformatic
platforms enabling MS1-based quantification, MaxQuant (MQ) is one
of the most used, while Proteome Discoverer (PD) has recently introduced
the Minora tool. Here, we present a comparative evaluation of six
MS1-based quantification methods available in MQ and PD. Intensity
(MQ and PD) and area (PD only) of the precursor ion peaks were measured
and then subjected or not to normalization. The six methods were applied
to data sets simulating various differential proteomics scenarios
and covering a wide range of protein abundance ratios and amounts.
PD outperformed MQ in terms of quantification yield, dynamic range,
and reproducibility, although neither platform reached a fully satisfactory
quality of measurements at low-abundance ranges. PD methods including
normalization were the most accurate in estimating the abundance ratio
between groups and the most sensitive when comparing groups with a
narrow abundance ratio; on the contrary, MQ methods generally reached
slightly higher specificity, accuracy, and precision values. Moreover,
we found that applying an optimized log ratio-based threshold can
maximize specificity, accuracy, and precision. Taken together, these
results can help researchers choose the most appropriate MS1-based
protein quantification strategy for their studies.
Collapse
Affiliation(s)
- Antonio Palomba
- Porto Conte Ricerche, Loc. Tramariglio, 07041 Alghero, Italy
| | - Marcello Abbondio
- Department of Biomedical Sciences, University of Sassari, Viale San Pietro 43/B, 07100 Sassari, Italy
| | - Giovanni Fiorito
- Department of Biomedical Sciences, University of Sassari, Viale San Pietro 43/B, 07100 Sassari, Italy.,MRC Centre for Environment and Health, Imperial College London, Norfolk Place, W2 1PG London, U.K
| | - Sergio Uzzau
- Department of Biomedical Sciences, University of Sassari, Viale San Pietro 43/B, 07100 Sassari, Italy
| | | | - Alessandro Tanca
- Department of Biomedical Sciences, University of Sassari, Viale San Pietro 43/B, 07100 Sassari, Italy
| |
Collapse
|
27
|
Dabke K, Kreimer S, Jones MR, Parker SJ. A Simple Optimization Workflow to Enable Precise and Accurate Imputation of Missing Values in Proteomic Data Sets. J Proteome Res 2021; 20:3214-3229. [PMID: 33939434 DOI: 10.1021/acs.jproteome.1c00070] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Missing values in proteomic data sets have real consequences on downstream data analysis and reproducibility. Although several imputation methods exist to handle missing values, no single imputation method is best suited for a diverse range of data sets, and no clear strategy exists for evaluating imputation methods for clinical DIA-MS data sets, especially at different levels of protein quantification. To navigate through the different imputation strategies available in the literature, we have established a strategy to assess imputation methods on clinical label-free DIA-MS data sets. We used three DIA-MS data sets with real missing values to evaluate eight imputation methods with multiple parameters at different levels of protein quantification: a dilution series data set, a small pilot data set, and a clinical proteomic data set comparing paired tumor and stroma tissue. We found that imputation methods based on local structures within the data, like local least-squares (LLS) and random forest (RF), worked well in our dilution series data set, whereas imputation methods based on global structures within the data, like BPCA, performed well in the other two data sets. We also found that imputation at the most basic protein quantification level-fragment level-improved accuracy and the number of proteins quantified. With this analytical framework, we quickly and cost-effectively evaluated different imputation methods using two smaller complementary data sets to narrow down to the larger proteomic data set's most accurate methods. This acquisition strategy allowed us to provide reproducible evidence of the accuracy of the imputation method, even in the absence of a ground truth. Overall, this study indicates that the most suitable imputation method relies on the overall structure of the data set and provides an example of an analytic framework that may assist in identifying the most appropriate imputation strategies for the differential analysis of proteins.
Collapse
Affiliation(s)
- Kruttika Dabke
- Center for Bioinformatics and Functional Genomics, Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Graduate Program in Biomedical Sciences, Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Simion Kreimer
- Advanced Clinical Biosystems Research Institute, Smidt Heart Institute, Departments of Cardiology and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Michelle R Jones
- Center for Bioinformatics and Functional Genomics, Department of Biomedical Science, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sarah J Parker
- Advanced Clinical Biosystems Research Institute, Smidt Heart Institute, Departments of Cardiology and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
28
|
Zhou Y, Hill C, Yao L, Li J, Hancock D, Downward J, Jones MG, Davies DE, Ewing RM, Skipp P, Wang Y. Quantitative Proteomic Analysis in Alveolar Type II Cells Reveals the Different Capacities of RAS and TGF-β to Induce Epithelial-Mesenchymal Transition. Front Mol Biosci 2021; 8:595712. [PMID: 33869273 PMCID: PMC8048883 DOI: 10.3389/fmolb.2021.595712] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 01/15/2021] [Indexed: 12/12/2022] Open
Abstract
Alveolar type II (ATII) epithelial cells function as stem cells, contributing to alveolar renewal, repair and cancer. Therefore, they are a highly relevant model for studying a number of lung diseases, including acute injury, fibrosis and cancer, in which signals transduced by RAS and transforming growth factor (TGF)-β play critical roles. To identify downstream molecular events following RAS and/or TGF-β activation, we performed proteomic analysis using a quantitative label-free approach (LC-HDMSE) to provide in-depth proteome coverage and estimates of protein concentration in absolute amounts. Data are available via ProteomeXchange with identifier PXD023720. We chose ATIIER:KRASV12 as an experimental cell line in which RAS is activated by adding 4-hydroxytamoxifen (4-OHT). Proteomic analysis of ATII cells treated with 4-OHT or TGF-β demonstrated that RAS activation induces an epithelial–mesenchymal transition (EMT) signature. In contrast, under the same conditions, activation of TGF-β signaling alone only induces a partial EMT. EMT is a dynamic and reversible biological process by which epithelial cells lose their cell polarity and down-regulate cadherin-mediated cell–cell adhesion to gain migratory properties, and is involved in embryonic development, wound healing, fibrosis and cancer metastasis. Thus, these results could help to focus research on the identification of processes that are potentially driving EMT-related human disease.
Collapse
Affiliation(s)
- Yilu Zhou
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Charlotte Hill
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Liudi Yao
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Juanjuan Li
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom
| | - David Hancock
- Oncogene Biology, The Francis Crick Institute, London, United Kingdom
| | - Julian Downward
- Oncogene Biology, The Francis Crick Institute, London, United Kingdom
| | - Mark G Jones
- Institute for Life Sciences, University of Southampton, Southampton, United Kingdom.,Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom.,NIHR Southampton Biomedical Research Centre, University Hospital Southampton, Southampton, United Kingdom
| | - Donna E Davies
- Institute for Life Sciences, University of Southampton, Southampton, United Kingdom.,Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom.,NIHR Southampton Biomedical Research Centre, University Hospital Southampton, Southampton, United Kingdom
| | - Rob M Ewing
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Paul Skipp
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom.,Centre for Proteomic Research, Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Yihua Wang
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom.,NIHR Southampton Biomedical Research Centre, University Hospital Southampton, Southampton, United Kingdom
| |
Collapse
|
29
|
McCabe A, Jones AR. lcmsWorld: High-Performance 3D Visualization Software for Mass Spectrometry. J Proteome Res 2021; 20:1981-1985. [PMID: 33710902 DOI: 10.1021/acs.jproteome.0c00618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Complex biological samples, in particular, in proteomics and metabolomics research, are often analyzed using mass spectrometry paired with liquid chromatography or gas chromatography. The chromatography stage adds a third dimension (retention time) to the usual 2D mass spectrometry output (mass/charge, detected ion counts). Experimental results are often discovered by complex computational analysis, but it is not always possible to know if the data has been correctly interpreted. To perform quality-control checks, it can often be helpful to verify the results by manually examining the raw data, and it is typically easier to understand the data in a graphical, rather than numerical, form. 3D graphics hardware is present in most modern computers but is rarely utilized by bioinformatics software, even when the data to be viewed are naturally 3D. lcmsWorld is new software that uses graphics hardware to quickly and smoothly examine and compare LC-MS data. A preprocessing step allows the software to subsequently access any area of the data instantly at multiple levels of detail. The data can then be freely navigated while the software automatically selects, loads, and displays the most appropriate detail. lcmsWorld is open source. Releases, source code, and example data files are available via https://github.com/PGB-LIV/lcmsWorld.
Collapse
Affiliation(s)
- Antony McCabe
- Computational Biology Facility, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Andrew R Jones
- Computational Biology Facility, University of Liverpool, Liverpool L69 7ZB, United Kingdom.,Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| |
Collapse
|
30
|
Willforss J, Siino V, Levander F. OmicLoupe: facilitating biological discovery by interactive exploration of multiple omic datasets and statistical comparisons. BMC Bioinformatics 2021; 22:107. [PMID: 33663372 PMCID: PMC7931979 DOI: 10.1186/s12859-021-04043-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 02/22/2021] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Visual exploration of gene product behavior across multiple omic datasets can pinpoint technical limitations in data and reveal biological trends. Still, such exploration is challenging as there is a need for visualizations that are tailored for the purpose. RESULTS The OmicLoupe software was developed to facilitate visual data exploration and provides more than 15 interactive cross-dataset visualizations for omics data. It expands visualizations to multiple datasets for quality control, statistical comparisons and overlap and correlation analyses, while allowing for rapid inspection and downloading of selected features. The usage of OmicLoupe is demonstrated in three different studies, where it allowed for detection of both technical data limitations and biological trends across different omic layers. An example is an analysis of SARS-CoV-2 infection based on two previously published studies, where OmicLoupe facilitated the identification of gene products with consistent expression changes across datasets at both the transcript and protein levels. CONCLUSIONS OmicLoupe provides fast exploration of omics data with tailored visualizations for comparisons within and across data layers. The interactive visualizations are highly informative and are expected to be useful in various analyses of both newly generated and previously published data. OmicLoupe is available at quantitativeproteomics.org/omicloupe.
Collapse
Affiliation(s)
- Jakob Willforss
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - Valentina Siino
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - Fredrik Levander
- Department of Immunotechnology, Lund University, Lund, Sweden.
- Science for Life Laboratory, National Bioinformatics Infrastructure Sweden (NBIS), Lund University, Lund, Sweden.
| |
Collapse
|
31
|
Wang X, Wilkinson R, Kildey K, Ungerer JPJ, Hill MM, Shah AK, Mohamed A, Dutt M, Molendijk J, Healy H, Kassianos AJ. Molecular and functional profiling of apical versus basolateral small extracellular vesicles derived from primary human proximal tubular epithelial cells under inflammatory conditions. J Extracell Vesicles 2021; 10:e12064. [PMID: 33643548 PMCID: PMC7886702 DOI: 10.1002/jev2.12064] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Revised: 01/08/2021] [Accepted: 01/13/2021] [Indexed: 12/14/2022] Open
Abstract
Proximal tubular epithelial cells (PTEC) are central players in inflammatory kidney diseases. However, the complex signalling mechanism/s via which polarized PTEC mediate disease progression are poorly understood. Small extracellular vesicles (sEV), including exosomes, are recognized as fundamental components of cellular communication and signalling courtesy of their molecular cargo (lipids, microRNA, proteins). In this study, we examined the molecular content and function of sEV secreted from the apical versus basolateral surfaces of polarized human primary PTEC under inflammatory diseased conditions. PTEC were cultured under normal and inflammatory conditions on Transwell inserts to enable separate collection and isolation of apical/basolateral sEV. Significantly increased numbers of apical and basolateral sEV were secreted under inflammatory conditions compared with equivalent normal conditions. Multi‐omics analysis revealed distinct molecular profiles (lipids, microRNA, proteins) between inflammatory and normal conditions for both apical and basolateral sEV. Biological pathway analyses of significantly differentially expressed molecules associated apical inflammatory sEV with processes of cell survival and immunological disease, while basolateral inflammatory sEV were linked to pathways of immune cell trafficking and cell‐to‐cell signalling. In line with this mechanistic concept, functional assays demonstrated significantly increased production of chemokines (monocyte chemoattractant protein‐1, interleukin‐8) and immuno‐regulatory cytokine interleukin‐10 by peripheral blood mononuclear cells activated with basolateral sEV derived from inflammatory PTEC. We propose that the distinct molecular composition of sEV released from the apical versus basolateral membranes of human inflammatory PTEC may reflect specialized functional roles, with basolateral‐derived sEV pivotal in modulating tubulointerstitial inflammatory responses observed in many immune‐mediated kidney diseases. These findings provide a rationale to further evaluate these sEV‐mediated inflammatory pathways as targets for biomarker and therapeutic development.
Collapse
Affiliation(s)
- Xiangju Wang
- Conjoint Internal Medicine Laboratory, Chemical Pathology Pathology Queensland Brisbane Queensland Australia.,Kidney Health Service Royal Brisbane and Women's Hospital Brisbane Queensland Australia
| | - Ray Wilkinson
- Conjoint Internal Medicine Laboratory, Chemical Pathology Pathology Queensland Brisbane Queensland Australia.,Kidney Health Service Royal Brisbane and Women's Hospital Brisbane Queensland Australia.,Institute of Health and Biomedical Innovation Queensland University of Technology Brisbane Queensland Australia.,Faculty of Medicine University of Queensland Brisbane Queensland Australia
| | - Katrina Kildey
- Conjoint Internal Medicine Laboratory, Chemical Pathology Pathology Queensland Brisbane Queensland Australia.,Kidney Health Service Royal Brisbane and Women's Hospital Brisbane Queensland Australia
| | - Jacobus P J Ungerer
- Conjoint Internal Medicine Laboratory, Chemical Pathology Pathology Queensland Brisbane Queensland Australia.,Faculty of Medicine University of Queensland Brisbane Queensland Australia
| | - Michelle M Hill
- QIMR Berghofer Medical Research Institute Brisbane Queensland Australia
| | - Alok K Shah
- QIMR Berghofer Medical Research Institute Brisbane Queensland Australia
| | - Ahmed Mohamed
- QIMR Berghofer Medical Research Institute Brisbane Queensland Australia
| | - Mriga Dutt
- QIMR Berghofer Medical Research Institute Brisbane Queensland Australia
| | - Jeffrey Molendijk
- QIMR Berghofer Medical Research Institute Brisbane Queensland Australia
| | - Helen Healy
- Conjoint Internal Medicine Laboratory, Chemical Pathology Pathology Queensland Brisbane Queensland Australia.,Kidney Health Service Royal Brisbane and Women's Hospital Brisbane Queensland Australia.,Faculty of Medicine University of Queensland Brisbane Queensland Australia
| | - Andrew J Kassianos
- Conjoint Internal Medicine Laboratory, Chemical Pathology Pathology Queensland Brisbane Queensland Australia.,Kidney Health Service Royal Brisbane and Women's Hospital Brisbane Queensland Australia.,Institute of Health and Biomedical Innovation Queensland University of Technology Brisbane Queensland Australia.,Faculty of Medicine University of Queensland Brisbane Queensland Australia
| |
Collapse
|
32
|
Dowell JA, Wright LJ, Armstrong EA, Denu JM. Benchmarking Quantitative Performance in Label-Free Proteomics. ACS OMEGA 2021; 6:2494-2504. [PMID: 33553868 PMCID: PMC7859943 DOI: 10.1021/acsomega.0c04030] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 01/11/2021] [Indexed: 05/07/2023]
Abstract
Previous benchmarking studies have demonstrated the importance of instrument acquisition methodology and statistical analysis on quantitative performance in label-free proteomics. However, the effects of these parameters in combination with replicate number and false discovery rate (FDR) corrections are not known. Using a benchmarking standard, we systematically evaluated the combined impact of acquisition methodology, replicate number, statistical approach, and FDR corrections. These analyses reveal a complex interaction between these parameters that greatly impacts the quantitative fidelity of protein- and peptide-level quantification. At a high replicate number (n = 8), both data-dependent acquisition (DDA) and data-independent acquisition (DIA) methodologies yield accurate protein quantification across statistical approaches. However, at a low replicate number (n = 4), only DIA in combination with linear models for microarrays (LIMMA) and reproducibility-optimized test statistic (ROTS) produced a high level of quantitative fidelity. Quantitative accuracy at low replicates is also greatly impacted by FDR corrections, with Benjamini-Hochberg and Storey corrections yielding variable true positive rates for DDA workflows. For peptide quantification, replicate number and acquisition methodology are even more critical. A higher number of replicates in combination with DIA and LIMMA produce high quantitative fidelity, while DDA performs poorly regardless of replicate number or statistical approach. These results underscore the importance of pairing instrument acquisition methodology with the appropriate replicate number and statistical approach for optimal quantification performance.
Collapse
Affiliation(s)
- James A. Dowell
- Wisconsin
Institute for Discovery, University of Wisconsin−Madison, 330 North Orchard Street, Madison, Wisconsin 53715, United States
| | - Logan J. Wright
- Wisconsin
Institute for Discovery, University of Wisconsin−Madison, 330 North Orchard Street, Madison, Wisconsin 53715, United States
| | - Eric A. Armstrong
- Wisconsin
Institute for Discovery, University of Wisconsin−Madison, 330 North Orchard Street, Madison, Wisconsin 53715, United States
| | - John M. Denu
- Wisconsin
Institute for Discovery, University of Wisconsin−Madison, 330 North Orchard Street, Madison, Wisconsin 53715, United States
- Department
of Biomolecular Chemistry, University of
Wisconsin−Madison, 420 Henry Mall Room 1135 Biochemistry Building, Madison, Wisconsin 53706, United States
- .
| |
Collapse
|
33
|
A comparative study of evaluating missing value imputation methods in label-free proteomics. Sci Rep 2021; 11:1760. [PMID: 33469060 PMCID: PMC7815892 DOI: 10.1038/s41598-021-81279-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 12/31/2020] [Indexed: 12/29/2022] Open
Abstract
The presence of missing values (MVs) in label-free quantitative proteomics greatly reduces the completeness of data. Imputation has been widely utilized to handle MVs, and selection of the proper method is critical for the accuracy and reliability of imputation. Here we present a comparative study that evaluates the performance of seven popular imputation methods with a large-scale benchmark dataset and an immune cell dataset. Simulated MVs were incorporated into the complete part of each dataset with different combinations of MV rates and missing not at random (MNAR) rates. Normalized root mean square error (NRMSE) was applied to evaluate the accuracy of protein abundances and intergroup protein ratios after imputation. Detection of true positives (TPs) and false altered-protein discovery rate (FADR) between groups were also compared using the benchmark dataset. Furthermore, the accuracy of handling real MVs was assessed by comparing enriched pathways and signature genes of cell activation after imputing the immune cell dataset. We observed that the accuracy of imputation is primarily affected by the MNAR rate rather than the MV rate, and downstream analysis can be largely impacted by the selection of imputation methods. A random forest-based imputation method consistently outperformed other popular methods by achieving the lowest NRMSE, high amount of TPs with the average FADR < 5%, and the best detection of relevant pathways and signature genes, highlighting it as the most suitable method for label-free proteomics.
Collapse
|
34
|
Abstract
Biotinylation identification (BioID) is a method designed to provide new cellular location and functional knowledge of the protein of interest through the identification of those proteins surrounding and in direct contact. A biotin ligase is fused onto the protein of interest and expressed in cells where it can biotinylate even short-lived transient protein complexes. In addition, due to the proximity labeling nature of the experiment, cellular localization and functional enrichment information can also be obtained. Since labeling occurs only after the addition of biotin, temporal relationships and localization changes (e.g., cytoplasmic to nuclear) can also be identified. Labeled proteins are easily purified, and contaminants minimized, using the strong interaction between biotin and streptavidin. Mass spectrometry analysis of the purified proteins allows for the identification of potential interactors for further validation and characterization.
Collapse
|
35
|
Dou Y, Kalmykova S, Pashkova M, Oghbaie M, Jiang H, Molloy KR, Chait BT, Rout MP, Fenyö D, Jensen TH, Altukhov I, LaCava J. Affinity proteomic dissection of the human nuclear cap-binding complex interactome. Nucleic Acids Res 2020; 48:10456-10469. [PMID: 32960270 PMCID: PMC7544204 DOI: 10.1093/nar/gkaa743] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/22/2020] [Accepted: 08/25/2020] [Indexed: 12/14/2022] Open
Abstract
A 5′,7-methylguanosine cap is a quintessential feature of RNA polymerase II-transcribed RNAs, and a textbook aspect of co-transcriptional RNA processing. The cap is bound by the cap-binding complex (CBC), canonically consisting of nuclear cap-binding proteins 1 and 2 (NCBP1/2). Interest in the CBC has recently renewed due to its participation in RNA-fate decisions via interactions with RNA productive factors as well as with adapters of the degradative RNA exosome. A novel cap-binding protein, NCBP3, was recently proposed to form an alternative CBC together with NCBP1, and to interact with the canonical CBC along with the protein SRRT. The theme of post-transcriptional RNA fate, and how it relates to co-transcriptional ribonucleoprotein assembly, is abundant with complicated, ambiguous, and likely incomplete models. In an effort to clarify the compositions of NCBP1-, 2- and 3-related macromolecular assemblies, we have applied an affinity capture-based interactome screen where the experimental design and data processing have been modified to quantitatively identify interactome differences between targets under a range of experimental conditions. This study generated a comprehensive view of NCBP-protein interactions in the ribonucleoprotein context and demonstrates the potential of our approach to benefit the interpretation of complex biological pathways.
Collapse
Affiliation(s)
- Yuhui Dou
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | | | - Maria Pashkova
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - Mehrnoosh Oghbaie
- Laboratory of Cellular and Structural Biology, The Rockefeller University, New York, USA.,European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Hua Jiang
- Laboratory of Cellular and Structural Biology, The Rockefeller University, New York, USA
| | - Kelly R Molloy
- Laboratory of Mass Spectrometry and Gaseous Ion Chemistry, The Rockefeller University, New York, USA
| | - Brian T Chait
- Laboratory of Mass Spectrometry and Gaseous Ion Chemistry, The Rockefeller University, New York, USA
| | - Michael P Rout
- Laboratory of Cellular and Structural Biology, The Rockefeller University, New York, USA
| | - David Fenyö
- Department of Biochemistry and Molecular Pharmacology, Institute for Systems Genetics, NYU Langone Health, New York, USA
| | - Torben Heick Jensen
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Ilya Altukhov
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - John LaCava
- Laboratory of Cellular and Structural Biology, The Rockefeller University, New York, USA.,European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
36
|
Song M, Greenbaum J, Luttrell J, Zhou W, Wu C, Shen H, Gong P, Zhang C, Deng HW. A Review of Integrative Imputation for Multi-Omics Datasets. Front Genet 2020; 11:570255. [PMID: 33193667 PMCID: PMC7594632 DOI: 10.3389/fgene.2020.570255] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 09/16/2020] [Indexed: 01/05/2023] Open
Abstract
Multi-omics studies, which explore the interactions between multiple types of biological factors, have significant advantages over single-omics analysis for their ability to provide a more holistic view of biological processes, uncover the causal and functional mechanisms for complex diseases, and facilitate new discoveries in precision medicine. However, omics datasets often contain missing values, and in multi-omics study designs it is common for individuals to be represented for some omics layers but not all. Since most statistical analyses cannot be applied directly to the incomplete datasets, imputation is typically performed to infer the missing values. Integrative imputation techniques which make use of the correlations and shared information among multi-omics datasets are expected to outperform approaches that rely on single-omics information alone, resulting in more accurate results for the subsequent downstream analyses. In this review, we provide an overview of the currently available imputation methods for handling missing values in bioinformatics data with an emphasis on multi-omics imputation. In addition, we also provide a perspective on how deep learning methods might be developed for the integrative imputation of multi-omics datasets.
Collapse
Affiliation(s)
- Meng Song
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, United States
| | - Jonathan Greenbaum
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, United States
| | - Joseph Luttrell
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, United States
| | - Weihua Zhou
- College of Computing, Michigan Technological University, Houghton, MI, United States
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, United States
| | - Hui Shen
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, United States
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, United States
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, United States
| | - Hong-Wen Deng
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, United States
| |
Collapse
|
37
|
Liu YC, Lu JJ, Lin LC, Lin HC, Chen CJ. Protein Biomarker Discovery for Methicillin-Sensitive, Heterogeneous Vancomycin-Intermediate and Vancomycin-Intermediate Staphylococcus aureus Strains Using Label-Free Data-Independent Acquisition Proteomics. J Proteome Res 2020; 20:164-171. [PMID: 33058664 DOI: 10.1021/acs.jproteome.0c00134] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Rapid identification of methicillin-sensitive Staphylococcus aureus (MSSA), heterogeneous vancomycin-intermediate S. aureus (hVISA), and vancomycin-intermediate S. aureus (VISA) is important for accurate treatment, timely intervention, and prevention of outbreaks. Here, 90 S. aureus isolates were analyzed for protein biomarker discovery, including MSSA, vancomycin-susceptible S. aureus (VSSA), hVISA, and VISA strains. Label-free data-independent acquisition proteomics was used to identify protein biomarkers that allow for discrimination among MSSA, hVISA, and VISA strains. There were 8786 nonredundant peptides identified, corresponding to 418 different annotated nonredundant proteins. Two VISA protein biomarkers, two hVISA protein biomarkers, and one MSSA protein biomarker with high sensitivities and specificities were discovered and verified. Data are available via MassIVE with identifier MSV000085776.
Collapse
Affiliation(s)
- Yu-Ching Liu
- Graduate Institute of Integrated Medicine, China Medical University, 91, Hsueh-Shih Rd, Taichung 40402, Taiwan
| | - Jang-Jih Lu
- Department of Laboratory Medicine, Chang Gung Memorial Hospital, Linkou, Taoyuan 33305, Taiwan.,Department of Medical Biotechnology and Laboratory Science, College of Medicine, Chang Gung University, Taoyuan 33302, Taiwan
| | - Lee-Chung Lin
- Department of Laboratory Medicine, Chang Gung Memorial Hospital, Linkou, Taoyuan 33305, Taiwan
| | - Hsiao-Chuan Lin
- School of Medicine, China Medical University, 91, Hsueh-Shih Rd, Taichung 40402, Taiwan.,Department of Pediatric Infectious Diseases, China Medical University Children's Hospital, Taichung 40447, Taiwan
| | - Chao-Jung Chen
- Graduate Institute of Integrated Medicine, China Medical University, 91, Hsueh-Shih Rd, Taichung 40402, Taiwan.,Proteomics Core Laboratory, Department of Medical Research, China Medical University Hospital, Taichung 404, Taiwan
| |
Collapse
|
38
|
Cozzolino F, Landolfi A, Iacobucci I, Monaco V, Caterino M, Celentano S, Zuccato C, Cattaneo E, Monti M. New label-free methods for protein relative quantification applied to the investigation of an animal model of Huntington Disease. PLoS One 2020; 15:e0238037. [PMID: 32886703 PMCID: PMC7473538 DOI: 10.1371/journal.pone.0238037] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 08/07/2020] [Indexed: 12/27/2022] Open
Abstract
Spectral Counts approaches (SpCs) are largely employed for the comparison of protein expression profiles in label-free (LF) differential proteomics applications. Similarly, to other comparative methods, also SpCs based approaches require a normalization procedure before Fold Changes (FC) calculation. Here, we propose new Complexity Based Normalization (CBN) methods that introduced a variable adjustment factor (f), related to the complexity of the sample, both in terms of total number of identified proteins (CBN(P)) and as total number of spectral counts (CBN(S)). Both these new methods were compared with the Normalized Spectral Abundance Factor (NSAF) and the Spectral Counts log Ratio (Rsc), by using standard protein mixtures. Finally, to test the robustness and the effectiveness of the CBNs methods, they were employed for the comparative analysis of cortical protein extract from zQ175 mouse brains, model of Huntington Disease (HD), and control animals (raw data available via ProteomeXchange with identifier PXD017471). LF data were also validated by western blot and MRM based experiments. On standard mixtures, both CBN methods showed an excellent behavior in terms of reproducibility and coefficients of variation (CVs) in comparison to the other SpCs approaches. Overall, the CBN(P) method was demonstrated to be the most reliable and sensitive in detecting small differences in protein amounts when applied to biological samples.
Collapse
Affiliation(s)
- Flora Cozzolino
- Department of Chemical Sciences, University of Naples "Federico II", Naples, Italy
- CEINGE Advanced Biotechnologies, Naples, Italy
| | - Alfredo Landolfi
- Department of Chemical Sciences, University of Naples "Federico II", Naples, Italy
- CEINGE Advanced Biotechnologies, Naples, Italy
| | - Ilaria Iacobucci
- Department of Chemical Sciences, University of Naples "Federico II", Naples, Italy
- CEINGE Advanced Biotechnologies, Naples, Italy
| | | | - Marianna Caterino
- Department of Molecular Medicine and Medical Biotechnologies, University of Naples "Federico II", Naples, Italy
| | | | - Chiara Zuccato
- Department of Biosciences, University of Milan, Milan, Italy
- Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi", Milan, Italy
| | - Elena Cattaneo
- Department of Biosciences, University of Milan, Milan, Italy
- Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi", Milan, Italy
| | - Maria Monti
- Department of Chemical Sciences, University of Naples "Federico II", Naples, Italy
- CEINGE Advanced Biotechnologies, Naples, Italy
- * E-mail:
| |
Collapse
|
39
|
Skeffington KL, Bond AR, Bigotti MG, AbdulGhani S, Iacobazzi D, Kang SL, Heesom KJ, Wilson MC, Stoica S, Martin R, Caputo M, Suleiman MS, Ghorbel MT. Changes in inflammation and oxidative stress signalling pathways in coarcted aorta triggered by bicuspid aortic valve and growth in young children. Exp Ther Med 2020; 20:48. [PMID: 32973936 PMCID: PMC7506967 DOI: 10.3892/etm.2020.9171] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 06/24/2020] [Indexed: 12/11/2022] Open
Abstract
Neonates with coarctation of the aorta (CoA) combined with a bicuspid aortic valve (BAV) show significant structural differences compared to neonatal CoA patients with a normal tricuspid aortic valve (TAV). These effects are likely to change over time in response to growth. This study investigated proteomic differences between coarcted aortic tissue of BAV and TAV patients in children older than one month. Aortic tissue just proximal to the coarctation site was collected from 10 children (BAV; n=6, 1.9±1.7 years, TAV; n=4, 1.7±1.5 years, (mean ± SEM, P=0.92.) Tissue were snap frozen, proteins extracted, and the extracts used for proteomic and phosphoproteomic analysis using Tandem Mass Tag (TMT) analysis. A total of 1811 protein and 76 phosphoprotein accession numbers were detected, of which 40 proteins and 6 phosphoproteins were significantly differentially expressed between BAV and TAV patients. Several canonical pathways involved in inflammation demonstrated enriched protein expression, including acute phase response signalling, EIF2 signalling and macrophage production of IL12 and reactive oxygen species. Acute phase response signalling also demonstrated enriched phosphoprotein expression, as did Th17 activation. Other pathways with significantly enriched protein expression include degradation of superoxide radicals and several pathways involved in apoptosis. This work suggests that BAV CoA patients older than one month have an altered proteome consistent with changes in inflammation, apoptosis and oxidative stress compared to TAV CoA patients of the same age. There is no evidence of structural differences, suggesting the pathology associated with BAV evolves with age in paediatric CoA patients.
Collapse
Affiliation(s)
- Katie L Skeffington
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK
| | - Andrew R Bond
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK
| | - M Giulia Bigotti
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK
| | - Safa AbdulGhani
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK.,Department of Congenital Heart Disease, Bristol Children's Hospital, Bristol BS2 8JB, UK
| | - Dominga Iacobazzi
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK
| | - Sok-Leng Kang
- Department of Physiology, Faculty of Medicine, Al-Quds University, P.O Box 89, Abu Dis, Palestine
| | - Kate J Heesom
- Proteomics Facility, University of Bristol, Bristol BS8 1RJ, UK
| | | | - Serban Stoica
- Department of Physiology, Faculty of Medicine, Al-Quds University, P.O Box 89, Abu Dis, Palestine
| | - Robin Martin
- Department of Physiology, Faculty of Medicine, Al-Quds University, P.O Box 89, Abu Dis, Palestine
| | - Massimo Caputo
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK.,Department of Physiology, Faculty of Medicine, Al-Quds University, P.O Box 89, Abu Dis, Palestine
| | - M Saadeh Suleiman
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK
| | - Mohamed T Ghorbel
- Bristol Heart Institute, Research Floor Level 7, Bristol Royal Infirmary, Bristol BS2 8HW, UK
| |
Collapse
|
40
|
Precursor Intensity-Based Label-Free Quantification Software Tools for Proteomic and Multi-Omic Analysis within the Galaxy Platform. Proteomes 2020; 8:proteomes8030015. [PMID: 32650610 PMCID: PMC7563855 DOI: 10.3390/proteomes8030015] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 07/06/2020] [Accepted: 07/07/2020] [Indexed: 01/15/2023] Open
Abstract
For mass spectrometry-based peptide and protein quantification, label-free quantification (LFQ) based on precursor mass peak (MS1) intensities is considered reliable due to its dynamic range, reproducibility, and accuracy. LFQ enables peptide-level quantitation, which is useful in proteomics (analyzing peptides carrying post-translational modifications) and multi-omics studies such as metaproteomics (analyzing taxon-specific microbial peptides) and proteogenomics (analyzing non-canonical sequences). Bioinformatics workflows accessible via the Galaxy platform have proven useful for analysis of such complex multi-omic studies. However, workflows within the Galaxy platform have lacked well-tested LFQ tools. In this study, we have evaluated moFF and FlashLFQ, two open-source LFQ tools, and implemented them within the Galaxy platform to offer access and use via established workflows. Through rigorous testing and communication with the tool developers, we have optimized the performance of each tool. Software features evaluated include: (a) match-between-runs (MBR); (b) using multiple file-formats as input for improved quantification; (c) use of containers and/or conda packages; (d) parameters needed for analyzing large datasets; and (e) optimization and validation of software performance. This work establishes a process for software implementation, optimization, and validation, and offers access to two robust software tools for LFQ-based analysis within the Galaxy platform.
Collapse
|
41
|
Minadakis G, Sokratous K, Spyrou GM. ProtExA: A tool for post-processing proteomics data providing differential expression metrics, co-expression networks and functional analytics. Comput Struct Biotechnol J 2020; 18:1695-1703. [PMID: 32670509 PMCID: PMC7340977 DOI: 10.1016/j.csbj.2020.06.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 06/17/2020] [Accepted: 06/20/2020] [Indexed: 12/31/2022] Open
Abstract
ProTExA is a web-tool that provides a post-processing workflow for the analysis of protein and gene expression datasets. Using network-based bioinformatics approaches, ProTExA facilitates differential expression analysis and co-expression network analysis as well as pathway and post-pathway analysis. Specifically, for a given set of protein-gene expression data across samples, ProTExA: (1) performs statistical analysis and filtering to highlight the differentially expressed proteins-genes, (2) performs enrichment analysis to identify top-scored pathways, (3) generates pathway-to-pathway and pathway-to-gene networks (4) generates protein and gene co-expression networks using a variety of methodologies, and (5) applies clustering methodologies to identify sub-networks of co-expressed proteins-genes. The proposed web-tool is a simple yet informative tool, towards understanding and exploitation of protein and gene expression datasets, especially for those that do not have the expertise and local resources to replicate specific analyses in the context of collaborative and scientific data exchanging.
Collapse
Affiliation(s)
- George Minadakis
- Department of Bioinformatics, The Cyprus Institute of Neurology & Genetics, 6 International Airport Avenue, 2370 Nicosia, P.O. Box 23462, 1683 Nicosia, Cyprus
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology & Genetics, 6 International Airport Avenue, 2370 Nicosia, P.O. Box 23462, 1683 Nicosia, Cyprus
| | - Kleitos Sokratous
- Department of Bioinformatics, The Cyprus Institute of Neurology & Genetics, 6 International Airport Avenue, 2370 Nicosia, P.O. Box 23462, 1683 Nicosia, Cyprus
- OMass Therapeutics, The Schrödinger Building, Heatley Road, The Oxford Science Park, Oxford OX4 4GE, UK
| | - George M Spyrou
- Department of Bioinformatics, The Cyprus Institute of Neurology & Genetics, 6 International Airport Avenue, 2370 Nicosia, P.O. Box 23462, 1683 Nicosia, Cyprus
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology & Genetics, 6 International Airport Avenue, 2370 Nicosia, P.O. Box 23462, 1683 Nicosia, Cyprus
| |
Collapse
|
42
|
Boonekamp FJ, Dashko S, Duiker D, Gehrmann T, van den Broek M, den Ridder M, Pabst M, Robert V, Abeel T, Postma ED, Daran JM, Daran-Lapujade P. Design and Experimental Evaluation of a Minimal, Innocuous Watermarking Strategy to Distinguish Near-Identical DNA and RNA Sequences. ACS Synth Biol 2020; 9:1361-1375. [PMID: 32413257 PMCID: PMC7309318 DOI: 10.1021/acssynbio.0c00045] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The construction of powerful cell factories requires intensive and extensive remodelling of microbial genomes. Considering the rapidly increasing number of these synthetic biology endeavors, there is an increasing need for DNA watermarking strategies that enable the discrimination between synthetic and native gene copies. While it is well documented that codon usage can affect translation, and most likely mRNA stability in eukaryotes, remarkably few quantitative studies explore the impact of watermarking on transcription, protein expression, and physiology in the popular model and industrial yeast Saccharomyces cerevisiae. The present study, using S. cerevisiae as eukaryotic paradigm, designed, implemented, and experimentally validated a systematic strategy to watermark DNA with minimal alteration of yeast physiology. The 13 genes encoding proteins involved in the major pathway for sugar utilization (i.e., glycolysis and alcoholic fermentation) were simultaneously watermarked in a yeast strain using the previously published pathway swapping strategy. Carefully swapping codons of these naturally codon optimized, highly expressed genes, did not affect yeast physiology and did not alter transcript abundance, protein abundance, and protein activity besides a mild effect on Gpm1. The markerQuant bioinformatics method could reliably discriminate native from watermarked genes and transcripts. Furthermore, presence of watermarks enabled selective CRISPR/Cas genome editing, specifically targeting the native gene copy while leaving the synthetic, watermarked variant intact. This study offers a validated strategy to simply watermark genes in S. cerevisiae.
Collapse
Affiliation(s)
- Francine J. Boonekamp
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Sofia Dashko
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Donna Duiker
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Thies Gehrmann
- Westerdijk Institute, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Marcel van den Broek
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Maxime den Ridder
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Martin Pabst
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Vincent Robert
- Westerdijk Institute, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Thomas Abeel
- Intelligent Systems − Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, 2628XE Delft, The Netherlands
| | - Eline D. Postma
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Jean-Marc Daran
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| | - Pascale Daran-Lapujade
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629HZ Delft, The Netherlands
| |
Collapse
|
43
|
A new opening for the tricky untargeted investigation of natural and modified short peptides. Talanta 2020; 219:121262. [PMID: 32887153 DOI: 10.1016/j.talanta.2020.121262] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 06/08/2020] [Accepted: 06/09/2020] [Indexed: 12/16/2022]
Abstract
Short peptides are of extreme interest in clinical and food research fields, nevertheless they still represent a crucial analytical issue. The main aim of this paper was the development of an analytical platform for a considerable advancement in short peptides identification. For the first time, short sequences presenting both natural and post-translationally modified amino acids were comprehensively studied thanks to the generation of specific databases. Short peptide databases had a dual purpose. First, they were employed as inclusion lists for a suspect screening mass-spectrometric analysis, overcoming the limits of data dependent acquisition mode and allowing the fragmentation of such low-abundance substances. Moreover, the databases were implemented in Compound Discoverer 3.0, a software dedicated to the analysis of short molecules, for the creation of a data processing workflow specifically dedicated to short peptide tentative identification. For this purpose, a detailed study of short peptide fragmentation pathways was carried out for the first time. The proposed method was applied to the study of short peptide sequences in enriched urine samples and led to the tentative identification more than 200 short natural and modified short peptides, the highest number ever reported.
Collapse
|
44
|
Zhang T, Gaffrey MJ, Monroe ME, Thomas DG, Weitz KK, Piehowski PD, Petyuk VA, Moore RJ, Thrall BD, Qian WJ. Block Design with Common Reference Samples Enables Robust Large-Scale Label-Free Quantitative Proteome Profiling. J Proteome Res 2020; 19:2863-2872. [PMID: 32407631 DOI: 10.1021/acs.jproteome.0c00310] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Label-free quantitative proteomics has become an increasingly popular tool for profiling global protein abundances. However, one major limitation is the potential performance drift of the LC-MS platform over time, which, in turn, limits its utility for analyzing large-scale sample sets. To address this, we introduce an experimental and data analysis scheme based on a block design with common references within each block for enabling large-scale label-free quantification. In this scheme, a large number of samples (e.g., >100 samples) are analyzed in smaller and more manageable blocks, minimizing instrument drift and variability within individual blocks. Each designated block also contains common reference samples (e.g., controls) for normalization across all blocks. We demonstrated the robustness of this approach by profiling the proteome response of human macrophage THP-1 cells to 11 engineered nanomaterials at two different doses. A total of 116 samples were analyzed in six blocks, yielding an average coverage of 4500 proteins per sample. Following a common reference-based correction, 2537 proteins were quantified with high reproducibility without any imputation of missing values from 116 data sets. The data revealed the consistent quantification of proteins across all six blocks, as illustrated by the highly consistent abundances of house-keeping proteins in all samples and the high levels of correlation among samples from different blocks. The data also demonstrated that label-free quantification is robust and accurate enough to quantify even very subtle abundance changes as well as large fold-changes. Our streamlined workflow is easy to implement and can be readily adapted to other large cohort studies for reproducible label-free proteome quantification.
Collapse
Affiliation(s)
- Tong Zhang
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Matthew J Gaffrey
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Matthew E Monroe
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Dennis G Thomas
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Karl K Weitz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Paul D Piehowski
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Vladislav A Petyuk
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Ronald J Moore
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Brian D Thrall
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Wei-Jun Qian
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
45
|
Paternal Resistance Training Induced Modifications in the Left Ventricle Proteome Independent of Offspring Diet. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2020; 2020:5603580. [PMID: 32454941 PMCID: PMC7218999 DOI: 10.1155/2020/5603580] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 12/18/2019] [Indexed: 01/13/2023]
Abstract
Ancestral obesogenic exposure is able to trigger harmful effects in the offspring left ventricle (LV) which could lead to cardiovascular diseases. However, the impact of the father's lifestyle on the offspring LV is largely unexplored. The aim of this study was to investigate the effects of 8 weeks of paternal resistance training (RT) on the offspring left ventricle (LV) proteome exposed to control or high-fat (HF) diet. Wistar rats were randomly divided into two groups: sedentary fathers and trained fathers (8 weeks, 3 times per week with weights secured to the animals' tails). The offspring were obtained by mating with sedentary females. Upon weaning, male offspring were divided into 4 groups (5 animals per group): offspring from sedentary fathers, exposed to control diet (SFO-C); offspring from trained fathers, exposed to control diet (TFO-C); offspring from sedentary fathers, exposed to high-fat diet (SFO-HF); and offspring from trained fathers, exposed to high-fat diet (TFO-HF). The LC-MS/MS analysis revealed 537 regulated proteins among groups. Offspring exposure to HF diet caused reduction in the abundance levels of proteins related to cell component organization, metabolic processes, and transport. Proteins related to antioxidant activity, transport, and transcription regulation were increased in TFO-C and TFO-HF as compared with the SFO-C and SFO-HF groups. Paternal RT demonstrated to be an important intervention capable of inducing significant effects on the LV proteome regardless of offspring diet due to the increase of proteins involved into LV homeostasis maintenance. This study contributes to a better understanding of the molecular aspects involved in transgenerational inheritance.
Collapse
|
46
|
Abstract
Brucella spp. are Gram negative intracellular bacteria responsible for brucellosis, a worldwide distributed zoonosis. A prominent aspect of the Brucella life cycle is its ability to invade, survive and multiply within host cells. Comprehensive approaches, such as proteomics, have aided in unravelling the molecular mechanisms underlying Brucella pathogenesis. Technological and methodological advancements such as increased instrument performance and multiplexed quantification have broadened the range of proteome studies, enabling new and improved analyses, providing deeper and more accurate proteome coverage. Indeed, proteomics has demonstrated its contribution to key research questions in Brucella biology, i.e., immunodominant proteins, host-cell interaction, stress response, antibiotic targets and resistance, protein secretion. Here, we review the proteomics of Brucella with a focus on more recent works and novel findings, ranging from reconfiguration of the intracellular bacterial proteome and studies on proteomic profiles of Brucella infected tissues, to the identification of Brucella extracellular proteins with putative roles in cell signaling and pathogenesis. In conclusion, proteomics has yielded copious new candidates and hypotheses that require future verification. It is expected that proteomics will continue to be an invaluable tool for Brucella and applications will further extend to the currently ill-explored aspects including, among others, protein processing and post-translational modification.
Collapse
|
47
|
Millikin RJ, Shortreed MR, Scalf M, Smith LM. A Bayesian Null Interval Hypothesis Test Controls False Discovery Rates and Improves Sensitivity in Label-Free Quantitative Proteomics. J Proteome Res 2020; 19:1975-1981. [PMID: 32243168 DOI: 10.1021/acs.jproteome.9b00796] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Statistical significance tests are a common feature in quantitative proteomics workflows. The Student's t-test is widely used to compute the statistical significance of a protein's change between two groups of samples. However, the t-test's null hypothesis asserts that the difference in means between two groups is exactly zero, often marking small but uninteresting fold-changes as statistically significant. Compensations to address this issue are widely used in quantitative proteomics, but we suggest that a replacement of the t-test with a Bayesian approach offers a better path forward. In this article, we describe a Bayesian hypothesis test in which the null hypothesis is an interval rather than a single point at zero; the width of the interval is estimated from population statistics. The improved sensitivity of the method substantially increases the number of truly changing proteins detected in two benchmark data sets (ProteomeXchange identifiers PXD005590 and PXD016470). The method has been implemented within FlashLFQ, an open-source software program that quantifies bottom-up proteomics search results obtained from any search tool. FlashLFQ is rapid, sensitive, and accurate and is available both as an easy-to-use graphical user interface (Windows) and as a command-line tool (Windows/Linux/OSX).
Collapse
Affiliation(s)
- Robert J Millikin
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| |
Collapse
|
48
|
Mohamed A, Collins J, Jiang H, Molendijk J, Stoll T, Torta F, Wenk MR, Bird RJ, Marlton P, Mollee P, Markey KA, Hill MM. Concurrent lipidomics and proteomics on malignant plasma cells from multiple myeloma patients: Probing the lipid metabolome. PLoS One 2020; 15:e0227455. [PMID: 31914155 PMCID: PMC6948732 DOI: 10.1371/journal.pone.0227455] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 12/18/2019] [Indexed: 12/31/2022] Open
Abstract
Background Multiple myeloma (MM) is a hematological malignancy characterized by the clonal expansion of malignant plasma cells. Though durable remissions are possible, MM is considered incurable, with relapse occurring in almost all patients. There has been limited data reported on the lipid metabolism changes in plasma cells during MM progression. Here, we evaluated the feasibility of concurrent lipidomics and proteomics analyses from patient plasma cells, and report these data on a limited number of patient samples, demonstrating the feasibility of the method, and establishing hypotheses to be evaluated in the future. Methods Plasma cells were purified from fresh bone marrow aspirates using CD138 microbeads. Proteins and lipids were extracted using a bi-phasic solvent system with methanol, methyl tert-butyl ether, and water. Untargeted proteomics, untargeted and targeted lipidomics were performed on 7 patient samples using liquid chromatography-mass spectrometry. Two comparisons were conducted: high versus low risk; relapse versus newly diagnosed. Proteins and pathways enriched in the relapsed group was compared to a public transcriptomic dataset from Multiple Myeloma Research Consortium reference collection (n = 222) at gene and pathways level. Results From one million purified plasma cells, we were able to extract material and complete untargeted (~6000 and ~3600 features in positive and negative mode respectively) and targeted lipidomics (313 lipids), as well as untargeted proteomics analysis (~4100 reviewed proteins). Comparative analyses revealed limited differences between high and low risk groups (according to the standard clinical criteria), hence we focused on drawing comparisons between the relapsed and newly diagnosed patients. Untargeted and targeted lipidomics indicated significant down-regulation of phosphatidylcholines (PCs) in relapsed MM. Although there was limited overlap of the differential proteins/transcripts, 76 significantly enriched pathways in relapsed MM were common between proteomics and transcriptomics data. Further evaluation of transcriptomics data for lipid metabolism network revealed enriched correlation of PC, ceramide, cardiolipin, arachidonic acid and cholesterol metabolism pathways to be exclusively correlated among relapsed but not in newly-diagnosed patients. Conclusions This study establishes the feasibility and workflow to conduct integrated lipidomics and proteomics analyses on patient-derived plasma cells. Potential lipid metabolism changes associated with MM relapse warrant further investigation.
Collapse
Affiliation(s)
- Ahmed Mohamed
- The University of Queensland Diamantina Institute, Faculty of Medicine, University of Queensland, Woolloongabba, Brisbane, Australia
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, Australia
| | - Joel Collins
- Princess Alexandra Hospital, Division of Cancer Care Services, Department of Haematology, Woolloongabba, Brisbane, Australia
- Toowoomba Hospital, Cancer Care Services, Toowoomba, Australia
- The University of Queensland Faculty of Medicine, Brisbane, Australia
| | - Hui Jiang
- The University of Queensland Diamantina Institute, Faculty of Medicine, University of Queensland, Woolloongabba, Brisbane, Australia
| | - Jeffrey Molendijk
- The University of Queensland Diamantina Institute, Faculty of Medicine, University of Queensland, Woolloongabba, Brisbane, Australia
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, Australia
| | - Thomas Stoll
- The University of Queensland Diamantina Institute, Faculty of Medicine, University of Queensland, Woolloongabba, Brisbane, Australia
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, Australia
| | - Federico Torta
- Memorial Sloan Kettering Cancer Center, New York, NY, United States of America
| | - Markus R. Wenk
- Memorial Sloan Kettering Cancer Center, New York, NY, United States of America
| | - Robert J. Bird
- Princess Alexandra Hospital, Division of Cancer Care Services, Department of Haematology, Woolloongabba, Brisbane, Australia
| | - Paula Marlton
- Princess Alexandra Hospital, Division of Cancer Care Services, Department of Haematology, Woolloongabba, Brisbane, Australia
- The University of Queensland Faculty of Medicine, Brisbane, Australia
| | - Peter Mollee
- Princess Alexandra Hospital, Division of Cancer Care Services, Department of Haematology, Woolloongabba, Brisbane, Australia
- The University of Queensland Faculty of Medicine, Brisbane, Australia
| | - Kate A. Markey
- Princess Alexandra Hospital, Division of Cancer Care Services, Department of Haematology, Woolloongabba, Brisbane, Australia
- The University of Queensland Faculty of Medicine, Brisbane, Australia
- SLING, Department of Biochemistry, National University of Singapore, Singapore
| | - Michelle M. Hill
- The University of Queensland Diamantina Institute, Faculty of Medicine, University of Queensland, Woolloongabba, Brisbane, Australia
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, Australia
- * E-mail:
| |
Collapse
|
49
|
O'Rourke MB, Town SEL, Dalla PV, Bicknell F, Koh Belic N, Violi JP, Steele JR, Padula MP. What is Normalization? The Strategies Employed in Top-Down and Bottom-Up Proteome Analysis Workflows. Proteomes 2019; 7:proteomes7030029. [PMID: 31443461 PMCID: PMC6789750 DOI: 10.3390/proteomes7030029] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 08/19/2019] [Accepted: 08/20/2019] [Indexed: 12/20/2022] Open
Abstract
The accurate quantification of changes in the abundance of proteins is one of the main applications of proteomics. The maintenance of accuracy can be affected by bias and error that can occur at many points in the experimental process, and normalization strategies are crucial to attempt to overcome this bias and return the sample to its regular biological condition, or normal state. Much work has been published on performing normalization on data post-acquisition with many algorithms and statistical processes available. However, there are many other sources of bias that can occur during experimental design and sample handling that are currently unaddressed. This article aims to cast light on the potential sources of bias and where normalization could be applied to return the sample to its normal state. Throughout we suggest solutions where possible but, in some cases, solutions are not available. Thus, we see this article as a starting point for discussion of the definition of and the issues surrounding the concept of normalization as it applies to the proteomic analysis of biological samples. Specifically, we discuss a wide range of different normalization techniques that can occur at each stage of the sample preparation and analysis process.
Collapse
Affiliation(s)
- Matthew B O'Rourke
- Bowel Cancer & Biomarker Lab, Northern Clinical School, Faculty of Medicine and Health, The University of Sydney Lvl 8, Kolling Institute. Royal North Shore Hospital, St. Leonards, NSW 2065, Australia
| | - Stephanie E L Town
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia
| | - Penelope V Dalla
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia
- Respiratory Cellular and Molecular Biology, Woolcock Institute of Medical Research, The University of Sydney, Glebe 2037, Australia
| | - Fiona Bicknell
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia
| | - Naomi Koh Belic
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia
| | - Jake P Violi
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia
| | - Joel R Steele
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia
| | - Matthew P Padula
- School of Life Sciences and Proteomics Core Facility, Faculty of Science, The University of Technology Sydney, Ultimo 2007, Australia.
| |
Collapse
|
50
|
Tang J, Fu J, Wang Y, Luo Y, Yang Q, Li B, Tu G, Hong J, Cui X, Chen Y, Yao L, Xue W, Zhu F. Simultaneous Improvement in the Precision, Accuracy, and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains. Mol Cell Proteomics 2019; 18:1683-1699. [PMID: 31097671 PMCID: PMC6682996 DOI: 10.1074/mcp.ra118.001169] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 04/28/2019] [Indexed: 12/13/2022] Open
Abstract
The label-free proteome quantification (LFQ) is multistep workflow collectively defined by quantification tools and subsequent data manipulation methods that has been extensively applied in current biomedical, agricultural, and environmental studies. Despite recent advances, in-depth and high-quality quantification remains extremely challenging and requires the optimization of LFQs by comparatively evaluating their performance. However, the evaluation results using different criteria (precision, accuracy, and robustness) vary greatly, and the huge number of potential LFQs becomes one of the bottlenecks in comprehensively optimizing proteome quantification. In this study, a novel strategy, enabling the discovery of the LFQs of simultaneously enhanced performance from thousands of workflows (integrating 18 quantification tools with 3,128 manipulation chains), was therefore proposed. First, the feasibility of achieving simultaneous improvement in the precision, accuracy, and robustness of LFQ was systematically assessed by collectively optimizing its multistep manipulation chains. Second, based on a variety of benchmark datasets acquired by various quantification measurements of different modes of acquisition, this novel strategy successfully identified a number of manipulation chains that simultaneously improved the performance across multiple criteria. Finally, to further enhance proteome quantification and discover the LFQs of optimal performance, an online tool (https://idrblab.org/anpela/) enabling collective performance assessment (from multiple perspectives) of the entire LFQ workflow was developed. This study confirmed the feasibility of achieving simultaneous improvement in precision, accuracy, and robustness. The novel strategy proposed and validated in this study together with the online tool might provide useful guidance for the research field requiring the mass-spectrometry-based LFQ technique.
Collapse
Affiliation(s)
- Jing Tang
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China; ¶Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China
| | - Jianbo Fu
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Qingxia Yang
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Bo Li
- §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Gao Tu
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Jiajun Hong
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xuejiao Cui
- §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Yuzong Chen
- ‖Department of Pharmacy, National University of Singapore, Singapore 117543, Singapore
| | - Lixia Yao
- **Department of Health Sciences Research, Mayo Clinic, Rochester MN 55905, United States
| | - Weiwei Xue
- §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Feng Zhu
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China.
| |
Collapse
|