1
|
Mehnert S, Davidson JT, Adeoye A, Lowe BD, Ruiz EA, King JR, Jackson GP. Expert Algorithm for Substance Identification Using Mass Spectrometry: Application to the Identification of Cocaine on Different Instruments Using Binary Classification Models. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:1235-1247. [PMID: 37254938 PMCID: PMC10326919 DOI: 10.1021/jasms.3c00090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 05/09/2023] [Accepted: 05/15/2023] [Indexed: 06/01/2023]
Abstract
This is the second of two manuscripts describing how general linear modeling (GLM) of a selection of the most abundant normalized fragment ion abundances of replicate mass spectra from one laboratory can be used in conjunction with binary classifiers to enable specific and selective identifications with reportable error rates of spectra from other laboratories. Here, the proof-of-concept uses a training set of 128 replicate cocaine spectra from one crime laboratory as the basis of GLM modeling. GLM models for the 20 most abundant fragments of cocaine were then applied to 175 additional test/validation cocaine spectra collected in more than a dozen crime laboratories and 716 known negative spectra, which included 10 spectra of three diastereomers of cocaine. Spectral similarity and dissimilarity between the measured and predicted abundances were assessed using a variety of conventional measures, including the mean absolute residual and NIST's spectral similarity score. For each spectral measure, GLM predictions were compared to the traditional exemplar approach, which used the average of the cocaine training set as the consensus spectrum for comparisons. In unsupervised models, EASI provided better than a 95% true positive rate for cocaine with a 0% false positive rate. A supervised binary logistic regression model provided 100% accuracy and no errors using EASI-predicted abundances of only four peaks at m/z 152, 198, 272, and 303. Regardless of the measure of spectral similarity, error rates for identifications using EASI were superior to the traditional exemplar/consensus approach. As a supervised binary classifier, EASI was more reliable than using Mahalanobis distances.
Collapse
Affiliation(s)
- Samantha
A. Mehnert
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - J. Tyler Davidson
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
| | - Alexandra Adeoye
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
| | - Brandon D. Lowe
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Emily A. Ruiz
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Jacob R. King
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Glen P. Jackson
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| |
Collapse
|
2
|
Zhang J, Gaowa N, Wang Y, Li H, Cao Z, Yang H, Zhang X, Li S. Complementary hepatic metabolomics and proteomics reveal the adaptive mechanisms of dairy cows to the transition period. J Dairy Sci 2023; 106:2071-2088. [PMID: 36567250 DOI: 10.3168/jds.2022-22224] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 09/06/2022] [Indexed: 12/24/2022]
Abstract
The transition period from late pregnancy to early lactation is a vital time of the lifecycle of dairy cows due to the marked metabolic challenges. Besides, the liver is the pivot point of metabolism in cattle. Nevertheless, the hepatic physiological molecular adaptation during the transition period has not been elucidated, especially from the metabolomics and proteomics view. Therefore, the present study aims to investigate the hepatic metabolic alterations in transition cows by using integrative metabolomics and proteomics methods. Gas chromatography quadrupole-time-of-flight mass spectrometry-based metabolomics and data-independent acquisition-based quantitative proteomics methods were used to analyze liver tissues collected from 8 healthy multiparous Holstein dairy cows 21 d before and after calving. In total, 44 metabolites and 250 proteins were identified as differentially expressed from 233 metabolites and 3,539 proteins detected from the liver biopsies during the transition period. Complementary functional analysis of different metabolites and proteins indicated the upregulated gluconeogenesis, tricarboxylic acid cycles, AA degradation, fatty acid oxidation, AMP-activated protein kinase signaling pathway, peroxisome proliferator-activated receptor signaling pathway, and ribosome proteins in postpartum dairy cows. In terms of the metabolites and proteins, glucose-6-phosphate, fructose-6-phosphate, carnitine palmitoyltransferase 1A, and phosphoenolpyruvate carboxykinase played a significant role in these pathways. The upregulated oxidative status may be accompanied by the pathways mentioned above. In addition, the upregulated glucagon and insulin signaling pathways also indicated the significant requirement for glucose in postpartum dairy cows. These outcomes, from the view of global metabolites and proteins, may present a better comprehension of the biology of the transition period, which can be helpful in further developing nutritional regulation strategies targeting the liver to help cows overcome this metabolically challenging time.
Collapse
Affiliation(s)
- Jun Zhang
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100 China; State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China
| | - Naren Gaowa
- State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China
| | - Yajing Wang
- State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China
| | - Huanxu Li
- Beijing Oriental Kingherd Biotechnology Company, Beijing 100193, China
| | - Zhijun Cao
- State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China
| | - Hongjian Yang
- State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China
| | - Xiaoming Zhang
- State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China
| | - Shengli Li
- State Key Laboratory of Animal Nutrition, Beijing Engineering Technology Research Center of Raw Milk Quality and Safety Control, College of Animal Science and Technology, China Agricultural University, Beijing 100193 China.
| |
Collapse
|
3
|
Leeming MG, Ang CS, Nie S, Varshney S, Williamson NA. Simulation of mass spectrometry-based proteomics data with Synthedia. BIOINFORMATICS ADVANCES 2022; 3:vbac096. [PMID: 36698761 PMCID: PMC9825309 DOI: 10.1093/bioadv/vbac096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 11/08/2022] [Accepted: 12/16/2022] [Indexed: 12/24/2022]
Abstract
Motivation A large number of experimental and bioinformatic parameters must be set to identify and quantify peptides in mass spectrometry experiments and each of these will impact the results. An ability to simulate raw data with known contents would allow researchers to rapidly explore the effects of varying experimental parameters and systematically investigate downstream processing software. A range of data simulators are available for established data-dependent acquisition methodologies, but these do not extend to the rapidly developing field of data-independent acquisition (DIA) strategies. Results Here, we present Synthedia-a software package to simulate DIA liquid chromatography-mass spectrometry for bottom-up proteomics experiments. Synthedia can generate datasets with known peptide precursor ions and fragments and allows for the customization of a wide variety of chromatographic and mass spectrometry parameters. Availability and implementation Synthedia is freely available via the internet and can be used through a graphical website (https://synthedia.org/) or locally via the command line (https://github.com/mgleeming/synthedia/). Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Ching-Seng Ang
- Bio21 Molecular Science & Biotechnology Institute, Melbourne Mass Spectrometry and Proteomics Facility, The University of Melbourne, Melbourne, VIC 3052, Australia
| | - Shuai Nie
- Bio21 Molecular Science & Biotechnology Institute, Melbourne Mass Spectrometry and Proteomics Facility, The University of Melbourne, Melbourne, VIC 3052, Australia
| | - Swati Varshney
- Bio21 Molecular Science & Biotechnology Institute, Melbourne Mass Spectrometry and Proteomics Facility, The University of Melbourne, Melbourne, VIC 3052, Australia
| | | |
Collapse
|
4
|
Tang J, Fu J, Wang Y, Li B, Li Y, Yang Q, Cui X, Hong J, Li X, Chen Y, Xue W, Zhu F. ANPELA: analysis and performance assessment of the label-free quantification workflow for metaproteomic studies. Brief Bioinform 2021; 21:621-636. [PMID: 30649171 PMCID: PMC7299298 DOI: 10.1093/bib/bby127] [Citation(s) in RCA: 131] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 11/19/2018] [Accepted: 12/06/2018] [Indexed: 12/13/2022] Open
Abstract
Label-free quantification (LFQ) with a specific and sequentially integrated workflow of acquisition technique, quantification tool and processing method has emerged as the popular technique employed in metaproteomic research to provide a comprehensive landscape of the adaptive response of microbes to external stimuli and their interactions with other organisms or host cells. The performance of a specific LFQ workflow is highly dependent on the studied data. Hence, it is essential to discover the most appropriate one for a specific data set. However, it is challenging to perform such discovery due to the large number of possible workflows and the multifaceted nature of the evaluation criteria. Herein, a web server ANPELA (https://idrblab.org/anpela/) was developed and validated as the first tool enabling performance assessment of whole LFQ workflow (collective assessment by five well-established criteria with distinct underlying theories), and it enabled the identification of the optimal LFQ workflow(s) by a comprehensive performance ranking. ANPELA not only automatically detects the diverse formats of data generated by all quantification tools but also provides the most complete set of processing methods among the available web servers and stand-alone tools. Systematic validation using metaproteomic benchmarks revealed ANPELA's capabilities in 1 discovering well-performing workflow(s), (2) enabling assessment from multiple perspectives and (3) validating LFQ accuracy using spiked proteins. ANPELA has a unique ability to evaluate the performance of whole LFQ workflow and enables the discovery of the optimal LFQs by the comprehensive performance ranking of all 560 workflows. Therefore, it has great potential for applications in metaproteomic and other studies requiring LFQ techniques, as many features are shared among proteomic studies.
Collapse
Affiliation(s)
- Jing Tang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Jianbo Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Bo Li
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Yinghong Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Qingxia Yang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Xuejiao Cui
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Xiaofeng Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Yuzong Chen
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Singapore, Singapore
| | - Weiwei Xue
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| |
Collapse
|
5
|
Singh VK, Seed TM, Cheema AK. Metabolomics-based predictive biomarkers of radiation injury and countermeasure efficacy: current status and future perspectives. Expert Rev Mol Diagn 2021; 21:641-654. [PMID: 34024238 DOI: 10.1080/14737159.2021.1933448] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
INTRODUCTION There is an urgent need for specific and sensitive bioassays to augment biodosimetric assessments of unwanted and excessive radiation exposures that originate from unexpected nuclear/radiological events, including nuclear accidents, acts of terrorism, or the use of a radiological dispersal device. If sufficiently intense, such ionizing radiation exposures are likely to impact normal metabolic processes within the cells and organs of the body, thus inducing multifaceted biological responses. AREAS COVERED This review covers the application of metabolomics, an emerging and promising technology based on quantitative and qualitative determinations of small molecules in biological samples for the rapid assessment of an individual's exposure to ionizing radiation. Recent advancements in the analytics of high-resolution chromatography, mass spectrometry, and bioinformatics have led to untargeted (global) and targeted (quantitative phase) approaches to identify biomarkers of radiation injury and countermeasure efficacy. Biomarkers are deemed essential for both assessing the radiation exposure levels and for extrapolative processes involved in determining scaling factors of a given radiation countering medicinal between experimental animals and humans. EXPERT OPINION The discipline of metabolomics appears to be highly informative in assessing radiation exposure levels and for identifying biomarkers of radiation injury and countermeasure efficacy.
Collapse
Affiliation(s)
- Vijay K Singh
- Division of Radioprotectants,Department of Pharmacology and Molecular Therapeutics, F. Edward Hébert School of Medicine, Uniformed Serices University of the Health Sciences, Bethesda, MD, USA.,Scientific Research Department, Armed Forces Radiobiology Research Institute, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | | | - Amrita K Cheema
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC, USA.,Department of Biochemistry, Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC, USA
| |
Collapse
|
6
|
Zhang Y, Bernau C, Parmigiani G, Waldron L. The impact of different sources of heterogeneity on loss of accuracy from genomic prediction models. Biostatistics 2020; 21:253-268. [PMID: 30202918 DOI: 10.1093/biostatistics/kxy044] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 07/22/2018] [Accepted: 08/04/2018] [Indexed: 11/13/2022] Open
Abstract
Cross-study validation (CSV) of prediction models is an alternative to traditional cross-validation (CV) in domains where multiple comparable datasets are available. Although many studies have noted potential sources of heterogeneity in genomic studies, to our knowledge none have systematically investigated their intertwined impacts on prediction accuracy across studies. We employ a hybrid parametric/non-parametric bootstrap method to realistically simulate publicly available compendia of microarray, RNA-seq, and whole metagenome shotgun microbiome studies of health outcomes. Three types of heterogeneity between studies are manipulated and studied: (i) imbalances in the prevalence of clinical and pathological covariates, (ii) differences in gene covariance that could be caused by batch, platform, or tumor purity effects, and (iii) differences in the "true" model that associates gene expression and clinical factors to outcome. We assess model accuracy, while altering these factors. Lower accuracy is seen in CSV than in CV. Surprisingly, heterogeneity in known clinical covariates and differences in gene covariance structure have very limited contributions in the loss of accuracy when validating in new studies. However, forcing identical generative models greatly reduces the within/across study difference. These results, observed consistently for multiple disease outcomes and omics platforms, suggest that the most easily identifiable sources of study heterogeneity are not necessarily the primary ones that undermine the ability to accurately replicate the accuracy of omics prediction models in new studies. Unidentified heterogeneity, such as could arise from unmeasured confounding, may be more important.
Collapse
Affiliation(s)
- Yuqing Zhang
- Graduate Program in Bioinformatics, Boston University, 24 Cummington Mall, Boston, MA, USA
| | - Christoph Bernau
- Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, Munich, Germany
| | - Giovanni Parmigiani
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 3 Blackfan Cir, Boston, MA, USA.,Department of Biostatistics, Harvard TH Chan School of Public Health, 677 Huntington Ave, Boston, MA, USA
| | - Levi Waldron
- Graduate School of Public Health and Health Policy, Institute for Implementation Science in Population Health, City University of New York, 55 W 125th St, New York, NY, USA
| |
Collapse
|
7
|
Beccaria M, Siqueira ALM, Maniquet A, Giusti P, Piparo M, Stefanuto PH, Focant JF. Advanced mono- and multi-dimensional gas chromatography-mass spectrometry techniques for oxygen-containing compound characterization in biomass and biofuel samples. J Sep Sci 2020; 44:115-134. [PMID: 33185940 DOI: 10.1002/jssc.202000907] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 11/05/2020] [Accepted: 11/06/2020] [Indexed: 11/08/2022]
Abstract
A wide variety of biomass, from triglycerides to lignocellulosic-based feedstock, are among promising candidates to possibly fulfill requirements as a substitute for crude oils as primary sources of chemical energy feedstock. During the feedstock processing carried out to increase the H:C ratio of the products, heteroatom-containing compounds can promote corrosion, thus limiting and/or deactivating catalytic processes needed to transform the biomass into fuel. The use of advanced gas chromatography techniques, in particular multi-dimensional gas chromatography, both heart-cutting and comprehensive coupled to mass spectrometry, has been widely exploited in the field of petroleomics over the past 30 years and has also been successfully applied to the characterization of volatile and semi-volatile compounds during the processing of biomass feedstock. This review intends to describe advanced gas chromatography-mass spectrometry-based techniques, mainly focusing in the period 2011-early 2020. Particular emphasis has been devoted to the multi-dimensional gas chromatography-mass spectrometry techniques, for the isolation and characterization of the oxygen-containing compounds in biomass feedstock. Within this context, the most recent advances to sample preparation, derivatization, as well as gas chromatography instrumentation, mass spectrometry ionization, identification, and data handling in the biomass industry, are described.
Collapse
Affiliation(s)
- Marco Beccaria
- Organic and Biological Analytical Chemistry Group, MolSys Research Unit, University of Liège, Liège, Belgium
| | - Anna Luiza Mendes Siqueira
- TOTAL Marketing Services, Research Center, Solaize, France.,International Joint Laboratory - iC2MC: Complex Matrices Molecular Characterization, TRTG, Harfleur, France
| | - Adrien Maniquet
- TOTAL Marketing Services, Research Center, Solaize, France.,International Joint Laboratory - iC2MC: Complex Matrices Molecular Characterization, TRTG, Harfleur, France
| | - Pierre Giusti
- TOTAL Refining and Chemicals, Total Research and Technologies Gonfreville, Harfleur, France.,International Joint Laboratory - iC2MC: Complex Matrices Molecular Characterization, TRTG, Harfleur, France
| | - Marco Piparo
- TOTAL Refining and Chemicals, Total Research and Technologies Gonfreville, Harfleur, France.,International Joint Laboratory - iC2MC: Complex Matrices Molecular Characterization, TRTG, Harfleur, France
| | - Pierre-Hugues Stefanuto
- Organic and Biological Analytical Chemistry Group, MolSys Research Unit, University of Liège, Liège, Belgium
| | - Jean-François Focant
- Organic and Biological Analytical Chemistry Group, MolSys Research Unit, University of Liège, Liège, Belgium
| |
Collapse
|
8
|
Vicente E, Vujaskovic Z, Jackson IL. A Systematic Review of Metabolomic and Lipidomic Candidates for Biomarkers in Radiation Injury. Metabolites 2020; 10:E259. [PMID: 32575772 PMCID: PMC7344731 DOI: 10.3390/metabo10060259] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 06/09/2020] [Accepted: 06/13/2020] [Indexed: 12/16/2022] Open
Abstract
A large-scale nuclear event has the ability to inflict mass casualties requiring point-of-care and laboratory-based diagnostic and prognostic biomarkers to inform victim triage and appropriate medical intervention. Extensive progress has been made to develop post-exposure point-of-care biodosimetry assays and to identify biomarkers that may be used in early phase testing to predict the course of the disease. Screening for biomarkers has recently extended to identify specific metabolomic and lipidomic responses to radiation using animal models. The objective of this review was to determine which metabolites or lipids most frequently experienced perturbations post-ionizing irradiation (IR) in preclinical studies using animal models of acute radiation sickness (ARS) and delayed effects of acute radiation exposure (DEARE). Upon review of approximately 65 manuscripts published in the peer-reviewed literature, the most frequently referenced metabolites showing clear changes in IR induced injury were found to be citrulline, citric acid, creatine, taurine, carnitine, xanthine, creatinine, hypoxanthine, uric acid, and threonine. Each metabolite was evaluated by specific study parameters to determine whether trends were in agreement across several studies. A select few show agreement across variable animal models, IR doses and timepoints, indicating that they may be ubiquitous and appropriate for use in diagnostic or prognostic biomarker panels.
Collapse
Affiliation(s)
| | | | - Isabel L. Jackson
- Division of Translational Radiation Sciences, Department of Radiation Oncology, University of Maryland School of Medicine, Baltimore, MD 21201, USA; (E.V.); (Z.V.)
| |
Collapse
|
9
|
Translating 'big data': better understanding of host-pathogen interactions to control bacterial foodborne pathogens in poultry. Anim Health Res Rev 2020; 21:15-35. [PMID: 31907101 DOI: 10.1017/s1466252319000124] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Recent technological advances has led to the generation, storage, and sharing of colossal sets of information ('big data'), and the expansion of 'omics' in science. To date, genomics/metagenomics, transcriptomics, proteomics, and metabolomics are arguably the most ground breaking approaches in food and public safety. Here we review some of the recent studies of foodborne pathogens (Campylobacter spp., Salmonella spp., and Escherichia coli) in poultry using big data. Genomic/metagenomic approaches have reveal the importance of the gut microbiota in health and disease. They have also been used to identify, monitor, and understand the epidemiology of antibiotic-resistance mechanisms and provide concrete evidence about the role of poultry in human infections. Transcriptomics studies have increased our understanding of the pathophysiology and immunopathology of foodborne pathogens in poultry and have led to the identification of host-resistance mechanisms. Proteomic/metabolomic approaches have aided in identifying biomarkers and the rapid detection of low levels of foodborne pathogens. Overall, 'omics' approaches complement each other and may provide, at least in part, a solution to our current food-safety issues by facilitating the development of new rapid diagnostics, therapeutic drugs, and vaccines to control foodborne pathogens in poultry. However, at this time most 'omics' approaches still remain underutilized due to their high cost and the high level of technical skills required.
Collapse
|
10
|
Tang J, Fu J, Wang Y, Luo Y, Yang Q, Li B, Tu G, Hong J, Cui X, Chen Y, Yao L, Xue W, Zhu F. Simultaneous Improvement in the Precision, Accuracy, and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains. Mol Cell Proteomics 2019; 18:1683-1699. [PMID: 31097671 PMCID: PMC6682996 DOI: 10.1074/mcp.ra118.001169] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 04/28/2019] [Indexed: 12/13/2022] Open
Abstract
The label-free proteome quantification (LFQ) is multistep workflow collectively defined by quantification tools and subsequent data manipulation methods that has been extensively applied in current biomedical, agricultural, and environmental studies. Despite recent advances, in-depth and high-quality quantification remains extremely challenging and requires the optimization of LFQs by comparatively evaluating their performance. However, the evaluation results using different criteria (precision, accuracy, and robustness) vary greatly, and the huge number of potential LFQs becomes one of the bottlenecks in comprehensively optimizing proteome quantification. In this study, a novel strategy, enabling the discovery of the LFQs of simultaneously enhanced performance from thousands of workflows (integrating 18 quantification tools with 3,128 manipulation chains), was therefore proposed. First, the feasibility of achieving simultaneous improvement in the precision, accuracy, and robustness of LFQ was systematically assessed by collectively optimizing its multistep manipulation chains. Second, based on a variety of benchmark datasets acquired by various quantification measurements of different modes of acquisition, this novel strategy successfully identified a number of manipulation chains that simultaneously improved the performance across multiple criteria. Finally, to further enhance proteome quantification and discover the LFQs of optimal performance, an online tool (https://idrblab.org/anpela/) enabling collective performance assessment (from multiple perspectives) of the entire LFQ workflow was developed. This study confirmed the feasibility of achieving simultaneous improvement in precision, accuracy, and robustness. The novel strategy proposed and validated in this study together with the online tool might provide useful guidance for the research field requiring the mass-spectrometry-based LFQ technique.
Collapse
Affiliation(s)
- Jing Tang
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China; ¶Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China
| | - Jianbo Fu
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Qingxia Yang
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Bo Li
- §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Gao Tu
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Jiajun Hong
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xuejiao Cui
- §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Yuzong Chen
- ‖Department of Pharmacy, National University of Singapore, Singapore 117543, Singapore
| | - Lixia Yao
- **Department of Health Sciences Research, Mayo Clinic, Rochester MN 55905, United States
| | - Weiwei Xue
- §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Feng Zhu
- ‡College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; §School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China.
| |
Collapse
|
11
|
Gutierrez M, Handy K, Smith R. XNet: A Bayesian Approach to Extracted Ion Chromatogram Clustering for Precursor Mass Spectrometry Data. J Proteome Res 2019; 18:2771-2778. [PMID: 31179699 DOI: 10.1021/acs.jproteome.9b00068] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Liquid chromatography mass spectrometry is a popular technique for high throughput analysis of biological samples. Identification and quantification of molecular species via mass spectrometry output requires postexperimental computational analysis of the raw instrument output. While tandem mass spectrometry remains a primary method for identification and quantification, species-resolved precursor data provides a rich source of unexploited information. Several algorithms have been proposed to resolve raw precursor signals into species-resolved isotopic envelopes. Many methods are particularly dependent on user parameters, and because they lack a means to optimize parameters, tend to perform poorly. To this end we present XNet, a parameter-less Bayesian machine learning approach to isotopic envelope extraction through the clustering of extracted ion chromatograms. We evaluate the performance of XNet and other prevalent methods on a quantitative ground truth data set. XNet is publicly available with an Apache license.
Collapse
Affiliation(s)
- Mathew Gutierrez
- Department of Computer Science , University of Montana , Missoula , Montana 59812 , United States
| | - Kyle Handy
- Department of Computer Science , University of Montana , Missoula , Montana 59812 , United States
| | - Rob Smith
- Department of Computer Science , University of Montana , Missoula , Montana 59812 , United States
| |
Collapse
|
12
|
Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
Affiliation(s)
- Thilo Muth
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
13
|
Stratton K, Webb-Robertson BJM, McCue LA, Stanfill B, Claborne D, Godinez I, Johansen T, Thompson AM, Burnum-Johnson KE, Waters KM, Bramer LM. pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data. J Proteome Res 2019; 18:1418-1425. [PMID: 30638385 PMCID: PMC6750869 DOI: 10.1021/acs.jproteome.8b00760] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Indexed: 02/06/2023]
Abstract
Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography-MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.
Collapse
Affiliation(s)
- Kelly
G. Stratton
- National
Security Directorate, Pacific Northwest
National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Bobbie-Jo M. Webb-Robertson
- National
Security Directorate, Pacific Northwest
National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Lee Ann McCue
- Earth
& Biological Sciences Directorate, Pacific
Northwest National Laboratory, 902 Battelle Boulavard, Richland, Washington 99354, United States
| | - Bryan Stanfill
- National
Security Directorate, Pacific Northwest
National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Daniel Claborne
- National
Security Directorate, Pacific Northwest
National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Iobani Godinez
- National
Security Directorate, Pacific Northwest
National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Thomas Johansen
- Department
of Statistics, Florida State University, 117 North Woodward Avenue, Tallahassee, Florida 32306, United States
| | - Allison M. Thompson
- Earth
& Biological Sciences Directorate, Pacific
Northwest National Laboratory, 902 Battelle Boulavard, Richland, Washington 99354, United States
| | - Kristin E. Burnum-Johnson
- Earth
& Biological Sciences Directorate, Pacific
Northwest National Laboratory, 902 Battelle Boulavard, Richland, Washington 99354, United States
| | - Katrina M. Waters
- Earth
& Biological Sciences Directorate, Pacific
Northwest National Laboratory, 902 Battelle Boulavard, Richland, Washington 99354, United States
| | - Lisa M. Bramer
- National
Security Directorate, Pacific Northwest
National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| |
Collapse
|
14
|
Slama P, Hoopmann MR, Moritz RL, Geman D. Robust determination of differential abundance in shotgun proteomics using nonparametric statistics. Mol Omics 2018; 14:424-436. [PMID: 30259924 PMCID: PMC6490964 DOI: 10.1039/c8mo00077h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Label-free shotgun mass spectrometry enables the detection of significant changes in protein abundance between different conditions. Due to often limited cohort sizes or replication, large ratios of potential protein markers to number of samples, as well as multiple null measurements pose important technical challenges to conventional parametric models. From a statistical perspective, a scenario similar to that of unlabeled proteomics is encountered in genomics when looking for differentially expressed genes. Still, the difficulty of detecting a large fraction of the true positives without a high false discovery rate is arguably greater in proteomics due to even smaller sample sizes and peptide-to-peptide variability in detectability. These constraints argue for nonparametric (or distribution-free) tests on normalized peptide values, thus minimizing the number of free parameters, as well as for measuring significance with permutation testing. We propose such a procedure with a class-based statistic, no parametric assumptions, and no parameters to select other than a nominal false discovery rate. Our method was tested on a new dataset which is available via ProteomeXchange with identifier PXD006447. The dataset was prepared using a standard proteolytic digest of a human protein mixture at 1.5-fold to 3-fold protein concentration changes and diluted into a constant background of yeast proteins. We demonstrate its superiority relative to other approaches in terms of the realized sensitivity and realized false discovery rates determined by ground truth, and recommend it for detecting differentially abundant proteins from MS data.
Collapse
Affiliation(s)
- Patrick Slama
- Center for Imaging Science, Institute for Computational Medicine, Johns Hopkins University, USA.
- Independent Researcher, Paris, France
| | | | - Robert L. Moritz
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, USA 98109
| | - Donald Geman
- Center for Imaging Science, Institute for Computational Medicine, Johns Hopkins University, USA.
- Department of Applied Mathematics and Statistics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD, 21218
| |
Collapse
|
15
|
Henning J, Tostengard A, Smith R. A Peptide-Level Fully Annotated Data Set for Quantitative Evaluation of Precursor-Aware Mass Spectrometry Data Processing Algorithms. J Proteome Res 2018; 18:392-398. [PMID: 30394759 DOI: 10.1021/acs.jproteome.8b00659] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Modern label-free quantitative mass spectrometry workflows are complex experimental chains for devising the composition of biological samples. With benchtop and in silico experimental steps that each have a significant effect on the accuracy, coverage, and statistical significance of the study result, it is crucial to understand the efficacy and biases of each protocol decision. Although many studies have been conducted on wet lab experimental protocols, postacquisition data processing methods have not been adequately evaluated in large part due to a lack of available ground truth data. In this study, we provide a novel ground truth data set for mass spectrometry data analysis at the precursor (MS1) signal level comprised of isolated peptide signals from UPS2, a popular complex standard for proteomics analysis, requiring more than 1000 h of manual curation. The data set consists of more than 62 million points with 1,294,008 grouped into 57,518 extracted ion chromatograms and those grouped into 14,111 isotopic envelopes. This data set can be used to evaluate many aspects of mass spectrometry data processing, including precursor mapping and signal extraction algorithms.
Collapse
Affiliation(s)
- Jessica Henning
- Department of Computer Science , University of Montana , Missoula , Montana 59812 , United States
| | - Annika Tostengard
- Department of Computer Science , University of Montana , Missoula , Montana 59812 , United States
| | - Rob Smith
- Department of Computer Science , University of Montana , Missoula , Montana 59812 , United States.,Prime Laboratories, Inc. , Missoula , Montana United States
| |
Collapse
|
16
|
Bittremieux W, Meysman P, Noble WS, Laukens K. Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing. J Proteome Res 2018; 17:3463-3474. [PMID: 30184435 PMCID: PMC6173621 DOI: 10.1021/acs.jproteome.8b00359] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. We present the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared with SpectraST. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Pieter Meysman
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| |
Collapse
|
17
|
Bittremieux W, Tabb DL, Impens F, Staes A, Timmerman E, Martens L, Laukens K. Quality control in mass spectrometry-based proteomics. MASS SPECTROMETRY REVIEWS 2018; 37:697-711. [PMID: 28802010 DOI: 10.1002/mas.21544] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 07/24/2017] [Accepted: 07/24/2017] [Indexed: 05/21/2023]
Abstract
Mass spectrometry is a highly complex analytical technique and mass spectrometry-based proteomics experiments can be subject to a large variability, which forms an obstacle to obtaining accurate and reproducible results. Therefore, a comprehensive and systematic approach to quality control is an essential requirement to inspire confidence in the generated results. A typical mass spectrometry experiment consists of multiple different phases including the sample preparation, liquid chromatography, mass spectrometry, and bioinformatics stages. We review potential sources of variability that can impact the results of a mass spectrometry experiment occurring in all of these steps, and we discuss how to monitor and remedy the negative influences on the experimental results. Furthermore, we describe how specialized quality control samples of varying sample complexity can be incorporated into the experimental workflow and how they can be used to rigorously assess detailed aspects of the instrument performance.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (Biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| | - David L Tabb
- Division of Molecular Biology and Human Genetics, Stellenbosch University Faculty of Medicine and Health Sciences, Tygerberg Hospital, Cape Town, South Africa
| | - Francis Impens
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - An Staes
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Evy Timmerman
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Zwijnaarde, Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (Biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| |
Collapse
|
18
|
Davidson JT, Lum BJ, Nano G, Jackson GP. Comparison of measured and recommended acceptance criteria for the analysis of seized drugs using Gas Chromatography–Mass Spectrometry (GC–MS). Forensic Chem 2018. [DOI: 10.1016/j.forc.2018.07.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|
19
|
Fu J, Tang J, Wang Y, Cui X, Yang Q, Hong J, Li X, Li S, Chen Y, Xue W, Zhu F. Discovery of the Consistently Well-Performed Analysis Chain for SWATH-MS Based Pharmacoproteomic Quantification. Front Pharmacol 2018; 9:681. [PMID: 29997509 PMCID: PMC6028727 DOI: 10.3389/fphar.2018.00681] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 06/05/2018] [Indexed: 12/20/2022] Open
Abstract
Sequential windowed acquisition of all theoretical fragment ion mass spectra (SWATH-MS) has emerged as one of the most popular techniques for label-free proteome quantification in current pharmacoproteomic research. It provides more comprehensive detection and more accurate quantitation of proteins comparing with the traditional techniques. The performance of SWATH-MS is highly susceptible to the selection of processing method. Till now, ≥27 methods (transformation, normalization, and missing-value imputation) are sequentially applied to construct numerous analysis chains for SWATH-MS, but it is still not clear which analysis chain gives the optimal quantification performance. Herein, the performances of 560 analysis chains for quantifying pharmacoproteomic data were comprehensively assessed. Firstly, the most complete set of the publicly available SWATH-MS based pharmacoproteomic data were collected by comprehensive literature review. Secondly, substantial variations among the performances of various analysis chains were observed, and the consistently well-performed analysis chains (CWPACs) across various datasets were for the first time generalized. Finally, the log and power transformations sequentially followed by the total ion current normalization were discovered as one of the best performed analysis chains for the quantification of SWATH-MS based pharmacoproteomic data. In sum, the CWPACs identified here provided important guidance to the quantification of proteomic data and could therefore facilitate the cutting-edge research in any pharmacoproteomic studies requiring SWATH-MS technique.
Collapse
Affiliation(s)
- Jianbo Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Jing Tang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Xuejiao Cui
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Qingxia Yang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Xiaoxu Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Shuang Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Yuzong Chen
- Bioinformatics and Drug Design Group, Department of Pharmacy, Center for Computational Science and Engineering, National University of Singapore, Singapore, Singapore
| | - Weiwei Xue
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.,School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| |
Collapse
|
20
|
Ceciliani F, Lecchi C, Urh C, Sauerwein H. Proteomics and metabolomics characterizing the pathophysiology of adaptive reactions to the metabolic challenges during the transition from late pregnancy to early lactation in dairy cows. J Proteomics 2017; 178:92-106. [PMID: 29055723 DOI: 10.1016/j.jprot.2017.10.010] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 10/04/2017] [Accepted: 10/15/2017] [Indexed: 01/15/2023]
Abstract
The transition from late pregnancy to early lactation is a critical period in a dairy cow's life due to the rapidly increasing drain of nutrients from the maternal organism towards the foetus and into colostrum and milk. In order to cope with the challenges of parturition and lactation, comprehensive adaptive reactions comprising the endocrine and the immune system need to be accomplished. There is high variation in this coping ability and both metabolic and infectious diseases, summarized as "production diseases", such as hypocalcaemia (milk fever), fatty liver syndrome, laminitis and ketosis, may occur and impact welfare, productive lifespan and economic outcomes. Proteomics and metabolomics have emerged as valuable techniques to characterize proteins and metabolite assets from tissue and biological fluids, such as milk, blood and urine. In this review we provide an overview on metabolic status and physiological changes during the transition period and the related production diseases in dairy cows, and summarize the state of art on proteomics and metabolomics of biological fluids and tissues involved in metabolic stress during the peripartum period. We also provide a current and prospective view of the application of the recent achievements generated by omics for biomarker discovery and their potential in diagnosis. BIOLOGICAL SIGNIFICANCE For high-yielding dairy cows there are several "occupational diseases" that occur mainly during the metabolic challenges related to the transition from pregnancy to lactation. Such diseases and their sequelae form a major concern for dairy production, and often lead to early culling of animals. Beside the economical perspective, metabolic stress may severely influence animal welfare. There is a multitude of studies about the metabolic backgrounds of such so called production diseases like ketosis, fatty liver, or hypocalcaemia, although the investigations aiming to assess the complexity of the pathophysiological reactions are largely focused on gene expression, i.e. transcriptomics. For extending the knowledge towards the proteome and the metabolome, the respective technologies are of increasing importance and can provide an overall view of how dairy cows react to metabolic stress, which is needed for an in-depth understanding of the molecular mechanisms of the related diseases. We herein review the current findings from studies applying proteomics and metabolomics to transition-related diseases, including fatty liver, ketosis, endometritis, hypocalcaemia and laminitis. For each disease, a brief overview of the up to date knowledge about its pathogenesis is provided, followed by an insight into the most recent achievements on the proteome and metabolome of tissues and biological fluids, such as blood serum and urine, highlighting potential biomarkers. We believe that this review would help readers to be become more familiar with the recent progresses of molecular background of transition-related diseases thus encouraging research in this field.
Collapse
Affiliation(s)
- Fabrizio Ceciliani
- Department of Veterinary Medicine, Università degli Studi di Milano, Milano, Italy.
| | - Cristina Lecchi
- Department of Veterinary Medicine, Università degli Studi di Milano, Milano, Italy
| | - Christiane Urh
- Institute of Animal Science, Physiology & Hygiene Unit, University of Bonn, Bonn, Germany
| | - Helga Sauerwein
- Institute of Animal Science, Physiology & Hygiene Unit, University of Bonn, Bonn, Germany
| |
Collapse
|
21
|
Hornung R, Causeur D, Bernau C, Boulesteix AL. Improving cross-study prediction through addon batch effect adjustment or addon normalization. Bioinformatics 2017; 33:397-404. [PMID: 27797760 DOI: 10.1093/bioinformatics/btw650] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 10/11/2016] [Indexed: 12/22/2022] Open
Abstract
Motivation To date most medical tests derived by applying classification methods to high-dimensional molecular data are hardly used in clinical practice. This is partly because the prediction error resulting when applying them to external data is usually much higher than internal error as evaluated through within-study validation procedures. We suggest the use of addon normalization and addon batch effect removal techniques in this context to reduce systematic differences between external data and the original dataset with the aim to improve prediction performance. Results We evaluate the impact of addon normalization and seven batch effect removal methods on cross-study prediction performance for several common classifiers using a large collection of microarray gene expression datasets, showing that some of these techniques reduce prediction error. Availability and Implementation All investigated addon methods are implemented in our R package bapred. Contact hornung@ibe.med.uni-muenchen.de. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Roman Hornung
- Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Munich, Germany
| | - David Causeur
- Applied Mathematics Department, Agrocampus Ouest, Rennes, France
| | | | - Anne-Laure Boulesteix
- Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Munich, Germany
| |
Collapse
|
22
|
Böcker S. Searching molecular structure databases using tandem MS data: are we there yet? Curr Opin Chem Biol 2017; 36:1-6. [DOI: 10.1016/j.cbpa.2016.12.010] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 12/06/2016] [Accepted: 12/07/2016] [Indexed: 10/20/2022]
|
23
|
A Golden Age for Working with Public Proteomics Data. Trends Biochem Sci 2017; 42:333-341. [PMID: 28118949 PMCID: PMC5414595 DOI: 10.1016/j.tibs.2017.01.001] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/13/2016] [Accepted: 01/02/2017] [Indexed: 11/23/2022]
Abstract
Data sharing in mass spectrometry (MS)-based proteomics is becoming a common scientific practice, as is now common in the case of other, more mature ‘omics’ disciplines like genomics and transcriptomics. We want to highlight that this situation, unprecedented in the field, opens a plethora of opportunities for data scientists. First, we explain in some detail some of the work already achieved, such as systematic reanalysis efforts. We also explain existing applications of public proteomics data, such as proteogenomics and the creation of spectral libraries and spectral archives. Finally, we discuss the main existing challenges and mention the first attempts to combine public proteomics data with other types of omics data sets. The field of proteomics has matured and diversified substantially over the past 10 years. Proteomics data are increasingly shared through centralized, public repositories. Standardization efforts have ensured that a large proportion of these public data can be read and processed by any interested researcher. Because any proteomics data set is only partially understood, there is great opportunity for (orthogonal) reuse of public data. While public proteomics data has so far remained outside ethics and privacy discussions, recent work indicates that there is an inherent risk.
Collapse
|
24
|
Navarro P, Kuharev J, Gillet LC, Bernhardt OM, MacLean B, Röst HL, Tate SA, Tsou CC, Reiter L, Distler U, Rosenberger G, Perez-Riverol Y, Nesvizhskii AI, Aebersold R, Tenzer S. A multicenter study benchmarks software tools for label-free proteome quantification. Nat Biotechnol 2016; 34:1130-1136. [PMID: 27701404 PMCID: PMC5120688 DOI: 10.1038/nbt.3685] [Citation(s) in RCA: 227] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Accepted: 08/26/2016] [Indexed: 12/12/2022]
Abstract
Consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH 2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from sequential window acquisition of all theoretical fragment-ion spectra (SWATH)-MS, which uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test data sets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation-window setups. For consistent evaluation, we developed LFQbench, an R package, to calculate metrics of precision and accuracy in label-free quantitative MS and report the identification performance, robustness and specificity of each software tool. Our reference data sets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics.
Collapse
Affiliation(s)
- Pedro Navarro
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz, Mainz, Germany
| | - Jörg Kuharev
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz, Mainz, Germany
| | - Ludovic C Gillet
- Department of Biology, Institute of Molecular Systems Biology, Eidgenoessische Technische Hochschule (IMSB-ETH) Zurich, Zurich, Switzerland
| | | | - Brendan MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Hannes L. Röst
- Department of Biology, Institute of Molecular Systems Biology, Eidgenoessische Technische Hochschule (IMSB-ETH) Zurich, Zurich, Switzerland
| | | | - Chih-Chiang Tsou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | | | - Ute Distler
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz, Mainz, Germany
| | - George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, Eidgenoessische Technische Hochschule (IMSB-ETH) Zurich, Zurich, Switzerland
- PhD Program in Systems Biology, University of Zurich and Eidgenoessische Technische Hochschule (ETH) Zurich, Zurich, Switzerland
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, Eidgenoessische Technische Hochschule (IMSB-ETH) Zurich, Zurich, Switzerland
- Faculty of Science, University of Zurich, Zurich, Switzerland
| | - Stefan Tenzer
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
25
|
Kohlbacher O, Vitek O, Weintraub ST. Challenges in Large-Scale Computational Mass Spectrometry and Multiomics. J Proteome Res 2016; 15:681-2. [DOI: 10.1021/acs.jproteome.6b00067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Oliver Kohlbacher
- Center for Bioinformatics, Quantitative Biology Center,
Department of Computer Science and Faculty of Medicine, University
of Tübingen and Max Planck Institute for Developmental Biology
| | - Olga Vitek
- Sy and Laurie Sternberg Interdisciplinary Associate
Professor, College of Science College of Computer and Information
Science, Northeastern University
| | - Susan T. Weintraub
- Department of Biochemistry, The University of Texas
Health Science Center at San Antonio
| |
Collapse
|