1
|
Heuckeroth S, Damiani T, Smirnov A, Mokshyna O, Brungs C, Korf A, Smith JD, Stincone P, Dreolin N, Nothias LF, Hyötyläinen T, Orešič M, Karst U, Dorrestein PC, Petras D, Du X, van der Hooft JJJ, Schmid R, Pluskal T. Reproducible mass spectrometry data processing and compound annotation in MZmine 3. Nat Protoc 2024:10.1038/s41596-024-00996-y. [PMID: 38769143 DOI: 10.1038/s41596-024-00996-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 02/26/2024] [Indexed: 05/22/2024]
Abstract
Untargeted mass spectrometry (MS) experiments produce complex, multidimensional data that are practically impossible to investigate manually. For this reason, computational pipelines are needed to extract relevant information from raw spectral data and convert it into a more comprehensible format. Depending on the sample type and/or goal of the study, a variety of MS platforms can be used for such analysis. MZmine is an open-source software for the processing of raw spectral data generated by different MS platforms. Examples include liquid chromatography-MS, gas chromatography-MS and MS-imaging. These data might typically be associated with various applications including metabolomics and lipidomics. Moreover, the third version of the software, described herein, supports the processing of ion mobility spectrometry (IMS) data. The present protocol provides three distinct procedures to perform feature detection and annotation of untargeted MS data produced by different instrumental setups: liquid chromatography-(IMS-)MS, gas chromatography-MS and (IMS-)MS imaging. For training purposes, example datasets are provided together with configuration batch files (i.e., list of processing steps and parameters) to allow new users to easily replicate the described workflows. Depending on the number of data files and available computing resources, we anticipate this to take between 2 and 24 h for new MZmine users and nonexperts. Within each procedure, we provide a detailed description for all processing parameters together with instructions/recommendations for their optimization. The main generated outputs are represented by aligned feature tables and fragmentation spectra lists that can be used by other third-party tools for further downstream analysis.
Collapse
Affiliation(s)
| | - Tito Damiani
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | | | - Olena Mokshyna
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Corinna Brungs
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Ansgar Korf
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Joshua David Smith
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
- First Faculty of Medicine, Charles University, Prague, Czech Republic
| | | | | | - Louis-Félix Nothias
- University of Geneva, Geneva, Switzerland
- Université Côte d'Azur, CNRS, ICN, Nice, France
| | | | - Matej Orešič
- Örebro University, Örebro, Sweden
- University of Turku and Åbo Akademi University, Turku, Finland
| | - Uwe Karst
- University of Münster, Münster, Germany
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Daniel Petras
- University of Tuebingen, Tuebingen, Germany
- University of California Riverside, Riverside, CA, USA
| | - Xiuxia Du
- University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Justin J J van der Hooft
- Wageningen University & Research, Wageningen, the Netherlands
- University of Johannesburg, Johannesburg, South Africa
| | - Robin Schmid
- University of Münster, Münster, Germany.
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic.
| |
Collapse
|
2
|
Bos TS, Pirok BWJ, Karlson L, Schantz S, Dahlseid TA, Stoll DR, Somsen GW. Fingerprinting of hydroxy propyl methyl cellulose by comprehensive two-dimensional liquid chromatography-mass spectrometry of monomers resulting from acid hydrolysis. J Chromatogr A 2024; 1722:464874. [PMID: 38598893 DOI: 10.1016/j.chroma.2024.464874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 03/19/2024] [Accepted: 04/03/2024] [Indexed: 04/12/2024]
Abstract
Hydroxypropyl methyl cellulose (HPMC) is a type of cellulose derivative with properties that render it useful in e.g. food, cosmetics, and pharmaceutical industry. The substitution degree and composition of the β-glucose subunits of HPMC affect its physical and functional properties, but HPMC characterization is challenging due to its high structural heterogeneity, including many isomers. In this study, comprehensive two-dimensional liquid chromatography-mass spectrometry was used to examine substituted glucose monomers originating from complete acid hydrolysis of HPMC. Resolution between the different monomers was achieved using a C18 and cyano column in the first and second LC dimension, respectively. The data analysis process was structured to obtain fingerprints of the monomers of interest. The results revealed that isomers of the respective monomers could be selectively separated based on the position of substituents. The examination of two industrial HPMC products revealed differences in overall monomer composition. While both products contained monomers with a similar degree of substitution, they exhibited distinct regioselectivity.
Collapse
Affiliation(s)
- Tijmen S Bos
- Division of Bioanalytical Chemistry, Amsterdam Institute of Molecular and Life Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1085, HV, Amsterdam 1081, the Netherlands; Centre for Analytical Sciences Amsterdam (CASA), the Netherlands.
| | - Bob W J Pirok
- Van 't Hoff Institute for Molecular Science (HIMS), University of Amsterdam, Science Park 904, XH, Amsterdam 1098, the Netherlands; Centre for Analytical Sciences Amsterdam (CASA), the Netherlands
| | - Leif Karlson
- Nouryon Chemicals, Zutphenseweg 10, AJ, Deventer 7418, the Netherlands
| | - Staffan Schantz
- Oral Product Development, Pharmaceutical Technology & Development, Operations, AstraZeneca, SE-431 83, Mölndal, Sweden
| | - Tina A Dahlseid
- Department of Chemistry, Gustavus Adolphus College, Saint Peter, Minnesota, 56082 United States
| | - Dwight R Stoll
- Department of Chemistry, Gustavus Adolphus College, Saint Peter, Minnesota, 56082 United States
| | - Govert W Somsen
- Division of Bioanalytical Chemistry, Amsterdam Institute of Molecular and Life Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1085, HV, Amsterdam 1081, the Netherlands; Centre for Analytical Sciences Amsterdam (CASA), the Netherlands
| |
Collapse
|
3
|
Tong J, Lu M, Wang R, An S, Wang J, Wang T, Xie C, Yu C. How Much Storage Precision Can Be Lost: Guidance for Near-Lossless Compression of Untargeted Metabolomics Mass Spectrometry Data. J Proteome Res 2024; 23:1702-1712. [PMID: 38640356 DOI: 10.1021/acs.jproteome.3c00851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2024]
Abstract
Several lossy compressors have achieved superior compression rates for mass spectrometry (MS) data at the cost of storage precision. Currently, the impacts of precision losses on MS data processing have not been thoroughly evaluated, which is critical for the future development of lossy compressors. We first evaluated different storage precision (32 bit and 64 bit) in lossless mzML files. We then applied 10 truncation transformations to generate precision-lossy files: five relative errors for intensities and five absolute errors for m/z values. MZmine3 and XCMS were used for feature detection and GNPS for compound annotation. Lastly, we compared Precision, Recall, F1 - score, and file sizes between lossy files and lossless files under different conditions. Overall, we revealed that the discrepancy between 32 and 64 bit precision was under 1%. We proposed an absolute m/z error of 10-4 and a relative intensity error of 2 × 10-2, adhering to a 5% error threshold (F1 - scores above 95%). For a stricter 1% error threshold (F1 - scores above 99%), an absolute m/z error of 2 × 10-5 and a relative intensity error of 2 × 10-3 were advised. This guidance aims to help researchers improve lossy compression algorithms and minimize the negative effects of precision losses on downstream data processing.
Collapse
Affiliation(s)
- Junjie Tong
- Central Hospital Affiliated to Shandong First Medical University, Jinan 250000, Shandong, China
- Key Laboratory of Tropical Medicinal Plant Chemistry of Ministry of Education, College of Chemistry and Chemical Engineering, Hainan Normal University, Haikou 571158, Hainan, China
| | - Miaoshan Lu
- Central Hospital Affiliated to Shandong First Medical University, Jinan 250000, Shandong, China
| | - Ruimin Wang
- Central Hospital Affiliated to Shandong First Medical University, Jinan 250000, Shandong, China
- Fudan University, Shanghai 200000, China
- Westlake University, Hangzhou 310024, Zhejiang, China
| | - Shaowei An
- Fudan University, Shanghai 200000, China
- Westlake University, Hangzhou 310024, Zhejiang, China
| | - Jinyin Wang
- Westlake University, Hangzhou 310024, Zhejiang, China
- Zhejiang University, Hangzhou 310009, Zhejiang, China
| | - Tong Wang
- Central Hospital Affiliated to Shandong First Medical University, Jinan 250000, Shandong, China
| | - Cong Xie
- Central Hospital Affiliated to Shandong First Medical University, Jinan 250000, Shandong, China
- Key Laboratory of Tropical Medicinal Plant Chemistry of Ministry of Education, College of Chemistry and Chemical Engineering, Hainan Normal University, Haikou 571158, Hainan, China
| | - Changbin Yu
- Central Hospital Affiliated to Shandong First Medical University, Jinan 250000, Shandong, China
| |
Collapse
|
4
|
Rupasinghe M, Bersaglieri C, Leslie Pedrioli DM, Pedrioli PG, Panatta M, Hottiger MO, Cinelli P, Santoro R. PRAMEL7 and CUL2 decrease NuRD stability to establish ground-state pluripotency. EMBO Rep 2024; 25:1453-1468. [PMID: 38332149 PMCID: PMC10933316 DOI: 10.1038/s44319-024-00083-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 01/05/2024] [Accepted: 01/10/2024] [Indexed: 02/10/2024] Open
Abstract
Pluripotency is established in E4.5 preimplantation epiblast. Embryonic stem cells (ESCs) represent the immortalization of pluripotency, however, their gene expression signature only partially resembles that of developmental ground-state. Induced PRAMEL7 expression, a protein highly expressed in the ICM but lowly expressed in ESCs, reprograms developmentally advanced ESC+serum into ground-state pluripotency by inducing a gene expression signature close to developmental ground-state. However, how PRAMEL7 reprograms gene expression remains elusive. Here we show that PRAMEL7 associates with Cullin2 (CUL2) and this interaction is required to establish ground-state gene expression. PRAMEL7 recruits CUL2 to chromatin and targets regulators of repressive chromatin, including the NuRD complex, for proteasomal degradation. PRAMEL7 antagonizes NuRD-mediated repression of genes implicated in pluripotency by decreasing NuRD stability and promoter association in a CUL2-dependent manner. Our data link proteasome degradation pathways to ground-state gene expression, offering insights to generate in vitro models to reproduce the in vivo ground-state pluripotency.
Collapse
Affiliation(s)
- Meneka Rupasinghe
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, 8057, Zurich, Switzerland
- Molecular Life Science Program, Life Science Zurich Graduate School, University of Zurich, 8057, Zurich, Switzerland
| | - Cristiana Bersaglieri
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, 8057, Zurich, Switzerland
| | - Deena M Leslie Pedrioli
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, 8057, Zurich, Switzerland
| | - Patrick Ga Pedrioli
- Department of Health Sciences and Technology, ETH Zurich, 8093, Zurich, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Martina Panatta
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, 8057, Zurich, Switzerland
- RNA Biology Program, Life Science Zurich Graduate School, University of Zurich, Zurich, Switzerland
| | - Michael O Hottiger
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, 8057, Zurich, Switzerland
| | - Paolo Cinelli
- Department of Trauma Surgery, University Hospital Zurich, University of Zurich, Rämistrasse 100, 8091, Zurich, Switzerland
| | - Raffaella Santoro
- Department of Molecular Mechanisms of Disease, DMMD, University of Zurich, 8057, Zurich, Switzerland.
| |
Collapse
|
5
|
Sandström H, Rissanen M, Rousu J, Rinke P. Data-Driven Compound Identification in Atmospheric Mass Spectrometry. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2306235. [PMID: 38095508 PMCID: PMC10885664 DOI: 10.1002/advs.202306235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/04/2023] [Indexed: 02/24/2024]
Abstract
Aerosol particles found in the atmosphere affect the climate and worsen air quality. To mitigate these adverse impacts, aerosol particle formation and aerosol chemistry in the atmosphere need to be better mapped out and understood. Currently, mass spectrometry is the single most important analytical technique in atmospheric chemistry and is used to track and identify compounds and processes. Large amounts of data are collected in each measurement of current time-of-flight and orbitrap mass spectrometers using modern rapid data acquisition practices. However, compound identification remains a major bottleneck during data analysis due to lacking reference libraries and analysis tools. Data-driven compound identification approaches could alleviate the problem, yet remain rare to non-existent in atmospheric science. In this perspective, the authors review the current state of data-driven compound identification with mass spectrometry in atmospheric science and discuss current challenges and possible future steps toward a digital era for atmospheric mass spectrometry.
Collapse
Affiliation(s)
- Hilda Sandström
- Department of Applied Physics, Aalto University, P.O. Box 11000, FI-00076, Aalto, Espoo, Finland
| | - Matti Rissanen
- Aerosol Physics Laboratory, Tampere University, FI-33720, Tampere, Finland
- Department of Chemistry, University of Helsinki, P.O. Box 55, A.I. Virtasen aukio 1, FI-00560, Helsinki, Finland
| | - Juho Rousu
- Department of Computer Science, Aalto University, P.O. Box 11000, FI-00076, Aalto, Espoo, Finland
| | - Patrick Rinke
- Department of Applied Physics, Aalto University, P.O. Box 11000, FI-00076, Aalto, Espoo, Finland
| |
Collapse
|
6
|
Lu M, Tong J, Fang W, Wang J, An S, Wang R, Jiang H, Yu C. Column storage enables edge computation of biological big data on 5G networks. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:17197-17219. [PMID: 37920052 DOI: 10.3934/mbe.2023766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/04/2023]
Abstract
With the continuous improvement of biological detection technology, the scale of biological data is also increasing, which overloads the central-computing server. The use of edge computing in 5G networks can provide higher processing performance for large biological data analysis, reduce bandwidth consumption and improve data security. Appropriate data compression and reading strategy becomes the key technology to implement edge computing. We introduce the column storage strategy into mass spectrum data so that part of the analysis scenario can be completed by edge computing. Data produced by mass spectrometry is a typical biological big data based. A blood sample analysed by mass spectrometry can produce a 10 gigabytes digital file. By introducing the column storage strategy and combining the related prior knowledge of mass spectrometry, the structure of the mass spectrum data is reorganized, and the result file is effectively compressed. Data can be processed immediately near the scientific instrument, reducing the bandwidth requirements and the pressure of the central server. Here, we present Aird-Slice, a mass spectrum data format using the column storage strategy. Aird-Slice reduces volume by 48% compared to vendor files and speeds up the critical computational step of ion chromatography extraction by an average of 116 times over the test dataset. Aird-Slice provides the ability to analyze biological data using an edge computing architecture on 5G networks.
Collapse
Affiliation(s)
- Miaoshan Lu
- Zhejiang University, Hangzhou 310009, Zhejiang Province, China
- School of Engineering, Westlake University, Hangzhou, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, China
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, China
| | - Junjie Tong
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, China
| | - Weidong Fang
- Guangxi Key Laboratory of Wireless Wideband Communication and Signal Processing, Guilin University of Electronic Technology, Guilin 541004, China
| | - Jinyin Wang
- Zhejiang University, Hangzhou 310009, Zhejiang Province, China
| | | | | | - Hengxuan Jiang
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, China
| | - Changbin Yu
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, China
| |
Collapse
|
7
|
Kim MS, Kim BY, Kim JI, Lee J, Jeon WK. Mumefural Improves Recognition Memory and Alters ERK-CREB-BDNF Signaling in a Mouse Model of Chronic Cerebral Hypoperfusion. Nutrients 2023; 15:3271. [PMID: 37513692 PMCID: PMC10383324 DOI: 10.3390/nu15143271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 07/20/2023] [Accepted: 07/23/2023] [Indexed: 07/30/2023] Open
Abstract
Cognitive impairment resulting from chronic cerebral hypoperfusion (CCH) is known as vascular dementia (VaD) and is associated with cerebral atrophy and cholinergic deficiencies. Mumefural (MF), a bioactive compound found in a heated fruit of Prunus mume Sieb. et Zucc, was recently found to improve cognitive impairment in a rat CCH model. However, additional evidence is necessary to validate the efficacy of MF administration for treating VaD. Therefore, we evaluated MF effects in a mouse CCH model using unilateral common carotid artery occlusion (UCCAO). Mice were subjected to UCCAO or sham surgery and orally treated with MF daily for 8 weeks. Behavioral tests were used to investigate cognitive function and locomotor activity. Changes in body and brain weights were measured, and levels of hippocampal proteins (brain-derived neurotrophic factor (BDNF), extracellular signal-regulated kinase (ERK), cyclic AMP-response element-binding protein (CREB), and acetylcholinesterase (AChE)) were assessed. Additionally, proteomic analysis was conducted to examine the alterations in protein profiles induced by MF treatment. Our study showed that MF administration significantly improved cognitive deficits. Brain atrophy was attenuated and MF treatment reversed the increase in AChE levels. Furthermore, MF significantly upregulated p-ERK/ERK, p-CREB/CREB, and BDNF levels after UCCAO. Thus, MF treatment ameliorates CCH-induced cognitive impairment by regulating ERK/CREB/BDNF signaling, suggesting that MF is a therapeutic candidate for treating CCH.
Collapse
Affiliation(s)
- Min-Soo Kim
- KM Convergence Research Division, Korea Institute of Oriental Medicine, Daejeon 34054, Republic of Korea
- Department of Biohealth Regulatory Science, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Bu-Yeo Kim
- KM Convergence Research Division, Korea Institute of Oriental Medicine, Daejeon 34054, Republic of Korea
| | - Jung Im Kim
- KM Convergence Research Division, Korea Institute of Oriental Medicine, Daejeon 34054, Republic of Korea
| | | | - Won Kyung Jeon
- KM Convergence Research Division, Korea Institute of Oriental Medicine, Daejeon 34054, Republic of Korea
| |
Collapse
|
8
|
Valle A, de la Calle ME, Muhamadali H, Hollywood KA, Xu Y, Lloyd JR, Goodacre R, Cantero D, Bolivar J. Metabolomics of Escherichia coli for Disclosing Novel Metabolic Engineering Strategies for Enhancing Hydrogen and Ethanol Production. Int J Mol Sci 2023; 24:11619. [PMID: 37511377 PMCID: PMC10380867 DOI: 10.3390/ijms241411619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/11/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023] Open
Abstract
The biological production of hydrogen is an appealing approach to mitigating the environmental problems caused by the diminishing supply of fossil fuels and the need for greener energy. Escherichia coli is one of the best-characterized microorganisms capable of consuming glycerol-a waste product of the biodiesel industry-and producing H2 and ethanol. However, the natural capacity of E. coli to generate these compounds is insufficient for commercial or industrial purposes. Metabolic engineering allows for the rewiring of the carbon source towards H2 production, although the strategies for achieving this aim are difficult to foresee. In this work, we use metabolomics platforms through GC-MS and FT-IR techniques to detect metabolic bottlenecks in the engineered ΔldhΔgndΔfrdBC::kan (M4) and ΔldhΔgndΔfrdBCΔtdcE::kan (M5) E. coli strains, previously reported as improved H2 and ethanol producers. In the M5 strain, increased intracellular citrate and malate were detected by GC-MS. These metabolites can be redirected towards acetyl-CoA and formate by the overexpression of the citrate lyase (CIT) enzyme and by co-overexpressing the anaplerotic human phosphoenol pyruvate carboxykinase (hPEPCK) or malic (MaeA) enzymes using inducible promoter vectors. These strategies enhanced specific H2 production by up to 1.25- and 1.49-fold, respectively, compared to the reference strains. Other parameters, such as ethanol and H2 yields, were also enhanced. However, these vectors may provoke metabolic burden in anaerobic conditions. Therefore, alternative strategies for a tighter control of protein expression should be addressed in order to avoid undesirable effects in the metabolic network.
Collapse
Affiliation(s)
- Antonio Valle
- Department of Biomedicine, Biotechnology and Public Health-Biochemistry and Molecular Biology, Campus Universitario de Puerto Real, University of Cadiz, 11510 Puerto Real, Spain
- Institute of Viticulture and Agri-Food Research (IVAGRO)-International Campus of Excellence (ceiA3), University of Cadiz, 11510 Puerto Real, Spain
| | - Maria Elena de la Calle
- Department of Biomedicine, Biotechnology and Public Health-Biochemistry and Molecular Biology, Campus Universitario de Puerto Real, University of Cadiz, 11510 Puerto Real, Spain
- Department of Chemical Engineering and Food Technology, Campus Universitario de Puerto Real, University of Cadiz, 11510 Puerto Real, Spain
| | - Howbeer Muhamadali
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, BioSciences Building, Crown Street, Liverpool L69 7ZB, UK
- Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK
| | - Katherine A Hollywood
- Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK
- Department of Chemistry, Faculty of Science and Engineering, Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK
| | - Yun Xu
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, BioSciences Building, Crown Street, Liverpool L69 7ZB, UK
- Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK
| | - Jonathan R Lloyd
- Williamson Research Centre, School of Earth & Environmental Sciences, University of Manchester, Manchester M13 9PL, UK
| | - Royston Goodacre
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, BioSciences Building, Crown Street, Liverpool L69 7ZB, UK
- Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK
| | - Domingo Cantero
- Institute of Viticulture and Agri-Food Research (IVAGRO)-International Campus of Excellence (ceiA3), University of Cadiz, 11510 Puerto Real, Spain
- Department of Chemical Engineering and Food Technology, Campus Universitario de Puerto Real, University of Cadiz, 11510 Puerto Real, Spain
| | - Jorge Bolivar
- Department of Biomedicine, Biotechnology and Public Health-Biochemistry and Molecular Biology, Campus Universitario de Puerto Real, University of Cadiz, 11510 Puerto Real, Spain
- Institute of Biomolecules (INBIO), University of Cadiz, 11510 Puerto Real, Spain
| |
Collapse
|
9
|
Naumann L, Haun A, Höchsmann A, Mohr M, Novák M, Flottmann D, Neusüß C. Augmented region of interest for untargeted metabolomics mass spectrometry (AriumMS) of multi-platform-based CE-MS and LC-MS data. Anal Bioanal Chem 2023:10.1007/s00216-023-04715-6. [PMID: 37225900 DOI: 10.1007/s00216-023-04715-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/16/2023] [Accepted: 04/20/2023] [Indexed: 05/26/2023]
Abstract
In mass spectrometry (MS)-based metabolomics, there is a great need to combine different analytical separation techniques to cover metabolites of different polarities and apply appropriate multi-platform data processing. Here, we introduce AriumMS (augmented region of interest for untargeted metabolomics mass spectrometry) as a reliable toolbox for multi-platform metabolomics. AriumMS offers augmented data analysis of several separation techniques utilizing a region-of-interest algorithm. To demonstrate the capabilities of AriumMS, five datasets were combined. This includes three newly developed capillary electrophoresis (CE)-Orbitrap MS methods using the recently introduced nanoCEasy CE-MS interface and two hydrophilic interaction liquid chromatography (HILIC)-Orbitrap MS methods. AriumMS provides a novel mid-level data fusion approach for multi-platform data analysis to simplify and speed up multi-platform data processing and evaluation. The key feature of AriumMS lies in the optimized data processing strategy, including parallel processing of datasets and flexible parameterization for processing of individual separation methods with different peak characteristics. As a case study, Saccharomyces cerevisiae (yeast) was treated with a growth inhibitor, and AriumMS successfully differentiated the metabolome based on the augmented multi-platform CE-MS and HILIC-MS investigation. As a result, AriumMS is proposed as a powerful tool to improve the accuracy and selectivity of metabolome analysis through the integration of several HILIC-MS/CE-MS techniques.
Collapse
Affiliation(s)
- Lukas Naumann
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany
| | - Adrian Haun
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany
| | - Alisa Höchsmann
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany
| | - Michael Mohr
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany
| | - Martin Novák
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany
| | - Dirk Flottmann
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany
| | - Christian Neusüß
- Department of Chemistry, Aalen University, Beethovenstraße 1, 73430, Aalen, Germany.
| |
Collapse
|
10
|
Feraud M, O'Brien JW, Samanipour S, Dewapriya P, van Herwerden D, Kaserzon S, Wood I, Rauert C, Thomas KV. InSpectra - A platform for identifying emerging chemical threats. JOURNAL OF HAZARDOUS MATERIALS 2023; 455:131486. [PMID: 37172382 DOI: 10.1016/j.jhazmat.2023.131486] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 04/20/2023] [Accepted: 04/23/2023] [Indexed: 05/14/2023]
Abstract
Non-target analysis (NTA) employing high-resolution mass spectrometry (HRMS) coupled with liquid chromatography is increasingly being used to identify chemicals of biological relevance. HRMS datasets are large and complex making the identification of potentially relevant chemicals extremely challenging. As they are recorded in vendor-specific formats, interpreting them is often reliant on vendor-specific software that may not accommodate advancements in data processing. Here we present InSpectra, a vendor independent automated platform for the systematic detection of newly identified emerging chemical threats. InSpectra is web-based, open-source/access and modular providing highly flexible and extensible NTA and suspect screening workflows. As a cloud-based platform, InSpectra exploits parallel computing and big data archiving capabilities with a focus for sharing and community curation of HRMS data. InSpectra offers a reproducible and transparent approach for the identification, tracking and prioritisation of emerging chemical threats.
Collapse
Affiliation(s)
- Mathieu Feraud
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Jake W O'Brien
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia; Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Netherlands.
| | - Saer Samanipour
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia; Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Netherlands; UvA Data Science Center, University of Amsterdam, Netherlands.
| | - Pradeep Dewapriya
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Denice van Herwerden
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Netherlands
| | - Sarit Kaserzon
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Ian Wood
- School of Mathematics and Physics, The University of Queensland, Australia
| | - Cassandra Rauert
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Kevin V Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| |
Collapse
|
11
|
Novák J, Schug KA, Havlíček V. Quantitation of small molecules from liquid chromatography-mass spectrometric accurate mass datasets using CycloBranch. EUROPEAN JOURNAL OF MASS SPECTROMETRY (CHICHESTER, ENGLAND) 2023; 29:102-110. [PMID: 37000628 DOI: 10.1177/14690667231164766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Gaussian and exponentially modified Gaussian functions were incorporated into integrating algorithms used by an open-source, cross-platform tool called CycloBranch. The quantitation is demonstrated on bacterial pyoverdines separated by fine isotope features. Using our algorithm, we can separate the m/z values 694.25802 and 694.26731 (a 0.009 Da difference), where the former belongs to the most intense peak of pyoverdine D (PvdD), and the latter to the second most intense peak of pyoverdine E (PvdE) in the respective isotopic clusters of [M + Fe-H]2+ ions. The areas under chromatographic curves of standards were analyzed for the limit of detection (LOD), limit of quantitation (LOQ), and regression coefficient calculations. The quantitative module returned a LOD and LOQ of 1.4 and 4.3 ng/mL, respectively, for both PvdD and PvdE in human urine. If present and detected in mass spectra, the intensities of user-defined [M + H]+, [M + Na]+, [M + K]+, [M + Fe-H]2+, or other ion types, can be accumulated and used for quantitation. The quantitation result is returned by CycloBranch in seconds or minutes, contrary to an hours-long manual approach, prone to user-born errors originating from necessary copying among various software environments. Native Bruker, Waters, Thermo, txt, mgf, mzML, and mzXML data formats are supported in CycloBranch, which is freely available at https://ms.biomed.cas.cz/cyclobranch.
Collapse
Affiliation(s)
- Jiří Novák
- Institute of Microbiology, 48311Czech Academy of Sciences, Prague, Czech Republic
- Faculty of Information Technology, Czech Technical University in Prague, Prague, Czech Republic
| | - Kevin A Schug
- Department of Chemistry and Biochemistry, The University of Texas Arlington, Arlington, TX, USA
| | - Vladimír Havlíček
- Institute of Microbiology, 48311Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
12
|
Deutsch EW, Mendoza L, Shteynberg DD, Hoopmann MR, Sun Z, Eng JK, Moritz RL. Trans-Proteomic Pipeline: Robust Mass Spectrometry-Based Proteomics Data Analysis Suite. J Proteome Res 2023; 22:615-624. [PMID: 36648445 PMCID: PMC10166710 DOI: 10.1021/acs.jproteome.2c00624] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The Trans-Proteomic Pipeline (TPP) mass spectrometry data analysis suite has been in continual development and refinement since its first tools, PeptideProphet and ProteinProphet, were published 20 years ago. The current release provides a large complement of tools for spectrum processing, spectrum searching, search validation, abundance computation, protein inference, and more. Many of the tools include machine-learning modeling to extract the most information from data sets and build robust statistical models to compute the probabilities that derived information is correct. Here we present the latest information on the many TPP tools, and how TPP can be deployed on various platforms from personal Windows laptops to Linux clusters and expansive cloud computing environments. We describe tutorials on how to use TPP in a variety of ways and describe synergistic projects that leverage TPP. We conclude with plans for continued development of TPP.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Luis Mendoza
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | | | | | - Zhi Sun
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Jimmy K Eng
- Proteomics Resource, University of Washington, Seattle, Washington 98195, United States
| | - Robert L Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
13
|
Bilbao A, Ross DH, Lee JY, Donor MT, Williams SM, Zhu Y, Ibrahim YM, Smith RD, Zheng X. MZA: A Data Conversion Tool to Facilitate Software Development and Artificial Intelligence Research in Multidimensional Mass Spectrometry. J Proteome Res 2023; 22:508-513. [PMID: 36414245 PMCID: PMC9898216 DOI: 10.1021/acs.jproteome.2c00313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Modern mass spectrometry-based workflows employing hybrid instrumentation and orthogonal separations collect multidimensional data, potentially allowing deeper understanding in omics studies through adoption of artificial intelligence methods. However, the large volume of these rich spectra challenges existing data storage and access technologies, therefore precluding informatics advancements. We present MZA (pronounced m-za), the mass-to-charge (m/z) generic data storage and access tool designed to facilitate software development and artificial intelligence research in multidimensional mass spectrometry measurements. Composed of a data conversion tool and a simple file structure based on the HDF5 format, MZA provides easy, cross-platform and cross-programming language access to raw MS-data, enabling fast development of new tools in data science programming languages such as Python and R. The software executable, example MS-data and example Python and R scripts are freely available at https://github.com/PNNL-m-q/mza.
Collapse
Affiliation(s)
- Aivett Bilbao
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA,Corresponding authors Aivett Bilbao – Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, 99352, United States; .; Xueyun Zheng – Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, United States;
| | - Dylan H. Ross
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Joon-Yong Lee
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Micah T. Donor
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | | | - Ying Zhu
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | | | | | - Xueyun Zheng
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA,Corresponding authors Aivett Bilbao – Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, 99352, United States; .; Xueyun Zheng – Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, United States;
| |
Collapse
|
14
|
Applications of mass spectroscopy in understanding cancer proteomics. Proteomics 2023. [DOI: 10.1016/b978-0-323-95072-5.00007-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
|
15
|
Morabito A, De Simone G, Ferrario M, Falcetta F, Pastorelli R, Brunelli L. EASY-FIA: A Readably Usable Standalone Tool for High-Resolution Mass Spectrometry Metabolomics Data Pre-Processing. Metabolites 2022; 13:metabo13010013. [PMID: 36676938 PMCID: PMC9861133 DOI: 10.3390/metabo13010013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/15/2022] [Accepted: 12/16/2022] [Indexed: 12/24/2022] Open
Abstract
Flow injection analysis coupled with high-resolution mass spectrometry (FIA-HRMS) is a fair trade-off between resolution and speed. However, free software available for data pre-processing is few, web-based, and often requires advanced user specialization. These tools rarely embedded blank and noise evaluation strategies, and direct feature annotation. We developed EASY-FIA, a free standalone application that can be employed for FIA-HRMS metabolomic data pre-processing by users with no bioinformatics/programming skills. We validated the tool's performance and applicability in two clinical metabolomics case studies. The main functions of our application are blank subtraction, alignment of the metabolites, and direct feature annotation by means of the Human Metabolome Database (HMDB) using a minimum number of mass spectrometry parameters. In a scenario where FIA-HRMS is increasingly recognized as a reliable strategy for fast metabolomics analysis, EASY-FIA could become a standardized and feasible tool easily usable by all scientists dealing with MS-based metabolomics. EASY-FIA was implemented in MATLAB with the App Designer tool and it is freely available for download.
Collapse
Affiliation(s)
- Aurelia Morabito
- Laboratory of Mass Spectrometry, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milan, Italy
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy
| | - Giulia De Simone
- Laboratory of Mass Spectrometry, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milan, Italy
- Department of Biotechnologies and Biosciences, Università degli Studi Milano Bicocca, 20126 Milan, Italy
| | - Manuela Ferrario
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy
| | - Francesca Falcetta
- Unit of Biophysics, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milan, Italy
| | - Roberta Pastorelli
- Laboratory of Mass Spectrometry, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milan, Italy
| | - Laura Brunelli
- Laboratory of Mass Spectrometry, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milan, Italy
- Correspondence: ; Tel.: +39-0239014742
| |
Collapse
|
16
|
Bittremieux W, Wang M, Dorrestein PC. The critical role that spectral libraries play in capturing the metabolomics community knowledge. Metabolomics 2022; 18:94. [PMID: 36409434 DOI: 10.1007/s11306-022-01947-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 10/19/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Spectral library searching is currently the most common approach for compound annotation in untargeted metabolomics. Spectral libraries applicable to liquid chromatography mass spectrometry have grown in size over the past decade to include hundreds of thousands to millions of mass spectra and tens of thousands of compounds, forming an essential knowledge base for the interpretation of metabolomics experiments. AIM OF REVIEW We describe existing spectral library resources, highlight different strategies for compiling spectral libraries, and discuss quality considerations that should be taken into account when interpreting spectral library searching results. Finally, we describe how spectral libraries are empowering the next generation of machine learning tools in computational metabolomics, and discuss several opportunities for using increasingly accessible large spectral libraries. KEY SCIENTIFIC CONCEPTS OF REVIEW This review focuses on the current state of spectral libraries for untargeted LC-MS/MS based metabolomics. We show how the number of entries in publicly accessible spectral libraries has increased more than 60-fold in the past eight years to aid molecular interpretation and we discuss how the role of spectral libraries in untargeted metabolomics will evolve in the near future.
Collapse
Affiliation(s)
- Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Mingxun Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, 92507, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
17
|
Gassaway BM, Li J, Rad R, Mintseris J, Mohler K, Levy T, Aguiar M, Beausoleil SA, Paulo JA, Rinehart J, Huttlin EL, Gygi SP. A multi-purpose, regenerable, proteome-scale, human phosphoserine resource for phosphoproteomics. Nat Methods 2022; 19:1371-1375. [PMID: 36280721 PMCID: PMC9847208 DOI: 10.1038/s41592-022-01638-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 09/06/2022] [Indexed: 01/21/2023]
Abstract
Mass-spectrometry-based phosphoproteomics has become indispensable for understanding cellular signaling in complex biological systems. Despite the central role of protein phosphorylation, the field still lacks inexpensive, regenerable, and diverse phosphopeptides with ground-truth phosphorylation positions. Here, we present Iterative Synthetically Phosphorylated Isomers (iSPI), a proteome-scale library of human-derived phosphoserine-containing phosphopeptides that is inexpensive, regenerable, and diverse, with precisely known positions of phosphorylation. We demonstrate possible uses of iSPI, including use as a phosphopeptide standard, a tool to evaluate and optimize phosphorylation-site localization algorithms, and a benchmark to compare performance across data analysis pipelines. We also present AScorePro, an updated version of the AScore algorithm specifically optimized for phosphorylation-site localization in higher energy fragmentation spectra, and the FLR viewer, a web tool for phosphorylation-site localization, to enable community use of the iSPI resource. iSPI and its associated data constitute a useful, multi-purpose resource for the phosphoproteomics community.
Collapse
Affiliation(s)
- Brandon M. Gassaway
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,These authors contributed equally: Brandon M. Gassaway, Jiaming Li
| | - Jiaming Li
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,These authors contributed equally: Brandon M. Gassaway, Jiaming Li
| | - Ramin Rad
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Julian Mintseris
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Kyle Mohler
- Department of Cellular and Molecular Physiology and Systems Biology Institute, Yale Medical School, New Haven, CT, USA
| | - Tyler Levy
- Cell Signaling Technology, Danvers, MA, USA
| | | | | | - Joao A. Paulo
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Jesse Rinehart
- Department of Cellular and Molecular Physiology and Systems Biology Institute, Yale Medical School, New Haven, CT, USA
| | | | - Steven P. Gygi
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,Correspondence and requests for materials should be addressed to Steven P. Gygi.
| |
Collapse
|
18
|
Finch JP, Wilson T, Lyons L, Phillips H, Beckmann M, Draper J. Spectral binning as an approach to post-acquisition processing of high resolution FIE-MS metabolome fingerprinting data. Metabolomics 2022; 18:64. [PMID: 35917032 PMCID: PMC9345815 DOI: 10.1007/s11306-022-01923-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 07/16/2022] [Indexed: 12/01/2022]
Abstract
INTRODUCTION Flow infusion electrospray high resolution mass spectrometry (FIE-HRMS) fingerprinting produces complex, high dimensional data sets which require specialist in-silico software tools to process the data prior to analysis. OBJECTIVES Present spectral binning as a pragmatic approach to post-acquisition procession of FIE-HRMS metabolome fingerprinting data. METHODS A spectral binning approach was developed that included the elimination of single scan m/z events, the binning of spectra and the averaging of spectra across the infusion profile. The modal accurate m/z was then extracted for each bin. This approach was assessed using four different biological matrices and a mix of 31 known chemical standards analysed by FIE-HRMS using an Exactive Orbitrap. Bin purity and centrality metrics were developed to objectively assess the distribution and position of accurate m/z within an individual bin respectively. RESULTS The optimal spectral binning width was found to be 0.01 amu. 80.8% of the extracted accurate m/z matched to predicted ionisation products of the chemical standards mix were found to have an error of below 3 ppm. The open-source R package binneR was developed as a user friendly implementation of the approach. This was able to process 100 data files using 4 Central Processing Units (CPU) workers in only 55 seconds with a maximum memory usage of 1.36 GB. CONCLUSION Spectral binning is a fast and robust method for the post-acquisition processing of FIE-HRMS data. The open-source R package binneR allows users to efficiently process data from FIE-HRMS experiments with the resources available on a standard desktop computer.
Collapse
Affiliation(s)
- Jasen P Finch
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK.
| | - Thomas Wilson
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK
| | - Laura Lyons
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK
| | - Helen Phillips
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK
| | - Manfred Beckmann
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK
| | - John Draper
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK
| |
Collapse
|
19
|
Valle A, Soto Z, Muhamadali H, Hollywood KA, Xu Y, Lloyd JR, Goodacre R, Cantero D, Cabrera G, Bolivar J. Metabolomics for the design of new metabolic engineering strategies for improving aerobic succinic acid production in Escherichia coli. Metabolomics 2022; 18:56. [PMID: 35857216 PMCID: PMC9300530 DOI: 10.1007/s11306-022-01912-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 06/17/2022] [Indexed: 11/24/2022]
Abstract
INTRODUCTION Glycerol is a byproduct from the biodiesel industry that can be biotransformed by Escherichia coli to high added-value products such as succinate under aerobic conditions. The main genetic engineering strategies to achieve this aim involve the mutation of succinate dehydrogenase (sdhA) gene and also those responsible for acetate synthesis including acetate kinase, phosphate acetyl transferase and pyruvate oxidase encoded by ackA, pta and pox genes respectively in the ΔsdhAΔack-ptaΔpox (M4) mutant. Other genetic manipulations to rewire the metabolism toward succinate consist on the activation of the glyoxylate shunt or blockage the pentose phosphate pathway (PPP) by deletion of isocitrate lyase repressor (iclR) or gluconate dehydrogenase (gnd) genes on M4-ΔiclR and M4-Δgnd mutants respectively. OBJECTIVE To deeply understand the effect of the blocking of the pentose phosphate pathway (PPP) or the activation of the glyoxylate shunt, metabolite profiles were analyzed on M4-Δgnd, M4-ΔiclR and M4 mutants. METHODS Metabolomics was performed by FT-IR and GC-MS for metabolite fingerprinting and HPLC for quantification of succinate and glycerol. RESULTS Most of the 65 identified metabolites showed lower relative levels in the M4-ΔiclR and M4-Δgnd mutants than those of the M4. However, fructose 1,6-biphosphate, trehalose, isovaleric acid and mannitol relative concentrations were increased in M4-ΔiclR and M4-Δgnd mutants. To further improve succinate production, the synthesis of mannitol was suppressed by deletion of mannitol dehydrogenase (mtlD) on M4-ΔgndΔmtlD mutant that increase ~ 20% respect to M4-Δgnd. CONCLUSION Metabolomics can serve as a holistic tool to identify bottlenecks in metabolic pathways by a non-rational design. Genetic manipulation to release these restrictions could increase the production of succinate.
Collapse
Affiliation(s)
- Antonio Valle
- Department of Biomedicine, Biotechnology and Public Health-Biochemistry and Molecular Biology, University of Cadiz, Campus Universitario de Puerto Real, 11510, Puerto Real, Cádiz, Spain.
- Institute of Viticulture and Agri-Food Research (IVAGRO) - International Campus of Excellence (ceiA3), University of Cadiz, 11510, Puerto Real, Cádiz, Spain.
| | - Zamira Soto
- Department of Biomedicine, Biotechnology and Public Health-Biochemistry and Molecular Biology, University of Cadiz, Campus Universitario de Puerto Real, 11510, Puerto Real, Cádiz, Spain
- Department of Chemical Engineering and Food Technology, University of Cadiz, Campus Universitario de Puerto Real, 11510, Puerto Real, Cádiz, Spain
- Faculty of Basic and Biomedical Sciences, Universidad Simón Bolívar, 080020, Barranquilla, Colombia
| | - Howbeer Muhamadali
- School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, Manchester, M1 7DN, UK
- Department of Biochemistry and Systems Biology, Institute of Integrative Systems, Molecular and Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, Liverpool, L69 7ZB, UK
| | - Katherine A Hollywood
- Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester, M1 7DN, UK
| | - Yun Xu
- School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, Manchester, M1 7DN, UK
- Department of Biochemistry and Systems Biology, Institute of Integrative Systems, Molecular and Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, Liverpool, L69 7ZB, UK
| | - Jonathan R Lloyd
- Williamson Research Centre, School of Earth & Environmental Sciences, University of Manchester, Manchester, M13 9PL, UK
| | - Royston Goodacre
- School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, Manchester, M1 7DN, UK
- Department of Biochemistry and Systems Biology, Institute of Integrative Systems, Molecular and Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, Liverpool, L69 7ZB, UK
| | - Domingo Cantero
- Department of Chemical Engineering and Food Technology, University of Cadiz, Campus Universitario de Puerto Real, 11510, Puerto Real, Cádiz, Spain
- Institute of Viticulture and Agri-Food Research (IVAGRO) - International Campus of Excellence (ceiA3), University of Cadiz, 11510, Puerto Real, Cádiz, Spain
| | - Gema Cabrera
- Department of Chemical Engineering and Food Technology, University of Cadiz, Campus Universitario de Puerto Real, 11510, Puerto Real, Cádiz, Spain
- Institute of Viticulture and Agri-Food Research (IVAGRO) - International Campus of Excellence (ceiA3), University of Cadiz, 11510, Puerto Real, Cádiz, Spain
| | - Jorge Bolivar
- Department of Biomedicine, Biotechnology and Public Health-Biochemistry and Molecular Biology, University of Cadiz, Campus Universitario de Puerto Real, 11510, Puerto Real, Cádiz, Spain.
- Institute of Biomolecules (INBIO), University of Cadiz, 11510, Puerto Real, Cádiz, Spain.
| |
Collapse
|
20
|
Driessen M, van der Plas-Duivesteijn S, Kienhuis AS, van den Brandhof EJ, Roodbergen M, van de Water B, Spaink HP, Palmblad M, van der Ven LTM, Pennings JLA. Identification of proteome markers for drug-induced liver injury in zebrafish embryos. Toxicology 2022; 477:153262. [PMID: 35868597 DOI: 10.1016/j.tox.2022.153262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/26/2022] [Accepted: 07/18/2022] [Indexed: 10/17/2022]
Abstract
The zebrafish embryo (ZFE) is a promising alternative non-rodent model in toxicology, and initial studies suggested its applicability in detecting hepatic responses related to drug-induced liver injury (DILI). Here, we hypothesize that detailed analysis of underlying mechanisms of hepatotoxicity in ZFE contributes to the improved identification of hepatotoxic properties of compounds and to the reduction of rodents used for hepatotoxicity assessment. ZFEs were exposed to nine reference hepatotoxicants, targeted at induction of steatosis, cholestasis, and necrosis, and effects compared with negative controls. Protein profiles of the individual compounds were generated using LC-MS/MS. We identified differentially expressed proteins and pathways, but as these showed considerable overlap, phenotype-specific responses could not be distinguished. This led us to identify a set of common hepatotoxicity marker proteins. At the pathway level, these were mainly associated with cellular adaptive stress-responses, whereas single proteins could be linked to common hepatotoxicity-associated processes. Applying several stringency criteria to our proteomics data as well as information from other data sources resulted in a set of potential robust protein markers, notably Igf2bp1, Cox5ba, Ahnak, Itih3b.2, Psma6b, Srsf3a, Ces2b, Ces2a, Tdo2b, and Anxa1c, for the detection of adverse responses.
Collapse
Affiliation(s)
- Marja Driessen
- Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O.Box 1, 3720 BA Bilthoven, the Netherlands; Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, the Netherlands
| | | | - Anne S Kienhuis
- Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O.Box 1, 3720 BA Bilthoven, the Netherlands
| | - Evert-Jan van den Brandhof
- Centre for Environmental Quality, National Institute for Public Health and the Environment (RIVM), P.O.Box 1, 3720 BA Bilthoven, the Netherlands
| | - Marianne Roodbergen
- Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O.Box 1, 3720 BA Bilthoven, the Netherlands; Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, the Netherlands
| | - Bob van de Water
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, the Netherlands
| | - Herman P Spaink
- Institute of Biology, Leiden University, Einsteinweg 55, 2333 CC Leiden, the Netherlands
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Leiden, the Netherlands
| | - Leo T M van der Ven
- Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O.Box 1, 3720 BA Bilthoven, the Netherlands
| | - Jeroen L A Pennings
- Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O.Box 1, 3720 BA Bilthoven, the Netherlands.
| |
Collapse
|
21
|
Duong VA, Park JM, Lee H. A review of suspension trapping digestion method in bottom-up proteomics. J Sep Sci 2022; 45:3150-3168. [PMID: 35770343 DOI: 10.1002/jssc.202200297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 06/23/2022] [Accepted: 06/27/2022] [Indexed: 11/05/2022]
Abstract
The standard bottom-up proteomic workflow is comprised of sample preparation, data acquisition, and data analysis. While the latter two parts have made considerable advances in the last decade, sample preparation has remained an important challenge within the workflow due to the multi-step nature of complex biological samples, and still requires much development. Several sample preparation methods have been developed and used in the last two decades, including in-gel, in-solution, on-bead, filter-aided sample preparation, and suspension trapping, to improve reproducibility, efficiency, scalability, and reduce handling time of this process. One of the most recent methods developed and applied in proteomics studies in recent years is suspension trapping, which combines rapid detergent removal, reactor-type protein digestion, and peptide clean-up in a tip or spin column. Suspension trapping is a simple, rapid, and reproducible digestion method that can effectively handle proteins in low microgram or sub-microgram amounts. This review discusses the benefits of the suspension trapping digestion method in relation to its development and application in bottom-up proteomics studies. We also discuss recent applications of suspension trapping digestion to different sample types and the features of the suspension trapping digestion method compared with other sample preparation methods. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Van-An Duong
- College of Pharmacy, Gachon University, Incheon, 21936, South Korea
| | - Jong-Moon Park
- College of Pharmacy, Gachon University, Incheon, 21936, South Korea
| | - Hookeun Lee
- College of Pharmacy, Gachon University, Incheon, 21936, South Korea
| |
Collapse
|
22
|
Hoffmann N, Mayer G, Has C, Kopczynski D, Al Machot F, Schwudke D, Ahrends R, Marcus K, Eisenacher M, Turewicz M. A Current Encyclopedia of Bioinformatics Tools, Data Formats and Resources for Mass Spectrometry Lipidomics. Metabolites 2022; 12:metabo12070584. [PMID: 35888710 PMCID: PMC9319858 DOI: 10.3390/metabo12070584] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/17/2022] [Accepted: 06/19/2022] [Indexed: 12/13/2022] Open
Abstract
Mass spectrometry is a widely used technology to identify and quantify biomolecules such as lipids, metabolites and proteins necessary for biomedical research. In this study, we catalogued freely available software tools, libraries, databases, repositories and resources that support lipidomics data analysis and determined the scope of currently used analytical technologies. Because of the tremendous importance of data interoperability, we assessed the support of standardized data formats in mass spectrometric (MS)-based lipidomics workflows. We included tools in our comparison that support targeted as well as untargeted analysis using direct infusion/shotgun (DI-MS), liquid chromatography−mass spectrometry, ion mobility or MS imaging approaches on MS1 and potentially higher MS levels. As a result, we determined that the Human Proteome Organization-Proteomics Standards Initiative standard data formats, mzML and mzTab-M, are already supported by a substantial number of recent software tools. We further discuss how mzTab-M can serve as a bridge between data acquisition and lipid bioinformatics tools for interpretation, capturing their output and transmitting rich annotated data for downstream processing. However, we identified several challenges of currently available tools and standards. Potential areas for improvement were: adaptation of common nomenclature and standardized reporting to enable high throughput lipidomics and improve its data handling. Finally, we suggest specific areas where tools and repositories need to improve to become FAIRer.
Collapse
Affiliation(s)
- Nils Hoffmann
- Forschungszentrum Jülich GmbH, Institute for Bio- and Geosciences (IBG-5), 52425 Jülich, Germany
- Correspondence: (N.H.); (M.T.); Tel.: +49-(0)521-106-86780 (N.H.)
| | - Gerhard Mayer
- Institute of Medical Systems Biology, Ulm University, 89081 Ulm, Germany;
| | - Canan Has
- Biological Mass Spectrometry, Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany;
- University Hospital Carl Gustav Carus, 01307 Dresden, Germany
- CENTOGENE GmbH, 18055 Rostock, Germany
| | - Dominik Kopczynski
- Department of Analytical Chemistry, University of Vienna, 1090 Vienna, Austria; (D.K.); (R.A.)
| | - Fadi Al Machot
- Faculty of Science and Technology, Norwegian University for Life Science (NMBU), 1433 Ås, Norway;
| | - Dominik Schwudke
- Bioanalytical Chemistry, Forschungszentrum Borstel, Leibniz Lung Center, 23845 Borstel, Germany;
- Airway Research Center North, German Center for Lung Research (DZL), 23845 Borstel, Germany
- German Center for Infection Research (DZIF), TTU Tuberculosis, 23845 Borstel, Germany
| | - Robert Ahrends
- Department of Analytical Chemistry, University of Vienna, 1090 Vienna, Austria; (D.K.); (R.A.)
| | - Katrin Marcus
- Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Ruhr University Bochum, 44801 Bochum, Germany; (K.M.); (M.E.)
| | - Martin Eisenacher
- Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Ruhr University Bochum, 44801 Bochum, Germany; (K.M.); (M.E.)
- Faculty of Medicine, Medizinisches Proteom-Center, Ruhr University Bochum, 44801 Bochum, Germany
| | - Michael Turewicz
- Institute for Clinical Biochemistry and Pathobiochemistry, German Diabetes Center (DDZ), Leibniz Center for Diabetes Research at Heinrich-Heine-University Düsseldorf, 40225 Düsseldorf, Germany
- German Center for Diabetes Research (DZD), Partner Düsseldorf, 85764 Neuherberg, Germany
- Correspondence: (N.H.); (M.T.); Tel.: +49-(0)521-106-86780 (N.H.)
| |
Collapse
|
23
|
Pinter N, Glätzer D, Fahrner M, Fröhlich K, Johnson J, Grüning BA, Warscheid B, Drepper F, Schilling O, Föll MC. MaxQuant and MSstats in Galaxy Enable Reproducible Cloud-Based Analysis of Quantitative Proteomics Experiments for Everyone. J Proteome Res 2022; 21:1558-1565. [PMID: 35503992 DOI: 10.1021/acs.jproteome.2c00051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Quantitative mass spectrometry-based proteomics has become a high-throughput technology for the identification and quantification of thousands of proteins in complex biological samples. Two frequently used tools, MaxQuant and MSstats, allow for the analysis of raw data and finding proteins with differential abundance between conditions of interest. To enable accessible and reproducible quantitative proteomics analyses in a cloud environment, we have integrated MaxQuant (including TMTpro 16/18plex), Proteomics Quality Control (PTXQC), MSstats, and MSstatsTMT into the open-source Galaxy framework. This enables the web-based analysis of label-free and isobaric labeling proteomics experiments via Galaxy's graphical user interface on public clouds. MaxQuant and MSstats in Galaxy can be applied in conjunction with thousands of existing Galaxy tools and integrated into standardized, sharable workflows. Galaxy tracks all metadata and intermediate results in analysis histories, which can be shared privately for collaborations or publicly, allowing full reproducibility and transparency of published analysis. To further increase accessibility, we provide detailed hands-on training materials. The integration of MaxQuant and MSstats into the Galaxy framework enables their usage in a reproducible way on accessible large computational infrastructures, hence realizing the foundation for high-throughput proteomics data science for everyone.
Collapse
Affiliation(s)
- Niko Pinter
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany
| | - Damian Glätzer
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Klemens Fröhlich
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), Albert-Ludwigs-University Freiburg, 79104 Freiburg, Germany
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | | | - Bettina Warscheid
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Faculty of Chemistry and Pharmacy, Department of Biochemistry, Julius Maximilian University of Würzburg, 97074 Würzburg, Germany
| | - Friedel Drepper
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), 79106 Freiburg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts 02115, United States
| |
Collapse
|
24
|
Shang Z, Tian Y, Yi Y, Li K, Qiao X, Ye M. Comparative bioactivity evaluation and chemical profiling of different parts of the medicinal plant Glycyrrhiza uralensis. J Pharm Biomed Anal 2022; 215:114793. [PMID: 35489249 DOI: 10.1016/j.jpba.2022.114793] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/16/2022] [Accepted: 04/20/2022] [Indexed: 11/19/2022]
Abstract
Glycyrrhiza uralensis is a popular medicinal plant worldwide. Its roots and rhizomes are used as the traditional Chinese medicine Gan-Cao. However, little is known on medicinal potential and chemistry of the other parts of the plant. In this work, the biological activities and chemical components of the roots, stems, leaves, and seeds of G. uralensis were investigated comparatively. The four parts exhibited different but noticeable biological activities. The chemicals in the four parts were globally characterized by liquid chromatography coupled with mass spectrometry (LC/MS) on a Thermo Vanquish UHPLC system connected to a Q-Exactive quadrupole Orbitrap mass spectrometer. By integrating molecular networking, compound spectral matching, MS2LDA-based substructure recognition, and reference standards comparison, a total of 1301 compounds were rapidly characterized. Three flavonoid C-glycosides were purified and their structures were identified by NMR spectroscopic analysis. Orthogonal partial least squares-discriminate analysis (OPLS-DA) further revealed 196 differential chemicals for the four parts. This work will promote the medicinal resource utilization of G. uralensis.
Collapse
Affiliation(s)
- Zhanpeng Shang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China
| | - Yungang Tian
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China
| | - Yang Yi
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China
| | - Kai Li
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China
| | - Xue Qiao
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China.
| | - Min Ye
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, 38 Xueyuan Road, Beijing 100191, China; Yunnan Baiyao International Medical Research Center, Peking University, 38 Xueyuan Road, Beijing 100191, China.
| |
Collapse
|
25
|
Yang R, Ma J, Zhang S, Zheng Y, Wang L, Zhu D. mzMD: visualization-oriented MS data storage and retrieval. Bioinformatics 2022; 38:2333-2340. [PMID: 35171986 DOI: 10.1093/bioinformatics/btac098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 01/23/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Drawing peaks in a data window of an MS dataset happens at all time in MS data visualization applications. This asks to retrieve from an MS dataset some selected peaks in a data window whose image in a display window reflects the visual feature of all peaks in the data window. If an algorithm for this purpose is asked to output high-quality solutions in real time, then the most fundamental dependence of it is on the storage format of the MS dataset. RESULTS We present mzMD, a new storage format of MS datasets and an algorithm to query this format of a storage system for a summary (a set of selected representative peaks) of a given data window. We propose a criterion Q-score to examine the quality of data window summaries. Experimental statistics on real MS datasets verified the high speed of mzMD in retrieving high-quality data window summaries. mzMD reported summaries of data windows whose Q-score outperforms those mzTree reported. The query speed of mzMD is the same as that of mzTree whereas its query speed stability is better than that of mzTree. AVAILABILITY AND IMPLEMENTATION The source code is freely available at https://github.com/yrm9837/mzMD-java. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Runmin Yang
- School of Computer Science and Technology, Shandong University, Qingdao 266237, China
| | - Jingjing Ma
- School of Computer Science and Technology, Shandong University, Qingdao 266237, China
| | - Shu Zhang
- School of Computer Science and Technology, Shandong University, Qingdao 266237, China
| | - Yu Zheng
- School of Computer Science and Technology, Shandong University, Qingdao 266237, China
| | - Lusheng Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong, China.,City University of Hong Kong Shenzhen Research Institute, Shenzhen 518057, China
| | - Daming Zhu
- School of Computer Science and Technology, Shandong University, Qingdao 266237, China
| |
Collapse
|
26
|
Minkus S, Bieber S, Letzel T. Spotlight on mass spectrometric non-target screening analysis: Advanced data processing methods recently communicated for extracting, prioritizing and quantifying features. ANALYTICAL SCIENCE ADVANCES 2022; 3:103-112. [PMID: 38715638 PMCID: PMC10989605 DOI: 10.1002/ansa.202200001] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 03/22/2022] [Accepted: 03/24/2022] [Indexed: 06/13/2024]
Abstract
Non-target screening of trace organic compounds complements routine monitoring of water bodies. So-called features need to be extracted from the raw data that preferably represent a chemical compound. Relevant features need to be prioritized and further be interpreted, for instance by identifying them. Finally, quantitative data is required to assess the risks of a detected compound. This review presents recent and noteworthy contributions to the processing of non-target screening (NTS) data, prioritization of features as well as (semi-) quantitative methods that do not require analytical standards. The focus lies on environmental water samples measured by liquid chromatography, electrospray ionization and high-resolution mass spectrometry. Examples for fully-integrated data processing workflows are given with options for parameter optimization and choosing between different feature extraction algorithms to increase feature coverage. The regions of interest-multivariate curve resolution method is reviewed which combines a data compression alternative with chemometric feature extraction. Furthermore, prioritization strategies based on a confined chemical space for annotation, guidance by targeted analysis and signal intensity are presented. Exploiting the retention time (RT) as diagnostic evidence for NTS investigations is highlighted by discussing RT indexing and prediction using quantitative structure-retention relationship models. Finally, a seminal technology for quantitative NTS is discussed without the need for analytical standards based on predicting ionization efficiencies.
Collapse
Affiliation(s)
- Susanne Minkus
- AFIN‐TS GmbHAugsburgGermany
- Technical University of Munich (Chair of Urban Water Systems Engineering)MunichGermany
| | | | - Thomas Letzel
- AFIN‐TS GmbHAugsburgGermany
- Technical University of Munich (Chair of Urban Water Systems Engineering)MunichGermany
| |
Collapse
|
27
|
StackZDPD: a novel encoding scheme for mass spectrometry data optimized for speed and compression ratio. Sci Rep 2022; 12:5384. [PMID: 35354909 PMCID: PMC8967824 DOI: 10.1038/s41598-022-09432-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 03/23/2022] [Indexed: 11/29/2022] Open
Abstract
As the pervasive, standardized format for interchange and deposition of raw mass spectrometry (MS) proteomics and metabolomics data, text-based mzML is inefficiently utilized on various analysis platforms due to its sheer volume of samples and limited read/write speed. Most research on compression algorithms rarely provides flexible random file reading scheme. Database-developed solution guarantees the efficiency of random file reading, but nevertheless the efforts in compression and third-party software support are insufficient. Under the premise of ensuring the efficiency of decompression, we propose an encoding scheme “Stack-ZDPD” that is optimized for storage of raw MS data, designed for the format “Aird”, a computation-oriented format with fast accessing and decoding time, where the core compression algorithm is “ZDPD”. Stack-ZDPD reduces the volume of data stored in mzML format by around 80% or more, depending on the data acquisition pattern, and the compression ratio is approximately 30% compared to ZDPD for data generated using Time of Flight technology. Our approach is available on AirdPro, for file conversion and the Java-API Aird-SDK, for data parsing.
Collapse
|
28
|
Mao J, Zhu H, Liu L, Fang Z, Dong M, Qin H, Ye M. MS-Decipher: a user-friendly proteome database search software with an emphasis on deciphering the spectra of O-linked glycopeptides. Bioinformatics 2022; 38:1911-1919. [PMID: 35020790 DOI: 10.1093/bioinformatics/btac014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 12/29/2021] [Accepted: 01/08/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION The interpretation of mass spectrometry (MS) data is a crucial step in proteomics analysis, and the identification of post-translational modifications (PTMs) is vital for the understanding of the regulation mechanism of the living system. Among various PTMs, glycosylation is one of the most diverse ones. Though many search engines have been developed to decipher proteomic data, some of them are difficult to operate and have poor performance on glycoproteomic datasets compared to advanced glycoproteomic software. RESULTS To simplify the analysis of proteomic datasets, especially O-glycoproteomic datasets, here, we present a user-friendly proteomic database search platform, MS-Decipher, for the identification of peptides from MS data. Two scoring schemes can be chosen for peptide-spectra matching. It was found that MS-Decipher had the same sensitivity and confidence in peptide identification compared to traditional database searching software. In addition, a special search mode, O-Search, is integrated into MS-Decipher to identify O-glycopeptides for O-glycoproteomic analysis. Compared with Mascot, MetaMorpheus and MSFragger, MS-Decipher can obtain about 139.9%, 48.8% and 6.9% more O-glycopeptide-spectrum matches. A useful tool is provided in MS-Decipher for the visualization of O-glycopeptide-spectra matches. MS-Decipher has a user-friendly graphical user interface, making it easier to operate. Several file formats are available in the searching and validation steps. MS-Decipher is implemented with Java, and can be used cross-platform. AVAILABILITY AND IMPLEMENTATION MS-Decipher is freely available at https://github.com/DICP-1809/MS-Decipher for academic use. For detailed implementation steps, please see the user guide. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiawei Mao
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China
| | - He Zhu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Luyao Liu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zheng Fang
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingming Dong
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China.,School of Bioengineering, Dalian University of Technology, Dalian 116024, China
| | - Hongqiang Qin
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China
| | - Mingliang Ye
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Science, Dalian 116023, China
| |
Collapse
|
29
|
Structural basis for safe and efficient energy conversion in a respiratory supercomplex. Nat Commun 2022; 13:545. [PMID: 35087070 PMCID: PMC8795186 DOI: 10.1038/s41467-022-28179-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 01/10/2022] [Indexed: 12/19/2022] Open
Abstract
Proton-translocating respiratory complexes assemble into supercomplexes that are proposed to increase the efficiency of energy conversion and limit the production of harmful reactive oxygen species during aerobic cellular respiration. Cytochrome bc complexes and cytochrome aa3 oxidases are major drivers of the proton motive force that fuels ATP generation via respiration, but how wasteful electron- and proton transfer is controlled to enhance safety and efficiency in the context of supercomplexes is not known. Here, we address this question with the 2.8 Å resolution cryo-EM structure of the cytochrome bcc-aa3 (III2-IV2) supercomplex from the actinobacterium Corynebacterium glutamicum. Menaquinone, substrate mimics, lycopene, an unexpected Qc site, dioxygen, proton transfer routes, and conformational states of key protonable residues are resolved. Our results show how safe and efficient energy conversion is achieved in a respiratory supercomplex through controlled electron and proton transfer. The structure may guide the rational design of drugs against actinobacteria that cause diphtheria and tuberculosis. Aerobic energy metabolism is driven by proton-pumping respiratory supercomplexes. The study reports the structural basis for energy conversion in such supercomplex. It may aid metabolic engineering and drug design against diphtheria and tuberculosis.
Collapse
|
30
|
Lu M, An S, Wang R, Wang J, Yu C. Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time. BMC Bioinformatics 2022; 23:35. [PMID: 35021987 PMCID: PMC8756627 DOI: 10.1186/s12859-021-04490-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Accepted: 11/23/2021] [Indexed: 12/27/2022] Open
Abstract
Background With the precision of the mass spectrometry (MS) going higher, the MS file size increases rapidly. Beyond the widely-used open format mzML, near-lossless or lossless compression algorithms and formats emerged in scenarios with different precision requirements. The data precision is often related to the instrument and subsequent processing algorithms. Unlike storage-oriented formats, which focus more on lossless compression rate, computation-oriented formats concentrate as much on decoding speed as the compression rate. Results Here we introduce “Aird”, an opensource and computation-oriented format with controllable precision, flexible indexing strategies, and high compression rate. Aird provides a novel compressor called Zlib-Diff-PforDelta (ZDPD) for m/z data. Compared with Zlib only, m/z data size is about 55% lower in Aird average. With the high-speed decoding and encoding performance of the single instruction multiple data technology used in the ZDPD, Aird merely takes 33% decoding time compared with Zlib. We have downloaded seven datasets from ProteomeXchange and Metabolights. They are from different SCIEX, Thermo, and Agilent instruments. Then we convert the raw data into mzML, mgf, and mz5 file formats by MSConvert and compare them with Aird format. Aird uses JavaScript Object Notation for metadata storage. Aird-SDK is written in Java, and AirdPro is a GUI client for vendor file converting written in C#. They are freely available at https://github.com/CSi-Studio/Aird-SDK and https://github.com/CSi-Studio/AirdPro. Conclusions With the innovation of MS acquisition mode, MS data characteristics are also constantly changing. New data features can bring more effective compression methods and new index modes to achieve high search performance. The MS data storage mode will also become professional and customized. ZDPD uses multiple MS digital features, and researchers also can use it in other formats like mzML. Aird is designed to become a computing-oriented data format with high scalability, compression rate, and fast decoding speed.
Collapse
|
31
|
Paggi RA, Albaum SP, Poetsch A, Cerletti M. Proteome Turnover Analysis in Haloferax volcanii by a Heavy Isotope Multilabeling Approach. Methods Mol Biol 2022; 2522:267-286. [PMID: 36125756 DOI: 10.1007/978-1-0716-2445-6_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The cellular protein repertoire is highly dynamic and responsive to internal or external stimuli. Its changes are largely the consequence of the combination of protein synthesis and degradation, referred collectively as protein turnover. Different proteomics techniques have been developed to determine the whole proteome turnover of a cell, but very few have been applied to archaea. In this chapter we describe a heavy isotope multilabeling method that allowed the successful analysis of relative protein synthesis and degradation rates on the proteome scale of the halophilic archaeon Haloferax volcanii. This method combines 15N and 13C isotope metabolic labeling with high-resolution mass spectrometry and data analysis tools (QuPE web-based platform) and could be applied to different archaea.
Collapse
Affiliation(s)
- Roberto A Paggi
- Instituto de Investigaciones Biológicas, FCEyN, Universidad Nacional de Mar del Plata (UNMDP), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Mar del Plata, Argentina
| | - Stefan P Albaum
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Ansgar Poetsch
- College of Marine Life Sciences, Ocean University of China, Qingdao, China.
- Queen Mary School, Medical College, Nanchang University, Nanchang, China.
- Plant Biochemistry, Ruhr University Bochum, Bochum, Germany.
| | - Micaela Cerletti
- Instituto de Investigaciones Biológicas, FCEyN, Universidad Nacional de Mar del Plata (UNMDP), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Mar del Plata, Argentina.
| |
Collapse
|
32
|
Range J, Halupczok C, Lohmann J, Swainston N, Kettner C, Bergmann FT, Weidemann A, Wittig U, Schnell S, Pleiss J. EnzymeML-a data exchange format for biocatalysis and enzymology. FEBS J 2021; 289:5864-5874. [PMID: 34890097 DOI: 10.1111/febs.16318] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 11/15/2021] [Accepted: 12/09/2021] [Indexed: 11/30/2022]
Abstract
EnzymeML is an XML-based data exchange format that supports the comprehensive documentation of enzymatic data by describing reaction conditions, time courses of substrate and product concentrations, the kinetic model, and the estimated kinetic constants. EnzymeML is based on the Systems Biology Markup Language, which was extended by implementing the STRENDA Guidelines. An EnzymeML document serves as a container to transfer data between experimental platforms, modeling tools, and databases. EnzymeML supports the scientific community by introducing a standardized data exchange format to make enzymatic data findable, accessible, interoperable, and reusable according to the FAIR data principles. An application programming interface in Python supports the integration of software tools for data acquisition, data analysis, and publication. The feasibility of a seamless data flow using EnzymeML is demonstrated by creating an EnzymeML document from a structured spreadsheet or from a STRENDA DB database entry, by kinetic modeling using the modeling platform COPASI, and by uploading to the enzymatic reaction kinetics database SABIO-RK.
Collapse
Affiliation(s)
- Jan Range
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Germany
| | - Colin Halupczok
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Germany
| | - Jens Lohmann
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Germany
| | - Neil Swainston
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, UK
| | | | | | | | - Ulrike Wittig
- Heidelberg Institute for Theoretical Studies, Germany
| | - Santiago Schnell
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Jürgen Pleiss
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Germany
| |
Collapse
|
33
|
Bioinformatics in Lipidomics: Automating Large-Scale LC-MS-Based Untargeted Lipidomics Profiling with SimLipid Software. Methods Mol Biol 2021. [PMID: 34786685 DOI: 10.1007/978-1-0716-1822-6_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Liquid chromatography-mass spectrometry (LC-MS) provides one of the most popular platforms for untargeted plant lipidomics analysis (Shulaev and Chapman, Biochim Biophys Acta 1862(8):786-791, 2017; Rupasinghe and Roessner, Methods Mol Biol 1778:125-135, 2018; Welti et al., Front Biosci 12:2494-506, 2007; Shiva et al., Plant Methods 14:14, 2018). We have developed SimLipid software in order to streamline the analysis of large-volume datasets generated by LC-MS-based untargeted lipidomics methods. SimLipid contains a customizable library of lipid species; graphical user interfaces (GUIs) for visualization of raw data; the identified lipid molecules and their associated mass spectra annotated with fragment ions and parent ions; and detailed information of each identified lipid species all in a single workbench enabling users to rapidly review the results by examining the data for confident identifications of lipid molecular species. In this chapter, we present the functionality of the software and workflow for automating large-scale LC-MS-based untargeted lipidomics profiling.
Collapse
|
34
|
Gao J, Liu Y, Yang F, Chen X, Cravatt BF, Wang C. CIMAGE2.0: An Expanded Tool for Quantitative Analysis of Activity-Based Protein Profiling (ABPP) Data. J Proteome Res 2021; 20:4893-4900. [PMID: 34495668 DOI: 10.1021/acs.jproteome.1c00455] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Activity-based protein profiling (ABPP) is a powerful chemical proteomic method for studying protein activity, modifications, and interactions in a high-throughput manner. In ABPP experiments, accurate quantification is crucial to determine the extent of probe labeling at the level of either target proteins or specific amino acid side chains. CIMAGE has been developed as an in-house quantification software specifically designed for ABPP data analysis that incorporates (1) a relaxed peak extraction algorithm and (2) stringent post-quantification checks for efficient and accurate quantification. It also can generate table and image data for users to conveniently visualize their results. Here we provide a retrospective introduction of the software and describe our recent upgrade efforts to enable (1) interfacing with different database search engines as input, (2) triplex quantification of ABPP data by reductive dimethylation, and (3) envelope checking for chemical elements with special isotopic distributions. We show that the updated CIMAGE can maintain its ability to quantify ABPP data with dramatic depth and high accuracy, and it also has similar quantification performance in benchmarked SILAC data as compared with MaxQuant. We believe that CIMAGE2.0 will continue to serve as a powerful analytical tool for ABPP studies.
Collapse
Affiliation(s)
- Jinjun Gao
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of the Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Yuan Liu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of the Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Fan Yang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of the Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Xuemin Chen
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of the Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Benjamin F Cravatt
- Department of Chemical Physiology, The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Chu Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of the Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
35
|
Hanau F, Röst H, Ochoa I. mspack: efficient lossless and lossy mass spectrometry data compression. Bioinformatics 2021; 37:3923-3925. [PMID: 34478503 DOI: 10.1093/bioinformatics/btab636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 08/16/2021] [Accepted: 09/01/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Mass spectrometry data, used for proteomics and metabolomics analyses, have seen considerable growth in the last years. Aiming at reducing the associated storage costs, dedicated compression algorithms for Mass Spectrometry (MS) data have been proposed, such as MassComp and MSNumpress. However, these algorithms focus on either lossless or lossy compression, respectively, and do not exploit the additional redundancy existing across scans contained in a single file. We introduce mspack, a compression algorithm for MS data that exploits this additional redundancy and that supports both lossless and lossy compression, as well as the mzML and the legacy mzXML formats. mspack applies several preprocessing lossless transforms and optional lossy transforms with a configurable error, followed by the general purpose compressors gzip or bsc to achieve a higher compression ratio. RESULTS We tested mspack on several datasets generated by commonly used mass spectrometry instruments. When used with the bsc compression backend, mspack achieves on average 76% smaller file sizes for lossless compression and 94% smaller file sizes for lossy compression, as compared to the original files. Lossless mspack achieves 10 - 60% lower file sizes than MassComp, and lossy mspack compresses 36 - 60% better than the lossy MSNumpress, for the same error, while exhibiting comparable accuracy and running time. AVAILABILITY mspack is implemented in C ++ and freely available at https://github.com/fhanau/mspack under the Apache license. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Felix Hanau
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Hannes Röst
- Department of Molecular Genetics, Donnelly Center, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Idoia Ochoa
- Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Electrical Engineering, University of Navarra, Tecnun, Donostia 20018, Spain
| |
Collapse
|
36
|
Sánchez Brotons A, Eriksson JO, Kwiatkowski M, Wolters JC, Kema IP, Barcaru A, Kuipers F, Bakker SJL, Bischoff R, Suits F, Horvatovich P. Pipelines and Systems for Threshold-Avoiding Quantification of LC-MS/MS Data. Anal Chem 2021; 93:11215-11224. [PMID: 34355890 PMCID: PMC8374884 DOI: 10.1021/acs.analchem.1c01892] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
![]()
The accurate processing
of complex liquid chromatography coupled
to tandem mass spectrometry (LC–MS/MS) data from biological
samples is a major challenge for metabolomics, proteomics, and related
approaches. Here, we present the pipelines and systems for threshold-avoiding
quantification (PASTAQ) LC–MS/MS preprocessing toolset, which
allows highly accurate quantification of data-dependent acquisition
LC–MS/MS datasets. PASTAQ performs compound quantification
using single-stage (MS1) data and implements novel algorithms for
high-performance and accurate quantification, retention time alignment,
feature detection, and linking annotations from multiple identification
engines. PASTAQ offers straightforward parameterization and automatic
generation of quality control plots for data and preprocessing assessment.
This design results in smaller variance when analyzing replicates
of proteomes mixed with known ratios and allows the detection of peptides
over a larger dynamic concentration range compared to widely used
proteomics preprocessing tools. The performance of the pipeline is
also demonstrated in a biological human serum dataset for the identification
of gender-related proteins.
Collapse
Affiliation(s)
- Alejandro Sánchez Brotons
- Department of Analytical Biochemistry, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV Groningen, The Netherlands
| | - Jonatan O Eriksson
- Department of Biomedical Engineering, Lund University, 221 84 Lund, Sweden
| | - Marcel Kwiatkowski
- Department of Analytical Biochemistry, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV Groningen, The Netherlands.,Functional Proteo-Metabolomics, Department of Biochemistry, University of Innsbruck, A-6020 Innsbruck, Austria
| | - Justina C Wolters
- Department of Pediatrics, University Medical Center Groningen, University of Groningen, 9713 GZ Groningen, The Netherlands
| | - Ido P Kema
- Department of Laboratory Medicine, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, The Netherlands
| | - Andrei Barcaru
- Department of Analytical Biochemistry, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV Groningen, The Netherlands
| | - Folkert Kuipers
- Department of Pediatrics, University Medical Center Groningen, University of Groningen, 9713 GZ Groningen, The Netherlands.,Department of Laboratory Medicine, University Medical Center Groningen, University of Groningen, 9700 RB Groningen, The Netherlands
| | - Stephan J L Bakker
- Department of Internal Medicine, Division of Nephrology, University Medical Center Groningen, University of Groningen, 9713 GZ Groningen, The Netherlands
| | - Rainer Bischoff
- Department of Analytical Biochemistry, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV Groningen, The Netherlands
| | - Frank Suits
- IBM Research-Australia, Southbank, 3006 Victoria, Australia
| | - Péter Horvatovich
- Department of Analytical Biochemistry, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV Groningen, The Netherlands
| |
Collapse
|
37
|
Cadow J, Manica M, Mathis R, Guo T, Aebersold R, Rodríguez Martínez M. On the feasibility of deep learning applications using raw mass spectrometry data. Bioinformatics 2021; 37:i245-i253. [PMID: 34252933 PMCID: PMC8275322 DOI: 10.1093/bioinformatics/btab311] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
SUMMARY In recent years, SWATH-MS has become the proteomic method of choice for data-independent-acquisition, as it enables high proteome coverage, accuracy and reproducibility. However, data analysis is convoluted and requires prior information and expert curation. Furthermore, as quantification is limited to a small set of peptides, potentially important biological information may be discarded. Here we demonstrate that deep learning can be used to learn discriminative features directly from raw MS data, eliminating hence the need of elaborate data processing pipelines. Using transfer learning to overcome sample sparsity, we exploit a collection of publicly available deep learning models already trained for the task of natural image classification. These models are used to produce feature vectors from each mass spectrometry (MS) raw image, which are later used as input for a classifier trained to distinguish tumor from normal prostate biopsies. Although the deep learning models were originally trained for a completely different classification task and no additional fine-tuning is performed on them, we achieve a highly remarkable classification performance of 0.876 AUC. We investigate different types of image preprocessing and encoding. We also investigate whether the inclusion of the secondary MS2 spectra improves the classification performance. Throughout all tested models, we use standard protein expression vectors as gold standards. Even with our naïve implementation, our results suggest that the application of deep learning and transfer learning techniques might pave the way to the broader usage of raw mass spectrometry data in real-time diagnosis. AVAILABILITY AND IMPLEMENTATION The open source code used to generate the results from MS images is available on GitHub: https://ibm.biz/mstransc. The raw MS data underlying this article cannot be shared publicly for the privacy of individuals that participated in the study. Processed data including the MS images, their encodings, classification labels and results can be accessed at the following link: https://ibm.box.com/v/mstc-supplementary. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joris Cadow
- Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland
| | - Matteo Manica
- Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland
| | - Roland Mathis
- Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland
| | - Tiannan Guo
- Institute of Basic Medical Sciences, School of Life Science, Westlake University, Hangzhou 310024, China
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich 8093, Switzerland
| | - María Rodríguez Martínez
- Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland
| |
Collapse
|
38
|
Khalid MF, Iman K, Ghafoor A, Saboor M, Ali A, Muaz U, Basharat AR, Tahir T, Abubakar M, Akhter MA, Nabi W, Vanderbauwhede W, Ahmad F, Wajid B, Chaudhary SU. PERCEPTRON: an open-source GPU-accelerated proteoform identification pipeline for top-down proteomics. Nucleic Acids Res 2021; 49:W510-W515. [PMID: 33999207 PMCID: PMC8262694 DOI: 10.1093/nar/gkab368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/10/2021] [Accepted: 04/25/2021] [Indexed: 11/12/2022] Open
Abstract
PERCEPTRON is a next-generation freely available web-based proteoform identification and characterization platform for top-down proteomics (TDP). PERCEPTRON search pipeline brings together algorithms for (i) intact protein mass tuning, (ii) de novo sequence tags-based filtering, (iii) characterization of terminal as well as post-translational modifications, (iv) identification of truncated proteoforms, (v) in silico spectral comparison, and (vi) weight-based candidate protein scoring. High-throughput performance is achieved through the execution of optimized code via multiple threads in parallel, on graphics processing units (GPUs) using NVidia Compute Unified Device Architecture (CUDA) framework. An intuitive graphical web interface allows for setting up of search parameters as well as for visualization of results. The accuracy and performance of the tool have been validated on several TDP datasets and against available TDP software. Specifically, results obtained from searching two published TDP datasets demonstrate that PERCEPTRON outperforms all other tools by up to 135% in terms of reported proteins and 10-fold in terms of runtime. In conclusion, the proposed tool significantly enhances the state-of-the-art in TDP search software and is publicly available at https://perceptron.lums.edu.pk. Users can also create in-house deployments of the tool by building code available on the GitHub repository (http://github.com/BIRL/Perceptron).
Collapse
Affiliation(s)
- Muhammad Farhan Khalid
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Kanzal Iman
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Amna Ghafoor
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Mujtaba Saboor
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Ahsan Ali
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Urwa Muaz
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Abdul Rehman Basharat
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Taha Tahir
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Muhammad Abubakar
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Momina Amer Akhter
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| | - Waqar Nabi
- School of Computing Science, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Wim Vanderbauwhede
- School of Computing Science, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Fayyaz Ahmad
- Department of Statistics, University of Gujrat, Gujrat, Pakistan
| | - Bilal Wajid
- Department of Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan
- Department of Computer Science, University of Management and Technology, Lahore, Pakistan
- Division of Research and Development, Sabz-Qalam, Lahore, Pakistan
| | - Safee Ullah Chaudhary
- Biomedical Informatics Research Laboratory, Department of Biology, Lahore University of Management Sciences, Lahore, Pakistan
| |
Collapse
|
39
|
Pure Ion Chromatograms Combined with Advanced Machine Learning Methods Improve Accuracy of Discriminant Models in LC-MS-Based Untargeted Metabolomics. Molecules 2021; 26:molecules26092715. [PMID: 34063107 PMCID: PMC8125400 DOI: 10.3390/molecules26092715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 04/30/2021] [Accepted: 05/03/2021] [Indexed: 11/17/2022] Open
Abstract
Untargeted metabolomics based on liquid chromatography coupled with mass spectrometry (LC-MS) can detect thousands of features in samples and produce highly complex datasets. The accurate extraction of meaningful features and the building of discriminant models are two crucial steps in the data analysis pipeline of untargeted metabolomics. In this study, pure ion chromatograms were extracted from a liquor dataset and left-sided colon cancer (LCC) dataset by K-means-clustering-based Pure Ion Chromatogram extraction method version 2.0 (KPIC2). Then, the nonlinear low-dimensional embedding by uniform manifold approximation and projection (UMAP) showed the separation of samples from different groups in reduced dimensions. The discriminant models were established by extreme gradient boosting (XGBoost) based on the features extracted by KPIC2. Results showed that features extracted by KPIC2 achieved 100% classification accuracy on the test sets of the liquor dataset and the LCC dataset, which demonstrated the rationality of the XGBoost model based on KPIC2 compared with the results of XCMS (92% and 96% for liquor and LCC datasets respectively). Finally, XGBoost can achieve better performance than the linear method and traditional nonlinear modeling methods on these datasets. UMAP and XGBoost are integrated into KPIC2 package to extend its performance in complex situations, which are not only able to effectively process nonlinear dataset but also can greatly improve the accuracy of data analysis in non-target metabolomics.
Collapse
|
40
|
Abstract
Metabolomics is a technology that generates large amounts of data and contributes to obtaining wide and integral explanations of the biochemical state of a living organism. Plants are continuously affected by abiotic stresses such as water scarcity, high temperatures and high salinity, and metabolomics has the potential for elucidating the response-to-stress mechanisms and develop resistance strategies in affected cultivars. This review describes the characteristics of each of the stages of metabolomic studies in plants and the role of metabolomics in the characterization of the response of various plant species to abiotic stresses.
Collapse
|
41
|
Abstract
Proteomics, the large-scale study of all proteins of an organism or system, is a powerful tool for studying biological systems. It can provide a holistic view of the physiological and biochemical states of given samples through identification and quantification of large numbers of peptides and proteins. In forensic science, proteomics can be used as a confirmatory and orthogonal technique for well-built genomic analyses. Proteomics is highly valuable in cases where nucleic acids are absent or degraded, such as hair and bone samples. It can be used to identify body fluids, ethnic group, gender, individual, and estimate post-mortem interval using bone, muscle, and decomposition fluid samples. Compared to genomic analysis, proteomics can provide a better global picture of a sample. It has been used in forensic science for a wide range of sample types and applications. In this review, we briefly introduce proteomic methods, including sample preparation techniques, data acquisition using liquid chromatography-tandem mass spectrometry, and data analysis using database search, spectral library search, and de novo sequencing. We also summarize recent applications in the past decade of proteomics in forensic science with a special focus on human samples, including hair, bone, body fluids, fingernail, muscle, brain, and fingermark, and address the challenges, considerations, and future developments of forensic proteomics.
Collapse
|
42
|
Discovery of Post-Translational Modifications in Emiliania huxleyi. Molecules 2021; 26:molecules26072027. [PMID: 33918234 PMCID: PMC8038017 DOI: 10.3390/molecules26072027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 03/31/2021] [Accepted: 04/01/2021] [Indexed: 11/17/2022] Open
Abstract
Emiliania huxleyi is a cosmopolitan coccolithophore that plays an essential role in global carbon and sulfur cycling, and contributes to marine cloud formation and climate regulation. Previously, the proteomic profile of Emiliania huxleyi was investigated using a three-dimensional separation strategy combined with liquid chromatography-tandem mass spectrometry (LC-MS/MS). The current study reuses the MS/MS spectra obtained, for the global discovery of post-translational modifications (PTMs) in this species without specific enrichment methods. Twenty-five different PTM types were examined using Trans-Proteomic Pipeline (Comet and PeptideProphet). Overall, 13,483 PTMs were identified in 7421 proteins. Methylation was the most frequent PTM with more than 2800 modified sites, and lysine was the most frequently modified amino acid with more than 4000 PTMs. The number of proteins identified increased by 22.5% to 18,780 after performing the PTM search. Compared to intact peptides, the intensities of some modified peptides were superior or equivalent. The intensities of some proteins increased dramatically after the PTM search. Gene ontology analysis revealed that protein persulfidation was related to photosynthesis in Emiliania huxleyi. Additionally, various membrane proteins were found to be phosphorylated. Thus, our global PTM discovery platform provides an overview of PTMs in the species and prompts further studies to uncover their biological functions. The combination of a three-dimensional separation method with global PTM search is a promising approach for the identification and discovery of PTMs in other species.
Collapse
|
43
|
Degnan DJ, Bramer LM, White AM, Zhou M, Bilbao A, McCue LA. PSpecteR: A User-Friendly and Interactive Application for Visualizing Top-Down and Bottom-Up Proteomics Data in R. J Proteome Res 2021; 20:2014-2020. [PMID: 33661636 DOI: 10.1021/acs.jproteome.0c00857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Visual examination of mass spectrometry data is necessary to assess data quality and to facilitate data exploration. Graphics provide the means to evaluate spectral properties, test alternative peptide/protein sequence matches, prepare annotated spectra for publication, and fine-tune parameters during wet lab procedures. Visual inspection of LC-MS data is constrained by proteomics visualization software designed for particular workflows or vendor-specific tools without open-source code. We built PSpecteR, an open-source and interactive R Shiny web application for visualization of LC-MS data, with support for several steps of proteomics data processing, including reading various mass spectrometry files, running open-source database search engines, labeling spectra with fragmentation patterns, testing post-translational modifications, plotting where identified fragments map to reference sequences, and visualizing algorithmic output and metadata. All figures, tables, and spectra are exportable within one easy-to-use graphical user interface. Our current software provides a flexible and modern R framework to support fast implementation of additional features. The open-source code is readily available (https://github.com/EMSL-Computing/PSpecteR), and a PSpecteR Docker container (https://hub.docker.com/r/emslcomputing/pspecter) is available for easy local installation.
Collapse
|
44
|
Rodrigues AM, Miguel C, Chaves I, António C. Mass spectrometry-based forest tree metabolomics. MASS SPECTROMETRY REVIEWS 2021; 40:126-157. [PMID: 31498921 DOI: 10.1002/mas.21603] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 08/05/2019] [Indexed: 05/24/2023]
Abstract
Research in forest tree species has advanced slowly when compared with other agricultural crops and model organisms, mainly due to the long-life cycles, large genome sizes, and lack of genomic tools. Additionally, trees are complex matrices, and the presence of interferents (e.g., oleoresins and cellulose) challenges the analysis of tree tissues with mass spectrometry (MS)-based analytical platforms. In this review, advances in MS-based forest tree metabolomics are discussed. Given their economic and ecological significance, particular focus is given to Pinus, Quercus, and Eucalyptus forest tree species to better understand their metabolite responses to abiotic and biotic stresses in the current climate change scenario. Furthermore, MS-based metabolomics technologies produce large and complex datasets that require expertize to adequately manage, process, analyze, and store the data in dedicated repositories. To ensure that the full potential of forest tree metabolomics data are translated into new knowledge, these data should comply with the FAIR principles (i.e., Findable, Accessible, Interoperable, and Re-usable). It is essential that adequate standards are implemented to annotate metadata from forest tree metabolomics studies as is already required by many science and governmental agencies and some major scientific publishers. © 2019 John Wiley & Sons Ltd. Mass Spec Rev 40:126-157, 2021.
Collapse
Affiliation(s)
- Ana Margarida Rodrigues
- Plant Metabolomics Laboratory, GreenIT-Bioresources for Sustainability, Instituto de Tecnologia Química e Biológica António Xavie, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, Oeiras, 2780-157, Portugal
| | - Célia Miguel
- Forest Genomics & Molecular Genetics Lab, BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, 1749-016, Lisboa, Portugal
- Instituto de Biologia Experimental e Tecnológica (iBET), 2780-157, Oeiras, Portugal
| | - Inês Chaves
- Forest Genomics & Molecular Genetics Lab, BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, 1749-016, Lisboa, Portugal
- Instituto de Biologia Experimental e Tecnológica (iBET), 2780-157, Oeiras, Portugal
| | - Carla António
- Plant Metabolomics Laboratory, GreenIT-Bioresources for Sustainability, Instituto de Tecnologia Química e Biológica António Xavie, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, Oeiras, 2780-157, Portugal
| |
Collapse
|
45
|
Concepcion FA, Khan MN, Ju Wang JD, Wei AD, Ojemann JG, Ko AL, Shi Y, Eng JK, Ramirez JM, Poolos NP. HCN Channel Phosphorylation Sites Mapped by Mass Spectrometry in Human Epilepsy Patients and in an Animal Model of Temporal Lobe Epilepsy. Neuroscience 2021; 460:13-30. [PMID: 33571596 DOI: 10.1016/j.neuroscience.2021.01.038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 01/07/2021] [Accepted: 01/26/2021] [Indexed: 10/22/2022]
Abstract
Because hyperpolarization-activated cyclic nucleotide-gated (HCN) ion channels modulate the excitability of cortical and hippocampal principal neurons, these channels play a key role in the hyperexcitability that occurs during the development of epilepsy after a brain insult, or epileptogenesis. In epileptic rats generated by pilocarpine-induced status epilepticus, HCN channel activity is downregulated by two main mechanisms: a hyperpolarizing shift in gating and a decrease in amplitude of the current mediated by HCN channels, Ih. Because these mechanisms are modulated by various phosphorylation signaling pathways, we hypothesized that phosphorylation changes occur at individual HCN channel amino acid residues (phosphosites) during epileptogenesis. We collected CA1 hippocampal tissue from male Sprague Dawley rats made epileptic by pilocarpine-induced status epilepticus, and age-matched naïve controls. We also included resected human brain tissue containing epileptogenic zones (EZs) where seizures arise for comparison to our chronically epileptic rats. After enrichment for HCN1 and HCN2 isoforms by immunoprecipitation and trypsin in-gel digestion, the samples were analyzed by mass spectrometry. We identified numerous phosphosites from HCN1 and HCN2 channels, representing a novel survey of phosphorylation sites within HCN channels. We found high levels of HCN channel phosphosite homology between humans and rats. We also identified a novel HCN1 channel phosphosite S791, which underwent significantly increased phosphorylation during the chronic epilepsy stage. Heterologous expression of a phosphomimetic mutant, S791D, replicated a hyperpolarizing shift in Ih gating seen in neurons from chronically epileptic rats. These results show that HCN1 channel phosphorylation is altered in epilepsy and may be of pathogenic importance.
Collapse
Affiliation(s)
- F A Concepcion
- Department of Neurology and Regional Epilepsy Center, University of Washington, Seattle, WA, United States
| | - M N Khan
- Department of Neurology and Regional Epilepsy Center, University of Washington, Seattle, WA, United States
| | - J-D Ju Wang
- Seattle Children's Research Institute, Center for Integrative Brain Research, Seattle, WA, United States
| | - A D Wei
- Seattle Children's Research Institute, Center for Integrative Brain Research, Seattle, WA, United States
| | - J G Ojemann
- Seattle Children's Research Institute, Center for Integrative Brain Research, Seattle, WA, United States; Department of Neurological Surgery, University of Washington, Seattle, WA, United States
| | - A L Ko
- Department of Neurological Surgery, University of Washington, Seattle, WA, United States
| | - Y Shi
- Department of Electrical and Computer Engineering, University of Washington, Seattle, WA, United States
| | - J K Eng
- Proteomics Resource, University of Washington, Seattle, WA, United States
| | - J-M Ramirez
- Seattle Children's Research Institute, Center for Integrative Brain Research, Seattle, WA, United States; Department of Neurological Surgery, University of Washington, Seattle, WA, United States
| | - N P Poolos
- Department of Neurology and Regional Epilepsy Center, University of Washington, Seattle, WA, United States.
| |
Collapse
|
46
|
Phapale P, Palmer A, Gathungu RM, Kale D, Brügger B, Alexandrov T. Public LC-Orbitrap Tandem Mass Spectral Library for Metabolite Identification. J Proteome Res 2021; 20:2089-2097. [PMID: 33529026 DOI: 10.1021/acs.jproteome.0c00930] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics studies require high-quality spectral libraries for reliable metabolite identification. We have constructed EMBL-MCF (European Molecular Biology Laboratory-Metabolomics Core Facility), an open LC-MS/MS spectral library that currently contains over 1600 fragmentation spectra from 435 authentic standards of endogenous metabolites and lipids. The unique features of the library include the presence of chromatographic profiles acquired with different LC-MS methods and coverage of different adduct ions. The library covers many biologically important metabolites with some unique metabolites and lipids as compared with other public libraries. The EMBL-MCF spectral library is created and shared using an in-house-developed web application at https://curatr.mcf.embl.de/. The library is freely available online and also integrated with other mass spectral repositories.
Collapse
Affiliation(s)
- Prasad Phapale
- Metabolomics Core Facility, EMBL, Heidelberg 69117, Germany
| | - Andrew Palmer
- Metabolomics Core Facility, EMBL, Heidelberg 69117, Germany
| | | | - Dipali Kale
- Heidelberg University Biochemistry Center (BZH), Heidelberg 69120, Germany
| | - Britta Brügger
- Heidelberg University Biochemistry Center (BZH), Heidelberg 69120, Germany
| | - Theodore Alexandrov
- Metabolomics Core Facility, EMBL, Heidelberg 69117, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg 69117, Germany.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
47
|
Rusconi F. Free Open Source Software for Protein and Peptide Mass Spectrometry- based Science. Curr Protein Pept Sci 2021; 22:134-147. [PMID: 33461461 DOI: 10.2174/1389203722666210118160946] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 10/12/2020] [Accepted: 01/04/2021] [Indexed: 12/28/2022]
Abstract
In the field of biology, and specifically in protein and peptide science, the power of mass spectrometry is that it is applicable to a vast spectrum of applications. Mass spectrometry can be applied to identify proteins and peptides in complex mixtures, to identify and locate post-translational modifications, to characterize the structure of proteins and peptides to the most detailed level or to detect protein-ligand non-covalent interactions. Thanks to the Free and Open Source Software (FOSS) movement, scientists have limitless opportunities to deepen their skills in software development to code software that solves mass spectrometric data analysis problems. After the conversion of raw data files into open standard format files, the entire spectrum of data analysis tasks can now be performed integrally on FOSS platforms, like GNU/Linux, and only with FOSS solutions. This review presents a brief history of mass spectrometry open file formats and goes on with the description of FOSS projects that are commonly used in protein and peptide mass spectrometry fields of endeavor: identification projects that involve mostly automated pipelines, like proteomics and peptidomics, and bio-structural characterization projects that most often involve manual scrutiny of the mass data. Projects of the last kind usually involve software that allows the user to delve into the mass data in an interactive graphics-oriented manner. Software projects are thus categorized on the basis of these criteria: software libraries for software developers vs desktop-based graphical user interface, software for the end-user and automated pipeline-based data processing vs interactive graphics-based mass data scrutiny.
Collapse
Affiliation(s)
- Filippo Rusconi
- PAPPSO, Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| |
Collapse
|
48
|
Helmus R, Ter Laak TL, van Wezel AP, de Voogt P, Schymanski EL. patRoon: open source software platform for environmental mass spectrometry based non-target screening. J Cheminform 2021; 13:1. [PMID: 33407901 PMCID: PMC7789171 DOI: 10.1186/s13321-020-00477-w] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 11/23/2020] [Indexed: 12/22/2022] Open
Abstract
Mass spectrometry based non-target analysis is increasingly adopted in environmental sciences to screen and identify numerous chemicals simultaneously in highly complex samples. However, current data processing software either lack functionality for environmental sciences, solve only part of the workflow, are not openly available and/or are restricted in input data formats. In this paper we present patRoon, a new R based open-source software platform, which provides comprehensive, fully tailored and straightforward non-target analysis workflows. This platform makes the use, evaluation and mixing of well-tested algorithms seamless by harmonizing various common (primarily open) software tools under a consistent interface. In addition, patRoon offers various functionality and strategies to simplify and perform automated processing of complex (environmental) data effectively. patRoon implements several effective optimization strategies to significantly reduce computational times. The ability of patRoon to perform time-efficient and automated non-target data annotation of environmental samples is demonstrated with a simple and reproducible workflow using open-access data of spiked samples from a drinking water treatment plant study. In addition, the ability to easily use, combine and evaluate different algorithms was demonstrated for three commonly used feature finding algorithms. This article, combined with already published works, demonstrate that patRoon helps make comprehensive (environmental) non-target analysis readily accessible to a wider community of researchers.
Collapse
Affiliation(s)
- Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 94240, 1090 GE, Amsterdam, The Netherlands.
| | - Thomas L Ter Laak
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 94240, 1090 GE, Amsterdam, The Netherlands.,KWR Water Research Institute, Chemical Water Quality and Health, P.O. Box 1072, 3430 BB, Nieuwegein, The Netherlands
| | - Annemarie P van Wezel
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 94240, 1090 GE, Amsterdam, The Netherlands
| | - Pim de Voogt
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, P.O. Box 94240, 1090 GE, Amsterdam, The Netherlands
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, L-4367, Belvaux, Luxembourg
| |
Collapse
|
49
|
Bhamber RS, Jankevics A, Deutsch EW, Jones AR, Dowsey AW. mzMLb: A Future-Proof Raw Mass Spectrometry Data Format Based on Standards-Compliant mzML and Optimized for Speed and Storage Requirements. J Proteome Res 2021; 20:172-183. [PMID: 32864978 PMCID: PMC7871438 DOI: 10.1021/acs.jproteome.0c00192] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Indexed: 12/24/2022]
Abstract
With ever-increasing amounts of data produced by mass spectrometry (MS) proteomics and metabolomics, and the sheer volume of samples now analyzed, the need for a common open format possessing both file size efficiency and faster read/write speeds has become paramount to drive the next generation of data analysis pipelines. The Proteomics Standards Initiative (PSI) has established a clear and precise extensible markup language (XML) representation for data interchange, mzML, receiving substantial uptake; nevertheless, storage and file access efficiency has not been the main focus. We propose an HDF5 file format "mzMLb" that is optimized for both read/write speed and storage of the raw mass spectrometry data. We provide an extensive validation of the write speed, random read speed, and storage size, demonstrating a flexible format that with or without compression is faster than all existing approaches in virtually all cases, while with compression is comparable in size to proprietary vendor file formats. Since our approach uniquely preserves the XML encoding of the metadata, the format implicitly supports future versions of mzML and is straightforward to implement: mzMLb's design adheres to both HDF5 and NetCDF4 standard implementations, which allows it to be easily utilized by third parties due to their widespread programming language support. A reference implementation within the established ProteoWizard toolkit is provided.
Collapse
Affiliation(s)
- Ranjeet S. Bhamber
- Department of Population Health Sciences and Bristol
Veterinary School, University of Bristol, Bristol BS8 2BN,
United Kingdom
| | - Andris Jankevics
- School of Biosciences and Phenome Centre Birmingham,
University of Birmingham, Birmingham B15 2TT, United
Kingdom
| | - Eric W. Deutsch
- Institute for Systems
Biology, Seattle, Washington 98109, United States
| | - Andrew R. Jones
- Institute of Integrative Biology,
University of Liverpool, Liverpool L69 7ZB, United
Kingdom
| | - Andrew W. Dowsey
- Department of Population Health Sciences and Bristol
Veterinary School, University of Bristol, Bristol BS8 2BN,
United Kingdom
| |
Collapse
|
50
|
Rabe A, Gesell Salazar M, Völker U. Bottom-Up Community Proteome Analysis of Saliva Samples and Tongue Swabs by Data-Dependent Acquisition Nano LC-MS/MS Mass Spectrometry. Methods Mol Biol 2021; 2327:221-238. [PMID: 34410648 DOI: 10.1007/978-1-0716-1518-8_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Analysis using mass spectrometry enables the characterization of metaproteomes in their native environments and overcomes the limitation of proteomics of pure cultures. Metaproteomics is a promising approach to link functions of currently actively expressed genes to the phylogenetic composition of the microbiome in their habitat. In this chapter, we describe the preparation of saliva samples and tongue swabs for nLC-MS/MS measurements and their bioinformatic analysis based on the Trans-Proteomic Pipeline and Prophane to study the oral microbiome .
Collapse
Affiliation(s)
- Alexander Rabe
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany.
| | - Manuela Gesell Salazar
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Uwe Völker
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| |
Collapse
|