1
|
Metaproteogenomic analysis of saliva samples from Parkinson's disease patients with cognitive impairment. NPJ Biofilms Microbiomes 2023; 9:86. [PMID: 37980417 PMCID: PMC10657361 DOI: 10.1038/s41522-023-00452-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 10/30/2023] [Indexed: 11/20/2023] Open
Abstract
Cognitive impairment (CI) is very common in patients with Parkinson's Disease (PD) and progressively develops on a spectrum from mild cognitive impairment (PD-MCI) to full dementia (PDD). Identification of PD patients at risk of developing cognitive decline, therefore, is unmet need in the clinic to manage the disease. Previous studies reported that oral microbiota of PD patients was altered even at early stages and poor oral hygiene is associated with dementia. However, data from single modalities are often unable to explain complex chronic diseases in the brain and cannot reliably predict the risk of disease progression. Here, we performed integrative metaproteogenomic characterization of salivary microbiota and tested the hypothesis that biological molecules of saliva and saliva microbiota dynamically shift in association with the progression of cognitive decline and harbor discriminatory key signatures across the spectrum of CI in PD. We recruited a cohort of 115 participants in a multi-center study and employed multi-omics factor analysis (MOFA) to integrate amplicon sequencing and metaproteomic analysis to identify signature taxa and proteins in saliva. Our baseline analyses revealed contrasting interplay between the genus Neisseria and Lactobacillus and Ligilactobacillus genera across the spectrum of CI. The group specific signature profiles enabled us to identify bacterial genera and protein groups associated with CI stages in PD. Our study describes compositional dynamics of saliva across the spectrum of CI in PD and paves the way for developing non-invasive biomarker strategies to predict the risk of CI progression in PD.
Collapse
|
2
|
Integrated multi-omics analyses of microbial communities: a review of the current state and future directions. Mol Omics 2023; 19:607-623. [PMID: 37417894 DOI: 10.1039/d3mo00089c] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
Integrated multi-omics analyses of microbiomes have become increasingly common in recent years as the emerging omics technologies provide an unprecedented opportunity to better understand the structural and functional properties of microbial communities. Consequently, there is a growing need for and interest in the concepts, approaches, considerations, and available tools for investigating diverse environmental and host-associated microbial communities in an integrative manner. In this review, we first provide a general overview of each omics analysis type, including a brief history, typical workflow, primary applications, strengths, and limitations. Then, we inform on both experimental design and bioinformatics analysis considerations in integrated multi-omics analyses, elaborate on the current approaches and commonly used tools, and highlight the current challenges. Finally, we discuss the expected key advances, emerging trends, potential implications on various fields from human health to biotechnology, and future directions.
Collapse
|
3
|
The importance of graph databases and graph learning for clinical applications. Database (Oxford) 2023; 2023:baad045. [PMID: 37428679 PMCID: PMC10332447 DOI: 10.1093/database/baad045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 05/26/2023] [Accepted: 06/16/2023] [Indexed: 07/12/2023]
Abstract
The increasing amount and complexity of clinical data require an appropriate way of storing and analyzing those data. Traditional approaches use a tabular structure (relational databases) for storing data and thereby complicate storing and retrieving interlinked data from the clinical domain. Graph databases provide a great solution for this by storing data in a graph as nodes (vertices) that are connected by edges (links). The underlying graph structure can be used for the subsequent data analysis (graph learning). Graph learning consists of two parts: graph representation learning and graph analytics. Graph representation learning aims to reduce high-dimensional input graphs to low-dimensional representations. Then, graph analytics uses the obtained representations for analytical tasks like visualization, classification, link prediction and clustering which can be used to solve domain-specific problems. In this survey, we review current state-of-the-art graph database management systems, graph learning algorithms and a variety of graph applications in the clinical domain. Furthermore, we provide a comprehensive use case for a clearer understanding of complex graph learning algorithms. Graphical abstract.
Collapse
|
4
|
Mistle: bringing spectral library predictions to metaproteomics with an efficient search index. Bioinformatics 2023; 39:btad376. [PMID: 37294786 PMCID: PMC10313348 DOI: 10.1093/bioinformatics/btad376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 05/11/2023] [Accepted: 06/08/2023] [Indexed: 06/11/2023] Open
Abstract
MOTIVATION Deep learning has moved to the forefront of tandem mass spectrometry-driven proteomics and authentic prediction for peptide fragmentation is more feasible than ever. Still, at this point spectral prediction is mainly used to validate database search results or for confined search spaces. Fully predicted spectral libraries have not yet been efficiently adapted to large search space problems that often occur in metaproteomics or proteogenomics. RESULTS In this study, we showcase a workflow that uses Prosit for spectral library predictions on two common metaproteomes and implement an indexing and search algorithm, Mistle, to efficiently identify experimental mass spectra within the library. Hence, the workflow emulates a classic protein sequence database search with protein digestion but builds a searchable index from spectral predictions as an in-between step. We compare Mistle to popular search engines, both on a spectral and database search level, and provide evidence that this approach is more accurate than a database search using MSFragger. Mistle outperforms other spectral library search engines in terms of run time and proves to be extremely memory efficient with a 4- to 22-fold decrease in RAM usage. This makes Mistle universally applicable to large search spaces, e.g. covering comprehensive sequence databases of diverse microbiomes. AVAILABILITY AND IMPLEMENTATION Mistle is freely available on GitHub at https://github.com/BAMeScience/Mistle.
Collapse
|
5
|
PepGM: A probabilistic graphical model for taxonomic inference of viral proteome samples with associated confidence scores. Bioinformatics 2023; 39:7147900. [PMID: 37129543 PMCID: PMC10182852 DOI: 10.1093/bioinformatics/btad289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 03/15/2023] [Accepted: 04/28/2023] [Indexed: 05/03/2023] Open
Abstract
MOTIVATION Inferring taxonomy in mass spectrometry-based shotgun proteomics is a complex task. In multi-species or viral samples of unknown taxonomic origin, the presence of proteins and corresponding taxa must be inferred from a list of identified peptides which is often complicated by protein homology: many proteins do not only share peptides within a taxon but also between taxa. However, correct taxonomic inference is crucial when identifying different viral strains with high sequence homology-considering, e.g., the different epidemiological characteristics of the various strains of SARS-CoV-2. Additionally, many viruses mutate frequently, further complicating the correct identification of viral proteomic samples. RESULTS We present PepGM, a probabilistic graphical model for the taxonomic assignment of virus proteomic samples with strain-level resolution and associated confidence scores. PepGM combines the results of a standard proteomic database search algorithm with belief propagation to calculate the marginal distributions, and thus confidence scores, for potential taxonomic assignments. We demonstrate the performance of PepGM using several publicly available virus proteomic datasets, showing its strain-level resolution performance. In two out of eight cases, the taxonomic assignments were only correct on the species level, which PepGM clearly indicates by lower confidence scores. AVAILABILITY AND IMPLEMENTATION PepGM is written in Python and embedded into a Snakemake workflow. It is available on https://github.com/BAMeScience/PepGM.
Collapse
|
6
|
Comprehensive evaluation of peptide de novo sequencing tools for monoclonal antibody assembly. Brief Bioinform 2022; 24:6955273. [PMID: 36545804 PMCID: PMC9851299 DOI: 10.1093/bib/bbac542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/25/2022] [Accepted: 11/10/2022] [Indexed: 12/24/2022] Open
Abstract
Monoclonal antibodies are biotechnologically produced proteins with various applications in research, therapeutics and diagnostics. Their ability to recognize and bind to specific molecule structures makes them essential research tools and therapeutic agents. Sequence information of antibodies is helpful for understanding antibody-antigen interactions and ensuring their affinity and specificity. De novo protein sequencing based on mass spectrometry is a valuable method to obtain the amino acid sequence of peptides and proteins without a priori knowledge. In this study, we evaluated six recently developed de novo peptide sequencing algorithms (Novor, pNovo 3, DeepNovo, SMSNet, PointNovo and Casanovo), which were not specifically designed for antibody data. We validated their ability to identify and assemble antibody sequences on three multi-enzymatic data sets. The deep learning-based tools Casanovo and PointNovo showed an increased peptide recall across different enzymes and data sets compared with spectrum-graph-based approaches. We evaluated different error types of de novo peptide sequencing tools and their performance for different numbers of missing cleavage sites, noisy spectra and peptides of various lengths. We achieved a sequence coverage of 97.69-99.53% on the light chains of three different antibody data sets using the de Bruijn assembler ALPS and the predictions from Casanovo. However, low sequence coverage and accuracy on the heavy chains demonstrate that complete de novo protein sequencing remains a challenging issue in proteomics that requires improved de novo error correction, alternative digestion strategies and hybrid approaches such as homology search to achieve high accuracy on long protein sequences.
Collapse
|
7
|
Ad hoc learning of peptide fragmentation from mass spectra enables an interpretable detection of phosphorylated and cross-linked peptides. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00467-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AbstractMass spectrometry-based proteomics provides a holistic snapshot of the entire protein set of living cells on a molecular level. Currently, only a few deep learning approaches exist that involve peptide fragmentation spectra, which represent partial sequence information of proteins. Commonly, these approaches lack the ability to characterize less studied or even unknown patterns in spectra because of their use of explicit domain knowledge. Here, to elevate unrestricted learning from spectra, we introduce ‘ad hoc learning of fragmentation’ (AHLF), a deep learning model that is end-to-end trained on 19.2 million spectra from several phosphoproteomic datasets. AHLF is interpretable, and we show that peak-level feature importance values and pairwise interactions between peaks are in line with corresponding peptide fragments. We demonstrate our approach by detecting post-translational modifications, specifically protein phosphorylation based on only the fragmentation spectrum without a database search. AHLF increases the area under the receiver operating characteristic curve (AUC) by an average of 9.4% on recent phosphoproteomic data compared with the current state of the art on this task. Furthermore, use of AHLF in rescoring search results increases the number of phosphopeptide identifications by a margin of up to 15.1% at a constant false discovery rate. To show the broad applicability of AHLF, we use transfer learning to also detect cross-linked peptides, as used in protein structure analysis, with an AUC of up to 94%.
Collapse
|
8
|
The Metaproteomics Initiative: a coordinated approach for propelling the functional characterization of microbiomes. MICROBIOME 2021; 9:243. [PMID: 34930457 PMCID: PMC8690404 DOI: 10.1186/s40168-021-01176-w] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 10/10/2021] [Indexed: 05/04/2023]
Abstract
Through connecting genomic and metabolic information, metaproteomics is an essential approach for understanding how microbiomes function in space and time. The international metaproteomics community is delighted to announce the launch of the Metaproteomics Initiative (www.metaproteomics.org), the goal of which is to promote dissemination of metaproteomics fundamentals, advancements, and applications through collaborative networking in microbiome research. The Initiative aims to be the central information hub and open meeting place where newcomers and experts interact to communicate, standardize, and accelerate experimental and bioinformatic methodologies in this field. We invite the entire microbiome community to join and discuss potential synergies at the interfaces with other disciplines, and to collectively promote innovative approaches to gain deeper insights into microbiome functions and dynamics. Video Abstract.
Collapse
|
9
|
Critical Assessment of MetaProteome Investigation (CAMPI): a multi-laboratory comparison of established workflows. Nat Commun 2021; 12:7305. [PMID: 34911965 PMCID: PMC8674281 DOI: 10.1038/s41467-021-27542-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 11/24/2021] [Indexed: 12/17/2022] Open
Abstract
Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.
Collapse
|
10
|
Critical Assessment of MetaProteome Investigation (CAMPI): a multi-laboratory comparison of established workflows. Nat Commun 2021; 12:7305. [PMID: 34911965 DOI: 10.1101/2021.03.05.433915] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 11/24/2021] [Indexed: 05/21/2023] Open
Abstract
Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.
Collapse
|
11
|
Tracking changes in adaptation to suspension growth for MDCK cells: cell growth correlates with levels of metabolites, enzymes and proteins. Appl Microbiol Biotechnol 2021; 105:1861-1874. [PMID: 33582836 PMCID: PMC7907048 DOI: 10.1007/s00253-021-11150-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 01/20/2021] [Accepted: 01/26/2021] [Indexed: 11/17/2022]
Abstract
Abstract Adaptations of animal cells to growth in suspension culture concern in particular viral vaccine production, where very specific aspects of virus-host cell interaction need to be taken into account to achieve high cell specific yields and overall process productivity. So far, the complexity of alterations on the metabolism, enzyme, and proteome level required for adaptation is only poorly understood. In this study, for the first time, we combined several complex analytical approaches with the aim to track cellular changes on different levels and to unravel interconnections and correlations. Therefore, a Madin-Darby canine kidney (MDCK) suspension cell line, adapted earlier to growth in suspension, was cultivated in a 1-L bioreactor. Cell concentrations and cell volumes, extracellular metabolite concentrations, and intracellular enzyme activities were determined. The experimental data set was used as the input for a segregated growth model that was already applied to describe the growth dynamics of the parental adherent cell line. In addition, the cellular proteome was analyzed by liquid chromatography coupled to tandem mass spectrometry using a label-free protein quantification method to unravel altered cellular processes for the suspension and the adherent cell line. Four regulatory mechanisms were identified as a response of the adaptation of adherent MDCK cells to growth in suspension. These regulatory mechanisms were linked to the proteins caveolin, cadherin-1, and pirin. Combining cell, metabolite, enzyme, and protein measurements with mathematical modeling generated a more holistic view on cellular processes involved in the adaptation of an adherent cell line to suspension growth. Key points • Less and more efficient glucose utilization for suspension cell growth • Concerted alteration of metabolic enzyme activity and protein expression • Protein candidates to interfere glycolytic activity in MDCK cells Supplementary Information The online version contains supplementary material available at 10.1007/s00253-021-11150-z.
Collapse
|
12
|
Survey of metaproteomics software tools for functional microbiome analysis. PLoS One 2020; 15:e0241503. [PMID: 33170893 PMCID: PMC7654790 DOI: 10.1371/journal.pone.0241503] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 10/15/2020] [Indexed: 11/23/2022] Open
Abstract
To gain a thorough appreciation of microbiome dynamics, researchers characterize the functional relevance of expressed microbial genes or proteins. This can be accomplished through metaproteomics, which characterizes the protein expression of microbiomes. Several software tools exist for analyzing microbiomes at the functional level by measuring their combined proteome-level response to environmental perturbations. In this survey, we explore the performance of six available tools, to enable researchers to make informed decisions regarding software choice based on their research goals. Tandem mass spectrometry-based proteomic data obtained from dental caries plaque samples grown with and without sucrose in paired biofilm reactors were used as representative data for this evaluation. Microbial peptides from one sample pair were identified by the X! tandem search algorithm via SearchGUI and subjected to functional analysis using software tools including eggNOG-mapper, MEGAN5, MetaGOmics, MetaProteomeAnalyzer (MPA), ProPHAnE, and Unipept to generate functional annotation through Gene Ontology (GO) terms. Among these software tools, notable differences in functional annotation were detected after comparing differentially expressed protein functional groups. Based on the generated GO terms of these tools we performed a peptide-level comparison to evaluate the quality of their functional annotations. A BLAST analysis against the NCBI non-redundant database revealed that the sensitivity and specificity of functional annotation varied between tools. For example, eggNOG-mapper mapped to the most number of GO terms, while Unipept generated more accurate GO terms. Based on our evaluation, metaproteomics researchers can choose the software according to their analytical needs and developers can use the resulting feedback to further optimize their algorithms. To make more of these tools accessible via scalable metaproteomics workflows, eggNOG-mapper and Unipept 4.0 were incorporated into the Galaxy platform.
Collapse
|
13
|
Abstract
One of the most widely used methods to detect an acute viral infection in clinical specimens is diagnostic real-time polymerase chain reaction. However, because of the COVID-19 pandemic, mass-spectrometry-based proteomics is currently being discussed as a potential diagnostic method for viral infections. Because proteomics is not yet applied in routine virus diagnostics, here we discuss its potential to detect viral infections. Apart from theoretical considerations, the current status and technical limitations are considered. Finally, the challenges that have to be overcome to establish proteomics in routine virus diagnostics are highlighted.
Collapse
|
14
|
Erratum: gNOMO: a multi-omics pipeline for integrated host and microbiome analysis of non-model organisms. NAR Genom Bioinform 2020; 2:lqaa083. [PMID: 33577626 PMCID: PMC7671335 DOI: 10.1093/nargab/lqaa083] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
[This corrects the article DOI: 10.1093/nargab/lqaa058.].
Collapse
|
15
|
A complete and flexible workflow for metaproteomics data analysis based on MetaProteomeAnalyzer and Prophane. Nat Protoc 2020; 15:3212-3239. [PMID: 32859984 DOI: 10.1038/s41596-020-0368-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 05/29/2020] [Indexed: 12/14/2022]
Abstract
Metaproteomics, the study of the collective protein composition of multi-organism systems, provides deep insights into the biodiversity of microbial communities and the complex functional interplay between microbes and their hosts or environment. Thus, metaproteomics has become an indispensable tool in various fields such as microbiology and related medical applications. The computational challenges in the analysis of corresponding datasets differ from those of pure-culture proteomics, e.g., due to the higher complexity of the samples and the larger reference databases demanding specific computing pipelines. Corresponding data analyses usually consist of numerous manual steps that must be closely synchronized. With MetaProteomeAnalyzer and Prophane, we have established two open-source software solutions specifically developed and optimized for metaproteomics. Among other features, peptide-spectrum matching is improved by combining different search engines and, compared to similar tools, metaproteome annotation benefits from the most comprehensive set of available databases (such as NCBI, UniProt, EggNOG, PFAM, and CAZy). The workflow described in this protocol combines both tools and leads the user through the entire data analysis process, including protein database creation, database search, protein grouping and annotation, and results visualization. To the best of our knowledge, this protocol presents the most comprehensive, detailed and flexible guide to metaproteomics data analysis to date. While beginners are provided with robust, easy-to-use, state-of-the-art data analysis in a reasonable time (a few hours, depending on, among other factors, the protein database size and the number of identified peptides and inferred proteins), advanced users benefit from the flexibility and adaptability of the workflow.
Collapse
|
16
|
gNOMO: a multi-omics pipeline for integrated host and microbiome analysis of non-model organisms. NAR Genom Bioinform 2020; 2:lqaa058. [PMID: 33575609 PMCID: PMC7671378 DOI: 10.1093/nargab/lqaa058] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 06/19/2020] [Accepted: 08/03/2020] [Indexed: 01/14/2023] Open
Abstract
The study of bacterial symbioses has grown exponentially in the recent past. However, existing bioinformatic workflows of microbiome data analysis do commonly not integrate multiple meta-omics levels and are mainly geared toward human microbiomes. Microbiota are better understood when analyzed in their biological context; that is together with their host or environment. Nevertheless, this is a limitation when studying non-model organisms mainly due to the lack of well-annotated sequence references. Here, we present gNOMO, a bioinformatic pipeline that is specifically designed to process and analyze non-model organism samples of up to three meta-omics levels: metagenomics, metatranscriptomics and metaproteomics in an integrative manner. The pipeline has been developed using the workflow management framework Snakemake in order to obtain an automated and reproducible pipeline. Using experimental datasets of the German cockroach Blattella germanica, a non-model organism with very complex gut microbiome, we show the capabilities of gNOMO with regard to meta-omics data integration, expression ratio comparison, taxonomic and functional analysis as well as intuitive output visualization. In conclusion, gNOMO is a bioinformatic pipeline that can easily be configured, for integrating and analyzing multiple meta-omics data types and for producing output visualizations, specifically designed for integrating paired-end sequencing data with mass spectrometry from non-model organisms.
Collapse
|
17
|
Connecting MetaProteomeAnalyzer and PeptideShaker to Unipept for Seamless End-to-End Metaproteomics Data Analysis. J Proteome Res 2020; 19:3562-3566. [DOI: 10.1021/acs.jproteome.0c00136] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
18
|
TaxIt: An Iterative Computational Pipeline for Untargeted Strain-Level Identification Using MS/MS Spectra from Pathogenic Single-Organism Samples. J Proteome Res 2020; 19:2501-2510. [PMID: 32362126 DOI: 10.1021/acs.jproteome.9b00714] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Untargeted accurate strain-level classification of a priori unidentified organisms using tandem mass spectrometry is a challenging task. Reference databases often lack taxonomic depth, limiting peptide assignments to the species level. However, the extension with detailed strain information increases runtime and decreases statistical power. In addition, larger databases contain a higher number of similar proteomes. We present TaxIt, an iterative workflow to address the increasing search space required for MS/MS-based strain-level classification of samples with unknown taxonomic origin. TaxIt first applies reference sequence data for initial identification of species candidates, followed by automated acquisition of relevant strain sequences for low level classification. Furthermore, proteome similarities resulting in ambiguous taxonomic assignments are addressed with an abundance weighting strategy to increase the confidence in candidate taxa. For benchmarking the performance of our method, we apply our iterative workflow on several samples of bacterial and viral origin. In comparison to noniterative approaches using unique peptides or advanced abundance correction, TaxIt identifies microbial strains correctly in all examples presented (with one tie), thereby demonstrating the potential for untargeted and deeper taxonomic classification. TaxIt makes extensive use of public, unrestricted, and continuously growing sequence resources such as the NCBI databases and is available under open-source BSD license at https://gitlab.com/rki_bioinformatics/TaxIt.
Collapse
|
19
|
An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action. F1000Res 2020; 9:295. [PMID: 33552475 PMCID: PMC7845155 DOI: 10.12688/f1000research.23224.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/09/2020] [Indexed: 08/22/2023] Open
Abstract
Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements and embeds research knowledge, and constitutes an essential research product in itself. Research software must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct new research effectively. In other words, software must be available, discoverable, usable, and adaptable to new needs, both now and in the future. Research software therefore requires an environment that supports sustainability. Hence, a change is needed in the way research software development and maintenance are currently motivated, incentivized, funded, structurally and infrastructurally supported, and legally treated. Failing to do so will threaten the quality and validity of research. In this paper, we identify challenges for research software sustainability in Germany and beyond, in terms of motivation, selection, research software engineering personnel, funding, infrastructure, and legal aspects. Besides researchers, we specifically address political and academic decision-makers to increase awareness of the importance and needs of sustainable research software practices. In particular, we recommend strategies and measures to create an environment for sustainable research software, with the ultimate goal to ensure that software-driven research is valid, reproducible and sustainable, and that software is recognized as a first class citizen in research. This paper is the outcome of two workshops run in Germany in 2019, at deRSE19 - the first International Conference of Research Software Engineers in Germany - and a dedicated DFG-supported follow-up workshop in Berlin.
Collapse
|
20
|
An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action. F1000Res 2020; 9:295. [PMID: 33552475 PMCID: PMC7845155 DOI: 10.12688/f1000research.23224.2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/11/2021] [Indexed: 11/20/2022] Open
Abstract
Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements and embeds research knowledge, and constitutes an essential research product in itself. Research software must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct new research effectively. In other words, software must be available, discoverable, usable, and adaptable to new needs, both now and in the future. Research software therefore requires an environment that supports sustainability. Hence, a change is needed in the way research software development and maintenance are currently motivated, incentivized, funded, structurally and infrastructurally supported, and legally treated. Failing to do so will threaten the quality and validity of research. In this paper, we identify challenges for research software sustainability in Germany and beyond, in terms of motivation, selection, research software engineering personnel, funding, infrastructure, and legal aspects. Besides researchers, we specifically address political and academic decision-makers to increase awareness of the importance and needs of sustainable research software practices. In particular, we recommend strategies and measures to create an environment for sustainable research software, with the ultimate goal to ensure that software-driven research is valid, reproducible and sustainable, and that software is recognized as a first class citizen in research. This paper is the outcome of two workshops run in Germany in 2019, at deRSE19 - the first International Conference of Research Software Engineers in Germany - and a dedicated DFG-supported follow-up workshop in Berlin.
Collapse
|
21
|
A Robust and Universal Metaproteomics Workflow for Research Studies and Routine Diagnostics Within 24 h Using Phenol Extraction, FASP Digest, and the MetaProteomeAnalyzer. Front Microbiol 2019; 10:1883. [PMID: 31474963 PMCID: PMC6707425 DOI: 10.3389/fmicb.2019.01883] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 07/30/2019] [Indexed: 01/29/2023] Open
Abstract
The investigation of microbial proteins by mass spectrometry (metaproteomics) is a key technology for simultaneously assessing the taxonomic composition and the functionality of microbial communities in medical, environmental, and biotechnological applications. We present an improved metaproteomics workflow using an updated sample preparation and a new version of the MetaProteomeAnalyzer software for data analysis. High resolution by multidimensional separation (GeLC, MudPIT) was sacrificed to aim at fast analysis of a broad range of different samples in less than 24 h. The improved workflow generated at least two times as many protein identifications than our previous workflow, and a drastic increase of taxonomic and functional annotations. Improvements of all aspects of the workflow, particularly the speed, are first steps toward potential routine clinical diagnostics (i.e., fecal samples) and analysis of technical and environmental samples. The MetaProteomeAnalyzer is provided to the scientific community as a central remote server solution at www.mpa.ovgu.de.
Collapse
|
22
|
Challenges and promise at the interface of metaproteomics and genomics: an overview of recent progress in metaproteogenomic data analysis. Expert Rev Proteomics 2019; 16:375-390. [PMID: 31002542 DOI: 10.1080/14789450.2019.1609944] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
INTRODUCTION The study of microbial communities based on the combined analysis of genomic and proteomic data - called metaproteogenomics - has gained increased research attention in recent years. This relatively young field aims to elucidate the functional and taxonomic interplay of proteins in microbiomes and its implications on human health and the environment. Areas covered: This article reviews bioinformatics methods and software tools dedicated to the analysis of data from metaproteomics and metaproteogenomics experiments. In particular, it focuses on the creation of tailored protein sequence databases, on the optimal use of database search algorithms including methods of error rate estimation, and finally on taxonomic and functional annotation of peptide and protein identifications. Expert opinion: Recently, various promising strategies and software tools have been proposed for handling typical data analysis issues in metaproteomics. However, severe challenges remain that are highlighted and discussed in this article; these include: (i) robust false-positive assessment of peptide and protein identifications, (ii) complex protein inference against a background of highly redundant data, (iii) taxonomic and functional post-processing of identification data, and finally, (iv) the assessment and provision of metrics and tools for quantitative analysis.
Collapse
|
23
|
Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
|
24
|
Editorial for Special Issue: Metaproteomics. Proteomes 2019; 7:proteomes7010009. [PMID: 30841491 PMCID: PMC6473379 DOI: 10.3390/proteomes7010009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 02/28/2019] [Indexed: 11/16/2022] Open
Abstract
As the proteome-level counterpart of metagenomics, metaproteomics extends conventional single-organism proteomics and allows researchers to characterize the entire protein complement of complex microbiomes on a large scale [...].
Collapse
|
25
|
Identical and Nonidentical Twins: Risk and Factors Involved in Development of Islet Autoimmunity and Type 1 Diabetes. Diabetes Care 2019; 42:192-199. [PMID: 30061316 PMCID: PMC6341285 DOI: 10.2337/dc18-0288] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Accepted: 06/28/2018] [Indexed: 02/03/2023]
Abstract
OBJECTIVE There are variable reports of risk of concordance for progression to islet autoantibodies and type 1 diabetes in identical twins after one twin is diagnosed. We examined development of positive autoantibodies and type 1 diabetes and the effects of genetic factors and common environment on autoantibody positivity in identical twins, nonidentical twins, and full siblings. RESEARCH DESIGN AND METHODS Subjects from the TrialNet Pathway to Prevention Study (N = 48,026) were screened from 2004 to 2015 for islet autoantibodies (GAD antibody [GADA], insulinoma-associated antigen 2 [IA-2A], and autoantibodies against insulin [IAA]). Of these subjects, 17,226 (157 identical twins, 283 nonidentical twins, and 16,786 full siblings) were followed for autoantibody positivity or type 1 diabetes for a median of 2.1 years. RESULTS At screening, identical twins were more likely to have positive GADA, IA-2A, and IAA than nonidentical twins or full siblings (all P < 0.0001). Younger age, male sex, and genetic factors were significant factors for expression of IA-2A, IAA, one or more positive autoantibodies, and two or more positive autoantibodies (all P ≤ 0.03). Initially autoantibody-positive identical twins had a 69% risk of diabetes by 3 years compared with 1.5% for initially autoantibody-negative identical twins. In nonidentical twins, type 1 diabetes risk by 3 years was 72% for initially multiple autoantibody-positive, 13% for single autoantibody-positive, and 0% for initially autoantibody-negative nonidentical twins. Full siblings had a 3-year type 1 diabetes risk of 47% for multiple autoantibody-positive, 12% for single autoantibody-positive, and 0.5% for initially autoantibody-negative subjects. CONCLUSIONS Risk of type 1 diabetes at 3 years is high for initially multiple and single autoantibody-positive identical twins and multiple autoantibody-positive nonidentical twins. Genetic predisposition, age, and male sex are significant risk factors for development of positive autoantibodies in twins.
Collapse
|
26
|
A Type 1 Diabetes Genetic Risk Score Predicts Progression of Islet Autoimmunity and Development of Type 1 Diabetes in Individuals at Risk. Diabetes Care 2018; 41:1887-1894. [PMID: 30002199 PMCID: PMC6105323 DOI: 10.2337/dc18-0087] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 06/06/2018] [Indexed: 02/03/2023]
Abstract
OBJECTIVE We tested the ability of a type 1 diabetes (T1D) genetic risk score (GRS) to predict progression of islet autoimmunity and T1D in at-risk individuals. RESEARCH DESIGN AND METHODS We studied the 1,244 TrialNet Pathway to Prevention study participants (T1D patients' relatives without diabetes and with one or more positive autoantibodies) who were genotyped with Illumina ImmunoChip (median [range] age at initial autoantibody determination 11.1 years [1.2-51.8], 48% male, 80.5% non-Hispanic white, median follow-up 5.4 years). Of 291 participants with a single positive autoantibody at screening, 157 converted to multiple autoantibody positivity and 55 developed diabetes. Of 953 participants with multiple positive autoantibodies at screening, 419 developed diabetes. We calculated the T1D GRS from 30 T1D-associated single nucleotide polymorphisms. We used multivariable Cox regression models, time-dependent receiver operating characteristic curves, and area under the curve (AUC) measures to evaluate prognostic utility of T1D GRS, age, sex, Diabetes Prevention Trial-Type 1 (DPT-1) Risk Score, positive autoantibody number or type, HLA DR3/DR4-DQ8 status, and race/ethnicity. We used recursive partitioning analyses to identify cut points in continuous variables. RESULTS Higher T1D GRS significantly increased the rate of progression to T1D adjusting for DPT-1 Risk Score, age, number of positive autoantibodies, sex, and ethnicity (hazard ratio [HR] 1.29 for a 0.05 increase, 95% CI 1.06-1.6; P = 0.011). Progression to T1D was best predicted by a combined model with GRS, number of positive autoantibodies, DPT-1 Risk Score, and age (7-year time-integrated AUC = 0.79, 5-year AUC = 0.73). Higher GRS was significantly associated with increased progression rate from single to multiple positive autoantibodies after adjusting for age, autoantibody type, ethnicity, and sex (HR 2.27 for GRS >0.295, 95% CI 1.47-3.51; P = 0.0002). CONCLUSIONS The T1D GRS independently predicts progression to T1D and improves prediction along T1D stages in autoantibody-positive relatives.
Collapse
|
27
|
A Potential Golden Age to Come-Current Tools, Recent Use Cases, and Future Avenues for De Novo Sequencing in Proteomics. Proteomics 2018; 18:e1700150. [PMID: 29968278 DOI: 10.1002/pmic.201700150] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/23/2018] [Indexed: 01/15/2023]
Abstract
In shotgun proteomics, peptide and protein identification is most commonly conducted using database search engines, the method of choice when reference protein sequences are available. Despite its widespread use the database-driven approach is limited, mainly because of its static search space. In contrast, de novo sequencing derives peptide sequence information in an unbiased manner, using only the fragment ion information from the tandem mass spectra. In recent years, with the improvements in MS instrumentation, various new methods have been proposed for de novo sequencing. This review article provides an overview of existing de novo sequencing algorithms and software tools ranging from peptide sequencing to sequence-to-protein mapping. Various use cases are described for which de novo sequencing was successfully applied. Finally, limitations of current methods are highlighted and new directions are discussed for a wider acceptance of de novo sequencing in the community.
Collapse
|
28
|
Disseminating Metaproteomic Informatics Capabilities and Knowledge Using the Galaxy-P Framework. Proteomes 2018; 6:proteomes6010007. [PMID: 29385081 PMCID: PMC5874766 DOI: 10.3390/proteomes6010007] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 01/26/2018] [Accepted: 01/26/2018] [Indexed: 01/12/2023] Open
Abstract
The impact of microbial communities, also known as the microbiome, on human health and the environment is receiving increased attention. Studying translated gene products (proteins) and comparing metaproteomic profiles may elucidate how microbiomes respond to specific environmental stimuli, and interact with host organisms. Characterizing proteins expressed by a complex microbiome and interpreting their functional signature requires sophisticated informatics tools and workflows tailored to metaproteomics. Additionally, there is a need to disseminate these informatics resources to researchers undertaking metaproteomic studies, who could use them to make new and important discoveries in microbiome research. The Galaxy for proteomics platform (Galaxy-P) offers an open source, web-based bioinformatics platform for disseminating metaproteomics software and workflows. Within this platform, we have developed easily-accessible and documented metaproteomic software tools and workflows aimed at training researchers in their operation and disseminating the tools for more widespread use. The modular workflows encompass the core requirements of metaproteomic informatics: (a) database generation; (b) peptide spectral matching; (c) taxonomic analysis and (d) functional analysis. Much of the software available via the Galaxy-P platform was selected, packaged and deployed through an online metaproteomics "Contribution Fest" undertaken by a unique consortium of expert software developers and users from the metaproteomics research community, who have co-authored this manuscript. These resources are documented on GitHub and freely available through the Galaxy Toolshed, as well as a publicly accessible metaproteomics gateway Galaxy instance. These documented workflows are well suited for the training of novice metaproteomics researchers, through online resources such as the Galaxy Training Network, as well as hands-on training workshops. Here, we describe the metaproteomics tools available within these Galaxy-based resources, as well as the process by which they were selected and implemented in our community-based work. We hope this description will increase access to and utilization of metaproteomics tools, as well as offer a framework for continued community-based development and dissemination of cutting edge metaproteomics software.
Collapse
|
29
|
Abstract
![]()
Metaproteomics,
the mass spectrometry-based analysis of proteins
from multispecies samples faces severe challenges concerning data
analysis and results interpretation. To overcome these shortcomings,
we here introduce the MetaProteomeAnalyzer (MPA) Portable software.
In contrast to the original server-based MPA application, this newly
developed tool no longer requires computational expertise for installation
and is now independent of any relational database system. In addition,
MPA Portable now supports state-of-the-art database search engines
and a convenient command line interface for high-performance data
processing tasks. While search engine results can easily be combined
to increase the protein identification yield, an additional two-step
workflow is implemented to provide sufficient analysis resolution
for further postprocessing steps, such as protein grouping as well
as taxonomic and functional annotation. Our new application has been
developed with a focus on intuitive usability, adherence to data standards,
and adaptation to Web-based workflow platforms. The open source software
package can be found at https://github.com/compomics/meta-proteome-analyzer.
Collapse
|
30
|
The impact of sequence database choice on metaproteomic results in gut microbiota studies. MICROBIOME 2016; 4:51. [PMID: 27671352 PMCID: PMC5037606 DOI: 10.1186/s40168-016-0196-8] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 09/12/2016] [Indexed: 05/23/2023]
Abstract
BACKGROUND Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. RESULTS Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. CONCLUSIONS This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.
Collapse
|
31
|
Metaproteomic data analysis at a glance: advances in computational microbial community proteomics. Expert Rev Proteomics 2016; 13:757-69. [DOI: 10.1080/14789450.2016.1209418] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
32
|
Mental health among currently enrolled medical students in Germany. Public Health 2016; 132:92-100. [PMID: 26880490 DOI: 10.1016/j.puhe.2015.12.014] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Revised: 12/25/2015] [Accepted: 12/30/2015] [Indexed: 11/25/2022]
Abstract
OBJECTIVES The study identifies the prevalence of common mental disorders according to the patient health questionnaire (PHQ) and the use of psychotropic substances in a sample of currently enrolled medical students. STUDY DESIGN A cross-sectional survey with a self-administrated questionnaire. METHODS All newly enrolled medical students at the University of Dusseldorf, with study beginning either in 2012 or 2013, respectively, were invited to participate. The evaluation was based on 590 completed questionnaires. Mental health outcomes were measured by the PHQ, including major depression, other depressive symptoms (subthreshold depression), anxiety, panic disorders and psychosomatic complaints. Moreover, information about psychotropic substances use (including medication) was obtained. Multiple logistic regression analysis was used to estimate associations between sociodemographic and socio-economic factors and mental health outcomes. RESULTS The prevalence rates, measured by the PHQ, were 4.7% for major depression, 5.8% for other depressive symptoms, 4.4% for anxiety, 1.9% for panic disorders, and 15.7% for psychosomatic complaints. These prevalence rates were higher than those reported in the general population, but lower than in medical students in the course of medical training. In all, 10.7% of the students reported regular psychotropic substance use: 5.1% of students used medication 'to calm down,' 4.6% 'to improve their sleep,' 4.4% 'to elevate mood,' and 3.1% 'to improve cognitive performance.' In the fully adjusted model, expected financial difficulties were significantly associated with poor mental health (odds ratio [OR]: 2.14; 95% confidence interval [CI]: 1.31-3.48), psychosomatic symptoms (OR:1.85; 95% CI: 1.11-3.09) and psychotropic substances use (OR: 2.68; 95% CI: 1.51-4.75). CONCLUSION The high rates of mental disorders among currently enrolled medical students call for the promotion of mental health, with a special emphasis on vulnerable groups.
Collapse
|
33
|
Colonic metaproteomic signatures of active bacteria and the host in obesity. Proteomics 2015; 15:3544-52. [DOI: 10.1002/pmic.201500049] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Revised: 07/03/2015] [Accepted: 07/24/2015] [Indexed: 11/10/2022]
|
34
|
Abstract
The study aimsed at surveying and analysing the prevailing risks for medical students due to so-called needlestick injuries, I. e., injuries to the skin by handling sharp objects by which blood of patients can be transmitted to the health professional. After introducing preventive measures in a typical German university hospital, a total of 1 903 students of human medicine in their clinical period from 2009 to 2012 (from a total of 2 024 subjects - a rate of 94.0%) were questioned in detail about potential needlestick or other injuries related to their work. The results show that such injuries happen particularly during the clinical period of the medical studies: While only 20.6% of the students indicated a needlestick injury at the beginning of this period, half of the students (50.9%) had experienced at least one injury at the end of the clinical period. The activities mentioned most frequently were taking of blood samples and injections. Needlestick injuries happened most frequently in surgical units, in internal medicine, and in gynaecology. Accidents happened mostly during secondary employment, medical traineeship, or in the context of practical nursing. In consequence, measures for improvement of the primary prevention should start with training on the one hand: Only briefing seems to be insufficient - intensive exercises in using stick-proof instruments seems to be more promising. On the other hand, the comprehensive introduction of stick-proof instruments has to be supported.
Collapse
|
35
|
Navigating through metaproteomics data: a logbook of database searching. Proteomics 2015; 15:3439-53. [PMID: 25778831 DOI: 10.1002/pmic.201400560] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Revised: 02/13/2015] [Accepted: 03/06/2015] [Indexed: 11/12/2022]
Abstract
Metaproteomic research involves various computational challenges during the identification of fragmentation spectra acquired from the proteome of a complex microbiome. These issues are manifold and range from the construction of customized sequence databases, the optimal setting of search parameters to limitations in the identification search algorithms themselves. In order to assess the importance of these individual factors, we studied the effect of strategies to combine different search algorithms, explored the influence of chosen database search settings, and investigated the impact of the size of the protein sequence database used for identification. Furthermore, we applied de novo sequencing as a complementary approach to classic database searching. All evaluations were performed on a human intestinal metaproteome dataset. Pyrococcus furiosus proteome data were used to contrast database searching of metaproteomic data to a classic proteomic experiment. Searching against subsets of metaproteome databases and the use of multiple search engines increased the number of identifications. The integration of P. furiosus sequences in a metaproteomic sequence database showcased the limitation of the target-decoy-controlled false discovery rate approach in combination with large sequence databases. The selection of varying search engine parameters and the application of de novo sequencing represented useful methods to increase the reliability of the results. Based on our findings, we provide recommendations for the data analysis that help researchers to establish or improve analysis workflows in metaproteomics.
Collapse
|
36
|
The MetaProteomeAnalyzer: A Powerful Open-Source Software Suite for Metaproteomics Data Analysis and Interpretation. J Proteome Res 2015; 14:1557-65. [DOI: 10.1021/pr501246w] [Citation(s) in RCA: 124] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
37
|
Viewing the proteome: how to visualize proteomics data? Proteomics 2015; 15:1341-55. [PMID: 25504833 DOI: 10.1002/pmic.201400412] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Revised: 10/23/2014] [Accepted: 12/05/2014] [Indexed: 01/18/2023]
Abstract
Proteomics has become one of the main approaches for analyzing and understanding biological systems. Yet similar to other high-throughput analysis methods, the presentation of the large amounts of obtained data in easily interpretable ways remains challenging. In this review, we present an overview of the different ways in which proteomics software supports the visualization and interpretation of proteomics data. The unique challenges and current solutions for visualizing the different aspects of proteomics data, from acquired spectra via protein identification and quantification to pathway analysis, are discussed, and examples of the most useful visualization approaches are highlighted. Finally, we offer our ideas about future directions for proteomics data visualization.
Collapse
|
38
|
Sample prefractionation with liquid isoelectric focusing enables in depth microbial metaproteome analysis of mesophilic and thermophilic biogas plants. Anaerobe 2014; 29:59-67. [DOI: 10.1016/j.anaerobe.2013.11.009] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Revised: 11/22/2013] [Accepted: 11/25/2013] [Indexed: 12/20/2022]
|
39
|
Comparative performance of four methods for high-throughput glycosylation analysis of immunoglobulin G in genetic and epidemiological research. Mol Cell Proteomics 2014; 13:1598-610. [PMID: 24719452 PMCID: PMC4047478 DOI: 10.1074/mcp.m113.037465] [Citation(s) in RCA: 129] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2014] [Revised: 03/14/2014] [Indexed: 11/06/2022] Open
Abstract
The biological and clinical relevance of glycosylation is becoming increasingly recognized, leading to a growing interest in large-scale clinical and population-based studies. In the past few years, several methods for high-throughput analysis of glycans have been developed, but thorough validation and standardization of these methods is required before significant resources are invested in large-scale studies. In this study, we compared liquid chromatography, capillary gel electrophoresis, and two MS methods for quantitative profiling of N-glycosylation of IgG in the same data set of 1201 individuals. To evaluate the accuracy of the four methods we then performed analysis of association with genetic polymorphisms and age. Chromatographic methods with either fluorescent or MS-detection yielded slightly stronger associations than MS-only and multiplexed capillary gel electrophoresis, but at the expense of lower levels of throughput. Advantages and disadvantages of each method were identified, which should inform the selection of the most appropriate method in future studies.
Collapse
|
40
|
FRI0200 Is the Measure of Effort-Reward Imbalance at Work Valid in Patients with Systemic Lupus Erythematosus and Rheumatoid Arthritis? Ann Rheum Dis 2014. [DOI: 10.1136/annrheumdis-2014-eular.2517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
41
|
DeNovoGUI: an open source graphical user interface for de novo sequencing of tandem mass spectra. J Proteome Res 2014; 13:1143-6. [PMID: 24295440 PMCID: PMC3923451 DOI: 10.1021/pr4008078] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
![]()
De novo sequencing is a popular technique in proteomics
for identifying peptides from tandem mass spectra without having to
rely on a protein sequence database. Despite the strong potential
of de novo sequencing algorithms, their adoption
threshold remains quite high. We here present a user-friendly and
lightweight graphical user interface called DeNovoGUI for running
parallelized versions of the freely available de novo sequencing software PepNovo+, greatly simplifying the use of de novo sequencing in proteomics. Our platform-independent
software is freely available under the permissible Apache2 open source
license. Source code, binaries, and additional documentation are available
at http://denovogui.googlecode.com.
Collapse
|
42
|
Searching for a needle in a stack of needles: challenges in metaproteomics data analysis. MOLECULAR BIOSYSTEMS 2013; 9:578-85. [PMID: 23238088 DOI: 10.1039/c2mb25415h] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
In the past years the integral study of microbial communities of varying complexity has gained increasing research interest. Mass spectrometry-driven metaproteomics enables the analysis of such communities on the functional level, but this fledgling field still faces various technical and semantic challenges regarding experimental data analysis and interpretation. In the present review, we outline the hurdles involved and attempt to cover the most valuable methods and software implementations available to researchers in the field today. Beyond merely focusing on protein identification, we provide an overview on different data pre- and post-processing steps, such as metabolic pathway analysis, that can be useful in a typical metaproteomics workflow. Finally, we briefly discuss directions for future work.
Collapse
|
43
|
ProteoCloud: A full-featured open source proteomics cloud computing pipeline. J Proteomics 2013; 88:104-8. [DOI: 10.1016/j.jprot.2012.12.026] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Revised: 12/17/2012] [Accepted: 12/21/2012] [Indexed: 01/08/2023]
|
44
|
glyXalign: High-throughput migration time alignment preprocessing of electrophoretic data retrieved via multiplexed capillary gel electrophoresis with laser-induced fluorescence detection-based glycoprofiling. Electrophoresis 2013; 34:2311-5. [DOI: 10.1002/elps.201200696] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Revised: 03/07/2013] [Accepted: 03/08/2013] [Indexed: 12/19/2022]
|
45
|
FRI0537 Self-reported health status and effort-reward imbalance in patients with rheumatoid arthritis and systemic lupus erythematosus. Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2013-eular.1664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
46
|
FRI0538 Gender-specific effort-reward imbalance in patients with systemic lupus erythematosus? Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2013-eular.1665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
47
|
JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures. Bioinformatics 2011; 28:584-6. [DOI: 10.1093/bioinformatics/btr688] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
48
|
XTandem Parser: an open-source library to parse and analyse X!Tandem MS/MS search results. Proteomics 2010; 10:1522-4. [PMID: 20140905 DOI: 10.1002/pmic.200900759] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Identification of proteins by MS plays an important role in proteomics. A crucial step concerns the identification of peptides from MS/MS spectra. The X!Tandem Project (http://www.thegpm.org/tandem) supplies an open-source search engine for this purpose. In this study, we present an open-source Java library called XTandem Parser that parses X!Tandem XML result files into an easily accessible and fully functional object model (http://xtandem-parser.googlecode.com). In addition, a graphical user interface is provided that functions as a usage example and an end-user visualization tool.
Collapse
|
49
|
Erratum: ms_lims, a simple yet powerful open source laboratory information management system for MS-driven proteomics. Proteomics 2010. [DOI: 10.1002/pmic.201090056] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
50
|
Erratum: XTandem Parser: An open-source library to parse and analyse X!Tandem MS/MS search results. Proteomics 2010. [DOI: 10.1002/pmic.201090058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|