1
|
Ivanov MV, Kopeykina AS, Gorshkov MV. Reanalysis of DIA Data Demonstrates the Capabilities of MS/MS-Free Proteomics to Reveal New Biological Insights in Disease-Related Samples. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:1775-1785. [PMID: 38938158 DOI: 10.1021/jasms.4c00134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]
Abstract
Data-independent acquisition (DIA) at the shortened data acquisition time is becoming a method of choice for quantitative proteomic applications requiring high throughput analysis of large cohorts of samples. With the advent of the combination of high resolution mass spectrometry with an asymmetric track lossless analyzer, these DIA capabilities were further extended with the recent demonstration of quantitative analyses at the speed of up to hundreds of samples per day. In particular, the proteomic data for the brain samples related to multiple system atrophy disease were acquired using 7 and 28 min chromatography gradients (Guzman et al., Nat. Biotech. 2024). In this work, we applied the recently introduced DirectMS1 method to reanalysis of these data using only MS1 spectra. Both DirectMS1 and DIA results were matched against long gradient DDA analysis from the earlier study of the same sample cohort. While the quantitation efficiency of DirectMS1 was comparable with DIA on the same data sets, we found an additional five proteins of biological significance relevant to the analyzed tissue samples. Among the findings, DirectMS1 was able to detect decreased caspase activity for Vimentin protein in the multiple system atrophy samples missed by the MS/MS-based quantitation methods. Our study suggests that DirectMS1 can be an efficient MS1-only addition to the analysis of DIA data in high-throughput quantitative proteomic studies.
Collapse
Affiliation(s)
- Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Anna S Kopeykina
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| |
Collapse
|
2
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
3
|
The M, Picciani M, Jensen C, Gabriel W, Kuster B, Wilhelm M. AI-Assisted Processing Pipeline to Boost Protein Isoform Detection. Methods Mol Biol 2024; 2836:157-181. [PMID: 38995541 DOI: 10.1007/978-1-0716-4007-4_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Proteomics, the study of proteins within biological systems, has seen remarkable advancements in recent years, with protein isoform detection emerging as one of the next major frontiers. One of the primary challenges is achieving the necessary peptide and protein coverage to confidently differentiate isoforms as a result of the protein inference problem and protein false discovery rate estimation challenge in large data. In this chapter, we describe the application of artificial intelligence-assisted peptide property prediction for database search engine rescoring by Oktoberfest, an approach that has proven effective, particularly for complex samples and extensive search spaces, which can greatly increase peptide coverage. Further, it illustrates a method for increasing isoform coverage by the PickedGroupFDR approach that is designed to excel when applied on large data. Real-world examples are provided to illustrate the utility of the tools in the context of rescoring, protein grouping, and false discovery rate estimation. By implementing these cutting-edge techniques, researchers can achieve a substantial increase in both peptide and isoform coverage, thus unlocking the potential of protein isoform detection in their studies and shedding light on their roles and functions in biological processes.
Collapse
Affiliation(s)
- Matthew The
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mario Picciani
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Cecilia Jensen
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Wassim Gabriel
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.
| |
Collapse
|
4
|
Bayer FP, Gander M, Kuster B, The M. CurveCurator: a recalibrated F-statistic to assess, classify, and explore significance of dose-response curves. Nat Commun 2023; 14:7902. [PMID: 38036588 PMCID: PMC10689459 DOI: 10.1038/s41467-023-43696-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/16/2023] [Indexed: 12/02/2023] Open
Abstract
Dose-response curves are key metrics in pharmacology and biology to assess phenotypic or molecular actions of bioactive compounds in a quantitative fashion. Yet, it is often unclear whether or not a measured response significantly differs from a curve without regulation, particularly in high-throughput applications or unstable assays. Treating potency and effect size estimates from random and true curves with the same level of confidence can lead to incorrect hypotheses and issues in training machine learning models. Here, we present CurveCurator, an open-source software that provides reliable dose-response characteristics by computing p-values and false discovery rates based on a recalibrated F-statistic and a target-decoy procedure that considers dataset-specific effect size distributions. The application of CurveCurator to three large-scale datasets enables a systematic drug mode of action analysis and demonstrates its scalable utility across several application areas, facilitated by a performant, interactive dashboard for fast data exploration.
Collapse
Affiliation(s)
- Florian P Bayer
- Proteomics and Bioanalytics, School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Manuel Gander
- Proteomics and Bioanalytics, School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Bernhard Kuster
- Proteomics and Bioanalytics, School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- German Cancer Consortium (DKTK), Partner Site Munich, 80336, Munich, Germany
| | - Matthew The
- Proteomics and Bioanalytics, School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
5
|
Postoenko VI, Garibova LA, Levitsky LI, Bubis JA, Gorshkov MV, Ivanov MV. IQMMA: Efficient MS1 Intensity Extraction Pipeline Using Multiple Feature Detection Algorithms for DDA Proteomics. J Proteome Res 2023; 22:2827-2835. [PMID: 37579078 DOI: 10.1021/acs.jproteome.3c00075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
One of the key steps in data dependent acquisition (DDA) proteomics is detection of peptide isotopic clusters, also called "features", in MS1 spectra and matching them to MS/MS-based peptide identifications. A number of peptide feature detection tools became available in recent years, each relying on its own matching algorithm. Here, we provide an integrated solution, the intensity-based Quantitative Mix and Match Approach (IQMMA), which integrates a number of untargeted peptide feature detection algorithms and returns the most probable intensity values for the MS/MS-based identifications. IQMMA was tested using available proteomic data acquired for both well-characterized (ground truth) and real-world biological samples, including a mix of Yeast and E. coli digests spiked at different concentrations into the Human K562 digest used as a background, and a set of glioblastoma cell lines. Three open-source feature detection algorithms were integrated: Dinosaur, biosaur2, and OpenMS FeatureFinder. None of them was found optimal when applied individually to all the data sets employed in this work; however, their combined use in IQMMA improved efficiency of subsequent protein quantitation. The software implementing IQMMA is freely available at https://github.com/PostoenkoVI/IQMMA under Apache 2.0 license.
Collapse
Affiliation(s)
- Valeriy I Postoenko
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudny 141701, Russia
| | - Leyla A Garibova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudny 141701, Russia
| | - Lev I Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| |
Collapse
|
6
|
Neely BA, Dorfer V, Martens L, Bludau I, Bouwmeester R, Degroeve S, Deutsch EW, Gessulat S, Käll L, Palczynski P, Payne SH, Rehfeldt TG, Schmidt T, Schwämmle V, Uszkoreit J, Vizcaíno JA, Wilhelm M, Palmblad M. Toward an Integrated Machine Learning Model of a Proteomics Experiment. J Proteome Res 2023; 22:681-696. [PMID: 36744821 PMCID: PMC9990124 DOI: 10.1021/acs.jproteome.2c00711] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.
Collapse
Affiliation(s)
- Benjamin A Neely
- National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Viktoria Dorfer
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Softwarepark 11, 4232 Hagenberg, Austria
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | | | - Lukas Käll
- Science for Life Laboratory, KTH - Royal Institute of Technology, 171 21 Solna, Sweden
| | - Pawel Palczynski
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, Utah 84602, United States
| | - Tobias Greisager Rehfeldt
- Institute for Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark
| | | | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Julian Uszkoreit
- Medical Proteome Analysis, Center for Protein Diagnostics (ProDi), Ruhr University Bochum, 44801 Bochum, Germany.,Medizinisches Proteom-Center, Medical Faculty, Ruhr University Bochum, 44801 Bochum, Germany
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich (TUM), 85354 Freising, Germany
| | - Magnus Palmblad
- Leiden University Medical Center, Postbus 9600, 2300 RC Leiden, The Netherlands
| |
Collapse
|
7
|
The M, Käll L. Integrating Identification and Quantification Uncertainty for Differential Protein Abundance Analysis with Triqler. Methods Mol Biol 2023; 2426:91-117. [PMID: 36308686 DOI: 10.1007/978-1-0716-1967-4_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein quantification for shotgun proteomics is a complicated process where errors can be introduced in each of the steps. Triqler is a Python package that estimates and integrates errors of the different parts of the label-free protein quantification pipeline into a single Bayesian model. Specifically, it weighs the quantitative values by the confidence we have in the correctness of the corresponding PSM. Furthermore, it treats missing values in a way that reflects their uncertainty relative to observed values. Finally, it combines these error estimates in a single differential abundance FDR that not only reflects the errors and uncertainties in quantification but also in identification. In this tutorial, we show how to (1) generate input data for Triqler from quantification packages such as MaxQuant and Quandenser, (2) run Triqler and what the different options are, (3) interpret the results, (4) investigate the posterior distributions of a protein of interest in detail, and (5) verify that the hyperparameter estimations are sensible.
Collapse
Affiliation(s)
- Matthew The
- Chair of Proteomics and Bioanalytics, Technische Universität München, Freising, Germany.
| | - Lukas Käll
- Science for Life Laboratory, KTH Royal Institute of Technology, Solna, Sweden
| |
Collapse
|
8
|
Ryu SY, Yun MP, Kim S. Integrating Multiple Quantitative Proteomic Analyses Using MetaMSD. Methods Mol Biol 2023; 2426:361-374. [PMID: 36308697 DOI: 10.1007/978-1-0716-1967-4_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
MetaMSD is a proteomic software that integrates multiple quantitative mass spectrometry data analysis results using statistical summary combination approaches. By utilizing this software, scientists can combine results from their pilot and main studies to maximize their biomarker discovery while effectively controlling false discovery rates. It also works for combining proteomic datasets generated by different labeling techniques and/or different types of mass spectrometry instruments. With these advantages, MetaMSD enables biological researchers to explore various proteomic datasets in public repositories to discover new biomarkers and generate interesting hypotheses for future studies. In this protocol, we provide a step-by-step procedure on how to install and perform a meta-analysis for quantitative proteomics using MetaMSD.
Collapse
Affiliation(s)
- So Young Ryu
- School of Public Health, University of Nevada Reno, Reno, NV, USA.
| | - Miriam P Yun
- Department of Psychology Institute for Neuroscience, University of Nevada Reno, Reno, NV, USA
| | - Sujung Kim
- School of Public Health, University of Nevada Reno, Reno, NV, USA
| |
Collapse
|
9
|
Ivanov MV, Bubis JA, Gorshkov V, Tarasova IA, Levitsky LI, Solovyeva EM, Lipatova AV, Kjeldsen F, Gorshkov MV. DirectMS1Quant: Ultrafast Quantitative Proteomics with MS/MS-Free Mass Spectrometry. Anal Chem 2022; 94:13068-13075. [PMID: 36094425 DOI: 10.1021/acs.analchem.2c02255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recently, we presented the DirectMS1 method of ultrafast proteome-wide analysis based on minute-long LC gradients and MS1-only mass spectra acquisition. Currently, the method provides the depth of human cell proteome coverage of 2500 proteins at a 1% false discovery rate (FDR) when using 5 min LC gradients and 7.3 min runtime in total. While the standard MS/MS approaches provide 4000-5000 protein identifications within a couple of hours of instrumentation time, we advocate here that the higher number of identified proteins does not always translate into better quantitation quality of the proteome analysis. To further elaborate on this issue, we performed a one-on-one comparison of quantitation results obtained using DirectMS1 with three popular MS/MS-based quantitation methods: label-free (LFQ) and tandem mass tag quantitation (TMT), both based on data-dependent acquisition (DDA) and data-independent acquisition (DIA). For comparison, we performed a series of proteome-wide analyses of well-characterized (ground truth) and biologically relevant samples, including a mix of UPS1 proteins spiked at different concentrations into an Echerichia coli digest used as a background and a set of glioblastoma cell lines. MS1-only data was analyzed using a novel quantitation workflow called DirectMS1Quant developed in this work. The results obtained in this study demonstrated comparable quantitation efficiency of 5 min DirectMS1 with both TMT and DIA methods, yet the latter two utilized a 10-20-fold longer instrumentation time.
Collapse
Affiliation(s)
- Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Irina A Tarasova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Lev I Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Elizaveta M Solovyeva
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Anastasiya V Lipatova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| |
Collapse
|
10
|
Pelosi B. Developing a bioinformatics pipeline for comparative protein classification analysis. BMC Genom Data 2022; 23:43. [PMID: 35668373 PMCID: PMC9172112 DOI: 10.1186/s12863-022-01045-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 03/11/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Protein classification is a task of paramount importance in various fields of biology. Despite the great momentum of modern implementation of protein classification, machine learning techniques such as Random Forest and Neural Network could not always be used for several reasons: data collection, unbalanced classification or labelling of the data.As an alternative, I propose the use of a bioinformatics pipeline to search for and classify information from protein databases. Hence, to evaluate the efficiency and accuracy of the pipeline, I focused on the carotenoid biosynthetic genes and developed a filtering approach to retrieve orthologs clusters in two well-studied plants that belong to the Brassicaceae family: Arabidopsis thaliana and Brassica rapa Pekinensis group. The result obtained has been compared with previous studies on carotenoid biosynthetic genes in B. rapa where phylogenetic analysis was conducted. RESULTS The developed bioinformatics pipeline relies on commercial software and multiple databeses including the use of phylogeny, Gene Ontology terms (GOs) and Protein Families (Pfams) at a protein level. Furthermore, the phylogeny is coupled with "population analysis" to evaluate the potential orthologs. All the steps taken together give a final table of potential orthologs. The phylogenetic tree gives a result of 43 putative orthologs conserved in B. rapa Pekinensis group. Different A. thaliana proteins have more than one syntenic ortholog as also shown in a previous finding (Li et al., BMC Genomics 16(1):1-11, 2015). CONCLUSIONS This study demonstrates that, when the biological features of proteins of interest are not specific, I can rely on a computational approach in filtering steps for classification purposes. The comparison of the results obtained here for the carotenoid biosynthetic genes with previous research confirmed the accuracy of the developed pipeline which can therefore be applied for filtering different types of datasets.
Collapse
Affiliation(s)
- Benedetta Pelosi
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
11
|
Wang B, Wang Y, Chen Y, Gao M, Ren J, Guo Y, Situ C, Qi Y, Zhu H, Li Y, Guo X. DeepSCP: utilizing deep learning to boost single-cell proteome coverage. Brief Bioinform 2022; 23:6598882. [DOI: 10.1093/bib/bbac214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/20/2022] [Accepted: 05/06/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
Multiplexed single-cell proteomes (SCPs) quantification by mass spectrometry greatly improves the SCP coverage. However, it still suffers from a low number of protein identifications and there is much room to boost proteins identification by computational methods. In this study, we present a novel framework DeepSCP, utilizing deep learning to boost SCP coverage. DeepSCP constructs a series of features of peptide-spectrum matches (PSMs) by predicting the retention time based on the multiple SCP sample sets and fragment ion intensities based on deep learning, and predicts PSM labels with an optimized-ensemble learning model. Evaluation of DeepSCP on public and in-house SCP datasets showed superior performances compared with other state-of-the-art methods. DeepSCP identified more confident peptides and proteins by controlling q-value at 0.01 using target–decoy competition method. As a convenient and low-cost computing framework, DeepSCP will help boost single-cell proteome identification and facilitate the future development and application of single-cell proteomics.
Collapse
Affiliation(s)
- Bing Wang
- School of Medicine , Southeast University, Nanjing 210009 , China
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Yue Wang
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Yu Chen
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Mengmeng Gao
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Jie Ren
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Yueshuai Guo
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Chenghao Situ
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Yaling Qi
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Hui Zhu
- Department of Clinical Laboratory , Sir Run Run Hospital, Nanjing Medical University, Nanjing 211166 , China
| | - Yan Li
- School of Medicine , Southeast University, Nanjing 210009 , China
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| | - Xuejiang Guo
- Department of Histology and Embryology , State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166 , China
| |
Collapse
|
12
|
Plubell DL, Käll L, Webb-Robertson BJM, Bramer LM, Ives A, Kelleher NL, Smith LM, Montine TJ, Wu CC, MacCoss MJ. Putting Humpty Dumpty Back Together Again: What Does Protein Quantification Mean in Bottom-Up Proteomics? J Proteome Res 2022; 21:891-898. [PMID: 35220718 PMCID: PMC8976764 DOI: 10.1021/acs.jproteome.1c00894] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Bottom-up proteomics provides peptide measurements and has been invaluable for moving proteomics into large-scale analyses. Commonly, a single quantitative value is reported for each protein-coding gene by aggregating peptide quantities into protein groups following protein inference or parsimony. However, given the complexity of both RNA splicing and post-translational protein modification, it is overly simplistic to assume that all peptides that map to a singular protein-coding gene will demonstrate the same quantitative response. By assuming that all peptides from a protein-coding sequence are representative of the same protein, we may miss the discovery of important biological differences. To capture the contributions of existing proteoforms, we need to reconsider the practice of aggregating protein values to a single quantity per protein-coding gene.
Collapse
Affiliation(s)
- Deanna L. Plubell
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195 USA
| | - Lukas Käll
- Science for Life Laboratory, KTH - Royal Institute of Technology, Box 1031, 17121, Solna, Sweden
| | | | - Lisa M. Bramer
- Pacific Northwest National Laboratory, Richland, WA 99352
| | - Ashley Ives
- Proteomics Center of Excellence & Departments of Chemistry and Molecular Biosciences, Northwestern University, Evanston, IL 60208
| | - Neil L. Kelleher
- Proteomics Center of Excellence & Departments of Chemistry and Molecular Biosciences, Northwestern University, Evanston, IL 60208
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, 53706
| | | | - Christine C. Wu
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195 USA
| | - Michael J. MacCoss
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195 USA
| |
Collapse
|
13
|
Crook OM, Chung CW, Deane CM. Challenges and Opportunities for Bayesian Statistics in Proteomics. J Proteome Res 2022; 21:849-864. [PMID: 35258980 PMCID: PMC8982455 DOI: 10.1021/acs.jproteome.1c00859] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Indexed: 12/27/2022]
Abstract
Proteomics is a data-rich science with complex experimental designs and an intricate measurement process. To obtain insights from the large data sets produced, statistical methods, including machine learning, are routinely applied. For a quantity of interest, many of these approaches only produce a point estimate, such as a mean, leaving little room for more nuanced interpretations. By contrast, Bayesian statistics allows quantification of uncertainty through the use of probability distributions. These probability distributions enable scientists to ask complex questions of their proteomics data. Bayesian statistics also offers a modular framework for data analysis by making dependencies between data and parameters explicit. Hence, specifying complex hierarchies of parameter dependencies is straightforward in the Bayesian framework. This allows us to use a statistical methodology which equals, rather than neglects, the sophistication of experimental design and instrumentation present in proteomics. Here, we review Bayesian methods applied to proteomics, demonstrating their potential power, alongside the challenges posed by adopting this new statistical framework. To illustrate our review, we give a walk-through of the development of a Bayesian model for dynamic organic orthogonal phase-separation (OOPS) data.
Collapse
Affiliation(s)
- Oliver M. Crook
- Department
of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Chun-wa Chung
- Structural
and Biophysical Sciences, GlaxoSmithKline
R&D, Stevenage SG1 2NY, United Kingdom
| | - Charlotte M. Deane
- Department
of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| |
Collapse
|
14
|
Schallert K, Verschaffelt P, Mesuere B, Benndorf D, Martens L, Van Den Bossche T. Pout2Prot: An Efficient Tool to Create Protein (Sub)groups from Percolator Output Files. J Proteome Res 2022; 21:1175-1180. [PMID: 35143215 DOI: 10.1021/acs.jproteome.1c00685] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In metaproteomics, the study of the collective proteome of microbial communities, the protein inference problem is more challenging than in single-species proteomics. Indeed, a peptide sequence can be present not only in multiple proteins or protein isoforms of the same species, but also in homologous proteins from closely related species. To assign the taxonomy and functions of the microbial species, specialized tools have been developed, such as Prophane. This tool, however, is not directly compatible with post-processing tools such as Percolator. In this manuscript we therefore present Pout2Prot, which takes Percolator Output (.pout) files from multiple experiments and creates protein group and protein subgroup output files (.tsv) that can be used directly with Prophane. We investigated different grouping strategies and compared existing protein grouping tools to develop an advanced protein grouping algorithm that offers a variety of different approaches, allows grouping for multiple files, and uses a weighted spectral count for protein (sub)groups to reflect abundance. Pout2Prot is available as a web application at https://pout2prot.ugent.be and is installable via pip as a standalone command line tool and reusable software library. All code is open source under the Apache License 2.0 and is available at https://github.com/compomics/pout2prot.
Collapse
Affiliation(s)
- Kay Schallert
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, 39104 Magdeburg, Germany.,Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, 39104 Magdeburg, Germany
| | - Pieter Verschaffelt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000 Ghent, Belgium.,VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
| | - Bart Mesuere
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000 Ghent, Belgium.,VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
| | - Dirk Benndorf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, 39104 Magdeburg, Germany.,Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, 39104 Magdeburg, Germany.,Microbiology, Department of Applied Biosciences and Process Technology, Anhalt University of Applied Sciences, 06366 Köthen, Germany
| | - Lennart Martens
- VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
15
|
Gardner ML, Freitas MA. Multiple Imputation Approaches Applied to the Missing Value Problem in Bottom-Up Proteomics. Int J Mol Sci 2021; 22:ijms22179650. [PMID: 34502557 PMCID: PMC8431783 DOI: 10.3390/ijms22179650] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 08/28/2021] [Accepted: 08/31/2021] [Indexed: 01/15/2023] Open
Abstract
Analysis of differential abundance in proteomics data sets requires careful application of missing value imputation. Missing abundance values widely vary when performing comparisons across different sample treatments. For example, one would expect a consistent rate of “missing at random” (MAR) across batches of samples and varying rates of “missing not at random” (MNAR) depending on the inherent difference in sample treatments within the study. The missing value imputation strategy must thus be selected that best accounts for both MAR and MNAR simultaneously. Several important issues must be considered when deciding the appropriate missing value imputation strategy: (1) when it is appropriate to impute data; (2) how to choose a method that reflects the combinatorial manner of MAR and MNAR that occurs in an experiment. This paper provides an evaluation of missing value imputation strategies used in proteomics and presents a case for the use of hybrid left-censored missing value imputation approaches that can handle the MNAR problem common to proteomics data.
Collapse
Affiliation(s)
- Miranda L. Gardner
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
| | - Michael A. Freitas
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Correspondence: or
| |
Collapse
|
16
|
Gabdrakhmanov IT, Gorshkov MV, Tarasova IA. Proteomics of Cellular Response to Stress: Taking Control of False Positive Results. BIOCHEMISTRY (MOSCOW) 2021; 86:338-349. [PMID: 33838633 DOI: 10.1134/s0006297921030093] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
One of the main goals of quantitative proteomics is molecular profiling of cellular response to stress at the protein level. To perform this profiling, statistical analysis of experimental data involves multiple testing of a hypothesis about the equality of protein concentrations between the cells under normal and stress conditions. This analysis is then associated with the multiple testing problem dealing with the increased chance of obtaining false positive results. A number of solutions to this problem are known, yet, they may lead to the loss of potentially important biological information when applied with commonly accepted thresholds of statistical significance. Using the proteomic data obtained earlier for the yeast samples containing proteins at known concentrations and the biological models of early and late cellular responses to stress, we analyzed dependences of distributions of false positive and false negative rates on the protein fold changes and thresholds of statistical significance. Based on the analysis of the density of data points in the volcano plots, Benjamini-Hochberg method, and gene ontology analysis, visual approach for optimization of the statistical threshold and selection of the differentially regulated proteins has been suggested, which could be useful for researchers working in the field of quantitative proteomics.
Collapse
Affiliation(s)
| | - Mikhail V Gorshkov
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow Region, 141701, Russia.,Talrose Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Irina A Tarasova
- Talrose Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
17
|
Abstract
Proteomics studies rely on the accurate assignment of peptides to the acquired tandem mass spectra-a task where machine learning algorithms have proven invaluable. We describe mokapot, which provides a flexible semisupervised learning algorithm that allows for highly customized analyses. We demonstrate some of the unique features of mokapot by improving the detection of RNA-cross-linked peptides from an analysis of RNA-binding proteins and increasing the consistency of peptide detection in a single-cell proteomics study.
Collapse
Affiliation(s)
- William
E. Fondrie
- Department
of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - William S. Noble
- Department
of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
- Paul
G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
18
|
The M, Käll L. Triqler for MaxQuant: Enhancing Results from MaxQuant by Bayesian Error Propagation and Integration. J Proteome Res 2021; 20:2062-2068. [PMID: 33661646 PMCID: PMC8041382 DOI: 10.1021/acs.jproteome.0c00902] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Error estimation for differential protein quantification by label-free shotgun proteomics is challenging due to the multitude of error sources, each contributing uncertainty to the final results. We have previously designed a Bayesian model, Triqler, to combine such error terms into one combined quantification error. Here we present an interface for Triqler that takes MaxQuant results as input, allowing quick reanalysis of already processed data. We demonstrate that Triqler outperforms the original processing for a large set of both engineered and clinical/biological relevant data sets. Triqler and its interface to MaxQuant are available as a Python module under an Apache 2.0 license from https://pypi.org/project/triqler/.
Collapse
Affiliation(s)
- Matthew The
- Chair of Proteomics and Bioanalytics, Technische Universität München, Emil-Erlenmeyer Forum 5, 85354 Freising, Germany
| | - Lukas Käll
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, Royal Institute of Technology - KTH, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
19
|
The M, Käll L. Focus on the spectra that matter by clustering of quantification data in shotgun proteomics. Nat Commun 2020; 11:3234. [PMID: 32591519 PMCID: PMC7319958 DOI: 10.1038/s41467-020-17037-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 06/08/2020] [Indexed: 02/02/2023] Open
Abstract
In shotgun proteomics, the analysis of label-free quantification experiments is typically limited by the identification rate and the noise level in the quantitative data. This generally causes a low sensitivity in differential expression analysis. Here, we propose a quantification-first approach for peptides that reverses the classical identification-first workflow, thereby preventing valuable information from being discarded in the identification stage. Specifically, we introduce a method, Quandenser, that applies unsupervised clustering on both MS1 and MS2 level to summarize all analytes of interest without assigning identities. This reduces search time due to the data reduction. We can now employ open modification and de novo searches to identify analytes of interest that would have gone unnoticed in traditional pipelines. Quandenser+Triqler outperforms the state-of-the-art method MaxQuant+Perseus, consistently reporting more differentially abundant proteins for all tested datasets. Software is available for all major operating systems at https://github.com/statisticalbiotechnology/quandenser, under Apache 2.0 license.
Collapse
Affiliation(s)
- Matthew The
- Science for Life Laboratory, KTH Royal Institute of Technology, Box 1031, 17121, Solna, Sweden
| | - Lukas Käll
- Science for Life Laboratory, KTH Royal Institute of Technology, Box 1031, 17121, Solna, Sweden.
| |
Collapse
|
20
|
Millikin RJ, Shortreed MR, Scalf M, Smith LM. A Bayesian Null Interval Hypothesis Test Controls False Discovery Rates and Improves Sensitivity in Label-Free Quantitative Proteomics. J Proteome Res 2020; 19:1975-1981. [PMID: 32243168 DOI: 10.1021/acs.jproteome.9b00796] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Statistical significance tests are a common feature in quantitative proteomics workflows. The Student's t-test is widely used to compute the statistical significance of a protein's change between two groups of samples. However, the t-test's null hypothesis asserts that the difference in means between two groups is exactly zero, often marking small but uninteresting fold-changes as statistically significant. Compensations to address this issue are widely used in quantitative proteomics, but we suggest that a replacement of the t-test with a Bayesian approach offers a better path forward. In this article, we describe a Bayesian hypothesis test in which the null hypothesis is an interval rather than a single point at zero; the width of the interval is estimated from population statistics. The improved sensitivity of the method substantially increases the number of truly changing proteins detected in two benchmark data sets (ProteomeXchange identifiers PXD005590 and PXD016470). The method has been implemented within FlashLFQ, an open-source software program that quantifies bottom-up proteomics search results obtained from any search tool. FlashLFQ is rapid, sensitive, and accurate and is available both as an easy-to-use graphical user interface (Windows) and as a command-line tool (Windows/Linux/OSX).
Collapse
Affiliation(s)
- Robert J Millikin
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| |
Collapse
|
21
|
Peshkin L, Gupta M, Ryazanova L, Wühr M. Bayesian Confidence Intervals for Multiplexed Proteomics Integrate Ion-statistics with Peptide Quantification Concordance. Mol Cell Proteomics 2019; 18:2108-2120. [PMID: 31311848 PMCID: PMC6773559 DOI: 10.1074/mcp.tir119.001317] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 06/11/2019] [Indexed: 01/28/2023] Open
Abstract
Multiplexed proteomics has emerged as a powerful tool to measure relative protein expression levels across multiple conditions. The relative protein abundances are inferred by comparing the signals generated by isobaric tags, which encode the samples' origins. Intuitively, the trust associated with a protein measurement depends on the similarity of ratios from the protein's peptides and the signal-strength of these measurements. However, typically the average peptide ratio is reported as the estimate of relative protein abundance, which is only the most likely ratio with a very naive model. Moreover, there is no sense on the confidence in these measurements. Here, we present a mathematically rigorous approach that integrates peptide signal strengths and peptide-measurement agreement into an estimation of the true protein ratio and the associated confidence (BACIQ). The main advantages of BACIQ are: (1) It removes the need to threshold reported peptide signal based on an arbitrary cut-off, thereby reporting more measurements from a given experiment; (2) Confidence can be assigned without replicates; (3) For repeated experiments BACIQ provides confidence intervals for the union, not the intersection, of quantified proteins; (4) For repeated experiments, BACIQ confidence intervals are more predictive than confidence intervals based on protein measurement agreement. To demonstrate the power of BACIQ we reanalyzed previously published data on subcellular protein movement on treatment with an Exportin-1 inhibiting drug. We detect ∼2× more highly significant movers, down to subcellular localization changes of ∼1%. Thus, our method drastically increases the value obtainable from quantitative proteomics experiments, helping researchers to interpret their data and prioritize resources. To make our approach easily accessible we distribute it via a Python/Stan package.
Collapse
Affiliation(s)
- Leonid Peshkin
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Meera Gupta
- Department of Molecular Biology & the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544; DOE Center for Advanced Bioenergy and Bioproducts Innovation, Princeton, NJ 08544
| | - Lillia Ryazanova
- Department of Molecular Biology & the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544; DOE Center for Advanced Bioenergy and Bioproducts Innovation, Princeton, NJ 08544
| | - Martin Wühr
- Department of Molecular Biology & the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544; DOE Center for Advanced Bioenergy and Bioproducts Innovation, Princeton, NJ 08544.
| |
Collapse
|