1
|
Stanevich V, Oyeniran O, Somani S. Modeling Chromatography Binding through Molecular Dynamics Simulations with Resin Fragments. J Phys Chem B 2024; 128:5557-5566. [PMID: 38809811 PMCID: PMC11181327 DOI: 10.1021/acs.jpcb.4c00578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/31/2024] [Accepted: 04/02/2024] [Indexed: 05/31/2024]
Abstract
Accurate atomistic modeling of the interactions of a chromatography resin with a solute can inform the selection of purification conditions for a product, an important problem in the biotech and pharmaceutical industries. We present a molecular dynamics simulation-based approach for the qualitative prediction of interaction sites (specificity) and retention times (affinity) of a protein for a given chromatography resin. We mimicked the resin with an unrestrained ligand composed of the resin headgroup coupled with successively larger fragments of the agarose backbone. The interactions of the ligand with the protein are simulated in an explicit solvent using the Replica Exchange Molecular Dynamics enhanced sampling approach in conjunction with Hydrogen Mass Repartitioning (REMD-HMR). We computed the ligand interaction surface from the simulation trajectories and correlated the features of the interaction surface with experimentally determined retention times. The simulation and analysis protocol were first applied to a series of ubiquitin mutants for which retention times on Capto MMC resin are available. The ubiquitin simulations helped identify the optimal ligand that was used in subsequent simulations on six proteins for which Capto MMC elution times are available. For each of the six proteins, we computed the interaction surface and characterized it in terms of a range of simulation-averaged residue-level physicochemical descriptors. Modeling of the salt concentrations required for elution with respect to the descriptors resulted in a linear fit in terms of aromaphilicity and Kyte-Doolittle hydrophobicity that was robust to outliers, showed high correlation, and correctly ranked the protein elution order. The physics-based model building approach described here does not require a large experimental data set and can be readily applied to different resins and diverse biomolecules.
Collapse
Affiliation(s)
- Vitali Stanevich
- Protein
Therapeutics API Development, Janssen Research & Development,
LLC, a Johnson & Johnson company, Malvern, Pennsylvania 19355, United States
| | - Oluyemi Oyeniran
- Statistics
and Decision Sciences, Janssen Research & Development, LLC, a Johnson & Johnson company, Spring House, Pennsylvania 19002, United States
| | - Sandeep Somani
- In Silico
Discovery, Janssen Research & Development, LLC, a Johnson & Johnson company, Spring House, Pennsylvania 19002, United States
| |
Collapse
|
2
|
Zhang K, Barbieri E, LeBarre J, Rameez S, Mostafa S, Menegatti S. Peptonics: A new family of cell-protecting surfactants for the recombinant expression of therapeutic proteins in mammalian cell cultures. Biotechnol J 2024; 19:e2300261. [PMID: 37844203 DOI: 10.1002/biot.202300261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 08/08/2023] [Accepted: 10/05/2023] [Indexed: 10/18/2023]
Abstract
Polymer surfactants are key components of cell culture media as they prevent mechanical damage during fermentation in stirred bioreactors. Among cell-protecting surfactants, Pluronics are widely utilized in biomanufacturing to ensure high cell viability and productivity. Monodispersity of monomer sequence and length is critical for the effectiveness of Pluronics-since minor deviations can damage the cells-but is challenging to achieve due to the stochastic nature of polymerization. Responding to this challenge, this study introduces Peptonics, a novel family of peptide and peptoid surfactants whose monomer composition and sequence are designed to achieve high cell viability and productivity at a fraction of chain length and cost of Pluronics. A designed ensemble of Peptonics was initially characterized via light scattering and tensiometry to select sequences whose phase behavior and tensioactivity align with those of Pluronics. Selected sequences were evaluated as cell-protecting surfactants using Chinese hamster ovary (CHO) cells expressing therapeutic monoclonal antibodies (mAb). Peptonics IH-T1010, ih-T1010, and ih-T1020 afforded high cell density (up to 3 × 107 cells mL-1 ) and viability (up to 95% within 10 days of culture), while reducing the accumulation of ammonia (a toxic metabolite) by ≈10% compared to Pluronic F-68. Improved cell viability afforded high mAb titer (up to 5.5 mg mL-1 ) and extended the production window beyond 14 days; notably, Peptonic IH-T1020 decreased mAb fragmentation and aggregation ≈5%, and lowered the titer of host cell proteins by 16% compared to Pluronic F-68. These features can improve significantly the purification of mAbs, thus increasing their availability at a lower cost to patients.
Collapse
Affiliation(s)
- Ka Zhang
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
- KBI Biopharma, Durham, North Carolina, USA
| | - Eduardo Barbieri
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
- LigaTrap Technologies LLC, Raleigh, North Carolina, USA
| | - Jacob LeBarre
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
| | | | | | - Stefano Menegatti
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
- LigaTrap Technologies LLC, Raleigh, North Carolina, USA
- Biomanufacturing Training and Education Center (BTEC), North Carolina State University, Raleigh, North Carolina, USA
- North Carolina Viral Vector Initiative in Research and Learning (NC-VVIRAL), North Carolina State University, Raleigh, North Carolina, USA
| |
Collapse
|
3
|
Wang J, Yu A, Cho BG, Mechref Y. Assessing the hydrophobicity of glycopeptides using reversed-phase liquid chromatography and tandem mass spectrometry. J Chromatogr A 2023; 1706:464237. [PMID: 37523904 DOI: 10.1016/j.chroma.2023.464237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 07/19/2023] [Accepted: 07/20/2023] [Indexed: 08/02/2023]
Abstract
Retention time is one of the most important parameters that has been widely used to demonstrate the separation results obtained from liquid chromatography (LC) platforms. However, retention time can shift when samples are tested with different instruments and laboratories, which hinders the identification process of analytes when comparing data collected from different LC systems. To address this problem, hydrophobicity index was introduced for retention time normalization of the glycopeptides separated by reversed-phase LC (RPLC). Tandem MS was used for the detection and identification of glycopeptides. In addition, the influence of different types of glycans on the hydrophobicity of peptide backbones was studied by comparing the retention time of glycopeptides with their non-glycosylated counterparts. The hydrophobicity of tryptic digested glycopeptides derived from model glycoproteins, including bovine fetuin, α1-acid glycoprotein, and haptoglobin from human plasma, were evaluated based on the hydrophobicity index of the standard peptides from a peptide retention time calibration mixture. The reduction of hydrophobicity of multiple peptide backbones was observed due to the hydrophilic glycan structures. By comparing the hydrophobicity index of glycopeptides collected from different time and instruments, the day-to-day and lab-to-lab comparisons suggested high reliability and reproducibility of this approach. The RSD% of hydrophobicity index from inter-lab experiments was 1.2%, while the RSD% of retention time was 5.1%. Then, the applications of this method were demonstrated on complex glycopeptide samples extracted from human blood serum. The hydrophobicity index can be applied to address the retention time shift when using different instruments, thereby boosting confidence of the characterization of glycopeptides.
Collapse
Affiliation(s)
- Junyao Wang
- Department of Chemistry and Biochemistry, Texas Tech University, United States
| | - Aiying Yu
- Department of Chemistry and Biochemistry, Texas Tech University, United States
| | - Byeong Gwan Cho
- Department of Chemistry and Biochemistry, Texas Tech University, United States
| | - Yehia Mechref
- Department of Chemistry and Biochemistry, Texas Tech University, United States.
| |
Collapse
|
4
|
Neely BA, Dorfer V, Martens L, Bludau I, Bouwmeester R, Degroeve S, Deutsch EW, Gessulat S, Käll L, Palczynski P, Payne SH, Rehfeldt TG, Schmidt T, Schwämmle V, Uszkoreit J, Vizcaíno JA, Wilhelm M, Palmblad M. Toward an Integrated Machine Learning Model of a Proteomics Experiment. J Proteome Res 2023; 22:681-696. [PMID: 36744821 PMCID: PMC9990124 DOI: 10.1021/acs.jproteome.2c00711] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Indexed: 02/07/2023]
Abstract
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.
Collapse
Affiliation(s)
- Benjamin A. Neely
- National
Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Viktoria Dorfer
- Bioinformatics
Research Group, University of Applied Sciences
Upper Austria, Softwarepark
11, 4232 Hagenberg, Austria
| | - Lennart Martens
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Isabell Bludau
- Department
of Proteomics and Signal Transduction, Max
Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Robbin Bouwmeester
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | | | - Lukas Käll
- Science
for Life Laboratory, KTH - Royal Institute
of Technology, 171 21 Solna, Sweden
| | - Pawel Palczynski
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, 5230 Odense, Denmark
| | - Samuel H. Payne
- Department
of Biology, Brigham Young University, Provo, Utah 84602, United States
| | - Tobias Greisager Rehfeldt
- Institute
for Mathematics and Computer Science, University
of Southern Denmark, 5230 Odense, Denmark
| | | | - Veit Schwämmle
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, 5230 Odense, Denmark
| | - Julian Uszkoreit
- Medical
Proteome Analysis, Center for Protein Diagnostics (ProDi), Ruhr University Bochum, 44801 Bochum, Germany
- Medizinisches
Proteom-Center, Medical Faculty, Ruhr University
Bochum, 44801 Bochum, Germany
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory,
European Bioinformatics Institute
(EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United
Kingdom
| | - Mathias Wilhelm
- Computational
Mass Spectrometry, Technical University
of Munich (TUM), 85354 Freising, Germany
| | - Magnus Palmblad
- Leiden University Medical Center, Postbus 9600, 2300
RC Leiden, The Netherlands
| |
Collapse
|
5
|
Yeung D, Spicer V, Zahedi RP, Krokhin O. Exploring the variable space of shallow machine learning models for reversed-phase retention time prediction. Comput Struct Biotechnol J 2023; 21:2446-2453. [PMID: 37090433 PMCID: PMC10113922 DOI: 10.1016/j.csbj.2023.02.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 03/02/2023] Open
Abstract
Peptide retention time (RT) prediction algorithms are tools to study and identify the physicochemical properties that drive the peptide-sorbent interaction. Traditional RT algorithms use multiple linear regression with manually curated parameters to determine the degree of direct contribution for each parameter and improvements to RT prediction accuracies relied on superior feature engineering. Deep learning led to a significant increase in RT prediction accuracy and automated feature engineering via chaining multiple learning modules. However, the significance and the identity of these extracted variables are not well understood due to the inherent complexity when interpreting "relationships-of-relationships" found in deep learning variables. To achieve both accuracy and interpretability simultaneously, we isolated individual modules used in deep learning and the isolated modules are the shallow learners employed for RT prediction in this work. Using a shallow convolutional neural network (CNN) and gated recurrent unit (GRU), we find that the spatial features obtained via the CNN correlate with real-world physicochemical properties namely cross-collisional sections (CCS) and variations of assessable surface area (ASA). Furthermore, we determined that the discovered parameters are "micro-coefficients" that contribute to the "macro-coefficient" - hydrophobicity. Manually embedding CCS and the variations of ASA to the GRU model yielded an R2 = 0.981 using only 525 variables and can represent 88% of the ∼110,000 tryptic peptides used in our dataset. This work highlights the feature discovery process of our shallow learners can achieve beyond traditional RT models in performance and have better interpretability when compared with the deep learning RT algorithms found in the literature.
Collapse
|
6
|
Al Musaimi O, Valenzo OMM, Williams DR. Prediction of peptides retention behavior in reversed-phase liquid chromatography based on their hydrophobicity. J Sep Sci 2023; 46:e2200743. [PMID: 36349538 PMCID: PMC10098489 DOI: 10.1002/jssc.202200743] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/30/2022] [Accepted: 10/31/2022] [Indexed: 11/11/2022]
Abstract
Hydrophobicity is an important physicochemical property of peptides and proteins. It is responsible for their conformational changes, stability, as well as various chemical intramolecular and intermolecular interactions. Enormous efforts have been invested to study the extent of hydrophobicity and how it could influence various biological processes, in addition to its crucial role in the separation and purification endeavor as well. Here, we have reviewed various studies that were carried out to determine the hydrophobicity starting from (i) simple amino acids solubility behavior, (ii) experimental approach that was undertaken in the reversed-phase liquid chromatography mode, and ending with (iii) some examples of more advanced computational and machine learning models.
Collapse
|
7
|
Lenčo J, Jadeja S, Naplekov DK, Krokhin OV, Khalikova MA, Chocholouš P, Urban J, Broeckhoven K, Nováková L, Švec F. Reversed-Phase Liquid Chromatography of Peptides for Bottom-Up Proteomics: A Tutorial. J Proteome Res 2022; 21:2846-2892. [PMID: 36355445 DOI: 10.1021/acs.jproteome.2c00407] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The performance of the current bottom-up liquid chromatography hyphenated with mass spectrometry (LC-MS) analyses has undoubtedly been fueled by spectacular progress in mass spectrometry. It is thus not surprising that the MS instrument attracts the most attention during LC-MS method development, whereas optimizing conditions for peptide separation using reversed-phase liquid chromatography (RPLC) remains somewhat in its shadow. Consequently, the wisdom of the fundaments of chromatography is slowly vanishing from some laboratories. However, the full potential of advanced MS instruments cannot be achieved without highly efficient RPLC. This is impossible to attain without understanding fundamental processes in the chromatographic system and the properties of peptides important for their chromatographic behavior. We wrote this tutorial intending to give practitioners an overview of critical aspects of peptide separation using RPLC to facilitate setting the LC parameters so that they can leverage the full capabilities of their MS instruments. After briefly introducing the gradient separation of peptides, we discuss their properties that affect the quality of LC-MS chromatograms the most. Next, we address the in-column and extra-column broadening. The last section is devoted to key parameters of LC-MS methods. We also extracted trends in practice from recent bottom-up proteomics studies and correlated them with the current knowledge on peptide RPLC separation.
Collapse
Affiliation(s)
- Juraj Lenčo
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Siddharth Jadeja
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Denis K Naplekov
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Oleg V Krokhin
- Department of Internal Medicine, Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, WinnipegR3E 3P4, Manitoba, Canada
| | - Maria A Khalikova
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Petr Chocholouš
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Jiří Urban
- Department of Chemistry, Faculty of Science, Masaryk University, Kamenice 5, 625 00Brno, Czech Republic
| | - Ken Broeckhoven
- Department of Chemical Engineering (CHIS), Faculty of Engineering, Vrije Universiteit Brussel, Pleinlaan 2, 1050Brussel, Belgium
| | - Lucie Nováková
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - František Švec
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| |
Collapse
|
8
|
Waibl F, Fernández-Quintero ML, Wedl FS, Kettenberger H, Georges G, Liedl KR. Comparison of hydrophobicity scales for predicting biophysical properties of antibodies. Front Mol Biosci 2022; 9:960194. [PMID: 36120542 PMCID: PMC9475378 DOI: 10.3389/fmolb.2022.960194] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 08/09/2022] [Indexed: 11/13/2022] Open
Abstract
While antibody-based therapeutics have grown to be one of the major classes of novel medicines, some antibody development candidates face significant challenges regarding expression levels, solubility, as well as stability and aggregation, under physiological and storage conditions. A major determinant of those properties is surface hydrophobicity, which promotes unspecific interactions and has repeatedly proven problematic in the development of novel antibody-based drugs. Multiple computational methods have been devised for in-silico prediction of antibody hydrophobicity, often using hydrophobicity scales to assign values to each amino acid. Those approaches are usually validated by their ability to rank potential therapeutic antibodies in terms of their experimental hydrophobicity. However, there is significant diversity both in the hydrophobicity scales and in the experimental methods, and consequently in the performance of in-silico methods to predict experimental results. In this work, we investigate hydrophobicity of monoclonal antibodies using hydrophobicity scales. We implement several scoring schemes based on the solvent-accessibility and the assigned hydrophobicity values, and compare the different scores and scales based on their ability to predict retention times from hydrophobic interaction chromatography. We provide an overview of the strengths and weaknesses of several commonly employed hydrophobicity scales, thereby improving the understanding of hydrophobicity in antibody development. Furthermore, we test several datasets, both publicly available and proprietary, and find that the diversity of the dataset affects the performance of hydrophobicity scores. We expect that this work will provide valuable guidelines for the optimization of biophysical properties in future drug discovery campaigns.
Collapse
Affiliation(s)
- Franz Waibl
- Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | | | - Florian S. Wedl
- Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Hubert Kettenberger
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, Penzberg, Germany
| | - Guy Georges
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, Penzberg, Germany
| | - Klaus R. Liedl
- Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
- *Correspondence: Klaus R. Liedl,
| |
Collapse
|
9
|
Yeung D, Anderson G, Spicer V, Krokhin OV. Chromatographic behaviour of peptides modified with amine-reacting tags for relative protein quantitation in proteomic applications. J Chromatogr A 2022; 1679:463391. [DOI: 10.1016/j.chroma.2022.463391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/27/2022] [Accepted: 07/29/2022] [Indexed: 10/16/2022]
|
10
|
Chen H, Lu Y, Shi S, Zhang Q, Cao X, Sun L, An D, Zhang X, Kong X, Liu J. Design and Development of a New Glucagon-Like Peptide-1 Receptor Agonist to Obtain High Oral Bioavailability. Pharm Res 2022; 39:1891-1906. [PMID: 35698011 DOI: 10.1007/s11095-022-03265-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 04/18/2022] [Indexed: 11/26/2022]
Abstract
PURPOSE Semaglutide is the only oral GLP-1 RA in the market, but oral bioavailability is generally limited in range of 0.4-1%. In this study, a new GLP-1RA named SHR-2042 was developed to gain higher oral bioavailability than semaglutide. METHOD Self-association of SHR-2042, semaglutide and liraglutide were assessed using SEC-MALS. The intestinal perfusion test in SD rats was used to select permeation enhancers (PEs) including SNAC, C10 and LCC. ITC, CD and DLS were used to explore the interaction between SHR-2042 and SNAC. Gastric administrated test in SD rats was used to screen SHR-2042 granules with different SHR-2042/SNAC ratios. The oral bioavailability of SHR-2042 was studied in rats and monkeys. RESULT The designed GLP-1RA, SHR-2042, gives a better solubility and lipophilicity than semaglutide. While it forms a similar oligomer with that of semaglutide. During the selection of PEs, SNAC shows better exposure than the other competing PEs including C10 and LCC. SHR-2042 and SNAC bind quickly and exhibit hydrophobic interaction. SNAC could promote monomerization of SHR-2042 and form micelles to trap the monomerized SHR-2042. The oral bioavailability of SHR-2042 paired with SNAC is 0.041% (1:0, w/w), 0.083% (1:10, w/w), 0.32% (1:30, w/w) and 2.83% (1:60, w/w) in rats. And the oral bioavailability of SHR-2042 matched with SNAC is 3.39% (1:30, w/w) in monkeys, which is over 10 times higher than that of semaglutide. CONCLUSION We believe that the design and development of oral SHR-2042 will provide a new way to design more and more GLP-1RAs with high oral bioavailability in the future.
Collapse
Affiliation(s)
- Hao Chen
- School of Pharmacy, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Yun Lu
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Shuai Shi
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Qiang Zhang
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Xiaoli Cao
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Lei Sun
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Dong An
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Xiaojie Zhang
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Xianglin Kong
- Jiangsu Hengrui Pharmaceuticals Co., Ltd., Lianyungang, 222000, People's Republic of China
| | - Jianping Liu
- School of Pharmacy, China Pharmaceutical University, Nanjing, 210009, People's Republic of China.
| |
Collapse
|
11
|
Borkar MR, Coutinho E. Amalgamation of comparative protein modeling with quantitative structure-retention relationship for prediction of the chromatographic behavior of peptides. J Chromatogr A 2022; 1669:462967. [DOI: 10.1016/j.chroma.2022.462967] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 03/09/2022] [Accepted: 03/11/2022] [Indexed: 10/18/2022]
|
12
|
Yang Y, Lin L, Qiao L. Deep learning approaches for data-independent acquisition proteomics. Expert Rev Proteomics 2021; 18:1031-1043. [PMID: 34918987 DOI: 10.1080/14789450.2021.2020654] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
INTRODUCTION Data-independent acquisition (DIA) is an emerging technology for large-scale proteomic studies. DIA data analysis methods are evolving rapidly, and deep learning has cut a conspicuous figure in this field. AREAS COVERED This review discusses and provides an overview of the deep learning methods that are used for DIA data analysis, including spectral library prediction, feature scoring, and statistical control in peptide-centric analysis, as well as de novo peptide sequencing. Literature searches were performed for articles, including preprints, up to December 2021 from PubMed, Scopus, and Web of Science databases. EXPERT OPINION While spectral library prediction has broken through the limitation on proteome coverage of experimental libraries, the statistical burden due to the large query space is the remaining challenge of utilizing proteome-wide predicted libraries. Analysis of post-translational modifications is another promising direction of deep learning-based DIA methods.
Collapse
Affiliation(s)
- Yi Yang
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| | - Ling Lin
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| | - Liang Qiao
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| |
Collapse
|
13
|
Mizero B, Yeung D, Spicer V, Krokhin OV. Peptide retention time prediction for peptides with post-translational modifications: N-terminal (α-amine) and lysine (ε-amine) acetylation. J Chromatogr A 2021; 1657:462584. [PMID: 34619563 DOI: 10.1016/j.chroma.2021.462584] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/23/2021] [Accepted: 09/24/2021] [Indexed: 10/20/2022]
Abstract
Development of a peptide retention prediction model in reversed-phase chromatography is reported for acetylated peptides - both N-terminal (α-) and side chain of Lys (ε-amine) residues. Large-scale proteomic 2D LC-MS analyses of acetylated/non-acetylated tryptic digest of whole human cell lysate have been used to assemble representative retention data sets of 25,000+ modified/non-modified pairs. This allowed elucidating chromatographic behaviour of modified peptides in three different separation modes: high pH reversed-phase, HILIC separation on amide phase (first dimension of 2D) and reversed-phase separation with formic acid as ion-pairing modifier in the second dimension. On average, N-terminal acetylation increases peptide RP retention at acidic pH by 5 Hydrophobicity Index units (% acetonitrile). Acetylation of first lysine adds another 4.1%. The magnitude of the retention shift varies greatly depending on the number of modified amines, peptide length, and N-terminal peptide sequence. Large retention shifts have been observed for peptides with hydrophobic N-termini and specifically peptides carrying sequences characteristic for amphipathic helical structures - all in complete agreement with major sequence-specific features of RP retention mechanism. The utility of the modified Sequence Specific Retention Calculator model has been verified for the in-vivo N-terminally acetylated peptides detected by 2D LC-MS/MS analysis of a yeast tryptic digest. The effect of N-terminal acetylation was also evaluated for six different HILIC columns, strong cation- and strong anion exchange separations using previously acquired 2D LC-MS/MS data.
Collapse
Affiliation(s)
- Benilde Mizero
- Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada
| | - Darien Yeung
- Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada
| | - Vic Spicer
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada
| | - Oleg V Krokhin
- Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada; Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada; Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada.
| |
Collapse
|
14
|
Boone M, Ramasamy P, Zuallaert J, Bouwmeester R, Van Moer B, Maddelein D, Turan D, Hulstaert N, Eeckhaut H, Vandermarliere E, Martens L, Degroeve S, De Neve W, Vranken W, Callewaert N. Massively parallel interrogation of protein fragment secretability using SECRiFY reveals features influencing secretory system transit. Nat Commun 2021; 12:6414. [PMID: 34741024 PMCID: PMC8571348 DOI: 10.1038/s41467-021-26720-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 10/15/2021] [Indexed: 11/09/2022] Open
Abstract
While transcriptome- and proteome-wide technologies to assess processes in protein biogenesis are now widely available, we still lack global approaches to assay post-ribosomal biogenesis events, in particular those occurring in the eukaryotic secretory system. We here develop a method, SECRiFY, to simultaneously assess the secretability of >105 protein fragments by two yeast species, S. cerevisiae and P. pastoris, using custom fragment libraries, surface display and a sequencing-based readout. Screening human proteome fragments with a median size of 50-100 amino acids, we generate datasets that enable datamining into protein features underlying secretability, revealing a striking role for intrinsic disorder and chain flexibility. The SECRiFY methodology generates sufficient amounts of annotated data for advanced machine learning methods to deduce secretability patterns. The finding that secretability is indeed a learnable feature of protein sequences provides a solid base for application-focused studies.
Collapse
Affiliation(s)
- Morgane Boone
- Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium. .,Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium. .,Department of Biochemistry and Biophysics, UCSF, San Francisco, CA, USA.
| | - Pathmanaban Ramasamy
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium ,grid.8767.e0000 0001 2290 8069Structural Biology Brussels, VUB, Brussels, Belgium ,grid.11486.3a0000000104788040Structural Biology Research Center, VIB, Brussels, Belgium ,Interuniversity Institute of Bioinformatics in Brussels (IB)2, ULB-VUB, Brussels, Belgium
| | - Jasper Zuallaert
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium ,grid.510328.dCenter for Biotech Data Science, Ghent University Global Campus, Songdo, Incheon, South Korea ,grid.5342.00000 0001 2069 7798IDLab, ELIS, UGent, Ghent, Belgium
| | - Robbin Bouwmeester
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Berre Van Moer
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Davy Maddelein
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Demet Turan
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Niels Hulstaert
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Hannah Eeckhaut
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Elien Vandermarliere
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Lennart Martens
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Sven Degroeve
- grid.11486.3a0000000104788040Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium ,grid.5342.00000 0001 2069 7798Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Wesley De Neve
- grid.510328.dCenter for Biotech Data Science, Ghent University Global Campus, Songdo, Incheon, South Korea ,grid.5342.00000 0001 2069 7798IDLab, ELIS, UGent, Ghent, Belgium
| | - Wim Vranken
- grid.8767.e0000 0001 2290 8069Structural Biology Brussels, VUB, Brussels, Belgium ,grid.11486.3a0000000104788040Structural Biology Research Center, VIB, Brussels, Belgium ,Interuniversity Institute of Bioinformatics in Brussels (IB)2, ULB-VUB, Brussels, Belgium
| | - Nico Callewaert
- Center for Medical Biotechnology, VIB, Zwijnaarde, Belgium. .,Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium.
| |
Collapse
|
15
|
Bouwmeester R, Gabriels R, Hulstaert N, Martens L, Degroeve S. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat Methods 2021; 18:1363-1369. [PMID: 34711972 DOI: 10.1038/s41592-021-01301-5] [Citation(s) in RCA: 82] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 09/13/2021] [Indexed: 11/09/2022]
Abstract
The inclusion of peptide retention time prediction promises to remove peptide identification ambiguity in complex liquid chromatography-mass spectrometry identification workflows. However, due to the way peptides are encoded in current prediction models, accurate retention times cannot be predicted for modified peptides. This is especially problematic for fledgling open searches, which will benefit from accurate retention time prediction for modified peptides to reduce identification ambiguity. We present DeepLC, a deep learning peptide retention time predictor using peptide encoding based on atomic composition that allows the retention time of (previously unseen) modified peptides to be predicted accurately. We show that DeepLC performs similarly to current state-of-the-art approaches for unmodified peptides and, more importantly, accurately predicts retention times for modifications not seen during training. Moreover, we show that DeepLC's ability to predict retention times for any modification enables potentially incorrect identifications to be flagged in an open search of a wide variety of proteome data.
Collapse
Affiliation(s)
- Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Niels Hulstaert
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium. .,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
16
|
Chang CH, Yeung D, Spicer V, Ogata K, Krokhin O, Ishihama Y. Sequence-Specific Model for Predicting Peptide Collision Cross Section Values in Proteomic Ion Mobility Spectrometry. J Proteome Res 2021; 20:3600-3610. [PMID: 34133192 DOI: 10.1021/acs.jproteome.1c00185] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The contribution of peptide amino acid sequence to collision cross section values (CCS) has been investigated using a dataset of ∼134 000 peptides of four different charge states (1+ to 4+). The migration data were acquired using a two-dimensional liquid chromatography (LC)/trapped ion mobility spectrometry/quadrupole/time-of-flight mass spectrometry (MS) analysis of HeLa cell digests created using seven different proteases and was converted to CCS values. Following the previously reported modeling approaches using intrinsic size parameters (ISP), we extended this methodology to encode the position of individual residues within a peptide sequence. A generalized prediction model was built by dividing the dataset into eight groups (four charges for both tryptic/nontryptic peptides). Position-dependent ISPs were independently optimized for the eight subsets of peptides, resulting in prediction accuracy of ∼0.981 for the entire population of peptides. We find that ion mobility is strongly affected by the peptide's ability to solvate the positively charged sites. Internal positioning of polar residues and proline leads to decreased CCS values as they improve charge solvation; conversely, this ability decreases with increasing peptide charge due to electrostatic repulsion. Furthermore, higher helical propensity and peptide hydrophobicity result in a preferential formation of extended structures with higher than predicted CCS values. Finally, acidic/basic residues exhibit position-dependent ISP behavior consistent with electrostatic interaction with the peptide macrodipole, which affects the peptide helicity. The MS raw data files have been deposited with the ProteomeXchange Consortium via the jPOST partner repository (http://jpostdb.org) with the dataset identifiers PXD021440/JPST000959, PXD022800/JPST001017, and PXD026087/ JPST001176.
Collapse
Affiliation(s)
- Chih-Hsiang Chang
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Darien Yeung
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba R3E 0J9, Canada
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
| | - Kosuke Ogata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Oleg Krokhin
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba R3E 0J9, Canada
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg, Manitoba R3T 2N2, Canada
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
- Laboratory of Clinical and Analytical Chemistry, National Institute of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| |
Collapse
|
17
|
Pang KT, Tay SJ, Wan C, Walsh I, Choo MSF, Yang YS, Choo A, Ho YS, Nguyen-Khuong T. Semi-Automated Glycoproteomic Data Analysis of LC-MS Data Using GlycopeptideGraphMS in Process Development of Monoclonal Antibody Biologics. Front Chem 2021; 9:661406. [PMID: 34084765 PMCID: PMC8167043 DOI: 10.3389/fchem.2021.661406] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 04/30/2021] [Indexed: 11/13/2022] Open
Abstract
The glycosylation of antibody-based proteins is vital in translating the right therapeutic outcomes of the patient. Despite this, significant infrastructure is required to analyse biologic glycosylation in various unit operations from biologic development, process development to QA/QC in bio-manufacturing. Simplified mass spectrometers offer ease of operation as well as the portability of method development across various operations. Furthermore, data analysis would need to have a degree of automation to relay information back to the manufacturing line. We set out to investigate the applicability of using a semiautomated data analysis workflow to investigate glycosylation in different biologic development test cases. The workflow involves data acquisition using a BioAccord LC-MS system with a data-analytical tool called GlycopeptideGraphMS along with Progenesis QI to semi-automate glycoproteomic characterisation and quantitation with a LC-MS1 dataset of a glycopeptides and peptides. Data analysis which involved identifying glycopeptides and their quantitative glycosylation was performed in 30 min with minimal user intervention. To demonstrate the effectiveness of the antibody and biologic glycopeptide assignment in various scenarios akin to biologic development activities, we demonstrate the effectiveness in the filtering of IgG1 and IgG2 subclasses from human serum IgG as well as innovator drugs trastuzumab and adalimumab and glycoforms by virtue of their glycosylation pattern. We demonstrate a high correlation between conventional released glycan analysis with fluorescent tagging and glycopeptide assignment derived from GraphMS. GraphMS workflow was then used to monitor the glycoform of our in-house trastuzumab biosimilar produced in fed-batch cultures. The demonstrated utility of GraphMS to semi-automate quantitation and qualitative identification of glycopeptides proves to be an easy data analysis method that can complement emerging multi-attribute monitoring (MAM) analytical toolsets in bioprocess environments.
Collapse
Affiliation(s)
- Kuin Tian Pang
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Shi Jie Tay
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Corrine Wan
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Ian Walsh
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Matthew S F Choo
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Yuan Sheng Yang
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Andre Choo
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Ying Swan Ho
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| | - Terry Nguyen-Khuong
- Bioprocessing Technology Institute, Agency for Science Technology and Research (ASTAR), Queenstown, Singapore
| |
Collapse
|
18
|
Dannenhoffer-Lafage T, Best RB. A Data-Driven Hydrophobicity Scale for Predicting Liquid-Liquid Phase Separation of Proteins. J Phys Chem B 2021; 125:4046-4056. [PMID: 33876938 DOI: 10.1021/acs.jpcb.0c11479] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
An accurate model for macroscale disordered assemblies of biological macromolecules such as those formed in so-called membraneless organelles would greatly assist in studying their structure, function, and dynamics. Recent evidence has suggested that liquid-liquid phase separation (LLPS) underlies the formation of membraneless organelles. While the general mechanism of exchange of macromolecule/water for macromolecule/macromolecule interactions is known to be the driving force for LLPS, the specific interactions involved are not well understood. One way that protein-water and protein-protein interactions have been understood historically is via hydrophobicity scales. However, these scales are typically optimized for describing these relative interactions in certain cases, such as protein folding or insertion of proteins into membranes. To better describe the relative interactions of proteins that undergo LLPS, we have developed a new, data-driven hydrophobicity scale. To determine the new scale, we used coarse-grained molecular dynamics simulations using the hydrophobicity scale coarse-grained model, which relates the interactions between amino acids to their hydrophobicity. Hydrophobicity values were determined via the force-balance method on a library of proteins that includes unfolded, intrinsically disordered, and phase-separating proteins (PSP). The resulting hydrophobicity scale can better predict whether a given protein will undergo LLPS at physiological conditions by using coarse-grained molecular dynamics simulations than existing hydrophobicity scales. This new scale confirms the importance of π-π interactions between amino acids as important drivers of LLPS. This new hydrophobicity scale provides a convenient and compact description of protein-protein interactions for proteins that undergo LLPS and could be used to develop new models to describe interactions between PSP and other components, such as nucleic acids.
Collapse
Affiliation(s)
- Thomas Dannenhoffer-Lafage
- Laboratory of Chemical Physics, National Institute for Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892-0520, United States
| | - Robert B Best
- Laboratory of Chemical Physics, National Institute for Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892-0520, United States
| |
Collapse
|
19
|
Ivanov MV, Bubis JA, Gorshkov V, Abdrakhimov DA, Kjeldsen F, Gorshkov MV. Boosting MS1-only Proteomics with Machine Learning Allows 2000 Protein Identifications in Single-Shot Human Proteome Analysis Using 5 min HPLC Gradient. J Proteome Res 2021; 20:1864-1873. [PMID: 33720732 DOI: 10.1021/acs.jproteome.0c00863] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Proteome-wide analyses rely on tandem mass spectrometry and the extensive separation of proteolytic mixtures. This imposes considerable instrumental time consumption, which is one of the main obstacles in the broader acceptance of proteomics in biomedical and clinical research. Recently, we presented a fast proteomic method termed DirectMS1 based on ultrashort LC gradients as well as MS1-only mass spectra acquisition and data processing. The method allows significant reduction of the proteome-wide analysis time to a few minutes at the depth of quantitative proteome coverage of 1000 proteins at 1% false discovery rate (FDR). In this work, to further increase the capabilities of the DirectMS1 method, we explored the opportunities presented by the recent progress in the machine-learning area and applied the LightGBM decision tree boosting algorithm to the scoring of peptide feature matches when processing MS1 spectra. Furthermore, we integrated the peptide feature identification algorithm of DirectMS1 with the recently introduced peptide retention time prediction utility, DeepLC. Additional approaches to improve the performance of the DirectMS1 method are discussed and demonstrated, such as using FAIMS for gas-phase ion separation. As a result of all improvements to DirectMS1, we succeeded in identifying more than 2000 proteins at 1% FDR from the HeLa cell line in a 5 min gradient LC-FAIMS/MS1 analysis. The data sets generated and analyzed during the current study have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD023977.
Collapse
Affiliation(s)
- Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Daniil A Abdrakhimov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia.,Moscow Institute of Physics and Technology, Institutsky lane 9, Dolgoprudny, Moscow Region 141700, Russia
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| |
Collapse
|
20
|
Wen B, Zeng W, Liao Y, Shi Z, Savage SR, Jiang W, Zhang B. Deep Learning in Proteomics. Proteomics 2020; 20:e1900335. [PMID: 32939979 PMCID: PMC7757195 DOI: 10.1002/pmic.201900335] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 09/14/2020] [Indexed: 12/17/2022]
Abstract
Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.
Collapse
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen‐Feng Zeng
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Chinese Academy of SciencesInstitute of Computing TechnologyBeijing100190China
| | - Yuxing Liao
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Zhiao Shi
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Sara R. Savage
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen Jiang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Bing Zhang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| |
Collapse
|
21
|
Charoenkwan P, Kanthawong S, Nantasenamat C, Hasan MM, Shoombuatong W. iDPPIV-SCM: A Sequence-Based Predictor for Identifying and Analyzing Dipeptidyl Peptidase IV (DPP-IV) Inhibitory Peptides Using a Scoring Card Method. J Proteome Res 2020; 19:4125-4136. [PMID: 32897718 DOI: 10.1021/acs.jproteome.0c00590] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The inhibition of dipeptidyl peptidase IV (DPP-IV, E.C.3.4.14.5) is well recognized as a new avenue for the treatment of Type 2 diabetes (T2D). Until now, peptide-like DDP-IV inhibitors have been shown to normalize the blood glucose concentration in T2D subjects. To the best of our knowledge, there is yet no computational model for predicting and analyzing DPP-IV inhibitory peptides using sequence information. In this study, we present for the first time a simple and easily interpretable sequence-based predictor using the scoring card method (SCM) for modeling the bioactivity of DPP-IV inhibitory peptides (iDPPIV-SCM). Particularly, the iDPPIV-SCM was developed by employing the SCM method together with the propensity scores of amino acids. Rigorous independent test results demonstrated that the proposed iDPPIV-SCM was found to be superior to those of well-known machine learning (ML) classifiers (e.g., k-nearest neighbor, logistic regression, and decision tree) with demonstrated improvements of 2-11, 4-22, and 7-10% for accuracy, MCC, and AUC, respectively, while also achieving comparable results to that of the support vector machine. Furthermore, the analysis of estimated propensity scores of amino acids as derived from the iDPPIV-SCM was performed so as to provide a more in-depth understanding on the molecular basis for enhancing the DPP-IV inhibitory potency. Taken together, these results revealed that iDPPIV-SCM was superior to those of other well-known ML classifiers owing to its simplicity, interpretability, and validity. For the convenience of biologists, the predictive model is deployed as a publicly accessible web server at http://camt.pythonanywhere.com/iDPPIV-SCM. It is anticipated that iDPPIV-SCM can serve as an important tool for the rapid screening of promising DPP-IV inhibitory peptides prior to their synthesis.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Sakawrat Kanthawong
- Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| |
Collapse
|
22
|
Porto DL, da Silva ARR, Oliveira ADS, Nogueira FHA, Pedrosa MDFF, Aragão CFS. Development and validation of a stability indicating HPLC-DAD method for the determination of the peptide stigmurin. Microchem J 2020. [DOI: 10.1016/j.microc.2020.104921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
23
|
Timmons PB, Hewage CM. HAPPENN is a novel tool for hemolytic activity prediction for therapeutic peptides which employs neural networks. Sci Rep 2020; 10:10869. [PMID: 32616760 PMCID: PMC7331684 DOI: 10.1038/s41598-020-67701-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 06/09/2020] [Indexed: 12/11/2022] Open
Abstract
The growing prevalence of resistance to antibiotics motivates the search for new antibacterial agents. Antimicrobial peptides are a diverse class of well-studied membrane-active peptides which function as part of the innate host defence system, and form a promising avenue in antibiotic drug research. Some antimicrobial peptides exhibit toxicity against eukaryotic membranes, typically characterised by hemolytic activity assays, but currently, the understanding of what differentiates hemolytic and non-hemolytic peptides is limited. This study leverages advances in machine learning research to produce a novel artificial neural network classifier for the prediction of hemolytic activity from a peptide's primary sequence. The classifier achieves best-in-class performance, with cross-validated accuracy of [Formula: see text] and Matthews correlation coefficient of 0.71. This innovative classifier is available as a web server at https://research.timmons.eu/happenn , allowing the research community to utilise it for in silico screening of peptide drug candidates for high therapeutic efficacies.
Collapse
Affiliation(s)
- Patrick Brendan Timmons
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland
| | - Chandralal M Hewage
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland.
| |
Collapse
|
24
|
Wang G, Wan H, Jian X, Li Y, Ouyang J, Tan X, Zhao Y, Lin Y, Xie L. INeo-Epp: A Novel T-Cell HLA Class-I Immunogenicity or Neoantigenic Epitope Prediction Method Based on Sequence-Related Amino Acid Features. BIOMED RESEARCH INTERNATIONAL 2020; 2020:5798356. [PMID: 32626747 PMCID: PMC7315274 DOI: 10.1155/2020/5798356] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 05/23/2020] [Indexed: 12/30/2022]
Abstract
In silico T-cell epitope prediction plays an important role in immunization experimental design and vaccine preparation. Currently, most epitope prediction research focuses on peptide processing and presentation, e.g., proteasomal cleavage, transporter associated with antigen processing (TAP), and major histocompatibility complex (MHC) combination. To date, however, the mechanism for the immunogenicity of epitopes remains unclear. It is generally agreed upon that T-cell immunogenicity may be influenced by the foreignness, accessibility, molecular weight, molecular structure, molecular conformation, chemical properties, and physical properties of target peptides to different degrees. In this work, we tried to combine these factors. Firstly, we collected significant experimental HLA-I T-cell immunogenic peptide data, as well as the potential immunogenic amino acid properties. Several characteristics were extracted, including the amino acid physicochemical property of the epitope sequence, peptide entropy, eluted ligand likelihood percentile rank (EL rank(%)) score, and frequency score for an immunogenic peptide. Subsequently, a random forest classifier for T-cell immunogenic HLA-I presenting antigen epitopes and neoantigens was constructed. The classification results for the antigen epitopes outperformed the previous research (the optimal AUC = 0.81, external validation data set AUC = 0.77). As mutational epitopes generated by the coding region contain only the alterations of one or two amino acids, we assume that these characteristics might also be applied to the classification of the endogenic mutational neoepitopes also called "neoantigens." Based on mutation information and sequence-related amino acid characteristics, a prediction model of a neoantigen was established as well (the optimal AUC = 0.78). Further, an easy-to-use web-based tool "INeo-Epp" was developed for the prediction of human immunogenic antigen epitopes and neoantigen epitopes.
Collapse
Affiliation(s)
- Guangzhi Wang
- College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, China
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, China
| | - Huihui Wan
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, China
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Xingxing Jian
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, China
- Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education and Key Laboratory of Carcinogenesis, National Health and Family Planning Commission, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Yuyu Li
- College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, China
| | - Jian Ouyang
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, China
| | - Xiaoxiu Tan
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Yong Zhao
- College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, China
| | - Yong Lin
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Lu Xie
- College of Food Science and Technology, Shanghai Ocean University, Shanghai 201306, China
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, China
| |
Collapse
|
25
|
Yang Y, Liu X, Shen C, Lin Y, Yang P, Qiao L. In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat Commun 2020; 11:146. [PMID: 31919359 PMCID: PMC6952453 DOI: 10.1038/s41467-019-13866-z] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 12/04/2019] [Indexed: 11/12/2022] Open
Abstract
Data-independent acquisition (DIA) is an emerging technology for quantitative proteomic analysis of large cohorts of samples. However, sample-specific spectral libraries built by data-dependent acquisition (DDA) experiments are required prior to DIA analysis, which is time-consuming and limits the identification/quantification by DIA to the peptides identified by DDA. Herein, we propose DeepDIA, a deep learning-based approach to generate in silico spectral libraries for DIA analysis. We demonstrate that the quality of in silico libraries predicted by instrument-specific models using DeepDIA is comparable to that of experimental libraries, and outperforms libraries generated by global models. With peptide detectability prediction, in silico libraries can be built directly from protein sequence databases. We further illustrate that DeepDIA can break through the limitation of DDA on peptide/protein detection, and enhance DIA analysis on human serum samples compared to the state-of-the-art protocol using a DDA library. We expect this work expanding the toolbox for DIA proteomics.
Collapse
Affiliation(s)
- Yi Yang
- Department of Chemistry, Shanghai Stomatological Hospital, and Institutes of Biomedical Sciences, Fudan University, Shanghai, 200000, China
| | - Xiaohui Liu
- Department of Chemistry, Shanghai Stomatological Hospital, and Institutes of Biomedical Sciences, Fudan University, Shanghai, 200000, China
| | - Chengpin Shen
- Shanghai Omicsolution Co., Ltd., Shanghai, 200000, China
| | - Yu Lin
- College of Engineering and Computer Science, The Australian National University, Canberra, ACT 0200, Australia
| | - Pengyuan Yang
- Department of Chemistry, Shanghai Stomatological Hospital, and Institutes of Biomedical Sciences, Fudan University, Shanghai, 200000, China
| | - Liang Qiao
- Department of Chemistry, Shanghai Stomatological Hospital, and Institutes of Biomedical Sciences, Fudan University, Shanghai, 200000, China.
| |
Collapse
|
26
|
Abstract
In bottom-up proteomics, proteins are typically identified by enzymatic digestion into peptides, tandem mass spectrometry and comparison of the tandem mass spectra with those predicted from a sequence database for peptides within measurement uncertainty from the experimentally obtained mass. Although now decreasingly common, isolated proteins or simple protein mixtures can also be identified by measuring only the masses of the peptides resulting from the enzymatic digest, without any further fragmentation. Separation methods such as liquid chromatography and electrophoresis are often used to fractionate complex protein or peptide mixtures prior to analysis by mass spectrometry. Although the primary reason for this is to avoid ion suppression and improve data quality, these separations are based on physical and chemical properties of the peptides or proteins and therefore also provide information about them. Depending on the separation method, this could be protein molecular weight (SDS-PAGE), isoelectric point (IEF), charge at a known pH (ion exchange chromatography), or hydrophobicity (reversed phase chromatography). These separations produce approximate measurements on properties that to some extent can be predicted from amino acid sequences. In the case of molecular weight of proteins without posttranslational modifications this is straightforward: simply add the molecular weights of the amino acid residues in the protein. For IEF, charge and hydrophobicity, the order of the amino acids, and folding state of the peptide or protein also matter, but it is nevertheless possible to predict the behavior of peptides and proteins in these separation methods to a degree which renders such predictions useful. This chapter reviews the topic of using data from separation methods for identification and validation in proteomics, with special emphasis on predicting retention times of tryptic peptides in reversed-phase chromatography under acidic conditions, as this is one of the most commonly used separation methods in bottom-up proteomics.
Collapse
|
27
|
Fontaine NT, Cadet XF, Vetrivel I. Novel Descriptors and Digital Signal Processing- Based Method for Protein Sequence Activity Relationship Study. Int J Mol Sci 2019; 20:ijms20225640. [PMID: 31718061 PMCID: PMC6888668 DOI: 10.3390/ijms20225640] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2019] [Revised: 11/04/2019] [Accepted: 11/07/2019] [Indexed: 12/18/2022] Open
Abstract
The work aiming to unravel the correlation between protein sequence and function in the absence of structural information can be highly rewarding. We present a new way of considering descriptors from the amino acids index database for modeling and predicting the fitness value of a polypeptide chain. This approach includes the following steps: (i) Calculating Q elementary numerical sequences (Ele_SEQ) depending on the encoding of the amino acid residues, (ii) determining an extended numerical sequence (Ext_SEQ) by concatenating the Q elementary numerical sequences, wherein at least one elementary numerical sequence is a protein spectrum obtained by applying fast Fourier transformation (FFT), and (iii) predicting a value of fitness for polypeptide variants (train and/or validation set). These new descriptors were tested on four sets of proteins of different lengths (GLP-2, TNF alpha, cytochrome P450, and epoxide hydrolase) and activities (cAMP activation, binding affinity, thermostability and enantioselectivity). We show that the use of multiple physicochemical descriptors coupled with the implementation of the FFT, taking into account the interactions between residues of amino acids within the protein sequence, could lead to very significant improvement in the quality of models and predictions. The choice of the descriptor or of the combination of descriptors and/or FFT is dependent on the couple protein/fitness. This approach can provide potential users with value added to existing mutant libraries where screening efforts have so far been unsuccessful in finding improved polypeptide mutants for useful applications.
Collapse
Affiliation(s)
- Nicolas T Fontaine
- PEACCEL, Protein Engineering ACCELerator, 6 Square Albin Cachot, box 42, 75013 Paris, France
| | - Xavier F Cadet
- PEACCEL, Protein Engineering ACCELerator, 6 Square Albin Cachot, box 42, 75013 Paris, France
| | - Iyanar Vetrivel
- PEACCEL, Protein Engineering ACCELerator, 6 Square Albin Cachot, box 42, 75013 Paris, France
| |
Collapse
|
28
|
Palmblad M, Lamprecht AL, Ison J, Schwämmle V. Automated workflow composition in mass spectrometry-based proteomics. Bioinformatics 2019; 35:656-664. [PMID: 30060113 PMCID: PMC6378944 DOI: 10.1093/bioinformatics/bty646] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 07/06/2018] [Accepted: 07/26/2018] [Indexed: 11/28/2022] Open
Abstract
Motivation Numerous software utilities operating on mass spectrometry (MS) data are described in the literature and provide specific operations as building blocks for the assembly of on-purpose workflows. Working out which tools and combinations are applicable or optimal in practice is often hard. Thus researchers face difficulties in selecting practical and effective data analysis pipelines for a specific experimental design. Results We provide a toolkit to support researchers in identifying, comparing and benchmarking multiple workflows from individual bioinformatics tools. Automated workflow composition is enabled by the tools’ semantic annotation in terms of the EDAM ontology. To demonstrate the practical use of our framework, we created and evaluated a number of logically and semantically equivalent workflows for four use cases representing frequent tasks in MS-based proteomics. Indeed we found that the results computed by the workflows could vary considerably, emphasizing the benefits of a framework that facilitates their systematic exploration. Availability and implementation The project files and workflows are available from https://github.com/bio-tools/biotoolsCompose/tree/master/Automatic-Workflow-Composition. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, RC Leiden, The Netherlands
| | - Anna-Lena Lamprecht
- Department of Information and Computing Sciences, Utrecht University, CC Utrecht, The Netherlands
| | - Jon Ison
- National Life Science Supercomputing Center, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology and VILLUM Center for Bioanalytical Sciences, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
29
|
Chen AT, Franks A, Slavov N. DART-ID increases single-cell proteome coverage. PLoS Comput Biol 2019; 15:e1007082. [PMID: 31260443 PMCID: PMC6625733 DOI: 10.1371/journal.pcbi.1007082] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Revised: 07/12/2019] [Accepted: 05/06/2019] [Indexed: 01/09/2023] Open
Abstract
Analysis by liquid chromatography and tandem mass spectrometry (LC-MS/MS) can identify and quantify thousands of proteins in microgram-level samples, such as those comprised of thousands of cells. This process, however, remains challenging for smaller samples, such as the proteomes of single mammalian cells, because reduced protein levels reduce the number of confidently sequenced peptides. To alleviate this reduction, we developed Data-driven Alignment of Retention Times for IDentification (DART-ID). DART-ID implements principled Bayesian frameworks for global retention time (RT) alignment and for incorporating RT estimates towards improved confidence estimates of peptide-spectrum-matches. When applied to bulk or to single-cell samples, DART-ID increased the number of data points by 30-50% at 1% FDR, and thus decreased missing data. Benchmarks indicate excellent quantification of peptides upgraded by DART-ID and support their utility for quantitative analysis, such as identifying cell types and cell-type specific proteins. The additional datapoints provided by DART-ID boost the statistical power and double the number of proteins identified as differentially abundant in monocytes and T-cells. DART-ID can be applied to diverse experimental designs and is freely available at http://dart-id.slavovlab.net.
Collapse
Affiliation(s)
- Albert Tian Chen
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, United States of America
- Barnett Institute, Northeastern University, Boston, Massachusetts, United States of America
| | - Alexander Franks
- Department of Statistics and Applied Probability, University of California Santa Barbara, California, United States of America
| | - Nikolai Slavov
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, United States of America
- Barnett Institute, Northeastern University, Boston, Massachusetts, United States of America
- Department of Biology, Northeastern University, Boston, Massachusetts, United States of America
| |
Collapse
|
30
|
Tienaho J, Karonen M, Muilu-Mäkelä R, Wähälä K, Leon Denegri E, Franzén R, Karp M, Santala V, Sarjala T. Metabolic Profiling of Water-Soluble Compounds from the Extracts of Dark Septate Endophytic Fungi (DSE) Isolated from Scots Pine ( Pinus sylvestris L.) Seedlings Using UPLC-Orbitrap-MS. Molecules 2019; 24:E2330. [PMID: 31242564 PMCID: PMC6630819 DOI: 10.3390/molecules24122330] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 06/14/2019] [Accepted: 06/22/2019] [Indexed: 01/23/2023] Open
Abstract
Endophytes are microorganisms living inside plant hosts and are known to be beneficial for the host plant vitality. In this study, we isolated three endophytic fungus species from the roots of Scots pine seedlings growing on Finnish drained peatland setting. The isolated fungi belonged to dark septate endophytes (DSE). The metabolic profiles of the hot water extracts of the fungi were investigated using Ultrahigh Performance Liquid Chromatography with Diode Array Detection and Electron Spray Ionization source Mass Spectrometry with Orbitrap analyzer (UPLC-DAD-ESI-MS-Orbitrap). Out of 318 metabolites, we were able to identify 220, of which a majority was amino acids and peptides. Additionally, opine amino acids, amino acid quinones, Amadori compounds, cholines, nucleobases, nucleosides, nucleotides, siderophores, sugars, sugar alcohols and disaccharides were found, as well as other previously reported metabolites from plants or endophytes. Some differences of the metabolic profiles, regarding the amount and identity of the found metabolites, were observed even though the fungi were isolated from the same host. Many of the discovered metabolites have been described possessing biological activities and properties, which may make a favorable contribution to the host plant nutrient availability or abiotic and biotic stress tolerance.
Collapse
Affiliation(s)
- Jenni Tienaho
- Faculty of Natural Sciences and Engineering, Tampere University, FI-33101 Tampere, Finland.
- Natural Resources Institute Finland (Luke), FI-00791 Helsinki, Finland.
| | - Maarit Karonen
- Natural Chemistry Research Group, Department of Chemistry, University of Turku, FI-20014 Turku, Finland.
| | | | - Kristiina Wähälä
- Department of Chemistry, University of Helsinki, FI-00014 Helsinki, Finland.
| | | | - Robert Franzén
- School of Chemical Engineering, Department of Chemistry and Materials Science, Aalto University, FI-00076 Espoo, Finland.
| | - Matti Karp
- Faculty of Natural Sciences and Engineering, Tampere University, FI-33101 Tampere, Finland.
| | - Ville Santala
- Faculty of Natural Sciences and Engineering, Tampere University, FI-33101 Tampere, Finland.
| | - Tytti Sarjala
- Natural Resources Institute Finland (Luke), FI-00791 Helsinki, Finland.
| |
Collapse
|
31
|
Hoffmann W, Langenhan J, Huhmann S, Moschner J, Chang R, Accorsi M, Seo J, Rademann J, Meijer G, Koksch B, Bowers MT, von Helden G, Pagel K. Eine intrinsische Hydrophobieskala für Aminosäuren und ihre Anwendung auf fluorierte Verbindungen. Angew Chem Int Ed Engl 2019. [DOI: 10.1002/ange.201813954] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Waldemar Hoffmann
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
| | - Jennifer Langenhan
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
| | - Susanne Huhmann
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
| | - Johann Moschner
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
| | - Rayoon Chang
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
| | - Matteo Accorsi
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
| | - Jongcheol Seo
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
- aktuelle Adresse: University of Science and Technology (POSTECH) Fachbereich Chemie 77 Cheongam-ro Pohang 37673 Republik Korea
| | - Jörg Rademann
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
| | - Gerard Meijer
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
| | - Beate Koksch
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
| | - Michael T. Bowers
- University of California Santa Barbara Department of Chemistry & Biochemistry Santa Barbara California 93106 USA
| | - Gert von Helden
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
| | - Kevin Pagel
- Freie Universität Berlin Fachbereich für Biologie, Chemie und Pharmazie Takustraße 3 / Königin-Luise-Straße 2+4 14195 Berlin Deutschland
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Abteilung Molekülphysik Faradayweg 4–6 14195 Berlin Deutschland
| |
Collapse
|
32
|
Hoffmann W, Langenhan J, Huhmann S, Moschner J, Chang R, Accorsi M, Seo J, Rademann J, Meijer G, Koksch B, Bowers MT, von Helden G, Pagel K. An Intrinsic Hydrophobicity Scale for Amino Acids and Its Application to Fluorinated Compounds. Angew Chem Int Ed Engl 2019; 58:8216-8220. [PMID: 30958917 DOI: 10.1002/anie.201813954] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 03/01/2019] [Indexed: 11/10/2022]
Abstract
More than 100 hydrophobicity scales have been introduced, with each being based on a distinct condensed-phase approach. However, a comparison of the hydrophobicity values gained from different techniques, and their relative ranking, is not straightforward, as the interactions between the environment and the amino acid are unique to each method. Here, we overcome this limitation by studying the properties of amino acids in the clean-room environment of the gas phase. In the gas phase, entropic contributions from the hydrophobic effect are by default absent and only the polarity of the side chain dictates the self-assembly. This allows for the derivation of a novel hydrophobicity scale, which is based solely on the interaction between individual amino acid units within the cluster and thus more accurately reflects the intrinsic nature of a side chain. This principle can be further applied to classify non-natural derivatives, as shown here for fluorinated amino acid variants.
Collapse
Affiliation(s)
- Waldemar Hoffmann
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany.,Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany
| | - Jennifer Langenhan
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany
| | - Susanne Huhmann
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany
| | - Johann Moschner
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany
| | - Rayoon Chang
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany.,Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany
| | - Matteo Accorsi
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany
| | - Jongcheol Seo
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany.,present address: University of Science and Technology (POSTECH), Department of Chemistry, 77 Cheongam-ro, Pohang, 37673, Korea
| | - Jörg Rademann
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany
| | - Gerard Meijer
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany
| | - Beate Koksch
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany
| | - Michael T Bowers
- University of California Santa Barbara, Department of Chemistry & Biochemistry, Santa Barbara, California, 93106, USA
| | - Gert von Helden
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany
| | - Kevin Pagel
- Freie Universität Berlin, Department of Biology, Chemistry and Pharmacy, Takustrasse 3/Königin-Luise-Strasse 2+4, 14195, Berlin, Germany.,Fritz-Haber-Institut der Max-Planck-Gesellschaft, Department of Molecular Physics, Faradayweg 4-6, 14195, Berlin, Germany
| |
Collapse
|
33
|
Bouwmeester R, Martens L, Degroeve S. Comprehensive and Empirical Evaluation of Machine Learning Algorithms for Small Molecule LC Retention Time Prediction. Anal Chem 2019; 91:3694-3703. [DOI: 10.1021/acs.analchem.8b05820] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
34
|
Levitsky LI, Klein JA, Ivanov MV, Gorshkov MV. Pyteomics 4.0: Five Years of Development of a Python Proteomics Framework. J Proteome Res 2019; 18:709-714. [PMID: 30576148 DOI: 10.1021/acs.jproteome.8b00717] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Many of the novel ideas that drive today's proteomic technologies are focused essentially on experimental or data-processing workflows. The latter are implemented and published in a number of ways, from custom scripts and programs, to projects built using general-purpose or specialized workflow engines; a large part of routine data processing is performed manually or with custom scripts that remain unpublished. Facilitating the development of reproducible data-processing workflows becomes essential for increasing the efficiency of proteomic research. To assist in overcoming the bioinformatics challenges in the daily practice of proteomic laboratories, 5 years ago we developed and announced Pyteomics, a freely available open-source library providing Python interfaces to proteomic data. We summarize the new functionality of Pyteomics developed during the time since its introduction.
Collapse
Affiliation(s)
- Lev I Levitsky
- Moscow Institute of Physics and Technology , Dolgoprudny, Moscow Region 141701 , Russia.,V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Moscow 119334 , Russia
| | - Joshua A Klein
- Bioinformatics Program , Boston University , Boston , Massachusetts 02215 , United States
| | - Mark V Ivanov
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Moscow 119334 , Russia
| | - Mikhail V Gorshkov
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Moscow 119334 , Russia
| |
Collapse
|
35
|
Gielbert A, Thorne JK, Plater JM, Thorne L, Griffiths PC, Simmons MM, Cassar CA. Molecular characterisation of atypical BSE prions by mass spectrometry and changes following transmission to sheep and transgenic mouse models. PLoS One 2018; 13:e0206505. [PMID: 30408075 PMCID: PMC6224059 DOI: 10.1371/journal.pone.0206505] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 10/14/2018] [Indexed: 11/18/2022] Open
Abstract
The prion hypothesis proposes a causal relationship between the misfolded prion protein (PrPSc) molecular entity and the disease transmissible spongiform encephalopathy (TSE). Variations in the conformation of PrPSc are associated with different forms of TSE and different risks to animal and human health. Since the discovery of atypical forms of bovine spongiform encephalopathy (BSE) in 2003, scientists have progressed the molecular characterisation of the associated PrPSc in order to better understand these risks, both in cattle as the natural host and following experimental transmission to other species. Here we report the development of a mass spectrometry based assay for molecular characterisation of bovine proteinase K (PK) treated PrPSc (PrPres) by quantitative identification of its N-terminal amino acid profiles (N-TAAPs) and tryptic peptides. We have applied the assay to classical, H-type and L-type BSE prions purified from cattle, transgenic (Tg) mice expressing the bovine (Tg110 and Tg1896) or ovine (TgEM16) prion protein gene, and sheep brain. We determined that, for classical BSE in cattle, the G96 N-terminal cleavage site dominated, while the range of cleavage sites was wider following transmission to Tg mice and sheep. For L-BSE in cattle and Tg bovinised mice, a C-terminal shift was identified in the N-TAAP distribution compared to classical BSE, consistent with observations by Western blot (WB). For L-BSE transmitted to sheep, both N-TAAP and tryptic peptide profiles were found to be changed compared to cattle, but less so following transmission to Tg ovinised mice. Relative abundances of aglycosyl peptides were found to be significantly different between the atypical BSE forms in cattle as well as in other hosts. The enhanced resolution provided by molecular analysis of PrPres using mass spectrometry has improved insight into the molecular changes following transmission of atypical BSE to other species.
Collapse
Affiliation(s)
- Adriana Gielbert
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
- * E-mail:
| | - Jemma K. Thorne
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
| | - Jane M. Plater
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
| | - Leigh Thorne
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
| | - Peter C. Griffiths
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
| | - Marion M. Simmons
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
| | - Claire A. Cassar
- Animal and Plant Health Agency-Weybridge, Addlestone, Surrey, United Kingdom
| |
Collapse
|
36
|
Poorinmohammad N, Hamedi J, Moghaddam MHAM. Sequence-based analysis and prediction of lantibiotics: A machine learning approach. Comput Biol Chem 2018; 77:199-206. [PMID: 30342319 DOI: 10.1016/j.compbiolchem.2018.10.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Revised: 08/15/2018] [Accepted: 10/05/2018] [Indexed: 10/28/2022]
Abstract
Lantibiotics, an important group of ribosomally synthesized peptides, represent an important arsenal of novel promising antimicrobials showing high potency in fighting against the prevalence of antibiotic resistance among microbial pathogens. However, due to the lack of high throughput strategies for the isolation and identification of these compounds, our information regarding their structure and especially sequence-based properties is far from complete. Therefore, in the present study, a comprehensive sequence-based analysis of these peptides was performed with the help of machine learning approach together with a feature selection technique. Meanwhile, an attempt to develop an accurate computational model for prediction of lantibiotics was made via constructing two datasets of 280 and 190 lantibiotic and non-lantibiotic antimicrobial peptide sequences, respectively. Based on the conducted approach and as a result of our search for a subset of relevant features of lantibiotics, particular types of sequenced-based features were observed to be preferred in lantibiotics, the knowledge-based implementation of which can be used as strategies for lantibiotic bioengineering purposes. Moreover, a SMO-based classifier was developed for the prediction of lantibiotics with the accuracy and specificity values of 88.5% and 94%, respectively which shows the great potential of the developed algorithm for the prediction of lantibiotcs. Conclusively, the accurate predictor algorithm as well as the identified sequence-based distinctiveness properties of lantibiotics can give valuable information in both the fields of lantibiotic discovery and bioengineering.
Collapse
Affiliation(s)
- Naghmeh Poorinmohammad
- Department of Microbial Biotechnology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran; Microbial Technology and Products Research Center, University of Tehran, Tehran, Iran
| | - Javad Hamedi
- Department of Microbial Biotechnology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran; Microbial Technology and Products Research Center, University of Tehran, Tehran, Iran.
| | - Mohammad Hossein Abbaspour Motlagh Moghaddam
- Department of Microbial Biotechnology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran; Microbial Technology and Products Research Center, University of Tehran, Tehran, Iran
| |
Collapse
|
37
|
Tarasova IA, Masselon CD, Gorshkov AV, Gorshkov MV. Predictive chromatography of peptides and proteins as a complementary tool for proteomics. Analyst 2018; 141:4816-4832. [PMID: 27419248 DOI: 10.1039/c6an00919k] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
In the last couple of decades, considerable effort has been focused on developing methods for quantitative and qualitative proteome characterization. The method of choice in this characterization is mass spectrometry used in combination with sample separation. One of the most widely used separation techniques at the front end of a mass spectrometer is high performance liquid chromatography (HPLC). A unique feature of HPLC is its specificity to the amino acid sequence of separated peptides and proteins. This specificity may provide additional information about the peptides or proteins under study which is complementary to the mass spectrometry data. The value of this information for proteomics has been recognized in the past few decades, which has stimulated significant effort in the development and implementation of computational and theoretical models for the prediction of peptide retention time for a given sequence. Here we review the advances in this area and the utility of predicted retention times for proteomic applications.
Collapse
Affiliation(s)
- Irina A Tarasova
- Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia.
| | - Christophe D Masselon
- CEA, iRTSV-BGE, Laboratoire d'Etude de la Dynamique des Protéomes, Grenoble, F-38000, France and INSERM, U1038-BGE, F-38000, Grenoble, France
| | - Alexander V Gorshkov
- N.N. Semenov Institute of Chemical Physics, Russian Academy of Sciences, Moscow 119991, Russia
| | - Mikhail V Gorshkov
- Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia. and Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow region 141700, Russia
| |
Collapse
|
38
|
Mohammed Y, Palmblad M. Visualization and application of amino acid retention coefficients obtained from modeling of peptide retention. J Sep Sci 2018; 41:3644-3653. [PMID: 30047222 PMCID: PMC6175132 DOI: 10.1002/jssc.201800488] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Revised: 07/17/2018] [Accepted: 07/18/2018] [Indexed: 11/08/2022]
Abstract
We introduce a method for data inspection in liquid separations of peptides using amino acid retention coefficients and their relative change across experiments. Our method allows for the direct comparison between actual experimental conditions, regardless of sample content and without the use of internal standards. The modeling uses linear regression of peptide retention time as a function of amino acid composition. We demonstrate the pH dependency of the model in a control experiment where the pH of the mobile phase was changed in controlled way. We introduce a score to identify the false discovery rate on peptide spectrum match level that corresponds to the set of most robust models, i.e. to maximize the shared agreement between experiments. We demonstrate the method utility in reversed-phase liquid chromatography using 24 datasets with minimal peptide overlap. We apply our method on datasets obtained from a public repository representing various separation designs, including one-dimensional reversed-phase liquid chromatography followed by tandem mass spectrometry, and two-dimensional online strong cation exchange coupled to reversed-phase liquid chromatography followed by tandem mass spectrometry, and highlight new insights. Our method provides a simple yet powerful way to inspect data quality, in particular for multidimensional separations, improving comparability of data at no additional experimental cost.
Collapse
Affiliation(s)
- Yassene Mohammed
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Leiden, Netherlands.,University of Victoria-Genome British Columbia Proteomics Centre, University of Victoria, Victoria, Canada
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Leiden, Netherlands
| |
Collapse
|
39
|
Levitsky LI, Ivanov MV, Lobas AA, Bubis JA, Tarasova IA, Solovyeva EM, Pridatchenko ML, Gorshkov MV. IdentiPy: An Extensible Search Engine for Protein Identification in Shotgun Proteomics. J Proteome Res 2018; 17:2249-2255. [PMID: 29682971 DOI: 10.1021/acs.jproteome.7b00640] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
We present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations. Using a simplified version of X!Tandem scoring algorithm and its novel "autotune" feature, IdentiPy outperforms the popular alternatives on high-resolution data sets. Autotune adjusts the search parameters for the particular data set, resulting in improved search efficiency and simplifying the user experience. IdentiPy with the autotune feature shows higher sensitivity compared with the evaluated search engines. IdentiPy Server has built-in postprocessing and protein inference procedures and provides graphic visualization of the statistical properties of the data set and the search results. It is open-source and can be freely extended to use third-party scoring functions or processing algorithms and allows customization of the search workflow for specialized applications.
Collapse
Affiliation(s)
- Lev I Levitsky
- Moscow Institute of Physics and Technology , 9 Institutskiy per. , Dolgoprudny , Moscow Region 141700 , Russian Federation.,V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Mark V Ivanov
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Anna A Lobas
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Julia A Bubis
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Irina A Tarasova
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Elizaveta M Solovyeva
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Marina L Pridatchenko
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| | - Mikhail V Gorshkov
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , 38 Leninsky Pr., Bld. 2 , Moscow 119334 , Russia
| |
Collapse
|
40
|
Lobas AA, Levitsky LI, Fichtenbaum A, Surin AK, Pridatchenko ML, Mitulovic G, Gorshkov AV, Gorshkov MV. Predictive Liquid Chromatography of Peptides Based on Hydrophilic Interactions for Mass Spectrometry-Based Proteomics. JOURNAL OF ANALYTICAL CHEMISTRY 2018. [DOI: 10.1134/s1061934817140076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
41
|
Badgett MJ, Boyes B, Orlando R. Peptide retention prediction using hydrophilic interaction liquid chromatography coupled to mass spectrometry. J Chromatogr A 2018; 1537:58-65. [PMID: 29338870 PMCID: PMC5805588 DOI: 10.1016/j.chroma.2017.12.055] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 12/12/2017] [Accepted: 12/20/2017] [Indexed: 10/18/2022]
Abstract
A model that predicts retention for peptides using a HALO® penta-HILIC column and gradient elution was created. Coefficients for each amino acid were derived using linear regression analysis and these coefficients can be summed to predict the retention of peptides. This model has a high correlation between experimental and predicted retention times (0.946), which is on par with previous RP and HILIC models. External validation of the model was performed using a set of H. pylori samples on the same LC-MS system used to create the model, and the deviation from actual to predicted times was low. Apart from amino acid composition, length and location of amino acid residues on a peptide were examined and two site-specific corrections for hydrophobic residues at the N-terminus as well as hydrophobic residues one spot over from the N-terminus were created.
Collapse
Affiliation(s)
- Majors J Badgett
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 United States
| | - Barry Boyes
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 United States; Advanced Materials Technology, Wilmington, DE 19810 United States
| | - Ron Orlando
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 United States.
| |
Collapse
|
42
|
Wang JR, Huang WL, Tsai MJ, Hsu KT, Huang HL, Ho SY. ESA-UbiSite: accurate prediction of human ubiquitination sites by identifying a set of effective negatives. Bioinformatics 2017; 33:661-668. [PMID: 28062441 DOI: 10.1093/bioinformatics/btw701] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 11/08/2016] [Indexed: 01/20/2023] Open
Abstract
Motivation Numerous ubiquitination sites remain undiscovered because of the limitations of mass spectrometry-based methods. Existing prediction methods use randomly selected non-validated sites as non-ubiquitination sites to train ubiquitination site prediction models. Results We propose an evolutionary screening algorithm (ESA) to select effective negatives among non-validated sites and an ESA-based prediction method, ESA-UbiSite, to identify human ubiquitination sites. The ESA selects non-validated sites least likely to be ubiquitination sites as training negatives. Moreover, the ESA and ESA-UbiSite use a set of well-selected physicochemical properties together with a support vector machine for accurate prediction. Experimental results show that ESA-UbiSite with effective negatives achieved 0.92 test accuracy and a Matthews's correlation coefficient of 0.48, better than existing prediction methods. The ESA increased ESA-UbiSite's test accuracy from 0.75 to 0.92 and can improve other post-translational modification site prediction methods. Availability and Implementation An ESA-UbiSite-based web server has been established at http://iclab.life.nctu.edu.tw/iclab_webtools/ESAUbiSite/ . Contact syho@mail.nctu.edu.tw. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jyun-Rong Wang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Wen-Lin Huang
- Department and Institute of Industrial Engineering and Management, Minghsin University of Science and Technology, Hsinchu 300, Taiwan
| | - Ming-Ju Tsai
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Kai-Ti Hsu
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Hui-Ling Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan.,Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Shinn-Ying Ho
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan.,Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| |
Collapse
|
43
|
Lamers SL, Fogel GB, Liu ES, Barbier AE, Rodriguez CW, Singer EJ, Nolan DJ, Rose R, McGrath MS. Brain-specific HIV Nef identified in multiple patients with neurological disease. J Neurovirol 2017; 24:1-15. [PMID: 29063512 DOI: 10.1007/s13365-017-0586-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 08/28/2017] [Accepted: 10/03/2017] [Indexed: 12/11/2022]
Abstract
HIV-1 Nef is a flexible, multifunctional protein with several cellular targets that is required for pathogenicity of the virus. This protein maintains a high degree of genetic variation among intra- and inter-host isolates. HIV Nef is relevant to HIV-associated neurological diseases (HAND) in patients treated with combined antiretroviral therapy because of the protein's role in promoting survival and migration of infected brain macrophages. In this study, we analyzed 2020 HIV Nef sequences derived from 22 different tissues and 31 subjects using a novel computational approach. This approach combines statistical regression and evolved neural networks (ENNs) to classify brain sequences based on the physical and chemical characteristics of functional Nef domains. Based on training, testing, and validation data, the method successfully classified brain Nef sequences at 84.5% and provided informative features for further examination. These included physicochemical features associated with the Src-homology-3 binding domain, the Nef loop (including the AP-2 Binding region), and a cytokine-binding domain. Non-brain sequences from patients with HIV-associated neurological disease were frequently classified as brain, suggesting that the approach could indicate neurological risk using blood-derived virus or for the development of biomarkers for use in assay systems aimed at drug efficacy studies for the treatment of HIV-associated neurological diseases.
Collapse
|
44
|
Williams BJ, Ciavarini SJ, Devlin C, Cohn SM, Xie R, Vissers JPC, Martin LB, Caswell A, Langridge JI, Geromanos SJ. Multi-mode acquisition (MMA): An MS/MS acquisition strategy for maximizing selectivity, specificity and sensitivity of DIA product ion spectra. Proteomics 2017; 16:2284-301. [PMID: 27296928 DOI: 10.1002/pmic.201500492] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Revised: 05/16/2016] [Accepted: 06/10/2016] [Indexed: 01/08/2023]
Abstract
In proteomics studies, it is generally accepted that depth of coverage and dynamic range is limited in data-directed acquisitions. The serial nature of the method limits both sensitivity and the number of precursor ions that can be sampled. To that end, a number of data-independent acquisition (DIA) strategies have been introduced with these methods, for the most part, immune to the sampling issue; nevertheless, some do have other limitations with respect to sensitivity. The major limitation with DIA approaches is interference, i.e., MS/MS spectra are highly chimeric and often incapable of being identified using conventional database search engines. Utilizing each available dimension of separation prior to ion detection, we present a new multi-mode acquisition (MMA) strategy multiplexing both narrowband and wideband DIA acquisitions in a single analytical workflow. The iterative nature of the MMA workflow limits the adverse effects of interference with minimal loss in sensitivity. Qualitative identification can be performed by selected ion chromatograms or conventional database search strategies.
Collapse
Affiliation(s)
| | | | | | | | - Rong Xie
- Waters Corporation, Milford, MA, USA
| | | | | | | | | | | |
Collapse
|
45
|
Moruz L, Käll L. Peptide retention time prediction. MASS SPECTROMETRY REVIEWS 2017; 36:615-623. [PMID: 26799864 DOI: 10.1002/mas.21488] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 11/12/2015] [Indexed: 06/05/2023]
Abstract
Most methods for interpreting data from shotgun proteomics experiments are to large degree dependent on being able to predict properties of peptide-ions. Often such predicted properties are limited to molecular mass and fragment spectra, but here we put focus on a perhaps underutilized property, a peptide's chromatographic retention time. We review a couple of different principles of retention time prediction,and their applications within computational proteomics. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:615-623, 2017.
Collapse
Affiliation(s)
- Luminita Moruz
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology - KTH, Stockholm, Sweden
| | - Lukas Käll
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology - KTH, Stockholm, Sweden
| |
Collapse
|
46
|
Jain T, Boland T, Lilov A, Burnina I, Brown M, Xu Y, Vásquez M. Prediction of delayed retention of antibodies in hydrophobic interaction chromatography from sequence using machine learning. Bioinformatics 2017; 33:3758-3766. [DOI: 10.1093/bioinformatics/btx519] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 08/11/2017] [Indexed: 12/16/2022] Open
Affiliation(s)
- Tushar Jain
- Computational Biology, Adimab, Palo Alto, CA, USA
| | - Todd Boland
- Computational Biology, Adimab, Palo Alto, CA, USA
| | | | | | | | - Yingda Xu
- Protein Analytics, Adimab, Lebanon, NH, USA
| | | |
Collapse
|
47
|
Cervantes-Torres J, Segura-Velázquez R, Padilla P, Sciutto E, Fragoso G. High stability of the immunomodulatory GK-1 synthetic peptide measured by a reversed phase high-performance liquid chromatography method. J Chromatogr B Analyt Technol Biomed Life Sci 2017; 1060:97-102. [DOI: 10.1016/j.jchromb.2017.05.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 05/24/2017] [Accepted: 05/25/2017] [Indexed: 10/19/2022]
|
48
|
Krokhin OV, Ezzati P, Spicer V. Peptide Retention Time Prediction in Hydrophilic Interaction Liquid Chromatography: Data Collection Methods and Features of Additive and Sequence-Specific Models. Anal Chem 2017; 89:5526-5533. [DOI: 10.1021/acs.analchem.7b00537] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Oleg V. Krokhin
- Manitoba
Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada
- Department
of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada
| | - Peyman Ezzati
- Manitoba
Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada
| | - Vic Spicer
- Manitoba
Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada
| |
Collapse
|
49
|
Nekrasova NA, Kurbatova SV. Quantitative structure–chromatographic retention correlations of quinoline derivatives. J Chromatogr A 2017; 1492:55-60. [DOI: 10.1016/j.chroma.2017.02.063] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Revised: 02/07/2017] [Accepted: 02/26/2017] [Indexed: 12/19/2022]
|
50
|
Meglič A, Pecman A, Rozina T, Leštan D, Sedmak B. Electrochemical inactivation of cyanobacteria and microcystin degradation using a boron-doped diamond anode - A potential tool for cyanobacterial bloom control. J Environ Sci (China) 2017; 53:248-261. [PMID: 28372749 DOI: 10.1016/j.jes.2016.02.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Revised: 10/21/2015] [Accepted: 02/19/2016] [Indexed: 06/07/2023]
Abstract
Cyanobacterial blooms are global phenomena that can occur in calm and nutrient-rich (eutrophic) fresh and marine waters. Human exposure to cyanobacteria and their biologically active products is possible during water sports and various water activities, or by ingestion of contaminated water. Although the vast majority of harmful cyanobacterial products are confined to the interior of the cells, these are eventually released into the surrounding water following natural or artificially induced cell death. Electrochemical oxidation has been used here to damage cyanobacteria to halt their proliferation, and for microcystin degradation under in-vitro conditions. Partially spent Jaworski growth medium with no addition of supporting electrolytes was used. Electrochemical treatment resulted in the cyanobacterial loss of cell-buoyancy regulation, cell proliferation arrest, and eventual cell death. Microcystin degradation was studied separately in two basic modes of treatment: batch-wise flow, and constant flow, for electrolytic-cell exposure. Batch-wise exposure simulates treatment under environmental conditions, while constant flow is more appropriate for the study of boron-doped diamond electrode efficacy under laboratory conditions. The effectiveness of microcystin degradation was established using high-performance liquid chromatography-photodiode array detector analysis, while the biological activities of the products were estimated using a colorimetric protein phosphatase-1 inhibition assay. The results indicate potential for the application of electro-oxidation methods for the control of bloom events by taking advantage of specific intrinsic ecological characteristics of bloom-forming cyanobacteria. The applicability of the use of boron-doped diamond electrodes in remediation of water exposed to cyanobacteria bloom events is discussed.
Collapse
Affiliation(s)
- Andrej Meglič
- Arhel Ltd., Pustovrhova c. 63, SI-1000 Ljubljana, Slovenia.
| | - Anja Pecman
- Centre for Soil and Environmental Sciences, Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000 Ljubljana, Slovenia
| | | | - Domen Leštan
- Centre for Soil and Environmental Sciences, Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000 Ljubljana, Slovenia
| | - Bojan Sedmak
- Department of Genetic Toxicology and Cancer Biology, National Institute of Biology, Večna pot 111, SI-1000 Ljubljana, Slovenia.
| |
Collapse
|