1
|
Wang J, Yu A, Cho BG, Mechref Y. Assessing the hydrophobicity of glycopeptides using reversed-phase liquid chromatography and tandem mass spectrometry. J Chromatogr A 2023; 1706:464237. [PMID: 37523904 DOI: 10.1016/j.chroma.2023.464237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 07/19/2023] [Accepted: 07/20/2023] [Indexed: 08/02/2023]
Abstract
Retention time is one of the most important parameters that has been widely used to demonstrate the separation results obtained from liquid chromatography (LC) platforms. However, retention time can shift when samples are tested with different instruments and laboratories, which hinders the identification process of analytes when comparing data collected from different LC systems. To address this problem, hydrophobicity index was introduced for retention time normalization of the glycopeptides separated by reversed-phase LC (RPLC). Tandem MS was used for the detection and identification of glycopeptides. In addition, the influence of different types of glycans on the hydrophobicity of peptide backbones was studied by comparing the retention time of glycopeptides with their non-glycosylated counterparts. The hydrophobicity of tryptic digested glycopeptides derived from model glycoproteins, including bovine fetuin, α1-acid glycoprotein, and haptoglobin from human plasma, were evaluated based on the hydrophobicity index of the standard peptides from a peptide retention time calibration mixture. The reduction of hydrophobicity of multiple peptide backbones was observed due to the hydrophilic glycan structures. By comparing the hydrophobicity index of glycopeptides collected from different time and instruments, the day-to-day and lab-to-lab comparisons suggested high reliability and reproducibility of this approach. The RSD% of hydrophobicity index from inter-lab experiments was 1.2%, while the RSD% of retention time was 5.1%. Then, the applications of this method were demonstrated on complex glycopeptide samples extracted from human blood serum. The hydrophobicity index can be applied to address the retention time shift when using different instruments, thereby boosting confidence of the characterization of glycopeptides.
Collapse
Affiliation(s)
- Junyao Wang
- Department of Chemistry and Biochemistry, Texas Tech University, United States
| | - Aiying Yu
- Department of Chemistry and Biochemistry, Texas Tech University, United States
| | - Byeong Gwan Cho
- Department of Chemistry and Biochemistry, Texas Tech University, United States
| | - Yehia Mechref
- Department of Chemistry and Biochemistry, Texas Tech University, United States.
| |
Collapse
|
2
|
Sandmann CL, Schulz JF, Ruiz-Orera J, Kirchner M, Ziehm M, Adami E, Marczenke M, Christ A, Liebe N, Greiner J, Schoenenberger A, Muecke MB, Liang N, Moritz RL, Sun Z, Deutsch EW, Gotthardt M, Mudge JM, Prensner JR, Willnow TE, Mertins P, van Heesch S, Hubner N. Evolutionary origins and interactomes of human, young microproteins and small peptides translated from short open reading frames. Mol Cell 2023; 83:994-1011.e18. [PMID: 36806354 PMCID: PMC10032668 DOI: 10.1016/j.molcel.2023.01.023] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/12/2022] [Accepted: 01/25/2023] [Indexed: 02/19/2023]
Abstract
All species continuously evolve short open reading frames (sORFs) that can be templated for protein synthesis and may provide raw materials for evolutionary adaptation. We analyzed the evolutionary origins of 7,264 recently cataloged human sORFs and found that most were evolutionarily young and had emerged de novo. We additionally identified 221 previously missed sORFs potentially translated into peptides of up to 15 amino acids-all of which are smaller than the smallest human microprotein annotated to date. To investigate the bioactivity of sORF-encoded small peptides and young microproteins, we subjected 266 candidates to a mass-spectrometry-based interactome screen with motif resolution. Based on these interactomes and additional cellular assays, we can associate several candidates with mRNA splicing, translational regulation, and endocytosis. Our work provides insights into the evolutionary origins and interaction potential of young and small proteins, thereby helping to elucidate this underexplored territory of the human proteome.
Collapse
Affiliation(s)
- Clara-L Sandmann
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany
| | - Jana F Schulz
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany
| | - Jorge Ruiz-Orera
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Marieluise Kirchner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
| | - Matthias Ziehm
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
| | - Eleonora Adami
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Maike Marczenke
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Annabel Christ
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Nina Liebe
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Johannes Greiner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Aaron Schoenenberger
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michael B Muecke
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany
| | - Ning Liang
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | | | - Zhi Sun
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | - Michael Gotthardt
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John R Prensner
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
| | - Thomas E Willnow
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Department of Biomedicine, Aarhus University, 8000 Aarhus, Denmark
| | - Philipp Mertins
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
| | | | - Norbert Hubner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany.
| |
Collapse
|
3
|
Ryan KA, Bruening ML. Online protein digestion in membranes between capillary electrophoresis and mass spectrometry. Analyst 2023; 148:1611-1619. [PMID: 36912593 DOI: 10.1039/d3an00106g] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
This research employs pepsin-containing membranes to digest proteins online after a capillary electrophoresis (CE) separation and prior to tandem mass spectrometry. Proteolysis after the separation allows the peptides from a given protein to enter the mass spectrometer in a single plug. Thus, migration time can serve as an additional criterion for confirming the identification of a peptide. The membrane resides in a sheath-flow electrospray ionization (ESI) source to enable digestion immediately before spray into the mass spectrometer, thus limiting separation of the digested peptides. Using the same membrane, digestion occurred reproducibly during 20 consecutive CE analyses performed over a 10 h period. Additionally, after separating a mixture of six unreduced proteins with CE, online digestion facilitated protein identification with at least 2 identifiable peptides for all the proteins. Sequence coverages were >75% for myoglobin and carbonic anhydrase II but much lower for proteins containing disulfide bonds. Development of methods for efficient separation of reduced proteins or identification of cross-linked peptides should enhance sequence coverages for proteins with disulfide bonds. Migration times for the peptides identified from a specific protein differed by <∼30 s, which allows for rejection of some spurious peptide identifications.
Collapse
Affiliation(s)
- Kendall A Ryan
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN 46556, USA.
| | - Merlin L Bruening
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN 46556, USA. .,Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
4
|
Neely BA, Dorfer V, Martens L, Bludau I, Bouwmeester R, Degroeve S, Deutsch EW, Gessulat S, Käll L, Palczynski P, Payne SH, Rehfeldt TG, Schmidt T, Schwämmle V, Uszkoreit J, Vizcaíno JA, Wilhelm M, Palmblad M. Toward an Integrated Machine Learning Model of a Proteomics Experiment. J Proteome Res 2023; 22:681-696. [PMID: 36744821 PMCID: PMC9990124 DOI: 10.1021/acs.jproteome.2c00711] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.
Collapse
Affiliation(s)
- Benjamin A Neely
- National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Viktoria Dorfer
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Softwarepark 11, 4232 Hagenberg, Austria
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | | | - Lukas Käll
- Science for Life Laboratory, KTH - Royal Institute of Technology, 171 21 Solna, Sweden
| | - Pawel Palczynski
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, Utah 84602, United States
| | - Tobias Greisager Rehfeldt
- Institute for Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark
| | | | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Julian Uszkoreit
- Medical Proteome Analysis, Center for Protein Diagnostics (ProDi), Ruhr University Bochum, 44801 Bochum, Germany.,Medizinisches Proteom-Center, Medical Faculty, Ruhr University Bochum, 44801 Bochum, Germany
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich (TUM), 85354 Freising, Germany
| | - Magnus Palmblad
- Leiden University Medical Center, Postbus 9600, 2300 RC Leiden, The Netherlands
| |
Collapse
|
5
|
Scrosati PM, Konermann L. Atomistic Details of Peptide Reversed-Phase Liquid Chromatography from Molecular Dynamics Simulations. Anal Chem 2023; 95:3892-3900. [PMID: 36745777 DOI: 10.1021/acs.analchem.2c05667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Peptide separations by reversed-phase liquid chromatography (RPLC) are an integral part of bottom-up proteomics. These separations typically employ C18 columns with water/acetonitrile gradient elution in the presence of formic acid. Despite the widespread use of such workflows, the exact nature of peptide interactions with the stationary and mobile phases is poorly understood. Here, we employ microsecond molecular dynamics (MD) simulations to uncover details of peptide RPLC. We examined two tryptic peptides, a hydrophobic and a hydrophilic species, in a slit pore lined with C18 chains that were grafted onto SiO2 support. Our simulations explored peptide trapping, followed by desorption and elution. Trapping in an aqueous mobile phase was initiated by C18 contacts with Lys butyl moieties. This was followed by extensive anchoring of nonpolar side chains (Leu/Ile/Val) in the C18 layer. Exposure to water/acetonitrile triggered peptide desorption in a stepwise fashion; charged sites close to the termini were the first to lift off, followed by the other residues. During water/acetonitrile elution, both peptides preferentially resided close to the pore center. The hydrophilic peptide exhibited no contacts with the stationary phase under these conditions. In contrast, the hydrophobic species underwent multiple transient Leu/Ile/Val binding interactions with C18 chains. These nonpolar interactions represent the foundation of differential peptide retention, in agreement with the experimental elution behavior of the two peptides. Extensive peptide/formate ion pairing was observed in water/acetonitrile, particularly at N-terminal sites. Overall, this work uncovers an unprecedented level of RPLC molecular details, paving the way for MD simulations as a future tool for improving retention prediction algorithms and for the design of novel column materials.
Collapse
Affiliation(s)
- Pablo M Scrosati
- Department of Chemistry, The University of Western Ontario, London, Ontario, N6A 5B7, Canada
| | - Lars Konermann
- Department of Chemistry, The University of Western Ontario, London, Ontario, N6A 5B7, Canada
| |
Collapse
|
6
|
Pauletti BA, Granato DC, M Carnielli C, Câmara GA, Normando AGC, Telles GP, Leme AFP. Typic: A Practical and Robust Tool to Rank Proteotypic Peptides for Targeted Proteomics. J Proteome Res 2023; 22:539-545. [PMID: 36480281 DOI: 10.1021/acs.jproteome.2c00585] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The selection of a suitable proteotypic peptide remains a challenge for designing a targeted quantitative proteomics assay. Although the criteria are well-established in the literature, the selection of these peptides is often performed in a subjective and time-consuming manner. Here, we have developed a practical and semiautomated workflow implemented in an open-source program named Typic. Typic is designed to run in a command line and a graphical interface to help selecting a list of proteotypic peptides for targeted quantitation. The tool combines the input data and downloads additional data from public repositories to produce a file per protein as output. Each output file includes relevant information to the selection of proteotypic peptides organized in a table, a colored ranking of peptides according to their potential value as targets for quantitation and auxiliary plots to assist users in the task of proteotypic peptides selection. Taken together, Typic leads to a practical and straightforward data extraction from multiple data sets, allowing the identification of most suitable proteotypic peptides based on established criteria, in an unbiased and standardized manner, ultimately leading to a more robust targeted proteomics assay.
Collapse
Affiliation(s)
- Bianca A Pauletti
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, 13083-970 São Paulo, Brazil
| | - Daniela C Granato
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, 13083-970 São Paulo, Brazil
| | - Carolina M Carnielli
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, 13083-970 São Paulo, Brazil
| | - Guilherme A Câmara
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, 13083-970 São Paulo, Brazil
| | - Ana Gabriela C Normando
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, 13083-970 São Paulo, Brazil
| | - Guilherme P Telles
- Instituto de Computação, Universidade Estadual de Campinas (UNICAMP), Campinas, 13083-852 São Paulo, Brazil
| | - Adriana F Paes Leme
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, 13083-970 São Paulo, Brazil
| |
Collapse
|
7
|
Merlotti A, Sadacca B, Arribas YA, Ngoma M, Burbage M, Goudot C, Houy A, Rocañín-Arjó A, Lalanne A, Seguin-Givelet A, Lefevre M, Heurtebise-Chrétien S, Baudon B, Oliveira G, Loew D, Carrascal M, Wu CJ, Lantz O, Stern MH, Girard N, Waterfall JJ, Amigorena S. Noncanonical splicing junctions between exons and transposable elements represent a source of immunogenic recurrent neo-antigens in patients with lung cancer. Sci Immunol 2023; 8:eabm6359. [PMID: 36735774 DOI: 10.1126/sciimmunol.abm6359] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 01/12/2023] [Indexed: 02/05/2023]
Abstract
Although most characterized tumor antigens are encoded by canonical transcripts (such as differentiation or tumor-testis antigens) or mutations (both driver and passenger mutations), recent results have shown that noncanonical transcripts including long noncoding RNAs and transposable elements (TEs) can also encode tumor-specific neo-antigens. Here, we investigate the presentation and immunogenicity of tumor antigens derived from noncanonical mRNA splicing events between coding exons and TEs. Comparing human non-small cell lung cancer (NSCLC) and diverse healthy tissues, we identified a subset of splicing junctions that is both tumor specific and shared across patients. We used HLA-I peptidomics to identify peptides encoded by tumor-specific junctions in primary NSCLC samples and lung tumor cell lines. Recurrent junction-encoded peptides were immunogenic in vitro, and CD8+ T cells specific for junction-encoded epitopes were present in tumors and tumor-draining lymph nodes from patients with NSCLC. We conclude that noncanonical splicing junctions between exons and TEs represent a source of recurrent, immunogenic tumor-specific antigens in patients with NSCLC.
Collapse
Affiliation(s)
- Antonela Merlotti
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Benjamin Sadacca
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
- Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France
| | - Yago A Arribas
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Mercia Ngoma
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Marianne Burbage
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Christel Goudot
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Alexandre Houy
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
| | - Ares Rocañín-Arjó
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Ana Lalanne
- Institut Curie, Laboratory of Clinical immunology, 75005 Paris, France
- Institut Curie, CIC-BT1428, 75005 Paris, France
| | - Agathe Seguin-Givelet
- Thoracic Surgery Department, Curie-Montsouris Thorax Institute - Institut Mutualiste Montsouris, Paris, France
- Paris 13 University, Sorbonne Paris Cité, Faculty of Medicine SMBH, Bobigny, France
| | - Marine Lefevre
- Department of Pathology, Institute Mutualiste Montsouris, Paris, France
| | | | - Blandine Baudon
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Giacomo Oliveira
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Damarys Loew
- Institut Curie, Centre de Recherche, Laboratoire de Spectrométrie de Masse Protéomique, PSL Research University, Paris cedex 05, France
| | - Montserrat Carrascal
- Biological and Environmental Proteomics, Institut d'Investigacions Biomèdiques de Barcelona-CSIC, IDIBAPS, Roselló 161, 6a planta, 08036 Barcelona, Spain
| | - Catherine J Wu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Olivier Lantz
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
- Institut Curie, Laboratory of Clinical immunology, 75005 Paris, France
- Institut Curie, CIC-BT1428, 75005 Paris, France
| | - Marc-Henri Stern
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
| | - Nicolas Girard
- Thoracic Surgery Department, Curie-Montsouris Thorax Institute - Institut Mutualiste Montsouris, Paris, France
| | - Joshua J Waterfall
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
- Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France
| | - Sebastian Amigorena
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| |
Collapse
|
8
|
Yeung D, Spicer V, Zahedi RP, Krokhin O. Exploring the variable space of shallow machine learning models for reversed-phase retention time prediction. Comput Struct Biotechnol J 2023; 21:2446-2453. [PMID: 37090433 PMCID: PMC10113922 DOI: 10.1016/j.csbj.2023.02.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 03/02/2023] Open
Abstract
Peptide retention time (RT) prediction algorithms are tools to study and identify the physicochemical properties that drive the peptide-sorbent interaction. Traditional RT algorithms use multiple linear regression with manually curated parameters to determine the degree of direct contribution for each parameter and improvements to RT prediction accuracies relied on superior feature engineering. Deep learning led to a significant increase in RT prediction accuracy and automated feature engineering via chaining multiple learning modules. However, the significance and the identity of these extracted variables are not well understood due to the inherent complexity when interpreting "relationships-of-relationships" found in deep learning variables. To achieve both accuracy and interpretability simultaneously, we isolated individual modules used in deep learning and the isolated modules are the shallow learners employed for RT prediction in this work. Using a shallow convolutional neural network (CNN) and gated recurrent unit (GRU), we find that the spatial features obtained via the CNN correlate with real-world physicochemical properties namely cross-collisional sections (CCS) and variations of assessable surface area (ASA). Furthermore, we determined that the discovered parameters are "micro-coefficients" that contribute to the "macro-coefficient" - hydrophobicity. Manually embedding CCS and the variations of ASA to the GRU model yielded an R2 = 0.981 using only 525 variables and can represent 88% of the ∼110,000 tryptic peptides used in our dataset. This work highlights the feature discovery process of our shallow learners can achieve beyond traditional RT models in performance and have better interpretability when compared with the deep learning RT algorithms found in the literature.
Collapse
|
9
|
Lenčo J, Jadeja S, Naplekov DK, Krokhin OV, Khalikova MA, Chocholouš P, Urban J, Broeckhoven K, Nováková L, Švec F. Reversed-Phase Liquid Chromatography of Peptides for Bottom-Up Proteomics: A Tutorial. J Proteome Res 2022; 21:2846-2892. [PMID: 36355445 DOI: 10.1021/acs.jproteome.2c00407] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The performance of the current bottom-up liquid chromatography hyphenated with mass spectrometry (LC-MS) analyses has undoubtedly been fueled by spectacular progress in mass spectrometry. It is thus not surprising that the MS instrument attracts the most attention during LC-MS method development, whereas optimizing conditions for peptide separation using reversed-phase liquid chromatography (RPLC) remains somewhat in its shadow. Consequently, the wisdom of the fundaments of chromatography is slowly vanishing from some laboratories. However, the full potential of advanced MS instruments cannot be achieved without highly efficient RPLC. This is impossible to attain without understanding fundamental processes in the chromatographic system and the properties of peptides important for their chromatographic behavior. We wrote this tutorial intending to give practitioners an overview of critical aspects of peptide separation using RPLC to facilitate setting the LC parameters so that they can leverage the full capabilities of their MS instruments. After briefly introducing the gradient separation of peptides, we discuss their properties that affect the quality of LC-MS chromatograms the most. Next, we address the in-column and extra-column broadening. The last section is devoted to key parameters of LC-MS methods. We also extracted trends in practice from recent bottom-up proteomics studies and correlated them with the current knowledge on peptide RPLC separation.
Collapse
Affiliation(s)
- Juraj Lenčo
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Siddharth Jadeja
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Denis K Naplekov
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Oleg V Krokhin
- Department of Internal Medicine, Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, WinnipegR3E 3P4, Manitoba, Canada
| | - Maria A Khalikova
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Petr Chocholouš
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - Jiří Urban
- Department of Chemistry, Faculty of Science, Masaryk University, Kamenice 5, 625 00Brno, Czech Republic
| | - Ken Broeckhoven
- Department of Chemical Engineering (CHIS), Faculty of Engineering, Vrije Universiteit Brussel, Pleinlaan 2, 1050Brussel, Belgium
| | - Lucie Nováková
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| | - František Švec
- Department of Analytical Chemistry, Faculty of Pharmacy in Hradec Králové, Charles University, Heyrovského 1203/8, 500 05Hradec Králové, Czech Republic
| |
Collapse
|
10
|
Polunin KE, Fedotkina OS, Polunina IA, Buryak AK. Optimizing the Chromatographic Separation of Antibacterial Peptides of Galleria mellonella. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY A 2022. [DOI: 10.1134/s0036024422080209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
11
|
Nonspecific adsorption evaluation and general minimization strategy in peptide analysis based on ultra-performance liquid chromatography-mass spectrometry. Se Pu 2022; 40:616-624. [PMID: 35791600 PMCID: PMC9404093 DOI: 10.3724/sp.j.1123.2021.12012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
蛋白质组学技术在多肽和蛋白质类新型治疗药物的开发、临床诊断生物标志物的深入发掘中应用广泛。然而,多肽和蛋白质类大分子的非特异性吸附性质给分析方法的开发带来极大挑战,亟须一种通用型的策略去评估和降低非特异吸附对超高效液相色谱-质谱(UPLC-MS)大分子检测造成的负面影响。研究以牛血清白蛋白(BSA)为模型,探讨其酶解后多肽组理化性质与吸附程度之间的相关性;根据肽段的响应和吸附程度设计分级策略;针对高响应、强吸附的Class Ⅱ类肽段,从样品制备中离心管、进样瓶的选择,乃至液相色谱系统中色谱柱固定相、流速、梯度、柱温、洗针液的选择全过程设计试验,探讨非特异吸附的影响因素及其通用型最小化策略。结果显示,肽段的被吸附程度与其理化参数HPLC指数(HPLC Index)、肽段长度等显著相关(p<0.05),但仅凭上述参数仅能解释30%肽段的被吸附程度。改性的聚丙烯材料可使肽段溶液在储存或前处理过程中获得较高的回收率(24 h内回收率大于80%)。在对液相色谱条件的考察和优化过程中发现,C8填料的色谱柱、高流速、缓梯度以及强洗针液,可使残留量降至最低(降低为原来的1/150)。柱温对残留的影响在肽段间存在较大个体差异,需要对不同的肽段具体分析以得到较少量的残留。研究以详实的数据考察并最小化模型肽段组在分析过程中的非特异吸附,提示了蛋白质类大分子药物分析方法建立中应重点关注的影响因素及其有效的解决方案。
Collapse
|
12
|
Polunin KE, Fedotkina OS, Polunina IA, Buryak AK. Modeling the Chromatographic Behavior of Antibacterial Peptides under Conditions of RP HPLC. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY A 2022. [DOI: 10.1134/s0036024422060188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
13
|
Chen W, McCool EN, Sun L, Zang Y, Ning X, Liu X. Evaluation of Machine Learning Models for Proteoform Retention and Migration Time Prediction in Top-Down Mass Spectrometry. J Proteome Res 2022; 21:1736-1747. [PMID: 35616364 PMCID: PMC9250612 DOI: 10.1021/acs.jproteome.2c00124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
Reversed-phase liquid
chromatography (RPLC) and capillary zone
electrophoresis (CZE) are two primary proteoform separation methods
in mass spectrometry (MS)-based top-down proteomics. Proteoform retention
time (RT) prediction in RPLC and migration time (MT) prediction in
CZE provide additional information for accurate proteoform identification
and quantification. While existing methods are mainly focused on peptide
RT and MT prediction in bottom-up MS, there is still a lack of methods
for proteoform RT and MT prediction in top-down MS. We systematically
evaluated eight machine learning models and a transfer learning method
for proteoform RT prediction and five models and the transfer learning
method for proteoform MT prediction. Experimental results showed that
a gated recurrent unit (GRU)-based model with transfer learning achieved
a high accuracy (R = 0.978) for proteoform RT prediction
and that the GRU-based model and a fully connected neural network
model obtained a high accuracy of R = 0.982 and 0.981
for proteoform MT prediction, respectively.
Collapse
Affiliation(s)
- Wenrong Chen
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, Indiana 46202, United Staes
| | - Elijah N McCool
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United Staes
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United Staes
| | - Yong Zang
- Department of Biostatics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana 46202, United Staes
| | - Xia Ning
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, United Staes.,Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, United Staes.,Translational Data Analytics Institute, The Ohio State University, Columbus, Ohio 43210, United Staes
| | - Xiaowen Liu
- Tulane Center for Biomedical Informatics and Genomics, Tulane University, New Orleans, Louisiana 70112, United Staes.,Deming Department of Medicine, Tulane University, New Orleans, Louisiana 70112, United Staes
| |
Collapse
|
14
|
Vibert J, Saulnier O, Collin C, Petit F, Borgman KJE, Vigneau J, Gautier M, Zaidi S, Pierron G, Watson S, Gruel N, Hénon C, Postel-Vinay S, Deloger M, Raynal V, Baulande S, Laud-Duval K, Hill V, Grossetête S, Dingli F, Loew D, Torrejon J, Ayrault O, Orth MF, Grünewald TGP, Surdez D, Coulon A, Waterfall JJ, Delattre O. Oncogenic chimeric transcription factors drive tumor-specific transcription, processing, and translation of silent genomic regions. Mol Cell 2022; 82:2458-2471.e9. [PMID: 35550257 DOI: 10.1016/j.molcel.2022.04.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 02/20/2022] [Accepted: 04/14/2022] [Indexed: 12/11/2022]
Abstract
Many cancers are characterized by gene fusions encoding oncogenic chimeric transcription factors (TFs) such as EWS::FLI1 in Ewing sarcoma (EwS). Here, we find that EWS::FLI1 induces the robust expression of a specific set of novel spliced and polyadenylated transcripts within otherwise transcriptionally silent regions of the genome. These neogenes (NGs) are virtually undetectable in large collections of normal tissues or non-EwS tumors and can be silenced by CRISPR interference at regulatory EWS::FLI1-bound microsatellites. Ribosome profiling and proteomics further show that some NGs are translated into highly EwS-specific peptides. More generally, we show that hundreds of NGs can be detected in diverse cancers characterized by chimeric TFs. Altogether, this study identifies the transcription, processing, and translation of novel, specific, highly expressed multi-exonic transcripts from otherwise silent regions of the genome as a new activity of aberrant TFs in cancer.
Collapse
Affiliation(s)
- Julien Vibert
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France; INSERM U830, Integrative Functional Genomics of Cancer Lab, PSL Research University, Institut Curie Research Center, Paris, France; Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France
| | - Olivier Saulnier
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Céline Collin
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Floriane Petit
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Kyra J E Borgman
- Institut Curie, PSL Research University, Sorbonne Université, CNRS UMR 3664, Laboratoire Dynamique du Noyau, 75005 Paris, France; Institut Curie, PSL Research University, Sorbonne Université, CNRS UMR168, Laboratoire Physico Chimie Curie, 75005 Paris, France
| | - Jérômine Vigneau
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Maud Gautier
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Sakina Zaidi
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Gaëlle Pierron
- Unité de Génétique Somatique, Service d'oncogénétique, Institut Curie, Centre Hospitalier, Paris, France
| | - Sarah Watson
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France; Medical Oncology Department, PSL Research University, Institut Curie Hospital, Paris, France
| | - Nadège Gruel
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France; Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France
| | - Clémence Hénon
- ATIP-Avenir group, Inserm Unit U981, Gustave Roussy, Villejuif, France
| | - Sophie Postel-Vinay
- ATIP-Avenir group, Inserm Unit U981, Gustave Roussy, Villejuif, France; Drug Development Department, DITEP, Gustave Roussy, Villejuif, France
| | - Marc Deloger
- Bioinformatics and Computational Systems Biology of Cancer, PSL Research University, Mines Paris Tech, INSERM U900, Paris, France
| | - Virginie Raynal
- Institut Curie Genomics of Excellence (ICGex) Platform, PSL Research University, Institut Curie Research Center, Paris, France
| | - Sylvain Baulande
- Institut Curie Genomics of Excellence (ICGex) Platform, PSL Research University, Institut Curie Research Center, Paris, France
| | - Karine Laud-Duval
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Véronique Hill
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Sandrine Grossetête
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Florent Dingli
- Laboratoire de Spectrométrie de Masse Protéomique, PSL Research University, Institut Curie Research Center, Paris, France
| | - Damarys Loew
- Laboratoire de Spectrométrie de Masse Protéomique, PSL Research University, Institut Curie Research Center, Paris, France
| | - Jacob Torrejon
- Institut Curie, CNRS UMR3347, INSERM, PSL Research University, Orsay, France; CNRS UMR 3347, INSERM U1021, Université Paris Sud, Université Paris-Saclay, Orsay, France
| | - Olivier Ayrault
- Institut Curie, CNRS UMR3347, INSERM, PSL Research University, Orsay, France; CNRS UMR 3347, INSERM U1021, Université Paris Sud, Université Paris-Saclay, Orsay, France
| | - Martin F Orth
- Max-Eder Research Group for Pediatric Sarcoma Biology, Institute of Pathology, Faculty of Medicine, LMU Munich, Munich, Germany
| | - Thomas G P Grünewald
- Division of Translational Pediatric Sarcoma Research, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany; Hopp-Children's Cancer Center (KiTZ), Heidelberg, Germany; Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
| | - Didier Surdez
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France
| | - Antoine Coulon
- Institut Curie, PSL Research University, Sorbonne Université, CNRS UMR 3664, Laboratoire Dynamique du Noyau, 75005 Paris, France; Institut Curie, PSL Research University, Sorbonne Université, CNRS UMR168, Laboratoire Physico Chimie Curie, 75005 Paris, France
| | - Joshua J Waterfall
- INSERM U830, Integrative Functional Genomics of Cancer Lab, PSL Research University, Institut Curie Research Center, Paris, France; Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France.
| | - Olivier Delattre
- INSERM U830, Équipe Labellisée LNCC, Diversity and Plasticity of Childhood Tumors Lab, PSL Research University, SIREDO Oncology Center, Institut Curie Research Center, Paris, France; Institut Curie, PSL Research University, Sorbonne Université, CNRS UMR 3664, Laboratoire Dynamique du Noyau, 75005 Paris, France.
| |
Collapse
|
15
|
Almeida A, Gabriel M, Firlej V, Martin‐Jaular L, Lejars M, Cipolla R, Petit F, Vogt N, San‐Roman M, Dingli F, Loew D, Destouches D, Vacherot F, de la Taille A, Théry C, Morillon A. Urinary extracellular vesicles contain mature transcriptome enriched in circular and long noncoding RNAs with functional significance in prostate cancer. J Extracell Vesicles 2022; 11:e12210. [PMID: 35527349 PMCID: PMC9081490 DOI: 10.1002/jev2.12210] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/22/2022] [Accepted: 03/15/2022] [Indexed: 12/14/2022] Open
Abstract
Long noncoding (lnc)RNAs modulate gene expression alongside presenting unexpected source of neoantigens. Despite their immense interest, their ability to be transferred and control adjacent cells is unknown. Extracellular Vesicles (EVs) offer a protective environment for nucleic acids, with pro and antitumourigenic functions by controlling the immune response. In contrast to extracellular nonvesicular RNA, few studies have addressed the full RNA content within human fluids' EVs and have compared them with their tissue of origin. Here, we performed Total RNA-Sequencing on six Formalin-Fixed-Paraffin-Embedded (FFPE) prostate cancer (PCa) tumour tissues and their paired urinary (u)EVs to provide the first whole transcriptome comparison from the same patients. UEVs contain simplified transcriptome with intron-free cytoplasmic transcripts and enriched lnc/circular (circ)RNAs, strikingly common to an independent 20 patients' urinary cohort. Our full cellular and EVs transcriptome comparison within three PCa cell lines identified a set of overlapping 14 uEV-circRNAs characterized as essential for prostate cell proliferation in vitro and 28 uEV-lncRNAs belonging to the cancer-related lncRNA census (CLC2). In addition, we found 15 uEV-lncRNAs, predicted to encode 768 high-affinity neoantigens, and for which three of the encoded-ORF produced detectable unmodified peptides by mass spectrometry. Our dual analysis of EVs-lnc/circRNAs both in urines' and in vitro's EVs provides a fundamental resource for future uEV-lnc/circRNAs phenotypic characterization involved in PCa.
Collapse
Affiliation(s)
- Anna Almeida
- CNRS UMR3244Sorbonne UniversityPSL UniversityInstitut Curie, Centre de RechercheParisFrance
- Departement de Recherche TranslationnellePSL UniversityInstitut Curie, Centre de RechercheParisFrance
| | - Marc Gabriel
- CNRS UMR3244Sorbonne UniversityPSL UniversityInstitut Curie, Centre de RechercheParisFrance
| | - Virginie Firlej
- AP‐HPHôpital H. MondorPlateforme de Ressources BiologiquesCréteilFrance
- Univ Paris Est CreteilUR TRePCaCréteilFrance
| | - Lorena Martin‐Jaular
- INSERM U932PSL UniversityInstitut Curie, Centre de RechercheParisFrance
- Curie Core Tech Extracellular VesiclesInstitut Curie, Centre de RechercheParisFrance
| | - Matthieu Lejars
- CNRS UMR3244Sorbonne UniversityPSL UniversityInstitut Curie, Centre de RechercheParisFrance
| | - Rocco Cipolla
- CNRS UMR3244Sorbonne UniversityPSL UniversityInstitut Curie, Centre de RechercheParisFrance
| | - Floriane Petit
- Tumour BiologyINSERM U820, Sorbonne Université, PSL University, Institut CurieCentre de RechercheParisFrance
| | - Nicolas Vogt
- CNRS UMR3244Sorbonne UniversityPSL UniversityInstitut Curie, Centre de RechercheParisFrance
| | - Mabel San‐Roman
- CNRS UMR3215, Sorbonne Université, PSL University, Institut CurieCentre de RechercheParisFrance
| | - Florent Dingli
- Laboratoire de Spectrométrie de Masse ProtéomiquePSL Research University, Institut Curie Centre de RechercheParisFrance
| | - Damarys Loew
- Laboratoire de Spectrométrie de Masse ProtéomiquePSL Research University, Institut Curie Centre de RechercheParisFrance
| | | | | | | | - Clotilde Théry
- INSERM U932PSL UniversityInstitut Curie, Centre de RechercheParisFrance
- Curie Core Tech Extracellular VesiclesInstitut Curie, Centre de RechercheParisFrance
| | - Antonin Morillon
- CNRS UMR3244Sorbonne UniversityPSL UniversityInstitut Curie, Centre de RechercheParisFrance
| |
Collapse
|
16
|
Mizero B, Villacrés C, Spicer V, Viner R, Saba J, Patel B, Snovida S, Jensen P, Huhmer A, Krokhin OV. Retention Time Prediction for TMT-Labeled Peptides in Proteomic LC-MS Experiments. J Proteome Res 2022; 21:1218-1228. [PMID: 35363494 DOI: 10.1021/acs.jproteome.1c00833] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present the first detailed study of chromatographic behavior of peptides labeled with tandem mass tags (TMT and TMTpro) in 2D LC for proteomic applications. Carefully designed experimental procedures have permitted generating data sets of over 100,000 nonlabeled and TMT-labeled peptide pairs for the low pH RP in the second separation dimension and data sets of over 10,000 peptide pairs for high-pH RP, HILIC (amide and silica), and SCX separations in the first separation dimension. The average increase in peptide RPLC (0.1% formic acid) retention upon TMT labeling was found to be 3.3% acetonitrile (linear water/acetonitrile gradients), spanning a range of -4 to 10.3%. In addition to the bulk peptide properties such as length, hydrophobicity, and the number of labeled residues, we found several sequence-dependent features mostly associated with differences in N-terminal chemistry. The behavior of TMTpro-labeled peptides was found to be very similar except for a slightly higher hydrophobicity: an average retention shift of 3.7% acetonitrile. The respective versions of the sequence-specific retention calculator (SSRCalc) model have been developed to accommodate both TMT chemistries, showing identical prediction accuracy (R2 ∼ 0.98) for labeled and nonlabeled peptides. Higher retention for TMT-labeled peptides was observed for high-pH RP and HILIC separations, while SCX selectivity remained virtually unchanged.
Collapse
Affiliation(s)
- Benilde Mizero
- Department of Chemistry, University of Manitoba, Winnipeg R3T 2N2, Canada
| | - Carina Villacrés
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg R3E 3P4, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg R3E 3P4, Canada
| | - Rosa Viner
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Julian Saba
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | | | - Sergei Snovida
- Thermo Fisher Scientific, Rockford, Illinois 61101, United States
| | - Penny Jensen
- Thermo Fisher Scientific, Rockford, Illinois 61101, United States
| | - Andreas Huhmer
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Oleg V Krokhin
- Department of Chemistry, University of Manitoba, Winnipeg R3T 2N2, Canada.,Manitoba Centre for Proteomics and Systems Biology, Winnipeg R3E 3P4, Canada.,Department of Internal Medicine, University of Manitoba, Winnipeg R3E 3P4, Canada
| |
Collapse
|
17
|
Borkar MR, Coutinho E. Amalgamation of comparative protein modeling with quantitative structure-retention relationship for prediction of the chromatographic behavior of peptides. J Chromatogr A 2022; 1669:462967. [DOI: 10.1016/j.chroma.2022.462967] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 03/09/2022] [Accepted: 03/11/2022] [Indexed: 10/18/2022]
|
18
|
Enmark M, Häggström J, Samuelsson J, Fornstedt T. Building machine-learning-based model for retention time and resolution predictions in ion pair chromatography of oligonucleotides. J Chromatogr A 2022; 1671:462999. [DOI: 10.1016/j.chroma.2022.462999] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 03/22/2022] [Accepted: 03/25/2022] [Indexed: 01/29/2023]
|
19
|
Polunin KE, Fedotkina OS, Polunina IA, Buryak AK. Influence of the Hydrophobicity of Galleria Mellonella Antibacterial Peptides on the Parameters of Their Chromatographic Retention. COLLOID JOURNAL 2022. [DOI: 10.1134/s1061933x21060089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
20
|
Yang Y, Lin L, Qiao L. Deep learning approaches for data-independent acquisition proteomics. Expert Rev Proteomics 2021; 18:1031-1043. [PMID: 34918987 DOI: 10.1080/14789450.2021.2020654] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
INTRODUCTION Data-independent acquisition (DIA) is an emerging technology for large-scale proteomic studies. DIA data analysis methods are evolving rapidly, and deep learning has cut a conspicuous figure in this field. AREAS COVERED This review discusses and provides an overview of the deep learning methods that are used for DIA data analysis, including spectral library prediction, feature scoring, and statistical control in peptide-centric analysis, as well as de novo peptide sequencing. Literature searches were performed for articles, including preprints, up to December 2021 from PubMed, Scopus, and Web of Science databases. EXPERT OPINION While spectral library prediction has broken through the limitation on proteome coverage of experimental libraries, the statistical burden due to the large query space is the remaining challenge of utilizing proteome-wide predicted libraries. Analysis of post-translational modifications is another promising direction of deep learning-based DIA methods.
Collapse
Affiliation(s)
- Yi Yang
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| | - Ling Lin
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| | - Liang Qiao
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| |
Collapse
|
21
|
Mizero B, Yeung D, Spicer V, Krokhin OV. Peptide retention time prediction for peptides with post-translational modifications: N-terminal (α-amine) and lysine (ε-amine) acetylation. J Chromatogr A 2021; 1657:462584. [PMID: 34619563 DOI: 10.1016/j.chroma.2021.462584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/23/2021] [Accepted: 09/24/2021] [Indexed: 10/20/2022]
Abstract
Development of a peptide retention prediction model in reversed-phase chromatography is reported for acetylated peptides - both N-terminal (α-) and side chain of Lys (ε-amine) residues. Large-scale proteomic 2D LC-MS analyses of acetylated/non-acetylated tryptic digest of whole human cell lysate have been used to assemble representative retention data sets of 25,000+ modified/non-modified pairs. This allowed elucidating chromatographic behaviour of modified peptides in three different separation modes: high pH reversed-phase, HILIC separation on amide phase (first dimension of 2D) and reversed-phase separation with formic acid as ion-pairing modifier in the second dimension. On average, N-terminal acetylation increases peptide RP retention at acidic pH by 5 Hydrophobicity Index units (% acetonitrile). Acetylation of first lysine adds another 4.1%. The magnitude of the retention shift varies greatly depending on the number of modified amines, peptide length, and N-terminal peptide sequence. Large retention shifts have been observed for peptides with hydrophobic N-termini and specifically peptides carrying sequences characteristic for amphipathic helical structures - all in complete agreement with major sequence-specific features of RP retention mechanism. The utility of the modified Sequence Specific Retention Calculator model has been verified for the in-vivo N-terminally acetylated peptides detected by 2D LC-MS/MS analysis of a yeast tryptic digest. The effect of N-terminal acetylation was also evaluated for six different HILIC columns, strong cation- and strong anion exchange separations using previously acquired 2D LC-MS/MS data.
Collapse
Affiliation(s)
- Benilde Mizero
- Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada
| | - Darien Yeung
- Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada
| | - Vic Spicer
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada
| | - Oleg V Krokhin
- Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada; Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada; Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada.
| |
Collapse
|
22
|
Polunin KE, Fedotkina OS, Polunina IA, Buryak AK. Effect of 1,1-Dimethylhydrazine on the Induction of Peptides in Galleria Mellonella Hemolymph. COLLOID JOURNAL 2021. [DOI: 10.1134/s1061933x21050124] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
23
|
Lee S, Ju S, Kim SJ, Choi JO, Kim K, Kim D, Jeon ES, Lee C. tipNrich: A Tip-Based N-Terminal Proteome Enrichment Method. Anal Chem 2021; 93:14088-14098. [PMID: 34615347 DOI: 10.1021/acs.analchem.1c01722] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The mass spectrometry-based analysis of protein post-translational modifications requires large amounts of sample, complicating the analysis of samples with limited amounts of proteins such as clinical biopsies. Here, we present a tip-based N-terminal analysis method, tipNrich. The entire procedure is processed in a single pipette tip to minimize sample loss, which is so highly optimized to analyze small amounts of proteins, even femtomole-scale of a single protein. With tipNrich, we investigated various single proteins purified from different organisms using a low-resolution mass spectrometer and identified several N-terminal peptides with different Nt-modifications such as ragged N-termini. Furthermore, we applied matrix-assisted laser desorption ionization time-of-flight mass spectrometry to our method for shortening the analysis time. Moreover, we showed that our method could be utilized in disease diagnosis as exemplified by the characterization of wild-type transthyretin amyloidosis patients compared to the healthy individuals based on N-terminome profiling. In summary, tipNrich will satisfy the need of identifying N-terminal peptides even with highly scarce amounts of proteins and of having faster processing time to check the quality of protein products or to characterize N-terminal proteoform-related diseases.
Collapse
Affiliation(s)
- Seonjeong Lee
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea.,Division of Bio-Medical Science and Technology, KIST School, Korea University of Science and Technology, Seoul 02792, Republic of Korea
| | - Shinyeong Ju
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea
| | - Seok Jin Kim
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul 02792, Korea.,Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 02792, Korea
| | - Jin-Oh Choi
- Division of Cardiology, Department of Medicine, Heart Vascular Stroke Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 02792, Korea
| | - Kihyun Kim
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 02792, Korea
| | - Darae Kim
- Division of Cardiology, Department of Medicine, Heart Vascular Stroke Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 02792, Korea
| | - Eun-Seok Jeon
- Division of Cardiology, Department of Medicine, Heart Vascular Stroke Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 02792, Korea
| | - Cheolju Lee
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea.,Division of Bio-Medical Science and Technology, KIST School, Korea University of Science and Technology, Seoul 02792, Republic of Korea
| |
Collapse
|
24
|
van Bentum M, Selbach M. An Introduction to Advanced Targeted Acquisition Methods. Mol Cell Proteomics 2021; 20:100165. [PMID: 34673283 PMCID: PMC8600983 DOI: 10.1016/j.mcpro.2021.100165] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 10/11/2021] [Accepted: 10/13/2021] [Indexed: 01/13/2023] Open
Abstract
Targeted proteomics via selected reaction monitoring (SRM) or parallel reaction monitoring (PRM) enables fast and sensitive detection of a preselected set of target peptides. However, the number of peptides that can be monitored in conventional targeting methods is usually rather small. Recently, a series of methods has been described that employ intelligent acquisition strategies to increase the efficiency of mass spectrometers to detect target peptides. These methods are based on one of two strategies. First, retention time adjustment-based methods enable intelligent scheduling of target peptide retention times. These include Picky, iRT, as well as spike-in free real-time adjustment methods such as MaxQuant.Live. Second, in spike-in triggered acquisition methods such as SureQuant, Pseudo-PRM, TOMAHAQ, and Scout-MRM, targeted scans are initiated by abundant labeled synthetic peptides added to samples before the run. Both strategies enable the mass spectrometer to better focus data acquisition time on target peptides. This either enables more sensitive detection or a higher number of targets per run. Here, we provide an overview of available advanced targeting methods and highlight their intrinsic strengths and weaknesses and compatibility with specific experimental setups. Our goal is to provide a basic introduction to advanced targeting methods for people starting to work in this field. Advanced acquisition methods improve focus of mass spectrometers on target peptides. This review discusses existing methods based on two strategies. Retention time adjustment-based methods enable intelligent scheduling of peptide RTs. In spike-in triggered acquisition methods targeted scans are initiated by spike-ins.
Collapse
Affiliation(s)
- Mirjam van Bentum
- Proteome Dynamics, Max Delbrück Center for Molecular Medicine, Berlin, Germany; Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Matthias Selbach
- Proteome Dynamics, Max Delbrück Center for Molecular Medicine, Berlin, Germany; Charité-Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
25
|
Watts E, Potts GK, Ready DB, George Thompson AM, Lee J, Escobar EE, Patterson MJ, Brodbelt JS. Characterization of HLA-A*02:01 MHC Immunopeptide Antigens Enhanced by Ultraviolet Photodissociation Mass Spectrometry. Anal Chem 2021; 93:13134-13142. [PMID: 34553926 DOI: 10.1021/acs.analchem.1c01002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Identifying major histocompatibility complex (MHC) class I immunopeptide antigens represents a key step in the development of immune-based targeted therapeutics and vaccines. However, the complete characterization of these antigens by tandem mass spectrometry remains challenging due to their short sequence length, high degree of hydrophobicity, and/or lack of sufficiently basic amino acids. This study seeks to address the potential for 193 nm ultraviolet photodissociation (UVPD) to improve the analysis of MHC class I immunopeptides by offering enhanced characterization of these sequences in lower charge states and differentiation of prominent isomeric leucine and isoleucine residues in the HLA-A*02:01 motif. Although electron transfer dissociation-higher energy collisional dissociation (EThcD) offered some success in the differentiation of leucine and isoleucine, 193 nm UVPD was able to confirm the identity of nearly 60% of leucine and isoleucine residues in a synthetic peptide mixture. Furthermore, 193 nm UVPD led to significantly more peptide identifications and higher scoring metrics than EThcD for peptides obtained from immunoprecipitation of MHC class I immunopeptides from in vitro cell culture. Additionally, 193 nm UVPD represents a promising complementary technique to higher-energy collisional dissociation (HCD), in which 424 of the 2593 peptides identified by 193 nm UVPD were not identified by HCD in HLA-A*02:01-specific immunoprecipitation and 804 of the 3300 peptides identified by 193 nm UVPD were not identified by HCD for pan HLA-A, -B, and -C immunoprecipitation. These results highlight that 193 nm UVPD offers an option for the characterization of immunopeptides, including differentiation of leucine and isoleucine residues.
Collapse
Affiliation(s)
- Eleanor Watts
- Department of Chemistry, University of Texas at Austin, Austin 78712-1139, Texas, United States
| | - Gregory K Potts
- AbbVie, Inc., North Chicago 60064-1802, Illinois, United States
| | - Damien B Ready
- AbbVie, Inc., North Chicago 60064-1802, Illinois, United States
| | | | - Janice Lee
- AbbVie, Inc., North Chicago 60064-1802, Illinois, United States
| | - Edwin E Escobar
- Department of Chemistry, University of Texas at Austin, Austin 78712-1139, Texas, United States
| | | | - Jennifer S Brodbelt
- Department of Chemistry, University of Texas at Austin, Austin 78712-1139, Texas, United States
| |
Collapse
|
26
|
Kulyyassov A, Fresnais M, Longuespée R. Targeted liquid chromatography-tandem mass spectrometry analysis of proteins: Basic principles, applications, and perspectives. Proteomics 2021; 21:e2100153. [PMID: 34591362 DOI: 10.1002/pmic.202100153] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 09/08/2021] [Accepted: 09/24/2021] [Indexed: 12/25/2022]
Abstract
Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is now the main analytical method for the identification and quantification of peptides and proteins in biological samples. In modern research, identification of biomarkers and their quantitative comparison between samples are becoming increasingly important for discovery, validation, and monitoring. Such data can be obtained following specific signals after fragmentation of peptides using multiple reaction monitoring (MRM) and parallel reaction monitoring (PRM) methods, with high specificity, accuracy, and reproducibility. In addition, these methods allow measurement of the amount of post-translationally modified forms and isoforms of proteins. This review article describes the basic principles of MRM assays, guidelines for sample preparation, recent advanced MRM-based strategies, applications and illustrative perspectives of MRM/PRM methods in clinical research and molecular biology.
Collapse
Affiliation(s)
| | - Margaux Fresnais
- Department of Clinical Pharmacology and Pharmacoepidemiology, Heidelberg University Hospital, Heidelberg, Germany
| | - Rémi Longuespée
- Department of Clinical Pharmacology and Pharmacoepidemiology, Heidelberg University Hospital, Heidelberg, Germany
| |
Collapse
|
27
|
Nagler A, Kalaora S, Barbolin C, Gangaev A, Ketelaars SLC, Alon M, Pai J, Benedek G, Yahalom-Ronen Y, Erez N, Greenberg P, Yagel G, Peri A, Levin Y, Satpathy AT, Bar-Haim E, Paran N, Kvistborg P, Samuels Y. Identification of presented SARS-CoV-2 HLA class I and HLA class II peptides using HLA peptidomics. Cell Rep 2021; 35:109305. [PMID: 34166618 PMCID: PMC8185308 DOI: 10.1016/j.celrep.2021.109305] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 01/17/2021] [Accepted: 06/03/2021] [Indexed: 02/07/2023] Open
Abstract
The human leukocyte antigen (HLA)-bound viral antigens serve as an immunological signature that can be selectively recognized by T cells. As viruses evolve by acquiring mutations, it is essential to identify a range of presented viral antigens. Using HLA peptidomics, we are able to identify severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-derived peptides presented by highly prevalent HLA class I (HLA-I) molecules by using infected cells as well as overexpression of SARS-CoV-2 genes. We find 26 HLA-I peptides and 36 HLA class II (HLA-II) peptides. Among the identified peptides, some are shared between different cells and some are derived from out-of-frame open reading frames (ORFs). Seven of these peptides were previously shown to be immunogenic, and we identify two additional immunoreactive peptides by using HLA multimer staining. These results may aid the development of the next generation of SARS-CoV-2 vaccines based on presented viral-specific antigens that span several of the viral genes.
Collapse
Affiliation(s)
- Adi Nagler
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Shelly Kalaora
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Chaya Barbolin
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Anastasia Gangaev
- Division of Molecular Oncology and Immunology, the Netherlands Cancer Institute, the Netherlands
| | - Steven L C Ketelaars
- Division of Molecular Oncology and Immunology, the Netherlands Cancer Institute, the Netherlands
| | - Michal Alon
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Joy Pai
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Gil Benedek
- Tissue Typing and Immunogenetics Unit, Hadassah Medical Organization and Faculty of Medicine, Hebrew University of Jerusalem, Israel
| | - Yfat Yahalom-Ronen
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Noam Erez
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Polina Greenberg
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Gal Yagel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Aviyah Peri
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Yishai Levin
- The de Botton Institute for Protein Profiling, The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot, Israel
| | - Ansuman T Satpathy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Erez Bar-Haim
- Department of Biochemistry and Molecular Genetics, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Nir Paran
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Pia Kvistborg
- Tissue Typing and Immunogenetics Unit, Hadassah Medical Organization and Faculty of Medicine, Hebrew University of Jerusalem, Israel
| | - Yardena Samuels
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
28
|
Chang CH, Yeung D, Spicer V, Ogata K, Krokhin O, Ishihama Y. Sequence-Specific Model for Predicting Peptide Collision Cross Section Values in Proteomic Ion Mobility Spectrometry. J Proteome Res 2021; 20:3600-3610. [PMID: 34133192 DOI: 10.1021/acs.jproteome.1c00185] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The contribution of peptide amino acid sequence to collision cross section values (CCS) has been investigated using a dataset of ∼134 000 peptides of four different charge states (1+ to 4+). The migration data were acquired using a two-dimensional liquid chromatography (LC)/trapped ion mobility spectrometry/quadrupole/time-of-flight mass spectrometry (MS) analysis of HeLa cell digests created using seven different proteases and was converted to CCS values. Following the previously reported modeling approaches using intrinsic size parameters (ISP), we extended this methodology to encode the position of individual residues within a peptide sequence. A generalized prediction model was built by dividing the dataset into eight groups (four charges for both tryptic/nontryptic peptides). Position-dependent ISPs were independently optimized for the eight subsets of peptides, resulting in prediction accuracy of ∼0.981 for the entire population of peptides. We find that ion mobility is strongly affected by the peptide's ability to solvate the positively charged sites. Internal positioning of polar residues and proline leads to decreased CCS values as they improve charge solvation; conversely, this ability decreases with increasing peptide charge due to electrostatic repulsion. Furthermore, higher helical propensity and peptide hydrophobicity result in a preferential formation of extended structures with higher than predicted CCS values. Finally, acidic/basic residues exhibit position-dependent ISP behavior consistent with electrostatic interaction with the peptide macrodipole, which affects the peptide helicity. The MS raw data files have been deposited with the ProteomeXchange Consortium via the jPOST partner repository (http://jpostdb.org) with the dataset identifiers PXD021440/JPST000959, PXD022800/JPST001017, and PXD026087/ JPST001176.
Collapse
Affiliation(s)
- Chih-Hsiang Chang
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Darien Yeung
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba R3E 0J9, Canada
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
| | - Kosuke Ogata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Oleg Krokhin
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba R3E 0J9, Canada
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
- Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba R3E 3P4, Canada
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg, Manitoba R3T 2N2, Canada
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
- Laboratory of Clinical and Analytical Chemistry, National Institute of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| |
Collapse
|
29
|
Wen B, Zeng W, Liao Y, Shi Z, Savage SR, Jiang W, Zhang B. Deep Learning in Proteomics. Proteomics 2020; 20:e1900335. [PMID: 32939979 PMCID: PMC7757195 DOI: 10.1002/pmic.201900335] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 09/14/2020] [Indexed: 12/17/2022]
Abstract
Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.
Collapse
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen‐Feng Zeng
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Chinese Academy of SciencesInstitute of Computing TechnologyBeijing100190China
| | - Yuxing Liao
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Zhiao Shi
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Sara R. Savage
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen Jiang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Bing Zhang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| |
Collapse
|
30
|
Lau E, Han Y, Williams DR, Thomas CT, Shrestha R, Wu JC, Lam MPY. Splice-Junction-Based Mapping of Alternative Isoforms in the Human Proteome. Cell Rep 2020; 29:3751-3765.e5. [PMID: 31825849 PMCID: PMC6961840 DOI: 10.1016/j.celrep.2019.11.026] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 09/24/2019] [Accepted: 11/06/2019] [Indexed: 12/18/2022] Open
Abstract
The protein-level translational status and function of many alternative splicing events remain poorly understood. We use an RNA sequencing (RNA-seq)-guided proteomics method to identify protein alternative splicing isoforms in the human proteome by constructing tissue-specific protein databases that prioritize transcript splice junction pairs with high translational potential. Using the custom databases to reanalyze ~80 million mass spectra in public proteomics datasets, we identify more than 1,500 noncanonical protein isoforms across 12 human tissues, including ~400 sequences undocumented on TrEMBL and RefSeq databases. We apply the method to original quantitative mass spectrometry experiments and observe widespread isoform regulation during human induced pluripotent stem cell cardiomyocyte differentiation. On a proteome scale, alternative isoform regions overlap frequently with disordered sequences and post-translational modification sites, suggesting that alternative splicing may regulate protein function through modulating intrinsically disordered regions. The described approach may help elucidate functional consequences of alternative splicing and expand the scope of proteomics investigations in various systems. The translation and function of many alternative splicing events await confirmation at the protein level. Lau et al. use an integrated proteotranscriptomics approach to identify non-canonical and undocumented isoforms from 12 organs in the human proteome. Alternative isoforms interfere with functional sequence features and are differentially regulated during iPSC cardiomyocyte differentiation.
Collapse
Affiliation(s)
- Edward Lau
- Stanford Cardiovascular Institute, Department of Medicine, Stanford University, Palo Alto, CA, USA
| | - Yu Han
- Consortium for Fibrosis Research and Translation, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA; Departments of Medicine-Cardiology and Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
| | - Damon R Williams
- Stanford Cardiovascular Institute, Department of Medicine, Stanford University, Palo Alto, CA, USA
| | - Cody T Thomas
- Departments of Medicine-Cardiology and Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
| | - Rajani Shrestha
- Stanford Cardiovascular Institute, Department of Medicine, Stanford University, Palo Alto, CA, USA
| | - Joseph C Wu
- Stanford Cardiovascular Institute, Department of Medicine, Stanford University, Palo Alto, CA, USA; Department of Radiology, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Maggie P Y Lam
- Consortium for Fibrosis Research and Translation, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA; Departments of Medicine-Cardiology and Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA.
| |
Collapse
|
31
|
Porto DL, da Silva ARR, Oliveira ADS, Nogueira FHA, Pedrosa MDFF, Aragão CFS. Development and validation of a stability indicating HPLC-DAD method for the determination of the peptide stigmurin. Microchem J 2020. [DOI: 10.1016/j.microc.2020.104921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
32
|
Gussakovsky D, Anderson G, Spicer V, Krokhin OV. Peptide separation selectivity in proteomics LC-MS experiments: Comparison of formic and mixed formic/heptafluorobutyric acids ion-pairing modifiers. J Sep Sci 2020; 43:3830-3839. [PMID: 32818315 DOI: 10.1002/jssc.202000578] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Separation selectivity and detection sensitivity of reversed-phase high-performance liquid chromatography with tandem mass spectrometry analyses were compared for formic (0.1%) and formic/heptafluorobutyric (0.1%/0.005%) acid based eluents using a proteomic data set of ∼12 000 paired peptides. The addition of a small amount of hydrophobic heptafluorobutyric acid ion-pairing modifier increased peptide retention by up to 10% acetonitrile depending on peptide charge, size, and hydrophobicity. Retention increase was greatest for peptides that were short, highly charged, and hydrophilic. There was an ∼3.75-fold reduction in MS signal observed across the whole population of peptides following the addition of heptafluorobutyric acid. This resulted in ∼36% and ∼21% reduction of detected proteins and unique peptides for the whole cell lysate digests, respectively. We also confirmed that the separation selectivity of the formic/heptafluorobutyric acid system was very similar to the commonly used conditions of 0.1% trifluoroacetic acid, and developed a new version of the Sequence-Specific Retention calculator model for the formic/heptafluorobutyric acid system showing the same ∼0.98 R2 -value accuracy as the Sequence-Specific Retention calculator formic acid model. In silico simulation of peptide distribution in separation space showed that the addition of 0.005% heptafluorobutyric acid to the 0.1% formic acid system increased potential proteome coverage by ∼11% of detectable species (tryptic peptides ≥ four amino acids).
Collapse
Affiliation(s)
- Daniel Gussakovsky
- Department of Chemistry, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Geoff Anderson
- Department of Chemistry, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Vic Spicer
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg, Manitoba, Canada
| | - Oleg V Krokhin
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg, Manitoba, Canada.,Department of Internal Medicine, University of Manitoba, Winnipeg, Manitoba, Canada
| |
Collapse
|
33
|
|
34
|
Yeung D, Klaassen N, Mizero B, Spicer V, Krokhin OV. Peptide retention time prediction in hydrophilic interaction liquid chromatography: Zwitter-ionic sulfoalkylbetaine and phosphorylcholine stationary phases. J Chromatogr A 2020; 1619:460909. [PMID: 32007221 DOI: 10.1016/j.chroma.2020.460909] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 01/17/2020] [Accepted: 01/21/2020] [Indexed: 01/01/2023]
Abstract
Peptide retention time prediction models have been developed for zwitter-ionic ZIC-HILIC and ZIC-cHILIC stationary phases (pH 4.5 eluents) using proteomics-derived retention datasets of ~30 thousand tryptic peptides each. Overall, hydrophilicity of these stationary phases was found to be similar to the previously studied Amide HILIC phase, but lower compared to bare silicas. Peptide retention is driven by interactions of all charged (hydrophilic) residues at pH 4.5 (Asp, Glu, Arg, Lys, His), but shows specificity according to orientation of functional groups in zwitter-ionic pair. Thus, ZIC-cHILIC exhibits an increased contribution of negatively charged Asp and Glu due to the distal positioning of positively charged quaternary amines on the stationary phase. These findings confirm that HILIC interactions are driven by both peptide distribution between water layer adsorbed on the stationary phase and by interactions specific to functional groups of the packing material. Sequence-Specific Retention Calculator HILIC models were optimized for these columns showing 0.967-0.976 R2-values between experimental and predicted retention values. ZIC-HILIC separations represent a good choice as a first dimension in 2D LC-MS of peptide mixtures with correlations between retention values of ZIC-HILIC against RPLC found at 0.197 (ZIC-HILIC) and 0.137 (ZIC-cHILIC) R2-values, confirming a good orthogonality.
Collapse
Affiliation(s)
- Darien Yeung
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada
| | - Nicole Klaassen
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada
| | - Benilde Mizero
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada
| | - Oleg V Krokhin
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada; Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada; Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada.
| |
Collapse
|
35
|
In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat Commun 2020; 11:146. [PMID: 31919359 PMCID: PMC6952453 DOI: 10.1038/s41467-019-13866-z] [Citation(s) in RCA: 94] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 12/04/2019] [Indexed: 11/12/2022] Open
Abstract
Data-independent acquisition (DIA) is an emerging technology for quantitative proteomic analysis of large cohorts of samples. However, sample-specific spectral libraries built by data-dependent acquisition (DDA) experiments are required prior to DIA analysis, which is time-consuming and limits the identification/quantification by DIA to the peptides identified by DDA. Herein, we propose DeepDIA, a deep learning-based approach to generate in silico spectral libraries for DIA analysis. We demonstrate that the quality of in silico libraries predicted by instrument-specific models using DeepDIA is comparable to that of experimental libraries, and outperforms libraries generated by global models. With peptide detectability prediction, in silico libraries can be built directly from protein sequence databases. We further illustrate that DeepDIA can break through the limitation of DDA on peptide/protein detection, and enhance DIA analysis on human serum samples compared to the state-of-the-art protocol using a DDA library. We expect this work expanding the toolbox for DIA proteomics. Data-independent acquisition (DIA) is an emerging technology in proteomics but it typically relies on spectral libraries built by data-dependent acquisition (DDA). Here, the authors use deep learning to generate in silico spectral libraries directly from protein sequences that enable more comprehensive DIA experiments than DDA-based libraries.
Collapse
|
36
|
Iwaniak A, Minkiewicz P, Hrynkiewicz M, Bucholska J, Darewicz M. Hybrid Approach in the Analysis of Bovine Milk Protein Hydrolysates as a Source of Peptides Containing Di- and Tripeptide Bitterness Indicators. POL J FOOD NUTR SCI 2020. [DOI: 10.31883/pjfns/113532] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
|
37
|
Abstract
In bottom-up proteomics, proteins are typically identified by enzymatic digestion into peptides, tandem mass spectrometry and comparison of the tandem mass spectra with those predicted from a sequence database for peptides within measurement uncertainty from the experimentally obtained mass. Although now decreasingly common, isolated proteins or simple protein mixtures can also be identified by measuring only the masses of the peptides resulting from the enzymatic digest, without any further fragmentation. Separation methods such as liquid chromatography and electrophoresis are often used to fractionate complex protein or peptide mixtures prior to analysis by mass spectrometry. Although the primary reason for this is to avoid ion suppression and improve data quality, these separations are based on physical and chemical properties of the peptides or proteins and therefore also provide information about them. Depending on the separation method, this could be protein molecular weight (SDS-PAGE), isoelectric point (IEF), charge at a known pH (ion exchange chromatography), or hydrophobicity (reversed phase chromatography). These separations produce approximate measurements on properties that to some extent can be predicted from amino acid sequences. In the case of molecular weight of proteins without posttranslational modifications this is straightforward: simply add the molecular weights of the amino acid residues in the protein. For IEF, charge and hydrophobicity, the order of the amino acids, and folding state of the peptide or protein also matter, but it is nevertheless possible to predict the behavior of peptides and proteins in these separation methods to a degree which renders such predictions useful. This chapter reviews the topic of using data from separation methods for identification and validation in proteomics, with special emphasis on predicting retention times of tryptic peptides in reversed-phase chromatography under acidic conditions, as this is one of the most commonly used separation methods in bottom-up proteomics.
Collapse
|
38
|
Genereux JC. Mass spectrometric approaches for profiling protein folding and stability. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2019; 118:111-144. [PMID: 31928723 DOI: 10.1016/bs.apcsb.2019.09.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Protein stability reports on protein homeostasis, function, and binding interactions, such as to other proteins, metabolites and drugs. As such, there is a pressing need for technologies that can report on protein stability. The ideal technique could be applied in vitro or in vivo systems, proteome-wide, independently of matrix, under native conditions, with residue-level resolution, and on protein at endogenous levels. Mass spectrometry has rapidly become a preferred technology for identifying and quantifying proteins. As such, it has been increasingly incorporated into methodologies for interrogating protein stability and folding. Although no single technology can satisfy all desired applications, several emerging approaches have shown outstanding success at providing biological insight into the stability of the proteome. This chapter outlines some of these recent emerging technologies.
Collapse
Affiliation(s)
- Joseph C Genereux
- Department of Chemistry, University of California, Riverside, CA, United States
| |
Collapse
|
39
|
Samuelsson J, Eiriksson FF, Åsberg D, Thorsteinsdóttir M, Fornstedt T. Determining gradient conditions for peptide purification in RPLC with machine-learning-based retention time predictions. J Chromatogr A 2019; 1598:92-100. [DOI: 10.1016/j.chroma.2019.03.043] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 03/20/2019] [Accepted: 03/21/2019] [Indexed: 01/22/2023]
|
40
|
Chen AT, Franks A, Slavov N. DART-ID increases single-cell proteome coverage. PLoS Comput Biol 2019; 15:e1007082. [PMID: 31260443 PMCID: PMC6625733 DOI: 10.1371/journal.pcbi.1007082] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Revised: 07/12/2019] [Accepted: 05/06/2019] [Indexed: 01/09/2023] Open
Abstract
Analysis by liquid chromatography and tandem mass spectrometry (LC-MS/MS) can identify and quantify thousands of proteins in microgram-level samples, such as those comprised of thousands of cells. This process, however, remains challenging for smaller samples, such as the proteomes of single mammalian cells, because reduced protein levels reduce the number of confidently sequenced peptides. To alleviate this reduction, we developed Data-driven Alignment of Retention Times for IDentification (DART-ID). DART-ID implements principled Bayesian frameworks for global retention time (RT) alignment and for incorporating RT estimates towards improved confidence estimates of peptide-spectrum-matches. When applied to bulk or to single-cell samples, DART-ID increased the number of data points by 30-50% at 1% FDR, and thus decreased missing data. Benchmarks indicate excellent quantification of peptides upgraded by DART-ID and support their utility for quantitative analysis, such as identifying cell types and cell-type specific proteins. The additional datapoints provided by DART-ID boost the statistical power and double the number of proteins identified as differentially abundant in monocytes and T-cells. DART-ID can be applied to diverse experimental designs and is freely available at http://dart-id.slavovlab.net.
Collapse
Affiliation(s)
- Albert Tian Chen
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, United States of America
- Barnett Institute, Northeastern University, Boston, Massachusetts, United States of America
| | - Alexander Franks
- Department of Statistics and Applied Probability, University of California Santa Barbara, California, United States of America
| | - Nikolai Slavov
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, United States of America
- Barnett Institute, Northeastern University, Boston, Massachusetts, United States of America
- Department of Biology, Northeastern University, Boston, Massachusetts, United States of America
| |
Collapse
|
41
|
Engineered peptide barcodes for in-depth analyses of binding protein libraries. Nat Methods 2019; 16:421-428. [PMID: 31011184 DOI: 10.1038/s41592-019-0389-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 03/08/2019] [Indexed: 01/02/2023]
Abstract
Binding protein generation typically relies on laborious screening cascades that process candidate molecules individually. We have developed NestLink, a binder selection and identification technology able to biophysically characterize thousands of library members at once without the need to handle individual clones at any stage of the process. NestLink uses genetically encoded barcoding peptides termed flycodes, which were designed for maximal detectability by mass spectrometry and support accurate deep sequencing. We demonstrate NestLink's capacity to overcome the current limitations of binder-generation methods in three applications. First, we show that hundreds of binder candidates can be simultaneously ranked according to kinetic parameters. Next, we demonstrate deep mining of a nanobody immune repertoire for membrane protein binders, carried out entirely in solution without target immobilization. Finally, we identify rare binders against an integral membrane protein directly in the cellular environment of a human pathogen. NestLink opens avenues for the selection of tailored binder characteristics directly in tissues or in living organisms.
Collapse
|
42
|
Aalizadeh R, Nika MC, Thomaidis NS. Development and application of retention time prediction models in the suspect and non-target screening of emerging contaminants. JOURNAL OF HAZARDOUS MATERIALS 2019; 363:277-285. [PMID: 30312924 DOI: 10.1016/j.jhazmat.2018.09.047] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 09/16/2018] [Accepted: 09/17/2018] [Indexed: 05/13/2023]
Abstract
Hydrophilic interaction liquid chromatography (HILIC) and reversed phase LC (RPLC) coupled to high resolution mass spectrometry (HRMS) are widely used for the identification of suspects and unknown compounds in the environment. For the identification of unknowns, apart from mass accuracy and isotopic fitting, retention time (tR) and MS/MS spectra evaluation is required. In this context, a novel comprehensive workflow was developed to study the tR behavior of large groups of emerging contaminants using Quantitative Structure-Retention Relationships (QSRR). 682 compounds were analyzed by HILIC-HRMS in positive Electrospray Ionization mode (ESI). Moreover, an extensive dataset was built for RPLC-HRMS including 1830 and 308 compounds for positive and negative ESI, respectively. Support Vector Machines (SVM) was used to model the tR data. The applicability domains of the models were studied by Monte Carlo Sampling (MCS) methods. The MCS method was also used to calculate the acceptable error windows for the predicted tR from various LC conditions. This paper provides validated models for predicting tR in HILIC/RPLC-HRMS platforms to facilitate identification of new emerging contaminants by suspect and non-target HRMS screening, and were applied for the identification of transformation products (TPs) of emerging contaminants and biocides in wastewater and sludge.
Collapse
Affiliation(s)
- Reza Aalizadeh
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zographou, 15771, Athens, Greece
| | - Maria-Christina Nika
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zographou, 15771, Athens, Greece
| | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zographou, 15771, Athens, Greece.
| |
Collapse
|
43
|
Perez JJ, Chen CY. Implementation of normalized retention time (iRT) for bottom-up proteomic analysis of the aminoglycoside phosphotransferase enzyme facilitating method distribution. Anal Bioanal Chem 2018; 411:4701-4708. [DOI: 10.1007/s00216-018-1377-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 08/15/2018] [Accepted: 09/13/2018] [Indexed: 01/05/2023]
|
44
|
Frank Y, Hruz T, Tschager T, Venzin V. Improved de novo peptide sequencing using LC retention time information. Algorithms Mol Biol 2018; 13:14. [PMID: 30181767 PMCID: PMC6114869 DOI: 10.1186/s13015-018-0132-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 08/20/2018] [Indexed: 12/03/2022] Open
Abstract
Background Liquid chromatography combined with tandem mass spectrometry is an important tool in proteomics for peptide identification. Liquid chromatography temporally separates the peptides in a sample. The peptides that elute one after another are analyzed via tandem mass spectrometry by measuring the mass-to-charge ratio of a peptide and its fragments. De novo peptide sequencing is the problem of reconstructing the amino acid sequences of a peptide from this measurement data. Past de novo sequencing algorithms solely consider the mass spectrum of the fragments for reconstructing a sequence. Results We propose to additionally exploit the information obtained from liquid chromatography. We study the problem of computing a sequence that is not only in accordance with the experimental mass spectrum, but also with the chromatographic retention time. We consider three models for predicting the retention time and develop algorithms for de novo sequencing for each model. Conclusions Based on an evaluation for two prediction models on experimental data from synthesized peptides we conclude that the identification rates are improved by exploiting the chromatographic information. In our evaluation, we compare our algorithms using the retention time information with algorithms using the same scoring model, but not the retention time.
Collapse
|
45
|
Lobas AA, Levitsky LI, Fichtenbaum A, Surin AK, Pridatchenko ML, Mitulovic G, Gorshkov AV, Gorshkov MV. Predictive Liquid Chromatography of Peptides Based on Hydrophilic Interactions for Mass Spectrometry-Based Proteomics. JOURNAL OF ANALYTICAL CHEMISTRY 2018. [DOI: 10.1134/s1061934817140076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
46
|
Badgett MJ, Boyes B, Orlando R. Peptide retention prediction using hydrophilic interaction liquid chromatography coupled to mass spectrometry. J Chromatogr A 2018; 1537:58-65. [PMID: 29338870 PMCID: PMC5805588 DOI: 10.1016/j.chroma.2017.12.055] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 12/12/2017] [Accepted: 12/20/2017] [Indexed: 10/18/2022]
Abstract
A model that predicts retention for peptides using a HALO® penta-HILIC column and gradient elution was created. Coefficients for each amino acid were derived using linear regression analysis and these coefficients can be summed to predict the retention of peptides. This model has a high correlation between experimental and predicted retention times (0.946), which is on par with previous RP and HILIC models. External validation of the model was performed using a set of H. pylori samples on the same LC-MS system used to create the model, and the deviation from actual to predicted times was low. Apart from amino acid composition, length and location of amino acid residues on a peptide were examined and two site-specific corrections for hydrophobic residues at the N-terminus as well as hydrophobic residues one spot over from the N-terminus were created.
Collapse
Affiliation(s)
- Majors J Badgett
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 United States
| | - Barry Boyes
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 United States; Advanced Materials Technology, Wilmington, DE 19810 United States
| | - Ron Orlando
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 United States.
| |
Collapse
|
47
|
Spicer V, Krokhin OV. Peptide retention time prediction in hydrophilic interaction liquid chromatography. Comparison of separation selectivity between bare silica and bonded stationary phases. J Chromatogr A 2018; 1534:75-84. [DOI: 10.1016/j.chroma.2017.12.046] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 12/14/2017] [Accepted: 12/16/2017] [Indexed: 01/01/2023]
|
48
|
Maboudi Afkham H, Qiu X, The M, Käll L. Uncertainty estimation of predictions of peptides' chromatographic retention times in shotgun proteomics. Bioinformatics 2017; 33:508-513. [PMID: 27797755 DOI: 10.1093/bioinformatics/btw619] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 09/20/2016] [Indexed: 12/17/2022] Open
Abstract
Motivation Liquid chromatography is frequently used as a means to reduce the complexity of peptide-mixtures in shotgun proteomics. For such systems, the time when a peptide is released from a chromatography column and registered in the mass spectrometer is referred to as the peptide's retention time . Using heuristics or machine learning techniques, previous studies have demonstrated that it is possible to predict the retention time of a peptide from its amino acid sequence. In this paper, we are applying Gaussian Process Regression to the feature representation of a previously described predictor E lude . Using this framework, we demonstrate that it is possible to estimate the uncertainty of the prediction made by the model. Here we show how this uncertainty relates to the actual error of the prediction. Results In our experiments, we observe a strong correlation between the estimated uncertainty provided by Gaussian Process Regression and the actual prediction error. This relation provides us with new means for assessment of the predictions. We demonstrate how a subset of the peptides can be selected with lower prediction error compared to the whole set. We also demonstrate how such predicted standard deviations can be used for designing adaptive windowing strategies. Contact lukas.kall@scilifelab.se. Availability and Implementation Our software and the data used in our experiments is publicly available and can be downloaded from https://github.com/statisticalbiotechnology/GPTime .
Collapse
Affiliation(s)
- Heydar Maboudi Afkham
- Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, 17121 Solna, Sweden
| | - Xuanbin Qiu
- Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, 17121 Solna, Sweden
| | - Matthew The
- Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, 17121 Solna, Sweden
| | - Lukas Käll
- Science for Life Laboratory, School of Biotechnology, KTH - Royal Institute of Technology, 17121 Solna, Sweden
| |
Collapse
|
49
|
Targeted Quantification of Isoforms of a Thylakoid-Bound Protein: MRM Method Development. Methods Mol Biol 2017; 1696:147-162. [PMID: 29086402 DOI: 10.1007/978-1-4939-7411-5_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
Targeted mass spectrometric methods such as selected/multiple reaction monitoring (SRM/MRM) have found intense application in protein detection and quantification which competes with classical immunoaffinity techniques. It provides a universal procedure to develop a fast, highly specific, sensitive, accurate, and cheap methodology for targeted detection and quantification of proteins based on the direct analysis of their surrogate peptides typically generated by tryptic digestion. This methodology can be advantageously applied in the field of plant proteomics and particularly for non-model species since immunoreagents are scarcely available. Here, we describe the issues to take into consideration in order to develop a MRM method to detect and quantify isoforms of the thylakoid-bound protein polyphenol oxidase from the non-model and database underrepresented species Eriobotrya japonica Lindl.
Collapse
|
50
|
Gussakovsky D, Neustaeter H, Spicer V, Krokhin OV. Sequence-Specific Model for Peptide Retention Time Prediction in Strong Cation Exchange Chromatography. Anal Chem 2017; 89:11795-11802. [PMID: 28971681 DOI: 10.1021/acs.analchem.7b03436] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The development of a peptide retention prediction model for strong cation exchange (SCX) separation on a Polysulfoethyl A column is reported. Off-line 2D LC-MS/MS analysis (SCX-RPLC) of S. cerevisiae whole cell lysate was used to generate a retention dataset of ∼30 000 peptides, sufficient for identifying the major sequence-specific features of peptide retention mechanisms in SCX. In contrast to RPLC/hydrophilic interaction liquid chromatography (HILIC) separation modes, where retention is driven by hydrophobic/hydrophilic contributions of all individual residues, SCX interactions depend mainly on peptide charge (number of basic residues at acidic pH) and size. An additive model (incorporating the contributions of all 20 residues into the peptide retention) combined with a peptide length correction produces a 0.976 R2 value prediction accuracy, significantly higher than the additive models for either HILIC or RPLC. Position-dependent effects on peptide retention for different residues were driven by the spatial orientation of tryptic peptides upon interaction with the negatively charged surface functional groups. The positively charged N-termini serve as a primary point of interaction. For example, basic residues (Arg, His, Lys) increase peptide retention when located closer to the N-terminus. We also found that hydrophobic interactions, which could lead to a mixed-mode separation mechanism, are largely suppressed at 20-30% of acetonitrile in the eluent. The accuracy of the final Sequence-Specific Retention Calculator (SSRCalc) SCX model (∼0.99 R2 value) exceeds all previously reported predictors for peptide LC separations. This also provides a solid platform for method development in 2D LC-MS protocols in proteomics and peptide retention prediction filtering of false positive identifications.
Collapse
Affiliation(s)
- Daniel Gussakovsky
- Department of Chemistry, University of Manitoba , 360 Parker Building, Winnipeg, Manitoba R3T 2N2, Canada
| | - Haley Neustaeter
- Department of Chemistry, University of Manitoba , 360 Parker Building, Winnipeg, Manitoba R3T 2N2, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba , 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada
| | - Oleg V Krokhin
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba , 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada.,Department of Internal Medicine, University of Manitoba , 799 JBRC, 715 McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada
| |
Collapse
|