1
|
Almotairi S, Badr E, Abdelbaky I, Elhakeem M, Abdul Salam M. Hybrid transformer-CNN model for accurate prediction of peptide hemolytic potential. Sci Rep 2024; 14:14263. [PMID: 38902287 PMCID: PMC11190137 DOI: 10.1038/s41598-024-63446-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 05/29/2024] [Indexed: 06/22/2024] Open
Abstract
Hemolysis is a crucial factor in various biomedical and pharmaceutical contexts, driving our interest in developing advanced computational techniques for precise prediction. Our proposed approach takes advantage of the unique capabilities of convolutional neural networks (CNNs) and transformers to detect complex patterns inherent in the data. The integration of CNN and transformers' attention mechanisms allows for the extraction of relevant information, leading to accurate predictions of hemolytic potential. The proposed method was trained on three distinct data sets of peptide sequences known as recurrent neural network-hemolytic (RNN-Hem), Hlppredfuse, and Combined. Our computational results demonstrated the superior efficacy of our models compared to existing methods. The proposed approach demonstrated impressive Matthews correlation coefficients of 0.5962, 0.9111, and 0.7788 respectively, indicating its effectiveness in predicting hemolytic activity. With its potential to guide experimental efforts in peptide design and drug development, this method holds great promise for practical applications. Integrating CNNs and transformers proves to be a powerful tool in the fields of bioinformatics and therapeutic research, highlighting their potential to drive advancement in this area.
Collapse
Affiliation(s)
- Sultan Almotairi
- Department of Computer Science, Faculty of College of Computer and Information Sciences, Majmaah University, 11952, Majmaah, Saudi Arabia
- Department of Computer Science, Faculty of Computer and Information Systems, Islamic University of Madinah, 42351, Medinah, Saudi Arabia
| | - Elsayed Badr
- Scientific Computing Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt.
- The Egyptian School of Data Science (ESDS), Benha, Egypt.
| | - Ibrahim Abdelbaky
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
| | - Mohamed Elhakeem
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt.
| | - Mustafa Abdul Salam
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
- Department of Computer Science, College of Arts and Science, Wadi Addawasir, Prince Sattam Bin Abdulaziz University, 16273, Al-Kharj, Saudi Arabia
| |
Collapse
|
2
|
Guzman YA, Sakellari D, Papadimitriou K, Floudas CA. High-throughput proteomic analysis of candidate biomarker changes in gingival crevicular fluid after treatment of chronic periodontitis. J Periodontal Res 2018; 53:853-860. [PMID: 29900535 DOI: 10.1111/jre.12575] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/20/2018] [Indexed: 12/27/2022]
Abstract
BACKGROUND AND OBJECTIVE Untargeted, high-throughput proteomics methodologies have great potential to aid in identifying biomarkers for the diagnosis of periodontal disease. The application of such methods to the discovery of candidate biomarkers for the resolution of periodontal inflammation after periodontal therapy has been investigated. MATERIAL AND METHODS Gingival crevicular fluid samples were collected from 10 patients diagnosed with chronic periodontitis at baseline and 1, 5, 9 and 13 weeks after completion of mechanical periodontal treatment. Clinical indices of periodontal disease, including probing depth, recession, clinical attachment level and bleeding on probing, were recorded at baseline and 13 weeks. Samples were analyzed using an online liquid chromatography-nanoelectrospray-hybrid ion trap-Orbitrap mass spectrometer. Spectra were processed with the PILOT_PROTEIN proteomics software suite. RESULTS Clinical parameters were significantly improved 13 weeks after treatment (Wilcoxon signed ranks test, P < .05). From the substantial number of identified proteins, a small subset was extracted by filter methods that included temporal pattern matching, logistic function fitting and mixed-integer linear optimization. This subset includes azurocidin, lysozyme C and myosin-9 as candidate biomarkers prominent at baseline and alpha-smooth muscle actin as prominent 13 weeks after treatment. Cross-validation studies yielded average predictive accuracy and area under the curve of 0.900 and 0.930, respectively. CONCLUSION High-throughput proteomic analysis can contribute to identifying endpoints of periodontal therapy. These candidate biomarkers should be evaluated for clinical efficacy.
Collapse
Affiliation(s)
- Y A Guzman
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, USA.,Texas A&M Energy Institute, Texas A&M University, College Station, USA.,Department of Chemical and Biological Engineering, Princeton University, Princeton, USA
| | - D Sakellari
- Department of Preventive Dentistry, Periodontology and Implant Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - K Papadimitriou
- Department of Preventive Dentistry, Periodontology and Implant Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - C A Floudas
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, USA.,Texas A&M Energy Institute, Texas A&M University, College Station, USA
| |
Collapse
|
3
|
Giese SH, Zickmann F, Renard BY. Detection of Unknown Amino Acid Substitutions Using Error-Tolerant Database Search. Methods Mol Biol 2016; 1362:247-264. [PMID: 26519182 DOI: 10.1007/978-1-4939-3106-4_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Recent studies have demonstrated that mass spectrometry-based variant detection is feasible. Typically, either genomic variant databases or transcript data are used to construct customized target databases for the identification of single-amino acid variants in mass spectrometry data. However, both approaches require additional data to perform the identification of SAAVs. Here, we discuss the application of an error-tolerant peptide search engine such as BICEPS for identifying variants exclusively based on standard Uniprot databases. Thereby, unnecessary and redundant extensions of the search space are avoided. The workflow provides an unbiased view on the data; the search space is not limited to known variants and simultaneously does not require additional data. In a subsequent step a second identification search is performed to verify the initially identified variant peptides and aggregate information on the protein level.
Collapse
Affiliation(s)
- Sven H Giese
- Research Group Bioinformatics (NG4), Robert Koch-Institute, Nordufer 20, 13353, Berlin, Germany
- Department of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 13355, Berlin, Germany
- Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JR, UK
| | - Franziska Zickmann
- Research Group Bioinformatics (NG4), Robert Koch-Institute, Nordufer 20, 13353, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics (NG4), Robert Koch-Institute, Nordufer 20, 13353, Berlin, Germany.
| |
Collapse
|
4
|
Guzman YA, Sakellari D, Arsenakis M, Floudas CA. Proteomics for the discovery of biomarkers and diagnosis of periodontitis: a critical review. Expert Rev Proteomics 2013; 11:31-41. [PMID: 24308552 DOI: 10.1586/14789450.2014.864953] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Periodontitis is a common chronic and destructive disease whose pathogenetic mechanisms remain unclear. Due to their sensitivity and global scale, proteomics studies offer the opportunity to uncover critical host and pathogen activity indicators and can elucidate clinically applicable biomarkers for improved diagnosis and treatment of the disease. This review summarizes the literature of proteomics studies on periodontitis and comprehensively discusses commonly found candidate biomarkers. Key considerations in the design of an experimental proteomics platform are also outlined. The applicability of protein biomarkers across the progression of periodontitis and unexplored areas of research are highlighted.
Collapse
Affiliation(s)
- Yannis A Guzman
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ, USA
| | | | | | | |
Collapse
|
5
|
Baliban RC, Sakellari D, Li Z, Guzman YA, Garcia BA, Floudas CA. Discovery of biomarker combinations that predict periodontal health or disease with high accuracy from GCF samples based on high-throughput proteomic analysis and mixed-integer linear optimization. J Clin Periodontol 2012. [PMID: 23190455 DOI: 10.1111/jcpe.12037] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
AIM To identify optimal combination(s) of proteomic based biomarkers in gingival crevicular fluid (GCF) samples from chronic periodontitis (CP) and periodontally healthy individuals and validate the predictions through known and blind test sets. MATERIALS AND METHODS GCF samples were collected from 96 CP and periodontally healthy subjects and analysed using high-performance liquid chromatography, tandem mass spectrometry and the PILOT_PROTEIN algorithm. A mixed-integer linear optimization (MILP) model was then developed to identify the optimal combination of biomarkers which could clearly distinguish a blind subject sample as healthy or diseased. RESULTS A thorough cross-validation of the MILP model capability was performed on a training set of 55 samples and greater than 99% accuracy was consistently achieved when annotating the testing set samples as healthy or diseased. The model was then trained on all 55 samples and tested on two different blind test sets, and using an optimal combination of 7 human proteins and 3 bacterial proteins, the model was able to correctly predict 40 out of 41 healthy and diseased samples. CONCLUSIONS The proposed large-scale proteomic analysis and MILP model led to the identification of novel combinations of biomarkers for consistent diagnosis of periodontal status with greater than 95% predictive accuracy.
Collapse
Affiliation(s)
- Richard C Baliban
- Department of Chemical and Biological Engineering, Princeton University, Princeton, USA
| | | | | | | | | | | |
Collapse
|
6
|
Baliban RC, Dimaggio PA, Plazas-Mayorca MD, Garcia BA, Floudas CA. PILOT_PROTEIN: identification of unmodified and modified proteins via high-resolution mass spectrometry and mixed-integer linear optimization. J Proteome Res 2012; 11:4615-29. [PMID: 22788846 DOI: 10.1021/pr300418j] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
A novel protein identification framework, PILOT_PROTEIN, has been developed to construct a comprehensive list of all unmodified proteins that are present in a living sample. It uses the peptide identification results from the PILOT_SEQUEL algorithm to initially determine all unmodified proteins within the sample. Using a rigorous biclustering approach that groups incorrect peptide sequences with other homologous sequences, the number of false positives reported is minimized. A sequence tag procedure is then incorporated along with the untargeted PTM identification algorithm, PILOT_PTM, to determine a list of all modification types and sites for each protein. The unmodified protein identification algorithm, PILOT_PROTEIN, is compared to the methods SEQUEST, InsPecT, X!Tandem, VEMS, and ProteinProspector using both prepared protein samples and a more complex chromatin digest. The algorithm demonstrates superior protein identification accuracy with a lower false positive rate. All materials are freely available to the scientific community at http://pumpd.princeton.edu.
Collapse
Affiliation(s)
- Richard C Baliban
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | | | | | | | | |
Collapse
|
7
|
Renard BY, Xu B, Kirchner M, Zickmann F, Winter D, Korten S, Brattig NW, Tzur A, Hamprecht FA, Steen H. Overcoming species boundaries in peptide identification with Bayesian information criterion-driven error-tolerant peptide search (BICEPS). Mol Cell Proteomics 2012; 11:M111.014167. [PMID: 22493179 PMCID: PMC3394943 DOI: 10.1074/mcp.m111.014167] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis.
Collapse
Affiliation(s)
- Bernhard Y Renard
- Research Group Bioinformatics (NG4), Robert Koch Institute, Berlin 13353, Germany.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Baliban RC, Sakellari D, Li Z, DiMaggio PA, Garcia BA, Floudas CA. Novel protein identification methods for biomarker discovery via a proteomic analysis of periodontally healthy and diseased gingival crevicular fluid samples. J Clin Periodontol 2011; 39:203-12. [PMID: 22092770 DOI: 10.1111/j.1600-051x.2011.01805.x] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/22/2011] [Indexed: 01/08/2023]
Abstract
AIM To identify possible novel biomarkers in gingival crevicular fluid (GCF) samples from chronic periodontitis (CP) and periodontally healthy individuals using high-throughput proteomic analysis. MATERIALS AND METHODS Gingival crevicular fluid samples were collected from 12 CP and 12 periodontally healthy subjects. Samples were trypically digested with trypsin, eluted using high-performance liquid chromatography, and fragmented using tandem mass spectrometry (MS/MS). MS/MS spectra were analysed using PILOT_PROTEIN to identify all unmodified proteins within the samples. RESULTS Using the database derived from Homo sapiens taxonomy and all bacterial taxonomies, 432 human (120 new) and 30 bacterial proteins were identified. The human proteins, angiotensinogen, clusterin and thymidine phosphorylase were identified as biomarker candidates based on their high-scoring only in samples from periodontal health. Similarly, neutrophil defensin-1, carbonic anhydrase-1 and elongation factor-1 gamma were associated with CP. Candidate bacterial biomarkers include 33 kDa chaperonin, iron uptake protein A2 and phosphoenolpyruvate carboxylase (health-associated) and ribulose biphosphate carboxylase, a probable succinyl-CoA:3-ketoacid-coenzyme A transferase, or DNA-directed RNA polymerase subunit beta (CP-associated). Most of these human and bacterial proteins have not been previously evaluated as biomarkers of periodontal conditions and require further investigation. CONCLUSIONS The proposed methods for large-scale comprehensive proteomic analysis may lead to the identification of novel biomarkers of periodontal health or disease.
Collapse
Affiliation(s)
- Richard C Baliban
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | | | | | | | | | | |
Collapse
|
9
|
SUN HC, ZHANG JY, LIU H, ZHANG W, XU CM, MA HB, ZHU YP, XIE HW. Algorithm Development of de novo Peptide Sequencing Via Tandem Mass Spectrometry. PROG BIOCHEM BIOPHYS 2011. [DOI: 10.3724/sp.j.1206.2010.00226] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
10
|
Balch WE, Yates JR. Application of mass spectrometry to study proteomics and interactomics in cystic fibrosis. Methods Mol Biol 2011; 742:227-247. [PMID: 21547736 DOI: 10.1007/978-1-61779-120-8_14] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The cystic fibrosis transmembrane conductance regulator (CFTR) does not function in isolation, but rather in a complex network of protein-protein interactions that dictate the physiology of a healthy cell and tissue and, when defective, the pathophysiology characteristic of cystic fibrosis (CF) disease. To begin to address the organization and operation of the extensive cystic fibrosis protein network dictated by simultaneous and sequential interactions, it will be necessary to understand the global protein environment (the proteome) in which CFTR functions in the cell and the local network that dictates CFTR folding, trafficking, and function at the cell surface. Emerging mass spectrometry (MS) technologies and methodologies offer an unprecedented opportunity to fully characterize both the proteome and the protein interactions directing normal CFTR function and to define what goes wrong in disease. Below we provide the CF investigator with a general introduction to the capabilities of modern mass spectrometry technologies and methodologies with the goal of inspiring further application of these technologies for development of a basic understanding of the disease and for the identification of novel pathways that may be amenable to therapeutic intervention in the clinic.
Collapse
Affiliation(s)
- William E Balch
- Department of Cell Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | | |
Collapse
|
11
|
Baliban RC, DiMaggio PA, Plazas-Mayorca MD, Young NL, Garcia BA, Floudas CA. A novel approach for untargeted post-translational modification identification using integer linear optimization and tandem mass spectrometry. Mol Cell Proteomics 2010; 9:764-79. [PMID: 20103568 DOI: 10.1074/mcp.m900487-mcp200] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
A novel algorithm, PILOT_PTM, has been developed for the untargeted identification of post-translational modifications (PTMs) on a template sequence. The algorithm consists of an analysis of an MS/MS spectrum via an integer linear optimization model to output a rank-ordered list of PTMs that best match the experimental data. Each MS/MS spectrum is analyzed by a preprocessing algorithm to reduce spectral noise and label potential complimentary, offset, isotope, and multiply charged peaks. Postprocessing of the rank-ordered list from the integer linear optimization model will resolve fragment mass errors and will reorder the list of PTMs based on the cross-correlation between the experimental and theoretical MS/MS spectrum. PILOT_PTM is instrument-independent, capable of handling multiple fragmentation technologies, and can address the universe of PTMs for every amino acid on the template sequence. The various features of PILOT_PTM are presented, and it is tested on several modified and unmodified data sets including chemically synthesized phosphopeptides, histone H3-(1-50) polypeptides, histone H3-(1-50) tryptic fragments, and peptides generated from proteins extracted from chromatin-enriched fractions. The data sets consist of spectra derived from fragmentation via collision-induced dissociation, electron transfer dissociation, and electron capture dissociation. The capability of PILOT_PTM is then benchmarked using five state-of-the-art methods, InsPecT, Virtual Expert Mass Spectrometrist (VEMS), Mod(i), Mascot, and X!Tandem. PILOT_PTM demonstrates superior accuracy on both the small and large scale proteome experiments. A protocol is finally developed for the analysis of a complete LC-MS/MS scan using template sequences generated from SEQUEST and is demonstrated on over 270,000 MS/MS spectra collected from a total chromatin digest.
Collapse
Affiliation(s)
- Richard C Baliban
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | | | | | | | | | | |
Collapse
|
12
|
DiMaggio PA, Young NL, Baliban RC, Garcia BA, Floudas CA. A mixed integer linear optimization framework for the identification and quantification of targeted post-translational modifications of highly modified proteins using multiplexed electron transfer dissociation tandem mass spectrometry. Mol Cell Proteomics 2009; 8:2527-43. [PMID: 19666874 DOI: 10.1074/mcp.m900144-mcp200] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Here we present a novel methodology for the identification of the targeted post-translational modifications present in highly modified proteins using mixed integer linear optimization and electron transfer dissociation (ETD) tandem mass spectrometry. For a given ETD tandem mass spectrum, the rigorous set of modified forms that satisfy the mass of the precursor ion, within some tolerance error, are enumerated by solving a feasibility problem via mixed integer linear optimization. The enumeration of the entire superset of modified forms enables the method to normalize the relative contributions of the individual modification sites. Given the entire set of modified forms, a superposition problem is then formulated using mixed integer linear optimization to determine the relative fractions of the modified forms that are present in the multiplexed ETD tandem mass spectrum. Chromatographic information in the mass and time dimension is utilized to assess the likelihood of the assigned modification states, to average several tandem mass spectra for confident identification of lower level forms, and to infer modification states of partially assigned spectra. The utility of the proposed computational framework is demonstrated on an entire LC-MS/MS ETD experiment corresponding to a mixture of highly modified histone peptides. This new computational method will facilitate the unprecedented LC-MS/MS ETD analysis of many hypermodified proteins and offer novel biological insight into these previously understudied systems.
Collapse
Affiliation(s)
- Peter A DiMaggio
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA
| | | | | | | | | |
Collapse
|
13
|
Yates JR, Ruse CI, Nakorchevsky A. Proteomics by Mass Spectrometry: Approaches, Advances, and Applications. Annu Rev Biomed Eng 2009; 11:49-79. [DOI: 10.1146/annurev-bioeng-061008-124934] [Citation(s) in RCA: 798] [Impact Index Per Article: 53.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- John R. Yates
- Department of Chemical Physiology and Cell Biology, The Scripps Research Institute, La Jolla, California 92037;
| | - Cristian I. Ruse
- Department of Chemical Physiology and Cell Biology, The Scripps Research Institute, La Jolla, California 92037;
| | - Aleksey Nakorchevsky
- Department of Chemical Physiology and Cell Biology, The Scripps Research Institute, La Jolla, California 92037;
| |
Collapse
|
14
|
Perry RH, Cooks RG, Noll RJ. Orbitrap mass spectrometry: instrumentation, ion motion and applications. MASS SPECTROMETRY REVIEWS 2008; 27:661-99. [PMID: 18683895 DOI: 10.1002/mas.20186] [Citation(s) in RCA: 273] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Since its introduction, the orbitrap has proven to be a robust mass analyzer that can routinely deliver high resolving power and mass accuracy. Unlike conventional ion traps such as the Paul and Penning traps, the orbitrap uses only electrostatic fields to confine and to analyze injected ion populations. In addition, its relatively low cost, simple design and high space-charge capacity make it suitable for tackling complex scientific problems in which high performance is required. This review begins with a brief account of the set of inventions that led to the orbitrap, followed by a qualitative description of ion capture, ion motion in the trap and modes of detection. Various orbitrap instruments, including the commercially available linear ion trap-orbitrap hybrid mass spectrometers, are also discussed with emphasis on the different methods used to inject ions into the trap. Figures of merit such as resolving power, mass accuracy, dynamic range and sensitivity of each type of instrument are compared. In addition, experimental techniques that allow mass-selective manipulation of the motion of confined ions and their potential application in tandem mass spectrometry in the orbitrap are described. Finally, some specific applications are reviewed to illustrate the performance and versatility of the orbitrap mass spectrometers.
Collapse
Affiliation(s)
- Richard H Perry
- Department of Chemistry, Purdue University, West Lafayette, IN 47907, USA
| | | | | |
Collapse
|
15
|
Shen Y, Tolić N, Hixson KK, Purvine SO, Anderson GA, Smith RD. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins. Anal Chem 2008; 80:7742-54. [PMID: 18783246 DOI: 10.1021/ac801123p] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
De novo sequencing is a spectrum analysis approach for mass spectrometry data to discover post-translational modifications in proteins; however, such an approach is still in its infancy and is still not widely applied to proteomic practices due to its limited reliability. In this work, we describe a de novo sequencing approach for the discovery of protein modifications based on identification of the proteome UStags (Shen, Y.; Tolić, N.; Hixson, K. K.; Purvine, S. O.; Pasa-Tolić, L.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Smith, R. D. Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry data for peptides and polypeptides from a yeast lysate, and the de novo sequences obtained were selected based on filter levels designed to provide a limited yet high quality subset of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags' prefix and suffix sequences and the UStags themselves) were used to infer possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances within several yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. To determine false discovery rates, two random (false) databases were independently used for sequence matching, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity of the approach were investigated and described. The combined de novo-UStag approach complements the UStag method previously reported by enabling the discovery of new protein modifications.
Collapse
Affiliation(s)
- Yufeng Shen
- Biological Science Division, Pacific Northwest National Laboratory, Richland, Washington 99352, USA.
| | | | | | | | | | | |
Collapse
|