151
|
Spirin V, Shpunt A, Seebacher J, Gentzel M, Shevchenko A, Gygi S, Sunyaev S. Assigning spectrum-specific P-values to protein identifications by mass spectrometry. ACTA ACUST UNITED AC 2011; 27:1128-34. [PMID: 21349864 DOI: 10.1093/bioinformatics/btr089] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Although many methods and statistical approaches have been developed for protein identification by mass spectrometry, the problem of accurate assessment of statistical significance of protein identifications remains an open question. The main issues are as follows: (i) statistical significance of inferring peptide from experimental mass spectra must be platform independent and spectrum specific and (ii) individual spectrum matches at the peptide level must be combined into a single statistical measure at the protein level. RESULTS We present a method and software to assign statistical significance to protein identifications from search engines for mass spectrometric data. The approach is based on asymptotic theory of order statistics. The parameters of the asymptotic distributions of identification scores are estimated for each spectrum individually. The method relies on new unbiased estimators for parameters of extreme value distribution. The estimated parameters are used to assign a spectrum-specific P-value to each peptide-spectrum match. The protein-level confidence measure combines P-values of peptide-to-spectrum matches. CONCLUSION We extensively tested the method using triplicate mouse and yeast high-throughput proteomic experiments. The proposed statistical approach improves the sensitivity of protein identifications without compromising specificity. While the method was primarily designed to work with Mascot, it is platform-independent and is applicable to any search engine which outputs a single score for a peptide-spectrum match. We demonstrate this by testing the method in conjunction with X!Tandem. AVAILABILITY The software is available for download at ftp://genetics.bwh.harvard.edu/SSPV/. CONTACT ssunyaev@rics.bwh.harvard.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Victor Spirin
- Division of Genetics, Brigham and Women's Hospital, Department of Cell Biology, Harvard Medical School, 240 Longwood Avenue, Boston, MA 02115, USA
| | | | | | | | | | | | | |
Collapse
|
152
|
Shashilov VA, Lednev IK. Advanced statistical and numerical methods for spectroscopic characterization of protein structural evolution. Chem Rev 2011; 110:5692-713. [PMID: 20593900 DOI: 10.1021/cr900152h] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Victor A Shashilov
- Aegis Analytical Corporation, 1380 Forest Park Circle, Suite 200, Lafayette, Colorado 80026, USA
| | | |
Collapse
|
153
|
Matthiesen R, Azevedo L, Amorim A, Carvalho AS. Discussion on common data analysis strategies used in MS-based proteomics. Proteomics 2011; 11:604-19. [PMID: 21241018 DOI: 10.1002/pmic.201000404] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2010] [Revised: 10/29/2010] [Accepted: 11/02/2010] [Indexed: 11/07/2022]
Abstract
Current proteomics technology is limited in resolving the proteome complexity of biological systems. The main issue at stake is to increase throughput and spectra quality so that spatiotemporal dimensions, population parameters and the complexity of protein modifications on a quantitative scale can be considered. MS-based proteomics and protein arrays are the main players in large-scale proteome analysis and an integration of these two methodologies is powerful but presently not sufficient for detailed quantitative and spatiotemporal proteome characterization. Improvements of instrumentation for MS-based proteomics have been achieved recently resulting in data sets of approximately one million spectra which is a large step in the right direction. The corresponding raw data range from 50 to 100 Gb and are frequently made available. Multidimensional LC-MS data sets have been demonstrated to identify and quantitate 2000-8000 proteins from whole cell extracts. The analysis of the resulting data sets requires several steps from raw data processing, to database-dependent search, statistical evaluation of the search result, quantitative algorithms and statistical analysis of quantitative data. A large number of software tools have been proposed for the above-mentioned tasks. However, it is not the aim of this review to cover all software tools, but rather discuss common data analysis strategies used by various algorithms for each of the above-mentioned steps in a non-redundant approach and to argue that there are still some areas which need improvements.
Collapse
Affiliation(s)
- Rune Matthiesen
- Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal.
| | | | | | | |
Collapse
|
154
|
Dasari S, Chambers MC, Codreanu SG, Liebler DC, Collins BC, Pennington SR, Gallagher WM, Tabb DL. Sequence tagging reveals unexpected modifications in toxicoproteomics. Chem Res Toxicol 2011; 24:204-16. [PMID: 21214251 DOI: 10.1021/tx100275t] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here, we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty-five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications.
Collapse
Affiliation(s)
- Surendra Dasari
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee 37232-0006, United States
| | | | | | | | | | | | | | | |
Collapse
|
155
|
Faherty BK, Gerber SA. MacroSEQUEST: efficient candidate-centric searching and high-resolution correlation analysis for large-scale proteomics data sets. Anal Chem 2010; 82:6821-9. [PMID: 20684545 DOI: 10.1021/ac100783x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Modern mass spectrometers are now capable of producing tens of thousands of tandem mass (MS/MS) spectra per hour of operation, resulting in an ever-increasing burden on the computational tools required to translate these raw MS/MS spectra into peptide sequences. In the present work, we describe our efforts to improve the performance of one of the earliest and most commonly used algorithms, SEQUEST, through a wholesale redesign of its processing architecture. We call this new program MacroSEQUEST, which exhibits a dramatic improvement in processing speed by transiently indexing the array of MS/MS spectra prior to searching FASTA databases. We demonstrate the performance of MacroSEQUEST relative to a suite of other programs commonly encountered in proteomics research. We also extend the capability of SEQUEST by implementing a parameter in MacroSEQUEST that allows for scalable sparse arrays of experimental and theoretical spectra to be implemented for high-resolution correlation analysis and demonstrate the advantages of high-resolution MS/MS searching to the sensitivity of large-scale proteomics data sets.
Collapse
Affiliation(s)
- Brendan K Faherty
- Department of Genetics, Dartmouth Medical School, Lebanon, New Hampshire 03756, USA
| | | |
Collapse
|
156
|
RAId_aPS: MS/MS analysis with multiple scoring functions and spectrum-specific statistics. PLoS One 2010; 5:e15438. [PMID: 21103371 PMCID: PMC2982831 DOI: 10.1371/journal.pone.0015438] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2010] [Accepted: 09/20/2010] [Indexed: 11/26/2022] Open
Abstract
Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an -value calibration protocol, we demonstrated earlier the feasibility of translating either the score or heuristic -value reported by any method into the textbook-defined -value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific-values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign -values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign -values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page.
Collapse
|
157
|
Sharma V, Eng JK, Feldman S, von Haller PD, MacCoss MJ, Noble WS. Precursor charge state prediction for electron transfer dissociation tandem mass spectra. J Proteome Res 2010; 9:5438-44. [PMID: 20731383 DOI: 10.1021/pr1006685] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Electron-transfer dissociation (ETD) induces fragmentation along the peptide backbone by transferring an electron from a radical anion to a protonated peptide. In contrast with collision-induced dissociation, side chains and modifications such as phosphorylation are left intact through the ETD process. Because the precursor charge state is an important input to MS/MS sequence database search tools, the ability to accurately determine the precursor charge is helpful for the identification process. Furthermore, because ETD can be applied to large, highly charged peptides, the need for accurate precursor charge state determination is magnified. Otherwise, each spectrum must be searched repeatedly using a large range of possible precursor charge states. To address this problem, we have developed an ETD charge state prediction tool based on support vector machine classifiers that is demonstrated to exhibit superior classification accuracy while minimizing the overall number of predicted charge states. The tool is freely available, open source, cross platform compatible, and demonstrated to perform well when compared with an existing charge state prediction tool. The program is available from http://code.google.com/p/etdz/.
Collapse
Affiliation(s)
- Vagisha Sharma
- Department of Biochemistry, University of Washington, Seattle, Washington, USA
| | | | | | | | | | | |
Collapse
|
158
|
Shireman LM, Kripps KA, Balogh LM, Conner KP, Whittington D, Atkins WM. Glutathione transferase A4-4 resists adduction by 4-hydroxynonenal. Arch Biochem Biophys 2010; 504:182-9. [PMID: 20836986 DOI: 10.1016/j.abb.2010.09.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2010] [Revised: 09/06/2010] [Accepted: 09/08/2010] [Indexed: 12/30/2022]
Abstract
4-Hydroxy-2-trans-nonenal (HNE) is a lipid peroxidation product that contributes to the pathophysiology of several diseases with components of oxidative stress. The electrophilic nature of HNE results in covalent adduct formation with proteins, fatty acids and DNA. However, it remains unclear whether enzymes that metabolize HNE avoid inactivation by it. Glutathione transferase A4-4 (GST A4-4) plays a significant role in the elimination of HNE by conjugating it with glutathione (GSH), with catalytic activity toward HNE that is dramatically higher than the homologous GST A1-1 or distantly related GSTs. To determine whether enzymes that metabolize HNE resist its covalent adduction, the rates of adduction of these GST isoforms were compared and the functional effects of adduction on catalytic properties were determined. Although GST A4-4 and GST A1-1 have striking structural similarity, GST A4-4 was insensitive to adduction by HNE under conditions that yield modest adduction of GST A1-1 and extensive adduction of GST P1-1. Furthermore, adduction of GST P1-1 by HNE eliminated its activity toward the substrates 1-chloro-2,4-dinitrobenzene (CDNB) and toward HNE itself. HNE effects on GST A4-4 and A1-1 were less significant. The results indicate that enzymes that metabolize HNE may have evolved structurally to resist covalent adduction by it.
Collapse
Affiliation(s)
- Laura M Shireman
- Department of Medicinal Chemistry, University of Washington, Seattle, 98195-7610, USA
| | | | | | | | | | | |
Collapse
|
159
|
McIlwain S, Draghicescu P, Singh P, Goodlett DR, Noble WS. Detecting cross-linked peptides by searching against a database of cross-linked peptide pairs. J Proteome Res 2010; 9:2488-95. [PMID: 20349954 DOI: 10.1021/pr901163d] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Mass spectrometric identification of cross-linked peptides can provide valuable information about the structure of protein complexes. We describe a straightforward database search scheme that identifies and assigns statistical confidence estimates to spectra from cross-linked peptides. The method is well suited to targeted analysis of a single protein complex, without requiring an isotope labeling strategy. Our approach uses a SEQUEST-style search procedure in which the database is comprised of a mixture of single peptides with and without linkers attached and cross-linked products. In contrast to several previous approaches, we generate theoretical spectra that account for all of the expected peaks from a cross-linked product, and we employ an empirical curve-fitting procedure to estimate statistical confidence measures. We show that our fully automated procedure successfully reidentifies spectra from a previous study, and we provide evidence that our statistical confidence estimates are accurate.
Collapse
Affiliation(s)
- Sean McIlwain
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | | | | | | | | |
Collapse
|
160
|
Cysteine S-conjugate β-lyases: important roles in the metabolism of naturally occurring sulfur and selenium-containing compounds, xenobiotics and anticancer agents. Amino Acids 2010; 41:7-27. [PMID: 20306345 DOI: 10.1007/s00726-010-0552-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2010] [Accepted: 03/01/2010] [Indexed: 12/13/2022]
Abstract
Cysteine S-conjugate β-lyases are pyridoxal 5'-phosphate-containing enzymes that catalyze β-elimination reactions with cysteine S-conjugates that possess a good leaving group in the β-position. The end products are aminoacrylate and a sulfur-containing fragment. The aminoacrylate tautomerizes and hydrolyzes to pyruvate and ammonia. The mammalian cysteine S-conjugate β-lyases thus far identified are enzymes involved in amino acid metabolism that catalyze β-lyase reactions as non-physiological side reactions. Most are aminotransferases. In some cases the lyase is inactivated by reaction products. The cysteine S-conjugate β-lyases are of much interest to toxicologists because they play an important key role in the bioactivation (toxication) of halogenated alkenes, some of which are produced on an industrial scale and are environmental contaminants. The cysteine S-conjugate β-lyases have been reviewed in this journal previously (Cooper and Pinto in Amino Acids 30:1-15, 2006). Here, we focus on more recent findings regarding: (1) the identification of enzymes associated with high-M(r) cysteine S-conjugate β-lyases in the cytosolic and mitochondrial fractions of rat liver and kidney; (2) the mechanism of syncatalytic inactivation of rat liver mitochondrial aspartate aminotransferase by the nephrotoxic β-lyase substrate S-(1,1,2,2-tetrafluoroethyl)-L-cysteine (the cysteine S-conjugate of tetrafluoroethylene); (3) toxicant channeling of reactive fragments from the active site of mitochondrial aspartate aminotransferase to susceptible proteins in the mitochondria; (4) the involvement of cysteine S-conjugate β-lyases in the metabolism/bioactivation of drugs and natural products; and (5) the role of cysteine S-conjugate β-lyases in the metabolism of selenocysteine Se-conjugates. This review emphasizes the fact that the cysteine S-conjugate β-lyases are biologically more important than hitherto appreciated.
Collapse
|
161
|
Kwon KH. Analytical methods for proteome data obtained from SDS-PAGE multi-dimensional separation and mass spectrometry. J Anal Sci Technol 2010. [DOI: 10.5355/jast.2010.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
162
|
Abstract
The peptide identification problem lies at the heart of modern proteomic methodology, from which the presence of a particular protein or proteins in a sample may be inferred. The challenge is to find the most likely amino acid sequence, which corresponds to each tandem mass spectrum that has been collected, and produce some kind of score and associated statistical measure that the putative identification is correct. This approach assumes that the peptide (and parent protein) sequence in question is known and is present in the database which is to be searched, as opposed to de novo methods, which seek to identify the peptide ab initio. This chapter will provide an overview of the methods that common, popular software tools employ to search protein sequence databases to provide the non-expert reader with sufficient background to appreciate the choices they can make. This will cover the approaches used to compare experimental and theoretical spectra and some of the methods used to validate and provide higher confidence in the assignments.
Collapse
Affiliation(s)
- Simon J Hubbard
- Faculty of Life Sciences, University of Manchester, Michael Smith Building, Manchester, UK.
| |
Collapse
|
163
|
Abstract
The review describes methods of de novo sequencing of peptides by mass spectrometry. De novo methods utilize computational approaches to deduce the sequence or partial sequence of peptides directly from the experimental MS/MS spectra. The concepts behind a number of de novo sequencing methods are discussed. The other approach to identify peptides by tandem mass spectrometry is to match the fragment ions with virtual peptide ions generated from a genomic or protein database. De novo methods are essential to identify proteins when the genomes are not known but they are also extremely useful even when the genomes are known since they are not affected by errors in a search database. Another advantage of de novo methods is that the partial sequence can be used to search for posttranslation modifications or for the identification of mutations by homology based software.
Collapse
Affiliation(s)
- Christopher Hughes
- Department of Biochemistry, University of Western Ontario, London, ON, Canada
| | | | | |
Collapse
|
164
|
Denekamp NY, Reinhardt R, Kube M, Lubzens E. Late embryogenesis abundant (LEA) proteins in nondesiccated, encysted, and diapausing embryos of rotifers. Biol Reprod 2009; 82:714-24. [PMID: 20018906 DOI: 10.1095/biolreprod.109.081091] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
Abstract
Two genes encoding for late embryogenesis abundant proteins (LEAs) are expressed in encysted diapausing embryos (or resting eggs) of rotifers (Brachionus plicatilis O.F. Müller) and females forming them. The two genes (bpa-leaa and bpa-leab) share approximately 50% of their nucleotides sequence, and bpa-leaa is more than twofold longer than bpa-leab. The deduced amino acid sequences show high abundance of alanine, glycine, lysine, and glutamic acid; a hydropathy index of lower than one; and a relatively high (81-82%) predicted probability of forming alpha-helices in their secondary structure, all of which are characteristic features of LEAs. The predicted molecular masses of bpa-LEAA ( approximately 67 kDa) and bpa-LEAB ( approximately 27 kDa) are similar to the molecular mass determined by Western-blot analyses, suggesting a low probability of posttranslational modifications. In silico analysis reveals that the two LEAs resemble group 3 LEAs based on the repeats for 11mer motifs, although they also display several putative amino acids typical of the 20mer motif of group 1 LEAs. The rotifer LEAs do not contain a predicted target sequence and are more likely localized in the cytosol. LEAs were expressed in resting eggs and females producing them, but not in other female forms or males. LEA transcripts and proteins are degraded during hatching, suggesting that LEAs are developmentally programmed during resting egg formation and hatching. LEAs probably equip the resting eggs to withstand desiccation if that occurs during dormancy. The present study expands our knowledge about the biological pathways associated with formation of rotifer resting eggs and also demonstrates the occurrence of LEAs in dormant, nondesiccated, encysted animal embryos.
Collapse
Affiliation(s)
- Nadav Y Denekamp
- Department of Marine Biology, Israel Oceanographic and Limnological Research, Haifa, Israel
| | | | | | | |
Collapse
|
165
|
Tsai YS, Scherl A, Shaw JL, MacKay CL, Shaffer SA, Langridge-Smith PRR, Goodlett DR. Precursor ion independent algorithm for top-down shotgun proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2009; 20:2154-2166. [PMID: 19773183 DOI: 10.1016/j.jasms.2009.07.024] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2009] [Revised: 07/29/2009] [Accepted: 07/29/2009] [Indexed: 05/28/2023]
Abstract
We present a precursor ion independent top-down algorithm (PIITA) for use in automated assignment of protein identifications from tandem mass spectra of whole proteins. To acquire the data, we utilize data-dependent acquisition to select protein precursor ions eluting from a C4-based HPLC column for collision induced dissociation in the linear ion trap of an LTQ-Orbitrap mass spectrometer. Gas-phase fractionation is used to increase the number of acquired tandem mass spectra, all of which are recorded in the Orbitrap mass analyzer. To identify proteins, the PIITA algorithm compares deconvoluted, deisotoped, observed tandem mass spectra to all possible theoretical tandem mass spectra for each protein in a genomic sequence database without regard for measured parent ion mass. Only after a protein is identified, is any difference in measured and theoretical precursor mass used to identify and locate post-translation modifications. We demonstrate the application of PIITA to data generated via our wet-lab approach on a Salmonella typhimurium outer membrane extract and compare these results to bottom-up analysis. From these data, we identify 154 proteins by top-down analysis, 73 of which were not identified in a parallel bottom-up analysis. We also identify 201 unique isoforms of these 154 proteins at a false discovery rate (FDR) of <1%.
Collapse
Affiliation(s)
- Yihsuan S Tsai
- Department of Medicinal Chemistry, University of Washington, Seattle, Washington 98195-7610, USA
| | | | | | | | | | | | | |
Collapse
|
166
|
Nunn BL, Aker JR, Shaffer SA, Tsai S, Strzepek RF, Boyd PW, Freeman TL, Brittnacher M, Malmström L, Goodlett DR. Deciphering diatom biochemical pathways via whole-cell proteomics. AQUATIC MICROBIAL ECOLOGY : INTERNATIONAL JOURNAL 2009; 55:241-253. [PMID: 19829762 PMCID: PMC2761042 DOI: 10.3354/ame01284] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Diatoms play a critical role in the oceans' carbon and silicon cycles; however, a mechanistic understanding of the biochemical processes that contribute to their ecological success remains elusive. Completion of the Thalassiosira pseudonana genome provided 'blueprints' for the potential biochemical machinery of diatoms, but offers only a limited insight into their biology under various environmental conditions. Using high-throughput shotgun proteomics, we identified a total of 1928 proteins expressed by T. pseudonana cultured under optimal growth conditions, enabling us to analyze this diatom's primary metabolic and biosynthetic pathways. Of the proteins identified, 70% are involved in cellular metabolism, while 11% are involved in the transport of molecules. We identified all of the enzymes involved in the urea cycle, thereby describing the complete pathway to convert ammonia to urea, along with urea transporters, and the urea-degrading enzyme urease. Although metabolic exchange between these pathways remains ambiguous, their constitutive presence suggests complex intracellular nitrogen recycling. In addition, all C(4) related enzymes for carbon fixation have been identified to be in abundance, with high protein sequence coverage. Quantification of mass spectra acquisitions demonstrated that the 20 most abundant proteins included an unexpectedly high expression of clathrin, which is the primary structural protein involved in endocytic transport. This result highlights a previously overlooked mechanism for the inter- and intra-cellular transport of nutrients and macromolecules in diatoms, potentially providing a missing link to organelle communication and metabolite exchange. Our results demonstrate the power of proteomics, and lay the groundwork for future comparative proteomic studies and directed analyses of specifically expressed proteins and biochemical pathways of oceanic diatoms.
Collapse
Affiliation(s)
- Brook L. Nunn
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
| | - Jocelyn R. Aker
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
| | - Scott A. Shaffer
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
| | - Shannon Tsai
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
| | | | - Philip W. Boyd
- NIWA Centre for Chemical and Physical Oceanography, Department of Chemistry, University of Otago, Dunedin, New Zealand
| | - Theodore Larson Freeman
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
- Department of Genomic Sciences, University of Washington, Box 355065, Seattle, Washington 98195, USA
| | - Mitchell Brittnacher
- Department of Genomic Sciences, University of Washington, Box 355065, Seattle, Washington 98195, USA
| | - Lars Malmström
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
| | - David R. Goodlett
- Medicinal Chemistry Department, University of Washington, Box 335351, Seattle, Washington 98195, USA
| |
Collapse
|
167
|
Critical Evaluation of Product Ion Selection and Spectral Correlation Analysis for Biomarker Screening Using Targeted Peptide Multiple Reaction Monitoring. Clin Proteomics 2009. [DOI: 10.1007/s12014-009-9023-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Abstract
Introduction
Tandem mass spectrometry (MS/MS) has emerged as a cornerstone of proteomic screens aimed at discovering putative protein biomarkers of disease with potential clinical applications. Systematic validation of lead candidates in large numbers of samples from patient cohorts remains an important challenge. One particularly promising high throughout technique is multiple reaction monitoring (MRM), a targeted form of MS/MS by which precise peptide precursor–product ion combinations, or transitions, are selectively tracked as informative probes. Despite recent progress, however, many important computational and statistical issues remain unresolved. These include the selection of an optimal set of transitions so as to achieve sufficiently high specificity and sensitivity when profiling complex biological specimens, and the corresponding generation of a suitable scoring function to reliably confirm tentative molecular identities based on noisy spectra.
Methods
In this study, we investigate various empirical criteria that are helpful to consider when developing and interpreting MRM-style assays based on the similarity between experimental and annotated reference spectra. We also rigorously evaluate and compare the performance of conventional spectral similarity measures, based on only a few pre-selected representative transitions, with a generic scoring metric, termed T
corr, wherein a selected product ion profile is used to score spectral comparisons.
Conclusions
Our analyses demonstrate that T
corr is potentially more suitable and effective for detecting biomarkers in complex biological mixtures than more traditional spectral library searches.
Collapse
|