1
|
Bhat GR, Sethi I, Rah B, Kumar R, Afroze D. Innovative in Silico Approaches for Characterization of Genes and Proteins. Front Genet 2022; 13:865182. [PMID: 35664302 PMCID: PMC9159363 DOI: 10.3389/fgene.2022.865182] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 04/11/2022] [Indexed: 11/13/2022] Open
Abstract
Bioinformatics is an amalgamation of biology, mathematics and computer science. It is a science which gathers the information from biology in terms of molecules and applies the informatic techniques to the gathered information for understanding and organizing the data in a useful manner. With the help of bioinformatics, the experimental data generated is stored in several databases available online like nucleotide database, protein databases, GENBANK and others. The data stored in these databases is used as reference for experimental evaluation and validation. Till now several online tools have been developed to analyze the genomic, transcriptomic, proteomics, epigenomics and metabolomics data. Some of them include Human Splicing Finder (HSF), Exonic Splicing Enhancer Mutation taster, and others. A number of SNPs are observed in the non-coding, intronic regions and play a role in the regulation of genes, which may or may not directly impose an effect on the protein expression. Many mutations are thought to influence the splicing mechanism by affecting the existing splice sites or creating a new sites. To predict the effect of mutation (SNP) on splicing mechanism/signal, HSF was developed. Thus, the tool is helpful in predicting the effect of mutations on splicing signals and can provide data even for better understanding of the intronic mutations that can be further validated experimentally. Additionally, rapid advancement in proteomics have steered researchers to organize the study of protein structure, function, relationships, and dynamics in space and time. Thus the effective integration of all of these technological interventions will eventually lead to steering up of next-generation systems biology, which will provide valuable biological insights in the field of research, diagnostic, therapeutic and development of personalized medicine.
Collapse
Affiliation(s)
- Gh. Rasool Bhat
- Advanced Centre for Human Genetics, Sher-I- Kashmir Institute of Medical Sciences, Soura, India
| | - Itty Sethi
- Institute of Human Genetics, University of Jammu, Jammu, India
| | - Bilal Rah
- Advanced Centre for Human Genetics, Sher-I- Kashmir Institute of Medical Sciences, Soura, India
| | - Rakesh Kumar
- School of Biotechnology, Shri Mata Vaishno Devi University, Katra, India
| | - Dil Afroze
- Advanced Centre for Human Genetics, Sher-I- Kashmir Institute of Medical Sciences, Soura, India
| |
Collapse
|
2
|
Tsiatsiani L, Giansanti P, Scheltema RA, van den Toorn H, Overall CM, Altelaar AFM, Heck AJR. Opposite Electron-Transfer Dissociation and Higher-Energy Collisional Dissociation Fragmentation Characteristics of Proteolytic K/R(X) n and (X) nK/R Peptides Provide Benefits for Peptide Sequencing in Proteomics and Phosphoproteomics. J Proteome Res 2016; 16:852-861. [PMID: 28111955 DOI: 10.1021/acs.jproteome.6b00825] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A key step in shotgun proteomics is the digestion of proteins into peptides amenable for mass spectrometry. Tryptic peptides can be readily sequenced and identified by collision-induced dissociation (CID) or higher-energy collisional dissociation (HCD) because the fragmentation rules are well-understood. Here, we investigate LysargiNase, a perfect trypsin mirror protease, because it cleaves equally specific at arginine and lysine residues, albeit at the N-terminal end. LysargiNase peptides are therefore practically tryptic-like in length and sequence except that following ESI, the two protons are now both positioned at the N-terminus. Here, we compare side-by-side the chromatographic separation properties, gas-phase fragmentation characteristics, and (phospho)proteome sequence coverage of tryptic (i.e., (X)nK/R) and LysargiNase (i.e., K/R(X)n) peptides using primarily electron-transfer dissociation (ETD) and, for comparison, HCD. We find that tryptic and LysargiNase peptides fragment nearly as mirror images. For LysargiNase predominantly N-terminal peptide ions (c-ions (ETD) and b-ions (HCD)) are formed, whereas for trypsin, C-terminal fragment ions dominate (z-ions (ETD) and y-ions (HCD)) in a homologous mixture of complementary ions. Especially during ETD, LysargiNase peptides fragment into low-complexity but information-rich sequence ladders. Trypsin and LysargiNase chart distinct parts of the proteome, and therefore, the combined use of these enzymes will benefit a more in-depth and reliable analysis of (phospho)proteomes.
Collapse
Affiliation(s)
| | | | | | | | - Christopher M Overall
- Centre for Blood Research, Department of Oral Biological and Medical Sciences, and Department of Biochemistry and Molecular Biology, University of British Columbia , Vancouver V6T 1Z3, BC, Canada
| | | | | |
Collapse
|
3
|
Expanding the detectable HLA peptide repertoire using electron-transfer/higher-energy collision dissociation (EThcD). Proc Natl Acad Sci U S A 2014; 111:4507-12. [PMID: 24616531 DOI: 10.1073/pnas.1321458111] [Citation(s) in RCA: 146] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The identification of peptides presented by human leukocyte antigen (HLA) class I is tremendously important for the understanding of antigen presentation mechanisms under healthy or diseased conditions. Currently, mass spectrometry-based methods represent the best methodology for the identification of HLA class I-associated peptides. However, the HLA class I peptide repertoire remains largely unexplored because the variable nature of endogenous peptides represents difficulties in conventional peptide fragmentation technology. Here, we substantially enhanced (about threefold) the identification success rate of peptides presented by HLA class I using combined electron-transfer/higher-energy collision dissociation (EThcD), reporting over 12,000 high-confident (false discovery rate <1%) peptides from a single human B-cell line. The direct importance of such an unprecedented large dataset is highlighted by the discovery of unique features in antigen presentation. The observation that a substantial part of proteins is sampled across different HLA alleles, and the common occurrence of HLA class I nested sets, suggest that the constraints of HLA class I to comprehensively present the health states of cells are not as tight as previously thought. Our dataset contains a substantial set of peptides bearing a variety of posttranslational modifications presented with marked allele-specific differences. We propose that EThcD should become the method of choice in analyzing HLA class I-presented peptides.
Collapse
|
4
|
Abstract
Moving past the discovery phase of proteomics, the term targeted proteomics combines multiple approaches investigating a certain set of proteins in more detail. One such targeted proteomics approach is the combination of liquid chromatography and selected or multiple reaction monitoring mass spectrometry (SRM, MRM). SRM-MS requires prior knowledge of the fragmentation pattern of peptides, as the presence of the analyte in a sample is determined by measuring the m/z values of predefined precursor and fragment ions. Using scheduled SRM-MS, many analytes can robustly be monitored allowing for high-throughput sample analysis of the same set of proteins over many conditions. In this chapter, fundaments of SRM-MS are explained as well as an optimized SRM pipeline from assay generation to data analyzed.
Collapse
Affiliation(s)
- H Alexander Ebhardt
- Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| |
Collapse
|
5
|
Robotham SA, Kluwe C, Cannon JR, Ellington A, Brodbelt JS. De novo sequencing of peptides using selective 351 nm ultraviolet photodissociation mass spectrometry. Anal Chem 2013; 85:9832-8. [PMID: 24050806 DOI: 10.1021/ac402309h] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Although in silico database search methods remain more popular for shotgun proteomics methods, de novo sequencing offers the ability to identify peptides derived from proteins lacking sequenced genomes and ones with subtle splice variants or truncations. Ultraviolet photodissociation (UVPD) of peptides derivatized by selective attachment of a chromophore at the N-terminus generates a characteristic series of y ions. The UVPD spectra of the chromophore-labeled peptides are simplified and thus amenable to de novo sequencing. This method resulted in an observed sequence coverage of 79% for cytochrome C (eight peptides), 47% for β-lactoglobulin (five peptides), 25% for carbonic anhydrase (six peptides), and 51% for bovine serum albumin (33 peptides). This strategy also allowed differentiation of proteins with high sequence homology as evidenced by de novo sequencing of two variants of green fluorescent protein.
Collapse
Affiliation(s)
- Scott A Robotham
- Department of Chemistry, University of Texas , Austin, Texas 78712, United States
| | | | | | | | | |
Collapse
|
6
|
Richards AL, Vincent CE, Guthals A, Rose CM, Westphall MS, Bandeira N, Coon JJ. Neutron-encoded signatures enable product ion annotation from tandem mass spectra. Mol Cell Proteomics 2013; 12:3812-23. [PMID: 24043425 DOI: 10.1074/mcp.m113.028951] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
We report the use of neutron-encoded (NeuCode) stable isotope labeling of amino acids in cell culture for the purpose of C-terminal product ion annotation. Two NeuCode labeling isotopologues of lysine, (13)C6(15)N2 and (2)H8, which differ by 36 mDa, were metabolically embedded in a sample proteome, and the resultant labeled proteins were combined, digested, and analyzed via liquid chromatography and mass spectrometry. With MS/MS scan resolving powers of ~50,000 or higher, product ions containing the C terminus (i.e. lysine) appear as a doublet spaced by exactly 36 mDa, whereas N-terminal fragments exist as a single m/z peak. Through theory and experiment, we demonstrate that over 90% of all y-type product ions have detectable doublets. We report on an algorithm that can extract these neutron signatures with high sensitivity and specificity. In other words, of 15,503 y-type product ion peaks, the y-type ion identification algorithm correctly identified 14,552 (93.2%) based on detection of the NeuCode doublet; 6.8% were misclassified (i.e. other ion types that were assigned as y-type products). Searching NeuCode labeled yeast with PepNovo(+) resulted in a 34% increase in correct de novo identifications relative to searching through MS/MS only. We use this tool to simplify spectra prior to database searching, to sort unmatched tandem mass spectra for spectral richness, for correlation of co-fragmented ions to their parent precursor, and for de novo sequence identification.
Collapse
Affiliation(s)
- Alicia L Richards
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706
| | | | | | | | | | | | | |
Collapse
|
7
|
An M, Zou X, Wang Q, Zhao X, Wu J, Xu LM, Shen HY, Xiao X, He D, Ji J. High-confidence de novo peptide sequencing using positive charge derivatization and tandem MS spectra merging. Anal Chem 2013; 85:4530-7. [PMID: 23536960 DOI: 10.1021/ac4001699] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
De novo peptide sequencing holds great promise in discovering new protein sequences and modifications but has often been hindered by low success rate of mass spectra interpretation, mainly due to the diversity of fragment ion types and insufficient information for each ion series. Here, we describe a novel methodology that combines highly efficient on-tip charge derivatization and tandem MS spectra merging, which greatly boosts the performance of interpretation. TMPP-Ac-OSu (succinimidyloxycarbonylmethyl tris(2,4,6-trimethoxyphenyl)phosphonium bromide) was used to derivatize peptides at N-termini on tips to reduce mass spectra complexity. Then, a novel approach of spectra merging was adopted to combine the benefits of collision-induced dissociation (CID) and electron transfer dissociation (ETD) fragmentation. We applied this methodology to rat C6 glioma cells and the Cyprinus carpio and searched the resulting peptide sequences against the protein database. Then, we achieved thousands of high-confidence peptide sequences, a level that conventional de novo sequencing methods could not reach. Next, we identified dozens of novel peptide sequences by homology searching of sequences that were fully backbone covered but unmatched during the database search. Furthermore, we randomly chose 34 sequences discovered in rat C6 cells and verified them. Finally, we conclude that this novel methodology that combines on-tip positive charge derivatization and tandem MS spectra merging will greatly facilitate the discovery of novel proteins and the proteome analysis of nonmodel organisms.
Collapse
Affiliation(s)
- Mingrui An
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Peking University, Beijing 100871, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Altelaar AFM, Munoz J, Heck AJR. Next-generation proteomics: towards an integrative view of proteome dynamics. Nat Rev Genet 2012. [PMID: 23207911 DOI: 10.1038/nrg3356] [Citation(s) in RCA: 515] [Impact Index Per Article: 42.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Next-generation sequencing allows the analysis of genomes, including those representing disease states. However, the causes of most disorders are multifactorial, and systems-level approaches, including the analysis of proteomes, are required for a more comprehensive understanding. The proteome is extremely multifaceted owing to splicing and protein modifications, and this is further amplified by the interconnectivity of proteins into complexes and signalling networks that are highly divergent in time and space. Proteome analysis heavily relies on mass spectrometry (MS). MS-based proteomics is starting to mature and to deliver through a combination of developments in instrumentation, sample preparation and computational analysis. Here we describe this emerging next generation of proteomics and highlight recent applications.
Collapse
Affiliation(s)
- A F Maarten Altelaar
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | | | | |
Collapse
|
9
|
CHONG KETFAH, LEONG HONWAI. TUTORIAL ON DE NOVO PEPTIDE SEQUENCING USING MS/MS MASS SPECTROMETRY. J Bioinform Comput Biol 2012; 10:1231002. [DOI: 10.1142/s0219720012310026] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
This paper is a self-contained introductory tutorial on the problem in proteomics known as peptide sequencing using tandem mass spectrometry. This tutorial deals specifically with de novo sequencing methods (as opposed to database search methods). We first give an introduction to peptide sequencing, its importance and history and some background on proteins. Next we show the relationship between a peptide and the final spectrum produced from a tandem mass spectrometer, together with a description of the various sources of complications that arise during the process of generating the mass spectrum. From there we model the computational problem of de novo peptide sequencing, which is basically the reverse problem of identifying the peptide which produced the spectrum. We then present several major approaches to solve it (including reviewing some of the current algorithms in each approach), and also discuss related problems and post-processing approaches.
Collapse
Affiliation(s)
- KET FAH CHONG
- Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore
| | - HON WAI LEONG
- Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore
| |
Collapse
|