1
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
2
|
Yang Y, Chong Z, Vihinen M. PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate. Int J Mol Sci 2023; 24:13023. [PMID: 37629203 PMCID: PMC10455311 DOI: 10.3390/ijms241613023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 08/27/2023] Open
Abstract
Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Zhang Chong
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
3
|
Verma V, Kumar A, Partap M, Thakur M, Bhargava B. CRISPR-Cas: A robust technology for enhancing consumer-preferred commercial traits in crops. FRONTIERS IN PLANT SCIENCE 2023; 14:1122940. [PMID: 36824195 PMCID: PMC9941649 DOI: 10.3389/fpls.2023.1122940] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 01/16/2023] [Indexed: 06/18/2023]
Abstract
The acceptance of new crop varieties by consumers is contingent on the presence of consumer-preferred traits, which include sensory attributes, nutritional value, industrial products and bioactive compounds production. Recent developments in genome editing technologies provide novel insight to identify gene functions and improve the various qualitative and quantitative traits of commercial importance in plants. Various conventional as well as advanced gene-mutagenesis techniques such as physical and chemical mutagenesis, CRISPR-Cas9, Cas12 and base editors are used for the trait improvement in crops. To meet consumer demand, breakthrough biotechnologies, especially CRISPR-Cas have received a fair share of scientific and industrial interest, particularly in plant genome editing. CRISPR-Cas is a versatile tool that can be used to knock out, replace and knock-in the desired gene fragments at targeted locations in the genome, resulting in heritable mutations of interest. This review highlights the existing literature and recent developments in CRISPR-Cas technologies (base editing, prime editing, multiplex gene editing, epigenome editing, gene delivery methods) for reliable and precise gene editing in plants. This review also discusses the potential of gene editing exhibited in crops for the improvement of consumer-demanded traits such as higher nutritional value, colour, texture, aroma/flavour, and production of industrial products such as biofuel, fibre, rubber and pharmaceuticals. In addition, the bottlenecks and challenges associated with gene editing system, such as off targeting, ploidy level and the ability to edit organelle genome have also been discussed.
Collapse
Affiliation(s)
- Vipasha Verma
- Floriculture Laboratory, Agrotechnology Division, Council of Scientific and Industrial Research (CSIR) –Institute of Himalayan Bioresource Technology (IHBT), Palampur, India
| | - Akhil Kumar
- Floriculture Laboratory, Agrotechnology Division, Council of Scientific and Industrial Research (CSIR) –Institute of Himalayan Bioresource Technology (IHBT), Palampur, India
| | - Mahinder Partap
- Floriculture Laboratory, Agrotechnology Division, Council of Scientific and Industrial Research (CSIR) –Institute of Himalayan Bioresource Technology (IHBT), Palampur, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| | - Meenakshi Thakur
- Floriculture Laboratory, Agrotechnology Division, Council of Scientific and Industrial Research (CSIR) –Institute of Himalayan Bioresource Technology (IHBT), Palampur, India
| | - Bhavya Bhargava
- Floriculture Laboratory, Agrotechnology Division, Council of Scientific and Industrial Research (CSIR) –Institute of Himalayan Bioresource Technology (IHBT), Palampur, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| |
Collapse
|
4
|
Zhai L, Wang L, Hu H, Liu Q, Lee S, Tan M, Zhang Y. PBC, an easy and efficient strategy for high-throughput protein C-terminome profiling. Front Cell Dev Biol 2022; 10:995590. [PMID: 36120566 PMCID: PMC9471192 DOI: 10.3389/fcell.2022.995590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
High-throughput profiling of protein C-termini is still a challenging task. Proteomics provides a powerful technology for systematic and high-throughput study of protein C-termini. Various C-terminal peptide enrichment strategies based on chemical derivatization and chromatography separation have been reported. However, they are still costly and time-consuming, with low enrichment efficiency for C-terminal peptides. In this study, by taking advantage of the high reaction selectivity of 2-pyridinecarboxaldehyde (2-PCA) with an α-amino group on peptide N-terminus and high affinity between biotin and streptavidin, we developed a 2-PCA- and biotin labeling-based C-terminomic (PBC) strategy for a high-efficiency and high-throughput analysis of protein C-terminome. Triplicates of PBC experiments identified a total of 1,975 C-terminal peptides corresponding to 1,190 proteins from 293 T cell line, which is 180% higher than the highest reported number of C-terminal peptides identified from mammalian cells by chemical derivatization-based C-terminomics study. The enrichment efficiency (68%) is the highest among the C-terminomics methods currently reported. In addition, we not only uncovered 50 proteins with truncated C-termini which were significantly enriched in extracellular exosome, vesicle, and ribosome by a bioinformatic analysis but also systematically characterized the whole PTMs on C-terminal in 293 T cells, suggesting PBC as a powerful tool for protein C-terminal degradomics and PTMs investigation. In conclusion, the PBC strategy would benefit high-efficiency and high-throughput profiling of protein C-terminome.
Collapse
Affiliation(s)
- Linhui Zhai
- School of Chinese Materia Medica, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
- Jiangsu Key Laboratory for Functional Substances of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Le Wang
- School of Chinese Materia Medica, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Hao Hu
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Quan Liu
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Sangkyu Lee
- College of Pharmacy and Research Institute of Pharmaceutical Sciences, Kyungpook National University, Daegu, South Korea
| | - Minjia Tan
- School of Chinese Materia Medica, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Yinan Zhang
- School of Chinese Materia Medica, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
- Jiangsu Key Laboratory for Functional Substances of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
| |
Collapse
|
5
|
Merlotti A, Menichetti G, Fariselli P, Capriotti E, Remondini D. Network-based strategies for protein characterization. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:217-248. [PMID: 34340768 DOI: 10.1016/bs.apcsb.2021.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Protein structure characterization is fundamental to understand protein properties, such as folding process and protein resistance to thermal stress, up to unveiling organism pathologies (e.g., prion disease). In this chapter, we provide an overview on how the spectral properties of the networks reconstructed from the Protein Contact Map (PCM) can be used to generate informative observables. As a specific case study, we apply two different network approaches to an example protein dataset, for the aim of discriminating protein folding state, and for the reconstruction of protein 3D structure.
Collapse
Affiliation(s)
| | - Giulia Menichetti
- Center for Complex Network Research, Department of Physics, Northeastern University, Boston, MA, United States; Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Turin, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Daniel Remondini
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy.
| |
Collapse
|
6
|
Prathiviraj R, Chellapandi P. Deciphering Molecular Virulence Mechanism of Mycobacterium tuberculosis Dop isopeptidase Based on Its Sequence-Structure-Function Linkage. Protein J 2020; 39:33-45. [PMID: 31760575 DOI: 10.1007/s10930-019-09876-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The pupylation pathway marks proteins for prokaryotic ubiquitin-like protein (Pup)-proteasomal degradation and survival strategy of mycobacteria inside of the host macrophages. Deamidase of Pup (Dop) plays a central role in the pupylation pathway. It is still a matter of investigation to know the function of Dop in virulence of mycobacterial lineage. Hence, the present study was intended to describe the sequence-structure-function-virulence link of Dop for understanding the molecular virulence mechanism of Mycobacterium tuberculosis H37Rv (Mtb). Phylogenetic analysis of this study indicated that Dop has extensively diverged across the proteasome-harboring bacteria. The functional part of Dop was converged across the pathogenic mycobacterial lineage. The genome-wide analysis pointed out that the pupylation gene locus was identical to each other, but its genome neighborhood differed from species to species. Molecular modeling and dynamic studies proved that the predicted structure of Mtb Dop was energetically stable and low conformational freedom. Moreover, evolutionary constraints in Mtb Dop were intensively analyzed for inferring its sequence-structure-function relationships for the full virulence of Mtb. It indicated that evolutionary optimization was extensively required to stabilize its local structural environment at the side chains of mutable residues. The sequence-structure-function-virulence link of Dop might have retained in Mtb by reordering hydrophobic and hydrogen bonding patterns in the local structural environment. Thus, the results of our study provide a quest to understand the molecular virulence and pathogenesis mechanisms of Mtb during the infection process.
Collapse
Affiliation(s)
- R Prathiviraj
- Molecular Systems Engineering Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
| | - P Chellapandi
- Molecular Systems Engineering Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
7
|
Liu L, Ma M, Cui J. A novel model-based on FCM-LM algorithm for prediction of protein folding rate. J Bioinform Comput Biol 2017; 15:1750012. [PMID: 28513252 DOI: 10.1142/s0219720017500123] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The prediction of protein folding rates is of paramount importance in describing the protein folding mechanism, which has broad applications in fields such as enzyme engineering and protein engineering. Therefore, predicting protein folding rates using the first-order of protein sequence, secondary structure and amino acid properties has become a very active research topic in recent years. This paper presents a new fuzzy cognitive map (FCM) model based on deep learning neural networks which uses data obtained from biological experiments to predict the protein folding rate. FCM extracts the important data features from the protein sequence which then initializes the deep neural networks effectively. It was found that the Levenberg-Marquardt (LM) algorithm for deep neural networks can improve the prediction accuracy of the protein folding rates. The correlation coefficient between the predicted values and those real values obtained from experiments reached 0.94 and 0.9 in two independent numerical tests.
Collapse
Affiliation(s)
- Longlong Liu
- 1 Department of Mathematics, Ocean University of China, Qingdao 266000, P. R. China
| | - Mingjiao Ma
- 1 Department of Mathematics, Ocean University of China, Qingdao 266000, P. R. China
| | - Jing Cui
- 1 Department of Mathematics, Ocean University of China, Qingdao 266000, P. R. China
| |
Collapse
|
8
|
Molecular Evolutionary Constraints that Determine the Avirulence State of Clostridium botulinum C2 Toxin. J Mol Evol 2017; 84:174-186. [DOI: 10.1007/s00239-017-9791-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 03/30/2017] [Indexed: 10/19/2022]
|
9
|
Mallik S, Das S, Kundu S. Predicting protein folding rate change upon point mutation using residue-level coevolutionary information. Proteins 2015; 84:3-8. [DOI: 10.1002/prot.24960] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 11/11/2015] [Accepted: 11/11/2015] [Indexed: 11/10/2022]
Affiliation(s)
- Saurav Mallik
- Department of Biophysics; Molecular Biology and Bioinformatics, University of Calcutta; Kolkata 700009 India
- Center of Excellence in Systems Biology and Biomedical Engineering (TEQIP Phase-II), University of Calcutta; Kolkata 700009 India
| | - Smita Das
- Department of Biophysics; Molecular Biology and Bioinformatics, University of Calcutta; Kolkata 700009 India
| | - Sudip Kundu
- Department of Biophysics; Molecular Biology and Bioinformatics, University of Calcutta; Kolkata 700009 India
- Center of Excellence in Systems Biology and Biomedical Engineering (TEQIP Phase-II), University of Calcutta; Kolkata 700009 India
| |
Collapse
|
10
|
Co-evolutionary constraints of globular proteins correlate with their folding rates. FEBS Lett 2015; 589:2179-85. [DOI: 10.1016/j.febslet.2015.06.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 06/09/2015] [Accepted: 06/24/2015] [Indexed: 11/20/2022]
|
11
|
Chaudhary P, Naganathan AN, Gromiha MM. Folding RaCe: a robust method for predicting changes in protein folding rates upon point mutations. ACTA ACUST UNITED AC 2015; 31:2091-7. [PMID: 25686635 DOI: 10.1093/bioinformatics/btv091] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 02/10/2015] [Indexed: 11/13/2022]
Abstract
MOTIVATION Protein engineering methods are commonly employed to decipher the folding mechanism of proteins and enzymes. However, such experiments are exceedingly time and resource intensive. It would therefore be advantageous to develop a simple computational tool to predict changes in folding rates upon mutations. Such a method should be able to rapidly provide the sequence position and chemical nature to modulate through mutation, to effect a particular change in rate. This can be of importance in protein folding, function or mechanistic studies. RESULTS We have developed a robust knowledge-based methodology to predict the changes in folding rates upon mutations formulated from amino and acid properties using multiple linear regression approach. We benchmarked this method against an experimental database of 790 point mutations from 26 two-state proteins. Mutants were first classified according to secondary structure, accessible surface area and position along the primary sequence. Three prime amino acid features eliciting the best relationship with folding rates change were then shortlisted for each class along with an optimized window length. We obtained a self-consistent mean absolute error of 0.36 s(-1) and a mean Pearson correlation coefficient (PCC) of 0.81. Jack-knife test resulted in a MAE of 0.42 s(-1) and a PCC of 0.73. Moreover, our method highlights the importance of outlier(s) detection and studying their implications in the folding mechanism. AVAILABILITY AND IMPLEMENTATION A web server 'Folding RaCe' has been developed and is available at http://www.iitm.ac.in/bioinfo/proteinfolding/foldingrace.html. CONTACT gromiha@iitm.ac.in SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Priyashree Chaudhary
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| | - Athi N Naganathan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| |
Collapse
|
12
|
Gupta S, Chavan S, Deobagkar DN, Deobagkar DD. Bio/chemoinformatics in India: an outlook. Brief Bioinform 2014; 16:710-31. [PMID: 25159593 DOI: 10.1093/bib/bbu028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 07/28/2014] [Indexed: 12/25/2022] Open
Abstract
With the advent of significant establishment and development of Internet facilities and computational infrastructure, an overview on bio/chemoinformatics is presented along with its multidisciplinary facts, promises and challenges. The Government of India has paved the way for more profound research in biological field with the use of computational facilities and schemes/projects to collaborate with scientists from different disciplines. Simultaneously, the growth of available biomedical data has provided fresh insight into the nature of redundant and compensatory data. Today, bioinformatics research in India is characterized by a powerful grid computing systems, great variety of biological questions addressed and the close collaborations between scientists and clinicians, with a full spectrum of focuses ranging from database building and methods development to biological discoveries. In fact, this outlook provides a resourceful platform highlighting the funding agencies, institutes and industries working in this direction, which would certainly be of great help to students seeking their career in bioinformatics. Thus, in short, this review highlights the current bio/chemoinformatics trend, educations, status, diverse applicability and demands for further development.
Collapse
|
13
|
Compiani M, Capriotti E. Computational and theoretical methods for protein folding. Biochemistry 2013; 52:8601-24. [PMID: 24187909 DOI: 10.1021/bi4001529] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A computational approach is essential whenever the complexity of the process under study is such that direct theoretical or experimental approaches are not viable. This is the case for protein folding, for which a significant amount of data are being collected. This paper reports on the essential role of in silico methods and the unprecedented interplay of computational and theoretical approaches, which is a defining point of the interdisciplinary investigations of the protein folding process. Besides giving an overview of the available computational methods and tools, we argue that computation plays not merely an ancillary role but has a more constructive function in that computational work may precede theory and experiments. More precisely, computation can provide the primary conceptual clues to inspire subsequent theoretical and experimental work even in a case where no preexisting evidence or theoretical frameworks are available. This is cogently manifested in the application of machine learning methods to come to grips with the folding dynamics. These close relationships suggested complementing the review of computational methods within the appropriate theoretical context to provide a self-contained outlook of the basic concepts that have converged into a unified description of folding and have grown in a synergic relationship with their computational counterpart. Finally, the advantages and limitations of current computational methodologies are discussed to show how the smart analysis of large amounts of data and the development of more effective algorithms can improve our understanding of protein folding.
Collapse
Affiliation(s)
- Mario Compiani
- School of Sciences and Technology, University of Camerino , Camerino, Macerata 62032, Italy
| | | |
Collapse
|
14
|
Gromiha MM, Harini K, Sowdhamini R, Fukui K. Relationship between amino acid properties and functional parameters in olfactory receptors and discrimination of mutants with enhanced specificity. BMC Bioinformatics 2012; 13 Suppl 7:S1. [PMID: 22594995 PMCID: PMC3348020 DOI: 10.1186/1471-2105-13-s7-s1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Olfactory receptors are key components in signal transduction. Mutations in olfactory receptors alter the odor response, which is a fundamental response of organisms to their immediate environment. Understanding the relationship between odorant response and mutations in olfactory receptors is an important problem in bioinformatics and computational biology. In this work, we have systematically analyzed the relationship between various physical, chemical, energetic and conformational properties of amino acid residues, and the change of odor response/compound's potency/half maximal effective concentration (EC50) due to amino acid substitutions. RESULTS We observed that both the characteristics of odorant molecule (ligand) and amino acid properties are important for odor response and EC50. Additional information on neighboring and surrounding residues of the mutants enhanced the correlation between amino acid properties and EC50. Further, amino acid properties have been combined systematically using multiple regression techniques and we obtained a correlation of 0.90-0.98 with odor response/EC50 of goldfish, mouse and human olfactory receptors. In addition, we have utilized machine learning methods to discriminate the mutants, which enhance or reduce EC50 values upon mutation and we obtained an accuracy of 93% and 79% for self-consistency and jack-knife tests, respectively. CONCLUSIONS Our analysis provides deep insights for understanding the odor response of olfactory receptor mutants and the present method could be used for identifying the mutants with enhanced specificity.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India.
| | | | | | | |
Collapse
|
15
|
Grading amino acid properties increased accuracies of single point mutation on protein stability prediction. BMC Bioinformatics 2012; 13:44. [PMID: 22435732 PMCID: PMC3820156 DOI: 10.1186/1471-2105-13-44] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Accepted: 03/22/2012] [Indexed: 11/23/2022] Open
Abstract
Background Protein stabilities can be affected sometimes by point mutations introduced to the
protein. Current sequence-information-based protein stability prediction encoding
schemes of machine learning approaches include sparse encoding and amino acid
property encoding. Property encoding schemes employ physical-chemical information
of the mutated protein environments, however, they produce complexity in the mean
time when many properties joined in the scheme. The complexity introduces noises
that affect machine learning algorithm accuracies. In order to overcome the
problem we described a new encoding scheme that graded twenty amino acids into
groups according to their specific property values. Results We employed three predefined values, 0.1, 0.5, and 0.9 to represent 'weak',
'middle', and 'strong' groups for each amino acid property, and introduced two
thresholds for each property to split twenty amino acids into one of the three
groups according to their property values. Each amino acid can take only one out
of three predefined values rather than twenty different values for each property.
The complexity and noises in the encoding schemes were reduced in this way. More
than 7% average accuracy improvement was found in the graded amino acid property
encoding schemes by 20-fold cross validation. The overall accuracy of our method
is more than 72% when performed on the independent test sets starting from
sequence information with three-state prediction definitions. Conclusions Grading numeric values of amino acid property can reduce the noises and complexity
of input information. It is in accordance with biochemical concepts for amino acid
properties and makes the input data simplified in the mean time. The idea of
graded property encoding schemes may be applied to protein related predictions
with machine learning approaches.
Collapse
|
16
|
Real value prediction of protein folding rate change upon point mutation. J Comput Aided Mol Des 2012; 26:339-47. [DOI: 10.1007/s10822-012-9560-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Accepted: 03/02/2012] [Indexed: 10/28/2022]
|
17
|
Carugo O. Participation of protein sequence termini in crystal contacts. Protein Sci 2011; 20:2121-4. [PMID: 21739502 DOI: 10.1002/pro.690] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2011] [Revised: 06/27/2011] [Accepted: 06/30/2011] [Indexed: 11/08/2022]
Abstract
The analysis of the crystal packing interactions, in a nonredundant set of high resolution and monomeric globular protein crystal structures, shows that the residues located at the N- and C-termini of the sequence tend to participate in packing interaction more often than expected and that often they interact with each other. Since the sequence termini are, in general, conformationally very flexible and since they host electrical charges of opposite sign, it can be hypothesized that they play a crucial role in the early formation of the nonphysiological contacts that bring to protein crystallization. It is thus not surprising that modest lengthening/shortening of the sequence termini have often a dramatic effect on protein crystallogenesis.
Collapse
|