1
|
Nikam R, Jemimah S, Gromiha MM. DeepPPAPredMut: deep ensemble method for predicting the binding affinity change in protein-protein complexes upon mutation. Bioinformatics 2024; 40:btae309. [PMID: 38718170 PMCID: PMC11112046 DOI: 10.1093/bioinformatics/btae309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/08/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024] Open
Abstract
MOTIVATION Protein-protein interactions underpin many cellular processes and their disruption due to mutations can lead to diseases. With the evolution of protein structure prediction methods like AlphaFold2 and the availability of extensive experimental affinity data, there is a pressing need for updated computational tools that can efficiently predict changes in binding affinity caused by mutations in protein-protein complexes. RESULTS We developed a deep ensemble model that leverages protein sequences, predicted structure-based features, and protein functional classes to accurately predict the change in binding affinity due to mutations. The model achieved a correlation of 0.97 and a mean absolute error (MAE) of 0.35 kcal/mol on the training dataset, and maintained robust performance on the test set with a correlation of 0.72 and a MAE of 0.83 kcal/mol. Further validation using Leave-One-Out Complex (LOOC) cross-validation exhibited a correlation of 0.83 and a MAE of 0.51 kcal/mol, indicating consistent performance. AVAILABILITY AND IMPLEMENTATION https://web.iitm.ac.in/bioinfo2/DeepPPAPredMut/index.html.
Collapse
|
2
|
Praveen Kumar PK, Sundar H, Balakrishnan K, Subramaniam S, Ramachandran H, Kevin M, Michael Gromiha M. The Role of HSP90 and TRAP1 Targets on Treatment in Hepatocellular Carcinoma. Mol Biotechnol 2024:10.1007/s12033-024-01151-4. [PMID: 38684604 DOI: 10.1007/s12033-024-01151-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 03/18/2024] [Indexed: 05/02/2024]
Abstract
Hepatocellular Carcinoma (HCC) is the predominant form of liver cancer and arises due to dysregulation of the cell cycle control machinery. Heat Shock Protein 90 (HSP90) and mitochondrial HSP90, also referred to as TRAP1 are important critical chaperone target receptors for early diagnosis and targeting HCC. Both HSP90 and TRAP1 expression was found to be higher in HCC patients. Hence, the importance of HSP90 and TRAP1 inhibitors mechanism and mitochondrial targeted delivery of those inhibitors function is widely studied. This review also focuses on importance of protein-protein interactions of HSP90 and TRAP1 targets and association of its interacting proteins in various pathways of HCC. To further elucidate the mechanism, systems biology approaches and computational biology approach studies are well explored in the association of inhibition of herbal plant molecules with HSP90 and its mitochondrial type in HCC.
Collapse
|
3
|
Shanmugam NRS, Kulandaisamy A, Veluraja K, Gromiha MM. CarbDisMut: database on neutral and disease-causing mutations in human carbohydrate-binding proteins. Glycobiology 2024; 34:cwae011. [PMID: 38335248 DOI: 10.1093/glycob/cwae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 01/03/2024] [Indexed: 02/12/2024] Open
Abstract
Protein-carbohydrate interactions are involved in several cellular and biological functions. Integrating structure and function of carbohydrate-binding proteins with disease-causing mutations help to understand the molecular basis of diseases. Although databases are available for protein-carbohydrate complexes based on structure, binding affinity and function, no specific database for mutations in human carbohydrate-binding proteins is reported in the literature. We have developed a novel database, CarbDisMut, a comprehensive integrated resource for disease-causing mutations with sequence and structural features. It has 1.17 million disease-associated mutations and 38,636 neutral mutations from 7,187 human carbohydrate-binding proteins. The database is freely available at https://web.iitm.ac.in/bioinfo2/carbdismut. The web-site is implemented using HTML, PHP and JavaScript and supports recent versions of all major browsers, such as Firefox, Chrome and Opera.
Collapse
|
4
|
Ridha F, Gromiha MM. MPA-Pred: A machine learning approach for predicting the binding affinity of membrane protein-protein complexes. Proteins 2024; 92:499-508. [PMID: 37949651 DOI: 10.1002/prot.26633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 10/05/2023] [Accepted: 10/25/2023] [Indexed: 11/12/2023]
Abstract
Membrane protein-protein interactions are essential for several functions including cell signaling, ion transport, and enzymatic activity. These interactions are mainly dictated by their binding affinities. Although several methods are available for predicting the binding affinity of protein-protein complexes, there exists no specific method for membrane protein-protein complexes. In this work, we collected the experimental binding affinity data for a set of 114 membrane protein-protein complexes and derived several structure and sequence-based features. Our analysis on the relationship between binding affinity and the features revealed that the important factors mainly depend on the type of membrane protein and the functional class of the protein. Specifically, aromatic and charged residues at the interface, and aromatic-aromatic and electrostatic interactions are found to be important to understand the binding affinity. Further, we developed a method, MPA-Pred, for predicting the binding affinity of membrane protein-protein complexes using a machine learning approach. It showed an average correlation and mean absolute error of 0.83 and 0.91 kcal/mol, respectively, using the jack-knife test on a set of 114 complexes. We have also developed a web server and it is available at https://web.iitm.ac.in/bioinfo2/MPA-Pred/. This method can be used for predicting the affinity of membrane protein-protein complexes at a large scale and aid to improve drug design strategies.
Collapse
|
5
|
Sankaran S, Krishnan SR, Sayed Y, Gromiha MM. Mechanism of drug resistance in HIV-1 protease subtype C in the presence of Atazanavir. Curr Res Struct Biol 2024; 7:100132. [PMID: 38435053 PMCID: PMC10907180 DOI: 10.1016/j.crstbi.2024.100132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 02/09/2024] [Accepted: 02/13/2024] [Indexed: 03/05/2024] Open
Abstract
AIDS is one of the deadliest diseases in the history of humankind caused by HIV. Despite the technological development, curtailing the viral infection inside human host still remains a challenge. Therapies such as HAART uses a combination of drugs to inhibit the viral activity. One of the important targets includes HIV protease and inhibiting its activity will minimize the production of mature structural proteins. However, the genetic diversity and the occurrence of drug resistant mutations adds complexity to effective drug design. In this study, we aimed at understanding the drug binding mechanism of one such subtype, namely subtype C and its insertion variant L38HL. We performed multiple molecular dynamics simulations along with binding free energy analysis of wild-type and L38HL bound to Atazanavir (ATV). From the analysis, we revealed that the insertion alters the hydrogen bond and hydrophobic interaction networks. The alterations in the interaction networks increase flexibility at the hinge-fulcrum interface. Further, the effects of these changes affect flap tip curling. Moreover, the changes in the hinge-fulcrum-cantilever interface alters the concerted motion of the functional regions leading to change in the direction of flap movement thus causing a subtle change in the active site volume. Additionally, formation of intramolecular hydrogen bonds in the ATV docked to L38HL restricted the movement of R1 and R2 groups thereby altering the interactions. Overall, the changes in the flexibility of flap together with the changes in the active site volume and compactness of the ligand provide insights for increased binding affinity of ATV with L38HL.
Collapse
|
6
|
Sharma D, Rawat P, Greiff V, Janakiraman V, Gromiha MM. Predicting the immune escape of SARS-CoV-2 neutralizing antibodies upon mutation. Biochim Biophys Acta Mol Basis Dis 2024; 1870:166959. [PMID: 37967796 DOI: 10.1016/j.bbadis.2023.166959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/25/2023] [Accepted: 11/07/2023] [Indexed: 11/17/2023]
Abstract
COVID-19 has resulted in millions of deaths and severe impact on economies worldwide. Moreover, the emergence of SARS-CoV-2 variants presented significant challenges in controlling the pandemic, particularly their potential to avoid the immune system and evade vaccine immunity. This has led to a growing need for research to predict how mutations in SARS-CoV-2 reduces the ability of antibodies to neutralize the virus. In this study, we assembled a set of 1813 mutations from the interface of SARS-CoV-2 spike protein's receptor binding domain (RBD) and neutralizing antibody complexes and developed a machine learning model to classify high or low escape mutations using interaction energy, inter-residue contacts and predicted binding free energy change. Our approach achieved an Area under the Receiver Operating Characteristics (ROC) Curve (AUC) of 0.91 using the Random Forest classifier on the test dataset with 217 mutations. The model was further utilized to predict the escape mutations on a dataset of 29,165 mutations located at the interface of 83 RBD-neutralizing antibody complexes. A small subset of this dataset was also validated based on available experimental data. We found that top 10 % high escape mutations were dominated by charged to nonpolar mutations whereas low escape mutations were dominated by polar to nonpolar mutations. We believe that the present method will allow prioritization of high/low escape mutations in the context of neutralizing antibodies targeting SARS-CoV-2 RBD region and assist antibody design for current and emerging variants.
Collapse
|
7
|
Harini K, Sekijima M, Gromiha MM. PRA-Pred: Structure-based prediction of protein-RNA binding affinity. Int J Biol Macromol 2024; 259:129490. [PMID: 38224813 DOI: 10.1016/j.ijbiomac.2024.129490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/10/2024] [Accepted: 01/12/2024] [Indexed: 01/17/2024]
Abstract
Understanding crucial factors that affect the binding affinity of protein-RNA complexes is vital for comprehending their recognition mechanisms. This study involved compiling experimentally measured binding affinity (ΔG) values of 217 protein-RNA complexes and extracting numerous structure-based features, considering RNA, protein, and interactions between protein and RNA. Our findings indicate the significance of RNA base-step parameters, interaction energies, number of atomic contacts in the complex, hydrogen bonds, and contact potentials in understanding the binding affinity. Further, we observed that these factors are influenced by the type of RNA strand and the function of the protein in a protein-RNA complex. Multiple regression equations were developed for different classes of complexes to perform the prediction of the binding affinity between the protein and RNA. We evaluated the models using the jack-knife test and achieved an overall correlation 0.77 between the experimental and predicted binding affinities with a mean absolute error of 1.02 kcal/mol. Furthermore, we introduced a web server, PRA-Pred, intended for the prediction of protein-RNA binding affinity, and it is freely accessible through https://web.iitm.ac.in/bioinfo2/prapred/. We propose that our approach could function as a potential resource for investigating protein-RNA recognitions and developing therapeutic strategies.
Collapse
|
8
|
Krishnan SR, Roy A, Gromiha MM. Reliable method for predicting the binding affinity of RNA-small molecule interactions using machine learning. Brief Bioinform 2024; 25:bbae002. [PMID: 38261341 PMCID: PMC10805179 DOI: 10.1093/bib/bbae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 12/21/2023] [Accepted: 12/24/2023] [Indexed: 01/24/2024] Open
Abstract
Ribonucleic acids (RNAs) play important roles in cellular regulation. Consequently, dysregulation of both coding and non-coding RNAs has been implicated in several disease conditions in the human body. In this regard, a growing interest has been observed to probe into the potential of RNAs to act as drug targets in disease conditions. To accelerate this search for disease-associated novel RNA targets and their small molecular inhibitors, machine learning models for binding affinity prediction were developed specific to six RNA subtypes namely, aptamers, miRNAs, repeats, ribosomal RNAs, riboswitches and viral RNAs. We found that differences in RNA sequence composition, flexibility and polar nature of RNA-binding ligands are important for predicting the binding affinity. Our method showed an average Pearson correlation (r) of 0.83 and a mean absolute error of 0.66 upon evaluation using the jack-knife test, indicating their reliability despite the low amount of data available for several RNA subtypes. Further, the models were validated with external blind test datasets, which outperform other existing quantitative structure-activity relationship (QSAR) models. We have developed a web server to host the models, RNA-Small molecule binding Affinity Predictor, which is freely available at: https://web.iitm.ac.in/bioinfo2/RSAPred/.
Collapse
|
9
|
Pandey M, Shah SK, Gromiha MM. Computational approaches for identifying disease-causing mutations in proteins. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2023; 139:141-171. [PMID: 38448134 DOI: 10.1016/bs.apcsb.2023.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Advancements in genome sequencing have expanded the scope of investigating mutations in proteins across different diseases. Amino acid mutations in a protein alter its structure, stability and function and some of them lead to diseases. Identification of disease-causing mutations is a challenging task and it will be helpful for designing therapeutic strategies. Hence, mutation data available in the literature have been curated and stored in several databases, which have been effectively utilized for developing computational methods to identify deleterious mutations (drivers), using sequence and structure-based properties of proteins. In this chapter, we describe the contents of specific databases that have information on disease-causing and neutral mutations followed by sequence and structure-based properties. Further, characteristic features of disease-causing mutations will be discussed along with computational methods for identifying cancer hotspot residues and disease-causing mutations in proteins.
Collapse
|
10
|
Nikam R, Yugandhar K, Gromiha MM. Deep learning-based method for predicting and classifying the binding affinity of protein-protein complexes. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2023; 1871:140948. [PMID: 37567456 DOI: 10.1016/j.bbapap.2023.140948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/05/2023] [Accepted: 08/08/2023] [Indexed: 08/13/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in various biological processes. Accurately estimating the binding affinity of PPIs is essential for understanding the underlying molecular recognition mechanisms. In this study, we employed a deep learning approach to predict the binding affinity (ΔG) of protein-protein complexes. To this end, we compiled a dataset of 903 protein-protein complexes, each with its corresponding experimental binding affinity, which belong to six functional classes. We extracted 8 to 20 non-redundant features from the sequence information as well as the predicted three-dimensional structures using feature selection methods for each protein functional class. Our method showed an overall mean absolute error of 1.05 kcal/mol and a correlation of 0.79 between experimental and predicted ΔG values. Additionally, we evaluated our model for discriminating high and low affinity protein-protein complexes and it achieved an accuracy of 87% with an F1 score of 0.86 using 10-fold cross-validation on the selected features. Our approach presents an efficient tool for studying PPIs and provides crucial insights into the underlying mechanisms of the molecular recognition process. The web server can be freely accessed at https://web.iitm.ac.in/bioinfo2/DeepPPAPred/index.html.
Collapse
|
11
|
Ramakrishna Reddy P, Kulandaisamy A, Michael Gromiha M. TMH Stab-pred: Predicting the stability of α-helical membrane proteins using sequence and structural features. Methods 2023; 218:118-124. [PMID: 37572768 DOI: 10.1016/j.ymeth.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/14/2023] Open
Abstract
The folding and stability of transmembrane proteins (TMPs) are governed by the insertion of secondary structural elements into the cell membrane followed by their assembly. Understanding the important features that dictate the stability of TMPs is important for elucidating their functions. In this work, we related sequence and structure-based parameters with free energy (ΔG0) of α-helical membrane proteins. Our results showed that the free energy transfer of hydrophobic peptides, relative contact order, total interaction energy, number of hydrogen bonds and lipid accessibility of transmembrane regions are important for stability. Further, we have developed multiple-regression models to predict the stability of α-helical membrane proteins using these features and our method can predict the stability with a correlation and mean absolute error (MAE) of 0.89 and 1.21 kcal/mol, respectively, on jack-knife test. The method was validated with a blind test set of three recently reported experimental ΔG0, which could predict the stability within an average MAE of 0.51 kcal/mol. Further, we developed a webserver for predicting the stability and it is freely available at (https://web.iitm.ac.in/bioinfo2/TMHS/). The importance of selected parameters and limitations are discussed.
Collapse
|
12
|
Nikam R, Yugandhar K, Gromiha MM. DeepBSRPred: deep learning-based binding site residue prediction for proteins. Amino Acids 2023; 55:1305-1316. [PMID: 36574037 DOI: 10.1007/s00726-022-03228-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 12/15/2022] [Indexed: 12/28/2022]
Abstract
MOTIVATION Proteins-protein interactions (PPIs) are important to govern several cellular activities. Amino acid residues, which are located at the interface are known as the binding sites and the information about binding sites helps to understand the binding affinities and functions of protein-protein complexes. RESULTS We have developed a deep neural network-based method, DeepBSRPred, for predicting the binding sites using protein sequence information and predicted structures from AlphaFold2. Specific sequence and structure-based features include position-specific scoring matrix (PSSM), solvent accessible surface area, conservation score and amino acid properties, and residue depth, respectively. Our method predicted the binding sites with an average F1 score of 0.73 in a dataset of 1236 proteins. Further, we compared the performance with other existing methods in the literature using four benchmark datasets and our method outperformed those methods. AVAILABILITY AND IMPLEMENTATION The DeepBSRPred web server can be found at https://web.iitm.ac.in/bioinfo2/deepbsrpred/index.html , along with all datasets used in this study. The trained models, the DeepBSRPred standalone source code, and the feature computation pipeline are freely available at https://web.iitm.ac.in/bioinfo2/deepbsrpred/download.html .
Collapse
|
13
|
Sun J, Kulandaisamy A, Ru J, Gromiha MM, Cribbs AP. TMKit: a Python interface for computational analysis of transmembrane proteins. Brief Bioinform 2023; 24:bbad288. [PMID: 37594311 PMCID: PMC10516361 DOI: 10.1093/bib/bbad288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 07/07/2023] [Accepted: 07/18/2023] [Indexed: 08/19/2023] Open
Abstract
Transmembrane proteins are receptors, enzymes, transporters and ion channels that are instrumental in regulating a variety of cellular activities, such as signal transduction and cell communication. Despite tremendous progress in computational capacities to support protein research, there is still a significant gap in the availability of specialized computational analysis toolkits for transmembrane protein research. Here, we introduce TMKit, an open-source Python programming interface that is modular, scalable and specifically designed for processing transmembrane protein data. TMKit is a one-stop computational analysis tool for transmembrane proteins, enabling users to perform database wrangling, engineer features at the mutational, domain and topological levels, and visualize protein-protein interaction interfaces. In addition, TMKit includes seqNetRR, a high-performance computing library that allows customized construction of a large number of residue connections. This library is particularly well suited for assigning correlation matrix-based features at a fast speed. TMKit should serve as a useful tool for researchers in assisting the study of transmembrane protein sequences and structures. TMKit is publicly available through https://github.com/2003100127/tmkit and https://tmkit-guide.herokuapp.com/doc/overview.
Collapse
|
14
|
Sneha NP, Dharshini SAP, Taguchi YH, Gromiha MM. Investigating Neuron Degeneration in Huntington's Disease Using RNA-Seq Based Transcriptome Study. Genes (Basel) 2023; 14:1801. [PMID: 37761940 PMCID: PMC10530489 DOI: 10.3390/genes14091801] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 09/02/2023] [Accepted: 09/11/2023] [Indexed: 09/29/2023] Open
Abstract
Huntington's disease (HD) is a progressive neurodegenerative disorder caused due to a CAG repeat expansion in the huntingtin (HTT) gene. The primary symptoms of HD include motor dysfunction such as chorea, dystonia, and involuntary movements. The primary motor cortex (BA4) is the key brain region responsible for executing motor/movement activities. Investigating patient and control samples from the BA4 region will provide a deeper understanding of the genes responsible for neuron degeneration and help to identify potential markers. Previous studies have focused on overall differential gene expression and associated biological functions. In this study, we illustrate the relationship between variants and differentially expressed genes/transcripts. We identified variants and their associated genes along with the quantification of genes and transcripts. We also predicted the effect of variants on various regulatory activities and found that many variants are regulating gene expression. Variants affecting miRNA and its targets are also highlighted in our study. Co-expression network studies revealed the role of novel genes. Function interaction network analysis unveiled the importance of genes involved in vesicle-mediated transport. From this unified approach, we propose that genes expressed in immune cells are crucial for reducing neuron death in HD.
Collapse
|
15
|
Gromiha MM, Harini K. Comment on 'Thermodynamic database supports deciphering protein-nucleic acid interactions'. Trends Biotechnol 2023; 41:988-989. [PMID: 37117054 DOI: 10.1016/j.tibtech.2023.03.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 03/22/2023] [Indexed: 04/30/2023]
Abstract
Mei and colleagues reported a thermodynamic database, PNATDB for protein-nucleic acid interactions, which contains 12 635 experimentally determined thermodynamic parameters. They claimed that extracting data from existing databases is difficult. ProNAB, which has more than 20 000 experimental data points for binding affinities of protein-nucleic acid complexes and other information, was not discussed.
Collapse
|
16
|
Gromiha MM, Kundrotas P, Marti MA, Venclovas Č, Li M. Editorial: Protein recognition and associated diseases. FRONTIERS IN BIOINFORMATICS 2023; 3:1215141. [PMID: 37283696 PMCID: PMC10240056 DOI: 10.3389/fbinf.2023.1215141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 05/09/2023] [Indexed: 06/08/2023] Open
|
17
|
Harini K, Kihara D, Michael Gromiha M. PDA-Pred: Predicting the binding affinity of protein-DNA complexes using machine learning techniques and structural features. Methods 2023; 213:10-17. [PMID: 36924867 PMCID: PMC10563387 DOI: 10.1016/j.ymeth.2023.03.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/17/2023] [Accepted: 03/11/2023] [Indexed: 03/17/2023] Open
Abstract
Protein-DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/.
Collapse
|
18
|
Pandey M, Gromiha MM. MutBLESS: A tool to identify disease-prone sites in cancer using deep learning. Biochim Biophys Acta Mol Basis Dis 2023; 1869:166721. [PMID: 37105446 DOI: 10.1016/j.bbadis.2023.166721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/07/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023]
Abstract
Understanding the molecular basis and impact of mutations at different stages of cancer are long-standing challenges in cancer biology. Identification of driver mutations from experiments is expensive and time intensive. In the present study, we collected the data for experimentally known driver mutations in 22 different cancer types and classified them into six categories: breast cancer (BRCA), acute myeloid leukaemia (LAML), endometrial carcinoma (EC), stomach cancer (STAD), skin cancer (SKCM), and other cancer types which contains 5747 disease prone and 5514 neutral sites in 516 proteins. The analysis of amino acid distribution along mutant sites revealed that the motifs AAA and LR are preferred in disease-prone sites whereas QPP and QF are dominant in neutral sites. Further, we developed a method using deep neural networks to predict disease-prone sites with amino acid sequence-based features such as physicochemical properties, secondary structure, tri-peptide motifs and conservation scores. We obtained an average AUC of 0.97 in five cancer types BRCA, LAML, EC, STAD and SKCM in a test dataset and 0.72 in all other cancer types together. Our method showed excellent performance for identifying cancer-specific mutations with an average sensitivity, specificity, and accuracy of 96.56 %, 97.39 %, and 97.64 %, respectively. We developed a web server for identifying cancer-prone sites, and it is available at https://web.iitm.ac.in/bioinfo2/MutBLESS/index.html. We suggest that our method can serve as an effective method to identify disease-prone sites and assist to develop therapeutic strategies.
Collapse
|
19
|
Krishnan SR, Soares RRG, Madaboosi N, Gromiha MM. AutoPLP: A Padlock Probe Design Pipeline for Zoonotic Pathogens. ACS Infect Dis 2023; 9:459-469. [PMID: 36790094 DOI: 10.1021/acsinfecdis.2c00436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Emergence of novel zoonotic infections among the human population has increased the burden on global healthcare systems to curb their spread. To meet the evolutionary agility of pathogens, it is essential to revamp the existing diagnostic methods for early detection and characterization of the pathogens at the molecular level. Padlock probes (PLPs), which can leverage the power of isothermal nucleic acid amplification techniques (NAAT) such as rolling circle amplification (RCA), are known for their high sensitivity and specificity in detecting a diverse pathogen panel of interest. However, due to the complexity involved in deciding the target regions for PLP design and the need for optimization of multiple experimental parameters, the applicability of RCA has been limited in point-of-care testing for pathogen detection. To address this gap, we have developed a novel and integrated PLP design pipeline named AutoPLP, which can automate the probe design process for a diverse pathogen panel of interest. The pipeline is composed of three modules which can perform sequence data curation, multiple sequence alignment, conservation analysis, filtration based on experimental parameters (Tm, GC content, and secondary structure formation), and in silico probe validation via potential cross-hybridization check with host genome. The modules can also take into account the backbone and restriction site information, appropriate combinations of which are incorporated along with the probe arms to design a complete probe sequence. The potential applications of AutoPLP are showcased through the design of PLPs for the detection of rabies virus and drug-resistant strains of Mycobacterium tuberculosis.
Collapse
|
20
|
Venkatachalam S, Murlidharan N, Krishnan SR, Ramakrishnan C, Setshedi M, Pandian R, Barh D, Tiwari S, Azevedo V, Sayed Y, Gromiha MM. Understanding Drug Resistance of Wild-Type and L38HL Insertion Mutant of HIV-1 C Protease to Saquinavir. Genes (Basel) 2023; 14:533. [PMID: 36833460 PMCID: PMC9957153 DOI: 10.3390/genes14020533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Revised: 02/16/2023] [Accepted: 02/17/2023] [Indexed: 02/25/2023] Open
Abstract
Acquired immunodeficiency syndrome (AIDS) is one of the most challenging infectious diseases to treat on a global scale. Understanding the mechanisms underlying the development of drug resistance is necessary for novel therapeutics. HIV subtype C is known to harbor mutations at critical positions of HIV aspartic protease compared to HIV subtype B, which affects the binding affinity. Recently, a novel double-insertion mutation at codon 38 (L38HL) was characterized in HIV subtype C protease, whose effects on the interaction with protease inhibitors are hitherto unknown. In this study, the potential of L38HL double-insertion in HIV subtype C protease to induce a drug resistance phenotype towards the protease inhibitor, Saquinavir (SQV), was probed using various computational techniques, such as molecular dynamics simulations, binding free energy calculations, local conformational changes and principal component analysis. The results indicate that the L38HL mutation exhibits an increase in flexibility at the hinge and flap regions with a decrease in the binding affinity of SQV in comparison with wild-type HIV protease C. Further, we observed a wide opening at the binding site in the L38HL variant due to an alteration in flap dynamics, leading to a decrease in interactions with the binding site of the mutant protease. It is supported by an altered direction of motion of flap residues in the L38HL variant compared with the wild-type. These results provide deep insights into understanding the potential drug resistance phenotype in infected individuals.
Collapse
|
21
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
|
22
|
Shanmugam A, Venkattappan A, Gromiha MM. Structure based Drug Designing Approaches in SARS-CoV-2 Spike Inhibitor Design. Curr Top Med Chem 2023; 22:2396-2409. [PMID: 36330617 DOI: 10.2174/1568026623666221103091658] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 09/14/2022] [Accepted: 10/06/2022] [Indexed: 11/06/2022]
Abstract
The COVID-19 outbreak and the pandemic situation have hastened the research community to design a novel drug and vaccine against its causative organism, the SARS-CoV-2. The spike glycoprotein present on the surface of this pathogenic organism plays an immense role in viral entry and antigenicity. Hence, it is considered an important drug target in COVID-19 drug design. Several three-dimensional crystal structures of this SARS-CoV-2 spike protein have been identified and deposited in the Protein DataBank during the pandemic period. This accelerated the research in computer- aided drug designing, especially in the field of structure-based drug designing. This review summarizes various structure-based drug design approaches applied to this SARS-CoV-2 spike protein and its findings. Specifically, it is focused on different structure-based approaches such as molecular docking, high-throughput virtual screening, molecular dynamics simulation, drug repurposing, and target-based pharmacophore modelling and screening. These structural approaches have been applied to different ligands and datasets such as FDA-approved drugs, small molecular chemical compounds, chemical libraries, chemical databases, structural analogs, and natural compounds, which resulted in the prediction of spike inhibitors, spike-ACE-2 interface inhibitors, and allosteric inhibitors.
Collapse
|
23
|
Sharma D, Baas T, Nogales A, Martinez-Sobrido L, Gromiha MM. CoDe: a web-based tool for codon deoptimization. BIOINFORMATICS ADVANCES 2023; 3:vbac102. [PMID: 36698765 PMCID: PMC9832946 DOI: 10.1093/bioadv/vbac102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 11/27/2022] [Accepted: 12/30/2022] [Indexed: 01/03/2023]
Abstract
Summary We have developed a web-based tool, CoDe (Codon Deoptimization) that deoptimizes genetic sequences based on different codon usage bias, ultimately reducing expression of the corresponding protein. The tool could also deoptimize the sequence for a specific region and/or selected amino acid(s). Moreover, CoDe can highlight sites targeted by restriction enzymes in the wild-type and codon-deoptimized sequences. Importantly, our web-based tool has a user-friendly interface with flexible options to download results. Availability and implementation The web-based tool CoDe is freely available at https://web.iitm.ac.in/bioinfo2/codeop/landing_page.html. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
|
24
|
Kulandaisamy A, Parvathy Dharshini SA, Gromiha MM. Alz-Disc: A Tool to Discriminate Disease-causing and Neutral Mutations in Alzheimer's Disease. Comb Chem High Throughput Screen 2023; 26:769-777. [PMID: 35619290 DOI: 10.2174/1386207325666220520102316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/17/2022] [Accepted: 04/07/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Alzheimer's disease (AD) is the most common neurodegenerative disorder that affects the neuronal system and leads to memory loss. Many coding gene variants are associated with this disease and it is important to characterize their annotations. METHODS We collected the Alzheimer's disease-causing and neutral mutations from different databases. For each mutation, we computed the different features from protein sequence. Further, these features were used to build a Bayes network-based machine-learning algorithm to discriminate between the disease-causing and neutral mutations in AD. RESULTS We have constructed a comprehensive dataset of 314 Alzheimer's disease-causing and 370 neutral mutations and explored their characteristic features such as conservation scores, positionspecific scoring matrix (PSSM) profile, and the change in hydrophobicity, different amino acid residue substitution matrices and neighboring residue information for identifying the disease-causing mutations. Utilizing these features, we have developed a disease-specific tool named Alz-disc, for discriminating the disease-causing and neutral mutations using sequence information alone. The performance of the present method showed an accuracy of 89% for independent test set, which is 13% higher than available generic methods. This method is freely available as a web server at https://web.iitm.ac.in/bioinfo2/alzdisc/. CONCLUSIONS This study is useful to annotate the effect of new variants and develop mutation specific drug design strategies for Alzheimer's disease.
Collapse
|
25
|
Harini K, Christoffer C, Gromiha MM, Kihara D. Pairwise and Multi-chain Protein Docking Enhanced Using LZerD Web Server. Methods Mol Biol 2023; 2690:355-373. [PMID: 37450159 PMCID: PMC10561630 DOI: 10.1007/978-1-0716-3327-4_28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Interactions of proteins with other macromolecules have important structural and functional roles in the basic processes of living cells. To understand and elucidate the mechanisms of interactions, it is important to know the 3D structures of the complexes. Proteomes contain numerous protein-protein complexes, for which experimentally determined structures often do not exist. Computational techniques can be a practical alternative to obtain useful complex structure models. Here, we present a web server that provides access to the LZerD and Multi-LZerD protein docking tools, which can perform both pairwise and multi-chain docking. The web server is user-friendly, with options to visualize the distribution and structures of binding poses of top-scoring models. The LZerD web server is available at https://lzerd.kiharalab.org . This chapter dictates the algorithm and step-by-step procedure to model the monomeric structures with AttentiveDist, and also provides the detail of pairwise LZerD docking, and multi-LZerD. This also provided case studies for each of the three modules.
Collapse
|
26
|
Sharma D, Notarte KI, Fernandez RA, Lippi G, Gromiha MM, Henry BM. In silico evaluation of the impact of Omicron variant of concern sublineage BA.4 and BA.5 on the sensitivity of RT-qPCR assays for SARS-CoV-2 detection using whole genome sequencing. J Med Virol 2023; 95:e28241. [PMID: 36263448 PMCID: PMC9874926 DOI: 10.1002/jmv.28241] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 10/10/2022] [Accepted: 10/17/2022] [Indexed: 01/27/2023]
Abstract
BACKGROUND Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variant of concern (VoC) Omicron (B.1.1.529) has rapidly spread around the world, presenting a new threat to global public human health. Due to the large number of mutations accumulated by SARS-CoV-2 Omicron, concerns have emerged over potentially reduced diagnostic accuracy of reverse-transcription polymerase chain reaction (RT-qPCR), the gold standard diagnostic test for diagnosing coronavirus disease 2019 (COVID-19). Thus, we aimed to assess the impact of the currently endemic Omicron sublineages BA.4 and BA.5 on the integrity and sensitivity of RT-qPCR assays used for coronavirus disease 2019 (COVID-19) diagnosis via in silico analysis. We employed whole genome sequencing data and evaluated the potential for false negatives or test failure due to mismatches between primers/probes and the Omicron VoC viral genome. METHODS In silico sensitivity of 12 RT-qPCR tests (containing 30 primers and probe sets) developed for detection of SARS-CoV-2 reported by the World Health Organization (WHO) or available in the literature, was assessed for specifically detecting SARS-CoV-2 Omicron BA.4 and BA.5 sublineages, obtained after removing redundancy from publicly available genomes from National Center for Biotechnology Information (NCBI) and Global Initiative on Sharing Avian Influenza Data (GISAID) databases. Mismatches between amplicon regions of SARS-CoV-2 Omicron VoC and primers and probe sets were evaluated, and clustering analysis of corresponding amplicon sequences was carried out. RESULTS From the 1164 representative SARS-CoV-2 Omicron VoC BA.4 sublineage genomes analyzed, a substitution in the first five nucleotides (C to T) of the amplicon's 3'-end was observed in all samples resulting in 0% sensitivity for assays HKUnivRdRp/Hel (mismatch in reverse primer) and CoremCharite N (mismatch in both forward and reverse primers). Due to a mismatch in the forward primer's 5'-end (3-nucleotide substitution, GGG to AAC), the sensitivity of the ChinaCDC N assay was at 0.69%. The 10 nucleotide mismatches in the reverse primer resulted in 0.09% sensitivity for Omicron sublineage BA.4 for Thai N assay. Of the 1926 BA.5 sublineage genomes, HKUnivRdRp/Hel assay also had 0% sensitivity. A sensitivity of 3.06% was observed for the ChinaCDC N assay because of a mismatch in the forward primer's 5'-end (3-nucleotide substitution, GGG to AAC). Similarly, due to the 10 nucleotide mismatches in the reverse primer, the Thai N assay's sensitivity was low at 0.21% for sublineage BA.5. Further, eight assays for BA.4 sublineage retained high sensitivity (more than 97%) and 9 assays for BA.5 sublineage retained more than 99% sensitivity. CONCLUSION We observed four assays (HKUnivRdRp/Hel, ChinaCDC N, Thai N, CoremCharite N) that could potentially result in false negative results for SARS-CoV-2 Omicron VoCs BA.4 and BA.5 sublineages. Interestingly, CoremCharite N had 0% sensitivity for Omicron Voc BA.4 but 99.53% sensitivity for BA.5. In addition, 66.67% of the assays for BA.4 sublineage and 75% of the assays for BA.5 sublineage retained high sensitivity. Further, amplicon clustering and additional substitution analysis along with sensitivity analysis could be used for the modification and development of RT-qPCR assays for detecting SARS-CoV-2 Omicron VoC sublineages.
Collapse
|
27
|
Jino Blessy J, Siva Shanmugam NR, Veluraja K, Michael Gromiha M. Investigations on the binding specificity of β-galactoside analogues with human galectin-1 using molecular dynamics simulations. J Biomol Struct Dyn 2022; 40:10094-10105. [PMID: 34219624 DOI: 10.1080/07391102.2021.1939788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Galectin-1 (Gal-1) is the first member of galectin family, which has a carbohydrate recognition domain, specifically binds towards β-galactoside containing oligosaccharides. Owing its association with carbohydrates, Gal-1 is involved in many biological processes such as cell signaling, adhesion and pathological pathways such as metastasis, apoptosis and increased tumour cell survival. The development of β-galactoside based inhibitors would help to control the Gal-1 expression. In the current study, we carried out molecular dynamics (MD) simulations to examine the structural and dynamic behaviour Gal-1-thiodigalactoside (TDG), Gal-1-lactobionic acid (LBA) and Gal-1-beta-(1→6)-galactobiose (G16G) complexes. The analysis of glycosidic torsional angles revealed that β-galactoside analogues TDG and LBA have a single binding mode (BM1) whereas G16G has two binding modes (BM1 and BM2) for interacting with Gal-1 protein. We have computed the binding free energies for the complexes Gal-1-TDG, Gal-1-LBA and Gal-1-G16G using MM/PBSA and are -6.45, -6.22 and -3.08 kcal/mol, respectively. This trend agrees well with experiments that the binding of Gal-1 with TDG is stronger than LBA. Further analysis revealed that the interactions due to direct and water-mediated hydrogen bonds play a significant role to the structural stability of the complexes. The result obtained from this study is useful to formulate a set of rules and derive pharmacophore-based features for designing inhibitors against galectin-1.Communicated by Ramaswamy H. Sarma.
Collapse
|
28
|
Ramaswamy Krishnan S, Roy A, Michael Gromiha M. R-SIM: A database of binding affinities for RNA-small molecule interactions. J Mol Biol 2022:167914. [DOI: 10.1016/j.jmb.2022.167914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 11/28/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022]
|
29
|
Pandey M, Anoosha P, Yesudhas D, Gromiha MM. Identification of potential driver mutations in glioblastoma using machine learning. Brief Bioinform 2022; 23:6764546. [PMID: 36266243 DOI: 10.1093/bib/bbac451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/13/2022] [Accepted: 09/22/2022] [Indexed: 12/14/2022] Open
Abstract
Glioblastoma is a fast and aggressively growing tumor in the brain and spinal cord. Mutation of amino acid residues in targets proteins, which are involved in glioblastoma, alters the structure and function and may lead to disease. In this study, we collected a set of 9386 disease-causing (drivers) mutations based on the recurrence in patient samples and experimentally annotated as pathogenic and 8728 as neutral (passenger) mutations. We observed that Arg is highly preferred at the mutant sites of drivers, whereas Met and Ile showed preferences in passengers. Inspecting neighboring residues at the mutant sites revealed that the motifs YP, CP and GRH, are preferred in drivers, whereas SI, IQ and TVI are dominant in neutral. In addition, we have computed other sequence-based features such as conservation scores, Position Specific Scoring Matrices (PSSM) and physicochemical properties, and developed a machine learning-based method, GBMDriver (GlioBlastoma Multiforme Drivers), for distinguishing between driver and passenger mutations. Our method showed an accuracy and AUC of 73.59% and 0.82, respectively, on 10-fold cross-validation and 81.99% and 0.87 in a blind set of 1809 mutants. The tool is available at https://web.iitm.ac.in/bioinfo2/GBMDriver/index.html. We envisage that the present method is helpful to prioritize driver mutations in glioblastoma and assist in identifying therapeutic targets.
Collapse
|
30
|
Parvathy Dharshini SA, Sneha NP, Yesudhas D, Kulandaisamy A, Rangaswamy U, Shanmugam A, Taguchi YH, Gromiha MM. Exploring plausible therapeutic targets for Alzheimer's disease using multi-omics approach, machine learning and docking. Curr Top Med Chem 2022; 22:1868-1879. [PMID: 36056872 DOI: 10.2174/1568026622666220902110115] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/22/2022] [Accepted: 07/29/2022] [Indexed: 11/22/2022]
Abstract
The progressive deterioration of neurons leads to Alzheimer's disease (AD), and developing a drug for this disorder is challenging. Substantial gene/transcriptome variability from multiple cell types leads to downstream pathophysiologic consequences that represent the heterogeneity of this disease. Identifying potential biomarkers for promising therapeutics is strenuous due to the fact that the transcriptome, epigenetic, or proteome changes detected in patients are not clear whether they are the cause or consequence of the disease, which eventually makes the drug discovery efforts intricate. The advancement in scRNA-sequencing technologies helps to identify cell type-specific biomarkers that may guide the selection of the pathways and related targets specific to different stages of the disease progression. This review is focussed on the analysis of multi-omics data from various perspectives (genomic and transcriptomic variants, and single-cell expression), which provide insights to identify plausible molecular targets to combat this complex disease. Further, we briefly outlined the developments in machine learning techniques to prioritize the risk-associated genes, predict probable mutations and identify promising drug candidates from natural products.
Collapse
|
31
|
Rawat P, Sharma D, Prabakaran R, Ridha F, Mohkhedkar M, Janakiraman V, Gromiha MM. Ab-CoV: a curated database for binding affinity and neutralization profiles of coronavirus-related antibodies. Bioinformatics 2022; 38:4051-4052. [PMID: 35771624 DOI: 10.1093/bioinformatics/btac439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 06/05/2022] [Accepted: 06/28/2022] [Indexed: 12/24/2022] Open
Abstract
SUMMARY We have developed a database, Ab-CoV, which contains manually curated experimental interaction profiles of 1780 coronavirus-related neutralizing antibodies. It contains more than 3200 datapoints on half maximal inhibitory concentration (IC50), half maximal effective concentration (EC50) and binding affinity (KD). Each data with experimentally known three-dimensional structures are complemented with predicted change in stability and affinity of all possible point mutations of interface residues. Ab-CoV also includes information on epitopes and paratopes, structural features of viral proteins, sequentially similar therapeutic antibodies and Collier de Perles plots. It has the feasibility for structure visualization and options to search, display and download the data. AVAILABILITY AND IMPLEMENTATION Ab-CoV database is freely available at https://web.iitm.ac.in/bioinfo2/ab-cov/home. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
32
|
Kulandaisamy A, Ridha F, Frishman D, Gromiha MM. Computational approaches for investigating disease-causing mutations in membrane proteins: database development, analysis and prediction. Curr Top Med Chem 2022; 22:1766-1775. [PMID: 35894475 DOI: 10.2174/1568026622666220726124705] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 05/27/2022] [Accepted: 06/03/2022] [Indexed: 11/22/2022]
Abstract
Membrane proteins (MPs) play an essential role in a broad range of cellular functions, serving as transporters, enzymes, receptors, and communicators, and about ~60% of membrane proteins are primarily used as drug targets. These proteins adopt either -helical or -barrel structures in the lipid bilayer of a cell/organelle membrane. Mutations in membrane proteins alter their structure and function and may lead to diseases. Accumulation of data on disease-causing and neutral mutations in membrane proteins are available in MutHTP and TMSNP databases, which provide additional features based on sequence, structure, topology, and diseases. These databases have been effectively utilized for analysing sequence and structure-based features in disease-causing and neutral mutations in membrane proteins, exploring disease-causing mechanisms, elucidating the relationship between sequence/structural parameters with diseases, and developing computational tools. Further, machine learning based tools have been developed for identifying disease-causing mutations using diverse features such as evolutionary information, physicochemical properties, atomic contacts, contact potentials, atomic contacts, and contribution of different energetic terms. These membrane protein-specific tools are helpful to characterize the effect of new variants in whole human membrane proteome. In this review, we provide a discussion of the available databases for disease-causing mutations in membrane proteins followed by a statistical analysis of membrane protein mutations using sequence and structural features. In addition, available prediction tools for identifying disease-causing and neutral mutations in membrane proteins will be described with their performances. This comprehensive review provides deep insights to design mutation-specific strategies for different diseases.
Collapse
|
33
|
Sharma D, Rawat P, Janakiraman V, Gromiha MM. Elucidating important structural features for the binding affinity of spike - SARS-CoV-2 neutralizing antibody complexes. Proteins 2022; 90:824-834. [PMID: 34761442 PMCID: PMC8661754 DOI: 10.1002/prot.26277] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 11/04/2021] [Accepted: 11/07/2021] [Indexed: 12/23/2022]
Abstract
The coronavirus disease 2019 (COVID-19) has affected the lives of millions of people around the world. In an effort to develop therapeutic interventions and control the pandemic, scientists have isolated several neutralizing antibodies against SARS-CoV-2 from the vaccinated and convalescent individuals. These antibodies can be explored further to understand SARS-CoV-2 specific antigen-antibody interactions and biophysical parameters related to binding affinity, which can be utilized to engineer more potent antibodies for current and emerging SARS-CoV-2 variants. In the present study, we have analyzed the interface between spike protein of SARS-CoV-2 and neutralizing antibodies in terms of amino acid residue propensity, pair preference, and atomic interaction energy. We observed that Tyr residues containing contacts are highly preferred and energetically favorable at the interface of spike protein-antibody complexes. We have also developed a regression model to relate the experimental binding affinity for antibodies using structural features, which showed a correlation of 0.93. Moreover, several mutations at the spike protein-antibody interface were identified, which may lead to immune escape (epitope residues) and improved affinity (paratope residues) in current/emerging variants. Overall, the work provides insights into spike protein-antibody interactions, structural parameters related to binding affinity and mutational effects on binding affinity change, which can be helpful to develop better therapeutics against COVID-19.
Collapse
|
34
|
Yesudhas D, Dharshini SAP, Taguchi YH, Gromiha MM. Tumor Heterogeneity and Molecular Characteristics of Glioblastoma Revealed by Single-Cell RNA-Seq Data Analysis. Genes (Basel) 2022; 13:428. [PMID: 35327982 PMCID: PMC8955282 DOI: 10.3390/genes13030428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/23/2022] [Accepted: 02/24/2022] [Indexed: 11/16/2022] Open
Abstract
Glioblastoma multiforme (GBM) is the most common infiltrating lethal tumor of the brain. Tumor heterogeneity and the precise characterization of GBM remain challenging, and the disease-specific and effective biomarkers are not available at present. To understand GBM heterogeneity and the disease prognosis mechanism, we carried out a single-cell transcriptome data analysis of 3389 cells from four primary IDH-WT (isocitrate dehydrogenase wild type) glioblastoma patients and compared the characteristic features of the tumor and periphery cells. We observed that the marker gene expression profiles of different cell types and the copy number variations (CNVs) are heterogeneous in the GBM samples. Further, we have identified 94 differentially expressed genes (DEGs) between tumor and periphery cells. We constructed a tissue-specific co-expression network and protein-protein interaction network for the DEGs and identified several hub genes, including CX3CR1, GAPDH, FN1, PDGFRA, HTRA1, ANXA2 THBS1, GFAP, PTN, TNC, and VIM. The DEGs were significantly enriched with proliferation and migration pathways related to glioblastoma. Additionally, we were able to identify the differentiation state of microglia and changes in the transcriptome in the presence of glioblastoma that might support tumor growth. This study provides insights into GBM heterogeneity and suggests novel potential disease-specific biomarkers which could help to identify the therapeutic targets in GBM.
Collapse
|
35
|
Gromiha MM, Orengo CA, Sowdhamini R, Thornton AJM. Srinivasan (1962-2021) in Bioinformatics and beyond. Bioinformatics 2022; 38:2377-2379. [PMID: 35134112 PMCID: PMC9004639 DOI: 10.1093/bioinformatics/btac054] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 01/27/2022] [Indexed: 02/05/2023] Open
|
36
|
Prabakaran R, Jemimah S, Rawat P, Sharma D, Gromiha MM. A novel hybrid SEIQR model incorporating the effect of quarantine and lockdown regulations for COVID-19. Sci Rep 2021; 11:24073. [PMID: 34912038 PMCID: PMC8674241 DOI: 10.1038/s41598-021-03436-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open
Abstract
Mitigating the devastating effect of COVID-19 is necessary to control the infectivity and mortality rates. Hence, several strategies such as quarantine of exposed and infected individuals and restricting movement through lockdown of geographical regions have been implemented in most countries. On the other hand, standard SEIR based mathematical models have been developed to understand the disease dynamics of COVID-19, and the proper inclusion of these restrictions is the rate-limiting step for the success of these models. In this work, we have developed a hybrid Susceptible-Exposed-Infected-Quarantined-Removed (SEIQR) model to explore the influence of quarantine and lockdown on disease propagation dynamics. The model is multi-compartmental, and it considers everyday variations in lockdown regulations, testing rate and quarantine individuals. Our model predicts a considerable difference in reported and actual recovered and deceased cases in qualitative agreement with recent reports.
Collapse
|
37
|
Kulandaisamy A, Nikam R, Harini K, Sharma D, Gromiha MM. Illustrative Tutorials for ProThermDB: Thermodynamic Database for Proteins and Mutants. Curr Protoc 2021; 1:e306. [PMID: 34826364 DOI: 10.1002/cpz1.306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
ProThermDB (https://web.iitm.ac.in/bioinfo2/prothermdb/index.html) is a primary resource for protein stability, which contains experimentally determined thermodynamic data for proteins and their mutants. The most recent version of ProThermDB accumulates the data obtained from both high- and low-throughput experimental biophysical methods. It includes comprehensive information at four different levels, i.e.: (i) protein sequence and structure; (ii) experimental conditions; (iii) thermodynamic parameters such as Gibbs free energy, melting temperature, enthalpy, etc.; and (iv) literature. In the following protocols, we present detailed tutorials for retrieving data using different search, display and sorting options, interpretation of search results, description of each entry-level information category, data upload and download, cross-links with other databases, and visualization options. This protocol consists of six pictorial exercises, which are useful for biologists/users to understand the contents and organization of data in ProThermDB. Further, potential applications of ProThermDB in protein engineering are discussed. © 2021 Wiley Periodicals LLC. Basic Protocol 1: Retrieval of experimental thermodynamic data for wild-type and mutants of a specific protein using a simple query Basic Protocol 2: Retrieval of stabilizing point mutations, which are located at the interior of α-helical regions, and obtaining data by thermal denaturation methods Basic Protocol 3: Retrieval of destabilizing point mutations, which are in β-sheets of exposed regions, and obtaining data by chemical denaturation methods (urea and GdnHCl) Basic Protocol 4: Retrieval of stabilizing and destabilizing point mutations in a range of physiological conditions (pH: 6-9 and T: 20°C-25°C) and publication years (2010-2020) Support Protocol: Downloading the entire data of the database for academic research purposes and submission of new data in ProThermDB.
Collapse
|
38
|
Harini K, Srivastava A, Kulandaisamy A, Gromiha MM. ProNAB: database for binding affinities of protein-nucleic acid complexes and their mutants. Nucleic Acids Res 2021; 50:D1528-D1534. [PMID: 34606614 PMCID: PMC8728258 DOI: 10.1093/nar/gkab848] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/08/2021] [Accepted: 09/10/2021] [Indexed: 11/16/2022] Open
Abstract
Protein–nucleic acid interactions are involved in various biological processes such as gene expression, replication, transcription, translation and packaging. The binding affinities of protein–DNA and protein–RNA complexes are important for elucidating the mechanism of protein–nucleic acid recognition. Although experimental data on binding affinity are reported abundantly in the literature, no well-curated database is currently available for protein–nucleic acid binding affinity. We have developed a database, ProNAB, which contains more than 20 000 experimental data for the binding affinities of protein–DNA and protein–RNA complexes. Each entry provides comprehensive information on sequence and structural features of a protein, nucleic acid and its complex, experimental conditions, thermodynamic parameters such as dissociation constant (Kd), binding free energy (ΔG) and change in binding free energy upon mutation (ΔΔG), and literature information. ProNAB is cross-linked with GenBank, UniProt, PDB, ProThermDB, PROSITE, DisProt and Pubmed. It provides a user-friendly web interface with options for search, display, sorting, visualization, download and upload the data. ProNAB is freely available at https://web.iitm.ac.in/bioinfo2/pronab/ and it has potential applications such as understanding the factors influencing the affinity, development of prediction tools, binding affinity change upon mutation and design complexes with the desired affinity.
Collapse
|
39
|
Prabakaran R, Rawat P, Kumar S, Gromiha MM. Erratum to: Evaluation of in silico tools for the prediction of protein and peptide aggregation on diverse datasets. Brief Bioinform 2021; 23:6367722. [PMID: 34499131 DOI: 10.1093/bib/bbab369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/18/2021] [Accepted: 06/02/2021] [Indexed: 11/13/2022] Open
|
40
|
Fernandez RA, Quimque MT, Notarte KI, Manzano JA, Pilapil DY, de Leon VN, San Jose JJ, Villalobos O, Muralidharan NH, Gromiha MM, Brogi S, Macabeo APG. Myxobacterial depsipeptide chondramides interrupt SARS-CoV-2 entry by targeting its broad, cell tropic spike protein. J Biomol Struct Dyn 2021; 40:12209-12220. [PMID: 34463219 PMCID: PMC8436362 DOI: 10.1080/07391102.2021.1969281] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 08/12/2021] [Indexed: 12/24/2022]
Abstract
The severity of the COVID-19 pandemic has necessitated the search for drugs against SARS-CoV-2. In this study, we explored via in silico approaches myxobacterial secondary metabolites against various receptor-binding regions of SARS-CoV-2 spike which are responsible in recognition and attachment to host cell receptors mechanisms, namely ACE2, GRP78, and NRP1. In general, cyclic depsipeptide chondramides conferred high affinities toward the spike RBD, showing strong binding to the known viral hot spots Arg403, Gln493 and Gln498 and better selectivity compared to most host cell receptors studied. Among them, chondramide C3 (1) exhibited a binding energy which remained relatively constant when docked against most of the spike variants. Chondramide C (2) on the other hand exhibited strong affinity against spike variants identified in the United Kingdom (N501Y), South Africa (N501Y, E484K, K417N) and Brazil (N501Y, E484K, K417T). Chondramide C6 (9) showed highest BE towards GRP78 RBD. Molecular dynamics simulations were also performed for chondramides 1 and 2 against SARS-CoV-2 spike RBD of the Wuhan wild-type and the South African variant, respectively, where resulting complexes demonstrated dynamic stability within a 120-ns simulation time. Protein-protein binding experiments using HADDOCK illustrated weaker binding affinity for complexed chondramide ligands in the RBD against the studied host cell receptors. The chondramide derivatives in general possessed favorable pharmacokinetic properties, highlighting their potential as prototypic anti-COVID-19 drugs limiting viral attachment and possibly minimizing viral infection.Communicated by Ramaswamy H. Sarma.
Collapse
|
41
|
Prabakaran R, Rawat P, Yasuo N, Sekijima M, Kumar S, Gromiha MM. Effect of charged mutation on aggregation of a pentapeptide: Insights from molecular dynamics simulations. Proteins 2021; 90:405-417. [PMID: 34460128 DOI: 10.1002/prot.26230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 06/30/2021] [Accepted: 08/24/2021] [Indexed: 12/14/2022]
Abstract
Aggregation of therapeutic monoclonal antibodies (mAbs) can negatively affect their chemistry, manufacturing, and control attributes and lead to undesirable immune responses in patients. Therefore, optimization of lead mAb drug candidates during discovery stages to mitigate aggregation is increasingly becoming an integral part of their developability assessments. The disruption of short sequence motifs called aggregation prone regions (APRs) found in amino acid sequences of mAb candidates can potentially mitigate their aggregation. In this work, we have performed molecular dynamics simulations to study the aggregation of an APR (VLVIY) found in λ light chains of human antibodies and its single point mutant KLVIY. Eighteen different multicopy peptide simulation systems of "VLVIY" and "KLVIY" were constructed by varying their concentrations, temperatures, termini capping, and flanking gate-keeper regions. Within 20 ns of the simulation, peptide "VLVIY" formed an aggregate of 100 peptides at ~0.1 M concentration with a 60% reduction in solvent accessible surface area (SASA). Furthermore, analysis of the SASA change, peptide cluster distribution, and water residence time demonstrated how Val ➔ Lys mutation resists aggregation and improves solubility. Presence of Lys slows down aggregation kinetics via charge-charge repulsions and by raising the kinetic barrier to formation of large oligomers. However, the effect of the Val ➔ Lys mutation is dependent on sequence and structural contexts around the APR. This mutation also alters the solvation shell around the peptide by favoring solute-solvent interactions, thereby increasing its solubility. This work has provided a detailed mechanistic explanation of how APR disruption can mitigate aggregation in biotherapeutics and improve their developability.
Collapse
|
42
|
Rawat P, Prabakaran R, Kumar S, Gromiha MM. Exploring the sequence features determining amyloidosis in human antibody light chains. Sci Rep 2021; 11:13785. [PMID: 34215782 PMCID: PMC8253744 DOI: 10.1038/s41598-021-93019-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/18/2021] [Indexed: 02/06/2023] Open
Abstract
The light chain (AL) amyloidosis is caused by the aggregation of light chain of antibodies into amyloid fibrils. There are plenty of computational resources available for the prediction of short aggregation-prone regions within proteins. However, it is still a challenging task to predict the amyloidogenic nature of the whole protein using sequence/structure information. In the case of antibody light chains, common architecture and known binding sites can provide vital information for the prediction of amyloidogenicity at physiological conditions. Here, in this work, we have compared classical sequence-based, aggregation-related features (such as hydrophobicity, presence of gatekeeper residues, disorderness, β-propensity, etc.) calculated for the CDR, FR or VL regions of amyloidogenic and non-amyloidogenic antibody light chains and implemented the insights gained in a machine learning-based webserver called "VLAmY-Pred" ( https://web.iitm.ac.in/bioinfo2/vlamy-pred/ ). The model shows prediction accuracy of 79.7% (sensitivity: 78.7% and specificity: 79.9%) with a ROC value of 0.88 on a dataset of 1828 variable region sequences of the antibody light chains. This model will be helpful towards improved prognosis for patients that may likely suffer from diseases caused by light chain amyloidosis, understanding origins of aggregation in antibody-based biotherapeutics, large-scale in-silico analysis of antibody sequences generated by next generation sequencing, and finally towards rational engineering of aggregation resistant antibodies.
Collapse
|
43
|
Prabakaran R, Rawat P, Kumar S, Gromiha MM. Evaluation of in silico tools for the prediction of protein and peptide aggregation on diverse datasets. Brief Bioinform 2021; 22:6309925. [PMID: 34181000 DOI: 10.1093/bib/bbab240] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/18/2021] [Accepted: 06/02/2021] [Indexed: 01/09/2023] Open
Abstract
Several prediction algorithms and tools have been developed in the last two decades to predict protein and peptide aggregation. These in silico tools aid to predict the aggregation propensity and amyloidogenicity as well as the identification of aggregation-prone regions. Despite the immense interest in the field, it is of prime importance to systematically compare these algorithms for their performance. In this review, we have provided a rigorous performance analysis of nine prediction tools using a variety of assessments. The assessments were carried out on several non-redundant datasets ranging from hexapeptides to protein sequences as well as amyloidogenic antibody light chains to soluble protein sequences. Our analysis reveals the robustness of the current prediction tools and the scope for improvement in their predictive performances. Insights gained from this work provide critical guidance to the scientific community on advantages and limitations of different aggregation prediction methods and make informed decisions about their research needs.
Collapse
|
44
|
Yesudhas D, Srivastava A, Sekijima M, Gromiha MM. Tackling Covid-19 using disordered-to-order transition of residues in the spike protein upon angiotensin-converting enzyme 2 binding. Proteins 2021; 89:1158-1166. [PMID: 33893649 PMCID: PMC8251098 DOI: 10.1002/prot.26088] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/18/2021] [Accepted: 04/09/2021] [Indexed: 01/09/2023]
Abstract
The 2019-novel coronavirus also known as severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) is a common threat to animals and humans, and is responsible for the human SARS pandemic in 2019 to 2021. The infection of SARS-CoV-2 in humans involves a viral surface glycoprotein named as spike proteins, which bind to the human angiotensin-converting enzyme 2 (ACE2) proteins. Particularly, the receptor binding domains (RBDs) mediate the interaction and contain several disordered regions, which help in the binding. Investigations on the influence of disordered residues/regions in stability and binding of spike protein with ACE2 help to understand the disease pathogenesis, which has not yet been studied. In this study, we have used molecular-dynamics simulations to characterize the structural changes in disordered regions of the spike protein that result from ACE2 binding. We observed that the disordered regions undergo disorder-to-order transition (DOT) upon binding with ACE2, and the DOT residues are located at functionally important regions of RBD. Although the RBD is having rigid structure, DOT residues make conformational rearrangements for the spike protein to attach with ACE2. The binding is strengthened via hydrophilic and aromatic amino acids mainly present in the DOTs. The positively correlated motions of the DOT residues with its nearby residues also explain the binding profile of RBD with ACE2, and the residues are observed to be contributing more favorable binding energies for the spike-ACE2 complex formation. This study emphasizes that intrinsically disordered residues in the RBD of spike protein may provide insights into its etiology and be useful for drug and vaccine discovery.
Collapse
|
45
|
Yesudhas D, Srivastava A, Gromiha MM. COVID-19 outbreak: history, mechanism, transmission, structural studies and therapeutics. Infection 2021; 49:199-213. [PMID: 32886331 PMCID: PMC7472674 DOI: 10.1007/s15010-020-01516-2] [Citation(s) in RCA: 98] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 08/25/2020] [Indexed: 01/08/2023]
Abstract
PURPOSE The coronavirus outbreak emerged as a severe pandemic, claiming more than 0.8 million lives across the world and raised a major global health concern. We survey the history and mechanism of coronaviruses, and the structural characteristics of the spike protein and its key residues responsible for human transmissions. METHODS We have carried out a systematic review to summarize the origin, transmission and etiology of COVID-19. The structural analysis of the spike protein and its disordered residues explains the mechanism of the viral transmission. A meta-data analysis of the therapeutic compounds targeting the SARS-CoV-2 is also included. RESULTS Coronaviruses can cross the species barrier and infect humans with unexpected consequences for public health. The transmission rate of SARS-CoV-2 infection is higher compared to that of the closely related SARS-CoV infections. In SARS-CoV-2 infection, intrinsically disordered regions are observed at the interface of the spike protein and ACE2 receptor, providing a shape complementarity to the complex. The key residues of the spike protein have stronger binding affinity with ACE2. These can be probable reasons for the higher transmission rate of SARS-CoV-2. In addition, we have also discussed the therapeutic compounds and the vaccines to target SARS-CoV-2, which can help researchers to develop effective drugs/vaccines for COVID-19. The overall history and mechanism of entry of SARS-CoV-2 along with structural study of spike-ACE2 complex provide insights to understand disease pathogenesis and development of vaccines and drugs.
Collapse
|
46
|
Srivastava A, Yesudhas D, Ahmad S, Gromiha MM. Understanding disorder-to-order transitions in protein-RNA complexes using molecular dynamics simulations. J Biomol Struct Dyn 2021; 40:7915-7925. [PMID: 33779503 DOI: 10.1080/07391102.2021.1904005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Intrinsically disordered regions (IDRs) in proteins are characterized by their flexibilities and low complexity regions, which lack unique 3 D structures in solution. IDRs play a significant role in signaling, regulation, and binding multiple partners, including DNA, RNA, and proteins. Although various experiments have shown the role of disordered regions in binding with RNA, a detailed computational analysis is required to understand their binding and recognition mechanism. In this work, we performed molecular dynamics simulations of 10 protein-RNA complexes to understand the binding governed by intrinsically disordered regions. The simulation results show that most of the disordered regions are important for RNA-binding and have a transition from disordered-to-ordered conformation upon binding, which often contribute significantly towards the binding affinity. Interestingly, most of the disordered residues are present at the interface or located as a linker between two regions having similar movements. The DOT regions are overlaped or flanked with experimentally reported functionally important residues in the recognition of protein-RNA complexes. This study provides additional insights for understanding the role and recognition mechanism of disordered regions in protein-RNA complexes.Communicated by Ramaswamy H. Sarma.
Collapse
|
47
|
Dharshini SAP, Jemimah S, Taguchi YH, Gromiha MM. Exploring Common Therapeutic Targets for Neurodegenerative Disorders Using Transcriptome Study. Front Genet 2021; 12:639160. [PMID: 33815473 PMCID: PMC8017312 DOI: 10.3389/fgene.2021.639160] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 02/22/2021] [Indexed: 11/13/2022] Open
Abstract
Alzheimer's disease (AD) and Parkinson's disease (PD) are well-known neuronal degenerative disorders that share common pathological events. Approved medications alleviate symptoms but do not address the root cause of the disease. Energy dysfunction in the neuronal population leads to various pathological events and ultimately results in neuronal death. Identifying common therapeutic targets for these disorders may help in the drug discovery process. The Brodmann area 9 (BA9) region is affected in both the disease conditions and plays an essential role in cognitive, motor, and memory-related functions. Analyzing transcriptome data of BA9 provides deep insights related to common pathological pathways involved in AD and PD. In this work, we map the preprocessed BA9 fastq files generated by RNA-seq for disease and control samples with reference hg38 genomic assembly and identify common variants and differentially expressed genes (DEG). These variants are predominantly located in the 3' UTR (non-promoter) region, affecting the conserved transcription factor (TF) binding motifs involved in the methylation and acetylation process. We have constructed BA9-specific functional interaction networks, which show the relationship between TFs and DEGs. Based on expression signature analysis, we propose that MAPK1, VEGFR1/FLT1, and FGFR1 are promising drug targets to restore blood-brain barrier functionality by reducing neuroinflammation and may save neurons.
Collapse
|
48
|
Prabakaran R, Rawat P, Thangakani AM, Kumar S, Gromiha MM. Protein aggregation: in silico algorithms and applications. Biophys Rev 2021; 13:71-89. [PMID: 33747245 PMCID: PMC7930180 DOI: 10.1007/s12551-021-00778-w] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 01/01/2021] [Indexed: 01/08/2023] Open
Abstract
Protein aggregation is a topic of immense interest to the scientific community due to its role in several neurodegenerative diseases/disorders and industrial importance. Several in silico techniques, tools, and algorithms have been developed to predict aggregation in proteins and understand the aggregation mechanisms. This review attempts to provide an essence of the vast developments in in silico approaches, resources available, and future perspectives. It reviews aggregation-related databases, mechanistic models (aggregation-prone region and aggregation propensity prediction), kinetic models (aggregation rate prediction), and molecular dynamics studies related to aggregation. With a multitude of prediction models related to aggregation already available to the scientific community, the field of protein aggregation is rapidly maturing to tackle new applications.
Collapse
|
49
|
Pandey M, Gromiha MM. Predicting potential residues associated with lung cancer using deep neural network. Mutat Res 2021; 822:111737. [PMID: 33508631 DOI: 10.1016/j.mrfmmm.2020.111737] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 10/30/2020] [Accepted: 12/09/2020] [Indexed: 12/31/2022]
Abstract
Lung cancer is a prominent type of cancer, which leads to high mortality rate worldwide. The major lung cancers lung adenocarcinoma (LUAD) and lung squamous carcinoma (LUSC) occur mainly due to somatic driver mutations in proteins and screening of such mutations is often cost and time intensive. Hence, in the present study, we systematically analyzed the preferred residues, residues pairs and motifs of 4172 disease prone sites in 195 proteins and compared with 4137 neutral sites. We observed that the motifs LG, QF and TST are preferred in disease prone sites whereas GK, KA and ISL are predominant in neutral sites. In addition, Gly, Asp, Glu, Gln and Trp are preferred in disease prone sites whereas, Ile, Val, Lys, Asn and Phe are preferred in neutral sites. Further, utilizing deep neural networks, we have developed a method for predicting disease prone sites with amino acid sequence based features such as physicochemical properties, conservation scores, secondary structure and di and tri-peptide motifs. The model is able to predict the disease prone sites at an accuracy of 81 % with sensitivity, specificity and AUC of 82 %, 78 % and 0.91, respectively, on 10-fold cross-validation. When the model was tested with a set of 417 disease-causing and 413 neutral sites, we obtained an accuracy and AUC of 80 % and 0.89, respectively. We suggest that our method can serve as an effective method to identify the disease causing and neutral sites in lung cancer. We have developed a web server CanProSite for identifying the disease prone sites and it is freely available at-https://web.iitm.ac.in/bioinfo2/CanProSite/.
Collapse
|
50
|
Nikam R, Kulandaisamy A, Harini K, Sharma D, Gromiha MM. ProThermDB: thermodynamic database for proteins and mutants revisited after 15 years. Nucleic Acids Res 2021; 49:D420-D424. [PMID: 33196841 PMCID: PMC7778892 DOI: 10.1093/nar/gkaa1035] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/14/2020] [Accepted: 10/26/2020] [Indexed: 11/12/2022] Open
Abstract
ProThermDB is an updated version of the thermodynamic database for proteins and mutants (ProTherm), which has ∼31 500 data on protein stability, an increase of 84% from the previous version. It contains several thermodynamic parameters such as melting temperature, free energy obtained with thermal and denaturant denaturation, enthalpy change and heat capacity change along with experimental methods and conditions, sequence, structure and literature information. Besides, the current version of the database includes about 120 000 thermodynamic data obtained for different organisms and cell lines, which are determined by recent high throughput proteomics techniques using whole-cell approaches. In addition, we provided a graphical interface for visualization of mutations at sequence and structure levels. ProThermDB is cross-linked with other relevant databases, PDB, UniProt, PubMed etc. It is freely available at https://web.iitm.ac.in/bioinfo2/prothermdb/index.html without any login requirements. It is implemented in Python, HTML and JavaScript, and supports the latest versions of major browsers, such as Firefox, Chrome and Safari.
Collapse
|