1
|
Aravind A, Nandakumar R, Ahmed M, Nisar M, Palollathil A, Kanichery A, Sreelan S, Sinan KM, Balaya RDA, Vijayakumar M, Prasad TSK, Raju R. REMEMProt: a resource of membrane-enriched proteome profiles, their disease associations, and biomarker status. Life Sci Alliance 2024; 7:e202302443. [PMID: 38719747 PMCID: PMC11077588 DOI: 10.26508/lsa.202302443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 04/26/2024] [Accepted: 04/26/2024] [Indexed: 05/12/2024] Open
Abstract
The differential expression of plasma membrane proteins is integrally analyzed for their diagnosis, prognosis, and therapeutic applications in diverse clinical manifestations. Necessarily, distinct membrane protein enrichment methods and mass spectrometry platforms are employed for their global and relative quantitation. First of its kind to explore, we compiled membrane-associated proteomes in human and mouse systems into a database named, Resource of Experimental Membrane-Enriched Mass spectrometry-derived Proteome (REMEMProt). It currently hosts 14,626 proteins (9,507 proteins in Homo sapiens; 5,119 proteins in Mus musculus) with information on their membrane-protein enrichment methods, experimental/physiological context of detection in cells or tissues, transmembrane domain analysis, and their current attribution as biomarkers. Based on these annotations and the transmembrane domain analysis in proteins or their binary/complex protein-protein interactors, REMEMProt facilitates the assessment of the plasma membrane localization potential of proteins through batch query. A cross-study enrichment analysis platform is enabled in REMEMProt for comparative analysis of proteomes using novel/modified membrane enrichment methods and evaluation of methods for targeted enrichment of membrane proteins. REMEMProt data are made freely accessible to explore and download at https://rememprot.ciods.in/.
Collapse
Affiliation(s)
- Anjana Aravind
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Revathy Nandakumar
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Mukhtar Ahmed
- https://ror.org/02f81g417 Department of Zoology, College of Science, King Saud University, Riyadh, Kingdom of Saudi Arabia
| | - Mahammad Nisar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore, India
| | - Akhina Palollathil
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Anagha Kanichery
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Sourav Sreelan
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore, India
- Yenepoya Institute of Technology, Yenepoya (Deemed to be University), Mangalore, India
| | - Kp Munavvar Sinan
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore, India
- Yenepoya Institute of Technology, Yenepoya (Deemed to be University), Mangalore, India
| | | | - Manavalan Vijayakumar
- Department of Surgical Oncology, Yenepoya Medical College, Yenepoya (Deemed to be University), Mangalore, India
| | | | - Rajesh Raju
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to be University), Mangalore, India
| |
Collapse
|
2
|
Duart G, Graña-Montes R, Pastor-Cantizano N, Mingarro I. Experimental and computational approaches for membrane protein insertion and topology determination. Methods 2024; 226:102-119. [PMID: 38604415 DOI: 10.1016/j.ymeth.2024.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 03/13/2024] [Accepted: 03/22/2024] [Indexed: 04/13/2024] Open
Abstract
Membrane proteins play pivotal roles in a wide array of cellular processes and constitute approximately a quarter of the protein-coding genes across all organisms. Despite their ubiquity and biological significance, our understanding of these proteins remains notably less comprehensive compared to their soluble counterparts. This disparity in knowledge can be attributed, in part, to the inherent challenges associated with employing specialized techniques for the investigation of membrane protein insertion and topology. This review will center on a discussion of molecular biology methodologies and computational prediction tools designed to elucidate the insertion and topology of helical membrane proteins.
Collapse
Affiliation(s)
- Gerard Duart
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Ricardo Graña-Montes
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Noelia Pastor-Cantizano
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Ismael Mingarro
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain.
| |
Collapse
|
3
|
Li H, Sun X, Cui W, Xu M, Dong J, Ekundayo BE, Ni D, Rao Z, Guo L, Stahlberg H, Yuan S, Vogel H. Computational drug development for membrane protein targets. Nat Biotechnol 2024; 42:229-242. [PMID: 38361054 DOI: 10.1038/s41587-023-01987-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 09/13/2023] [Indexed: 02/17/2024]
Abstract
The application of computational biology in drug development for membrane protein targets has experienced a boost from recent developments in deep learning-driven structure prediction, increased speed and resolution of structure elucidation, machine learning structure-based design and the evaluation of big data. Recent protein structure predictions based on machine learning tools have delivered surprisingly reliable results for water-soluble and membrane proteins but have limitations for development of drugs that target membrane proteins. Structural transitions of membrane proteins have a central role during transmembrane signaling and are often influenced by therapeutic compounds. Resolving the structural and functional basis of dynamic transmembrane signaling networks, especially within the native membrane or cellular environment, remains a central challenge for drug development. Tackling this challenge will require an interplay between experimental and computational tools, such as super-resolution optical microscopy for quantification of the molecular interactions of cellular signaling networks and their modulation by potential drugs, cryo-electron microscopy for determination of the structural transitions of proteins in native cell membranes and entire cells, and computational tools for data analysis and prediction of the structure and function of cellular signaling networks, as well as generation of promising drug candidates.
Collapse
Affiliation(s)
- Haijian Li
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Xiaolin Sun
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Wenqiang Cui
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Marc Xu
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Junlin Dong
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Babatunde Edukpe Ekundayo
- Laboratory of Biological Electron Microscopy, IPHYS, SB, EPFL and Department of Fundamental Microbiology, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland
| | - Dongchun Ni
- Laboratory of Biological Electron Microscopy, IPHYS, SB, EPFL and Department of Fundamental Microbiology, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland
| | - Zhili Rao
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Liwei Guo
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Henning Stahlberg
- Laboratory of Biological Electron Microscopy, IPHYS, SB, EPFL and Department of Fundamental Microbiology, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland.
| | - Shuguang Yuan
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China.
| | - Horst Vogel
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China.
- Institut des Sciences et Ingénierie Chimiques (ISIC), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| |
Collapse
|
4
|
Liu Z, Bao Y, Wang W, Pan L, Wang H, Lin GN. Emden: A novel method integrating graph and transformer representations for predicting the effect of mutations on clinical drug response. Comput Biol Med 2023; 167:107678. [PMID: 37976823 DOI: 10.1016/j.compbiomed.2023.107678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/22/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023]
Abstract
Precision medicine based on personalized genomics provides promising strategies to enhance the efficacy of molecular-targeted therapies. However, the clinical effectiveness of drugs has been severely limited due to genetic variations that lead to drug resistance. Predicting the impact of missense mutations on clinical drug response is an essential way to reduce the cost of clinical trials and understand genetic diseases. Here, we present Emden, a novel method integrating graph and transformer representations that predicts the effect of missense mutations on drug response through binary classification with interpretability. Emden utilized protein sequences-based features and drug structures as inputs for rapid prediction, employing competitive representation learning and demonstrating strong generalization capabilities and robustness. Our study showed promising potential for clinical drug guidance and deep insight into computer-assisted precision medicine. Emden is freely available as a web server at https://www.psymukb.net/Emden.
Collapse
Affiliation(s)
- Zhe Liu
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yihang Bao
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Weidi Wang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Liangwei Pan
- Department of Thoracic Surgery, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Han Wang
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
| | - Guan Ning Lin
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China.
| |
Collapse
|
5
|
Abstract
A wide range of biomaterials and engineered cell surfaces are composed of bioconjugates embedded in liposome membranes, surface-immobilized bilayers, or the plasma membranes of living cells. This review article summarizes the various ways that Nature anchors integral and peripheral proteins in a cell membrane and describes the strategies devised by chemical biologists to label a membrane protein in living cells. Also discussed are modern synthetic and semisynthetic methods to produce lipidated proteins. Subsequent sections describe methods to anchor a three-component synthetic construct that is composed of a lipophilic membrane anchor, hydrophilic linker, and exposed functional component. The surface exposed payload can be a fluorophore, aptamer, oligonucleotide, polypeptide, peptide nucleic acid, polysaccharide, branched dendrimer, or linear polymer. Hydrocarbon chains are commonly used as the membrane anchor, and a general experimental trend is that a two chain lipid anchor has higher membrane affinity than a cholesteryl or single chain lipid anchor. Amphiphilic fluorescent dyes are effective molecular probes for cell membrane imaging and a zwitterionic linker between the fluorophore and the lipid anchor promotes high persistence in the plasma membrane of living cells. A relatively new advance is the development of switchable membrane anchors as molecular tools for fundamental studies or as technology platforms for applied biomaterials.
Collapse
Affiliation(s)
- Rananjaya S Gamage
- Department of Chemistry and Biochemistry, 251 Nieuwland Science Hall, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Jordan L Chasteen
- Department of Chemistry and Biochemistry, 251 Nieuwland Science Hall, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Bradley D Smith
- Department of Chemistry and Biochemistry, 251 Nieuwland Science Hall, University of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
6
|
Artificial intelligence-based HDX (AI-HDX) prediction reveals fundamental characteristics to protein dynamics: Mechanisms on SARS-CoV-2 immune escape. iScience 2023; 26:106282. [PMID: 36910327 PMCID: PMC9968663 DOI: 10.1016/j.isci.2023.106282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 01/10/2023] [Accepted: 02/23/2023] [Indexed: 03/03/2023] Open
Abstract
Three-dimensional structure and dynamics are essential for protein function. Advancements in hydrogen-deuterium exchange (HDX) techniques enable probing protein dynamic information in physiologically relevant conditions. HDX-coupled mass spectrometry (HDX-MS) has been broadly applied in pharmaceutical industries. However, it is challenging to obtain dynamics information at the single amino acid resolution and time consuming to perform the experiments and process the data. Here, we demonstrate the first deep learning model, artificial intelligence-based HDX (AI-HDX), that predicts intrinsic protein dynamics based on the protein sequence. It uncovers the protein structural dynamics by combining deep learning, experimental HDX, sequence alignment, and protein structure prediction. AI-HDX can be broadly applied to drug discovery, protein engineering, and biomedical studies. As a demonstration, we elucidated receptor-binding domain structural dynamics as a potential mechanism of anti-severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) antibody efficacy and immune escape. AI-HDX fundamentally differs from the current AI tools for protein analysis and may transform protein design for various applications.
Collapse
|
7
|
Gao T, Zhao Y, Zhang L, Wang H. Secondary and Topological Structural Merge Prediction of Alpha-Helical Transmembrane Proteins Using a Hybrid Model Based on Hidden Markov and Long Short-Term Memory Neural Networks. Int J Mol Sci 2023; 24:ijms24065720. [PMID: 36982795 PMCID: PMC10057634 DOI: 10.3390/ijms24065720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 03/11/2023] [Accepted: 03/13/2023] [Indexed: 03/19/2023] Open
Abstract
Alpha-helical transmembrane proteins (αTMPs) play essential roles in drug targeting and disease treatments. Due to the challenges of using experimental methods to determine their structure, αTMPs have far fewer known structures than soluble proteins. The topology of transmembrane proteins (TMPs) can determine the spatial conformation relative to the membrane, while the secondary structure helps to identify their functional domain. They are highly correlated on αTMPs sequences, and achieving a merge prediction is instructive for further understanding the structure and function of αTMPs. In this study, we implemented a hybrid model combining Deep Learning Neural Networks (DNNs) with a Class Hidden Markov Model (CHMM), namely HDNNtopss. DNNs extract rich contextual features through stacked attention-enhanced Bidirectional Long Short-Term Memory (BiLSTM) networks and Convolutional Neural Networks (CNNs), and CHMM captures state-associative temporal features. The hybrid model not only reasonably considers the probability of the state path but also has a fitting and feature-extraction capability for deep learning, which enables flexible prediction and makes the resulting sequence more biologically meaningful. It outperforms current advanced merge-prediction methods with a Q4 of 0.779 and an MCC of 0.673 on the independent test dataset, which have practical, solid significance. In comparison to advanced prediction methods for topological and secondary structures, it achieves the highest topology prediction with a Q2 of 0.884, which has a strong comprehensive performance. At the same time, we implemented a joint training method, Co-HDNNtopss, and achieved a good performance to provide an important reference for similar hybrid-model training.
Collapse
Affiliation(s)
- Ting Gao
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun 130117, China; (T.G.); (Y.Z.)
| | - Yutong Zhao
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun 130117, China; (T.G.); (Y.Z.)
| | - Li Zhang
- School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China;
| | - Han Wang
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun 130117, China; (T.G.); (Y.Z.)
- Correspondence:
| |
Collapse
|
8
|
Konda Mani S, Thiyagarajan R, Yli-Harja O, Kandhavelu M, Murugesan A. Structural analysis of human G-protein-coupled receptor 17 ligand binding sites. J Cell Biochem 2023; 124:533-544. [PMID: 36791278 DOI: 10.1002/jcb.30388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/17/2023] [Accepted: 02/03/2023] [Indexed: 02/17/2023]
Abstract
The human G protein coupled membrane receptor (GPR17), the sensor of brain damage, is identified as a biomarker for many neurological diseases. In human brain tissue, GPR17 exist in two isoforms, long and short. While cryo-electron microscopy technology has provided the structure of the long isoform of GPR17 with Gi complex, the structure of the short isoform and its activation mechanism remains unclear. Recently, we theoretically modeled the structure of the short isoform of GPR17 with Gi signaling protein and identified novel ligands. In the present work, we demonstrated the presence of two distinct ligand binding sites in the short isoform of GPR17. The molecular docking of GPR17 with endogenous (UDP) and synthetic ligands (T0510.3657, MDL29950) found the presence of two distinct binding pockets. Our observations revealed that endogenous ligand UDP can bind stronger in two different binding pockets as evidenced by glide and autodock vina scores, whereas the other two ligand's binding with GPR17 has less docking score. The analysis of receptor-UDP interactions shows complexes' stability in the lipid environment by 100 ns atomic molecular dynamics simulations. The amino acid residues VAL83, ARG87, and PHE111 constitute ligand binding site 1, whereas site 2 constitutes ASN67, ARG129, and LYS232. Root mean square fluctuation analysis showed the residues 83, 87, and 232 with higher fluctuations during molecular dynamics simulation in both binding pockets. Our findings imply that the residues of GPR17's two binding sites are crucial, and their interaction with UDP reveals the protein's hidden signaling and communication properties. Furthermore, this finding may assist in the development of targeted therapies for the treatment of neurological diseases.
Collapse
Affiliation(s)
- Saravanan Konda Mani
- Department of Biotechnology, Bharath Institute of Higher Education & Research, Chennai, Tamilnadu, India
| | - Ramesh Thiyagarajan
- Department of Basic Medical Sciences, College of Medicine, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Olli Yli-Harja
- Computaional Systems Biology Group, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.,Institute for Systems Biology, Seattle, Washington, USA
| | - Meenakshisundaram Kandhavelu
- Molecular Signaling Group, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.,BioMeditech and Tays Cancer Center, Tampere University Hospital, Tampere, Finland
| | - Akshaya Murugesan
- BioMeditech and Tays Cancer Center, Tampere University Hospital, Tampere, Finland.,Department of Biotechnology, Lady Doak College, Madurai Kamaraj University, Madurai, India
| |
Collapse
|
9
|
Nallasamy V, Seshiah M. Energy Profile Bayes and Thompson Optimized Convolutional Neural Network protein structure prediction. Neural Comput Appl 2023; 35:1983-2006. [PMID: 36245797 PMCID: PMC9542649 DOI: 10.1007/s00521-022-07868-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 09/21/2022] [Indexed: 01/12/2023]
Abstract
In living organisms, proteins are considered as the executants of biological functions. Owing to its pivotal role played in protein folding patterns, comprehension of protein structure is a challenging issue. Moreover, owing to numerous protein sequence exploration in protein data banks and complication of protein structures, experimental methods are found to be inadequate for protein structural class prediction. Hence, it is very much advantageous to design a reliable computational method to predict protein structural classes from protein sequences. In the recent few years there has been an elevated interest in using deep learning to assist protein structure prediction as protein structure prediction models can be utilized to screen a large number of novel sequences. In this regard, we propose a model employing Energy Profile for atom pairs in conjunction with the Legion-Class Bayes function called Energy Profile Legion-Class Bayes Protein Structure Identification model. Followed by this, we use a Thompson Optimized convolutional neural network to extract features between amino acids and then the Thompson Optimized SoftMax function is employed to extract associations between protein sequences for predicting secondary protein structure. The proposed Energy Profile Bayes and Thompson Optimized Convolutional Neural Network (EPB-OCNN) method tested distinct unique protein data and was compared to the state-of-the-art methods, the Template-Based Modeling, Protein Design using Deep Graph Neural Networks, a deep learning-based S-glutathionylation sites prediction tool called a Computational Framework, the Deep Learning and a distance-based protein structure prediction using deep learning. The results obtained when applied with the Biopython tool with respect to protein structure prediction time, protein structure prediction accuracy, specificity, recall, F-measure, and precision, respectively, are measured. The proposed EPB-OCNN method outperformed the state-of-the-art methods, thereby corroborating the objective.
Collapse
Affiliation(s)
- Varanavasi Nallasamy
- Cognizant Technology Solutions Pvt. Ltd, CHIL SEZ IT Park, Keeranatham, Saravanam Patti, Coimbatore, Tamil Nadu 641035 India
| | - Malarvizhi Seshiah
- Department of Computer Science, Thiruvalluvar Government Arts College, Rasipuram, Namakkal, Tamil Nadu India
| |
Collapse
|
10
|
Ismi DP, Pulungan R, Afiahayati. Deep learning for protein secondary structure prediction: Pre and post-AlphaFold. Comput Struct Biotechnol J 2022; 20:6271-6286. [PMID: 36420164 PMCID: PMC9678802 DOI: 10.1016/j.csbj.2022.11.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 11/05/2022] [Accepted: 11/05/2022] [Indexed: 11/13/2022] Open
Abstract
This paper aims to provide a comprehensive review of the trends and challenges of deep neural networks for protein secondary structure prediction (PSSP). In recent years, deep neural networks have become the primary method for protein secondary structure prediction. Previous studies showed that deep neural networks had uplifted the accuracy of three-state secondary structure prediction to more than 80%. Favored deep learning methods, such as convolutional neural networks, recurrent neural networks, inception networks, and graph neural networks, have been implemented in protein secondary structure prediction. Methods adapted from natural language processing (NLP) and computer vision are also employed, including attention mechanism, ResNet, and U-shape networks. In the post-AlphaFold era, PSSP studies focus on different objectives, such as enhancing the quality of evolutionary information and exploiting protein language models as the PSSP input. The recent trend to utilize pre-trained language models as input features for secondary structure prediction provides a new direction for PSSP studies. Moreover, the state-of-the-art accuracy achieved by previous PSSP models is still below its theoretical limit. There are still rooms for improvement to be made in the field.
Collapse
Affiliation(s)
- Dewi Pramudi Ismi
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
- Department of Infomatics, Faculty of Industrial Technology, Universitas Ahmad Dahlan, Yogyakarta, Indonesia
| | - Reza Pulungan
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - Afiahayati
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
| |
Collapse
|
11
|
Evaluation of the Effectiveness of Derived Features of AlphaFold2 on Single-Sequence Protein Binding Site Prediction. BIOLOGY 2022; 11:biology11101454. [PMID: 36290358 PMCID: PMC9598995 DOI: 10.3390/biology11101454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 09/30/2022] [Accepted: 09/30/2022] [Indexed: 11/06/2022]
Abstract
Simple Summary With the development of artificial intelligence, researchers can roughly predict the crystal structure of a protein by computer without the need for biological experiments, which provides new ideas and solutions to problems, such as protein-protein interaction and drug-target predictions. In this study, we proposed strategies to combine predicted protein structures with deep learning networks and evaluated them on different protein binding site prediction tasks. Our computational experiment results showed that all proposed strategies could effectively encode structural information for deep learning models. Abstract Though AlphaFold2 has attained considerably high precision on protein structure prediction, it is reported that directly inputting coordinates into deep learning networks cannot achieve desirable results on downstream tasks. Thus, how to process and encode the predicted results into effective forms that deep learning models can understand to improve the performance of downstream tasks is worth exploring. In this study, we tested the effects of five processing strategies of coordinates on two single-sequence protein binding site prediction tasks. These five strategies are spatial filtering, the singular value decomposition of a distance map, calculating the secondary structure feature, and the relative accessible surface area feature of proteins. The computational experiment results showed that all strategies were suitable and effective methods to encode structural information for deep learning models. In addition, by performing a case study of a mutated protein, we showed that the spatial filtering strategy could introduce structural changes into HHblits profiles and deep learning networks when protein mutation happens. In sum, this work provides new insight into the downstream tasks of protein-molecule interaction prediction, such as predicting the binding residues of proteins and estimating the effects of mutations.
Collapse
|