1
|
Duart G, Graña-Montes R, Pastor-Cantizano N, Mingarro I. Experimental and computational approaches for membrane protein insertion and topology determination. Methods 2024; 226:102-119. [PMID: 38604415 DOI: 10.1016/j.ymeth.2024.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 03/13/2024] [Accepted: 03/22/2024] [Indexed: 04/13/2024] Open
Abstract
Membrane proteins play pivotal roles in a wide array of cellular processes and constitute approximately a quarter of the protein-coding genes across all organisms. Despite their ubiquity and biological significance, our understanding of these proteins remains notably less comprehensive compared to their soluble counterparts. This disparity in knowledge can be attributed, in part, to the inherent challenges associated with employing specialized techniques for the investigation of membrane protein insertion and topology. This review will center on a discussion of molecular biology methodologies and computational prediction tools designed to elucidate the insertion and topology of helical membrane proteins.
Collapse
Affiliation(s)
- Gerard Duart
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Ricardo Graña-Montes
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Noelia Pastor-Cantizano
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Ismael Mingarro
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain.
| |
Collapse
|
2
|
DeKryger W, Chroneos ZC. Emerging concepts of myosin 18A isoform mechanobiology in organismal and immune system physiology, development, and function. FASEB J 2024; 38:e23649. [PMID: 38776246 DOI: 10.1096/fj.202400350r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/17/2024] [Accepted: 04/22/2024] [Indexed: 05/24/2024]
Abstract
Alternative and combinatorial splicing of myosin 18A (MYO18A) gene transcripts results in expression of MYO18A protein isoforms and isoform variants with different membrane and subcellular localizations, and functional properties. MYO18A proteins are members of the myosin superfamily consisting of a myosin-like motor domain, an IQ motif, and a coiled-coil domain. MYO18A isoforms, however, lack the ability to hydrolyze ATP and do not perform ATP-dependent motor activity. MYO18A isoforms are distinguished by different amino- and carboxy-terminal extensions and domains. The domain organization and functions of MYO18Aα, MYO18Aβ, and MYO18Aγ have been studied experimentally. MYO18Aα and MYO18Aβ have a common carboxy-terminal extension but differ by the presence or absence of an amino-terminal KE repeat and PDZ domain, respectively. The amino- and carboxy-terminal extensions of MYO18Aγ contain unique proline and serine-rich domains. Computationally predicted MYO18Aε and MYO18Aδ isoforms contain the carboxy-terminal serine-rich extension but differ by the presence or absence of the amino-terminal KE/PDZ extension. Additional isoform variants within each category arise by alternative utilization or inclusion/exclusion of small exons. MYO18Aα variants are expressed in somatic cells and mature immune cells, whereas MYO18Aβ variants occur mainly in myeloid and natural killer cells. MYO18Aγ expression is selective to cardiac and skeletal muscle. In the present review perspective, we discuss current and emerging concepts of the functional specialization of MYO18A proteins in membrane and cytoskeletal dynamics, cellular communication and signaling, endocytic and exocytic organelle movement, viral infection, and as the SP-R210 receptor for surfactant protein A.
Collapse
Affiliation(s)
- William DeKryger
- Department of Pediatrics, Division of Neonatal-Perinatal Medicine, Pulmonary Immunology and Physiology Laboratory, Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Zissis C Chroneos
- Department of Pediatrics, Division of Neonatal-Perinatal Medicine, Pulmonary Immunology and Physiology Laboratory, Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Department of Microbiology and Immunology, Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| |
Collapse
|
3
|
Sağsöz ME, Sağlam B, Arslan K, Baştuğ T, Çavuş M, Puralı N. Structural, Functional and Molecular Dynamics Examination of a de novo cloned Otopetrin-like Proton Channel in crayfish. Cell Biochem Biophys 2024:10.1007/s12013-024-01310-z. [PMID: 38811473 DOI: 10.1007/s12013-024-01310-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/11/2024] [Indexed: 05/31/2024]
Abstract
Proton channels play a crucial role in many biological functions, as they are responsible for the selective transport of protons across cell membranes. Recently, Otopetrins, a family of eukaryotic proton-selective ion channels, have attracted significant attention due to their diverse physiological roles. Despite the importance of Otopetrins, their structural and functional properties remain relatively unexplored. As a model organism, crayfish have been extensively studied to gain insights into the functioning of the nervous system. These studies cover a wide range of aspects, including the properties of individual neurons and behavioral science. However, studying the physiological systems of crayfish poses challenges for molecular research due to limited molecular sequence information available for these organisms. In the present work was identified an originally cloned mRNA, coding an Otopetrin like proton channel in the crayfish. The coded protein was modeled in silico and possible conduction mechanisms and pathways were revealed. A plasmid of the cloned mRNA was heterologously expressed in HEK293T cells. Functional experiments on transfected cells indicated that the expressed mRNA was coupled to proton conduction across the cell membrane.
Collapse
Affiliation(s)
- Mustafa Erdem Sağsöz
- Biophysics Department, Hacettepe University, Faculty of Medicine, Ankara, Türkiye
- Biophysics Department, Atatürk University, Faculty of Medicine, Erzurum, Türkiye
| | - Berk Sağlam
- Biophysics Department, Hacettepe University, Faculty of Medicine, Ankara, Türkiye
| | - Kaan Arslan
- Biophysics Department, Hacettepe University, Faculty of Medicine, Ankara, Türkiye
| | - Turgut Baştuğ
- Biophysics Department, Hacettepe University, Faculty of Medicine, Ankara, Türkiye
| | - Murat Çavuş
- Bozok University, Faculty of Education, Mathematics and Science Education, Yozgat, Türkiye
| | - Nuhan Puralı
- Biophysics Department, Hacettepe University, Faculty of Medicine, Ankara, Türkiye.
| |
Collapse
|
4
|
Graffam D, Cutlan M, Storm AR, Hulse-Kemp AM, Stoeckman AK. Gossypium hirsutum gene of unknown function Gohir.A02G161000 encodes a potential transmembrane Root UVB Sensitive 4 Protein with a putative protein-protein interaction interface. MICROPUBLICATION BIOLOGY 2024; 2024:10.17912/micropub.biology.000869. [PMID: 38495582 PMCID: PMC10943365 DOI: 10.17912/micropub.biology.000869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 02/05/2024] [Accepted: 02/27/2024] [Indexed: 03/19/2024]
Abstract
A gene of unknown function, Gohir.A02G161000.1, identified in Gossypium hirsutum was studied using computational sequence and structure bioinformatics tools. The associated protein GhRUS4-A0A1U8JPV7 (UniProt A0A1U8JPV7) is predicted to be a plastid-localized, transmembrane root UVB-sensitive 4 (RUS4) protein with a newly identified potential dimerization surface. Evidence from homology and sequence conservation suggest involvement in auxin transport and pollen maturation.
Collapse
Affiliation(s)
| | - Marissa Cutlan
- Chemistry Department, Bethel University, Saint Paul, MN USA
| | - Amanda R Storm
- Department of Biology, Western Carolina University, Cullowhee, NC USA
| | - Amanda M Hulse-Kemp
- Genomics and Bioinformatics Research Unit, The Agricultural Research Service of U.S. Department of Agriculture, Raleigh, NC USA
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC USA
| | | |
Collapse
|
5
|
Li H, Sun X, Cui W, Xu M, Dong J, Ekundayo BE, Ni D, Rao Z, Guo L, Stahlberg H, Yuan S, Vogel H. Computational drug development for membrane protein targets. Nat Biotechnol 2024; 42:229-242. [PMID: 38361054 DOI: 10.1038/s41587-023-01987-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 09/13/2023] [Indexed: 02/17/2024]
Abstract
The application of computational biology in drug development for membrane protein targets has experienced a boost from recent developments in deep learning-driven structure prediction, increased speed and resolution of structure elucidation, machine learning structure-based design and the evaluation of big data. Recent protein structure predictions based on machine learning tools have delivered surprisingly reliable results for water-soluble and membrane proteins but have limitations for development of drugs that target membrane proteins. Structural transitions of membrane proteins have a central role during transmembrane signaling and are often influenced by therapeutic compounds. Resolving the structural and functional basis of dynamic transmembrane signaling networks, especially within the native membrane or cellular environment, remains a central challenge for drug development. Tackling this challenge will require an interplay between experimental and computational tools, such as super-resolution optical microscopy for quantification of the molecular interactions of cellular signaling networks and their modulation by potential drugs, cryo-electron microscopy for determination of the structural transitions of proteins in native cell membranes and entire cells, and computational tools for data analysis and prediction of the structure and function of cellular signaling networks, as well as generation of promising drug candidates.
Collapse
Affiliation(s)
- Haijian Li
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Xiaolin Sun
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Wenqiang Cui
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Marc Xu
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Junlin Dong
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Babatunde Edukpe Ekundayo
- Laboratory of Biological Electron Microscopy, IPHYS, SB, EPFL and Department of Fundamental Microbiology, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland
| | - Dongchun Ni
- Laboratory of Biological Electron Microscopy, IPHYS, SB, EPFL and Department of Fundamental Microbiology, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland
| | - Zhili Rao
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Liwei Guo
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China
| | - Henning Stahlberg
- Laboratory of Biological Electron Microscopy, IPHYS, SB, EPFL and Department of Fundamental Microbiology, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland.
| | - Shuguang Yuan
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China.
| | - Horst Vogel
- Center for Computer-Aided Drug Discovery, Faculty of Pharmaceutical Sciences, Shenzhen Institute of Advanced Technology/Chinese Academy of Sciences (SIAT/CAS), Shenzhen, China.
- Institut des Sciences et Ingénierie Chimiques (ISIC), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| |
Collapse
|
6
|
Savojardo C, Martelli PL, Casadio R. Finding functional motifs in protein sequences with deep learning and natural language models. Curr Opin Struct Biol 2023; 81:102641. [PMID: 37385080 DOI: 10.1016/j.sbi.2023.102641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 04/17/2023] [Accepted: 05/24/2023] [Indexed: 07/01/2023]
Abstract
Recently, prediction of structural/functional motifs in protein sequences takes advantage of powerful machine learning based approaches. Protein encoding adopts protein language models overpassing standard procedures. Different combinations of machine learning and encoding schemas are available for predicting different structural/functional motifs. Particularly interesting is the adoption of protein language models to encode proteins in addition to evolution information and physicochemical parameters. A thorough analysis of recent predictors developed for annotating transmembrane regions, sorting signals, lipidation and phosphorylation sites allows to investigate the state-of-the-art focusing on the relevance of protein language models for the different tasks. This highlights that more experimental data are necessary to exploit available powerful machine learning methods.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, 40126 Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, 40126 Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, 40126 Bologna, Italy.
| |
Collapse
|
7
|
Xue Y, Wang X, Liu W. Reconstitution of the Linaridin Pathway Provides Access to the Family-Determining Activity of Two Membrane-Associated Proteins in the Formation of Structurally Underestimated Cypemycin. J Am Chem Soc 2023; 145:7040-7047. [PMID: 36921096 DOI: 10.1021/jacs.3c01730] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
Cypemycin is a parent linaridin peptide known to contain nonproteinogenic dehydrobutyrine, N,N-dimethylalanine, and aminovinyl-cysteine residues. The enzymatic process by which this ribosomally synthesized peptide is formed remains elusive largely because of the deficiency of knowledge in post-translational modifications (PTMs) conducted by CypH and CypL, the two membrane-associated enzymes unique to linaridin biosynthesis. Based on heterologous reconstitution of the pathway in Streptomyces coelicolor, we report the detailed structural characterization of cypemycin as a previously unknown, d-amino acid-rich linaridin. In particular, the unprecedented family-determining activity of CypH and CypL was revealed, which, in addition to hydrolysis for removal of the N-terminal leader peptide, leads to transformation of the core peptide part of the precursor peptide through mechanistically related 16 reactions for residue epimerization (11 amino acids), dehydration (4 Thr), and dethiolation (Cys19). Subsequent functionalization for linaridin maturation includes CypD-involved aminovinyl-cysteine formation and N,N-dimethylation of the newly exposed N-terminal d-Ala residue that requires CypM activity. Genetic, chemical, biochemical, engineering, and modeling approaches were used to access the structure of cypemycin and the versatility of the CypH and CypL combination that is achieved in catalysis. This work furthers the appreciation of PTM chemistry and facilitates efforts for expanding linaridin structural diversity using synthetic biology methods.
Collapse
Affiliation(s)
- Yanqing Xue
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, China
| | - Xiaofeng Wang
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, China.,School of Chemistry and Materials Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Sublane Xiangshan, Hangzhou 310024, China
| | - Wen Liu
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, China
| |
Collapse
|
8
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Jacklyn Liu
- UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
| | - Kai Hu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India,Corresponding authors.
| | - Yuan Zhang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China,Corresponding authors.
| |
Collapse
|
9
|
González-Magaña A, Altuna J, Queralt-Martín M, Largo E, Velázquez C, Montánchez I, Bernal P, Alcaraz A, Albesa-Jové D. The P. aeruginosa effector Tse5 forms membrane pores disrupting the membrane potential of intoxicated bacteria. Commun Biol 2022; 5:1189. [PMID: 36335275 PMCID: PMC9637101 DOI: 10.1038/s42003-022-04140-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 10/20/2022] [Indexed: 11/08/2022] Open
Abstract
The type VI secretion system (T6SS) of Pseudomonas aeruginosa injects effector proteins into neighbouring competitors and host cells, providing a fitness advantage that allows this opportunistic nosocomial pathogen to persist and prevail during the onset of infections. However, despite the high clinical relevance of P. aeruginosa, the identity and mode of action of most P. aeruginosa T6SS-dependent effectors remain to be discovered. Here, we report the molecular mechanism of Tse5-CT, the toxic auto-proteolytic product of the P. aeruginosa T6SS exported effector Tse5. Our results demonstrate that Tse5-CT is a pore-forming toxin that can transport ions across the membrane, causing membrane depolarisation and bacterial death. The membrane potential regulates a wide range of essential cellular functions; therefore, membrane depolarisation is an efficient strategy to compete with other microorganisms in polymicrobial environments.
Collapse
Affiliation(s)
- Amaia González-Magaña
- Fundación Biofísica Bizkaia/Biofisika Bizkaia Fundazioa (FBB) and Departamento de Bioquímica y Biología Molecular, Instituto Biofisika (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
| | - Jon Altuna
- Fundación Biofísica Bizkaia/Biofisika Bizkaia Fundazioa (FBB) and Departamento de Bioquímica y Biología Molecular, Instituto Biofisika (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
| | - María Queralt-Martín
- Laboratory of Molecular Biophysics, Department of Physics, University Jaume I, 12071, Castellón, Spain
| | - Eneko Largo
- Fundación Biofísica Bizkaia/Biofisika Bizkaia Fundazioa (FBB) and Departamento de Bioquímica y Biología Molecular, Instituto Biofisika (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
- Departamento de Inmunología, Microbiología y Parasitología, University of the Basque Country, 48940, Leioa, Spain
| | - Carmen Velázquez
- Fundación Biofísica Bizkaia/Biofisika Bizkaia Fundazioa (FBB) and Departamento de Bioquímica y Biología Molecular, Instituto Biofisika (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
| | - Itxaso Montánchez
- Departamento de Inmunología, Microbiología y Parasitología, University of the Basque Country, 48940, Leioa, Spain
| | - Patricia Bernal
- Departamento de Microbiología, Facultad de Biología, Universidad de Sevilla, 41012, Sevilla, Spain
| | - Antonio Alcaraz
- Laboratory of Molecular Biophysics, Department of Physics, University Jaume I, 12071, Castellón, Spain
| | - David Albesa-Jové
- Fundación Biofísica Bizkaia/Biofisika Bizkaia Fundazioa (FBB) and Departamento de Bioquímica y Biología Molecular, Instituto Biofisika (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain.
- Ikerbasque, Basque Foundation for Science, 48013, Bilbao, Spain.
| |
Collapse
|
10
|
Wang L, Zhong H, Xue Z, Wang Y. Res-Dom: predicting protein domain boundary from sequence using deep residual network and Bi-LSTM. BIOINFORMATICS ADVANCES 2022; 2:vbac060. [PMID: 36699417 PMCID: PMC9710680 DOI: 10.1093/bioadv/vbac060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 07/01/2022] [Accepted: 08/30/2022] [Indexed: 01/28/2023]
Abstract
Motivation Protein domains are the basic units of proteins that can fold, function and evolve independently. Protein domain boundary partition plays an important role in protein structure prediction, understanding their biological functions, annotating their evolutionary mechanisms and protein design. Although there are many methods that have been developed to predict domain boundaries from protein sequence over the past two decades, there is still much room for improvement. Results In this article, a novel domain boundary prediction tool called Res-Dom was developed, which is based on a deep residual network, bidirectional long short-term memory (Bi-LSTM) and transfer learning. We used deep residual neural networks to extract higher-order residue-related information. In addition, we also used a pre-trained protein language model called ESM to extract sequence embedded features, which can summarize sequence context information more abundantly. To improve the global representation of these deep residual networks, a Bi-LSTM network was also designed to consider long-range interactions between residues. Res-Dom was then tested on an independent test set including 342 proteins and generated correct single-domain and multi-domain classifications with a Matthew's correlation coefficient of 0.668, which was 17.6% higher than the second-best compared method. For domain boundaries, the normalized domain overlapping score of Res-Dom was 0.849, which was 5% higher than the second-best compared method. Furthermore, Res-Dom required significantly less time than most of the recently developed state-of-the-art domain prediction methods. Availability and implementation All source code, datasets and model are available at http://isyslab.info/Res-Dom/.
Collapse
Affiliation(s)
- Lei Wang
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China.,School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Haolin Zhong
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Zhidong Xue
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China.,School of Software Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yan Wang
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China.,School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
11
|
Wang L, Zhong H, Xue Z, Wang Y. Improving the topology prediction of α-helical transmembrane proteins with deep transfer learning. Comput Struct Biotechnol J 2022; 20:1993-2000. [PMID: 35521551 PMCID: PMC9062415 DOI: 10.1016/j.csbj.2022.04.024] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 04/09/2022] [Accepted: 04/17/2022] [Indexed: 11/11/2022] Open
Abstract
Transmembrane proteins (TMPs) are essential for cell recognition and communication, and they serve as important drug targets in humans. Transmembrane proteins' 3D structures are critical for determining their functions and drug design but are hard to determine even by experimental methods. Although some computational methods have been developed to predict transmembrane helices (TMHs) and orientation, there is still room for improvement. Considering that the pre-trained language model can make full use of massive unlabeled protein sequences to obtain latent feature representation for TMPs and reduce the dependence on evolutionary information, we proposed DeepTMpred, which used pre-trained self-supervised language models called ESM, convolutional neural networks, attentive neural network and conditional random fields for alpha-TMP topology prediction. Compared with the current state-of-the-art tools on a non-redundant dataset of TMPs, DeepTMpred demonstrated superior predictive performance in most evaluation metrics, especially at the TMH level. Furthermore, DeepTMpred could also obtain reliable prediction results for TMPs without much evolutionary feature in a few seconds. A tutorial on how to use DeepTMpred can be found in the colab notebook (https://colab.research.google.com/github/ISYSLAB-HUST/DeepTMpred/blob/master/notebook/test.ipynb).
Collapse
Affiliation(s)
- Lei Wang
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China
| | - Haolin Zhong
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Zhidong Xue
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yan Wang
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China
| |
Collapse
|
12
|
Feng SH, Xia CQ, Zhang PD, Shen HB. Ab-Initio Membrane Protein Amphipathic Helix Structure Prediction Using Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:795-805. [PMID: 33026978 DOI: 10.1109/tcbb.2020.3029274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Amphipathic helix (AH)features the segregation of polar and nonpolar residues and plays important roles in many membrane-associated biological processes through interacting with both the lipid and the soluble phases. Although the AH structure has been discovered for a long time, few ab initio machine learning-based prediction models have been reported, due to the limited amount of training data. In this study, we report a new deep learning-based prediction model, which is composed of a residual neural network and the uneven-thresholds decision algorithm. It is constructed on 121 membrane proteins, in total 51640 residue samples, which are curated from an up-to-date membrane protein structure database. Through a rigid 10-fold nested cross-validation experiment, we demonstrate that our model can achieve promising predictions and exceed current state-of-the-art approaches in this field. This presents a new avenue for accurately predicting AHs. Analysis on the contribution of the input residues and some cases further reveals the high interpretability and the generalization of our model.
Collapse
|
13
|
Wang L, Zhang J, Wang D, Song C. Membrane contact probability: An essential and predictive character for the structural and functional studies of membrane proteins. PLoS Comput Biol 2022; 18:e1009972. [PMID: 35353812 PMCID: PMC9000120 DOI: 10.1371/journal.pcbi.1009972] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/11/2022] [Accepted: 02/25/2022] [Indexed: 11/20/2022] Open
Abstract
One of the unique traits of membrane proteins is that a significant fraction of their hydrophobic amino acids is exposed to the hydrophobic core of lipid bilayers rather than being embedded in the protein interior, which is often not explicitly considered in the protein structure and function predictions. Here, we propose a characteristic and predictive quantity, the membrane contact probability (MCP), to describe the likelihood of the amino acids of a given sequence being in direct contact with the acyl chains of lipid molecules. We show that MCP is complementary to solvent accessibility in characterizing the outer surface of membrane proteins, and it can be predicted for any given sequence with a machine learning-based method by utilizing a training dataset extracted from MemProtMD, a database generated from molecular dynamics simulations for the membrane proteins with a known structure. As the first of many potential applications, we demonstrate that MCP can be used to systematically improve the prediction precision of the protein contact maps and structures. The distribution of residues on protein surfaces is largely determined by the surrounding environment. For soluble proteins, most of the residues on the outer surface are hydrophilic, and people use the quantity “solvent accessibility” to describe and predict these surface residues. In contrast, for membrane proteins that are embedded in a lipid bilayer, many of their surface residues are hydrophobic and membrane-contacting, but there is yet a widely-accepted quantity for the description or prediction of this characteristic property. Here, we propose a new quantity termed “membrane contact probability (MCP)”, which can be used to describe and predict the membrane-contacting surface residues of proteins. We also propose a machine learning-based method to predict MCP from protein sequences, utilizing the dataset generated by physics-based computer simulations. We demonstrate that a quantity such as MCP is helpful for protein structure prediction, and we believe that it will find broad applications in the structure and function studies of membrane proteins.
Collapse
Affiliation(s)
- Lei Wang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary studies, Peking University, Beijing, China
| | - Jiangguo Zhang
- School of Life Sciences, Peking University, Beijing, China
| | - Dali Wang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Chen Song
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- * E-mail:
| |
Collapse
|
14
|
Feng SH, Xia CQ, Shen HB. CoCoPRED: coiled-coil protein structural feature prediction from amino acid sequence using deep neural networks. Bioinformatics 2022; 38:720-729. [PMID: 34718416 DOI: 10.1093/bioinformatics/btab744] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 10/08/2021] [Accepted: 10/27/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Coiled-coil is composed of two or more helices that are wound around each other. It widely exists in proteins and has been discovered to play a variety of critical roles in biology processes. Generally, there are three types of structural features in coiled-coil: coiled-coil domain (CCD), oligomeric state and register. However, most of the existing computational tools only focus on one of them. RESULTS Here, we describe a new deep learning model, CoCoPRED, which is based on convolutional layers, bidirectional long short-term memory, and attention mechanism. It has three networks, i.e. CCD network, oligomeric state network, and register network, corresponding to the three types of structural features in coiled-coil. This means CoCoPRED has the ability of fulfilling comprehensive prediction for coiled-coil proteins. Through the 5-fold cross-validation experiment, we demonstrate that CoCoPRED can achieve better performance than the state-of-the-art models on both CCD prediction and oligomeric state prediction. Further analysis suggests the CCD prediction may be a performance indicator of the oligomeric state prediction in CoCoPRED. The attention heads in CoCoPRED indicate that registers a, b and e are more crucial for the oligomeric state prediction. AVAILABILITY AND IMPLEMENTATION CoCoPRED is available at http://www.csbio.sjtu.edu.cn/bioinf/CoCoPRED. The datasets used in this research can also be downloaded from the website. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shi-Hao Feng
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Chun-Qiu Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.,Department of Computer Science, Shanghai Jiao Tong University, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai 200240, China
| |
Collapse
|
15
|
Sanchez-Pulido L, Ponting CP. Extending the Horizon of Homology Detection with Coevolution-based Structure Prediction. J Mol Biol 2021; 433:167106. [PMID: 34139218 PMCID: PMC8527833 DOI: 10.1016/j.jmb.2021.167106] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 12/12/2022]
Abstract
Traditional sequence analysis algorithms fail to identify distant homologies when they lie beyond a detection horizon. In this review, we discuss how co-evolution-based contact and distance prediction methods are pushing back this homology detection horizon, thereby yielding new functional insights and experimentally testable hypotheses. Based on correlated substitutions, these methods divine three-dimensional constraints among amino acids in protein sequences that were previously devoid of all annotated domains and repeats. The new algorithms discern hidden structure in an otherwise featureless sequence landscape. Their revelatory impact promises to be as profound as the use, by archaeologists, of ground-penetrating radar to discern long-hidden, subterranean structures. As examples of this, we describe how triplicated structures reflecting longin domains in MON1A-like proteins, or UVR-like repeats in DISC1, emerge from their predicted contact and distance maps. These methods also help to resolve structures that do not conform to a "beads-on-a-string" model of protein domains. In one such example, we describe CFAP298 whose ubiquitin-like domain was previously challenging to perceive owing to a large sequence insertion within it. More generally, the new algorithms permit an easier appreciation of domain families and folds whose evolution involved structural insertion or rearrangement. As we exemplify with α1-antitrypsin, coevolution-based predicted contacts may also yield insights into protein dynamics and conformational change. This new combination of structure prediction (using innovative co-evolution based methods) and homology inference (using more traditional sequence analysis approaches) shows great promise for bringing into view a sea of evolutionary relationships that had hitherto lain far beyond the horizon of homology detection.
Collapse
Affiliation(s)
- Luis Sanchez-Pulido
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK.
| | - Chris P Ponting
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK.
| |
Collapse
|
16
|
Lomize AL, Schnitzer KA, Todd SC, Pogozheva ID. Thermodynamics-Based Molecular Modeling of α-Helices in Membranes and Micelles. J Chem Inf Model 2021; 61:2884-2896. [PMID: 34029472 DOI: 10.1021/acs.jcim.1c00161] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Folding of Membrane-Associated Peptides (FMAP) method was developed for modeling α-helix formation by linear peptides in micelles and lipid bilayers. FMAP 2.0 identifies locations of α-helices in the amino acid sequence, generates their three-dimensional models in planar bilayers or spherical micelles, and estimates their thermodynamic stabilities and tilt angles, depending on temperature and pH. The method was tested for 723 peptides (926 data points) experimentally studied in different environments and for 170 single-pass transmembrane (TM) proteins with available crystal structures. FMAP 2.0 detected more than 95% of experimentally observed α-helices with an average error in helix end determination of around 2, 3, 4, and 5 residues per helix for peptides in water, micelles, bilayers, and TM proteins, respectively. Helical and nonhelical residue states were predicted with an accuracy from 0.86 to 0.96, and the Matthews correlation coefficient was from 0.64 to 0.88 depending on the environment. Experimental micelle- and membrane-binding energies and tilt angles of peptides were reproduced with a root-mean-square deviation of around 2 kcal/mol and 7°, respectively. The TM and non-TM states of hydrophobic and pH-triggered α-helical peptides in various lipid bilayers were reproduced in more than 95% of cases. The FMAP 2.0 web server (https://membranome.org/fmap) is publicly available to explore the structural polymorphism of antimicrobial, cell-penetrating, fusion, and other membrane-binding peptides, which is important for understanding the mechanisms of their biological activities.
Collapse
Affiliation(s)
- Andrei L Lomize
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, Michigan 48109-1065, United States
| | - Kevin A Schnitzer
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, 1221 Beal Avenue, Ann Arbor, Michigan 48109-2102, United States
| | - Spencer C Todd
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, 1221 Beal Avenue, Ann Arbor, Michigan 48109-2102, United States
| | - Irina D Pogozheva
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, Michigan 48109-1065, United States
| |
Collapse
|
17
|
Partial proteolysis improves the identification of the extracellular segments of transmembrane proteins by surface biotinylation. Sci Rep 2020; 10:8880. [PMID: 32483232 PMCID: PMC7264363 DOI: 10.1038/s41598-020-65831-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 05/08/2020] [Indexed: 01/11/2023] Open
Abstract
Transmembrane proteins (TMP) play a crucial role in several physiological processes. Despite their importance and diversity, only a few TMP structures have been determined by high-resolution protein structure characterization methods so far. Due to the low number of determined TMP structures, the parallel development of various bioinformatics and experimental methods was necessary for their topological characterization. The combination of these methods is a powerful approach in the determination of TMP topology as in the Constrained Consensus TOPology prediction. To support the prediction, we previously developed a high-throughput topology characterization method based on primary amino group-labelling that is still limited in identifying all TMPs and their extracellular segments on the surface of a particular cell type. In order to generate more topology information, a new step, a partial proteolysis of the cell surface has been introduced to our method. This step results in new primary amino groups in the proteins that can be biotinylated with a membrane-impermeable agent while the cells still remain intact. Pre-digestion also promotes the emergence of modified peptides that are more suitable for MS/MS analysis. The modified sites can be utilized as extracellular constraints in topology predictions and may contribute to the refined topology of these proteins.
Collapse
|