1
|
Dubey NK, Kumar V, Goswami C. Sperm-Specific CatSper is Not Conserved in All Vertebrates and May Not be the Only Progesterone-Responsive Ion Channel Present in Sperm. J Membr Biol 2024:10.1007/s00232-024-00316-1. [PMID: 38970681 DOI: 10.1007/s00232-024-00316-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Accepted: 06/24/2024] [Indexed: 07/08/2024]
Abstract
Progesterone (P4) acts as a key conserved signalling molecule in vertebrate reproduction. P4 is especially important for mature sperm physiology and subsequent reproductive success. "CatSpermasome", a multi-unit molecular complex, has been suggested to be the main if not the only P4-responsive atypical Ca2+-ion channel present in mature sperm. Altogether, here we analyse the protein sequences of CatSper1-4 from more than 500 vertebrates ranging from early fishes to humans. CatSper1 becomes longer in mammals due to sequence gain mainly at the N-terminus. Overall the conservation of full-length CatSper1-4 as well as the individual TM regions remain low. The lipid-water-interface residues (i.e. a 5 amino acid stretch sequence present on both sides of each TM region) also remain highly diverged. No specific patterns of amino acid distributions were observed. The total frequency of positively charged, negatively charged or their ratios do not follow in any specific pattern. Similarly, the frequency of total hydrophobic, total hydrophilic residues or even their ratios remain random and do not follow any specific pattern. We noted that the CatSper1-4 genes are missing in amphibians and the CatSper1 gene is missing in birds. The high variability of CatSper1-4 and gene-loss in certain clades indicate that the "CatSpermasome" is not the only P4-responsive ion channel. Data indicate that the molecular evolution of CatSper is mostly guided by diverse hydrophobic ligands rather than only P4. The comparative data also suggest possibilities of other Ca2+-channel/s in vertebrate sperm that can also respond to P4.
Collapse
Affiliation(s)
- Nishant Kumar Dubey
- School of Biological Sciences, National Institute of Science Education and Research Bhubaneswar, P.O. Jatni, Khurda, 752050, Odisha, India.
- Training School Complex, Homi Bhabha National Institute, Anushakti Nagar, Mumbai, 400094, India.
| | - Vikash Kumar
- School of Biological Sciences, National Institute of Science Education and Research Bhubaneswar, P.O. Jatni, Khurda, 752050, Odisha, India
- Training School Complex, Homi Bhabha National Institute, Anushakti Nagar, Mumbai, 400094, India
| | - Chandan Goswami
- School of Biological Sciences, National Institute of Science Education and Research Bhubaneswar, P.O. Jatni, Khurda, 752050, Odisha, India.
- Training School Complex, Homi Bhabha National Institute, Anushakti Nagar, Mumbai, 400094, India.
| |
Collapse
|
2
|
Duart G, Graña-Montes R, Pastor-Cantizano N, Mingarro I. Experimental and computational approaches for membrane protein insertion and topology determination. Methods 2024; 226:102-119. [PMID: 38604415 DOI: 10.1016/j.ymeth.2024.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 03/13/2024] [Accepted: 03/22/2024] [Indexed: 04/13/2024] Open
Abstract
Membrane proteins play pivotal roles in a wide array of cellular processes and constitute approximately a quarter of the protein-coding genes across all organisms. Despite their ubiquity and biological significance, our understanding of these proteins remains notably less comprehensive compared to their soluble counterparts. This disparity in knowledge can be attributed, in part, to the inherent challenges associated with employing specialized techniques for the investigation of membrane protein insertion and topology. This review will center on a discussion of molecular biology methodologies and computational prediction tools designed to elucidate the insertion and topology of helical membrane proteins.
Collapse
Affiliation(s)
- Gerard Duart
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Ricardo Graña-Montes
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Noelia Pastor-Cantizano
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain
| | - Ismael Mingarro
- Departament de Bioquímica i Biologia Molecular, Institut Universitari de Biotecnologia i Biomedicina (BIOTECMED), Universitat de València, E-46100 Burjassot, Spain.
| |
Collapse
|
3
|
Vetriselvan Y, Manoharan A, Murugan M, Jayakumar S, Govindasamy C, Ravikumar S. In Silico Characterization of Pathogenic Homeodomain Missense Mutations in the PITX2 Gene. Biochem Genet 2024:10.1007/s10528-024-10836-z. [PMID: 38802693 DOI: 10.1007/s10528-024-10836-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 05/09/2024] [Indexed: 05/29/2024]
Abstract
Paired homologous domain transcription factor 2 (PITX2) is critically involved in ocular and cardiac development. Mutations in PITX2 are consistently reported in association with Axenfeld-Rieger syndrome, an autosomal dominant genetic disorder and atrial fibrillation, a common cardiac arrhythmia. In this study, we have mined missense mutations in PITX2 gene from NCBI-dbSNP and Ensembl databases, evaluated the pathogenicity of the missense variants in the homeodomain and C-terminal region using five in silico prediction tools SIFT, PolyPhen2, GERP, Mutation Assessor and CADD. Fifteen homeodomain mutations G42V, G42R, R45W, S49Y, R53W, E53D, E55V, R62H, P65S, R69H, G75R, R84G, R86K, R87W, R91P were found to be highly pathogenic by both SIFT, PolyPhen2 were further functionally characterized using I-Mutant 2.0, Consurf, MutPred and Project Hope. The findings of the study can be used for prioritizing mutations in the context of genetic studies.
Collapse
Affiliation(s)
- Yogesh Vetriselvan
- Department of Medical Biotechnology, Aarupadai Veedu Medical College and Hospital, Vinayaka Mission's Research Foundation (DU), Kirumampakkam, Puducherry, 607403, India
| | - Aarthi Manoharan
- Department of Medical Biotechnology, Aarupadai Veedu Medical College and Hospital, Vinayaka Mission's Research Foundation (DU), Kirumampakkam, Puducherry, 607403, India
| | - Manoranjani Murugan
- Department of Medical Biotechnology, Aarupadai Veedu Medical College and Hospital, Vinayaka Mission's Research Foundation (DU), Kirumampakkam, Puducherry, 607403, India
| | - Swetha Jayakumar
- Department of Medical Biotechnology, Aarupadai Veedu Medical College and Hospital, Vinayaka Mission's Research Foundation (DU), Kirumampakkam, Puducherry, 607403, India
| | - Chandramohan Govindasamy
- Department of Community Health Sciences, College of Applied Medical Sciences, King Saud University, P.O. Box 10219, 11433, Riyadh, Saudi Arabia
| | - Sambandam Ravikumar
- Department of Medical Biotechnology, Aarupadai Veedu Medical College and Hospital, Vinayaka Mission's Research Foundation (DU), Kirumampakkam, Puducherry, 607403, India.
| |
Collapse
|
4
|
Patoliya J, Thaker K, Rabadiya K, Patel D, Jain NK, Joshi R. Uncovering the Interaction Interface Between Harpin (Hpa1) and Rice Aquaporin (OsPIP1;3) Through Protein-Protein Docking: An In Silico Approach. Mol Biotechnol 2024; 66:756-768. [PMID: 36807270 DOI: 10.1007/s12033-023-00690-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 02/07/2023] [Indexed: 02/23/2023]
Abstract
Hpa1 (a type of harpin) is involved in T3SS (Type III Secretion System) assembly in the infection mechanism by Xanthomonas Oryzae pv. oryzae (Xoo). Hpa1 interacts with the plasma membrane components of plants thereby assisting effector proteins toward the cytoplasm, wherein effectors execute their pathological functions. Independently, harpins also induce hypersensitive response and systemic acquired resistance in plants. However, lack of knowledge regarding the plant-harpin interaction mechanism constrains the pathway of its agricultural application. Although an in vitro study proved that Hpa1 protein can interact with OsPIP1;3, a rice aquaporin, the structural basis of the interaction is yet to be discovered. The presented work is the first of its kind where an in silico approach is used for the PPI (protein-protein interaction) of harpin protein. The study discovered participation of Hpa1 N-terminal amino acids at the interface. Besides, MD simulation studies were performed to assess the stability. RMSD values were 0.35 ± 0.049, 0.73 ± 0.11, and 0.50 ± 0.065 nm for OsPIP1;3, Hpa1, and Hpa1-OsPIP1;3 complex, respectively. Additionally, Residue-wise fluctuations have also been studied post-MDS. Taken together, these findings not only give a solid foundation for a deeper knowledge of various interacting target molecules with Harpin protein orthologs but also bring a new avenue for the structural-functional relationship study of harpin proteins.
Collapse
Affiliation(s)
- Jaimini Patoliya
- Department of Biochemistry and Forensic Science, University School of Sciences, Gujarat University, Ahmedabad, Gujarat, 380009, India
| | - Khushali Thaker
- Department of Biochemistry and Forensic Science, University School of Sciences, Gujarat University, Ahmedabad, Gujarat, 380009, India
| | - Khushbu Rabadiya
- Department of Microbiology and Biotechnology, University School of Sciences, Gujarat University, Ahmedabad, Gujarat, 380009, India
| | - Dhaval Patel
- Gujarat Biotechnology University, Gandhinagar, Gujarat, 382355, India
| | - Nayan K Jain
- Department of Life Science, University School of Sciences, Gujarat University, Ahmedabad, Gujarat, 380009, India
| | - Rushikesh Joshi
- Department of Biochemistry and Forensic Science, University School of Sciences, Gujarat University, Ahmedabad, Gujarat, 380009, India.
| |
Collapse
|
5
|
Aliper ET, Efremov RG. Inconspicuous Yet Indispensable: The Coronavirus Spike Transmembrane Domain. Int J Mol Sci 2023; 24:16421. [PMID: 38003610 PMCID: PMC10671605 DOI: 10.3390/ijms242216421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/07/2023] [Accepted: 11/12/2023] [Indexed: 11/26/2023] Open
Abstract
Membrane-spanning portions of proteins' polypeptide chains are commonly known as their transmembrane domains (TMDs). The structural organisation and dynamic behaviour of TMDs from proteins of various families, be that receptors, ion channels, enzymes etc., have been under scrutiny on the part of the scientific community for the last few decades. The reason for such attention is that, apart from their obvious role as an "anchor" in ensuring the correct orientation of the protein's extra-membrane domains (in most cases functionally important), TMDs often actively and directly contribute to the operation of "the protein machine". They are capable of transmitting signals across the membrane, interacting with adjacent TMDs and membrane-proximal domains, as well as with various ligands, etc. Structural data on TMD arrangement are still fragmentary at best due to their complex molecular organisation as, most commonly, dynamic oligomers, as well as due to the challenges related to experimental studies thereof. Inter alia, this is especially true for viral fusion proteins, which have been the focus of numerous studies for quite some time, but have provoked unprecedented interest in view of the SARS-CoV-2 pandemic. However, despite numerous structure-centred studies of the spike (S) protein effectuating target cell entry in coronaviruses, structural data on the TMD as part of the entire spike protein are still incomplete, whereas this segment is known to be crucial to the spike's fusogenic activity. Therefore, in attempting to bring together currently available data on the structure and dynamics of spike proteins' TMDs, the present review aims to tackle a highly pertinent task and contribute to a better understanding of the molecular mechanisms underlying virus-mediated fusion, also offering a rationale for the design of novel efficacious methods for the treatment of infectious diseases caused by SARS-CoV-2 and related viruses.
Collapse
Affiliation(s)
- Elena T. Aliper
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117997, Russia
| | - Roman G. Efremov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117997, Russia
- Department of Applied Mathematics, National Research University Higher School of Economics, Moscow 101000, Russia
- L.D. Landau School of Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny 141701, Russia
| |
Collapse
|
6
|
Veith T, Bleicker T, Eschbach-Bludau M, Brünink S, Mühlemann B, Schneider J, Beheim-Schwarzbach J, Rakotondranary SJ, Ratovonamana YR, Tsagnangara C, Ernest R, Randriantafika F, Sommer S, Stetter N, Jones TC, Drosten C, Ganzhorn JU, Corman VM. Non-structural genes of novel lemur adenoviruses reveal codivergence of virus and host. Virus Evol 2023; 9:vead024. [PMID: 37091898 PMCID: PMC10121206 DOI: 10.1093/ve/vead024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 03/06/2023] [Accepted: 03/27/2023] [Indexed: 03/29/2023] Open
Abstract
Adenoviruses (AdVs) are important human and animal pathogens and are frequently used as vectors for gene therapy and vaccine delivery. Surprisingly, there are only scant data regarding primate AdV origin and evolution, especially in the most basal primate hosts. We detect and sequence AdVs from faeces of two Madagascan lemur species. Complete genome sequence analyses define a new AdV species with a particularly large gene encoding a protein of unknown function in the early gene region 3. Unexpectedly, the new AdV species is not most similar to human or other simian AdVs but to bat adenovirus C. Genome characterisation shows signals of virus-host codivergence in non-structural genes, which show lower diversity than structural genes. Outside a lemur species mixing zone, recombination less frequently separates structural genes, as in human adenovirus C. The evolutionary history of lemur AdVs likely involves both a host switch and codivergence with the lemur hosts.
Collapse
Affiliation(s)
- Talitha Veith
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
| | - Tobias Bleicker
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
| | - Monika Eschbach-Bludau
- Institute of Virology, University Hospital, University of Bonn, Venusberg-Campus 1, Bonn 53127, Germany
| | - Sebastian Brünink
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
| | - Barbara Mühlemann
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
- German Centre for Infection Research (DZIF), Partner Site Berlin, Charitéplatz 1, Berlin 10117, Germany
| | - Julia Schneider
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
- German Centre for Infection Research (DZIF), Partner Site Berlin, Charitéplatz 1, Berlin 10117, Germany
| | - Jörn Beheim-Schwarzbach
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
| | - S Jacques Rakotondranary
- Institute of Cell and Systems Biology of Animals, Universität Hamburg, Martin-Luther-King Platz 3, Hamburg 20146, Germany
- Département Biologie Animale, Faculté des Sciences, Université d’ Antananarivo, P.O. Box 906, Antananarivo 101, Madagascar
| | - Yedidya R Ratovonamana
- Institute of Cell and Systems Biology of Animals, Universität Hamburg, Martin-Luther-King Platz 3, Hamburg 20146, Germany
- Département Biologie Animale, Faculté des Sciences, Université d’ Antananarivo, P.O. Box 906, Antananarivo 101, Madagascar
| | - Cedric Tsagnangara
- Tropical Biodiversity and Social Enterprise SARL, Immeuble CNAPS, premier étage, Fort Dauphin 614, Madagascar
| | - Refaly Ernest
- Tropical Biodiversity and Social Enterprise SARL, Immeuble CNAPS, premier étage, Fort Dauphin 614, Madagascar
| | | | - Simone Sommer
- Institute of Evolutionary Ecology and Conservation Genomics, University of Ulm, Albert-Einstein Allee 11, Ulm 89069, Germany
| | - Nadine Stetter
- Institute of Cell and Systems Biology of Animals, Universität Hamburg, Martin-Luther-King Platz 3, Hamburg 20146, Germany
- Bernhard Nocht Institute for Tropical Medicine, Bernhard-Nocht-Straße 74, Hamburg 20359, Germany
| | - Terry C Jones
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
- Centre for Pathogen Evolution, Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| | - Christian Drosten
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
- German Centre for Infection Research (DZIF), Partner Site Berlin, Charitéplatz 1, Berlin 10117, Germany
| | - Jörg U Ganzhorn
- Institute of Cell and Systems Biology of Animals, Universität Hamburg, Martin-Luther-King Platz 3, Hamburg 20146, Germany
| | - Victor M Corman
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Charitéplatz 1, Berlin 10117, Germany
- German Centre for Infection Research (DZIF), Partner Site Berlin, Charitéplatz 1, Berlin 10117, Germany
- Labor Berlin, Charité—Vivantes GmbH, Sylter Straße 2, Berlin 13353, Germany
| |
Collapse
|
7
|
Vargas RA, Soto-Aguilera S, Parra M, Herrera S, Santibañez A, Kossack C, Saavedra CP, Mora O, Pineda M, Gonzalez O, Gonzalez A, Maisey K, Torres-Maravilla E, Bermúdez-Humarán LG, Suárez-Villota EY, Tello M. Analysis of microbiota-host communication mediated by butyrate in Atlantic Salmon. Comput Struct Biotechnol J 2023; 21:2558-2578. [PMID: 37122632 PMCID: PMC10130356 DOI: 10.1016/j.csbj.2023.03.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 03/28/2023] [Accepted: 03/29/2023] [Indexed: 04/03/2023] Open
Abstract
Butyrate is a microbiota-produced metabolite, sensed by host short-chain fatty acid receptors FFAR2 (Gpr43), FFAR3 (Gpr41), HCAR2 (Gpr109A), and Histone deacetylase (HDAC) that promotes microbiota-host crosstalk. Butyrate influences energy uptake, developmental and immune response in mammals. This microbial metabolite is produced by around 79 anaerobic genera present in the mammalian gut, yet little is known about the role of butyrate in the host-microbiota interaction in salmonid fish. To further our knowledge of this interaction, we analyzed the intestinal microbiota and genome of Atlantic salmon (Salmo salar), searching for butyrate-producing genera and host butyrate receptors. We identified Firmicutes, Proteobacteria, and Actinobacteria as the main butyrate-producing bacteria in the salmon gut microbiota. In the Atlantic salmon genome, we identified an expansion of genes orthologous to FFAR2 and HCAR2 receptors, and class I and IIa HDACs that are sensitive to butyrate. In addition, we determined the expression levels of orthologous of HCAR2 in the gut, spleen, and head-kidney, and FFAR2 in RTgutGC cells. The effect of butyrate on the Atlantic salmon immune response was evaluated by analyzing the pro and anti-inflammatory cytokines response in vitro in SHK-1 cells by RT-qPCR. Butyrate decreased the expression of the pro-inflammatory cytokine IL-1β and increased anti-inflammatory IL-10 and TGF-β cytokines. Butyrate also reduced the expression of interferon-alpha, Mx, and PKR, and decreased the viral load at a higher concentration (4 mM) in cells treated with this molecule before the infection with Infectious Pancreatic Necrosis Virus (IPNV) by mechanisms independent of FFAR2, FFAR3 and HCAR2 expression that probably inhibit HDAC. Moreover, butyrate modified phosphorylation of cytoplasmic proteins in RTgutGC cells. Our data allow us to infer that Atlantic salmon have the ability to sense butyrate produced by their gut microbiota via different specific targets, through which butyrate modulates the immune response of pro and anti-inflammatory cytokines and the antiviral response.
Collapse
|
8
|
Lu Y, Shimada K, Tang S, Zhang J, Ogawa Y, Noda T, Shibuya H, Ikawa M. 1700029I15Rik orchestrates the biosynthesis of acrosomal membrane proteins required for sperm-egg interaction. Proc Natl Acad Sci U S A 2023. [PMID: 36787362 DOI: 10.1101/2022.04.15.488448] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
Sperm acrosomal membrane proteins, such as Izumo sperm-egg fusion 1 (IZUMO1) and sperm acrosome-associated 6 (SPACA6), play essential roles in mammalian gamete binding or fusion. How their biosynthesis is regulated during spermiogenesis has largely remained elusive. Here, we show that 1700029I15Rik knockout male mice are severely subfertile and their spermatozoa do not fuse with eggs. 1700029I15Rik is a type-II transmembrane protein expressed in early round spermatids but not in mature spermatozoa. It interacts with proteins involved in N-linked glycosylation, disulfide isomerization, and endoplasmic reticulum (ER)-Golgi trafficking, suggesting a potential role in nascent protein processing. The ablation of 1700029I15Rik destabilizes non-catalytic subunits of the oligosaccharyltransferase (OST) complex that are pivotal for N-glycosylation. The knockout testes exhibit normal expression of sperm plasma membrane proteins, but decreased abundance of multiple acrosomal membrane proteins involved in fertilization. The knockout sperm show upregulated chaperones related to ER-associated degradation (ERAD) and elevated protein ubiquitination; strikingly, SPACA6 becomes undetectable. Our results support for a specific, 1700029I15Rik-mediated pathway underpinning the biosynthesis of acrosomal membrane proteins during spermiogenesis.
Collapse
Affiliation(s)
- Yonggang Lu
- Immunology Frontier Research Center, Osaka University, Osaka 565-0871, Japan
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Osaka 565-0871, Japan
| | - Kentaro Shimada
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Osaka 565-0871, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, Osaka 565-0871, Japan
| | - Shaogeng Tang
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305
| | - Jingjing Zhang
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg SE-41390, Sweden
| | - Yo Ogawa
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Osaka 565-0871, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, Osaka 565-0871, Japan
| | - Taichi Noda
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Osaka 565-0871, Japan
- Division of Reproductive Biology, Institute of Resource Development and Analysis, Kumamoto University, Kumamoto 860-0811, Japan
- Priority Organization for Innovation and Excellence, Kumamoto University, Kumamoto 860-8555, Japan
| | - Hiroki Shibuya
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg SE-41390, Sweden
| | - Masahito Ikawa
- Immunology Frontier Research Center, Osaka University, Osaka 565-0871, Japan
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Osaka 565-0871, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, Osaka 565-0871, Japan
- Laboratory of Reproductive Systems Biology, Institute of Medical Science, The University of Tokyo, Tokyo 108-8639, Japan
- Center for Infectious Disease Education and Research, Osaka University, Osaka 565-0871, Japan
| |
Collapse
|
9
|
Graf F, Zehentner B, Fellner L, Scherer S, Neuhaus K. Three Novel Antisense Overlapping Genes in E. coli O157:H7 EDL933. Microbiol Spectr 2023; 11:e0235122. [PMID: 36533921 PMCID: PMC9927249 DOI: 10.1128/spectrum.02351-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 12/03/2022] [Indexed: 12/23/2022] Open
Abstract
The abundance of long overlapping genes in prokaryotic genomes is likely to be significantly underestimated. To date, only a few examples of such genes are fully established. Using RNA sequencing and ribosome profiling, we found expression of novel overlapping open reading frames in Escherichia coli O157:H7 EDL933 (EHEC). Indeed, the overlapping candidate genes are equipped with typical structural elements required for transcription and translation, i.e., promoters, transcription start sites, as well as terminators, all of which were experimentally verified. Translationally arrested mutants, unable to produce the overlapping encoded protein, were found to have a growth disadvantage when grown competitively against the wild type. Thus, the phenotypes found imply biological functionality of the genes at the level of proteins produced. The addition of 3 more examples of prokaryotic overlapping genes to the currently limited, yet constantly growing pool of such genes emphasizes the underestimated coding capacity of bacterial genomes. IMPORTANCE The abundance of long overlapping genes in prokaryotic genomes is likely to be significantly underestimated, since such genes are not allowed in genome annotations. However, ribosome profiling catches mRNA in the moment of being template for protein production. Using this technique and subsequent experiments, we verified 3 novel overlapping genes encoded in antisense of known genes. This adds more examples of prokaryotic overlapping genes to the currently limited, yet constantly growing pool of such genes.
Collapse
Affiliation(s)
- Franziska Graf
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Barbara Zehentner
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Lea Fellner
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Siegfried Scherer
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Klaus Neuhaus
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| |
Collapse
|
10
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Jacklyn Liu
- UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
| | - Kai Hu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India,Corresponding authors.
| | - Yuan Zhang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China,Corresponding authors.
| |
Collapse
|
11
|
Südfeld C, Kiyani A, Wefelmeier K, Wijffels RH, Barbosa MJ, D’Adamo S. Expression of glycerol-3-phosphate acyltransferase increases non-polar lipid accumulation in Nannochloropsis oceanica. Microb Cell Fact 2023; 22:12. [PMID: 36647076 PMCID: PMC9844033 DOI: 10.1186/s12934-022-01987-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 12/09/2022] [Indexed: 01/18/2023] Open
Abstract
Microalgae are considered a suitable production platform for high-value lipids and oleochemicals. Several species including Nannochloropsis oceanica produce large amounts of essential [Formula: see text]-3 polyunsaturated fatty acids (PUFAs) which are integral components of food and feed and have been associated with health-promoting effects. N. oceanica can further accumulate high contents of non-polar lipids with chemical properties that render them a potential replacement for plant oils such as palm oil. However, biomass and lipid productivities obtained with microalgae need to be improved to reach commercial feasibility. Genetic engineering can improve biomass and lipid productivities, for instance by increasing carbon flux to lipids. Here, we report the overexpression of glycerol-3-phosphate acyltransferase (GPAT) in N. oceanica during favorable growth conditions as a strategy to increase non-polar lipid content. Transformants overproducing either an endogenous (NoGPAT) or a heterologous (Acutodesmus obliquus GPAT) GPAT enzyme targeted to the endoplasmic reticulum had up to 42% and 51% increased non-polar lipid contents, respectively, compared to the wild type. Biomass productivities of transformant strains were not substantially impaired, resulting in lipid productivities that were increased by up to 37% and 42% for NoGPAT and AoGPAT transformants, respectively. When exposed to nutrient stress, transformants and wild type had similar lipid contents, suggesting that GPAT enzyme exerts strong flux control on lipid synthesis in N. oceanica under favorable growth conditions. NoGPAT transformants further accumulated PUFAs in non-polar lipids, reaching a total of 6.8% PUFAs per biomass, an increase of 24% relative to the wild type. Overall, our results indicate that GPAT is an interesting target for engineering of lipid metabolism in microalgae, in order to improve non-polar lipid and PUFAs accumulation in microalgae.
Collapse
Affiliation(s)
- Christian Südfeld
- grid.4818.50000 0001 0791 5666Wageningen University, Bioprocess Engineering, PO Box 16, 6700 AA Wageningen, Netherlands
| | - Aamna Kiyani
- grid.4818.50000 0001 0791 5666Wageningen University, Bioprocess Engineering, PO Box 16, 6700 AA Wageningen, Netherlands ,grid.412621.20000 0001 2215 1297Department of Microbiology, Quaid-I-Azam University, Islamabad, 45320 Pakistan
| | - Katrin Wefelmeier
- grid.4818.50000 0001 0791 5666Wageningen University, Bioprocess Engineering, PO Box 16, 6700 AA Wageningen, Netherlands
| | - René H. Wijffels
- grid.4818.50000 0001 0791 5666Wageningen University, Bioprocess Engineering, PO Box 16, 6700 AA Wageningen, Netherlands ,grid.465487.cFaculty of Biosciences and Aquaculture, Nord University, N-8049 Bodø, Norway
| | - Maria J. Barbosa
- grid.4818.50000 0001 0791 5666Wageningen University, Bioprocess Engineering, PO Box 16, 6700 AA Wageningen, Netherlands
| | - Sarah D’Adamo
- grid.4818.50000 0001 0791 5666Wageningen University, Bioprocess Engineering, PO Box 16, 6700 AA Wageningen, Netherlands
| |
Collapse
|
12
|
Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, Bhowmik D, Rost B. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7112-7127. [PMID: 34232869 DOI: 10.1109/tpami.2021.3095381] [Citation(s) in RCA: 335] [Impact Index Per Article: 167.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models (LMs) taken from Natural Language Processing (NLP). These LMs reach for new prediction frontiers at low inference costs. Here, we trained two auto-regressive models (Transformer-XL, XLNet) and four auto-encoder models (BERT, Albert, Electra, T5) on data from UniRef and BFD containing up to 393 billion amino acids. The protein LMs (pLMs) were trained on the Summit supercomputer using 5616 GPUs and TPU Pod up-to 1024 cores. Dimensionality reduction revealed that the raw pLM-embeddings from unlabeled data captured some biophysical features of protein sequences. We validated the advantage of using the embeddings as exclusive input for several subsequent tasks: (1) a per-residue (per-token) prediction of protein secondary structure (3-state accuracy Q3=81%-87%); (2) per-protein (pooling) predictions of protein sub-cellular location (ten-state accuracy: Q10=81%) and membrane versus water-soluble (2-state accuracy Q2=91%). For secondary structure, the most informative embeddings (ProtT5) for the first time outperformed the state-of-the-art without multiple sequence alignments (MSAs) or evolutionary information thereby bypassing expensive database searches. Taken together, the results implied that pLMs learned some of the grammar of the language of life. All our models are available through https://github.com/agemagician/ProtTrans.
Collapse
|
13
|
Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, Bhowmik D, Rost B. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022. [PMID: 34232869 DOI: 10.1101/2020.07.12.199554] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models (LMs) taken from Natural Language Processing (NLP). These LMs reach for new prediction frontiers at low inference costs. Here, we trained two auto-regressive models (Transformer-XL, XLNet) and four auto-encoder models (BERT, Albert, Electra, T5) on data from UniRef and BFD containing up to 393 billion amino acids. The protein LMs (pLMs) were trained on the Summit supercomputer using 5616 GPUs and TPU Pod up-to 1024 cores. Dimensionality reduction revealed that the raw pLM-embeddings from unlabeled data captured some biophysical features of protein sequences. We validated the advantage of using the embeddings as exclusive input for several subsequent tasks: (1) a per-residue (per-token) prediction of protein secondary structure (3-state accuracy Q3=81%-87%); (2) per-protein (pooling) predictions of protein sub-cellular location (ten-state accuracy: Q10=81%) and membrane versus water-soluble (2-state accuracy Q2=91%). For secondary structure, the most informative embeddings (ProtT5) for the first time outperformed the state-of-the-art without multiple sequence alignments (MSAs) or evolutionary information thereby bypassing expensive database searches. Taken together, the results implied that pLMs learned some of the grammar of the language of life. All our models are available through https://github.com/agemagician/ProtTrans.
Collapse
|
14
|
Aliper ET, Krylov NA, Nolde DE, Polyansky AA, Efremov RG. A Uniquely Stable Trimeric Model of SARS-CoV-2 Spike Transmembrane Domain. Int J Mol Sci 2022; 23:ijms23169221. [PMID: 36012488 PMCID: PMC9409440 DOI: 10.3390/ijms23169221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/13/2022] [Accepted: 08/15/2022] [Indexed: 11/16/2022] Open
Abstract
Understanding fusion mechanisms employed by SARS-CoV-2 spike protein entails realistic transmembrane domain (TMD) models, while no reliable approaches towards predicting the 3D structure of transmembrane (TM) trimers exist. Here, we propose a comprehensive computational framework to model the spike TMD only based on its primary structure. We performed amino acid sequence pattern matching and compared the molecular hydrophobicity potential (MHP) distribution on the helix surface against TM homotrimers with known 3D structures and selected an appropriate template for homology modeling. We then iteratively built a model of spike TMD, adjusting “dynamic MHP portraits” and residue variability motifs. The stability of this model, with and without palmitoyl modifications downstream of the TMD, and several alternative configurations (including a recent NMR structure), was tested in all-atom molecular dynamics simulations in a POPC bilayer mimicking the viral envelope. Our model demonstrated unique stability under the conditions applied and conforms to known basic principles of TM helix packing. The original computational framework looks promising and could potentially be employed in the construction of 3D models of TM trimers for a wide range of membrane proteins.
Collapse
Affiliation(s)
- Elena T. Aliper
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10 Miklukho-Maklaya St., 117997 Moscow, Russia
| | - Nikolay A. Krylov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10 Miklukho-Maklaya St., 117997 Moscow, Russia
- National Research University Higher School of Economics, 101000 Moscow, Russia
| | - Dmitry E. Nolde
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10 Miklukho-Maklaya St., 117997 Moscow, Russia
- National Research University Higher School of Economics, 101000 Moscow, Russia
| | - Anton A. Polyansky
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10 Miklukho-Maklaya St., 117997 Moscow, Russia
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna BioCenter 5, A-1030 Vienna, Austria
| | - Roman G. Efremov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10 Miklukho-Maklaya St., 117997 Moscow, Russia
- National Research University Higher School of Economics, 101000 Moscow, Russia
- Moscow Institute of Physics and Technology (State University), 141701 Dolgoprudny, Russia
- Correspondence:
| |
Collapse
|
15
|
Beirne C, McCann E, McDowell A, Miliotis G. Genetic determinants of antimicrobial resistance in three multi-drug resistant strains of Cutibacterium acnes isolated from patients with acne: a predictive in silico study. Access Microbiol 2022; 4:acmi000404. [PMID: 36133174 PMCID: PMC9484663 DOI: 10.1099/acmi.0.000404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 07/06/2022] [Indexed: 01/09/2023] Open
Abstract
Objectives. Using available whole genome data, the objective of this in silico study was to identify genetic mechanisms that could explain the antimicrobial resistance profile of three multi-drug resistant (MDR) strains (CA17, CA51, CA39) of the skin bacterium
Cutibacterium acnes
previously recovered from patients with acne. In particular, we were interested in detecting novel genetic determinants associated with resistance to fluoroquinolone and macrolide antibiotics that could then be confirmed experimentally.
Methods. A range of open source bioinformatics tools were used to ‘mine’ genetic determinants of antimicrobial resistance and plasmid borne contigs, and to characterise the phylogenetic diversity of the MDR strains.
Results. As probable mechanisms of resistance to fluoroquinolones, we identified a previously described resistance associated allelic variant of the gyrA gene with a ‘deleterious' S101L mutation in type IA1 strains CA51 (ST1) and CA39 (ST1), as well as a novel E761R ‘deleterious’ mutation in the type II strain CA17 (ST153). A distinct genomic sequence of the efflux protein YfmO which is potentially associated with resistance to MLSB antibiotics was also present in CA17; homologues in CA51, CA39, and other strains of
Cutibacterium acnes
, were also found but differed in amino acid content. Strikingly, in CA17 we also identified a circular 2.7 kb non-conjugative plasmid (designated pCA17) that closely resembled a 4.8 kb plasmid (pYU39) from the MDR
Salmonella enterica
strain YU39.
Conclusions. This study has provided a detailed explanation of potential genetic determinants for MDR in the
Cutibacterium acnes
strains CA17, CA39 and CA51. Further laboratory investigations will be required to validate these in silico results, especially in relation to pCA17.
Collapse
Affiliation(s)
- Catriona Beirne
- Antimicrobial Resistance and Microbial Ecology Group, School of Medicine, National University of Ireland, Galway, Ireland
| | - Emily McCann
- Antimicrobial Resistance and Microbial Ecology Group, School of Medicine, National University of Ireland, Galway, Ireland
| | - Andrew McDowell
- Nutrition Innovation Centre for Food and Health, (NICHE), School of Biomedical Sciences, Ulster University, Coleraine, Ireland
| | - Georgios Miliotis
- Antimicrobial Resistance and Microbial Ecology Group, School of Medicine, National University of Ireland, Galway, Ireland
| |
Collapse
|
16
|
Bernhofer M, Rost B. TMbed: transmembrane proteins predicted through language model embeddings. BMC Bioinformatics 2022; 23:326. [PMID: 35941534 PMCID: PMC9358067 DOI: 10.1186/s12859-022-04873-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 08/03/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Despite the immense importance of transmembrane proteins (TMP) for molecular biology and medicine, experimental 3D structures for TMPs remain about 4-5 times underrepresented compared to non-TMPs. Today's top methods such as AlphaFold2 accurately predict 3D structures for many TMPs, but annotating transmembrane regions remains a limiting step for proteome-wide predictions. RESULTS Here, we present TMbed, a novel method inputting embeddings from protein Language Models (pLMs, here ProtT5), to predict for each residue one of four classes: transmembrane helix (TMH), transmembrane strand (TMB), signal peptide, or other. TMbed completes predictions for entire proteomes within hours on a single consumer-grade desktop machine at performance levels similar or better than methods, which are using evolutionary information from multiple sequence alignments (MSAs) of protein families. On the per-protein level, TMbed correctly identified 94 ± 8% of the beta barrel TMPs (53 of 57) and 98 ± 1% of the alpha helical TMPs (557 of 571) in a non-redundant data set, at false positive rates well below 1% (erred on 30 of 5654 non-membrane proteins). On the per-segment level, TMbed correctly placed, on average, 9 of 10 transmembrane segments within five residues of the experimental observation. Our method can handle sequences of up to 4200 residues on standard graphics cards used in desktop PCs (e.g., NVIDIA GeForce RTX 3060). CONCLUSIONS Based on embeddings from pLMs and two novel filters (Gaussian and Viterbi), TMbed predicts alpha helical and beta barrel TMPs at least as accurately as any other method but at lower false positive rates. Given the few false positives and its outstanding speed, TMbed might be ideal to sieve through millions of 3D structures soon to be predicted, e.g., by AlphaFold2.
Collapse
Affiliation(s)
- Michael Bernhofer
- Department of Informatics, Bioinformatics and Computational Biology ‑ i12, Technical University of Munich (TUM), Boltzmannstr. 3, 85748, Garching, Germany. .,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany.
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology ‑ i12, Technical University of Munich (TUM), Boltzmannstr. 3, 85748, Garching, Germany.,Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching, Germany.,TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
17
|
Wang L, Zhong H, Xue Z, Wang Y. Improving the topology prediction of α-helical transmembrane proteins with deep transfer learning. Comput Struct Biotechnol J 2022; 20:1993-2000. [PMID: 35521551 PMCID: PMC9062415 DOI: 10.1016/j.csbj.2022.04.024] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 04/09/2022] [Accepted: 04/17/2022] [Indexed: 11/11/2022] Open
Abstract
Transmembrane proteins (TMPs) are essential for cell recognition and communication, and they serve as important drug targets in humans. Transmembrane proteins' 3D structures are critical for determining their functions and drug design but are hard to determine even by experimental methods. Although some computational methods have been developed to predict transmembrane helices (TMHs) and orientation, there is still room for improvement. Considering that the pre-trained language model can make full use of massive unlabeled protein sequences to obtain latent feature representation for TMPs and reduce the dependence on evolutionary information, we proposed DeepTMpred, which used pre-trained self-supervised language models called ESM, convolutional neural networks, attentive neural network and conditional random fields for alpha-TMP topology prediction. Compared with the current state-of-the-art tools on a non-redundant dataset of TMPs, DeepTMpred demonstrated superior predictive performance in most evaluation metrics, especially at the TMH level. Furthermore, DeepTMpred could also obtain reliable prediction results for TMPs without much evolutionary feature in a few seconds. A tutorial on how to use DeepTMpred can be found in the colab notebook (https://colab.research.google.com/github/ISYSLAB-HUST/DeepTMpred/blob/master/notebook/test.ipynb).
Collapse
|
18
|
Feng SH, Xia CQ, Zhang PD, Shen HB. Ab-Initio Membrane Protein Amphipathic Helix Structure Prediction Using Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:795-805. [PMID: 33026978 DOI: 10.1109/tcbb.2020.3029274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Amphipathic helix (AH)features the segregation of polar and nonpolar residues and plays important roles in many membrane-associated biological processes through interacting with both the lipid and the soluble phases. Although the AH structure has been discovered for a long time, few ab initio machine learning-based prediction models have been reported, due to the limited amount of training data. In this study, we report a new deep learning-based prediction model, which is composed of a residual neural network and the uneven-thresholds decision algorithm. It is constructed on 121 membrane proteins, in total 51640 residue samples, which are curated from an up-to-date membrane protein structure database. Through a rigid 10-fold nested cross-validation experiment, we demonstrate that our model can achieve promising predictions and exceed current state-of-the-art approaches in this field. This presents a new avenue for accurately predicting AHs. Analysis on the contribution of the input residues and some cases further reveals the high interpretability and the generalization of our model.
Collapse
|
19
|
Monjarás Feria J, Valvano MA. Exploring the Topology of Cytoplasmic Membrane Proteins Involved in Lipopolysaccharide Biosynthesis by in Silico and Biochemical Analyses. Methods Mol Biol 2022; 2548:71-82. [PMID: 36151492 DOI: 10.1007/978-1-0716-2581-1_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In the absence of a tri-dimensional structure, revealing the topology of a membrane protein provides relevant information to identify the number and orientation of transmembrane helices and the localization of critical amino acid residues, contributing to a better understanding of function and intermolecular associations. Topology can be predicted in silico by bioinformatic analysis or solved by biochemical methods. In this chapter, we describe a pipeline employing bioinformatic approaches for the prediction of membrane protein topology, followed by experimental validation through the substituted-cysteine accessibility method and the analysis of the protein's oligomerization state.
Collapse
Affiliation(s)
- Julia Monjarás Feria
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast, Belfast, UK
| | - Miguel A Valvano
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast, Belfast, UK.
| |
Collapse
|
20
|
Yang Y, Yu J, Liu Z, Wang X, Wang H, Ma Z, Xu D. An Improved Topology Prediction of Alpha-Helical Transmembrane Protein Based on Deep Multi-Scale Convolutional Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:295-304. [PMID: 32750879 DOI: 10.1109/tcbb.2020.3005813] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Alpha-helical proteins ( αTMPs) are essential in various biological processes. Despite their tertiary structures are crucial for revealing complex functions, experimental structure determination remains challenging and costly. In the past decades, various sequence-based topology prediction methods have been developed to bridge the gap between the sequences and structures by characterizing the structural features, but significant improvements are still required. Deep learning brings a great opportunity for its powerful representation learning capability from limited original data. In this work, we improved our αTMP topology prediction method DMCTOP using deep learning, which composed of two deep convolutional blocks to simultaneously extract local and global contextual features. Consequently, the inputs were simplified to reflect the original features of the sequence, including a protein sequence feature and an evolutionary conservation feature. DMCTOP can efficiently and accurately identify all topological types and the N-terminal orientation for an αTMP sequence. To validate the effectiveness of our method, we benchmarked DMCTOP against 13 peer methods according to the whole sequence, the transmembrane segment and the traditional criterion in testing experiments. All the results reveal that our method achieved the highest prediction accuracy and outperformed all the previous methods. The method is available at https://icdtools.nenu.edu.cn/dmctop.
Collapse
|
21
|
O’Donoghue SI, Schafferhans A, Sikta N, Stolte C, Kaur S, Ho BK, Anderson S, Procter JB, Dallago C, Bordin N, Adcock M, Rost B. SARS-CoV-2 structural coverage map reveals viral protein assembly, mimicry, and hijacking mechanisms. Mol Syst Biol 2021; 17:e10079. [PMID: 34519429 PMCID: PMC8438690 DOI: 10.15252/msb.202010079] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 08/05/2021] [Accepted: 08/06/2021] [Indexed: 01/18/2023] Open
Abstract
We modeled 3D structures of all SARS-CoV-2 proteins, generating 2,060 models that span 69% of the viral proteome and provide details not available elsewhere. We found that ˜6% of the proteome mimicked human proteins, while ˜7% was implicated in hijacking mechanisms that reverse post-translational modifications, block host translation, and disable host defenses; a further ˜29% self-assembled into heteromeric states that provided insight into how the viral replication and translation complex forms. To make these 3D models more accessible, we devised a structural coverage map, a novel visualization method to show what is-and is not-known about the 3D structure of the viral proteome. We integrated the coverage map into an accompanying online resource (https://aquaria.ws/covid) that can be used to find and explore models corresponding to the 79 structural states identified in this work. The resulting Aquaria-COVID resource helps scientists use emerging structural data to understand the mechanisms underlying coronavirus infection and draws attention to the 31% of the viral proteome that remains structurally unknown or dark.
Collapse
MESH Headings
- Amino Acid Transport Systems, Neutral/chemistry
- Amino Acid Transport Systems, Neutral/genetics
- Amino Acid Transport Systems, Neutral/metabolism
- Angiotensin-Converting Enzyme 2/chemistry
- Angiotensin-Converting Enzyme 2/genetics
- Angiotensin-Converting Enzyme 2/metabolism
- Binding Sites
- COVID-19/genetics
- COVID-19/metabolism
- COVID-19/virology
- Computational Biology/methods
- Coronavirus Envelope Proteins/chemistry
- Coronavirus Envelope Proteins/genetics
- Coronavirus Envelope Proteins/metabolism
- Coronavirus Nucleocapsid Proteins/chemistry
- Coronavirus Nucleocapsid Proteins/genetics
- Coronavirus Nucleocapsid Proteins/metabolism
- Host-Pathogen Interactions/genetics
- Humans
- Mitochondrial Membrane Transport Proteins/chemistry
- Mitochondrial Membrane Transport Proteins/genetics
- Mitochondrial Membrane Transport Proteins/metabolism
- Mitochondrial Precursor Protein Import Complex Proteins
- Models, Molecular
- Molecular Mimicry
- Neuropilin-1/chemistry
- Neuropilin-1/genetics
- Neuropilin-1/metabolism
- Phosphoproteins/chemistry
- Phosphoproteins/genetics
- Phosphoproteins/metabolism
- Protein Binding
- Protein Conformation, alpha-Helical
- Protein Conformation, beta-Strand
- Protein Interaction Domains and Motifs
- Protein Interaction Mapping/methods
- Protein Multimerization
- Protein Processing, Post-Translational
- SARS-CoV-2/chemistry
- SARS-CoV-2/genetics
- SARS-CoV-2/metabolism
- Spike Glycoprotein, Coronavirus/chemistry
- Spike Glycoprotein, Coronavirus/genetics
- Spike Glycoprotein, Coronavirus/metabolism
- Viral Matrix Proteins/chemistry
- Viral Matrix Proteins/genetics
- Viral Matrix Proteins/metabolism
- Viroporin Proteins/chemistry
- Viroporin Proteins/genetics
- Viroporin Proteins/metabolism
- Virus Replication
Collapse
Affiliation(s)
- Seán I O’Donoghue
- Garvan Institute of Medical ResearchDarlinghurstNSWAustralia
- CSIRO Data61CanberraACTAustralia
- School of Biotechnology and Biomolecular Sciences (UNSW)KensingtonNSWAustralia
| | - Andrea Schafferhans
- Garvan Institute of Medical ResearchDarlinghurstNSWAustralia
- Department of Bioengineering SciencesWeihenstephan‐Tr. University of Applied SciencesFreisingGermany
- Department of InformaticsBioinformatics & Computational BiologyTechnical University of MunichMunichGermany
| | - Neblina Sikta
- Garvan Institute of Medical ResearchDarlinghurstNSWAustralia
| | | | - Sandeep Kaur
- Garvan Institute of Medical ResearchDarlinghurstNSWAustralia
- School of Biotechnology and Biomolecular Sciences (UNSW)KensingtonNSWAustralia
| | - Bosco K Ho
- Garvan Institute of Medical ResearchDarlinghurstNSWAustralia
| | | | | | - Christian Dallago
- Department of InformaticsBioinformatics & Computational BiologyTechnical University of MunichMunichGermany
| | - Nicola Bordin
- Institute of Structural and Molecular BiologyUniversity College LondonLondonUK
| | | | - Burkhard Rost
- Department of InformaticsBioinformatics & Computational BiologyTechnical University of MunichMunichGermany
| |
Collapse
|
22
|
Spectrum of Protein Location in Proteomes Captures Evolutionary Relationship Between Species. J Mol Evol 2021; 89:544-553. [PMID: 34328525 PMCID: PMC8379119 DOI: 10.1007/s00239-021-10022-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 07/16/2021] [Indexed: 11/10/2022]
Abstract
The native subcellular location (also referred to as localization or cellular compartment) of a protein is the one in which it acts most frequently; it is one aspect of protein function. Do ten eukaryotic model organisms differ in their location spectrum, i.e., the fraction of its proteome in each of seven major cellular compartments? As experimental annotations of locations remain biased and incomplete, we need prediction methods to answer this question. After systematic bias corrections, the complete but faulty prediction methods appeared to be more appropriate to compare location spectra between species than the incomplete more accurate experimental data. This work compared the location spectra for ten eukaryotes: Homo sapiens (human), Gorilla gorilla (gorilla), Pan troglodytes (chimpanzee), Mus musculus (mouse), Rattus norvegicus (rat), Drosophila melanogaster (fruit/vinegar fly), Anopheles gambiae (African malaria mosquito), Caenorhabitis elegans (nematode), Saccharomyces cerevisiae (baker’s yeast), and Schizosaccharomyces pombe (fission yeast). The two largest classes were predicted to be the nucleus and the cytoplasm together accounting for 47–62% of all proteins, while 7–21% of the proteins were predicted in the plasma membrane and 4–15% to be secreted. Overall, the predicted location spectra were largely similar. However, in detail, the differences sufficed to plot trees (UPGMA) and 2D (PCA) maps relating the ten organisms using a simple Euclidean distance in seven states (location classes). The relations based on the simple predicted location spectra captured aspects of cross-species comparisons usually revealed only by much more detailed evolutionary comparisons. Most interestingly, known phylogenetic relations were reproduced better by paralog-only than by ortholog-only trees.
Collapse
|
23
|
Bernhofer M, Dallago C, Karl T, Satagopam V, Heinzinger M, Littmann M, Olenyi T, Qiu J, Schütze K, Yachdav G, Ashkenazy H, Ben-Tal N, Bromberg Y, Goldberg T, Kajan L, O’Donoghue S, Sander C, Schafferhans A, Schlessinger A, Vriend G, Mirdita M, Gawron P, Gu W, Jarosz Y, Trefois C, Steinegger M, Schneider R, Rost B. PredictProtein - Predicting Protein Structure and Function for 29 Years. Nucleic Acids Res 2021; 49:W535-W540. [PMID: 33999203 PMCID: PMC8265159 DOI: 10.1093/nar/gkab354] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 04/06/2021] [Accepted: 05/10/2021] [Indexed: 12/12/2022] Open
Abstract
Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of protein structure in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold (apparently without lowering performance of prediction methods); user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and secondary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. PredictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.
Collapse
Affiliation(s)
- Michael Bernhofer
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
| | - Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
| | - Tim Karl
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
| | - Venkata Satagopam
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
- ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Michael Heinzinger
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
| | - Maria Littmann
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
| | - Tobias Olenyi
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
| | - Jiajun Qiu
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- Department of Otolaryngology Head & Neck Surgery, The Ninth People's Hospital & Ear Institute, School of Medicine & Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai Jiao Tong University, Shanghai, China
| | - Konstantin Schütze
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
| | - Guy Yachdav
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
| | - Haim Ashkenazy
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel
| | - Nir Ben-Tal
- Department of Biochemistry & Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08901, USA
| | - Tatyana Goldberg
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
| | - Laszlo Kajan
- Roche Polska Sp. z o.o., Domaniewska 39B, 02–672 Warsaw, Poland
| | | | - Chris Sander
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Department of Cell Biology, Harvard Medical School, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Boston, MA 02142, USA
| | - Andrea Schafferhans
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- HSWT (Hochschule Weihenstephan Triesdorf | University of Applied Sciences), Department of Bioengineering Sciences, Am Hofgarten 10, 85354 Freising, Germany
| | - Avner Schlessinger
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Milot Mirdita
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Piotr Gawron
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Wei Gu
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
- ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Yohan Jarosz
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
- ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Christophe Trefois
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
- ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Reinhard Schneider
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
- ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Burkhard Rost
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
24
|
Lomize AL, Schnitzer KA, Todd SC, Pogozheva ID. Thermodynamics-Based Molecular Modeling of α-Helices in Membranes and Micelles. J Chem Inf Model 2021; 61:2884-2896. [PMID: 34029472 DOI: 10.1021/acs.jcim.1c00161] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Folding of Membrane-Associated Peptides (FMAP) method was developed for modeling α-helix formation by linear peptides in micelles and lipid bilayers. FMAP 2.0 identifies locations of α-helices in the amino acid sequence, generates their three-dimensional models in planar bilayers or spherical micelles, and estimates their thermodynamic stabilities and tilt angles, depending on temperature and pH. The method was tested for 723 peptides (926 data points) experimentally studied in different environments and for 170 single-pass transmembrane (TM) proteins with available crystal structures. FMAP 2.0 detected more than 95% of experimentally observed α-helices with an average error in helix end determination of around 2, 3, 4, and 5 residues per helix for peptides in water, micelles, bilayers, and TM proteins, respectively. Helical and nonhelical residue states were predicted with an accuracy from 0.86 to 0.96, and the Matthews correlation coefficient was from 0.64 to 0.88 depending on the environment. Experimental micelle- and membrane-binding energies and tilt angles of peptides were reproduced with a root-mean-square deviation of around 2 kcal/mol and 7°, respectively. The TM and non-TM states of hydrophobic and pH-triggered α-helical peptides in various lipid bilayers were reproduced in more than 95% of cases. The FMAP 2.0 web server (https://membranome.org/fmap) is publicly available to explore the structural polymorphism of antimicrobial, cell-penetrating, fusion, and other membrane-binding peptides, which is important for understanding the mechanisms of their biological activities.
Collapse
Affiliation(s)
- Andrei L Lomize
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, Michigan 48109-1065, United States
| | - Kevin A Schnitzer
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, 1221 Beal Avenue, Ann Arbor, Michigan 48109-2102, United States
| | - Spencer C Todd
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, 1221 Beal Avenue, Ann Arbor, Michigan 48109-2102, United States
| | - Irina D Pogozheva
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, Michigan 48109-1065, United States
| |
Collapse
|
25
|
Computational prediction of secreted proteins in gram-negative bacteria. Comput Struct Biotechnol J 2021; 19:1806-1828. [PMID: 33897982 PMCID: PMC8047123 DOI: 10.1016/j.csbj.2021.03.019] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 03/18/2021] [Accepted: 03/18/2021] [Indexed: 12/29/2022] Open
Abstract
Gram-negative bacteria harness multiple protein secretion systems and secrete a large proportion of the proteome. Proteins can be exported to periplasmic space, integrated into membrane, transported into extracellular milieu, or translocated into cytoplasm of contacting cells. It is important for accurate, genome-wide annotation of the secreted proteins and their secretion pathways. In this review, we systematically classified the secreted proteins according to the types of secretion systems in Gram-negative bacteria, summarized the known features of these proteins, and reviewed the algorithms and tools for their prediction.
Collapse
|
26
|
Batra S, Pancholi P, Roy M, Kaushik S, Jyoti A, Verma K, Srivastava VK. Exploring insights of syntaxin superfamily proteins from
Entamoeba histolytica
: a prospective simulation,
protein‐protein
interaction, and docking study. J Mol Recognit 2021; 34:e2886. [DOI: 10.1002/jmr.2886] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 12/02/2020] [Accepted: 12/17/2020] [Indexed: 12/25/2022]
Affiliation(s)
- Sagar Batra
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Puranjaya Pancholi
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Mrinalini Roy
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Sanket Kaushik
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Anupam Jyoti
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Kuldeep Verma
- Institute of Science, Nirma University Ahmedabad Gujarat India
| | | |
Collapse
|
27
|
Waas M, Littrell J, Gundry RL. CIRFESS: An Interactive Resource for Querying the Set of Theoretically Detectable Peptides for Cell Surface and Extracellular Enrichment Proteomic Studies. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:1389-1397. [PMID: 32212654 PMCID: PMC8116119 DOI: 10.1021/jasms.0c00021] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Cell surface transmembrane, extracellular, and secreted proteins are high value targets for immunophenotyping, drug development, and studies related to intercellular communication in health and disease. As the number of specific and validated affinity reagents that target this subproteome are limited, mass spectrometry (MS)-based approaches will continue to play a critical role in enabling discovery and quantitation of these molecules. Given the technical considerations that make MS-based cell surface proteome studies uniquely challenging, it can be difficult to select an appropriate experimental approach. To this end, we have integrated multiple prediction strategies and annotations into a single online resource, Compiled Interactive Resource for Extracellular and Surface Studies (CIRFESS). CIRFESS enables rapid interrogation of the human proteome to reveal the cell surface proteome theoretically detectable by current approaches and highlights where current prediction strategies provide concordant and discordant information. We applied CIRFESS to identify the percentage of various subsets of the proteome which are expected to be captured by targeted enrichment strategies, including two established methods and one that is possible but not yet demonstrated. These results will inform the selection of available proteomic strategies and development of new strategies to enhance coverage of the cell surface and extracellular proteome. CIRFESS is available at www.cellsurfer.net/cirfess.
Collapse
Affiliation(s)
- Matthew Waas
- CardiOmics Program, Center for Heart and Vascular Research, Division of Cardiovascular Medicine, and Department of Cellular and Integrative Physiology, University of Nebraska Medical Center, Omaha, Nebraska 68198, United States
| | - Jack Littrell
- CardiOmics Program, Center for Heart and Vascular Research, Division of Cardiovascular Medicine, and Department of Cellular and Integrative Physiology, University of Nebraska Medical Center, Omaha, Nebraska 68198, United States
| | - Rebekah L Gundry
- CardiOmics Program, Center for Heart and Vascular Research, Division of Cardiovascular Medicine, and Department of Cellular and Integrative Physiology, University of Nebraska Medical Center, Omaha, Nebraska 68198, United States
| |
Collapse
|
28
|
von der Heyde EL, Hallmann A. Babo1, formerly Vop1 and Cop1/2, is no eyespot photoreceptor but a basal body protein illuminating cell division in Volvox carteri. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:276-298. [PMID: 31778231 DOI: 10.1111/tpj.14623] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 10/29/2019] [Accepted: 11/19/2019] [Indexed: 06/10/2023]
Abstract
In photosynthetic organisms many processes are light dependent and sensing of light requires light-sensitive proteins. The supposed eyespot photoreceptor protein Babo1 (formerly Vop1) has previously been classified as an opsin due to the capacity for binding retinal. Here, we analyze Babo1 and provide evidence that it is no opsin. Due to the localization at the basal bodies, the former Vop1 and Cop1/2 proteins were renamed V.c. Babo1 and C.r. Babo1. We reveal a large family of more than 60 Babo1-related proteins from a wide range of species. The detailed subcellular localization of fluorescence-tagged Babo1 shows that it accumulates at the basal apparatus. More precisely, it is located predominantly at the basal bodies and to a lesser extent at the four strands of rootlet microtubules. We trace Babo1 during basal body separation and cell division. Dynamic structural rearrangements of Babo1 particularly occur right before the first cell division. In four-celled embryos Babo1 was exclusively found at the oldest basal bodies of the embryo and on the corresponding d-roots. The unequal distribution of Babo1 in four-celled embryos could be an integral part of a geometrical system in early embryogenesis, which establishes the anterior-posterior polarity and influences the spatial arrangement of all embryonic structures and characteristics. Due to its retinal-binding capacity, Babo1 could also be responsible for the unequal distribution of retinoids, knowing that such concentration gradients of retinoids can be essential for the correct patterning during embryogenesis of more complex organisms. Thus, our findings push the Babo1 research in another direction.
Collapse
Affiliation(s)
- Eva L von der Heyde
- Department of Cellular and Developmental Biology of Plants, University of Bielefeld, Universitätsstr 25, 33615, Bielefeld, Germany
| | - Armin Hallmann
- Department of Cellular and Developmental Biology of Plants, University of Bielefeld, Universitätsstr 25, 33615, Bielefeld, Germany
| |
Collapse
|
29
|
Feng SH, Zhang WX, Yang J, Yang Y, Shen HB. Topology Prediction Improvement of α-helical Transmembrane Proteins Through Helix-tail Modeling and Multiscale Deep Learning Fusion. J Mol Biol 2020; 432:1279-1296. [DOI: 10.1016/j.jmb.2019.12.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 12/02/2019] [Accepted: 12/04/2019] [Indexed: 12/18/2022]
|
30
|
Kulandaisamy A, Priya SB, Sakthivel R, Frishman D, Gromiha MM. Statistical analysis of disease‐causing and neutral mutations in human membrane proteins. Proteins 2019; 87:452-466. [DOI: 10.1002/prot.25667] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 01/16/2019] [Accepted: 01/31/2019] [Indexed: 11/11/2022]
Affiliation(s)
- A. Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
| | - S. Binny Priya
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
| | - R. Sakthivel
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
| | - Dmitrij Frishman
- Department of BioinformaticsPeter the Great St. Petersburg Polytechnic University St. Petersburg Russian Federation
- Department of BioinformaticsTechnische Universität München, Wissenschaftszentrum Weihenstephan Freising Germany
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
- Advanced Computational Drug Discovery Unit (ACDD)Institute of Innovative Research, Tokyo Institute of Technology Yokohama Kanagawa Japan
| |
Collapse
|
31
|
Ibrahim MAA, Hassan AMA. Comparative Modeling and Evaluation of Leukotriene B4 Receptors for Selective Drug Discovery Towards the Treatment of Inflammatory Diseases. Protein J 2019; 37:518-530. [PMID: 30267300 DOI: 10.1007/s10930-018-9797-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Leukotriene B4 (LTB4) exerts its biological effects through stimulation of specific G protein-coupled receptors (GPCRs)-namely BLT1 and BLT2. Due to the absence of human BLT1 and BLT2 crystal structures, the current study was set to predict the 3D structures of these two receptors for structure-based anti-inflammatory drug discovery. Homology modeling of the BLT1 receptor was first constructed, based on various X-ray and NMR GPCR templates, followed by molecular dynamics (MD) refinement. Using a single-template approach, nine well-established alignment methods and ten secondary structure prediction methods during the backbone generation were implemented and assessed. The binding sites of the BLT1 receptor were then mapped using fifteen chemical probes with the help of FTMAP and AutoDock Vina 4.2 software. Model validation was performed through the docking of eight specific antagonists that have experimental inhibition constants (ki) towards BLT1. The antagonists-BLT1 docked structures were then subjected to AMBER-based molecular mechanical minimization and the corresponding binding energies were calculated using molecular mechanics-generalized Born surface area (MM/GBSA) approach. According to the results, the most energetically stable models were constructed using SAlign method for the alignment process and PSIPRED for secondary structure prediction. In comparison, the refined BLT1 model built on 2KS9 as an NMR template has the lowest DOPE energy compared to those built on 4EA3 and 4XT1 as X-ray templates. According to the mapping results, two main binding sites were identified: one was among TMs II, III and VII and the other was among TMs III, IV and V. For the antagonists, correlation between binding energies and experimental data was in a good agreement, with a correlation coefficient (R2 value) of 0.91. Due to the great amino acid sequence similarity between BLT1 and BLT2 receptors (calculated as 45.2%), BLT2 model was constructed based on the predicted BLT1 model.
Collapse
Affiliation(s)
- Mahmoud A A Ibrahim
- Computational Chemistry Laboratory, Chemistry Department, Faculty of Science, Minia University, Minia, 61519, Egypt.
| | - Alaa M A Hassan
- Computational Chemistry Laboratory, Chemistry Department, Faculty of Science, Minia University, Minia, 61519, Egypt
| |
Collapse
|
32
|
Merilahti JAM, Elenius K. Gamma-secretase-dependent signaling of receptor tyrosine kinases. Oncogene 2018; 38:151-163. [PMID: 30166589 PMCID: PMC6756091 DOI: 10.1038/s41388-018-0465-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 07/26/2018] [Accepted: 07/27/2018] [Indexed: 12/28/2022]
Abstract
Human genome harbors 55 receptor tyrosine kinases (RTK). At least half of the RTKs have been reported to be cleaved by gamma-secretase-mediated regulated intramembrane proteolysis. The two-step process involves releasing the RTK ectodomain to the extracellular space by proteolytic cleavage called shedding, followed by cleavage in the RTK transmembrane domain by the gamma-secretase complex resulting in release of a soluble RTK intracellular domain. This intracellular domain, including the tyrosine kinase domain, can in turn translocate to various cellular compartments, such as the nucleus or proteasome. The soluble intracellular domain may interact with transcriptional regulators and other proteins to induce specific effects on cell survival, proliferation, and differentiation, establishing an additional signaling mode for the cleavable RTKs. On the other hand, the same process can facilitate RTK turnover and proteasomal degradation. In this review we focus on the regulation of RTK shedding and gamma-secretase cleavage, as well as signaling promoted by the soluble RTK ICDs. In addition, therapeutic implications of increased knowledge on RTK cleavage on cancer drug development are discussed.
Collapse
Affiliation(s)
- Johannes A M Merilahti
- Institute of Biomedicine, University of Turku, 20520, Turku, Finland.,Medicity Research Laboratory, University of Turku, 20520, Turku, Finland.,Turku Doctoral Programme of Molecular Medicine, University of Turku, 20520, Turku, Finland
| | - Klaus Elenius
- Institute of Biomedicine, University of Turku, 20520, Turku, Finland. .,Medicity Research Laboratory, University of Turku, 20520, Turku, Finland. .,Department of Oncology, Turku University Hospital, 20520, Turku, Finland.
| |
Collapse
|
33
|
Hücker SM, Vanderhaeghen S, Abellan-Schneyder I, Scherer S, Neuhaus K. The Novel Anaerobiosis-Responsive Overlapping Gene ano Is Overlapping Antisense to the Annotated Gene ECs2385 of Escherichia coli O157:H7 Sakai. Front Microbiol 2018; 9:931. [PMID: 29867840 PMCID: PMC5960689 DOI: 10.3389/fmicb.2018.00931] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 04/23/2018] [Indexed: 12/26/2022] Open
Abstract
Current notion presumes that only one protein is encoded at a given bacterial genetic locus. However, transcription and translation of an overlapping open reading frame (ORF) of 186 bp length were discovered by RNAseq and RIBOseq experiments. This ORF is almost completely embedded in the annotated L,D-transpeptidase gene ECs2385 of Escherichia coli O157:H7 Sakai in the antisense reading frame -3. The ORF is transcribed as part of a bicistronic mRNA, which includes the annotated upstream gene ECs2384, encoding a murein lipoprotein. The transcriptional start site of the operon resides 38 bp upstream of the ECs2384 start codon and is driven by a predicted σ70 promoter, which is constitutively active under different growth conditions. The bicistronic operon contains a ρ-independent terminator just upstream of the novel gene, significantly decreasing its transcription. The novel gene can be stably expressed as an EGFP-fusion protein and a translationally arrested mutant of ano, unable to produce the protein, shows a growth advantage in competitive growth experiments compared to the wild type under anaerobiosis. Therefore, the novel antisense overlapping gene is named ano (anaerobiosis responsive overlapping gene). A phylostratigraphic analysis indicates that ano originated very recently de novo by overprinting after the Escherichia/Shigella clade separated from other enterobacteria. Therefore, ano is one of the very rare cases of overlapping genes known in the genus Escherichia.
Collapse
Affiliation(s)
- Sarah M Hücker
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Sonja Vanderhaeghen
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | | | - Siegfried Scherer
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.,Institute for Food & Health, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.,Core Facility Microbiome/NGS, Institute for Food & Health, Technical University of Munich, Freising, Germany
| |
Collapse
|
34
|
Hücker SM, Vanderhaeghen S, Abellan-Schneyder I, Wecko R, Simon S, Scherer S, Neuhaus K. A novel short L-arginine responsive protein-coding gene (laoB) antiparallel overlapping to a CadC-like transcriptional regulator in Escherichia coli O157:H7 Sakai originated by overprinting. BMC Evol Biol 2018; 18:21. [PMID: 29433444 PMCID: PMC5810103 DOI: 10.1186/s12862-018-1134-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 01/31/2018] [Indexed: 11/10/2022] Open
Abstract
Background Due to the DNA triplet code, it is possible that the sequences of two or more protein-coding genes overlap to a large degree. However, such non-trivial overlaps are usually excluded by genome annotation pipelines and, thus, only a few overlapping gene pairs have been described in bacteria. In contrast, transcriptome and translatome sequencing reveals many signals originated from the antisense strand of annotated genes, of which we analyzed an example gene pair in more detail. Results A small open reading frame of Escherichia coli O157:H7 strain Sakai (EHEC), designated laoB (L-arginine responsive overlapping gene), is embedded in reading frame −2 in the antisense strand of ECs5115, encoding a CadC-like transcriptional regulator. This overlapping gene shows evidence of transcription and translation in Luria-Bertani (LB) and brain-heart infusion (BHI) medium based on RNA sequencing (RNAseq) and ribosomal-footprint sequencing (RIBOseq). The transcriptional start site is 289 base pairs (bp) upstream of the start codon and transcription termination is 155 bp downstream of the stop codon. Overexpression of LaoB fused to an enhanced green fluorescent protein (EGFP) reporter was possible. The sequence upstream of the transcriptional start site displayed strong promoter activity under different conditions, whereas promoter activity was significantly decreased in the presence of L-arginine. A strand-specific translationally arrested mutant of laoB provided a significant growth advantage in competitive growth experiments in the presence of L-arginine compared to the wild type, which returned to wild type level after complementation of laoB in trans. A phylostratigraphic analysis indicated that the novel gene is restricted to the Escherichia/Shigella clade and might have originated recently by overprinting leading to the expression of part of the antisense strand of ECs5115. Conclusions Here, we present evidence of a novel small protein-coding gene laoB encoded in the antisense frame −2 of the annotated gene ECs5115. Clearly, laoB is evolutionarily young and it originated in the Escherichia/Shigella clade by overprinting, a process which may cause the de novo evolution of bacterial genes like laoB. Electronic supplementary material The online version of this article (10.1186/s12862-018-1134-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sarah M Hücker
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,Fraunhofer ITEM-R, Am Biopark 9, 93053, Regensburg, Germany
| | - Sonja Vanderhaeghen
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Isabel Abellan-Schneyder
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Romy Wecko
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Svenja Simon
- Department of Computer and Information Science, University of Konstanz, Box 78, 78457, Konstanz, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany. .,Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| |
Collapse
|
35
|
Membrane proteins structures: A review on computational modeling tools. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2017; 1859:2021-2039. [DOI: 10.1016/j.bbamem.2017.07.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/04/2017] [Accepted: 07/13/2017] [Indexed: 01/02/2023]
|
36
|
Hücker SM, Ardern Z, Goldberg T, Schafferhans A, Bernhofer M, Vestergaard G, Nelson CW, Schloter M, Rost B, Scherer S, Neuhaus K. Discovery of numerous novel small genes in the intergenic regions of the Escherichia coli O157:H7 Sakai genome. PLoS One 2017; 12:e0184119. [PMID: 28902868 PMCID: PMC5597208 DOI: 10.1371/journal.pone.0184119] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Accepted: 08/20/2017] [Indexed: 12/29/2022] Open
Abstract
In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set.
Collapse
Affiliation(s)
- Sarah M. Hücker
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Tatyana Goldberg
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Andrea Schafferhans
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Michael Bernhofer
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Gisle Vestergaard
- Research Unit Environmental Genomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Chase W. Nelson
- Sackler Institute for Comparative Genomics, American Museum of Natural History New York, New York, United States of America
| | - Michael Schloter
- Research Unit Environmental Genomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Burkhard Rost
- Department of Informatics—Bioinformatics & TUM-IAS, Technische Universität München, Garching, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technische Universität München, Freising, Germany
- Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Freising, Germany
- * E-mail:
| |
Collapse
|