1
|
Wang X, Hu R, Zhang Y, Tian L, Liu S, Huang Z, Wang L, Lu Y, Wang L, Wang Y, Wu Y, Cong Y, Yang G. Mechanistic analysis of thermal stability in a novel thermophilic polygalacturonase MlPG28B derived from the marine fungus Mucor lusitanicus. Int J Biol Macromol 2024; 280:136007. [PMID: 39326595 DOI: 10.1016/j.ijbiomac.2024.136007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 09/23/2024] [Accepted: 09/23/2024] [Indexed: 09/28/2024]
Abstract
In this study, heterologous MlPG28B expression was obtained by cloning the Mucor lusitanicus gene screened from a marine environment. The enzyme activity of MlPG28B was maximum at 60 °C, 30 % of the enzyme activity was retained after incubation at 100 °C for 30 min, and enzyme activity was still present after 60 min incubation, one of the best thermostable polygalacturonases characterized until now. The high-purity oligosaccharide standards (DP2-DP7) were prepared with polygalacturonic acid as a substrate. Kinetic parameters showed that MlPG28B at the optimum temperature has a low Km value (3055 ± 1104 mg/L), indicating high substrate affinity. Sequence alignment analysis inferred key residues Cys276, Cys284, Lys107, and Gln237 for MlPG28B thermal stability. Molecular docking and molecular dynamics simulation results indicated that MlPG28B has flexible T1 and T3 loops conducive to substrate recognition, binding, and catalysis and forms a hydrogen bond to the substrate by a highly conserved residue Asn161 in the active-site cleft. Based on site-directed mutation results, the five residues are key in determining MlPG28B thermal stability. Therefore, MlPG28B is a promising candidate for industrial enzymes in feed preparation.
Collapse
Affiliation(s)
- Xin Wang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Ruitong Hu
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Yu Zhang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Linfang Tian
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Siyi Liu
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Zhe Huang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Lianshun Wang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Yanan Lu
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Li Wang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Yuan Wang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China
| | - Yuntian Wu
- Agricultural Service Center, Huanren Manchu Autonomous County, Benxi 117200, China.
| | - Yuting Cong
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China.
| | - Guojun Yang
- College of Fisheries and Life Science, National Demonstration Center for Experimental Aquaculture Education (Dalian Ocean University), Ministry of Education, Dalian 116023, China; Dalian Key Laboratory of Breeding, Reproduction and Aquaculture of Crustaceans, Dalian 116023, China; Key Laboratory of Environment Controlled Aquaculture, Ministry of Education, Dalian 116023, China.
| |
Collapse
|
2
|
Aggarwal T, Kondabagil K. Proteome-scale structural prediction of the giant Marseillevirus reveals conserved folds and putative homologs of the hypothetical proteins. Arch Virol 2024; 169:222. [PMID: 39414627 DOI: 10.1007/s00705-024-06155-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 09/02/2024] [Indexed: 10/18/2024]
Abstract
A significant proportion of the highly divergent and novel proteins of giant viruses are termed "hypothetical" due to the absence of detectable homologous sequences in the existing databases. The quality of genome and proteome annotations often relies on the identification of signature sequences and motifs in order to assign putative functions to the gene products. These annotations serve as the first set of information for researchers to develop workable hypotheses for further experimental research. The structure-function relationship of proteins suggests that proteins with similar functions may also exhibit similar folding patterns. Here, we report the first proteome-wide structure prediction of the giant Marseillevirus. We use AlphaFold-predicted structures and their comparative analysis with the experimental structures in the PDB database to preliminarily annotate the viral proteins. Our work highlights the conservation of structural folds in proteins with highly divergent sequences and reveals potentially paralogous relationships among them. We also provide evidence for gene duplication and fusion as contributing factors to giant viral genome expansion and evolution. With the easily accessible AlphaFold and other advanced bioinformatics tools for high-confidence de novo structure prediction, we propose a combined sequence and predicted-structure-based proteome annotation approach for the initial characterization of novel and complex organisms or viruses.
Collapse
Affiliation(s)
- Tanvi Aggarwal
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | - Kiran Kondabagil
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India.
| |
Collapse
|
3
|
Torres MDT, Brooks EF, Cesaro A, Sberro H, Gill MO, Nicolaou C, Bhatt AS, de la Fuente-Nunez C. Mining human microbiomes reveals an untapped source of peptide antibiotics. Cell 2024; 187:5453-5467.e15. [PMID: 39163860 DOI: 10.1016/j.cell.2024.07.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 05/09/2024] [Accepted: 07/17/2024] [Indexed: 08/22/2024]
Abstract
Drug-resistant bacteria are outpacing traditional antibiotic discovery efforts. Here, we computationally screened 444,054 previously reported putative small protein families from 1,773 human metagenomes for antimicrobial properties, identifying 323 candidates encoded in small open reading frames (smORFs). To test our computational predictions, 78 peptides were synthesized and screened for antimicrobial activity in vitro, with 70.5% displaying antimicrobial activity. As these compounds were different compared with previously reported antimicrobial peptides, we termed them smORF-encoded peptides (SEPs). SEPs killed bacteria by targeting their membrane, synergizing with each other, and modulating gut commensals, indicating a potential role in reconfiguring microbiome communities in addition to counteracting pathogens. The lead candidates were anti-infective in both murine skin abscess and deep thigh infection models. Notably, prevotellin-2 from Prevotella copri presented activity comparable to the commonly used antibiotic polymyxin B. Our report supports the existence of hundreds of antimicrobials in the human microbiome amenable to clinical translation.
Collapse
Affiliation(s)
- Marcelo D T Torres
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Erin F Brooks
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA 94305, USA
| | - Angela Cesaro
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hila Sberro
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA 94305, USA
| | - Matthew O Gill
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Cosmos Nicolaou
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA 94305, USA
| | - Ami S Bhatt
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA.
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
4
|
Scat S, Weissman KJ, Chagot B. Insights into docking in megasynthases from the investigation of the toblerol trans-AT polyketide synthase: many α-helical means to an end. RSC Chem Biol 2024; 5:669-683. [PMID: 38966669 PMCID: PMC11221535 DOI: 10.1039/d4cb00075g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 05/16/2024] [Indexed: 07/06/2024] Open
Abstract
The fidelity of biosynthesis by modular polyketide synthases (PKSs) depends on specific moderate affinity interactions between successive polypeptide subunits mediated by docking domains (DDs). These sequence elements are notably portable, allowing their transplantation into alternative biosynthetic and metabolic contexts. Herein, we use integrative structural biology to characterize a pair of DDs from the toblerol trans-AT PKS. Both are intrinsically disordered regions (IDRs) that fold into a 3 α-helix docking complex of unprecedented topology. The C-terminal docking domain (CDD) resembles the 4 α-helix type (4HB) CDDs, which shows that the same type of DD can be redeployed to form complexes of distinct geometry. By carefully re-examining known DD structures, we further extend this observation to type 2 docking domains, establishing previously unsuspected structural relations between DD types. Taken together, these data illustrate the plasticity of α-helical DDs, which allow the formation of a diverse topological spectrum of docked complexes. The newly identified DDs should also find utility in modular PKS genetic engineering.
Collapse
Affiliation(s)
- Serge Scat
- Université de Lorraine, CNRS, IMoPA F-54000 Nancy France
| | | | | |
Collapse
|
5
|
Hamamsy T, Morton JT, Blackwell R, Berenberg D, Carriero N, Gligorijevic V, Strauss CEM, Leman JK, Cho K, Bonneau R. Protein remote homology detection and structural alignment using deep learning. Nat Biotechnol 2024; 42:975-985. [PMID: 37679542 PMCID: PMC11180608 DOI: 10.1038/s41587-023-01917-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 07/26/2023] [Indexed: 09/09/2023]
Abstract
Exploiting sequence-structure-function relationships in biotechnology requires improved methods for aligning proteins that have low sequence similarity to previously annotated proteins. We develop two deep learning methods to address this gap, TM-Vec and DeepBLAST. TM-Vec allows searching for structure-structure similarities in large sequence databases. It is trained to accurately predict TM-scores as a metric of structural similarity directly from sequence pairs without the need for intermediate computation or solution of structures. Once structurally similar proteins have been identified, DeepBLAST can structurally align proteins using only sequence information by identifying structurally homologous regions between proteins. It outperforms traditional sequence alignment methods and performs similarly to structure-based alignment methods. We show the merits of TM-Vec and DeepBLAST on a variety of datasets, including better identification of remotely homologous proteins compared with state-of-the-art sequence alignment and structure prediction methods.
Collapse
Grants
- R35GM122515 National Science Foundation (NSF)
- IOS-1546218 National Science Foundation (NSF)
- R35 GM122515 NIGMS NIH HHS
- R01 DK103358 NIDDK NIH HHS
- CBET- 1728858 National Science Foundation (NSF)
- R01 AI130945 NIAID NIH HHS
- This research was supported by NIH R01DK103358, the Simons Foundation, NSF- IOS-1546218, R35GM122515, NSF CBET- 1728858, NIH R01AI130945, to T.H. This research was supported by the intramural research program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) to J.T.M. This research was supported by the Flatiron Institute as part of the Simons Foundation to Robert Blackwell, J.K.L., and N.C. This research was supported by Los Alamos National Lab to C.S. This research was supported by the Samsung Advanced Institute of Technology (Next Generation Deep Learning: from pattern recognition to AI), Samsung Research (Improving Deep Learning using Latent Structure), and NSF Award 1922658 to K.C.
- Simons Foundation
- U.S. Department of Health & Human Services | NIH | Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)
Collapse
Affiliation(s)
- Tymor Hamamsy
- Center for Data Science, New York University, New York, NY, USA
| | - James T Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Robert Blackwell
- Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Daniel Berenberg
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
- Prescient Design, New York, NY, USA
| | - Nicholas Carriero
- Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
| | | | | | - Julia Koehler Leman
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Kyunghyun Cho
- Center for Data Science, New York University, New York, NY, USA.
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA.
- Prescient Design, New York, NY, USA.
- CIFAR, Toronto, Ontario, Canada.
| | - Richard Bonneau
- Center for Data Science, New York University, New York, NY, USA.
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA.
- Prescient Design, New York, NY, USA.
- Department of Biology, New York University, New York, NY, USA.
| |
Collapse
|
6
|
Middendorf L, Eicholt LA. Random, de novo, and conserved proteins: How structure and disorder predictors perform differently. Proteins 2024; 92:757-767. [PMID: 38226524 DOI: 10.1002/prot.26652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/18/2023] [Accepted: 12/01/2023] [Indexed: 01/17/2024]
Abstract
Understanding the emergence and structural characteristics of de novo and random proteins is crucial for unraveling protein evolution and designing novel enzymes. However, experimental determination of their structures remains challenging. Recent advancements in protein structure prediction, particularly with AlphaFold2 (AF2), have expanded our knowledge of protein structures, but their applicability to de novo and random proteins is unclear. In this study, we investigate the structural predictions and confidence scores of AF2 and protein language model-based predictor ESMFold for de novo and conserved proteins from Drosophila and a dataset of comparable random proteins. We find that the structural predictions for de novo and random proteins differ significantly from conserved proteins. Interestingly, a positive correlation between disorder and confidence scores (pLDDT) is observed for de novo and random proteins, in contrast to the negative correlation observed for conserved proteins. Furthermore, the performance of structure predictors for de novo and random proteins is hampered by the lack of sequence identity. We also observe fluctuating median predicted disorder among different sequence length quartiles for random proteins, suggesting an influence of sequence length on disorder predictions. In conclusion, while structure predictors provide initial insights into the structural composition of de novo and random proteins, their accuracy and applicability to such proteins remain limited. Experimental determination of their structures is necessary for a comprehensive understanding. The positive correlation between disorder and pLDDT could imply a potential for conditional folding and transient binding interactions of de novo and random proteins.
Collapse
Affiliation(s)
- Lasse Middendorf
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Lars A Eicholt
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| |
Collapse
|
7
|
Versini R, Sritharan S, Aykac Fas B, Tubiana T, Aimeur SZ, Henri J, Erard M, Nüsse O, Andreani J, Baaden M, Fuchs P, Galochkina T, Chatzigoulas A, Cournia Z, Santuz H, Sacquin-Mora S, Taly A. A Perspective on the Prospective Use of AI in Protein Structure Prediction. J Chem Inf Model 2024; 64:26-41. [PMID: 38124369 DOI: 10.1021/acs.jcim.3c01361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
AlphaFold2 (AF2) and RoseTTaFold (RF) have revolutionized structural biology, serving as highly reliable and effective methods for predicting protein structures. This article explores their impact and limitations, focusing on their integration into experimental pipelines and their application in diverse protein classes, including membrane proteins, intrinsically disordered proteins (IDPs), and oligomers. In experimental pipelines, AF2 models help X-ray crystallography in resolving the phase problem, while complementarity with mass spectrometry and NMR data enhances structure determination and protein flexibility prediction. Predicting the structure of membrane proteins remains challenging for both AF2 and RF due to difficulties in capturing conformational ensembles and interactions with the membrane. Improvements in incorporating membrane-specific features and predicting the structural effect of mutations are crucial. For intrinsically disordered proteins, AF2's confidence score (pLDDT) serves as a competitive disorder predictor, but integrative approaches including molecular dynamics (MD) simulations or hydrophobic cluster analyses are advocated for accurate dynamics representation. AF2 and RF show promising results for oligomeric models, outperforming traditional docking methods, with AlphaFold-Multimer showing improved performance. However, some caveats remain in particular for membrane proteins. Real-life examples demonstrate AF2's predictive capabilities in unknown protein structures, but models should be evaluated for their agreement with experimental data. Furthermore, AF2 models can be used complementarily with MD simulations. In this Perspective, we propose a "wish list" for improving deep-learning-based protein folding prediction models, including using experimental data as constraints and modifying models with binding partners or post-translational modifications. Additionally, a meta-tool for ranking and suggesting composite models is suggested, driving future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Raphaelle Versini
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sujith Sritharan
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Burcu Aykac Fas
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Thibault Tubiana
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Sana Zineb Aimeur
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Julien Henri
- Sorbonne Université, CNRS, Laboratoire de Biologie, Computationnelle et Quantitative UMR 7238, Institut de Biologie Paris-Seine, 4 Place Jussieu, F-75005 Paris, France
| | - Marie Erard
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Oliver Nüsse
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Marc Baaden
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Patrick Fuchs
- Sorbonne Université, École Normale Supérieure, PSL University, CNRS, Laboratoire des Biomolécules, LBM, 75005 Paris, France
- Université de Paris, UFR Sciences du Vivant, 75013 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75014 Paris, France
| | - Alexios Chatzigoulas
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Zoe Cournia
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Hubert Santuz
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Antoine Taly
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| |
Collapse
|
8
|
Varabyou A, Sommer MJ, Erdogdu B, Shinder I, Minkin I, Chao KH, Park S, Heinz J, Pockrandt C, Shumate A, Rincon N, Puiu D, Steinegger M, Salzberg SL, Pertea M. CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure. Genome Biol 2023; 24:249. [PMID: 37904256 PMCID: PMC10614308 DOI: 10.1186/s13059-023-03088-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 10/16/2023] [Indexed: 11/01/2023] Open
Abstract
CHESS 3 represents an improved human gene catalog based on nearly 10,000 RNA-seq experiments across 54 body sites. It significantly improves current genome annotation by integrating the latest reference data and algorithms, machine learning techniques for noise filtering, and new protein structure prediction methods. CHESS 3 contains 41,356 genes, including 19,839 protein-coding genes and 158,377 transcripts, with 14,863 protein-coding transcripts not in other catalogs. It includes all MANE transcripts and at least one transcript for most RefSeq and GENCODE genes. On the CHM13 human genome, the CHESS 3 catalog contains an additional 129 protein-coding genes. CHESS 3 is available at http://ccb.jhu.edu/chess .
Collapse
Affiliation(s)
- Ales Varabyou
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA.
| | - Markus J Sommer
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Beril Erdogdu
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Ida Shinder
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Cross Disciplinary Graduate Program in Biomedical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Ilia Minkin
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Kuan-Hao Chao
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Sukhwan Park
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Jakob Heinz
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Christopher Pockrandt
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Alaina Shumate
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Natalia Rincon
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Daniela Puiu
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea
| | - Steven L Salzberg
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA.
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA.
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
9
|
Varadi M, Tsenkov M, Velankar S. Challenges in bridging the gap between protein structure prediction and functional interpretation. Proteins 2023. [PMID: 37850517 DOI: 10.1002/prot.26614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/26/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023]
Abstract
The rapid evolution of protein structure prediction tools has significantly broadened access to protein structural data. Although predicted structure models have the potential to accelerate and impact fundamental and translational research significantly, it is essential to note that they are not validated and cannot be considered the ground truth. Thus, challenges persist, particularly in capturing protein dynamics, predicting multi-chain structures, interpreting protein function, and assessing model quality. Interdisciplinary collaborations are crucial to overcoming these obstacles. Databases like the AlphaFold Protein Structure Database, the ESM Metagenomic Atlas, and initiatives like the 3D-Beacons Network provide FAIR access to these data, enabling their interpretation and application across a broader scientific community. Whilst substantial advancements have been made in protein structure prediction, further progress is required to address the remaining challenges. Developing training materials, nurturing collaborations, and ensuring open data sharing will be paramount in this pursuit. The continued evolution of these tools and methodologies will deepen our understanding of protein function and accelerate disease pathogenesis and drug development discoveries.
Collapse
Affiliation(s)
- Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maxim Tsenkov
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
10
|
Torres MDT, Brooks E, Cesaro A, Sberro H, Nicolaou C, Bhatt AS, de la Fuente-Nunez C. Human gut metagenomic mining reveals an untapped source of peptide antibiotics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.31.555711. [PMID: 37693399 PMCID: PMC10491270 DOI: 10.1101/2023.08.31.555711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Drug-resistant bacteria are outpacing traditional antibiotic discovery efforts. Here, we computationally mined 444,054 families of putative small proteins from 1,773 human gut metagenomes, identifying 323 peptide antibiotics encoded in small open reading frames (smORFs). To test our computational predictions, 78 peptides were synthesized and screened for antimicrobial activity in vitro, with 59% displaying activity against either pathogens or commensals. Since these peptides were unique compared to previously reported antimicrobial peptides, we termed them smORF-encoded peptides (SEPs). SEPs killed bacteria by targeting their membrane, synergized with each other, and modulated gut commensals, indicating that they may play a role in reconfiguring microbiome communities in addition to counteracting pathogens. The lead candidates were anti-infective in both murine skin abscess and deep thigh infection models. Notably, prevotellin-2 from Prevotella copri presented activity comparable to the commonly used antibiotic polymyxin B. We report the discovery of hundreds of peptide sequences in the human gut.
Collapse
Affiliation(s)
- Marcelo D. T. Torres
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
| | - Erin Brooks
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA, United States of America
| | - Angela Cesaro
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
| | - Hila Sberro
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA, United States of America
| | - Cosmos Nicolaou
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA, United States of America
| | - Ami S. Bhatt
- Department of Medicine (Hematology; Blood and Marrow Transplantation), Stanford University, Stanford, CA, United States of America
- Department of Genetics, Stanford University, Stanford, CA, United States of America
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States of America
| |
Collapse
|
11
|
Alotaibi BS, Ajmal A, Hakami MA, Mahmood A, Wadood A, Hu J. New drug target identification in Vibrio vulnificus by subtractive genome analysis and their inhibitors through molecular docking and molecular dynamics simulations. Heliyon 2023; 9:e17650. [PMID: 37449110 PMCID: PMC10336522 DOI: 10.1016/j.heliyon.2023.e17650] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/29/2023] [Accepted: 06/24/2023] [Indexed: 07/18/2023] Open
Abstract
Vibrio vulnificus is a rod shape, Gram-negative bacterium that causes sepsis (with a greater than 50% mortality rate), necrotizing fasciitis, gastroenteritis, skin, and soft tissue infection, wound infection, peritonitis, meningitis, pneumonia, keratitis, and arthritis. Based on pathogenicity V. vulnificus is categorized into three biotypes. Type 1 and type 3 cause diseases in humans while biotype 2 causes diseases in eel and fish. Due to indiscriminate use of antibiotics V. vulnificus has developed resistance to many antibiotics so curing is dramatically a challenge. V. vulnificus is resistant to cefazolin, streptomycin, tetracycline, aztreonam, tobramycin, cefepime, and gentamycin. Subtractive genome analysis is the most effective method for drug target identification. The method is based on the subtraction of homologous proteins from both pathogen and host. By this process set of proteins present only in the pathogen and perform essential functions in the pathogen can be identified. The entire proteome of Vibrio vulnificus strain ATCC 27562 was reduced step by step to a single protein predicted as the drug target. AlphaFold2 is one of the applications of deep learning algorithms in biomedicine and is correctly considered the game changer in the field of structural biology. Accuracy and speed are the major strength of AlphaFold2. In the PDB database, the crystal structure of the predicted drug target was not present, therefore the Colab notebook was used to predict the 3D structure by the AlphaFold2, and subsequently, the predicted model was validated. Potent inhibitors against the new target were predicted by virtual screening and molecular docking study. The most stable compound ZINC01318774 tightly attaches to the binding pocket of bisphosphoglycerate-independent phosphoglycerate mutase. The time-dependent molecular dynamics simulation revealed compound ZINC01318774 was superior as compared to the standard drug tetracycline in terms of stability. The availability of V. vulnificus strain ATCC 27562 has allowed in silico identification of drug target which will provide a base for the discovery of specific therapeutic targets against Vibrio vulnificus.
Collapse
Affiliation(s)
- Bader S. Alotaibi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Al-Quwayiyah, Shaqra Univesity, Riyadh, Saudi Arabia
| | - Amar Ajmal
- Department of Biochemistry, Computational Medicinal Chemistry Laboratory, UCSS, Abdul Wali Khan University, Mardan, Pakistan
| | - Mohammed Ageeli Hakami
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Al-Quwayiyah, Shaqra Univesity, Riyadh, Saudi Arabia
| | - Arif Mahmood
- Center for Medical Genetics and Human Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, 410078, Hunan, China
| | - Abdul Wadood
- Department of Biochemistry, Computational Medicinal Chemistry Laboratory, UCSS, Abdul Wali Khan University, Mardan, Pakistan
| | - Junjian Hu
- Department of Central Laboratory, SSL, Central Hospital of Gongguan City, Affiliated Dongguan Shilong People's Hospital of Southern Medical University, Dongguan, China
| |
Collapse
|
12
|
Ruperti F, Papadopoulos N, Musser JM, Mirdita M, Steinegger M, Arendt D. Cross-phyla protein annotation by structural prediction and alignment. Genome Biol 2023; 24:113. [PMID: 37173746 PMCID: PMC10176882 DOI: 10.1186/s13059-023-02942-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 04/18/2023] [Indexed: 05/15/2023] Open
Abstract
BACKGROUND Protein annotation is a major goal in molecular biology, yet experimentally determined knowledge is typically limited to a few model organisms. In non-model species, the sequence-based prediction of gene orthology can be used to infer protein identity; however, this approach loses predictive power at longer evolutionary distances. Here we propose a workflow for protein annotation using structural similarity, exploiting the fact that similar protein structures often reflect homology and are more conserved than protein sequences. RESULTS We propose a workflow of openly available tools for the functional annotation of proteins via structural similarity (MorF: MorphologFinder) and use it to annotate the complete proteome of a sponge. Sponges are highly relevant for inferring the early history of animals, yet their proteomes remain sparsely annotated. MorF accurately predicts the functions of proteins with known homology in [Formula: see text] cases and annotates an additional [Formula: see text] of the proteome beyond standard sequence-based methods. We uncover new functions for sponge cell types, including extensive FGF, TGF, and Ephrin signaling in sponge epithelia, and redox metabolism and control in myopeptidocytes. Notably, we also annotate genes specific to the enigmatic sponge mesocytes, proposing they function to digest cell walls. CONCLUSIONS Our work demonstrates that structural similarity is a powerful approach that complements and extends sequence similarity searches to identify homologous proteins over long evolutionary distances. We anticipate this will be a powerful approach that boosts discovery in numerous -omics datasets, especially for non-model organisms.
Collapse
Affiliation(s)
- Fabian Ruperti
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Faculty of Biosciences, Collaboration for joint Ph.D. degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Nikolaos Papadopoulos
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Department for Evolutionary Biology, University of Vienna, Vienna, Austria
| | - Jacob M Musser
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Milot Mirdita
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Detlev Arendt
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
- Centre for Organismal Studies, University of Heidelberg, Heidelberg, Germany.
| |
Collapse
|
13
|
Mészáros B, Park E, Malinverni D, Sejdiu BI, Immadisetty K, Sandhu M, Lang B, Babu MM. Recent breakthroughs in computational structural biology harnessing the power of sequences and structures. Curr Opin Struct Biol 2023; 80:102608. [PMID: 37182396 DOI: 10.1016/j.sbi.2023.102608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 05/16/2023]
Abstract
Recent advances in computational approaches and their integration into structural biology enable tackling increasingly complex questions. Here, we discuss several key areas, highlighting breakthroughs and remaining challenges. Theoretical modeling has provided tools to accurately predict and design protein structures on a scale currently difficult to achieve using experimental approaches. Molecular Dynamics simulations have become faster and more precise, delivering actionable information inaccessible by current experimental methods. Virtual screening workflows allow a high-throughput approach to discover ligands that bind and modulate protein function, while Machine Learning methods enable the design of proteins with new functionalities. Integrative structural biology combines several of these approaches, pushing the frontiers of structural and functional characterization to ever larger systems, advancing towards a complete understanding of the living cell. These breakthroughs will accelerate and significantly impact diverse areas of science.
Collapse
Affiliation(s)
- Bálint Mészáros
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| | - Electa Park
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| | - Duccio Malinverni
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/DucMalinverni
| | - Besian I Sejdiu
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/bisejdiu
| | - Kalyan Immadisetty
- Department of Bone Marrow Transplantation & Cellular Therapy, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/k_immadisetty
| | - Manbir Sandhu
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/M5andhu
| | - Benjamin Lang
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/langbnj
| | - M Madan Babu
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| |
Collapse
|
14
|
Liu J, Yuan R, Shao W, Wang J, Silman I, Sussman JL. Do "Newly Born" orphan proteins resemble "Never Born" proteins? A study using three deep learning algorithms. Proteins 2023. [PMID: 37092778 DOI: 10.1002/prot.26496] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 02/26/2023] [Accepted: 04/01/2023] [Indexed: 04/25/2023]
Abstract
"Newly Born" proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such "Newly Born" proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called "Never Born" proteins. The programs were used to compare the structures of two sets of "Never Born" proteins that had been expressed-Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high-quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well-identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high-quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3.
Collapse
Affiliation(s)
- Jing Liu
- Department of Biotechnology and Food Engineering, Guangdong Technion-Israel Institute of Technology, Shantou, China
- Faculty of Biotechnology and Food Engineering, Technion-Israel Institute of Technology, Haifa, Israel
| | - Rongqing Yuan
- Department of Chemistry, Tsinghua University, Beijing, China
| | - Wei Shao
- School of Chemistry and Chemical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jitong Wang
- Department of Chemistry, Tsinghua University, Beijing, China
| | - Israel Silman
- Department of Brain Sciences, The Weizmann Institute of Science, Rehovot, Israel
| | - Joel L Sussman
- Department of Chemical and Structural Biology, The Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
15
|
Aubel M, Eicholt L, Bornberg-Bauer E. Assessing structure and disorder prediction tools for de novo emerged proteins in the age of machine learning. F1000Res 2023; 12:347. [PMID: 37113259 PMCID: PMC10126731 DOI: 10.12688/f1000research.130443.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/17/2023] [Indexed: 03/31/2023] Open
Abstract
Background: De novo protein coding genes emerge from scratch in the non-coding regions of the genome and have, per definition, no homology to other genes. Therefore, their encoded de novo proteins belong to the so-called "dark protein space". So far, only four de novo protein structures have been experimentally approximated. Low homology, presumed high disorder and limited structures result in low confidence structural predictions for de novo proteins in most cases. Here, we look at the most widely used structure and disorder predictors and assess their applicability for de novo emerged proteins. Since AlphaFold2 is based on the generation of multiple sequence alignments and was trained on solved structures of largely conserved and globular proteins, its performance on de novo proteins remains unknown. More recently, natural language models of proteins have been used for alignment-free structure predictions, potentially making them more suitable for de novo proteins than AlphaFold2. Methods: We applied different disorder predictors (IUPred3 short/long, flDPnn) and structure predictors, AlphaFold2 on the one hand and language-based models (Omegafold, ESMfold, RGN2) on the other hand, to four de novo proteins with experimental evidence on structure. We compared the resulting predictions between the different predictors as well as to the existing experimental evidence. Results: Results from IUPred, the most widely used disorder predictor, depend heavily on the choice of parameters and differ significantly from flDPnn which has been found to outperform most other predictors in a comparative assessment study recently. Similarly, different structure predictors yielded varying results and confidence scores for de novo proteins. Conclusions: We suggest that, while in some cases protein language model based approaches might be more accurate than AlphaFold2, the structure prediction of de novo emerged proteins remains a difficult task for any predictor, be it disorder or structure.
Collapse
Affiliation(s)
- Margaux Aubel
- Institute for Evolution and Bidiversity, University of Muenster, Muenster, 48149, Germany
| | - Lars Eicholt
- Institute for Evolution and Bidiversity, University of Muenster, Muenster, 48149, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Bidiversity, University of Muenster, Muenster, 48149, Germany
- Department Protein Evolution, Max Planck-Institute for Biology, Tuebingen, 72076, Germany
| |
Collapse
|
16
|
McDonald EF, Jones T, Plate L, Meiler J, Gulsevin A. Benchmarking AlphaFold2 on peptide structure prediction. Structure 2023; 31:111-119.e2. [PMID: 36525975 PMCID: PMC9883802 DOI: 10.1016/j.str.2022.11.012] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 10/15/2022] [Accepted: 11/18/2022] [Indexed: 12/23/2022]
Abstract
Recent advancements in computational tools have allowed protein structure prediction with high accuracy. Computational prediction methods have been used for modeling many soluble and membrane proteins, but the performance of these methods in modeling peptide structures has not yet been systematically investigated. We benchmarked the accuracy of AlphaFold2 in predicting 588 peptide structures between 10 and 40 amino acids using experimentally determined NMR structures as reference. Our results showed AlphaFold2 predicts α-helical, β-hairpin, and disulfide-rich peptides with high accuracy. AlphaFold2 performed at least as well if not better than alternative methods developed specifically for peptide structure prediction. AlphaFold2 showed several shortcomings in predicting Φ/Ψ angles, disulfide bond patterns, and the lowest RMSD structures failed to correlate with lowest pLDDT ranked structures. In summary, computation can be a powerful tool to predict peptide structures, but additional steps may be necessary to analyze and validate the results.
Collapse
Affiliation(s)
- Eli Fritz McDonald
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA; Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA
| | - Taylor Jones
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA; Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA
| | - Lars Plate
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA; Department of Biological Sciences, Vanderbilt University, Nashville, TN 37212, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA; Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA; Institute for Drug Discovery, Leipzig University Medical School, 04103 Leipzig, Germany.
| | - Alican Gulsevin
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA; Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA.
| |
Collapse
|
17
|
Nabi A, Dilekoglu B, Adebali O, Tastan O. Discovering misannotated lncRNAs using deep learning training dynamics. Bioinformatics 2023; 39:6960922. [PMID: 36571493 PMCID: PMC9825752 DOI: 10.1093/bioinformatics/btac821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 10/05/2022] [Accepted: 12/23/2022] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION Recent experimental evidence has shown that some long non-coding RNAs (lncRNAs) contain small open reading frames (sORFs) that are translated into functional micropeptides, suggesting that these lncRNAs are misannotated as non-coding. Current methods to detect misannotated lncRNAs rely on ribosome-profiling (Ribo-Seq) and mass-spectrometry experiments, which are cell-type dependent and expensive. RESULTS Here, we propose a computational method to identify possible misannotated lncRNAs from sequence information alone. Our approach first builds deep learning models to discriminate coding and non-coding transcripts and leverages these models' training dynamics to identify misannotated lncRNAs-i.e. lncRNAs with coding potential. The set of misannotated lncRNAs we identified significantly overlap with experimentally validated ones and closely resemble coding protein sequences as evidenced by significant BLAST hits. Our analysis on a subset of misannotated lncRNA candidates also shows that some ORFs they contain yield high confidence folded structures as predicted by AlphaFold2. This methodology offers promising potential for assisting experimental efforts in characterizing the hidden proteome encoded by misannotated lncRNAs and for curating better datasets for building coding potential predictors. AVAILABILITY AND IMPLEMENTATION Source code is available at https://github.com/nabiafshan/DetectingMisannotatedLncRNAs. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Afshan Nabi
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Berke Dilekoglu
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Ogun Adebali
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | | |
Collapse
|
18
|
Garcia-Calvo E, García-García A, Rodríguez S, Farrais S, Martín R, García T. Construction of a Fab Library Merging Chains from Semisynthetic and Immune Origin, Suitable for Developing New Tools for Gluten Immunodetection in Food. Foods 2022; 12:149. [PMID: 36613365 PMCID: PMC9818130 DOI: 10.3390/foods12010149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/20/2022] [Accepted: 12/23/2022] [Indexed: 12/29/2022] Open
Abstract
The observed increase in the prevalence of gluten-related disorders has prompted the development of novel immunological systems for gluten detection in foodstuff. The innovation on these methods relies on the generation of new antibodies, which might alternatively be obtained by molecular evolution methods such as phage display. This work presents a novel approach for the generation of a Fab library by merging semi-synthetic heavy chains built-up from a pre-existent recombinant antibody fragment (dAb8E) with an immune light chain set derived from celiac donors. From the initial phage population (107 candidates) and after three rounds of selection and amplification, four different clones were isolated for further characterization. The phage Fab8E-4 presented the best features to be applied in an indirect ELISA for the detection of gluten in foods, resulting in improved specificity and sensitivity.
Collapse
Affiliation(s)
- Eduardo Garcia-Calvo
- Departamento de Nutrición y Ciencia de los Alimentos, Facultad de Veterinaria, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Aina García-García
- Departamento de Nutrición y Ciencia de los Alimentos, Facultad de Veterinaria, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Santiago Rodríguez
- Departamento de Nutrición y Ciencia de los Alimentos, Facultad de Veterinaria, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Sergio Farrais
- Servicio de Medicina Digestiva, Hospital Universitario Fundación Jiménez Díaz, 28040 Madrid, Spain
| | - Rosario Martín
- Departamento de Nutrición y Ciencia de los Alimentos, Facultad de Veterinaria, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Teresa García
- Departamento de Nutrición y Ciencia de los Alimentos, Facultad de Veterinaria, Universidad Complutense de Madrid, 28040 Madrid, Spain
| |
Collapse
|
19
|
Ilzhöfer D, Heinzinger M, Rost B. SETH predicts nuances of residue disorder from protein embeddings. FRONTIERS IN BIOINFORMATICS 2022; 2:1019597. [PMID: 36304335 PMCID: PMC9580958 DOI: 10.3389/fbinf.2022.1019597] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 09/20/2022] [Indexed: 11/07/2022] Open
Abstract
Predictions for millions of protein three-dimensional structures are only a few clicks away since the release of AlphaFold2 results for UniProt. However, many proteins have so-called intrinsically disordered regions (IDRs) that do not adopt unique structures in isolation. These IDRs are associated with several diseases, including Alzheimer's Disease. We showed that three recent disorder measures of AlphaFold2 predictions (pLDDT, "experimentally resolved" prediction and "relative solvent accessibility") correlated to some extent with IDRs. However, expert methods predict IDRs more reliably by combining complex machine learning models with expert-crafted input features and evolutionary information from multiple sequence alignments (MSAs). MSAs are not always available, especially for IDRs, and are computationally expensive to generate, limiting the scalability of the associated tools. Here, we present the novel method SETH that predicts residue disorder from embeddings generated by the protein Language Model ProtT5, which explicitly only uses single sequences as input. Thereby, our method, relying on a relatively shallow convolutional neural network, outperformed much more complex solutions while being much faster, allowing to create predictions for the human proteome in about 1 hour on a consumer-grade PC with one NVIDIA GeForce RTX 3060. Trained on a continuous disorder scale (CheZOD scores), our method captured subtle variations in disorder, thereby providing important information beyond the binary classification of most methods. High performance paired with speed revealed that SETH's nuanced disorder predictions for entire proteomes capture aspects of the evolution of organisms. Additionally, SETH could also be used to filter out regions or proteins with probable low-quality AlphaFold2 3D structures to prioritize running the compute-intensive predictions for large data sets. SETH is freely publicly available at: https://github.com/Rostlab/SETH.
Collapse
Affiliation(s)
- Dagmar Ilzhöfer
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
| | - Michael Heinzinger
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), TUM Graduate School, Garching, Germany
| | - Burkhard Rost
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Institute for Advanced Study (TUM-IAS), TUM (Technical University of Munich), Garching, Germany
- TUM School of Life Sciences Weihenstephan (WZW), TUM (Technical University of Munich), Freising, Germany
| |
Collapse
|
20
|
Kamaraj R, Drastik M, Maixnerova J, Pavek P. Allosteric Antagonism of the Pregnane X Receptor (PXR): Current-State-of-the-Art and Prediction of Novel Allosteric Sites. Cells 2022; 11:2974. [PMID: 36230936 PMCID: PMC9563780 DOI: 10.3390/cells11192974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/20/2022] [Accepted: 09/20/2022] [Indexed: 11/26/2022] Open
Abstract
The pregnane X receptor (PXR, NR1I2) is a xenobiotic-activated transcription factor with high levels of expression in the liver. It not only plays a key role in drug metabolism and elimination, but also promotes tumor growth, drug resistance, and metabolic diseases. It has been proposed as a therapeutic target for type II diabetes, metabolic syndrome, and inflammatory bowel disease, and PXR antagonists have recently been considered as a therapy for colon cancer. There are currently no PXR antagonists that can be used in a clinical setting. Nevertheless, due to the large and complex ligand-binding pocket (LBP) of the PXR, it is challenging to discover PXR antagonists at the orthosteric site. Alternative ligand binding sites of the PXR have also been proposed and are currently being studied. Recently, the AF-2 allosteric binding site of the PXR has been identified, with several compounds modulating the site discovered. Herein, we aimed to summarize our current knowledge of allosteric modulation of the PXR as well as our attempt to unlock novel allosteric sites. We describe the novel binding function 3 (BF-3) site of PXR, which is also common for other nuclear receptors. In addition, we also mention a novel allosteric site III based on in silico prediction. The identified allosteric sites of the PXR provide new insights into the development of safe and efficient allosteric modulators of the PXR receptor. We therefore propose that novel PXR allosteric sites might be promising targets for treating chronic metabolic diseases and some cancers.
Collapse
Affiliation(s)
- Rajamanikkam Kamaraj
- Department of Pharmacology and Toxicology, Faculty of Pharmacy, Charles University in Prague, Heyrovskeho 1203, 50005 Hradec Kralove, Czech Republic
| | - Martin Drastik
- Department of Physical Chemistry and Biophysics, Faculty of Pharmacy, Charles University in Prague, Heyrovskeho 1203, 50005 Hradec Kralove, Czech Republic
| | - Jana Maixnerova
- Department of Pharmacology and Toxicology, Faculty of Pharmacy, Charles University in Prague, Heyrovskeho 1203, 50005 Hradec Kralove, Czech Republic
| | - Petr Pavek
- Department of Pharmacology and Toxicology, Faculty of Pharmacy, Charles University in Prague, Heyrovskeho 1203, 50005 Hradec Kralove, Czech Republic
| |
Collapse
|
21
|
Geffen Y, Ofran Y, Unger R. DistilProtBert: a distilled protein language model used to distinguish between real proteins and their randomly shuffled counterparts. Bioinformatics 2022; 38:ii95-ii98. [PMID: 36124789 DOI: 10.1093/bioinformatics/btac474] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
SUMMARY Recently, deep learning models, initially developed in the field of natural language processing (NLP), were applied successfully to analyze protein sequences. A major drawback of these models is their size in terms of the number of parameters needed to be fitted and the amount of computational resources they require. Recently, 'distilled' models using the concept of student and teacher networks have been widely used in NLP. Here, we adapted this concept to the problem of protein sequence analysis, by developing DistilProtBert, a distilled version of the successful ProtBert model. Implementing this approach, we reduced the size of the network and the running time by 50%, and the computational resources needed for pretraining by 98% relative to ProtBert model. Using two published tasks, we showed that the performance of the distilled model approaches that of the full model. We next tested the ability of DistilProtBert to distinguish between real and random protein sequences. The task is highly challenging if the composition is maintained on the level of singlet, doublet and triplet amino acids. Indeed, traditional machine-learning algorithms have difficulties with this task. Here, we show that DistilProtBert preforms very well on singlet, doublet and even triplet-shuffled versions of the human proteome, with AUC of 0.92, 0.91 and 0.87, respectively. Finally, we suggest that by examining the small number of false-positive classifications (i.e. shuffled sequences classified as proteins by DistilProtBert), we may be able to identify de novo potential natural-like proteins based on random shuffling of amino acid sequences. AVAILABILITY AND IMPLEMENTATION https://github.com/yarongef/DistilProtBert.
Collapse
Affiliation(s)
- Yaron Geffen
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Yanay Ofran
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Ron Unger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| |
Collapse
|
22
|
Eicholt LA, Aubel M, Berk K, Bornberg‐Bauer E, Lange A. Heterologous expression of naturally evolved putative de novo proteins with chaperones. Protein Sci 2022; 31:e4371. [PMID: 35900020 PMCID: PMC9278007 DOI: 10.1002/pro.4371] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 05/03/2022] [Accepted: 05/14/2022] [Indexed: 11/23/2022]
Abstract
Over the past decade, evidence has accumulated that new protein-coding genes can emerge de novo from previously non-coding DNA. Most studies have focused on large scale computational predictions of de novo protein-coding genes across a wide range of organisms. In contrast, experimental data concerning the folding and function of de novo proteins are scarce. This might be due to difficulties in handling de novo proteins in vitro, as most are short and predicted to be disordered. Here, we propose a guideline for the effective expression of eukaryotic de novo proteins in Escherichia coli. We used 11 sequences from Drosophila melanogaster and 10 from Homo sapiens, that are predicted de novo proteins from former studies, for heterologous expression. The candidate de novo proteins have varying secondary structure and disorder content. Using multiple combinations of purification tags, E. coli expression strains, and chaperone systems, we were able to increase the number of solubly expressed putative de novo proteins from 30% to 62%. Our findings indicate that the best combination for expressing putative de novo proteins in E. coli is a GST-tag with T7 Express cells and co-expressed chaperones. We found that, overall, proteins with higher predicted disorder were easier to express. STATEMENT: Today, we know that proteins do not only evolve by duplication and divergence of existing proteins but also arise from previously non-coding DNA. These proteins are called de novo proteins. Their properties are still poorly understood and their experimental analysis faces major obstacles. Here, we aim to present a starting point for soluble expression of de novo proteins with the help of chaperones and thereby enable further characterization.
Collapse
Affiliation(s)
- Lars A. Eicholt
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Margaux Aubel
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Katrin Berk
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Erich Bornberg‐Bauer
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
- Max Planck‐Institute for Biology TuebingenTübingenGermany
| | - Andreas Lange
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| |
Collapse
|
23
|
Bæk KT, Kepp KP. Assessment of AlphaFold2 for Human Proteins via Residue Solvent Exposure. J Chem Inf Model 2022; 62:3391-3400. [PMID: 35785970 DOI: 10.1021/acs.jcim.2c00243] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
As only 35% of human proteins feature (often partial) PDB structures, the protein structure prediction tool AlphaFold2 (AF2) could have massive impact on human biology and medicine fields, making independent benchmarks of interest. We studied AF2's ability to describe the backbone solvent exposure as a functionally important and easily interpretable "natural coordinate" of protein conformation, using human proteins as test case. After screening for appropriate comparative sets, we matched 1818 human proteins predicted by AF2 against 7585 unique experimental PDBs, and after curation for sequence overlap, we assessed 1264 comparative pairs comprising 115 unique AF2 structures and 652 unique experimental structures. AF2 performed markedly worse for multimers, whereas ligands, cofactors, and experimental resolution were interestingly not very important for performance. AF2 performed excellently for monomer proteins. Challenges relating to specific groups of residues and multimers were analyzed. We identified larger deviations for lower-confidence scores (pLDDT), and exposed residues and polar residues (e.g., Asp, Glu, Asn) being less accurately described than hydrophobic residues. Proline conformations were the hardest to predict, probably due to a common location in dynamic solvent-accessible parts. In summary, using solvent exposure as a metric, we quantified the performance of AF2 for human proteins and provided estimates of the expected agreement as a function of ligand presence, multimer/monomer status, local residue solvent exposure, pLDDT, and amino acid type. Overall performance was found to be excellent.
Collapse
Affiliation(s)
- Kristoffer T Bæk
- DTU Chemistry, Technical University of Denmark, Building 206, Kgs. Lyngby 2800, Denmark
| | - Kasper P Kepp
- DTU Chemistry, Technical University of Denmark, Building 206, Kgs. Lyngby 2800, Denmark
| |
Collapse
|