1
|
Sousa JPM, Ramos MJ, Fernandes PA. QM/MM Study of the Reaction Mechanism of L-Tyrosine Hydroxylation Catalyzed by the Enzyme CYP76AD1. J Phys Chem B 2024; 128:9447-9454. [PMID: 39185757 DOI: 10.1021/acs.jpcb.4c05209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
We have studied the hydroxylation mechanism of l-Tyr by the heme-dependent enzyme CYP76AD1 from the sugar beet (Beta vulgaris). This enzyme has a promising biotechnological application in modified yeast strains to produce medicinal alkaloids, an alternative to the traditional opium poppy harvest. A generative machine learning software based on AlphaFold was used to build the structure of CYP76AD1 since there are no structural data for this specific enzyme. After model validation, l-Tyr was docked in the active site of CYP76AD1 to assemble the reactive complex, whose catalytic distances remained stable throughout the 100 ns of MD simulation. Subsequent QM/MM calculations elucidated that l-Tyr hydroxylation occurs in two steps: hydrogen abstraction from l-Tyr by CpdI, forming an l-Tyr radical, and subsequent radical rebound, corresponding to a rate-limiting step of 16.0 kcal·mol-1. Our calculations suggest that the hydrogen abstraction step should occur in the doublet state, while the radical rebound should happen in the quartet state. The clarification of the reaction mechanism of CYP76AD1 provides insights into the rational optimization of the biosynthesis of alkaloids to eliminate the use of opium poppy.
Collapse
Affiliation(s)
- João P M Sousa
- LAQV-REQUIMTE, Departamento de Química e Bioquímica, Faculdade de Ciências Universidade do Porto, Rua do Campo Alegre, s/n, Porto 4169-007, Portugal
| | - Maria J Ramos
- LAQV-REQUIMTE, Departamento de Química e Bioquímica, Faculdade de Ciências Universidade do Porto, Rua do Campo Alegre, s/n, Porto 4169-007, Portugal
| | - Pedro A Fernandes
- LAQV-REQUIMTE, Departamento de Química e Bioquímica, Faculdade de Ciências Universidade do Porto, Rua do Campo Alegre, s/n, Porto 4169-007, Portugal
| |
Collapse
|
2
|
Poudel B, Vanegas JM. Structural Rearrangement of the AT1 Receptor Modulated by Membrane Thickness and Tension. J Phys Chem B 2024; 128:9470-9481. [PMID: 39298653 DOI: 10.1021/acs.jpcb.4c03325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Membrane-embedded mechanosensitive (MS) proteins, including ion channels and G-protein coupled receptors (GPCRs), are essential for the transduction of external mechanical stimuli into biological signals. The angiotensin II type 1 (AT1) receptor plays many important roles in cardiovascular regulation and is associated with diseases such as hypertension and congestive heart failure. The membrane-mediated activation of the AT1 receptor is not well understood, despite this being one of the most widely studied GPCRs within the context of biased agonism. Here, we use extensive molecular dynamics (MD) simulations to characterize the effect of the local membrane environment on the activation of the AT1 receptor. We show that membrane thickness plays an important role in the stability of active and inactive states of the receptor, as well as the dynamic interchange between states. Furthermore, our simulation results show that membrane tension is effective in driving large-scale structural changes in the inactive state such as the outward movement of transmembrane helix 6 to stabilize intermediate active-like conformations. We conclude by comparing our simulation observations with AlphaFold 2 predictions, as a proxy to experimental structures, to provide a framework for how membrane mediated stimuli can facilitate activation of the AT1 receptor through the β-arrestin signaling pathway.
Collapse
Affiliation(s)
- Bharat Poudel
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| | - Juan M Vanegas
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| |
Collapse
|
3
|
Liang H, Luo Y, van der Donk WA. Substrate Specificity of a Methyltransferase Involved in the Biosynthesis of the Lantibiotic Cacaoidin. Biochemistry 2024; 63:2493-2505. [PMID: 39271288 PMCID: PMC11447909 DOI: 10.1021/acs.biochem.4c00150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Modification of the N- and C-termini of peptides enhances their stability against degradation by exopeptidases. The biosynthetic pathways of many peptidic natural products feature enzymatic modification of their termini, and these enzymes may represent a valuable pool of biocatalysts. The lantibiotic cacaoidin carries an N,N-dimethylated N-terminal amine group. Its biosynthetic gene cluster encodes the putative methyltransferase Cao4. In this work, we present reconstitution of the activity of the enzyme, which we termed CaoSC following standardized lanthipeptide nomenclature, using a heterologously produced peptide as the model substrate. In vitro methylation of diverse lanthipeptides revealed the substrate requirements of CaoSC. The enzyme accepts peptides of varying lengths and C-terminal sequences but requires dehydroalanine or dehydrobutyrine at the second position. CaoSC-mediated dimethylation of natural lantibiotics resulted in modestly enhanced antimicrobial activity of the lantibiotic haloduracin compared to that of the native compound. Improved activity and/or metabolic stability as a result of methylation illustrates the potential future application of CaoSC in the bioengineering of therapeutic peptides.
Collapse
Affiliation(s)
- Haoqian Liang
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Youran Luo
- Department of Chemistry and Howard Hughes Medical Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Wilfred A van der Donk
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
- Department of Chemistry and Howard Hughes Medical Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
4
|
Kouraki A, Zheng AS, Miller S, Kelly A, Ashraf W, Bazzani D, Bonadiman A, Tonidandel G, Bolzan M, Vijay A, Nightingale J, Menni C, Ollivere BJ, Valdes AM. Metagenomic changes in response to antibiotic treatment in severe orthopedic trauma patients. iScience 2024; 27:110783. [PMID: 39286492 PMCID: PMC11403444 DOI: 10.1016/j.isci.2024.110783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 06/21/2024] [Accepted: 08/19/2024] [Indexed: 09/19/2024] Open
Abstract
We investigated changes in microbiome composition and abundance of antimicrobial resistance (AMR) genes post-antibiotic treatment in severe trauma patients. Shotgun sequencing revealed beta diversity (Bray-Curtis) differences between 16 hospitalized multiple rib fractures patients and 10 age- and sex-matched controls (p = 0.043), and between antibiotic-treated and untreated patients (p = 0.015). Antibiotic-treated patients had lower alpha diversity (Shannon) at discharge (p = 0.003) and 12-week post-discharge (p = 0.007). At 12 weeks, they also exhibited a 5.50-fold (95% confidence interval [CI]: 2.86-8.15) increase in Escherichia coli (p = 0.0004) compared to controls. Differential analysis identified nine AMRs that increased in antibiotic-treated compared to untreated patients between hospital discharge and 6 and 12 weeks follow-up (false discovery rate [FDR] < 0.20). Two aminoglycoside genes and a beta-lactamase gene were directly related to antibiotics administered, while five were unrelated. In trauma patients, lower alpha diversity, higher abundance of pathobionts, and increases in AMRs persisted for 12 weeks post-discharge, suggesting prolonged microbiome disruption. Probiotic or symbiotic therapies may offer future treatment avenues.
Collapse
Affiliation(s)
- Afroditi Kouraki
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Amy S Zheng
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Suzanne Miller
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Anthony Kelly
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Waheed Ashraf
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | | | | | | | | | - Amrita Vijay
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Jessica Nightingale
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Cristina Menni
- Department of Twin Research, King's College London, London SE1 7EH, UK
| | - Benjamin J Ollivere
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| | - Ana M Valdes
- Academic Unit of Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
- NIHR Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust and the University of Nottingham, Nottingham NG7 2UH, UK
| |
Collapse
|
5
|
Li C, Zhang Y, Shi W, Peng Y, Han Y, Jiang S, Dong X, Zhang R. Viral diversity within marine biofilms and interactions with corrosive microbes. ENVIRONMENTAL RESEARCH 2024; 263:119991. [PMID: 39276831 DOI: 10.1016/j.envres.2024.119991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 08/25/2024] [Accepted: 09/11/2024] [Indexed: 09/17/2024]
Abstract
In marine environments, a wide variety of microbes like bacteria, and archaea influence on the corrosion of materials. Viruses are widely distributed in biofilms among these microbes and may affect the corrosion process through interactions with key corrosive prokaryotes. However, understanding of the viral communities within biofilms and their interactions with corrosive microbes remains is limited. To improve this knowledge gap, 53 metagenomes were utilized to investigate the diversity of viruses within biofilms on 8 different materials and their interactions with corrosive microbes. Notably, the viruses within biofilms predominantly belonged to Caudoviricetes, and phylogenetic analysis of Caudoviricetes and protein-sharing networks with other environments revealed the presence of numerous novel viral clades in biofilms. The virus‒host linkages revealed a close association between viruses and corrosive microbes in biofilms. This means that viruses may modulate host corrosion-related metabolism through auxiliary metabolic genes. It was observed that the virus could enhance host resistance to metals and antibiotics via horizontal gene transfer. Interestingly, viruses could protect themselves from host antiviral systems through anti-defense systems. This study illustrates the diversity of viruses within biofilms formed on materials and the intricate interactions between viruses and corrosive microbes, showing the potential roles of viruses in corrosive biofilms.
Collapse
Affiliation(s)
- Chengpeng Li
- Key Laboratory of Advanced Marine Materials, Key Laboratory of Marine Environmental Corrosion and Bio-fouling, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China; Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, 361005, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yimeng Zhang
- Key Laboratory of Advanced Marine Materials, Key Laboratory of Marine Environmental Corrosion and Bio-fouling, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China
| | - Wenqing Shi
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Science, Xiamen University, Xiamen, 361102, China
| | - Yongyi Peng
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, 361005, China; School of Marine Sciences, Sun Yat-Sen University, Zhuhai, 519082, China
| | - Yingchun Han
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, 361005, China
| | - Shuqing Jiang
- Key Laboratory of Advanced Marine Materials, Key Laboratory of Marine Environmental Corrosion and Bio-fouling, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiyang Dong
- Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, 361005, China.
| | - Ruiyong Zhang
- Key Laboratory of Advanced Marine Materials, Key Laboratory of Marine Environmental Corrosion and Bio-fouling, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China; Institute of Marine Corrosion Protection, Guangxi Key Laboratory of Marine Environmental Science, Guangxi Academy of Sciences, Nanning, China.
| |
Collapse
|
6
|
Anselmi NK, Vanyo ST, Clark ND, Rodriguez DML, Jones MM, Rosenthal S, Patel D, Marconi RT, Visser MB. Topology and functional characterization of major outer membrane proteins of Treponema maltophilum and Treponema lecithinolyticum. Mol Oral Microbiol 2024. [PMID: 39263909 DOI: 10.1111/omi.12484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 08/21/2024] [Accepted: 08/22/2024] [Indexed: 09/13/2024]
Abstract
Numerous Treponema species are prevalent in the dysbiotic subgingival microbial community during periodontitis. The major outer sheath protein is a highly expressed virulence factor of the well-characterized species Treponema denticola. Msp forms an oligomeric membrane protein complex with adhesin and porin properties and contributes to host-microbial interaction. Treponema maltophilum and Treponema lecithinolyticum species are also prominent during periodontitis but are relatively understudied. Msp-like membrane surface proteins exist in T. maltophilum (MspA) and T. lecithinolyticum (MspTL), but limited information exists regarding their structural features or functionality. Protein profiling reveals numerous differences between these species, but minimal differences between strains of the same species. Using protein modeling tools, we predict MspA and MspTL monomeric forms to be large β-barrel structures composed of 20 all-next-neighbor antiparallel β strands which most likely adopt a homotrimer formation. Using cell fractionation, Triton X-114 phase partitioning, heat modifiability, and chemical and detergent release assays, we found evidence of amphiphilic integral membrane-associated oligomerization for both native MspA and MspTL in intact spirochetes. Proteinase K accessibility and immunofluorescence assays demonstrate surface exposure of MspA and MspTL. Functionally, purified recombinant MspA or MspTL monomer proteins can impair neutrophil chemotaxis. Expressions of MspA or MspTL with a PelB leader sequence in Escherichia coli also demonstrate surface exposure and can impair neutrophil chemotaxis in an in vivo air pouch model of inflammation. Collectively, our data demonstrate that MspA and MspTL membrane proteins can contribute to pathogenesis of these understudied oral spirochete species.
Collapse
Affiliation(s)
- Natalie K Anselmi
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, New York, USA
| | - Stephen T Vanyo
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, New York, USA
| | - Nicholas D Clark
- Department of Structural Biology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, The State University of New York, Buffalo, New York, USA
| | - Dayron M Leyva Rodriguez
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, New York, USA
| | - Megan M Jones
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, New York, USA
| | - Sara Rosenthal
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, New York, USA
| | - Dhara Patel
- Department of Microbiology and Immunology, Virginia Commonwealth University Medical Center, Richmond, Virginia, USA
| | - Richard T Marconi
- Department of Microbiology and Immunology, Virginia Commonwealth University Medical Center, Richmond, Virginia, USA
| | - Michelle B Visser
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, New York, USA
| |
Collapse
|
7
|
Italiya G, Subramanian S. Leveraging new approach methodologies: ecotoxicological modelling of endocrine disrupting chemicals to Danio rerio through machine learning and toxicity studies. Toxicol Mech Methods 2024:1-17. [PMID: 39223866 DOI: 10.1080/15376516.2024.2400324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/30/2024] [Accepted: 07/31/2024] [Indexed: 09/04/2024]
Abstract
New approach methodologies (NAMs) offer information tailored to the intended application while reducing the use of animals. NAMs aim to develop quantitative structure-activity relationship (QSAR) and quantitive-Read-Across structure-activity relationship (q-RASAR) models to predict and categorize the acute toxicity of known and unknown endocrine-disrupting chemicals (EDCs) against zebrafish. EDCs are a diverse group of toxic substances that disrupt the endocrine system of humans and animals. The q-RASAR model was constructed and verified using validation metrics (R2 = 0.886 and Q2 = 0.814) which found to be more reliable model compare to QSAR model. The substructure fingerprint was well-fitted for the classification model and it was validated using 10-fold average accuracy (Q = 86.88%), specificity (Sp = 88.89%), Matthew's correlation curve (MCC = 0.621) and receiver operating characteristics (ROC = 0.828). The dataset of unknown substances revealed that phenolphthalein (Php) exhibited a significant level of toxicity based on q-RASAR model. The docking and simulation study indicated that the computationally derived important features successfully bound to the target zebrafish sex hormone binding globulin (zfSHBG). The experimental LC50 value of 0.790 mg L-1 was very close to the predicted value of 0.763 mg L-1, which provides high confidence to the developed model.
Collapse
Affiliation(s)
- Gopal Italiya
- School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| | - Sangeetha Subramanian
- School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
8
|
Bao J, Su B, Chen Z, Sun Z, Peng J, Zhao S. A UTP3-dependent nucleolar translocation pathway facilitates pre-rRNA 5'ETS processing. Nucleic Acids Res 2024; 52:9671-9694. [PMID: 39036955 PMCID: PMC11381329 DOI: 10.1093/nar/gkae631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 06/27/2024] [Accepted: 07/09/2024] [Indexed: 07/23/2024] Open
Abstract
The ribosome small subunit (SSU) is assembled by the SSU processome which contains approximately 70 non-ribosomal protein factors. Whilst the biochemical mechanisms of the SSU processome in 18S rRNA processing and maturation have been extensively studied, how SSU processome components enter the nucleolus has yet to be systematically investigated. Here, in examining the nucleolar localization of 50 human SSU processome components, we found that UTP3, together with another 24 proteins, enter the nucleolus autonomously. For the remaining 25 proteins we found that UTP3/SAS10 assists the nucleolar localization of five proteins (MPP10, UTP25, EMG1 and the two UTP-B components UTP12 and UTP13), likely through its interaction with nuclear importin α. This 'ferrying' function of UTP3 was then confirmed as conserved in the zebrafish. We also found that knockdown of human UTP3 impairs cleavage at the A0-site while loss-of-function of either utp3/sas10 or utp13/tbl3 in zebrafish causes the accumulation of aberrantly processed 5'ETS products, which highlights the crucial role of UTP3 in mediating 5'ETS processing. Mechanistically, we found that UTP3 facilitates the degradation of processed 5'ETS by recruiting the RNA exosome component EXOSC10 to the nucleolus. These findings lay the groundwork for studying the mechanism of cytoplasm-to-nucleolus trafficking of SSU processome components.
Collapse
Affiliation(s)
- Jiayang Bao
- MOE Key Laboratory for Molecular Animal Nutrition, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| | - Baochun Su
- MOE Key Laboratory for Molecular Animal Nutrition, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| | - Zheyan Chen
- College of Life Sciences, Zhejiang University, Hangzhou 310058, China
| | - Zhaoxiang Sun
- College of Life Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jinrong Peng
- MOE Key Laboratory for Molecular Animal Nutrition, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| | - Shuyi Zhao
- MOE Key Laboratory for Molecular Animal Nutrition, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
9
|
Zeng W, Dou Y, Pan L, Xu L, Peng S. Improving prediction performance of general protein language model by domain-adaptive pretraining on DNA-binding protein. Nat Commun 2024; 15:7838. [PMID: 39244557 PMCID: PMC11380688 DOI: 10.1038/s41467-024-52293-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 08/29/2024] [Indexed: 09/09/2024] Open
Abstract
DNA-protein interactions exert the fundamental structure of many pivotal biological processes, such as DNA replication, transcription, and gene regulation. However, accurate and efficient computational methods for identifying these interactions are still lacking. In this study, we propose a method ESM-DBP through refining the DNA-binding protein sequence repertory and domain-adaptive pretraining based the general protein language model. Our method considers the lacking exploration of general language model for DNA-binding protein domain-specific knowledge, so we screen out 170,264 DNA-binding protein sequences to construct the domain-adaptive language model. Experimental results on four downstream tasks show that ESM-DBP provides a better feature representation of DNA-binding protein compared to the original language model, resulting in improved prediction performance and outperforming the state-of-the-art methods. Moreover, ESM-DBP can still perform well even for those sequences with only a few homologous sequences. ChIP-seq on two predicted cases further support the validity of the proposed method.
Collapse
Affiliation(s)
- Wenwu Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Yutao Dou
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Liangrui Pan
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Liwen Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| |
Collapse
|
10
|
Yu J, Zhang Y, Fang Y, Paulo JA, Yaghoubi D, Hua X, Shipkovenska G, Toda T, Zhang Z, Gygi SP, Jia S, Li Q, Moazed D. A replisome-associated histone H3-H4 chaperone required for epigenetic inheritance. Cell 2024; 187:5010-5028.e24. [PMID: 39094570 PMCID: PMC11380579 DOI: 10.1016/j.cell.2024.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 03/17/2024] [Accepted: 07/03/2024] [Indexed: 08/04/2024]
Abstract
Faithful transfer of parental histones to newly replicated daughter DNA strands is critical for inheritance of epigenetic states. Although replication proteins that facilitate parental histone transfer have been identified, how intact histone H3-H4 tetramers travel from the front to the back of the replication fork remains unknown. Here, we use AlphaFold-Multimer structural predictions combined with biochemical and genetic approaches to identify the Mrc1/CLASPIN subunit of the replisome as a histone chaperone. Mrc1 contains a conserved histone-binding domain that forms a brace around the H3-H4 tetramer mimicking nucleosomal DNA and H2A-H2B histones, is required for heterochromatin inheritance, and promotes parental histone recycling during replication. We further identify binding sites for the FACT histone chaperone in Swi1/TIMELESS and DNA polymerase α that are required for heterochromatin inheritance. We propose that Mrc1, in concert with FACT acting as a mobile co-chaperone, coordinates the distribution of parental histones to newly replicated DNA.
Collapse
Affiliation(s)
- Juntao Yu
- Howard Hughes Medical Institute, Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Yujie Zhang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Yimeng Fang
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Joao A Paulo
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Dadmehr Yaghoubi
- Howard Hughes Medical Institute, Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Xu Hua
- Institute for Cancer Genetics, Department of Pediatrics, and Department of Genetics and Development, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Gergana Shipkovenska
- Howard Hughes Medical Institute, Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Takenori Toda
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Zhiguo Zhang
- Institute for Cancer Genetics, Department of Pediatrics, and Department of Genetics and Development, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Steven P Gygi
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Songtao Jia
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Qing Li
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
| | - Danesh Moazed
- Howard Hughes Medical Institute, Department of Cell Biology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
11
|
Singh H, Wiscovitch-Russo R, Kuelbs C, Espinoza J, Appel AE, Lyons RJ, Vashee S, Förtsch HEA, Foster JE, Ramdath D, Hayes VM, Nelson KE, Gonzalez-Juarbe N. Multiomic Insights into Human Health: Gut Microbiomes of Hunter-Gatherer, Agropastoral, and Western Urban Populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.03.611095. [PMID: 39282340 PMCID: PMC11398329 DOI: 10.1101/2024.09.03.611095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Societies with exposure to preindustrial diets exhibit improved markers of health. Our study used a comprehensive multi-omic approach to reveal that the gut microbiome of the Ju/'hoansi hunter-gatherers, one of the most remote KhoeSan groups, exhibit a higher diversity and richness, with an abundance of microbial species lost in the western population. The Ju/'hoansi microbiome showed enhanced global transcription and enrichment of complex carbohydrate metabolic and energy generation pathways. The Ju/'hoansi also show high abundance of short-chain fatty acids that are associated with health and optimal immune function. In contrast, these pathways and their respective species were found in low abundance or completely absent in Western populations. Amino acid and fatty acid metabolism pathways were observed prevalent in the Western population, associated with biomarkers of chronic inflammation. Our study provides the first in-depth multi-omic characterization of the Ju/'hoansi microbiome, revealing uncharacterized species and functional pathways that are associated with health.
Collapse
|
12
|
Mifsud JCO, Lytras S, Oliver MR, Toon K, Costa VA, Holmes EC, Grove J. Mapping glycoprotein structure reveals Flaviviridae evolutionary history. Nature 2024; 633:695-703. [PMID: 39232167 PMCID: PMC11410658 DOI: 10.1038/s41586-024-07899-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 08/01/2024] [Indexed: 09/06/2024]
Abstract
Viral glycoproteins drive membrane fusion in enveloped viruses and determine host range, tissue tropism and pathogenesis1. Despite their importance, there is a fragmentary understanding of glycoproteins within the Flaviviridae2, a large virus family that include pathogens such as hepatitis C, dengue and Zika viruses, and numerous other human, animal and emergent viruses. For many flaviviruses the glycoproteins have not yet been identified, for others, such as the hepaciviruses, the molecular mechanisms of membrane fusion remain uncharacterized3. Here we combine phylogenetic analyses with protein structure prediction to survey glycoproteins across the entire Flaviviridae. We find class II fusion systems, homologous to the Orthoflavivirus E glycoprotein in most species, including highly divergent jingmenviruses and large genome flaviviruses. However, the E1E2 glycoproteins of the hepaciviruses, pegiviruses and pestiviruses are structurally distinct, may represent a novel class of fusion mechanism, and are strictly associated with infection of vertebrate hosts. By mapping glycoprotein distribution onto the underlying phylogeny, we reveal a complex evolutionary history marked by the capture of bacterial genes and potentially inter-genus recombination. These insights, made possible through protein structure prediction, refine our understanding of viral fusion mechanisms and reveal the events that have shaped the diverse virology and ecology of the Flaviviridae.
Collapse
Affiliation(s)
- Jonathon C O Mifsud
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, New South Wales, Australia
| | - Spyros Lytras
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
- Division of Systems Virology, Department of Microbiology and Immunology, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Michael R Oliver
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Kamilla Toon
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Vincenzo A Costa
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, New South Wales, Australia
| | - Edward C Holmes
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, New South Wales, Australia
- Laboratory of Data Discovery for Health Limited, Hong Kong SAR, China
| | - Joe Grove
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK.
| |
Collapse
|
13
|
Rahimzadeh F, Mohammad Khanli L, Salehpoor P, Golabi F, PourBahrami S. Unveiling the evolution of policies for enhancing protein structure predictions: A comprehensive analysis. Comput Biol Med 2024; 179:108815. [PMID: 38986287 DOI: 10.1016/j.compbiomed.2024.108815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/09/2024] [Accepted: 06/24/2024] [Indexed: 07/12/2024]
Abstract
Predicting protein structure is both fascinating and formidable, playing a crucial role in structure-based drug discovery and unraveling diseases with elusive origins. The Critical Assessment of Protein Structure Prediction (CASP) serves as a biannual battleground where global scientists converge to untangle the intricate relationships within amino acid chains. Two primary methods, Template-Based Modeling (TBM) and Template-Free (TF) strategies, dominate protein structure prediction. The trend has shifted towards Template-Free predictions due to their broader sequence coverage with fewer templates. The predictive process can be broadly classified into contact map, binned-distance, and real-valued distance predictions, each with distinctive strengths and limitations manifested through tailored loss functions. We have also introduced revolutionary end-to-end, and all-atom diffusion-based techniques that have transformed protein structure predictions. Recent advancements in deep learning techniques have significantly improved prediction accuracy, although the effectiveness is contingent upon the quality of input features derived from natural bio-physiochemical attributes and Multiple Sequence Alignments (MSA). Hence, the generation of high-quality MSA data holds paramount importance in harnessing informative input features for enhanced prediction outcomes. Remarkable successes have been achieved in protein structure prediction accuracy, however not enough for what structural knowledge was intended to, which implies need for development in some other aspects of the predictions. In this regard, scientists have opened other frontiers for protein structural prediction. The utilization of subsampling in multiple sequence alignment (MSA) and protein language modeling appears to be particularly promising in enhancing the accuracy and efficiency of predictions, ultimately aiding in drug discovery efforts. The exploration of predicting protein complex structure also opens up exciting opportunities to deepen our knowledge of molecular interactions and design therapeutics that are more effective. In this article, we have discussed the vicissitudes that the scientists have gone through to improve prediction accuracy, and examined the effective policies in predicting from different aspects, including the construction of high quality MSA, providing informative input features, and progresses in deep learning approaches. We have also briefly touched upon transitioning from predicting single-chain protein structures to predicting protein complex structures. Our findings point towards promoting open research environments to support the objectives of protein structure prediction.
Collapse
Affiliation(s)
- Faezeh Rahimzadeh
- Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
| | | | - Pedram Salehpoor
- Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
| | - Faegheh Golabi
- Department of Biomedical Engineering, Faculty of Advanced Medical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Shahin PourBahrami
- Department of Computer Engineering, Technical and Vocational University (TVU), Tehran, Iran
| |
Collapse
|
14
|
Bryant P, Noé F. Structure prediction of alternative protein conformations. Nat Commun 2024; 15:7328. [PMID: 39187507 PMCID: PMC11347660 DOI: 10.1038/s41467-024-51507-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 08/07/2024] [Indexed: 08/28/2024] Open
Abstract
Proteins are dynamic molecules whose movements result in different conformations with different functions. Neural networks such as AlphaFold2 can predict the structure of single-chain proteins with conformations most likely to exist in the PDB. However, almost all protein structures with multiple conformations represented in the PDB have been used while training these models. Therefore, it is unclear whether alternative protein conformations can be genuinely predicted using these networks, or if they are simply reproduced from memory. Here, we train a structure prediction network, Cfold, on a conformational split of the PDB to generate alternative conformations. Cfold enables efficient exploration of the conformational landscape of monomeric protein structures. Over 50% of experimentally known nonredundant alternative protein conformations evaluated here are predicted with high accuracy (TM-score > 0.8).
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany.
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrhenius väg 20C, 114 18, Stockholm, Sweden.
- Science for Life Laboratory, 172 21, Solna, Sweden.
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany
| |
Collapse
|
15
|
Çakar MM, Milčić N, Andreadaki T, Charnock S, Fessner WD, Blažević ZF. Kinetic characterization of two neuraminic acid synthases and evaluation of their application potential. Appl Microbiol Biotechnol 2024; 108:446. [PMID: 39167161 PMCID: PMC11339185 DOI: 10.1007/s00253-024-13277-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Revised: 08/02/2024] [Accepted: 08/06/2024] [Indexed: 08/23/2024]
Abstract
Neuraminic acid synthases are an important yet underexplored group of enzymes. Thus, in this research, we performed a detailed kinetic and stability analysis and a comparison of previously known neuraminic acid synthase from Neisseria meningitidis, and a novel enzyme, PNH5, obtained from a metagenomic library. A systematic analysis revealed a high level of similarity of PNH5 to other known neuraminic acid synthases, except for its pH optimum, which was found to be at 5.5 for the novel enzyme. This is the first reported enzyme from this family that prefers an acidic pH value. The effect of different metal cofactors on enzyme activity, i.e. Co2+, Mn2+ and Mg2+, was studied systematically. The kinetics of neuraminic acid synthesis was completely elucidated, and an appropriate kinetic model was proposed. Enzyme stability study revealed that the purified enzyme exhibits changes in its structure during time as observed by differential light scattering, which cause a drop in its activity and protein concentration. The operational enzyme stability for the neuraminic acid synthase from N. meningitidis is excellent, where no activity drop was observed during the batch reactor experiments. In the case of PNH5, some activity drop was observed at higher concentration of substrates. The obtained results present a solid platform for the future application of these enzymes in the synthesis of sialic acids. KEY POINTS: • A novel neuraminic acid synthase was characterized. • The effect of cofactors on NeuS activity was elucidated. • Kinetic and stability characterization of two neuraminic acid synthases was performed.
Collapse
Affiliation(s)
- Mehmet Mervan Çakar
- University of Zagreb, Faculty of Chemical Engineering and Technology, Trg Marka Marulića 19, 10000, Zagreb, Croatia
| | - Nevena Milčić
- University of Zagreb, Faculty of Chemical Engineering and Technology, Trg Marka Marulića 19, 10000, Zagreb, Croatia
| | | | - Simon Charnock
- Prozomix Limited, Station Court, Haltwhistle, Northumberland, NE49 9HN, UK
| | - Wolf-Dieter Fessner
- Institute of Organic Chemistry and Biochemistry, Technical University of Darmstadt, Peter-Grünberg-Strasse 4, 64287, Darmstadt, Germany
| | - Zvjezdana Findrik Blažević
- University of Zagreb, Faculty of Chemical Engineering and Technology, Trg Marka Marulića 19, 10000, Zagreb, Croatia.
| |
Collapse
|
16
|
Young VL, McSweeney AM, Edwards MJ, Ward VK. The Disorderly Nature of Caliciviruses. Viruses 2024; 16:1324. [PMID: 39205298 PMCID: PMC11360831 DOI: 10.3390/v16081324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 08/07/2024] [Accepted: 08/17/2024] [Indexed: 09/04/2024] Open
Abstract
An intrinsically disordered protein (IDP) or region (IDR) lacks or has little protein structure but still maintains function. This lack of structure creates flexibility and fluidity, allowing multiple protein conformations and potentially transient interactions with more than one partner. Caliciviruses are positive-sense ssRNA viruses, containing a relatively small genome of 7.6-8.6 kb and have a broad host range. Many viral proteins are known to contain IDRs, which benefit smaller viral genomes by expanding the functional proteome through the multifunctional nature of the IDR. The percentage of intrinsically disordered residues within the total proteome for each calicivirus type species can range between 8 and 23%, and IDRs have been experimentally identified in NS1-2, VPg and RdRP proteins. The IDRs within a protein are not well conserved across the genera, and whether this correlates to different activities or increased tolerance to mutations, driving virus adaptation to new selection pressures, is unknown. The function of norovirus NS1-2 has not yet been fully elucidated but includes involvement in host cell tropism, the promotion of viral spread and the suppression of host interferon-λ responses. These functions and the presence of host cell-like linear motifs that interact with host cell caspases and VAPA/B are all found or affected by the disordered region of norovirus NS1-2. The IDRs of calicivirus VPg are involved in viral transcription and translation, RNA binding, nucleotidylylation and cell cycle arrest, and the N-terminal IDR within the human norovirus RdRP could potentially drive liquid-liquid phase separation. This review identifies and summarises the IDRs of proteins within the Caliciviridae family and their importance during viral replication and subsequent host interactions.
Collapse
Affiliation(s)
| | | | | | - Vernon K. Ward
- Department of Microbiology & Immunology, School of Biomedical Sciences, University of Otago, P.O. Box 56, Dunedin 9054, New Zealand
| |
Collapse
|
17
|
Corre MH, Rey B, David SC, Torii S, Chiappe D, Kohn T. The early communication stages between serine proteases and enterovirus capsids in the race for viral disintegration. Commun Biol 2024; 7:969. [PMID: 39122806 PMCID: PMC11316004 DOI: 10.1038/s42003-024-06627-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 07/24/2024] [Indexed: 08/12/2024] Open
Abstract
Serine proteases are important environmental contributors of enterovirus biocontrol. However, the structural features of molecular interaction accounting for the susceptibility of enteroviruses to proteases remains unexplained. Here, we describe the molecular mechanisms involved in the recruitment of serine proteases to viral capsids. Among the virus types used, coxsackievirus A9 (CVA9), but not CVB5 and echovirus 11 (E11), was inactivated by Subtilisin A in a host-independent manner, while Bovine Pancreatic Trypsin (BPT) only reduced CVA9 infectivity in a host-dependent manner. Predictive interaction models of each protease with capsid protomers indicate the main targets as internal disordered protein (IDP) segments exposed either on the 5-fold vertex (DE loop VP1) or at the 5/2-fold intersection (C-terminal end VP1) of viral capsids. We further show that a functional binding protease/capsid depends on both the strength and the evolution over time of protease-VP1 complexes, and lastly on the local adaptation of proteases on surrounding viral regions. Finally, we predicted three residues on CVA9 capsid that trigger cleavage by Subtilisin A, one of which may act as a sensor residue contributing to enzyme recognition on the DE loop. Overall, this study describes an important biological mechanism involved in enteroviruses biocontrol.
Collapse
Affiliation(s)
- Marie-Hélène Corre
- Laboratory of Environmental Virology, Environmental Engineering Institute (IIE), School of Architecture, Civil and Environmental Engineering (ENAC), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015-CH, Lausanne, Switzerland.
| | - Benjamin Rey
- Laboratory of Environmental Virology, Environmental Engineering Institute (IIE), School of Architecture, Civil and Environmental Engineering (ENAC), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015-CH, Lausanne, Switzerland
| | - Shannon C David
- Laboratory of Environmental Virology, Environmental Engineering Institute (IIE), School of Architecture, Civil and Environmental Engineering (ENAC), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015-CH, Lausanne, Switzerland
| | - Shotaro Torii
- Laboratory of Environmental Virology, Environmental Engineering Institute (IIE), School of Architecture, Civil and Environmental Engineering (ENAC), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015-CH, Lausanne, Switzerland
| | - Diego Chiappe
- Proteomics Core Facility, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015-CH, Lausanne, Switzerland
| | - Tamar Kohn
- Laboratory of Environmental Virology, Environmental Engineering Institute (IIE), School of Architecture, Civil and Environmental Engineering (ENAC), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015-CH, Lausanne, Switzerland
| |
Collapse
|
18
|
Jang YJ, Qin QQ, Huang SY, Peter ATJ, Ding XM, Kornmann B. Accurate prediction of protein function using statistics-informed graph networks. Nat Commun 2024; 15:6601. [PMID: 39097570 PMCID: PMC11297950 DOI: 10.1038/s41467-024-50955-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 07/15/2024] [Indexed: 08/05/2024] Open
Abstract
Understanding protein function is pivotal in comprehending the intricate mechanisms that underlie many crucial biological activities, with far-reaching implications in the fields of medicine, biotechnology, and drug development. However, more than 200 million proteins remain uncharacterized, and computational efforts heavily rely on protein structural information to predict annotations of varying quality. Here, we present a method that utilizes statistics-informed graph networks to predict protein functions solely from its sequence. Our method inherently characterizes evolutionary signatures, allowing for a quantitative assessment of the significance of residues that carry out specific functions. PhiGnet not only demonstrates superior performance compared to alternative approaches but also narrows the sequence-function gap, even in the absence of structural information. Our findings indicate that applying deep learning to evolutionary data can highlight functional sites at the residue level, providing valuable support for interpreting both existing properties and new functionalities of proteins in research and biomedicine.
Collapse
Affiliation(s)
- Yaan J Jang
- Department of Biochemistry, University of Oxford, Oxford, UK.
- AmoAi Technologies, Oxford, UK.
| | - Qi-Qi Qin
- AmoAi Technologies, Oxford, UK
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Si-Yu Huang
- AmoAi Technologies, Oxford, UK
- Oxford Martin School, University of Oxford, Oxford, UK
- School of Systems Science, Beijing Normal University, Beijing, China
| | | | - Xue-Ming Ding
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Benoît Kornmann
- Department of Biochemistry, University of Oxford, Oxford, UK.
| |
Collapse
|
19
|
Ahdritz G, Bouatta N, Floristean C, Kadyan S, Xia Q, Gerecke W, O'Donnell TJ, Berenberg D, Fisk I, Zanichelli N, Zhang B, Nowaczynski A, Wang B, Stepniewska-Dziubinska MM, Zhang S, Ojewole A, Guney ME, Biderman S, Watkins AM, Ra S, Lorenzo PR, Nivon L, Weitzner B, Ban YEA, Chen S, Zhang M, Li C, Song SL, He Y, Sorger PK, Mostaque E, Zhang Z, Bonneau R, AlQuraishi M. OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nat Methods 2024; 21:1514-1524. [PMID: 38744917 DOI: 10.1038/s41592-024-02272-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 04/03/2024] [Indexed: 05/16/2024]
Abstract
AlphaFold2 revolutionized structural biology with the ability to predict protein structures with exceptionally high accuracy. Its implementation, however, lacks the code and data required to train new models. These are necessary to (1) tackle new tasks, like protein-ligand complex structure prediction, (2) investigate the process by which the model learns and (3) assess the model's capacity to generalize to unseen regions of fold space. Here we report OpenFold, a fast, memory efficient and trainable implementation of AlphaFold2. We train OpenFold from scratch, matching the accuracy of AlphaFold2. Having established parity, we find that OpenFold is remarkably robust at generalizing even when the size and diversity of its training set is deliberately limited, including near-complete elisions of classes of secondary structure elements. By analyzing intermediate structures produced during training, we also gain insights into the hierarchical manner in which OpenFold learns to fold. In sum, our studies demonstrate the power and utility of OpenFold, which we believe will prove to be a crucial resource for the protein modeling community.
Collapse
Affiliation(s)
- Gustaf Ahdritz
- Department of Systems Biology, Columbia University, New York, NY, USA
- Harvard University, Cambridge, MA, USA
| | - Nazim Bouatta
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA.
| | | | - Sachin Kadyan
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Qinghui Xia
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - William Gerecke
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | | | - Daniel Berenberg
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
| | - Ian Fisk
- Flatiron Institute, New York, NY, USA
| | | | - Bo Zhang
- Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT, USA
| | | | | | | | | | | | | | - Stella Biderman
- EleutherAI, New York, NY, USA
- Booz Allen Hamilton, McLean, VA, USA
| | | | - Stephen Ra
- Prescient Design, Genentech, New York, NY, USA
| | | | | | | | | | | | - Minjia Zhang
- University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | | | | | | | - Peter K Sorger
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | | | - Zhao Zhang
- Rutgers University, New Brunswick, NJ, USA
| | | | | |
Collapse
|
20
|
Mei Z, Wang F, Bhosle A, Dong D, Mehta R, Ghazi A, Zhang Y, Liu Y, Rinott E, Ma S, Rimm EB, Daviglus M, Willett WC, Knight R, Hu FB, Qi Q, Chan AT, Burk RD, Stampfer MJ, Shai I, Kaplan RC, Huttenhower C, Wang DD. Strain-specific gut microbial signatures in type 2 diabetes identified in a cross-cohort analysis of 8,117 metagenomes. Nat Med 2024; 30:2265-2276. [PMID: 38918632 DOI: 10.1038/s41591-024-03067-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/14/2024] [Indexed: 06/27/2024]
Abstract
The association of gut microbial features with type 2 diabetes (T2D) has been inconsistent due in part to the complexity of this disease and variation in study design. Even in cases in which individual microbial species have been associated with T2D, mechanisms have been unable to be attributed to these associations based on specific microbial strains. We conducted a comprehensive study of the T2D microbiome, analyzing 8,117 shotgun metagenomes from 10 cohorts of individuals with T2D, prediabetes, and normoglycemic status in the United States, Europe, Israel and China. Dysbiosis in 19 phylogenetically diverse species was associated with T2D (false discovery rate < 0.10), for example, enriched Clostridium bolteae and depleted Butyrivibrio crossotus. These microorganisms also contributed to community-level functional changes potentially underlying T2D pathogenesis, for example, perturbations in glucose metabolism. Our study identifies within-species phylogenetic diversity for strains of 27 species that explain inter-individual differences in T2D risk, such as Eubacterium rectale. In some cases, these were explained by strain-specific gene carriage, including loci involved in various mechanisms of horizontal gene transfer and novel biological processes underlying metabolic risk, for example, quorum sensing. In summary, our study provides robust cross-cohort microbial signatures in a strain-resolved manner and offers new mechanistic insights into T2D.
Collapse
Affiliation(s)
- Zhendong Mei
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Fenglei Wang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Amrisha Bhosle
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Danyue Dong
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Raaj Mehta
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
| | - Andrew Ghazi
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yancong Zhang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yuxi Liu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Ehud Rinott
- Department of Medicine, Hebrew University and Hadassah Medical Center, Jerusalem, Israel
| | - Siyuan Ma
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Eric B Rimm
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Martha Daviglus
- Institute for Minority Health Research, University of Illinois Chicago, Chicago, IL, USA
| | - Walter C Willett
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Rob Knight
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Frank B Hu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Qibin Qi
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Andrew T Chan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Robert D Burk
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Obstetrics, Gynecology and Women's Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Meir J Stampfer
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Iris Shai
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Faculty of Health Sciences, The Health and Nutrition Innovative International Research Center, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Robert C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Curtis Huttenhower
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Harvard Chan Microbiome in Public Health Center, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Dong D Wang
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
21
|
Sürmeli Y, Durmuş N, Şanlı-Mohamed G. Exploring the Structural Insights of Thermostable Geobacillus esterases by Computational Characterization. ACS OMEGA 2024; 9:32931-32941. [PMID: 39100300 PMCID: PMC11292637 DOI: 10.1021/acsomega.4c03818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 07/10/2024] [Accepted: 07/15/2024] [Indexed: 08/06/2024]
Abstract
This study conducted an in silico analysis of two biochemically characterized thermostable esterases, Est2 and Est3, from Geobacillus strains. To achieve this, the amino acid sequences of Est2 and Est3 were examined to assess their biophysicochemical properties, evolutionary connections, and sequence similarities. Three-dimensional models were constructed and validated through diverse bioinformatics tools. Molecular dynamics (MD) simulation was employed on a pNP-C2 ligand to explore interactions between enzymes and ligand. Biophysicochemical property analysis indicated that aliphatic indices and theoretical T m values of enzymes were between 82-83 and 55-65 °C, respectively. Molecular phylogeny placed Est2 and Est3 within Family XIII, alongside other Geobacillus esterases. DeepMSA2 revealed that Est2, Est3, and homologous sequences shared 12 conserved residues in their core domain (L39, D50, G53, G55, S57, G92, S94, G96, P108, P184, D193, and H223). BANΔIT analysis indicated that Est2 and Est3 had a significantly more rigid cap domain compared to Est30. Salt bridge analysis revealed that E150-R136, E124-K165, E137-R141, and E154-K157 salt bridges made Est2 and Est3 more stable compared to Est30. MD simulation indicated that Est3 exhibited greater fluctuations in the N-terminal region including conserved F25, cap domain, and C-terminal region, notably including H223, suggesting that these regions might influence esterase catalysis. The common residues in the ligand-binding sites of Est2-Est3 were determined as F25 and L167. The analysis of root mean square fluctuation (RMSF) revealed that region 1, encompassing F25 within the β2-α1 loop of Est3, exhibited higher fluctuations compared to those of Est2. Overall, this study might provide valuable insights for future investigations aimed at improving esterase thermostability and catalytic efficiency, critical industrial traits, through targeted amino acid modifications within the N-terminal region, cap domain, and C-terminal region using rational protein engineering techniques.
Collapse
Affiliation(s)
- Yusuf Sürmeli
- Department
of Agricultural Biotechnology, Tekirdağ
Namık Kemal University, 59030 Tekirdağ, Turkey
| | - Naciye Durmuş
- Department
of Molecular Biology and Genetics, İstanbul
Technical University, 34485 İstanbul, Turkey
| | | |
Collapse
|
22
|
Zhang H, Lan J, Wang H, Lu R, Zhang N, He X, Yang J, Chen L. AlphaFold2 in biomedical research: facilitating the development of diagnostic strategies for disease. Front Mol Biosci 2024; 11:1414916. [PMID: 39139810 PMCID: PMC11319189 DOI: 10.3389/fmolb.2024.1414916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/15/2024] [Indexed: 08/15/2024] Open
Abstract
Proteins, as the primary executors of physiological activity, serve as a key factor in disease diagnosis and treatment. Research into their structures, functions, and interactions is essential to better understand disease mechanisms and potential therapies. DeepMind's AlphaFold2, a deep-learning protein structure prediction model, has proven to be remarkably accurate, and it is widely employed in various aspects of diagnostic research, such as the study of disease biomarkers, microorganism pathogenicity, antigen-antibody structures, and missense mutations. Thus, AlphaFold2 serves as an exceptional tool to bridge fundamental protein research with breakthroughs in disease diagnosis, developments in diagnostic strategies, and the design of novel therapeutic approaches and enhancements in precision medicine. This review outlines the architecture, highlights, and limitations of AlphaFold2, placing particular emphasis on its applications within diagnostic research grounded in disciplines such as immunology, biochemistry, molecular biology, and microbiology.
Collapse
Affiliation(s)
- Hong Zhang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Jiajing Lan
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Huijie Wang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Ruijie Lu
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Nanqi Zhang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Xiaobai He
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
- Key Laboratory of Biomarkers and In Vitro Diagnosis Translation of Zhejiang Province, Hangzhou, China
| | - Jun Yang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Linjie Chen
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
- Zhejiang Engineering Research Centre for Key Technology of Diagnostic Testing, Hangzhou, China
| |
Collapse
|
23
|
Fenster JA, Azzinaro PA, Dinhobl M, Borca MV, Spinard E, Gladue DP. African Swine Fever Virus Protein-Protein Interaction Prediction. Viruses 2024; 16:1170. [PMID: 39066332 PMCID: PMC11281715 DOI: 10.3390/v16071170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 07/05/2024] [Accepted: 07/12/2024] [Indexed: 07/28/2024] Open
Abstract
The African swine fever virus (ASFV) is an often deadly disease in swine and poses a threat to swine livestock and swine producers. With its complex genome containing more than 150 coding regions, developing effective vaccines for this virus remains a challenge due to a lack of basic knowledge about viral protein function and protein-protein interactions between viral proteins and between viral and host proteins. In this work, we identified ASFV-ASFV protein-protein interactions (PPIs) using artificial intelligence-powered protein structure prediction tools. We benchmarked our PPI identification workflow on the Vaccinia virus, a widely studied nucleocytoplasmic large DNA virus, and found that it could identify gold-standard PPIs that have been validated in vitro in a genome-wide computational screening. We applied this workflow to more than 18,000 pairwise combinations of ASFV proteins and were able to identify seventeen novel PPIs, many of which have corroborating experimental or bioinformatic evidence for their protein-protein interactions, further validating their relevance. Two protein-protein interactions, I267L and I8L, I267L__I8L, and B175L and DP79L, B175L__DP79L, are novel PPIs involving viral proteins known to modulate host immune response.
Collapse
Affiliation(s)
- Jacob A. Fenster
- Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN 37830, USA;
- Plum Island Animal Disease Center, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Orient, NY 11957, USA; (P.A.A.); (M.D.); (E.S.)
- National Bio and Agro-Defense Facility, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Manhattan, KS 66502, USA
| | - Paul A. Azzinaro
- Plum Island Animal Disease Center, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Orient, NY 11957, USA; (P.A.A.); (M.D.); (E.S.)
- National Bio and Agro-Defense Facility, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Manhattan, KS 66502, USA
| | - Mark Dinhobl
- Plum Island Animal Disease Center, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Orient, NY 11957, USA; (P.A.A.); (M.D.); (E.S.)
- National Bio and Agro-Defense Facility, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Manhattan, KS 66502, USA
| | - Manuel V. Borca
- Plum Island Animal Disease Center, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Orient, NY 11957, USA; (P.A.A.); (M.D.); (E.S.)
- National Bio and Agro-Defense Facility, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Manhattan, KS 66502, USA
| | - Edward Spinard
- Plum Island Animal Disease Center, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Orient, NY 11957, USA; (P.A.A.); (M.D.); (E.S.)
- National Bio and Agro-Defense Facility, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Manhattan, KS 66502, USA
| | - Douglas P. Gladue
- Plum Island Animal Disease Center, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Orient, NY 11957, USA; (P.A.A.); (M.D.); (E.S.)
- National Bio and Agro-Defense Facility, Foreign Animal Disease Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Manhattan, KS 66502, USA
| |
Collapse
|
24
|
Cuturello F, Celoria M, Ansuini A, Cazzaniga A. Enhancing predictions of protein stability changes induced by single mutations using MSA-based Language Models. Bioinformatics 2024; 40:btae447. [PMID: 39012369 PMCID: PMC11269464 DOI: 10.1093/bioinformatics/btae447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 06/19/2024] [Accepted: 07/10/2024] [Indexed: 07/17/2024] Open
Abstract
MOTIVATION Protein Language Models offer a new perspective for addressing challenges in structural biology, while relying solely on sequence information. Recent studies have investigated their effectiveness in forecasting shifts in thermodynamic stability caused by single amino acid mutations, a task known for its complexity due to the sparse availability of data, constrained by experimental limitations. To tackle this problem, we introduce two key novelties: leveraging a Protein Language Model that incorporates Multiple Sequence Alignments to capture evolutionary information, and using a recently released mega-scale dataset with rigorous data pre-processing to mitigate overfitting. RESULTS We ensure comprehensive comparisons by fine-tuning various pre-trained models, taking advantage of analyses such as ablation studies and baselines evaluation. Our methodology introduces a stringent policy to reduce the widespread issue of data leakage, rigorously removing sequences from the training set when they exhibit significant similarity with the test set. The MSA Transformer emerges as the most accurate among the models under investigation, given its capability to leverage co-evolution signals encoded in aligned homologous sequences. Moreover, the optimized MSA Transformer outperforms existing methods and exhibits enhanced generalization power, leading to a notable improvement in predicting changes in protein stability resulting from point mutations. AVAILABILITY AND IMPLEMENTATION Code and data at https://github.com/RitAreaSciencePark/PLM4Muts. SUPPLEMENTARY INFORMATION Supplementary Information is available at Bioinformatics online.
Collapse
Affiliation(s)
- Francesca Cuturello
- Research and Technology Institute, , AREA Science Park, Trieste 34149, Italy
| | - Marco Celoria
- Research and Technology Institute, , AREA Science Park, Trieste 34149, Italy
- HPC Department, , CINECA National Supercomputing Center, Bologna 40033, Italy
| | - Alessio Ansuini
- Research and Technology Institute, , AREA Science Park, Trieste 34149, Italy
| | - Alberto Cazzaniga
- Research and Technology Institute, , AREA Science Park, Trieste 34149, Italy
| |
Collapse
|
25
|
Launay R, Chobert SC, Abby SS, Pierrel F, André I, Esque J. Structural Reconstruction of E. coli Ubi Metabolon Using an AlphaFold2-Based Computational Framework. J Chem Inf Model 2024; 64:5175-5193. [PMID: 38710096 DOI: 10.1021/acs.jcim.4c00304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Ubiquinone (UQ) is a redox polyisoprenoid lipid found in the membranes of bacteria and eukaryotes that has important roles, notably one in respiratory metabolism, which sustains cellular bioenergetics. In Escherichia coli, several steps of the UQ biosynthesis take place in the cytosol. To perform these reactions, a supramolecular assembly called Ubi metabolon is involved. This latter is composed of seven proteins (UbiE, UbiG, UbiF, UbiH, UbiI, UbiJ, and UbiK), and its structural organization is unknown as well as its protein stoichiometry. In this study, a computational framework has been designed to predict the structure of this macromolecular assembly. In several successive steps, we explored the possible protein interactions as well as the protein stoichiometry, to finally obtain a structural organization of the complex. The use of AlphaFold2-based methods combined with evolutionary information enabled us to predict several models whose quality and confidence were further analyzed using different metrics and scores. Our work led to the identification of a "core assembly" that will guide functional and structural characterization of the Ubi metabolon.
Collapse
Affiliation(s)
- Romain Launay
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Sophie-Carole Chobert
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Sophie S Abby
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Fabien Pierrel
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Isabelle André
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Jérémy Esque
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| |
Collapse
|
26
|
Norn C, Oliveira F, André I. Improved prediction of site-rates from structure with averaging across homologs. Protein Sci 2024; 33:e5086. [PMID: 38923241 PMCID: PMC11196898 DOI: 10.1002/pro.5086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/12/2024] [Accepted: 06/04/2024] [Indexed: 06/28/2024]
Abstract
Variation in mutation rates at sites in proteins can largely be understood by the constraint that proteins must fold into stable structures. Models that calculate site-specific rates based on protein structure and a thermodynamic stability model have shown a significant but modest ability to predict empirical site-specific rates calculated from sequence. Models that use detailed atomistic models of protein energetics do not outperform simpler approaches using packing density. We demonstrate that a fundamental reason for this is that empirical site-specific rates are the result of the average effect of many different microenvironments in a phylogeny. By analyzing the results of evolutionary dynamics simulations, we show how averaging site-specific rates across many extant protein structures can lead to correct recovery of site-rate prediction. This result is also demonstrated in natural protein sequences and experimental structures. Using predicted structures, we demonstrate that atomistic models can improve upon contact density metrics in predicting site-specific rates from a structure. The results give fundamental insights into the factors governing the distribution of site-specific rates in protein families.
Collapse
Affiliation(s)
- Christoffer Norn
- Department of Biochemistry and Structural BiologyLund UniversityLundSweden
- Bioinnovation Institute FoundationKøbenhavnDenmark
| | - Fábio Oliveira
- Department of Biochemistry and Structural BiologyLund UniversityLundSweden
| | - Ingemar André
- Department of Biochemistry and Structural BiologyLund UniversityLundSweden
| |
Collapse
|
27
|
Tütüncü HE, Durmuş N, Sürmeli Y. Unraveling the potential of uninvestigated thermoalkaliphilic lipases by molecular docking and molecular dynamic simulation: an in silico characterization study. 3 Biotech 2024; 14:179. [PMID: 38882640 PMCID: PMC11176153 DOI: 10.1007/s13205-024-04023-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/29/2024] [Indexed: 06/18/2024] Open
Abstract
Thermoalkaliphilic lipase enzymes are mostly favored for use in the detergent industry. While there has been considerable research on Geobacillus lipases, a significant portion of these enzymes remains unexplored or undocumented in the scientific literature. This work performed in silico phylogeny, sequence alignment, structural and enzyme-substrate interaction analyses of the five thermoalkaliphilic lipases belonging to different Geobacillus species (Geobacillus stearothermophilus lipase = GsLip, Geobacillus sp. B4113_201601 lipase = Gb4Lip, Geobacillus kaustophilus HTA426 lipase = GkLip, Geobacillus sp. SP22 lipase = GspLip, Geobacillus sp. NTU 03 lipase = GntLip). For this purpose, unreviewed enzyme sequences of five Geobacillus thermoalkaliphilic lipases were analyzed at sequence and phylogeny levels. 3D homology enzyme models were built, validated, and investigated by different bioinformatics tools. The ligand interactions screening using seven para-nitrophenyl (pNP) esters and enzyme-ligand interactions were analyzed on Gb4Lip:pNP-C12 and BTL2:pNP-C12 by MD simulation. Biophysicochemical characteristic analysis showed that Gb4Lip had a theoretical T m value of above 65 ºC, and a higher aliphatic index indicating greater thermal stability. Sequence alignment showed a hydrophilic threonine in the α6 helix of Gb4Lip, indicating high enzymatic activity. A normalized temperature factor B (B'-factor) analysis showed that the lid domains of five lipases significantly possessed lower B'-factor values, compared to G. thermocatenulatus lipase 2 (BTL2), indicating that they had higher rigidity. Molecular docking results indicated that the five lipases had the highest binding affinity toward pNP-C12. The RMSF investigation revealed that the thermostability of Gb4Lip is influenced by specific molecular elements: D202-S203 within the αB region of the lid domain, and E274-Q275 within the b3 strand, as well as W278 in the b3-b4 loop, and H282 in the b4 strand of the Ca2+-binding region. MD simulation analysis showed that catalytic residue S114 and at least one oxyanion hole residue (F17 and/or Q114) in Gb4Lip frequently formed hydrogen bonds with the pNP-C12 ligand at 343 K and 348 K throughout the simulation process, indicating that Gb4Lip might catalyze relatively long-chain ligand pNP-C12 with high performance. In conclusion, Gb4Lip might be a more suitable candidate as the detergent additive. In addition, this investigation can offer valuable perspectives on Family I.5 lipases such as Gb4Lip for future exploration in the field of protein engineering. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-024-04023-5.
Collapse
Affiliation(s)
- Havva Esra Tütüncü
- Department of Nutrition and Dietetics, Malatya Turgut Özal University, 44210 Malatya, Turkey
| | - Naciye Durmuş
- Department of Molecular Biology and Genetics, İstanbul Technical University, 34485 Istanbul, Turkey
| | - Yusuf Sürmeli
- Department of Agricultural Biotechnology, Tekirdağ Namık Kemal University, 59030 Tekirdağ, Turkey
| |
Collapse
|
28
|
Bryant P, Noé F. Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile. PLoS Comput Biol 2024; 20:e1012253. [PMID: 39052676 DOI: 10.1371/journal.pcbi.1012253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/06/2024] [Accepted: 06/14/2024] [Indexed: 07/27/2024] Open
Abstract
Structure prediction of protein complexes has improved significantly with AlphaFold2 and AlphaFold-multimer (AFM), but only 60% of dimers are accurately predicted. Here, we learn a bias to the MSA representation that improves the predictions by performing gradient descent through the AFM network. We demonstrate the performance on seven difficult targets from CASP15 and increase the average MMscore to 0.76 compared to 0.63 with AFM. We evaluate the procedure on 487 protein complexes where AFM fails and obtain an increased success rate (MMscore>0.75) of 33% on these difficult targets. Our protocol, AFProfile, provides a way to direct predictions towards a defined target function guided by the MSA. We expect gradient descent over the MSA to be useful for different tasks.
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Germany
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Germany
- Microsoft Research AI4Science, Berlin, Germany
| |
Collapse
|
29
|
Si Y, Zou J, Gao Y, Chuai G, Liu Q, Chen L. Foundation models in molecular biology. BIOPHYSICS REPORTS 2024; 10:135-151. [PMID: 39027316 PMCID: PMC11252241 DOI: 10.52601/bpr.2024.240006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/04/2024] [Indexed: 07/20/2024] Open
Abstract
Determining correlations between molecules at various levels is an important topic in molecular biology. Large language models have demonstrated a remarkable ability to capture correlations from large amounts of data in the field of natural language processing as well as image generation, and correlations captured from data using large language models can also be applicable to solving a wide range of specific tasks, hence large language models are also referred to as foundation models. The massive amount of data that exists in the field of molecular biology provides an excellent basis for the development of foundation models, and the recent emergence of foundation models in the field of molecular biology has really pushed the entire field forward. We summarize the foundation models developed based on RNA sequence data, DNA sequence data, protein sequence data, single-cell transcriptome data, and spatial transcriptome data respectively, and further discuss the research directions for the development of foundation models in molecular biology.
Collapse
Affiliation(s)
- Yunda Si
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Jiawei Zou
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China
| | - Yicheng Gao
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Guohui Chuai
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Qi Liu
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Luonan Chen
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
30
|
Sorgenfrei FA, Sloan JJ, Weissensteiner F, Zechner M, Mehner NA, Ellinghaus TL, Schachtschabel D, Seemayer S, Kroutil W. Solvent concentration at 50% protein unfolding may reform enzyme stability ranking and process window identification. Nat Commun 2024; 15:5420. [PMID: 38926341 PMCID: PMC11208486 DOI: 10.1038/s41467-024-49774-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 06/19/2024] [Indexed: 06/28/2024] Open
Abstract
As water miscible organic co-solvents are often required for enzyme reactions to improve e.g., the solubility of the substrate in the aqueous medium, an enzyme is required which displays high stability in the presence of this co-solvent. Consequently, it is of utmost importance to identify the most suitable enzyme or the appropriate reaction conditions. Until now, the melting temperature is used in general as a measure for stability of enzymes. The experiments here show, that the melting temperature does not correlate to the activity observed in the presence of the solvent. As an alternative parameter, the concentration of the co-solvent at the point of 50% protein unfolding at a specific temperature T in shortc U 50 T is introduced. Analyzing a set of ene reductases,c U 50 T is shown to indicate the concentration of the co-solvent where also the activity of the enzyme drops fastest. Comparing possible rankings of enzymes according to melting temperature andc U 50 T reveals a clearly diverging outcome also depending on the specific solvent used. Additionally, plots ofc U 50 versus temperature enable a fast identification of possible reaction windows to deduce tolerated solvent concentrations and temperature.
Collapse
Affiliation(s)
- Frieda A Sorgenfrei
- Austrian Centre of Industrial Biotechnology c/o University of Graz, Heinrichstrasse 28, 8010, Graz, Austria
| | - Jeremy J Sloan
- BASF SE, Carl-Bosch-Strasse 38, 67056, Ludwigshafen, Germany
| | - Florian Weissensteiner
- Austrian Centre of Industrial Biotechnology c/o University of Graz, Heinrichstrasse 28, 8010, Graz, Austria
- Department of Chemistry, University of Graz, NAWI Graz, Heinrichstrasse 28, 8010, Graz, Austria
| | - Marco Zechner
- Austrian Centre of Industrial Biotechnology c/o University of Graz, Heinrichstrasse 28, 8010, Graz, Austria
| | - Niklas A Mehner
- BASF SE, Carl-Bosch-Strasse 38, 67056, Ludwigshafen, Germany
| | | | | | - Stefan Seemayer
- BASF SE, Carl-Bosch-Strasse 38, 67056, Ludwigshafen, Germany.
| | - Wolfgang Kroutil
- Austrian Centre of Industrial Biotechnology c/o University of Graz, Heinrichstrasse 28, 8010, Graz, Austria.
- Department of Chemistry, University of Graz, NAWI Graz, Heinrichstrasse 28, 8010, Graz, Austria.
- BioTechMed Graz, 8010, Graz, Austria.
- Field of Excellence BioHealth, University of Graz, 8010, Graz, Austria.
| |
Collapse
|
31
|
Nicolas Y, Bret H, Cannavo E, Acharya A, Cejka P, Borde V, Guerois R. Molecular insights into the activation of Mre11-Rad50 endonuclease activity by Sae2/CtIP. Mol Cell 2024; 84:2223-2237.e4. [PMID: 38870937 DOI: 10.1016/j.molcel.2024.05.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 02/25/2024] [Accepted: 05/20/2024] [Indexed: 06/15/2024]
Abstract
In Saccharomyces cerevisiae (S. cerevisiae), Mre11-Rad50-Xrs2 (MRX)-Sae2 nuclease activity is required for the resection of DNA breaks with secondary structures or protein blocks, while in humans, the MRE11-RAD50-NBS1 (MRN) homolog with CtIP is needed to initiate DNA end resection of all breaks. Phosphorylated Sae2/CtIP stimulates the endonuclease activity of MRX/N. Structural insights into the activation of the Mre11 nuclease are available only for organisms lacking Sae2/CtIP, so little is known about how Sae2/CtIP activates the nuclease ensemble. Here, we uncover the mechanism of Mre11 activation by Sae2 using a combination of AlphaFold2 structural modeling of biochemical and genetic assays. We show that Sae2 stabilizes the Mre11 nuclease in a conformation poised to cleave substrate DNA. Several designs of compensatory mutations establish how Sae2 activates MRX in vitro and in vivo, supporting the structural model. Finally, our study uncovers how human CtIP, despite considerable sequence divergence, employs a similar mechanism to activate MRN.
Collapse
Affiliation(s)
- Yoann Nicolas
- Institut Curie, PSL University, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, 75005 Paris, France
| | - Hélène Bret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Elda Cannavo
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona 6500, Switzerland
| | - Ananya Acharya
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona 6500, Switzerland
| | - Petr Cejka
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona 6500, Switzerland.
| | - Valérie Borde
- Institut Curie, PSL University, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, 75005 Paris, France.
| | - Raphaël Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France.
| |
Collapse
|
32
|
Yañez AJ, Barrientos CA, Isla A, Aguilar M, Flores-Martin SN, Yuivar Y, Ojeda A, Ibieta P, Hernández M, Figueroa J, Avendaño-Herrera R, Mancilla M. Discovery and Characterization of the ddx41 Gene in Atlantic Salmon: Evolutionary Implications, Structural Functions, and Innate Immune Responses to Piscirickettsia salmonis and Renibacterium salmoninarum Infections. Int J Mol Sci 2024; 25:6346. [PMID: 38928053 PMCID: PMC11204154 DOI: 10.3390/ijms25126346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 06/28/2024] Open
Abstract
The innate immune response in Salmo salar, mediated by pattern recognition receptors (PRRs), is crucial for defending against pathogens. This study examined DDX41 protein functions as a cytosolic/nuclear sensor for cyclic dinucleotides, RNA, and DNA from invasive intracellular bacteria. The investigation determined the existence, conservation, and functional expression of the ddx41 gene in S. salar. In silico predictions and experimental validations identified a single ddx41 gene on chromosome 5 in S. salar, showing 83.92% homology with its human counterpart. Transcriptomic analysis in salmon head kidney confirmed gene transcriptional integrity. Proteomic identification through mass spectrometry characterized three unique peptides with 99.99% statistical confidence. Phylogenetic analysis demonstrated significant evolutionary conservation across species. Functional gene expression analysis in SHK-1 cells infected by Piscirickettsia salmonis and Renibacterium salmoninarum indicated significant upregulation of DDX41, correlated with increased proinflammatory cytokine levels and activation of irf3 and interferon signaling pathways. In vivo studies corroborated DDX41 activation in immune responses, particularly when S. salar was challenged with P. salmonis, underscoring its potential in enhancing disease resistance. This is the first study to identify the DDX41 pathway as a key component in S. salar innate immune response to invading pathogens, establishing a basis for future research in salmonid disease resistance.
Collapse
Affiliation(s)
- Alejandro J. Yañez
- Laboratorio de Diagnóstico y Terapia, Facultad de Ciencias, Universidad Austral de Chile, Valdivia 5090000, Chile; (C.A.B.); (A.I.); (M.A.); (S.N.F.-M.)
- Interdisciplinary Center for Aquaculture Research (INCAR), Concepción 4030000, Chile; (J.F.); (R.A.-H.)
| | - Claudia A. Barrientos
- Laboratorio de Diagnóstico y Terapia, Facultad de Ciencias, Universidad Austral de Chile, Valdivia 5090000, Chile; (C.A.B.); (A.I.); (M.A.); (S.N.F.-M.)
| | - Adolfo Isla
- Laboratorio de Diagnóstico y Terapia, Facultad de Ciencias, Universidad Austral de Chile, Valdivia 5090000, Chile; (C.A.B.); (A.I.); (M.A.); (S.N.F.-M.)
- Interdisciplinary Center for Aquaculture Research (INCAR), Concepción 4030000, Chile; (J.F.); (R.A.-H.)
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomas, Valdivia 5090000, Chile
| | - Marcelo Aguilar
- Laboratorio de Diagnóstico y Terapia, Facultad de Ciencias, Universidad Austral de Chile, Valdivia 5090000, Chile; (C.A.B.); (A.I.); (M.A.); (S.N.F.-M.)
| | - Sandra N. Flores-Martin
- Laboratorio de Diagnóstico y Terapia, Facultad de Ciencias, Universidad Austral de Chile, Valdivia 5090000, Chile; (C.A.B.); (A.I.); (M.A.); (S.N.F.-M.)
| | - Yassef Yuivar
- ADL Diagnostic Chile, Sector la Vara, Puerto Montt 5480000, Chile; (Y.Y.); (A.O.)
| | - Adriana Ojeda
- ADL Diagnostic Chile, Sector la Vara, Puerto Montt 5480000, Chile; (Y.Y.); (A.O.)
| | - Pablo Ibieta
- TEKBios Ltda, Camino Pargua Km 8, Maullín 5580000, Chile;
| | - Mauricio Hernández
- Division of Biotechnology, MELISA Institute, San Pedro de la Paz 4133515, Chile;
| | - Jaime Figueroa
- Interdisciplinary Center for Aquaculture Research (INCAR), Concepción 4030000, Chile; (J.F.); (R.A.-H.)
- Laboratorio de Biología Molecular de Peces, Instituto de Bioquímica y Microbiología, Facultad de Ciencias, Universidad Austral de Chile, Valdivia 5090000, Chile
| | - Rubén Avendaño-Herrera
- Interdisciplinary Center for Aquaculture Research (INCAR), Concepción 4030000, Chile; (J.F.); (R.A.-H.)
- Laboratorio de Patología de Organismos Acuáticos y Biotecnología Acuícola, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Viña del Mar 2520000, Chile
| | - Marcos Mancilla
- ADL Diagnostic Chile, Sector la Vara, Puerto Montt 5480000, Chile; (Y.Y.); (A.O.)
| |
Collapse
|
33
|
Ding X, Chen X, Sullivan EE, Shay TF, Gradinaru V. Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeling. Mol Ther 2024; 32:1687-1700. [PMID: 38582966 PMCID: PMC11184338 DOI: 10.1016/j.ymthe.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 02/08/2024] [Accepted: 04/03/2024] [Indexed: 04/08/2024] Open
Abstract
Deep-learning-based methods for protein structure prediction have achieved unprecedented accuracy, yet their utility in the engineering of protein-based binders remains constrained due to a gap between the ability to predict the structures of candidate proteins and the ability toprioritize proteins by their potential to bind to a target. To bridge this gap, we introduce Automated Pairwise Peptide-Receptor Analysis for Screening Engineered proteins (APPRAISE), a method for predicting the target-binding propensity of engineered proteins. After generating structural models of engineered proteins competing for binding to a target using an established structure prediction tool such as AlphaFold-Multimer or ESMFold, APPRAISE performs a rapid (under 1 CPU second per model) scoring analysis that takes into account biophysical and geometrical constraints. As proof-of-concept cases, we demonstrate that APPRAISE can accurately classify receptor-dependent vs. receptor-independent adeno-associated viral vectors and diverse classes of engineered proteins such as miniproteins targeting the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike, nanobodies targeting a G-protein-coupled receptor, and peptides that specifically bind to transferrin receptor or programmed death-ligand 1 (PD-L1). APPRAISE is accessible through a web-based notebook interface using Google Colaboratory (https://tiny.cc/APPRAISE). With its accuracy, interpretability, and generalizability, APPRAISE promises to expand the utility of protein structure prediction and accelerate protein engineering for biomedical applications.
Collapse
Affiliation(s)
- Xiaozhe Ding
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA.
| | - Xinhong Chen
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Erin E Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Timothy F Shay
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Viviana Gradinaru
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA.
| |
Collapse
|
34
|
Zhang B, Hou Z, Yang Y, Wong KC, Zhu H, Li X. SOFB is a comprehensive ensemble deep learning approach for elucidating and characterizing protein-nucleic-acid-binding residues. Commun Biol 2024; 7:679. [PMID: 38830995 PMCID: PMC11148103 DOI: 10.1038/s42003-024-06332-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 05/15/2024] [Indexed: 06/05/2024] Open
Abstract
Proteins and nucleic-acids are essential components of living organisms that interact in critical cellular processes. Accurate prediction of nucleic acid-binding residues in proteins can contribute to a better understanding of protein function. However, the discrepancy between protein sequence information and obtained structural and functional data renders most current computational models ineffective. Therefore, it is vital to design computational models based on protein sequence information to identify nucleic acid binding sites in proteins. Here, we implement an ensemble deep learning model-based nucleic-acid-binding residues on proteins identification method, called SOFB, which characterizes protein sequences by learning the semantics of biological dynamics contexts, and then develop an ensemble deep learning-based sequence network to learn feature representation and classification by explicitly modeling dynamic semantic information. Among them, the language learning model, which is constructed from natural language to biological language, captures the underlying relationships of protein sequences, and the ensemble deep learning-based sequence network consisting of different convolutional layers together with Bi-LSTM refines various features for optimal performance. Meanwhile, to address the imbalanced issue, we adopt ensemble learning to train multiple models and then incorporate them. Our experimental results on several DNA/RNA nucleic-acid-binding residue datasets demonstrate that our proposed model outperforms other state-of-the-art methods. In addition, we conduct an interpretability analysis of the identified nucleic acid binding residue sequences based on the attention weights of the language learning model, revealing novel insights into the dynamic semantic information that supports the identified nucleic acid binding residues. SOFB is available at https://github.com/Encryptional/SOFB and https://figshare.com/articles/online_resource/SOFB_figshare_rar/25499452 .
Collapse
Affiliation(s)
- Bin Zhang
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Zilong Hou
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Yuning Yang
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong, Hong Kong SAR
| | - Haoran Zhu
- School of Artificial Intelligence, Jilin University, Changchun, China.
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun, China.
| |
Collapse
|
35
|
Gaschignard G, Millet M, Bruley A, Benzerara K, Dezi M, Skouri-Panet F, Duprat E, Callebaut I. AlphaFold2-guided description of CoBaHMA, a novel family of bacterial domains within the heavy-metal-associated superfamily. Proteins 2024; 92:776-794. [PMID: 38258321 DOI: 10.1002/prot.26668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/22/2023] [Accepted: 01/01/2024] [Indexed: 01/24/2024]
Abstract
Three-dimensional (3D) structure information, now available at the proteome scale, may facilitate the detection of remote evolutionary relationships in protein superfamilies. Here, we illustrate this with the identification of a novel family of protein domains related to the ferredoxin-like superfold, by combining (i) transitive sequence similarity searches, (ii) clustering approaches, and (iii) the use of AlphaFold2 3D structure models. Domains of this family were initially identified in relation with the intracellular biomineralization of calcium carbonates by Cyanobacteria. They are part of the large heavy-metal-associated (HMA) superfamily, departing from the latter by specific sequence and structural features. In particular, most of them share conserved basic amino acids (hence their name CoBaHMA for Conserved Basic residues HMA), forming a positively charged surface, which is likely to interact with anionic partners. CoBaHMA domains are found in diverse modular organizations in bacteria, existing in the form of monodomain proteins or as part of larger proteins, some of which are membrane proteins involved in transport or lipid metabolism. This suggests that the CoBaHMA domains may exert a regulatory function, involving interactions with anionic lipids. This hypothesis might have a particular resonance in the context of the compartmentalization observed for cyanobacterial intracellular calcium carbonates.
Collapse
Affiliation(s)
- Geoffroy Gaschignard
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Maxime Millet
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Apolline Bruley
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Karim Benzerara
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Manuela Dezi
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Feriel Skouri-Panet
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Elodie Duprat
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| |
Collapse
|
36
|
Lenda R, Zhukova L, Ożyhar A, Bystranowska D. Deciphering the dual nature of nesfatin-1: a tale of zinc ion's Janus-faced influence. Cell Commun Signal 2024; 22:298. [PMID: 38812013 PMCID: PMC11134965 DOI: 10.1186/s12964-024-01675-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 05/21/2024] [Indexed: 05/31/2024] Open
Abstract
BACKGROUND Nucleobindin-2 (Nucb2) and nesfatin-1 (N1) are widely distributed hormones that regulate numerous physiological processes, from energy homeostasis to carcinogenesis. However, the role of nesfatin-2 (N2), the second product of Nucb2 proteolytic processing, remains elusive. To elucidate the relationship between the structure and function of nesfatins, we investigated the properties of chicken and human homologs of N1, as well as a fragment of Nucb2 consisting of N1 and N2 conjoined in a head-to-tail manner (N1/2). RESULTS Our findings indicate that Zn(II) sensing, in the case of N1, is conserved between chicken and human species. However, the data presented here reveal significant differences in the molecular features of the analyzed peptides, particularly in the presence of Zn(II). We demonstrated that Zn(II) has a Janus effect on the M30 region (a crucial anorexigenic core) of N1 and N1/2. In N1 homologs, Zn(II) binding results in the concealment of the M30 region driven by a disorder-to-order transition and adoption of the amyloid fold. In contrast, in N1/2 molecules, Zn(II) binding causes the exposure of the M30 region and its destabilization, resulting in strong exposure of the region recognized by prohormone convertases within the N1/2 molecule. CONCLUSIONS In conclusion, we found that Zn(II) binding is conserved between chicken and human N1. However, despite the high homology of chicken and human N1, their interaction modes with Zn(II) appear to differ. Furthermore, Zn(II) binding might be essential for regulating the function of nesfatins by spatiotemporally hindering the N1 anorexigenic M30 core and concomitantly facilitating N1 release from Nucb2.
Collapse
Affiliation(s)
- Rafał Lenda
- Department of Biochemistry, Molecular Biology and Biotechnology, Wrocław University of Science and Technology, Wybrzeże Wyspiańskiego 27, Wrocław, 50-370, Poland
| | - Lilia Zhukova
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
| | - Andrzej Ożyhar
- Department of Biochemistry, Molecular Biology and Biotechnology, Wrocław University of Science and Technology, Wybrzeże Wyspiańskiego 27, Wrocław, 50-370, Poland
| | - Dominika Bystranowska
- Department of Biochemistry, Molecular Biology and Biotechnology, Wrocław University of Science and Technology, Wybrzeże Wyspiańskiego 27, Wrocław, 50-370, Poland.
| |
Collapse
|
37
|
Bryant P, Kelkar A, Guljas A, Clementi C, Noé F. Structure prediction of protein-ligand complexes from sequence information with Umol. Nat Commun 2024; 15:4536. [PMID: 38806453 PMCID: PMC11133481 DOI: 10.1038/s41467-024-48837-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/15/2024] [Indexed: 05/30/2024] Open
Abstract
Protein-ligand docking is an established tool in drug discovery and development to narrow down potential therapeutics for experimental testing. However, a high-quality protein structure is required and often the protein is treated as fully or partially rigid. Here we develop an AI system that can predict the fully flexible all-atom structure of protein-ligand complexes directly from sequence information. We find that classical docking methods are still superior, but depend upon having crystal structures of the target protein. In addition to predicting flexible all-atom structures, predicted confidence metrics (plDDT) can be used to select accurate predictions as well as to distinguish between strong and weak binders. The advances presented here suggest that the goal of AI-based drug discovery is one step closer, but there is still a way to go to grasp the complexity of protein-ligand interactions fully. Umol is available at: https://github.com/patrickbryant1/Umol .
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany.
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrhenius väg 20C, 114 18, Stockholm, Sweden.
- Science for Life Laboratory, 172 21, Solna, Sweden.
| | - Atharva Kelkar
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Andrea Guljas
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany
| |
Collapse
|
38
|
Bai H, Lewitus E, Li Y, Thomas PV, Zemil M, Merbah M, Peterson CE, Thuraisamy T, Rees PA, Hajduczki A, Dussupt V, Slike B, Mendez-Rivera L, Schmid A, Kavusak E, Rao M, Smith G, Frey J, Sims A, Wieczorek L, Polonis V, Krebs SJ, Ake JA, Vasan S, Bolton DL, Joyce MG, Townsley S, Rolland M. Contemporary HIV-1 consensus Env with AI-assisted redesigned hypervariable loops promote antibody binding. Nat Commun 2024; 15:3924. [PMID: 38724518 PMCID: PMC11082178 DOI: 10.1038/s41467-024-48139-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
An effective HIV-1 vaccine must elicit broadly neutralizing antibodies (bnAbs) against highly diverse Envelope glycoproteins (Env). Since Env with the longest hypervariable (HV) loops is more resistant to the cognate bnAbs than Env with shorter HV loops, we redesigned hypervariable loops for updated Env consensus sequences of subtypes B and C and CRF01_AE. Using modeling with AlphaFold2, we reduced the length of V1, V2, and V5 HV loops while maintaining the integrity of the Env structure and glycan shield, and modified the V4 HV loop. Spacers are designed to limit strain-specific targeting. All updated Env are infectious as pseudoviruses. Preliminary structural characterization suggests that the modified HV loops have a limited impact on Env's conformation. Binding assays show improved binding to modified subtype B and CRF01_AE Env but not to subtype C Env. Neutralization assays show increases in sensitivity to bnAbs, although not always consistently across clades. Strikingly, the HV loop modification renders the resistant CRF01_AE Env sensitive to 10-1074 despite the absence of a glycan at N332.
Collapse
Affiliation(s)
- Hongjun Bai
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Eric Lewitus
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Yifan Li
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Paul V Thomas
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Michelle Zemil
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Mélanie Merbah
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Caroline E Peterson
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Thujitha Thuraisamy
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Phyllis A Rees
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Agnes Hajduczki
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Vincent Dussupt
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Bonnie Slike
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Letzibeth Mendez-Rivera
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Annika Schmid
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Erin Kavusak
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Mekhala Rao
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Gabriel Smith
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Jessica Frey
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Alicea Sims
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Lindsay Wieczorek
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Victoria Polonis
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Shelly J Krebs
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Julie A Ake
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Sandhya Vasan
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Diane L Bolton
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - M Gordon Joyce
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Samantha Townsley
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Morgane Rolland
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA.
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA.
| |
Collapse
|
39
|
Yin S, Mi X, Shukla D. Leveraging machine learning models for peptide-protein interaction prediction. RSC Chem Biol 2024; 5:401-417. [PMID: 38725911 PMCID: PMC11078210 DOI: 10.1039/d3cb00208j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/07/2024] [Indexed: 05/12/2024] Open
Abstract
Peptides play a pivotal role in a wide range of biological activities through participating in up to 40% protein-protein interactions in cellular processes. They also demonstrate remarkable specificity and efficacy, making them promising candidates for drug development. However, predicting peptide-protein complexes by traditional computational approaches, such as docking and molecular dynamics simulations, still remains a challenge due to high computational cost, flexible nature of peptides, and limited structural information of peptide-protein complexes. In recent years, the surge of available biological data has given rise to the development of an increasing number of machine learning models for predicting peptide-protein interactions. These models offer efficient solutions to address the challenges associated with traditional computational approaches. Furthermore, they offer enhanced accuracy, robustness, and interpretability in their predictive outcomes. This review presents a comprehensive overview of machine learning and deep learning models that have emerged in recent years for the prediction of peptide-protein interactions.
Collapse
Affiliation(s)
- Song Yin
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
| | - Xuenan Mi
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
- Department of Bioengineering, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| |
Collapse
|
40
|
Grybchuk D, Galan A, Klocek D, Macedo DH, Wolf YI, Votýpka J, Butenko A, Lukeš J, Neri U, Záhonová K, Kostygov AY, Koonin EV, Yurchenko V. Identification of diverse RNA viruses in Obscuromonas flagellates (Euglenozoa: Trypanosomatidae: Blastocrithidiinae). Virus Evol 2024; 10:veae037. [PMID: 38774311 PMCID: PMC11108086 DOI: 10.1093/ve/veae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 04/03/2024] [Accepted: 04/29/2024] [Indexed: 05/24/2024] Open
Abstract
Trypanosomatids (Euglenozoa) are a diverse group of unicellular flagellates predominately infecting insects (monoxenous species) or circulating between insects and vertebrates or plants (dixenous species). Monoxenous trypanosomatids harbor a wide range of RNA viruses belonging to the families Narnaviridae, Totiviridae, Qinviridae, Leishbuviridae, and a putative group of tombus-like viruses. Here, we focus on the subfamily Blastocrithidiinae, a previously unexplored divergent group of monoxenous trypanosomatids comprising two related genera: Obscuromonas and Blastocrithidia. Members of the genus Blastocrithidia employ a unique genetic code, in which all three stop codons are repurposed to encode amino acids, with TAA also used to terminate translation. Obscuromonas isolates studied here bear viruses of three families: Narnaviridae, Qinviridae, and Mitoviridae. The latter viral group is documented in trypanosomatid flagellates for the first time. While other known mitoviruses replicate in the mitochondria, those of trypanosomatids appear to reside in the cytoplasm. Although no RNA viruses were detected in Blastocrithidia spp., we identified an endogenous viral element in the genome of B. triatomae indicating its past encounter(s) with tombus-like viruses.
Collapse
Affiliation(s)
- Danyil Grybchuk
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
- Central European Institute of Technology, Masaryk University, Brno 625 00, Czechia
| | - Arnau Galan
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
| | - Donnamae Klocek
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
| | - Diego H Macedo
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
| | - Yuri I Wolf
- National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda 20894, USA
| | - Jan Votýpka
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice 370 05, Czechia
- Department of Parasitology, Faculty of Science, Charles University, Prague 128 00, Czechia
| | - Anzhelika Butenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice 370 05, Czechia
- Faculty of Science, University of South Bohemia, České Budějovice 370 05, Czechia
| | - Julius Lukeš
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice 370 05, Czechia
- Faculty of Science, University of South Bohemia, České Budějovice 370 05, Czechia
| | - Uri Neri
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv 39040, Israel
| | - Kristína Záhonová
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice 370 05, Czechia
- Department of Parasitology, Faculty of Science, Charles University, BIOCEV, Vestec 252 50, Czechia
- Division of Infectious Diseases, Department of Medicine, University of Alberta, Edmonton, Alberta T6G 2G3, Canada
| | - Alexei Yu Kostygov
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
- Zoological Institute of the Ruian Academy of Sciences, St. Petersburg 199034, Russia
| | - Eugene V Koonin
- National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda 20894, USA
| | - Vyacheslav Yurchenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava 710 00, Czechia
| |
Collapse
|
41
|
Lee S, Kim G, Karin EL, Mirdita M, Park S, Chikhi R, Babaian A, Kryshtafovych A, Steinegger M. Petabase-Scale Homology Search for Structure Prediction. Cold Spring Harb Perspect Biol 2024; 16:a041465. [PMID: 38316555 PMCID: PMC11065157 DOI: 10.1101/cshperspect.a041465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.
Collapse
Affiliation(s)
- Sewon Lee
- School of Biological Sciences, Seoul National University, Gwanak-gu, Seoul 08826, South Korea
| | - Gyuri Kim
- School of Biological Sciences, Seoul National University, Gwanak-gu, Seoul 08826, South Korea
| | | | - Milot Mirdita
- School of Biological Sciences, Seoul National University, Gwanak-gu, Seoul 08826, South Korea
| | - Sukhwan Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, South Korea
| | - Rayan Chikhi
- Institut Pasteur, Université Paris Cité, G5 Sequence Bioinformatics, 75015 Paris, France
| | - Artem Babaian
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | | | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Gwanak-gu, Seoul 08826, South Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul 08826, South Korea
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul 08826, South Korea
| |
Collapse
|
42
|
Chen Y, Chen G, Chen CYC. MFTrans: A multi-feature transformer network for protein secondary structure prediction. Int J Biol Macromol 2024; 267:131311. [PMID: 38599417 DOI: 10.1016/j.ijbiomac.2024.131311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 03/21/2024] [Accepted: 03/30/2024] [Indexed: 04/12/2024]
Abstract
In the rapidly evolving field of computational biology, accurate prediction of protein secondary structures is crucial for understanding protein functions, facilitating drug discovery, and advancing disease diagnostics. In this paper, we propose MFTrans, a deep learning-based multi-feature fusion network aimed at enhancing the precision and efficiency of Protein Secondary Structure Prediction (PSSP). This model employs a Multiple Sequence Alignment (MSA) Transformer in combination with a multi-view deep learning architecture to effectively capture both global and local features of protein sequences. MFTrans integrates diverse features generated by protein sequences, including MSA, sequence information, evolutionary information, and hidden state information, using a multi-feature fusion strategy. The MSA Transformer is utilized to interleave row and column attention across the input MSA, while a Transformer encoder and decoder are introduced to enhance the extracted high-level features. A hybrid network architecture, combining a convolutional neural network with a bidirectional Gated Recurrent Unit (BiGRU) network, is used to further extract high-level features after feature fusion. In independent tests, our experimental results show that MFTrans has superior generalization ability, outperforming other state-of-the-art PSSP models by 3 % on average on public benchmarks including CASP12, CASP13, CASP14, TEST2016, TEST2018, and CB513. Case studies further highlight its advanced performance in predicting mutation sites. MFTrans contributes significantly to the protein science field, opening new avenues for drug discovery, disease diagnosis, and protein.
Collapse
Affiliation(s)
- Yifu Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong 518107, China
| | - Guanxing Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong 518107, China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong 518107, China; AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China; School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China; Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan; Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan.
| |
Collapse
|
43
|
Zhao Q, Bertolli S, Park YJ, Tan Y, Cutler KJ, Srinivas P, Asfahl KL, Fonesca-García C, Gallagher LA, Li Y, Wang Y, Coleman-Derr D, DiMaio F, Zhang D, Peterson SB, Veesler D, Mougous JD. Streptomyces umbrella toxin particles block hyphal growth of competing species. Nature 2024; 629:165-173. [PMID: 38632398 PMCID: PMC11062931 DOI: 10.1038/s41586-024-07298-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 03/11/2024] [Indexed: 04/19/2024]
Abstract
Streptomyces are a genus of ubiquitous soil bacteria from which the majority of clinically utilized antibiotics derive1. The production of these antibacterial molecules reflects the relentless competition Streptomyces engage in with other bacteria, including other Streptomyces species1,2. Here we show that in addition to small-molecule antibiotics, Streptomyces produce and secrete antibacterial protein complexes that feature a large, degenerate repeat-containing polymorphic toxin protein. A cryo-electron microscopy structure of these particles reveals an extended stalk topped by a ringed crown comprising the toxin repeats scaffolding five lectin-tipped spokes, which led us to name them umbrella particles. Streptomyces coelicolor encodes three umbrella particles with distinct toxin and lectin composition. Notably, supernatant containing these toxins specifically and potently inhibits the growth of select Streptomyces species from among a diverse collection of bacteria screened. For one target, Streptomyces griseus, inhibition relies on a single toxin and that intoxication manifests as rapid cessation of vegetative hyphal growth. Our data show that Streptomyces umbrella particles mediate competition among vegetative mycelia of related species, a function distinct from small-molecule antibiotics, which are produced at the onset of reproductive growth and act broadly3,4. Sequence analyses suggest that this role of umbrella particles extends beyond Streptomyces, as we identified umbrella loci in nearly 1,000 species across Actinobacteria.
Collapse
Affiliation(s)
- Qinqin Zhao
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Savannah Bertolli
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Young-Jun Park
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Yongjun Tan
- Department of Biology, St Louis University, St Louis, MO, USA
| | - Kevin J Cutler
- Department of Microbiology, University of Washington, Seattle, WA, USA
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Pooja Srinivas
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Kyle L Asfahl
- Department of Microbiology, University of Washington, Seattle, WA, USA
- Microbial Interactions and Microbiome Center, University of Washington, Seattle, WA, USA
| | - Citlali Fonesca-García
- Plant Gene Expression Center, USDA-ARS, Albany, CA, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Larry A Gallagher
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Yaqiao Li
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Yaxi Wang
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Devin Coleman-Derr
- Plant Gene Expression Center, USDA-ARS, Albany, CA, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Dapeng Zhang
- Department of Biology, St Louis University, St Louis, MO, USA
- Program of Bioinformatic and Computational Biology, St Louis University, St Louis, MO, USA
| | - S Brook Peterson
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - David Veesler
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Joseph D Mougous
- Department of Microbiology, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
- Microbial Interactions and Microbiome Center, University of Washington, Seattle, WA, USA.
| |
Collapse
|
44
|
Poon BK, Terwilliger TC, Adams PD. The Phenix-AlphaFold webservice: Enabling AlphaFold predictions for use in Phenix. Protein Sci 2024; 33:e4992. [PMID: 38647406 PMCID: PMC11034488 DOI: 10.1002/pro.4992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 03/01/2024] [Accepted: 03/31/2024] [Indexed: 04/25/2024]
Abstract
Advances in machine learning have enabled sufficiently accurate predictions of protein structure to be used in macromolecular structure determination with crystallography and cryo-electron microscopy data. The Phenix software suite has AlphaFold predictions integrated into an automated pipeline that can start with an amino acid sequence and data, and automatically perform model-building and refinement to return a protein model fitted into the data. Due to the steep technical requirements of running AlphaFold efficiently, we have implemented a Phenix-AlphaFold webservice that enables all Phenix users to run AlphaFold predictions remotely from the Phenix GUI starting with the official 1.21 release. This webservice will be improved based on how it is used by the research community and the future research directions for Phenix.
Collapse
Affiliation(s)
- Billy K. Poon
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | - Thomas C. Terwilliger
- New Mexico ConsortiumLos AlamosNew MexicoUSA
- Los Alamos National LaboratoryLos AlamosNew MexicoUSA
| | - Paul D. Adams
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
- Department of BioengineeringUniversity of California, BerkeleyBerkeleyCaliforniaUSA
| |
Collapse
|
45
|
Wachter F, Nowak RP, Ficarro S, Marto J, Fischer ES. Structural characterization of methylation-independent PP2A assembly guides alphafold2Multimer prediction of family-wide PP2A complexes. J Biol Chem 2024; 300:107268. [PMID: 38582449 PMCID: PMC11087950 DOI: 10.1016/j.jbc.2024.107268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/31/2024] [Accepted: 04/02/2024] [Indexed: 04/08/2024] Open
Abstract
Dysregulation of phosphorylation-dependent signaling is a hallmark of tumorigenesis. Protein phosphatase 2 (PP2A) is an essential regulator of cell growth. One scaffold subunit (A) binds to a catalytic subunit (C) to form a core AC heterodimer, which together with one of many regulatory (B) subunits forms the active trimeric enzyme. The combinatorial number of distinct PP2A complexes is large, which results in diverse substrate specificity and subcellular localization. The detailed mechanism of PP2A assembly and regulation remains elusive and reports about an important role of methylation of the carboxy terminus of PP2A C are conflicting. A better understanding of the molecular underpinnings of PP2A assembly and regulation is critical to dissecting PP2A function in physiology and disease. Here, we combined biochemical reconstitution, mass spectrometry, X-ray crystallography, and functional assays to characterize the assembly of trimeric PP2A. In vitro studies demonstrated that methylation of the carboxy-terminus of PP2A C was dispensable for PP2A assembly in vitro. To corroborate these findings, we determined the X-ray crystal structure of the unmethylated PP2A Aα-B56ε-Cα trimer complex to 3.1 Å resolution. The experimental structure superimposed well with an Alphafold2Multimer prediction of the PP2A trimer. We then predicted models of all canonical PP2A complexes providing a framework for structural analysis of PP2A. In conclusion, methylation was dispensable for trimeric PP2A assembly and integrative structural biology studies of PP2A offered predictive models for all canonical PP2A complexes.
Collapse
Affiliation(s)
- Franziska Wachter
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, USA
| | - Radosław P Nowak
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, USA
| | - Scott Ficarro
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Jarrod Marto
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Eric S Fischer
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, USA.
| |
Collapse
|
46
|
Xie T, Huang J. Can Protein Structure Prediction Methods Capture Alternative Conformations of Membrane Transporters? J Chem Inf Model 2024; 64:3524-3536. [PMID: 38564295 DOI: 10.1021/acs.jcim.3c01936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Understanding the conformational dynamics of proteins, such as the inward-facing (IF) and outward-facing (OF) transition observed in transporters, is vital for elucidating their functional mechanisms. Despite significant advances in protein structure prediction (PSP) over the past three decades, most efforts have been focused on single-state prediction, leaving multistate or alternative conformation prediction (ACP) relatively unexplored. This discrepancy has led to the development of highly accurate PSP methods such as AlphaFold, yet their capabilities for ACP remain limited. To investigate the performance of current PSP methods in ACP, we curated a data set, named IOMemP, consisting of 32 experimentally determined high-resolution IF and OF structures of 16 membrane proteins with substantial conformational changes. We benchmarked 12 representative PSP methods, along with two recent multistate methods based on AlphaFold, against this data set. Our findings reveal a remarkably consistent preference for specific states across various PSP methods. We elucidated how coevolution information in MSAs influences state preference. Moreover, we showed that AlphaFold, when excluding coevolution information, estimated similar energies between the experimental IF and OF conformations, indicating that the energy model learned by AlphaFold is not biased toward any particular state. Our IOMemP data set and benchmark results are anticipated to advance the development of robust ACP methods.
Collapse
Affiliation(s)
- Tengyu Xie
- College of Life Science, Zhejiang University, HangZhou Zhejiang 310058, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, HangZhou Zhejiang 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, HangZhou Zhejiang 310024, China
| | - Jing Huang
- College of Life Science, Zhejiang University, HangZhou Zhejiang 310058, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, HangZhou Zhejiang 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, HangZhou Zhejiang 310024, China
| |
Collapse
|
47
|
Ackmann J, Brüge A, Gotina L, Lim S, Jahreis K, Vollbrecht AL, Kim YK, Pae AN, Labus J, Ponimaskin E. Structural determinants for activation of the Tau kinase CDK5 by the serotonin receptor 5-HT7R. Cell Commun Signal 2024; 22:233. [PMID: 38641599 PMCID: PMC11031989 DOI: 10.1186/s12964-024-01612-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/11/2024] [Indexed: 04/21/2024] Open
Abstract
BACKGROUND Multiple neurodegenerative diseases are induced by the formation and deposition of protein aggregates. In particular, the microtubule-associated protein Tau leads to the development of so-called tauopathies characterized by the aggregation of hyperphosphorylated Tau within neurons. We recently showed that the constitutive activity of the serotonin receptor 7 (5-HT7R) is required for Tau hyperphosphorylation and aggregation through activation of the cyclin-dependent kinase 5 (CDK5). We also demonstrated physical interaction between 5-HT7R and CDK5 at the plasma membrane suggesting that the 5-HT7R/CDK5 complex is an integral part of the signaling network involved in Tau-mediated pathology. METHODS Using biochemical, microscopic, molecular biological, computational and AI-based approaches, we investigated structural requirements for the formation of 5-HT7R/CDK5 complex. RESULTS We demonstrated that 5-HT7R domains responsible for coupling to Gs proteins are not involved in receptor interaction with CDK5. We also created a structural model of the 5-HT7R/CDK5 complex and refined the interaction interface. The model predicted two conserved phenylalanine residues, F278 and F281, within the third intracellular loop of 5-HT7R to be potentially important for complex formation. While site-directed mutagenesis of these residues did not influence Gs protein-mediated receptor signaling, replacement of both phenylalanines by alanine residues significantly reduced 5-HT7R/CDK5 interaction and receptor-mediated CDK5 activation, leading to reduced Tau hyperphosphorylation and aggregation. Molecular dynamics simulations of 5-HT7R/CDK5 complex for wild-type and receptor mutants confirmed binding interface stability of the initial model. CONCLUSIONS Our results provide a structural basis for the development of novel drugs targeting the 5-HT7R/CDK5 interaction interface for the selective treatment of Tau-related disorders, including frontotemporal dementia and Alzheimer's disease.
Collapse
Affiliation(s)
- Jana Ackmann
- Department of Cellular Neurophysiology, Institute for Neurophysiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Alina Brüge
- Department of Cellular Neurophysiology, Institute for Neurophysiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Lizaveta Gotina
- Brain Science Institute, Korea Institute of Science and Technology (KIST), Seoul, Republic of Korea
- Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology (UST), Daejeon, Republic of Korea
| | - Sungsu Lim
- Brain Science Institute, Korea Institute of Science and Technology (KIST), Seoul, Republic of Korea
| | - Kathrin Jahreis
- Department of Cellular Neurophysiology, Institute for Neurophysiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Anna-Lena Vollbrecht
- Department of Cellular Neurophysiology, Institute for Neurophysiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Yun Kyung Kim
- Brain Science Institute, Korea Institute of Science and Technology (KIST), Seoul, Republic of Korea
- Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology (UST), Daejeon, Republic of Korea
| | - Ae Nim Pae
- Brain Science Institute, Korea Institute of Science and Technology (KIST), Seoul, Republic of Korea
- Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology (UST), Daejeon, Republic of Korea
| | - Josephine Labus
- Department of Cellular Neurophysiology, Institute for Neurophysiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Evgeni Ponimaskin
- Department of Cellular Neurophysiology, Institute for Neurophysiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany.
| |
Collapse
|
48
|
Yuan Q, Tian C, Yang Y. Genome-scale annotation of protein binding sites via language model and geometric deep learning. eLife 2024; 13:RP93695. [PMID: 38630609 PMCID: PMC11023698 DOI: 10.7554/elife.93695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024] Open
Abstract
Revealing protein binding sites with other molecules, such as nucleic acids, peptides, or small ligands, sheds light on disease mechanism elucidation and novel drug design. With the explosive growth of proteins in sequence databases, how to accurately and efficiently identify these binding sites from sequences becomes essential. However, current methods mostly rely on expensive multiple sequence alignments or experimental protein structures, limiting their genome-scale applications. Besides, these methods haven't fully explored the geometry of the protein structures. Here, we propose GPSite, a multi-task network for simultaneously predicting binding residues of DNA, RNA, peptide, protein, ATP, HEM, and metal ions on proteins. GPSite was trained on informative sequence embeddings and predicted structures from protein language models, while comprehensively extracting residual and relational geometric contexts in an end-to-end manner. Experiments demonstrate that GPSite substantially surpasses state-of-the-art sequence-based and structure-based approaches on various benchmark datasets, even when the structures are not well-predicted. The low computational cost of GPSite enables rapid genome-scale binding residue annotations for over 568,000 sequences, providing opportunities to unveil unexplored associations of binding sites with molecular functions, biological processes, and genetic variants. The GPSite webserver and annotation database can be freely accessed at https://bio-web1.nscc-gz.cn/app/GPSite.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Chong Tian
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| |
Collapse
|
49
|
Wright E. Accurately clustering biological sequences in linear time by relatedness sorting. Nat Commun 2024; 15:3047. [PMID: 38589369 PMCID: PMC11001989 DOI: 10.1038/s41467-024-47371-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
Clustering biological sequences into similar groups is an increasingly important task as the number of available sequences continues to grow exponentially. Search-based approaches to clustering scale super-linearly with the number of input sequences, making it impractical to cluster very large sets of sequences. Approaches to clustering sequences in linear time currently lack the accuracy of super-linear approaches. Here, I set out to develop and characterize a strategy for clustering with linear time complexity that retains the accuracy of less scalable approaches. The resulting algorithm, named Clusterize, sorts sequences by relatedness to linearize the clustering problem. Clusterize produces clusters with accuracy rivaling popular programs (CD-HIT, MMseqs2, and UCLUST) but exhibits linear asymptotic scalability. Clusterize generates higher accuracy and oftentimes much larger clusters than Linclust, a fast linear time clustering algorithm. I demonstrate the utility of Clusterize by accurately solving different clustering problems involving millions of nucleotide or protein sequences.
Collapse
Affiliation(s)
- Erik Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Evolutionary Biology and Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
50
|
Jing X, Wu F, Luo X, Xu J. Single-sequence protein structure prediction by integrating protein language models. Proc Natl Acad Sci U S A 2024; 121:e2308788121. [PMID: 38507445 PMCID: PMC10990103 DOI: 10.1073/pnas.2308788121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/05/2024] [Indexed: 03/22/2024] Open
Abstract
Protein structure prediction has been greatly improved by deep learning in the past few years. However, the most successful methods rely on multiple sequence alignment (MSA) of the sequence homologs of the protein under prediction. In nature, a protein folds in the absence of its sequence homologs and thus, a MSA-free structure prediction method is desired. Here, we develop a single-sequence-based protein structure prediction method RaptorX-Single by integrating several protein language models and a structure generation module and then study its advantage over MSA-based methods. Our experimental results indicate that in addition to running much faster than MSA-based methods such as AlphaFold2, RaptorX-Single outperforms AlphaFold2 and other MSA-free methods in predicting the structure of antibodies (after fine-tuning on antibody data), proteins of very few sequence homologs, and single mutation effects. By comparing different protein language models, our results show that not only the scale but also the training data of protein language models will impact the performance. RaptorX-Single also compares favorably to MSA-based AlphaFold2 when the protein under prediction has a large number of sequence homologs.
Collapse
Affiliation(s)
| | - Fandi Wu
- MoleculeMind Ltd., Beijing100084, China
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing100190, China
| | - Xiao Luo
- Toyota Technological Institute at Chicago, Chicago, IL60637
- Shanghai Artificial Intelligence Laboratory, Shanghai200232, China
| | - Jinbo Xu
- MoleculeMind Ltd., Beijing100084, China
- Toyota Technological Institute at Chicago, Chicago, IL60637
| |
Collapse
|