1
|
Xie X, Deng X, Chen L, Yuan J, Chen H, Wei C, Feng C, Liu X, Qiu G. From Gene to Structure: Unraveling Genomic Dark Matter in Ca. Accumulibacter. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024. [PMID: 39699575 DOI: 10.1021/acs.est.4c09948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2024]
Abstract
"Candidatus Accumulibacter" is a unique and pivotal genus of polyphosphate-accumulating organisms prevalent in wastewater treatment plants and plays mainstay roles in the global phosphorus cycle. However, the efforts to fully understand their genetic and metabolic characteristics are largely hindered by major limitations in existing sequence-based annotation methods. Here, we reported an integrated approach combining pangenome analysis, protein structure prediction and clustering, and meta-omic characterization, to uncover genetic and metabolic traits previously unexplored for Ca. Accumulibacter. The identification of a previously overlooked pyrophosphate-fructose 6-phosphate 1-phosphotransferase gene (pfp) suggested that all Ca. Accumulibacter encoded a complete Embden-Meyerhof-Parnas pathway. A homologue of the phosphate-specific transport system accessory protein (PhoU) was suggested to be an inorganic phosphate transport (Pit) accessory protein (Pap) conferring effective and efficient phosphate transport. Additional lineage members were found to encode complete denitrification pathways. A pipeline was built, generating a pan-Ca. Accumulibacter annotation reference database, covering >200,000 proteins and their encoding genes. Benchmarking on 27 Ca. Accumulibacter genomes showed major improvement in the average annotation coverage from 51% to 82%. This pipeline is readily applicable to diverse cultured and uncultured bacteria to establish high-coverage annotation reference databases, facilitating the exploration of genomic dark matter in the bacterial domain.
Collapse
Affiliation(s)
- Xiaojing Xie
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Xuhan Deng
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Liping Chen
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Jing Yuan
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Hang Chen
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Chaohai Wei
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
- Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Recycling, Guangzhou 510006, China
| | - Chunhua Feng
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
- Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Recycling, Guangzhou 510006, China
| | - Xianghui Liu
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore 637551, Singapore
| | - Guanglei Qiu
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore 637551, Singapore
- Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Recycling, Guangzhou 510006, China
- The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, Guangzhou 510006, China
| |
Collapse
|
2
|
Song S, Li T, Stevens AO, Shorty T, He Y. Molecular Dynamics Reveal Key Steps in BAR-Related Membrane Remodeling. Pathogens 2024; 13:902. [PMID: 39452773 PMCID: PMC11510478 DOI: 10.3390/pathogens13100902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 10/08/2024] [Accepted: 10/13/2024] [Indexed: 10/26/2024] Open
Abstract
Endocytosis plays a complex role in pathogen-host interactions. It serves as a pathway for pathogens to enter the host cell and acts as a part of the immune defense mechanism. Endocytosis involves the formation of lipid membrane vesicles and the reshaping of the cell membrane, a task predominantly managed by proteins containing BAR (Bin1/Amphiphysin/yeast RVS167) domains. Insights into how BAR domains can remodel and reshape cell membranes provide crucial information on infections and can aid the development of treatment. Aiming at deciphering the roles of the BAR dimers in lipid membrane bending and remodeling, we conducted extensive all-atom molecular dynamics simulations and discovered that the presence of helix kinks divides the BAR monomer into two segments-the "arm segment" and the "core segment"-which exhibit distinct movement patterns. Contrary to the prior hypothesis of BAR domains working as a rigid scaffold, we found that it functions in an "Arms-Hands" mode. These findings enhance the understanding of endocytosis, potentially advancing research on pathogen-host interactions and aiding in the identification of new treatment strategies targeting BAR domains.
Collapse
Affiliation(s)
- Shenghan Song
- Department of Chemistry & Chemical Biology, The University of New Mexico, Albuquerque, NM 87131, USA
| | - Tongtong Li
- Department of Chemistry & Chemical Biology, The University of New Mexico, Albuquerque, NM 87131, USA
| | - Amy O. Stevens
- Department of Chemistry & Chemical Biology, The University of New Mexico, Albuquerque, NM 87131, USA
| | - Temair Shorty
- Department of Chemistry & Chemical Biology, The University of New Mexico, Albuquerque, NM 87131, USA
| | - Yi He
- Department of Chemistry & Chemical Biology, The University of New Mexico, Albuquerque, NM 87131, USA
- Translational Informatics Division, Department of Internal Medicine, The University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
3
|
Sabsay KR, te Velthuis AJW. Using structure prediction of negative sense RNA virus nucleoproteins to assess evolutionary relationships. Virus Evol 2024; 10:veae058. [PMID: 39129834 PMCID: PMC11315766 DOI: 10.1093/ve/veae058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 05/21/2024] [Accepted: 07/19/2024] [Indexed: 08/13/2024] Open
Abstract
Negative sense RNA viruses (NSV) include some of the most detrimental human pathogens, including the influenza, Ebola, and measles viruses. NSV genomes consist of one or multiple single-stranded RNA molecules that are encapsidated into one or more ribonucleoprotein (RNP) complexes. These RNPs consist of viral RNA, a viral RNA polymerase, and many copies of the viral nucleoprotein (NP). Current evolutionary relationships within the NSV phylum are based on the alignment of conserved RNA-dependent RNA polymerase (RdRp) domain amino acid sequences. However, the RdRp domain-based phylogeny does not address whether NP, the other core protein in the NSV genome, evolved along the same trajectory or whether several RdRp-NP pairs evolved through convergent evolution in the segmented and non-segmented NSV genome architectures. Addressing how NP and the RdRp domain evolved may help us better understand NSV diversity. Since NP sequences are too short to infer robust phylogenetic relationships, we here used experimentally obtained and AlphaFold 2.0-predicted NP structures to probe whether evolutionary relationships can be estimated using NSV NP sequences. Following flexible structure alignments of modeled structures, we find that the structural homology of the NSV NPs reveals phylogenetic clusters that are consistent with RdRp-based clustering. In addition, we were able to assign viruses for which RdRp sequences are currently missing to phylogenetic clusters based on the available NP sequence. Both our RdRp-based and NP-based relationships deviate from the current NSV classification of the segmented Naedrevirales, which cluster with the other segmented NSVs in our analysis. Overall, our results suggest that the NSV RdRp and NP genes largely evolved along similar trajectories and even short pieces of genetic, protein-coding information can be used to infer evolutionary relationships, potentially making metagenomic analyses more valuable.
Collapse
Affiliation(s)
- Kimberly R Sabsay
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Washington Road, Princeton, NJ 08544, United States
- Lewis Sigler Institute, Princeton University, Washington Road, Princeton, NJ 08544, United States
| | - Aartjan J W te Velthuis
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Washington Road, Princeton, NJ 08544, United States
| |
Collapse
|
4
|
Dong Y, Quan H, Ma C, Shan L, Deng L. TGC-ARG: Anticipating Antibiotic Resistance via Transformer-Based Modeling and Contrastive Learning. Int J Mol Sci 2024; 25:7228. [PMID: 39000335 PMCID: PMC11241484 DOI: 10.3390/ijms25137228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/25/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024] Open
Abstract
In various domains, including everyday activities, agricultural practices, and medical treatments, the escalating challenge of antibiotic resistance poses a significant concern. Traditional approaches to studying antibiotic resistance genes (ARGs) often require substantial time and effort and are limited in accuracy. Moreover, the decentralized nature of existing data repositories complicates comprehensive analysis of antibiotic resistance gene sequences. In this study, we introduce a novel computational framework named TGC-ARG designed to predict potential ARGs. This framework takes protein sequences as input, utilizes SCRATCH-1D for protein secondary structure prediction, and employs feature extraction techniques to derive distinctive features from both sequence and structural data. Subsequently, a Siamese network is employed to foster a contrastive learning environment, enhancing the model's ability to effectively represent the data. Finally, a multi-layer perceptron (MLP) integrates and processes sequence embeddings alongside predicted secondary structure embeddings to forecast ARG presence. To evaluate our approach, we curated a pioneering open dataset termed ARSS (Antibiotic Resistance Sequence Statistics). Comprehensive comparative experiments demonstrate that our method surpasses current state-of-the-art methodologies. Additionally, through detailed case studies, we illustrate the efficacy of our approach in predicting potential ARGs.
Collapse
Affiliation(s)
| | | | | | | | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (Y.D.); (H.Q.); (C.M.); (L.S.)
| |
Collapse
|
5
|
Chen Z, Wang R, Guo J, Wang X. The role and future prospects of artificial intelligence algorithms in peptide drug development. Biomed Pharmacother 2024; 175:116709. [PMID: 38713945 DOI: 10.1016/j.biopha.2024.116709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 05/09/2024] Open
Abstract
Peptide medications have been more well-known in recent years due to their many benefits, including low side effects, high biological activity, specificity, effectiveness, and so on. Over 100 peptide medications have been introduced to the market to treat a variety of illnesses. Most of these peptide medications are developed on the basis of endogenous peptides or natural peptides, which frequently required expensive, time-consuming, and extensive tests to confirm. As artificial intelligence advances quickly, it is now possible to build machine learning or deep learning models that screen a large number of candidate sequences for therapeutic peptides. Therapeutic peptides, such as those with antibacterial or anticancer properties, have been developed by the application of artificial intelligence algorithms.The process of finding and developing peptide drugs is outlined in this review, along with a few related cases that were helped by AI and conventional methods. These resources will open up new avenues for peptide drug development and discovery, helping to meet the pressing needs of clinical patients for disease treatment. Although peptide drugs are a new class of biopharmaceuticals that distinguish them from chemical and small molecule drugs, their clinical purpose and value cannot be ignored. However, the traditional peptide drug research and development has a long development cycle and high investment, and the creation of peptide medications will be substantially hastened by the AI-assisted (AI+) mode, offering a new boost for combating diseases.
Collapse
Affiliation(s)
- Zhiheng Chen
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Ruoxi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Junqi Guo
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Xiaogang Wang
- Guangdong Provincial Key Laboratory of Bone and Joint Degenerative Diseases, The Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong 510630, China.
| |
Collapse
|
6
|
Sabsay KR, te Velthuis AJ. Using structure prediction of negative sense RNA virus nucleoproteins to assess evolutionary relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.16.580771. [PMID: 38405982 PMCID: PMC10888975 DOI: 10.1101/2024.02.16.580771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Negative sense RNA viruses (NSV) include some of the most detrimental human pathogens, including the influenza, Ebola and measles viruses. NSV genomes consist of one or multiple single-stranded RNA molecules that are encapsidated into one or more ribonucleoprotein (RNP) complexes. These RNPs consist of viral RNA, a viral RNA polymerase, and many copies of the viral nucleoprotein (NP). Current evolutionary relationships within the NSV phylum are based on alignment of conserved RNA-directed RNA polymerase (RdRp) domain amino acid sequences. However, the RdRp domain-based phylogeny does not address whether NP, the other core protein in the NSV genome, evolved along the same trajectory or whether several RdRp-NP pairs evolved through convergent evolution in the segmented and non-segmented NSV genomes architectures. Addressing how NP and the RdRp domain evolved may help us better understand NSV diversity. Since NP sequences are too short to infer robust phylogenetic relationships, we here used experimentally-obtained and AlphaFold 2.0-predicted NP structures to probe whether evolutionary relationships can be estimated using NSV NP sequences. Following flexible structure alignments of modeled structures, we find that the structural homology of the NSV NPs reveals phylogenetic clusters that are consistent with RdRp-based clustering. In addition, we were able to assign viruses for which RdRp sequences are currently missing to phylogenetic clusters based on the available NP sequence. Both our RdRp-based and NP-based relationships deviate from the current NSV classification of the segmented Naedrevirales, which cluster with the other segmented NSVs in our analysis. Overall, our results suggest that the NSV RdRp and NP genes largely evolved along similar trajectories and that even short pieces of genetic, protein-coding information can be used to infer evolutionary relationships, potentially making metagenomic analyses more valuable.
Collapse
Affiliation(s)
- Kimberly R. Sabsay
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ 08544, United States
- Lewis Sigler Institute, Princeton University, Princeton, NJ 08544, United States
| | - Aartjan J.W. te Velthuis
- Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ 08544, United States
| |
Collapse
|
7
|
Wu C, Guo D. Identification of Two Flip-Over Genes in Grass Family as Potential Signature of C4 Photosynthesis Evolution. Int J Mol Sci 2023; 24:14165. [PMID: 37762466 PMCID: PMC10531853 DOI: 10.3390/ijms241814165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 09/05/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open
Abstract
In flowering plants, C4 photosynthesis is superior to C3 type in carbon fixation efficiency and adaptation to extreme environmental conditions, but the mechanisms behind the assembly of C4 machinery remain elusive. This study attempts to dissect the evolutionary divergence from C3 to C4 photosynthesis in five photosynthetic model plants from the grass family, using a combined comparative transcriptomics and deep learning technology. By examining and comparing gene expression levels in bundle sheath and mesophyll cells of five model plants, we identified 16 differentially expressed signature genes showing cell-specific expression patterns in C3 and C4 plants. Among them, two showed distinctively opposite cell-specific expression patterns in C3 vs. C4 plants (named as FOGs). The in silico physicochemical analysis of the two FOGs illustrated that C3 homologous proteins of LHCA6 had low and stable pI values of ~6, while the pI values of LHCA6 homologs increased drastically in C4 plants Setaria viridis (7), Zea mays (8), and Sorghum bicolor (over 9), suggesting this protein may have different functions in C3 and C4 plants. Interestingly, based on pairwise protein sequence/structure similarities between each homologous FOG protein, one FOG PGRL1A showed local inconsistency between sequence similarity and structure similarity. To find more examples of the evolutionary characteristics of FOG proteins, we investigated the protein sequence/structure similarities of other FOGs (transcription factors) and found that FOG proteins have diversified incompatibility between sequence and structure similarities during grass family evolution. This raised an interesting question as to whether the sequence similarity is related to structure similarity during C4 photosynthesis evolution.
Collapse
Affiliation(s)
| | - Dianjing Guo
- State Key Laboratory of Agrobiotechnology, School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China;
| |
Collapse
|
8
|
Rappoport D, Jinich A. Enzyme Substrate Prediction from Three-Dimensional Feature Representations Using Space-Filling Curves. J Chem Inf Model 2023; 63:1637-1648. [PMID: 36802628 DOI: 10.1021/acs.jcim.3c00005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
Compact and interpretable structural feature representations are required for accurately predicting properties and function of proteins. In this work, we construct and evaluate three-dimensional feature representations of protein structures based on space-filling curves (SFCs). We focus on the problem of enzyme substrate prediction, using two ubiquitous enzyme families as case studies: the short-chain dehydrogenase/reductases (SDRs) and the S-adenosylmethionine-dependent methyltransferases (SAM-MTases). Space-filling curves such as the Hilbert curve and the Morton curve generate a reversible mapping from discretized three-dimensional to one-dimensional representations and thus help to encode three-dimensional molecular structures in a system-independent way and with only a few adjustable parameters. Using three-dimensional structures of SDRs and SAM-MTases generated using AlphaFold2, we assess the performance of the SFC-based feature representations in predictions on a new benchmark database of enzyme classification tasks including their cofactor and substrate selectivity. Gradient-boosted tree classifiers yield binary prediction accuracy of 0.77-0.91 and area under curve (AUC) characteristics of 0.83-0.92 for the classification tasks. We investigate the effects of amino acid encoding, spatial orientation, and (the few) parameters of SFC-based encodings on the accuracy of the predictions. Our results suggest that geometry-based approaches such as SFCs are promising for generating protein structural representations and are complementary to the existing protein feature representations such as evolutionary scale modeling (ESM) sequence embeddings.
Collapse
Affiliation(s)
- Dmitrij Rappoport
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, California 92697, United States
| | - Adrian Jinich
- Weill Cornell Medicine, 1300 York Avenue, Box 65, New York, New York 10065, United States
| |
Collapse
|
9
|
In Search of a Dynamical Vocabulary: A Pipeline to Construct a Basis of Shared Traits in Large-Scale Motions of Proteins. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12147157] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The paradigmatic sequence–structure–dynamics–function relation in proteins is currently well established in the scientific community; in particular, a large effort has been made to probe the first connection, indeed providing convincing evidence of its strength and rationalizing it in a quantitative and general framework. In contrast, however, the role of dynamics as a link between structure and function has eluded a similarly clear-cut verification and description. In this work, we propose a pipeline aimed at building a basis for the quantitative characterization of the large-scale dynamics of a set of proteins, starting from the sole knowledge of their native structures. The method hinges on a dynamics-based clusterization, which allows a straightforward comparison with structural and functional protein classifications. The resulting basis set, obtained through the application to a group of related proteins, is shown to reproduce the salient large-scale dynamical features of the dataset. Most interestingly, the basis set is shown to encode the fluctuation patterns of homologous proteins not belonging to the initial dataset, thus highlighting the general applicability of the pipeline used to build it.
Collapse
|
10
|
Toopaang W, Bunnak W, Srisuksam C, Wattananukit W, Tanticharoen M, Yang YL, Amnuaykanjanasin A. Microbial polyketides and their roles in insect virulence: from genomics to biological functions. Nat Prod Rep 2022; 39:2008-2029. [PMID: 35822627 DOI: 10.1039/d1np00058f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Covering: May 1966 up to January 2022Entomopathogenic microorganisms have potential for biological control of insect pests. Their main secondary metabolites include polyketides, nonribosomal peptides, and polyketide-nonribosomal peptide (PK-NRP) hybrids. Among these secondary metabolites, polyketides have mainly been studied for structural identification, pathway engineering, and for their contributions to medicine. However, little is known about the function of polyketides in insect virulence. This review focuses on the role of bacterial and fungal polyketides, as well as PK-NRP hybrids in insect infection and killing. We also discuss gene distribution and evolutional relationships among different microbial species. Further, the role of microbial polyketides and the hybrids in modulating insect-microbial symbiosis is also explored. Understanding the mechanisms of polyketides in insect pathogenesis, how compounds moderate the host-fungus interaction, and the distribution of PKS genes across different fungi and bacteria will facilitate the discovery and development of novel polyketide-derived bio-insecticides.
Collapse
Affiliation(s)
- Wachiraporn Toopaang
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Nueng, Amphoe Khlong Luang, Pathum Thani 12120, Thailand. .,Molecular and Biological Agricultural Sciences, Taiwan International Graduate Program, Academia Sinica and National Chung Hsing University, Taiwan.,Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan.
| | - Warapon Bunnak
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Nueng, Amphoe Khlong Luang, Pathum Thani 12120, Thailand.
| | - Chettida Srisuksam
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Nueng, Amphoe Khlong Luang, Pathum Thani 12120, Thailand.
| | - Wilawan Wattananukit
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Nueng, Amphoe Khlong Luang, Pathum Thani 12120, Thailand.
| | - Morakot Tanticharoen
- School of Bioresources and Technology, King Mongkut's University of Technology Thonburi, Bangkok 10140, Thailand
| | - Yu-Liang Yang
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan. .,Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 711010, Taiwan
| | - Alongkorn Amnuaykanjanasin
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Nueng, Amphoe Khlong Luang, Pathum Thani 12120, Thailand.
| |
Collapse
|
11
|
George MW, Abreu BL, Boufroura H, Moore JC, Poliakoff M. Telescoped Continuous Flow Synthesis of 2-Substituted 1,4-Benzoquinones via Oxidative Dearomatisation of para-Substituted Phenols Using Singlet Oxygen in Supercritical CO2. SYNTHESIS-STUTTGART 2022. [DOI: 10.1055/s-0041-1737413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractThis paper describes a continuous multi-step synthesis in supercritical CO2. A continuous flow synthesis of 2-substituted 1,4-benzoquinones is reported, and details of the high-pressure reactors are given. This proceeds via the telescoped dearomatisation of p-substituted phenols using singlet oxygen in supercritical CO2 and an acid-mediated C–C migration. The process has a short residence time of 30 minutes, with overall yields and projected productivities of up to 83% and 9 g/day, respectively. This methodology enables a safe and efficient synthesis of 2-substituted 1,4-benzoquinones from photo-generated singlet oxygen, and cheap and readily available p-substituted phenols. The procedure has high atom efficiency, low photocatalyst loading, and substitutes potentially hazardous and corrosive reagents and solvents for molecular oxygen, CO2, and the less hazardous solid-supported acid Amberlyst-15.
Collapse
|
12
|
Zhao W, Luo S, Wu H, Jiang X, He T, Hu X. A multi-label learning framework for predicting antibiotic resistance genes via dual-view modeling. Brief Bioinform 2022; 23:6546259. [PMID: 35272349 DOI: 10.1093/bib/bbac052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 01/27/2022] [Accepted: 01/31/2022] [Indexed: 11/13/2022] Open
Abstract
The increasing prevalence of antibiotic resistance has become a global health crisis. For the purpose of safety regulation, it is of high importance to identify antibiotic resistance genes (ARGs) in bacteria. Although culture-based methods can identify ARGs relatively more accurately, the identifying process is time-consuming and specialized knowledge is required. With the rapid development of whole genome sequencing technology, researchers attempt to identify ARGs by computing sequence similarity from public databases. However, these computational methods might fail to detect ARGs due to the low sequence identity to known ARGs. Moreover, existing methods cannot effectively address the issue of multidrug resistance prediction for ARGs, which is a great challenge to clinical treatments. To address the challenges, we propose an end-to-end multi-label learning framework for predicting ARGs. More specifically, the task of ARGs prediction is modeled as a problem of multi-label learning, and a deep neural network-based end-to-end framework is proposed, in which a specific loss function is introduced to employ the advantage of multi-label learning for ARGs prediction. In addition, a dual-view modeling mechanism is employed to make full use of the semantic associations among two views of ARGs, i.e. sequence-based information and structure-based information. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs prediction.
Collapse
Affiliation(s)
- Weizhong Zhao
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Shujie Luo
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Haifang Wu
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Xingpeng Jiang
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Tingting He
- School of Computer, Central China Normal University, Wuhan, Hubei, 430079, PR China
| | - Xiaohua Hu
- College of Computing & Informatics, Drexel University, Philadelphia, PA 19104, USA
| |
Collapse
|
13
|
Wei L, Ye X, Xue Y, Sakurai T, Wei L. ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism. Brief Bioinform 2021; 22:6209691. [PMID: 33822870 DOI: 10.1093/bib/bbab041] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 01/11/2021] [Accepted: 01/28/2021] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Peptides have recently emerged as promising therapeutic agents against various diseases. For both research and safety regulation purposes, it is of high importance to develop computational methods to accurately predict the potential toxicity of peptides within the vast number of candidate peptides. RESULTS In this study, we proposed ATSE, a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural networks and attention mechanism. More specifically, it consists of four modules: (i) a sequence processing module for converting peptide sequences to molecular graphs and evolutionary profiles, (ii) a feature extraction module designed to learn discriminative features from graph structural information and evolutionary information, (iii) an attention module employed to optimize the features and (iv) an output module determining a peptide as toxic or non-toxic, using optimized features from the attention module. CONCLUSION Comparative studies demonstrate that the proposed ATSE significantly outperforms all other competing methods. We found that structural information is complementary to the evolutionary information, effectively improving the predictive performance. Importantly, the data-driven features learned by ATSE can be interpreted and visualized, providing additional information for further analysis. Moreover, we present a user-friendly online computational platform that implements the proposed ATSE, which is now available at http://server.malab.cn/ATSE. We expect that it can be a powerful and useful tool for researchers of interest.
Collapse
Affiliation(s)
- Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan, 3058577
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan, 3058577
| | - Yuyang Xue
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan, 3058577
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan, 3058577
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China
| |
Collapse
|
14
|
Abstract
We use a bioinformatic description of amino acid dynamic properties, based on residue-specific average B factors, to construct a dynamics-based, large-scale description of a space of protein sequences. We examine the relationship between that space and an independently constructed, structure-based space comprising the same sequences. It is demonstrated that structure and dynamics are only moderately correlated. It is further shown that helical proteins fall into two classes with very different structure-dynamics relationships. We suggest that dynamics in the two helical classes are dominated by distinctly different modes--pseudo-one-dimensional, localized helical modes in one case, and pseudo-three-dimensional (3D) global modes in the other. Sheet/barrel and mixed-α/β proteins exhibit more conventional structure-dynamics relationships. It is found that the strongest correlation between structure and dynamic properties arises when the latter are represented by the sequence average of the dynamic index, which corresponds physically to the overall mobility of the protein. None of these results are accessible to bioinformatic methods hitherto available.
Collapse
|
15
|
Paul S, Ainavarapu SRK, Venkatramani R. Variance of Atomic Coordinates as a Dynamical Metric to Distinguish Proteins and Protein-Protein Interactions in Molecular Dynamics Simulations. J Phys Chem B 2020; 124:4247-4262. [PMID: 32281802 DOI: 10.1021/acs.jpcb.0c01191] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Protein dynamics is a manifestation of the complex trajectories of these biomolecules on a multidimensional rugged potential energy surface (PES) driven by thermal energy. At present, computational methods such as atomistic molecular dynamics (MD) simulations can describe thermal protein conformational changes in fully solvated environments over millisecond timescales. Despite these advances, a quantitative assessment of protein dynamics remains a complicated topic, intricately linked to issues such as sampling convergence and the identification of appropriate reaction coordinates/structural features to describe protein conformational states and motions. Here, we present the cumulative variance of atomic coordinate fluctuations (CVCF) along trajectories as an intuitive PES sensitive metric to assess both the extent of sampling and protein dynamics captured in MD simulations. We first examine the sampling problem in model one- (1D) and two-dimensional (2D) PES to demonstrate that the CVCF when traced as a function of the sampling variable (time in MD simulations) can identify local and global equilibria. Further, even far from global equilibrium, a situation representative of standard MD trajectories of proteins, the CVCF can distinguish different PES and therefore resolve the resultant protein dynamics. We demonstrate the utility of our CVCF analysis by applying it to distinguish the dynamics of structurally homologous proteins from the ubiquitin family (ubiquitin, SUMO1, SUMO2) and ubiquitin protein-protein interactions. Our CVCF analysis reveals that differential side-chain dynamics from the structured part of the protein (the conserved β-grasp fold) present distinct protein PES to distinguish ubiquitin from SUMO isoforms. Upon binding to two functionally distinct protein partners (UBCH5A and UEV), intrinsic ubiquitin dynamics changes to reflect the binding context even though the two proteins have similar binding modes, which lead to negligible (sub-angstrom scale) structural changes.
Collapse
Affiliation(s)
- Sanjoy Paul
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Dr. Homi Bhabha Road, Colaba, Mumbai 400005, Maharashtra, India
| | - Sri Rama Koti Ainavarapu
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Dr. Homi Bhabha Road, Colaba, Mumbai 400005, Maharashtra, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Dr. Homi Bhabha Road, Colaba, Mumbai 400005, Maharashtra, India
| |
Collapse
|
16
|
Beyond Supersecondary Structure: Physics-Based Sequence Alignment. Methods Mol Biol 2019. [PMID: 30945228 DOI: 10.1007/978-1-4939-9161-7_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Traditional approaches to sequence alignment are based on evolutionary ideas. As a result, they are prebiased toward results which are in accord with initial expectations. We present here a method of sequence alignment which is based entirely on the physical properties of the amino acids. This approach has no inherent bias, eliminates much of the computational complexity associated with methods currently in use, and has been shown to give good results for structures which were poorly predicted by traditional methods in recent CASP competitions and to identify sequence differences which correlate with structural and dynamic differences not detectable by traditional methods.
Collapse
|
17
|
Sun Z, Liu Q, Qu G, Feng Y, Reetz MT. Utility of B-Factors in Protein Science: Interpreting Rigidity, Flexibility, and Internal Motion and Engineering Thermostability. Chem Rev 2019; 119:1626-1665. [PMID: 30698416 DOI: 10.1021/acs.chemrev.8b00290] [Citation(s) in RCA: 313] [Impact Index Per Article: 52.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Zhoutong Sun
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West Seventh Avenue, Tianjin Airport Economic Area, Tianjin 300308, China
| | - Qian Liu
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ge Qu
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West Seventh Avenue, Tianjin Airport Economic Area, Tianjin 300308, China
| | - Yan Feng
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Manfred T. Reetz
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West Seventh Avenue, Tianjin Airport Economic Area, Tianjin 300308, China
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
- Chemistry Department, Philipps-University, Hans-Meerwein-Strasse 4, 35032 Marburg, Germany
| |
Collapse
|
18
|
Ng ML, Rahmat ZB, Bin Omar MSS. Molecular Modeling and Simulation of Transketolase from Orthosiphon stamineus. Curr Comput Aided Drug Des 2018; 15:308-317. [PMID: 30345923 DOI: 10.2174/1573409914666181022141753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 08/21/2018] [Accepted: 10/17/2018] [Indexed: 11/22/2022]
Abstract
BACKGROUND Orthosiphon stamineus is a traditional medicinal plant in Southeast Asia countries with various well-known pharmacological activities such as antidiabetic, diuretics and antitumor activities. Transketolase is one of the proteins identified in the leaves of the plant and transketolase is believed able to lower blood sugar level in human through non-pancreatic mechanism. In order to understand the protein behavioral properties, 3D model of transketolase and analysis of protein structure are of obvious interest. METHODS In the present study, 3D model of transketolase was constructed and its atomic characteristics revealed. Besides, molecular dynamic simulation of the protein at 310 K and 368 K deciphered transketolase may be a thermophilic protein as the structure does not distort even at elevated temperature. This study also used the protein at 310 K and 368 K resimulated back at 310 K environment. RESULTS The results revealed that the protein is stable at all condition which suggest that it has high capacity to adapt at different environment not only at high temperature but also from high temperature condition to low temperature where the structure remains unchanged while retaining protein function. CONCLUSION The thermostability properties of transketolase is beneficial for pharmaceutical industries as most of the drug making processes are at high temperature condition.
Collapse
Affiliation(s)
- Mei Ling Ng
- Faculty of Biosciences and Medical Engineering, Universiti Teknologi Malaysia, 81310 (Skudai), Johor, Malaysia
| | - Zaidah Binti Rahmat
- Faculty of Biosciences and Medical Engineering, Universiti Teknologi Malaysia, 81310 (Skudai), Johor, Malaysia
| | | |
Collapse
|
19
|
Cao K, Li N, Wang H, Cao X, He J, Zhang B, He QY, Zhang G, Sun X. Two zinc-binding domains in the transporter AdcA from Streptococcus pyogenes facilitate high-affinity binding and fast transport of zinc. J Biol Chem 2018; 293:6075-6089. [PMID: 29491141 DOI: 10.1074/jbc.m117.818997] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Revised: 02/25/2018] [Indexed: 11/06/2022] Open
Abstract
Zinc is an essential metal in bacteria. One important bacterial zinc transporter is AdcA, and most bacteria possess AdcA homologs that are single-domain small proteins due to better efficiency of protein biogenesis. However, a double-domain AdcA with two zinc-binding sites is significantly overrepresented in Streptococcus species, many of which are major human pathogens. Using molecular simulation and experimental validations of AdcA from Streptococcus pyogenes, we found here that the two AdcA domains sequentially stabilize the structure upon zinc binding, indicating an organization required for both increased zinc affinity and transfer speed. This structural organization appears to endow Streptococcus species with distinct advantages in zinc-depleted environments, which would not be achieved by each single AdcA domain alone. This enhanced zinc transport mechanism sheds light on the significance of the evolution of the AdcA domain fusion, provides new insights into double-domain transporter proteins with two binding sites for the same ion, and indicates a potential target of antimicrobial drugs against pathogenic Streptococcus species.
Collapse
Affiliation(s)
- Kun Cao
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Nan Li
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Hongcui Wang
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Xin Cao
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Jiaojiao He
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Bing Zhang
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Qing-Yu He
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Gong Zhang
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| | - Xuesong Sun
- From the Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huang-Pu Avenue West, Guangzhou 510632, China
| |
Collapse
|