1
|
Nakatsu G, Ko D, Michaud M, Franzosa EA, Morgan XC, Huttenhower C, Garrett WS. Virulence factor discovery identifies associations between the Fic gene family and Fap2 + fusobacteria in colorectal cancer microbiomes. mBio 2025; 16:e0373224. [PMID: 39807864 PMCID: PMC11796403 DOI: 10.1128/mbio.03732-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Accepted: 12/13/2024] [Indexed: 01/16/2025] Open
Abstract
Fusobacterium is a bacterium associated with colorectal cancer (CRC) tumorigenesis, progression, and metastasis. Fap2 is a fusobacteria-specific outer membrane galactose-binding lectin that mediates Fusobacterium adherence to and invasion of CRC tumors. Advances in omics analyses provide an opportunity to profile and identify microbial genomic features that correlate with the cancer-associated bacterial virulence factor Fap2. Here, we analyze genomes of Fusobacterium colon tumor isolates and find that a family of post-translational modification enzymes containing Fic domains is associated with Fap2 positivity in these strains. We demonstrate that Fic family genes expand with the presence of Fap2 in the fusobacterial pangenome. Through comparative genomic analysis, we find that Fap2+ Fusobacteriota are highly enriched with Fic gene families compared to other cancer-associated and human gut microbiome bacterial taxa. Using a global data set of CRC shotgun metagenomes, we show that fusobacterial Fic and Fap2 genes frequently co-occur in the fecal microbiomes of individuals with late-stage CRC. We further characterize specific Fic gene families harbored by Fap2+ Fusobacterium animalis genomes and detect recombination events and elements of horizontal gene transfer via synteny analysis of Fic gene loci. Exposure of a F. animalis strain to a colon adenocarcinoma cell line increases gene expression of fusobacterial Fic and virulence-associated adhesins. Finally, we demonstrate that Fic proteins are synthesized by F. animalis as Fic peptides are detectable in F. animalis monoculture supernatants. Taken together, our study uncovers Fic genes as potential virulence factors in Fap2+ fusobacterial genomes.IMPORTANCEAccumulating data support that bacterial members of the intra-tumoral microbiota critically influence colorectal cancer progression. Yet, relatively little is known about non-adhesin fusobacterial virulence factors that may influence carcinogenesis. Our genomic analysis and expression assays in fusobacteria identify Fic domain-containing genes, well-studied virulence factors in pathogenic bacteria, as potential fusobacterial virulence features. The Fic family proteins that we find are encoded by fusobacteria and expressed by Fusobacterium animalis merit future investigation to assess their roles in colorectal cancer development and progression.
Collapse
Affiliation(s)
- Geicho Nakatsu
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
| | - Duhyun Ko
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
| | - Monia Michaud
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
| | - Eric A. Franzosa
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Xochitl C. Morgan
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Curtis Huttenhower
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Wendy S. Garrett
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Harvard T.H. Chan Microbiome in Public Health Center, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| |
Collapse
|
2
|
Mall R, Singh A, Patel CN, Guirimand G, Castiglione F. VISH-Pred: an ensemble of fine-tuned ESM models for protein toxicity prediction. Brief Bioinform 2024; 25:bbae270. [PMID: 38842509 PMCID: PMC11154842 DOI: 10.1093/bib/bbae270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 05/06/2024] [Accepted: 05/23/2024] [Indexed: 06/07/2024] Open
Abstract
Peptide- and protein-based therapeutics are becoming a promising treatment regimen for myriad diseases. Toxicity of proteins is the primary hurdle for protein-based therapies. Thus, there is an urgent need for accurate in silico methods for determining toxic proteins to filter the pool of potential candidates. At the same time, it is imperative to precisely identify non-toxic proteins to expand the possibilities for protein-based biologics. To address this challenge, we proposed an ensemble framework, called VISH-Pred, comprising models built by fine-tuning ESM2 transformer models on a large, experimentally validated, curated dataset of protein and peptide toxicities. The primary steps in the VISH-Pred framework are to efficiently estimate protein toxicities taking just the protein sequence as input, employing an under sampling technique to handle the humongous class-imbalance in the data and learning representations from fine-tuned ESM2 protein language models which are then fed to machine learning techniques such as Lightgbm and XGBoost. The VISH-Pred framework is able to correctly identify both peptides/proteins with potential toxicity and non-toxic proteins, achieving a Matthews correlation coefficient of 0.737, 0.716 and 0.322 and F1-score of 0.759, 0.696 and 0.713 on three non-redundant blind tests, respectively, outperforming other methods by over $10\%$ on these quality metrics. Moreover, VISH-Pred achieved the best accuracy and area under receiver operating curve scores on these independent test sets, highlighting the robustness and generalization capability of the framework. By making VISH-Pred available as an easy-to-use web server, we expect it to serve as a valuable asset for future endeavors aimed at discerning the toxicity of peptides and enabling efficient protein-based therapeutics.
Collapse
Affiliation(s)
- Raghvendra Mall
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
| | - Ankita Singh
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
| | - Chirag N Patel
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
| | - Gregory Guirimand
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe, 657-8501, Japan
| | - Filippo Castiglione
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
- Institute for Applied Computing, National Research Council of Italy, Via dei Taurini, 19, 00185, Rome, Italy
| |
Collapse
|
3
|
Yin CF, Nie Y, Li T, Zhou NY. AlmA involved in the long-chain n-alkane degradation pathway in Acinetobacter baylyi ADP1 is a Baeyer-Villiger monooxygenase. Appl Environ Microbiol 2024; 90:e0162523. [PMID: 38168668 PMCID: PMC10807437 DOI: 10.1128/aem.01625-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 11/22/2023] [Indexed: 01/05/2024] Open
Abstract
Many Acinetobacter species can grow on n-alkanes of varying lengths (≤C40). AlmA, a unique flavoprotein in these Acinetobacter strains, is the only enzyme proven to be required for the degradation of long-chain (LC) n-alkanes, including C32 and C36 alkanes. Although it is commonly presumed to be a terminal hydroxylase, its role in n-alkane degradation remains elusive. In this study, we conducted physiological, biochemical, and bioinformatics analyses of AlmA to determine its role in n-alkane degradation by Acinetobacter baylyi ADP1. Consistent with previous reports, gene deletion analysis showed that almA was vital for the degradation of LC n-alkanes (C26-C36). Additionally, enzymatic analysis revealed that AlmA catalyzed the conversion of aliphatic 2-ketones (C10-C16) to their corresponding esters, but it did not conduct n-alkane hydroxylation under the same conditions, thus suggesting that AlmA in strain ADP1 possesses Baeyer-Villiger monooxygenase (BVMO) activity. These results were further confirmed by bioinformatics analysis, which revealed that AlmA was closer to functionally identified BVMOs than to hydroxylases. Altogether, the results of our study suggest that LC n-alkane degradation by strain ADP1 possibly follows a novel subterminal oxidation pathway that is distinct from the terminal oxidation pathway followed for short-chain n-alkane degradation. Furthermore, our findings suggest that AlmA catalyzes the third reaction in the LC n-alkane degradation pathway.IMPORTANCEMany microbial studies on n-alkane degradation are focused on the genes involved in short-chain n-alkane (≤C16) degradation; however, reports on the genes involved in long-chain (LC) n-alkane (>C20) degradation are limited. Thus far, only AlmA has been reported to be involved in LC n-alkane degradation by Acinetobacter spp.; however, its role in the n-alkane degradation pathway remains elusive. In this study, we conducted a detailed characterization of AlmA in A. baylyi ADP1 and found that AlmA exhibits Baeyer-Villiger monooxygenase activity, thus indicating the presence of a novel LC n-alkane biodegradation mechanism in strain ADP1.
Collapse
Affiliation(s)
- Chao-Fan Yin
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, and School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Yong Nie
- College of Engineering, Peking University, Beijing, China
| | - Tao Li
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, and School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Ning-Yi Zhou
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, and School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
4
|
Hassan SS, Shams R, Camps I, Basharat Z, Sohail S, Khan Y, Ullah A, Irfan M, Ali J, Bilal M, Morel CM. Subtractive sequence analysis aided druggable targets mining in Burkholderia cepacia complex and finding inhibitors through bioinformatics approach. Mol Divers 2023; 27:2823-2847. [PMID: 36567421 PMCID: PMC9790820 DOI: 10.1007/s11030-022-10584-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 12/05/2022] [Indexed: 12/27/2022]
Abstract
Burkholderia cepacia complex (BCC) is a group of gram-negative bacteria composed of at least 20 different species that cause diseases in plants, animals as well as humans (cystic fibrosis and airway infection). Here, we analyzed the proteomic data of 47 BCC strains by classifying them in three groups. Phylogenetic analyses were performed followed by individual core region identification for each group. Comparative analysis of the three individual core protein fractions resulted in 1766 ortholog/proteins. Non-human homologous proteins from the core region gave 1680 proteins. Essential protein analyses reduced the target list to 37 proteins, which were further compared to a closely related out-group, Burkholderia gladioli ATCC 10,248 strain, resulting in 21 proteins. 3D structure modeling, validation, and druggability step gave six targets that were subjected to further target prioritization parameters which ultimately resulted in two BCC targets. A library of 12,000 ZINC drug-like compounds was screened, where only the top hits were selected for docking orientations. These included ZINC01405842 (against Chorismate synthase aroC) and ZINC06055530 (against Bifunctional N-acetylglucosamine-1-phosphate uridyltransferase/Glucosamine-1-phosphate acetyltransferase glmU). Finally, dynamics simulation (200 ns) was performed for each ligand-receptor complex, followed by ADMET profiling. Of these targets, details of their applicability as drug targets have not yet been elucidated experimentally, hence making our predictions novel and it is suggested that further wet-lab experimentations should be conducted to test the identified BCC targets and ZINC scaffolds to inhibit them.
Collapse
Affiliation(s)
- Syed Shah Hassan
- Jamil–ur–Rehman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270 Pakistan
- Centre for Technological Development in Health (CDTS), Oswaldo Cruz Foundation (Fiocruz), Building “Expansão”, 8th Floor Room 814, Av. Brasil 4036, Manguinhos, Rio de Janeiro, RJ 21040-361 Brazil
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000 KP Pakistan
| | - Rida Shams
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000 KP Pakistan
| | - Ihosvany Camps
- Laboratório de Modelagem Computacional—LaModel, Instituto de Ciências Exatas—ICEx. Universidade Federal de Alfenas—UNIFAL-MG, Alfenas, Minas Gerais Brazil
- High Performance & Quantum Computing Labs, Waterloo, Canada
| | - Zarrin Basharat
- Jamil–ur–Rehman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270 Pakistan
| | - Saman Sohail
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000 KP Pakistan
| | - Yasmin Khan
- Jamil–ur–Rehman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270 Pakistan
| | - Asad Ullah
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000 KP Pakistan
| | - Muhammad Irfan
- Jamil–ur–Rehman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270 Pakistan
| | - Javed Ali
- Department of Chemistry, Kohat University of Science & Technology–KUST, Kohat, KP Pakistan
| | - Muhammad Bilal
- Department of Chemistry, Kohat University of Science & Technology–KUST, Kohat, KP Pakistan
| | - Carlos M. Morel
- Centre for Technological Development in Health (CDTS), Oswaldo Cruz Foundation (Fiocruz), Building “Expansão”, 8th Floor Room 814, Av. Brasil 4036, Manguinhos, Rio de Janeiro, RJ 21040-361 Brazil
| |
Collapse
|
5
|
Beliaeva MA, Wilmanns M, Zimmermann M. Decipher enzymes from human microbiota for drug discovery and development. Curr Opin Struct Biol 2023; 80:102567. [PMID: 36963164 DOI: 10.1016/j.sbi.2023.102567] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 02/03/2023] [Accepted: 02/13/2023] [Indexed: 03/26/2023]
Abstract
The human microbiota plays an important role in human health and contributes to the metabolism of therapeutic drugs affecting their potency. However, the current knowledge on human gut bacterial metabolism is limited and lacks an understanding of the underlying mechanisms of observed drug biotransformations. Despite the complexity of the gut microbial community, genomic and metagenomic sequencing provides insights into the diversity of chemical reactions that can be carried out by the microbiota and poses new challenges to functionally annotate thousands of bacterial enzymes. Here, we outline methods to systematically address the structural and functional space of the human microbiome, highlighting a combination of in silico and in vitro approaches. Systematic knowledge about microbial enzymes could eventually be applied for personalized therapy, the development of prodrugs and modulators of unwanted bacterial activity, and the further discovery of new antibiotics.
Collapse
Affiliation(s)
- Mariia A Beliaeva
- European Molecular Biology Laboratory, Heidelberg, Germany; European Molecular Biology Laboratory, Hamburg Unit, Hamburg, Germany. https://twitter.com/@MariiaABeliaeva
| | - Matthias Wilmanns
- European Molecular Biology Laboratory, Hamburg Unit, Hamburg, Germany. https://twitter.com/@WilmannsGroup
| | | |
Collapse
|
6
|
CSM-Toxin: A Web-Server for Predicting Protein Toxicity. Pharmaceutics 2023; 15:pharmaceutics15020431. [PMID: 36839752 PMCID: PMC9966851 DOI: 10.3390/pharmaceutics15020431] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 01/31/2023] Open
Abstract
Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand "biological" language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver.
Collapse
|
7
|
Abstract
Microcins are a class of antimicrobial peptides produced by certain Gram-negative bacterial species to kill or inhibit the growth of competing bacteria. Only 10 unique, experimentally validated class II microcins have been identified, and the majority of these come from Escherichia coli. Although the current representation of microcins is sparse, they exhibit a diverse array of molecular functionalities, uptake mechanisms, and target specificities. This broad diversity from such a small representation suggests that microcins may have untapped potential for bioprospecting peptide antibiotics from genomic data sets. We used a systematic bioinformatics approach to search for verified and novel class II microcins in E. coli and other species within its family, Enterobacteriaceae. Nearly one-quarter of the E. coli genome assemblies contained one or more microcins, where the prevalence of hits to specific microcins varied by isolate phylogroup. E. coli isolates from human extraintestinal and poultry meat sources were enriched for microcins, while those from freshwater were depleted. Putative microcins were found in various abundances across all five distinct phylogenetic lineages of Enterobacteriaceae, with a particularly high prevalence in the "Klebsiella" clade. Representative genome assemblies from species across the Enterobacterales order, as well as a few outgroup species, also contained putative microcin sequences. This study suggests that microcins have a complicated evolutionary history, spanning far beyond our limited knowledge of the currently validated microcins. Efforts to functionally characterize these newly identified microcins have great potential to open a new field of peptide antibiotics and microbiome modulators and elucidate the ways in which bacteria compete with each other. IMPORTANCE Class II microcins are small bacteriocins produced by strains of Gram-negative bacteria in the Enterobacteriaceae. They are generally understood to play a role in interbacterial competition, although direct evidence of this is limited, and they could prove informative in developing new peptide antibiotics. However, few examples of verified class II microcins exist, and novel microcins are difficult to identify due to their sequence diversity, making it complicated to study them as a group. Here, we overcome this limitation by developing a bioinformatics pipeline to detect microcins in silico. Using this pipeline, we demonstrate that both verified and novel class II microcins are widespread within and outside the Enterobacteriaceae, which has not been systematically shown previously. The observed prevalence of class II microcins suggests that they are ecologically important, and the elucidation of novel microcins provides a resource that can be used to expand our knowledge of the structure and function of microcins as antibacterials.
Collapse
|
8
|
A novel capsid protein network allows the characteristic internal membrane structure of Marseilleviridae giant viruses. Sci Rep 2022; 12:21428. [PMID: 36504202 PMCID: PMC9742146 DOI: 10.1038/s41598-022-24651-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 11/18/2022] [Indexed: 12/14/2022] Open
Abstract
Marseilleviridae is a family of giant viruses, showing a characteristic internal membrane with extrusions underneath the icosahedral vertices. However, such large objects, with a maximum diameter of 250 nm are technically difficult to examine at sub-nanometre resolution by cryo-electron microscopy. Here, we tested the utility of 1 MV high-voltage cryo-EM (cryo-HVEM) for single particle structural analysis (SPA) of giant viruses using tokyovirus, a species of Marseilleviridae, and revealed the capsid structure at 7.7 Å resolution. The capsid enclosing the viral DNA consisted primarily of four layers: (1) major capsid proteins (MCPs) and penton proteins, (2) minor capsid proteins (mCPs), (3) scaffold protein components (ScPCs), and (4) internal membrane. The mCPs showed a novel capsid lattice consisting of eight protein components. ScPCs connecting the icosahedral vertices supported the formation of the membrane extrusions, and possibly act like tape measure proteins reported in other giant viruses. The density on top of the MCP trimer was suggested to include glycoproteins. This is the first attempt at cryo-HVEM SPA. We found the primary limitations to be the lack of automated data acquisition and software support for collection and processing and thus achievable resolution. However, the results pave the way for using cryo-HVEM for structural analysis of larger biological specimens.
Collapse
|
9
|
Valorization of Biomasses from Energy Crops for the Discovery of Novel Thermophilic Glycoside Hydrolases through Metagenomic Analysis. Int J Mol Sci 2022; 23:ijms231810505. [PMID: 36142415 PMCID: PMC9505709 DOI: 10.3390/ijms231810505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/06/2022] [Accepted: 09/08/2022] [Indexed: 11/16/2022] Open
Abstract
The increasing interest for environmentally friendly technologies is driving the transition from fossil-based economy to bioeconomy. A key enabler for circular bioeconomy is to valorize renewable biomasses as feedstock to extract high value-added chemicals. Within this transition the discovery and the use of robust biocatalysts to replace toxic chemical catalysts play a significant role as technology drivers. To meet both the demands, we performed microbial enrichments on two energy crops, used as low-cost feed for extremophilic consortia. A culture-dependent approach coupled to metagenomic analysis led to the discovery of more than 300 glycoside hydrolases and to characterize a new α-glucosidase from an unknown hyperthermophilic archaeon. Aglu1 demonstrated to be the most active archaeal GH31 on 4Np-α-Glc and it showed unexpected specificity vs. kojibiose, revealing to be a promising candidate for biotechnological applications such as the liquefaction/saccharification of starch.
Collapse
|
10
|
Perera M, Wijesundera S, Wijayarathna CD, Seneviratne G, Jayasena S. Identification of long-chain alkane-degrading (LadA) monooxygenases in Aspergillus flavus via in silico analysis. Front Microbiol 2022; 13:898456. [PMID: 36110294 PMCID: PMC9468676 DOI: 10.3389/fmicb.2022.898456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 07/20/2022] [Indexed: 11/25/2022] Open
Abstract
Efficient degradation of alkanes in crude oil by the isolated Aspergillus flavus MM1 alluded to the presence of highly active alkane-degrading enzymes in this fungus. A long-chain alkane-degrading, LadA-like enzyme family in A. flavus was identified, and possible substrate-binding modes were analyzed using a computational approach. By analyzing publicly available protein databases, we identified six uncharacterized proteins in A. flavus NRRL 3357, of which five were identified as class LadAα and one as class LadAβ, which are eukaryotic homologs of bacterial long-chain alkane monooxygenase (LadA). Computational models of A. flavus LadAα homologs (Af1-Af5) showed overall structural similarity to the bacterial LadA and the unique sequence and structural elements that bind the cofactor Flavin mononucleotide (FMN). A receptor-cofactor-substrate docking protocol was established and validated to demonstrate the substrate binding in the A. flavus LadAα homologs. The modeled Af1, Af3, Af4, and Af5 captured long-chain n-alkanes inside the active pocket, above the bound FMN. Isoalloxazine ring of reduced FMN formed a π–alkyl interaction with the terminal carbon atom of captured alkanes, C16–C30, in Af3–Af5 and C16–C24 in Af1. Our results confirmed the ability of identified A. flavus LadAα monooxygenases to bind long-chain alkanes inside the active pocket. Hence A. flavus LadAα monooxygenases potentially initiate the degradation of long-chain alkanes by oxidizing bound long-chain alkanes into their corresponding alcohol.
Collapse
Affiliation(s)
- Madushika Perera
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
| | - Sulochana Wijesundera
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
| | | | | | - Sharmila Jayasena
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
- *Correspondence: Sharmila Jayasena,
| |
Collapse
|
11
|
Possible Regulation of Toll-Like Receptor 4 By Lysine Acetylation Through LPCAT2 Activity in RAW264.7 Cells. Biosci Rep 2022; 42:231468. [PMID: 35735109 PMCID: PMC9289797 DOI: 10.1042/bsr20220251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 06/16/2022] [Accepted: 06/22/2022] [Indexed: 11/17/2022] Open
Abstract
Inflammation is central to several diseases. TLR4 mediates inflammation by recognising and binding to bacterial lipopolysaccharides and interacting with other proteins in the TLR4 signalling pathway. Although there is extensive research on TLR4-mediated inflammation, there are gaps in understanding its mechanisms. Recently, TLR4 co-localised with LPCAT2, a lysophospholipid acetyltransferase. LPCAT2 is already known to influence lipopolysaccharide-induced inflammation; however, the mechanism of LPCAT2 influencing lipopolysaccharide-mediated inflammation is not understood. The present study combined computational analysis with biochemical analysis to investigate the influence of LPCAT2 on lysine acetylation in LPS-treated RAW264.7 cells. The results suggest for the first time that LPCAT2 influences lysine acetylation in LPS-treated RAW264.7 cells. Moreover, we detected acetylated lysine residues on TLR4. The present study lays a foundation for further research on the role of lysine acetylation on TLR4 signalling. Moreover, further research is required to characterise LPCAT2 as a protein acetyltransferase.
Collapse
|
12
|
He L, Huang X, Zhang G, Yuan L, Shen E, Zhang L, Zhang XH, Zhang T, Tao L, Ju F. Distinctive signatures of pathogenic and antibiotic resistant potentials in the hadal microbiome. ENVIRONMENTAL MICROBIOME 2022; 17:19. [PMID: 35468809 PMCID: PMC9036809 DOI: 10.1186/s40793-022-00413-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 04/06/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Hadal zone of the deep-sea trenches accommodates microbial life under extreme energy limitations and environmental conditions, such as low temperature, high pressure, and low organic matter down to 11,000 m below sea level. However, microbial pathogenicity, resistance, and adaptation therein remain unknown. Here we used culture-independent metagenomic approaches to explore the virulence and antibiotic resistance in the hadal microbiota of the Mariana Trench. RESULTS The results indicate that the 10,898 m Challenger Deep bottom sediment harbored prosperous microbiota with contrasting signatures of virulence factors and antibiotic resistance, compared with the neighboring but shallower 6038 m steep wall site and the more nearshore 5856 m Pacific basin site. Virulence genes including several famous large translocating virulence genes (e.g., botulinum neurotoxins, tetanus neurotoxin, and Clostridium difficile toxins) were uniquely detected in the trench bottom. However, the shallower and more nearshore site sediment had a higher abundance and richer diversity of known antibiotic resistance genes (ARGs), especially for those clinically relevant ones (e.g., fosX, sul1, and TEM-family extended-spectrum beta-lactamases), revealing resistance selection under anthropogenic stresses. Further analysis of mobilome (i.e., the collection of mobile genetic elements, MGEs) suggests horizontal gene transfer mediated by phage and integrase as the major mechanism for the evolution of Mariana Trench sediment bacteria. Notably, contig-level co-occurring and taxonomic analysis shows emerging evidence for substantial co-selection of virulence genes and ARGs in taxonomically diverse bacteria in the hadal sediment, especially for the Challenger Deep bottom where mobilized ARGs and virulence genes are favorably enriched in largely unexplored bacteria. CONCLUSIONS This study reports the landscape of virulence factors, antibiotic resistome, and mobilome in the sediment and seawater microbiota residing hadal environment of the deepest ocean bottom on earth. Our work unravels the contrasting and unique features of virulence genes, ARGs, and MGEs in the Mariana Trench bottom, providing new insights into the eco-environmental and biological processes underlying microbial pathogenicity, resistance, and adaptative evolution in the hadal environment.
Collapse
Affiliation(s)
- Liuqing He
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Xinyu Huang
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Guoqing Zhang
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Ling Yuan
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Enhui Shen
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Lu Zhang
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Xiao-Hua Zhang
- College of Marine Life Sciences, and Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao, 266003 Shandong China
| | - Tong Zhang
- Environmental Microbiome Engineering and Biotechnology Laboratory, The University of Hong Kong, Hong Kong SAR, China
| | - Liang Tao
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| | - Feng Ju
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, 310024 Zhejiang China
- Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, 310024 Zhejiang China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, 310024 Zhejiang China
| |
Collapse
|
13
|
Moyo AC, Dufossé L, Giuffrida D, van Zyl LJ, Trindade M. Structure and biosynthesis of carotenoids produced by a novel Planococcus sp. isolated from South Africa. Microb Cell Fact 2022; 21:43. [PMID: 35305628 PMCID: PMC8933910 DOI: 10.1186/s12934-022-01752-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 01/26/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND The genus Planococcus is comprised of halophilic bacteria generally reported for the production of carotenoid pigments and biosurfactants. In previous work, we showed that the culturing of the orange-pigmented Planococcus sp. CP5-4 isolate increased the evaporation rate of industrial wastewater brine effluent, which we attributed to the orange pigment. This demonstrated the potential application of this bacterium for industrial brine effluent management in evaporation ponds for inland desalination plants. Here we identified a C30-carotenoid biosynthetic gene cluster responsible for pigment biosynthesis in Planococcus sp. CP5-4 through isolation of mutants and genome sequencing. We further compare the core genes of the carotenoid biosynthetic gene clusters identified from different Planococcus species' genomes which grouped into gene cluster families containing BGCs linked to different carotenoid product chemotypes. Lastly, LC-MS analysis of saponified and unsaponified pigment extracts obtained from cultures of Planococcus sp. CP5-4, revealed the structure of the main (predominant) glucosylated C30-carotenoid fatty acid ester produced by Planococcus sp. CP5-4. RESULTS Genome sequence comparisons of isolated mutant strains of Planococcus sp. CP5-4 showed deletions of 146 Kb and 3 Kb for the non-pigmented and "yellow" mutants respectively. Eight candidate genes, likely responsible for C30-carotenoid biosynthesis, were identified on the wild-type genome region corresponding to the deleted segment in the non-pigmented mutant. Six of the eight candidate genes formed a biosynthetic gene cluster. A truncation of crtP was responsible for the "yellow" mutant phenotype. Genome annotation revealed that the genes encoded 4,4'-diapolycopene oxygenase (CrtNb), 4,4'- diapolycopen-4-al dehydrogenase (CrtNc), 4,4'-diapophytoene desaturase (CrtN), 4,4'- diaponeurosporene oxygenase (CrtP), glycerol acyltransferase (Agpat), family 2 glucosyl transferase 2 (Gtf2), phytoene/squalene synthase (CrtM), and cytochrome P450 hydroxylase enzymes. Carotenoid analysis showed that a glucosylated C30-carotenoid fatty acid ester, methyl 5-(6-C17:3)-glucosyl-5, 6'-dihydro-apo-4, 4'-lycopenoate was the main carotenoid compound produced by Planococcus sp. CP5-4. CONCLUSION We identified and characterized the carotenoid biosynthetic gene cluster and the C30-carotenoid compound produced by Planococcus sp. CP5-4. Mass-spectrometry guided analysis of the saponified and unsaponified pigment extracts showed that methyl 5-glucosyl-5, 6-dihydro-apo-4, 4'-lycopenoate esterified to heptadecatrienoic acid (C17:3). Furthermore, through phylogenetic analysis of the core carotenoid BGCs of Planococcus species we show that various C30-carotenoid product chemotypes, apart from methyl 5-glucosyl-5, 6-dihydro-apo-4, 4'-lycopenoate and 5-glucosyl-4, 4-diaponeurosporen-4'-ol-4-oic acid, may be produced that could offer opportunities for a variety of applications.
Collapse
Affiliation(s)
- Anesu Conrad Moyo
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Bellville, Cape Town, 7535, South Africa
- BioCiTi Laboratory, 4th Floor Block B, Bandwidth Barn, Woodstock Exchange Building, 66-68 Albert Road, Woodstock, Cape Town, 7925, South Africa
| | - Laurent Dufossé
- Chemistry and Biotechnology of Natural Products, CHEMBIOPRO, ESIROI Agroalimentaire, Université de La Réunion, 15 Avenue René Cassin, CS 92003, CEDEX 9, F-97744, Saint-Denis, France
| | - Daniele Giuffrida
- Università Degli Studi Di Messina, Dip. B.I.O.M.O.R.F, Polo Annunziata, 98168, Messina, ME, Italy
| | - Leonardo Joaquim van Zyl
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Bellville, Cape Town, 7535, South Africa
| | - Marla Trindade
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Bellville, Cape Town, 7535, South Africa.
| |
Collapse
|
14
|
Yadav M, Rathore JS. Functional and transcriptional analysis of chromosomal encoded hipBA Xn2 type II toxin-antitoxin (TA) module from Xenorhabdus nematophila. Microb Pathog 2021; 162:105309. [PMID: 34839000 DOI: 10.1016/j.micpath.2021.105309] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 10/26/2021] [Accepted: 11/22/2021] [Indexed: 02/06/2023]
Abstract
Xenorhabdus nematophila is an entomopathogenic bacterium that synthesizes numerous toxins and kills its larval insect host. Apart from such toxins, its genome also has a plethora of toxin-antitoxin (TA) systems. The role of TA systems in bacterial physiology is debatable; however, they are associated with maintaining bacterial genomic stability and their survival under adverse environmental conditions. Here, we explored the functionality and transcriptional regulation of the type II hipBAXn2 TA system. This TA system was identified in the genome of X. nematophila ATCC 19061, which consists of the hipAXn2 toxin gene encoding 278 amino acid residues and hipBXn2 encoding antitoxin of 135 amino acid residues. We showed that overexpression of HipAXn2 toxin reduced the growth of Escherichia coli cells in a bacteriostatic manner, and amino-acids G8, H164, N167, and S169 were key residues for this growth reduction. Promoter activity and expression profiling of the hipBAXn2 TA system was showed that transcription was induced in both E. coli as well as X. nematophila upon exposure to different stress conditions. Further, we have exhibited the binding features of HipAXn2 toxin and HipBXn2 antitoxin to their promoter. This study provides evidence for the presence of a functional and well-regulated hipBAXn2 TA system in X. nematophila.
Collapse
Affiliation(s)
- Mohit Yadav
- School of Biotechnology, Gautam Buddha University, Yamuna Expressway, Greater Noida, Uttar Pradesh, India
| | - Jitendra Singh Rathore
- School of Biotechnology, Gautam Buddha University, Yamuna Expressway, Greater Noida, Uttar Pradesh, India.
| |
Collapse
|
15
|
Nouioui I, Dye T. Heat-killed Mycolicibacterium aurum Aogashima: An environmental nonpathogenic actinobacteria under development as a safe novel food ingredient. Food Sci Nutr 2021; 9:4839-4854. [PMID: 34531996 PMCID: PMC8441333 DOI: 10.1002/fsn3.2413] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 05/25/2021] [Accepted: 05/29/2021] [Indexed: 12/17/2022] Open
Abstract
Over the last few decades, a wealth of evidence has formed the basis for "the Old Friends hypothesis" suggesting that, in contrast to the past, increasingly people are living in environments with limited and less diverse microbial exposure, with potential consequences for their health. Hence, including safe live or heat-killed microbes in the diet may be beneficial in promoting and maintaining human health. In order to assess the safety of microbes beyond the current use of standardized cultures and probiotic supplements, new approaches are being developed. Here, we present evidence for the safety of heat-killed Mycolicibacterium aurum Aogashima as a novel food, utilizing the decision tree approach developed by Pariza and colleagues (2015). We provide evidence that the genome of M. aurum Aogashima is free of (1) genetic elements associated with pathogenicity or toxigenicity, (2) transferable antibiotic resistance gene DNA, and (3) genes coding for antibiotics used in human or veterinary medicine. Moreover, a 90-day oral toxicity study in rats showed that (4) the no observed adverse effect level (NOAEL) was the highest concentration tested, namely 2000 μg/kg BW/day. We conclude that oral consumption of heat-killed M. aurum Aogashima is safe and warrants further evaluation as a novel food ingredient.
Collapse
Affiliation(s)
- Imen Nouioui
- Devonshire BuildingNewcastle University School of Natural and Environmental SciencesNewcastle Upon TyneUnited Kingdom of Great Britain and Northern Ireland
| | | |
Collapse
|
16
|
Chen YH, Chiang PW, Rogozin DY, Degermendzhy AG, Chiu HH, Tang SL. Salvaging high-quality genomes of microbial species from a meromictic lake using a hybrid sequencing approach. Commun Biol 2021; 4:996. [PMID: 34426638 PMCID: PMC8382752 DOI: 10.1038/s42003-021-02510-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 08/01/2021] [Indexed: 11/08/2022] Open
Abstract
Most of Earth's bacteria have yet to be cultivated. The metabolic and functional potentials of these uncultivated microorganisms thus remain mysterious, and the metagenome-assembled genome (MAG) approach is the most robust method for uncovering these potentials. However, MAGs discovered by conventional metagenomic assembly and binning are usually highly fragmented genomes with heterogeneous sequence contamination. In this study, we combined Illumina and Nanopore data to develop a new workflow to reconstruct 233 MAGs-six novel bacterial orders, 20 families, 66 genera, and 154 species-from Lake Shunet, a secluded meromictic lake in Siberia. With our workflow, the average N50 of reconstructed MAGs greatly increased 10-40-fold compared to when the conventional Illumina assembly and binning method were used. More importantly, six complete MAGs were recovered from our datasets. The recovery of 154 novel species MAGs from a rarely explored lake greatly expands the current bacterial genome encyclopedia.
Collapse
Affiliation(s)
- Yu-Hsiang Chen
- Bioinformatics Program, Taiwan International Graduate Program, National Taiwan University, Taipei, Taiwan
- Bioinformatics Program, Institute of Information Science, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Pei-Wen Chiang
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Denis Yu Rogozin
- Institute of Biophysics, Siberian Branch of Russian Academy of Sciences, Krasnoyarsk, Russia
- Siberian Federal University, Krasnoyarsk, Russia
| | - Andrey G Degermendzhy
- Institute of Biophysics, Siberian Branch of Russian Academy of Sciences, Krasnoyarsk, Russia
| | - Hsiu-Hui Chiu
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Sen-Lin Tang
- Bioinformatics Program, Institute of Information Science, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan.
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan.
| |
Collapse
|
17
|
Discovery and mining of enzymes from the human gut microbiome. Trends Biotechnol 2021; 40:240-254. [PMID: 34304905 DOI: 10.1016/j.tibtech.2021.06.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 12/19/2022]
Abstract
Advances in technological and bioinformatics approaches have led to the generation of a plethora of human gut metagenomic datasets. Metabolomics has also provided substantial data regarding the small metabolites produced and modified by the microbiota. Comparatively, the microbial enzymes mediating the transformation of metabolites have not been intensively investigated. Here, we discuss the recent efforts and technologies used for discovering and mining enzymes from the human gut microbiota. The wealth of knowledge on metabolites, reactions, genome sequences, and structures of proteins, may drive the development of strategies for enzyme mining. Ongoing efforts to annotate gut microbiota enzymes will explain catalytic mechanisms that may guide the clinical applications of the gut microbiome for diagnostic and therapeutic purposes.
Collapse
|
18
|
Jain A, Terashi G, Kagaya Y, Maddhuri Venkata Subramaniya SR, Christoffer C, Kihara D. Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction. Sci Rep 2021; 11:7574. [PMID: 33828153 PMCID: PMC8027171 DOI: 10.1038/s41598-021-87204-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 03/25/2021] [Indexed: 12/12/2022] Open
Abstract
Protein 3D structure prediction has advanced significantly in recent years due to improving contact prediction accuracy. This improvement has been largely due to deep learning approaches that predict inter-residue contacts and, more recently, distances using multiple sequence alignments (MSAs). In this work we present AttentiveDist, a novel approach that uses different MSAs generated with different E-values in a single model to increase the co-evolutionary information provided to the model. To determine the importance of each MSA's feature at the inter-residue level, we added an attention layer to the deep neural network. We show that combining four MSAs of different E-value cutoffs improved the model prediction performance as compared to single E-value MSA features. A further improvement was observed when an attention layer was used and even more when additional prediction tasks of bond angle predictions were added. The improvement of distance predictions were successfully transferred to achieve better protein tertiary structure modeling.
Collapse
Affiliation(s)
- Aashish Jain
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Yuki Kagaya
- Graduate School of Information Sciences, Tohoku University, Sendai, Japan
| | | | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
19
|
Du H, Song Z, Zhang M, Nie Y, Xu Y. The deletion of Schizosaccharomyces pombe decreased the production of flavor-related metabolites during traditional Baijiu fermentation. Food Res Int 2021; 140:109872. [PMID: 33648190 DOI: 10.1016/j.foodres.2020.109872] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 10/15/2020] [Accepted: 10/29/2020] [Indexed: 01/03/2023]
Abstract
The microbiota in traditional solid-state fermentation is a complex microbiota that plays a key role in the production of feed, fuel, food and pharmaceutical products. The function of microbiota is an important factor dictating the quantity and quality of products. Core functional species play key metabolic roles in the microbiota, and their disappearance could result in the abnormal fermentation process. In this work, we combined Baijiu production and laboratory experiments to explore the keystone microbes and their metabolites. We found the deletion of core functional microbe resulted in the loss of multiple metabolites involved many alcohols and acids. In the traditional Baijiu production, the absence or appearance of Schizosaccharomyces pombe caused the content divergence in 227 flavor-related metabolites, especially in ethanol, butanol and pentanoic acid between abnormal and normal group (each content > 1 mg/kg and the content ratio of normal/abnormal group > 2). Schi. pombe increased the expression level of related genes involving alcohol dehydrogenase (ADH), acyl-CoA oxidase (ACOX) and trans-2-enoyl-CoA reductase (TER). Moreover, in the verification experiment of laboratory, the absence or appearance of Schizosaccharomyces pombe C-11 caused the content divergence in 136 flavor-related metabolites, especially in ethanol, butanol and pentanoic acid between Sp- and Sp+ group (each content > 1 mg/kg and the content ratio of Sp+/Sp- group > 2). Our results identified specific member that were essential for the function of fermentation microbiota. This study also suggests species deletions from fermentation microbiota and synthetic consortium could be a useful approach to illustrate relevant microbe-metabolites association and defining metabolic roles in the traditional solid-state fermentation.
Collapse
Affiliation(s)
- Hai Du
- State Key Laboratory of Food Science and Technology, Key Laboratory of Industrial Biotechnology of Ministry of Education, Synergetic Innovation Center of Food Safety and Nutrition, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China
| | - Zhewei Song
- State Key Laboratory of Food Science and Technology, Key Laboratory of Industrial Biotechnology of Ministry of Education, Synergetic Innovation Center of Food Safety and Nutrition, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China
| | - Menghui Zhang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Microbiology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yao Nie
- State Key Laboratory of Food Science and Technology, Key Laboratory of Industrial Biotechnology of Ministry of Education, Synergetic Innovation Center of Food Safety and Nutrition, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China.
| | - Yan Xu
- State Key Laboratory of Food Science and Technology, Key Laboratory of Industrial Biotechnology of Ministry of Education, Synergetic Innovation Center of Food Safety and Nutrition, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China.
| |
Collapse
|
20
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
21
|
Grasso S, van Rij T, van Dijl JM. GP4: an integrated Gram-Positive Protein Prediction Pipeline for subcellular localization mimicking bacterial sorting. Brief Bioinform 2020; 22:5998864. [PMID: 33227814 PMCID: PMC8294519 DOI: 10.1093/bib/bbaa302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 10/08/2020] [Accepted: 10/09/2020] [Indexed: 11/17/2022] Open
Abstract
Subcellular localization is a critical aspect of protein function and the potential application of proteins either as drugs or drug targets, or in industrial and domestic applications. However, the experimental determination of protein localization is time consuming and expensive. Therefore, various localization predictors have been developed for particular groups of species. Intriguingly, despite their major representation amongst biotechnological cell factories and pathogens, a meta-predictor based on sorting signals and specific for Gram-positive bacteria was still lacking. Here we present GP4, a protein subcellular localization meta-predictor mainly for Firmicutes, but also Actinobacteria, based on the combination of multiple tools, each specific for different sorting signals and compartments. Novelty elements include improved cell-wall protein prediction, including differentiation of the type of interaction, prediction of non-canonical secretion pathway target proteins, separate prediction of lipoproteins and better user experience in terms of parsability and interpretability of the results. GP4 aims at mimicking protein sorting as it would happen in a bacterial cell. As GP4 is not homology based, it has a broad applicability and does not depend on annotated databases with homologous proteins. Non-canonical usage may include little studied or novel species, synthetic and engineered organisms, and even re-use of the prediction data to develop custom prediction algorithms. Our benchmark analysis highlights the improved performance of GP4 compared to other widely used subcellular protein localization predictors. A webserver running GP4 is available at http://gp4.hpc.rug.nl/
Collapse
Affiliation(s)
| | | | - Jan Maarten van Dijl
- University of Groningen and the University Medical Center Groningen, the Netherlands
| |
Collapse
|
22
|
Daly M, Bromilow SN, Nitride C, Shewry PR, Gethings LA, Mills ENC. Mapping Coeliac Toxic Motifs in the Prolamin Seed Storage Proteins of Barley, Rye, and Oats Using a Curated Sequence Database. Front Nutr 2020; 7:87. [PMID: 32766270 PMCID: PMC7379453 DOI: 10.3389/fnut.2020.00087] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 05/12/2020] [Indexed: 12/20/2022] Open
Abstract
Wheat gluten, and related prolamin proteins in rye, barley and oats cause the immune-mediated gluten intolerance syndrome, coeliac disease. Foods labelled as gluten-free which can be safely consumed by coeliac patients, must not contain gluten above a level of 20 mg/Kg. Current immunoassay methods for detection of gluten can give conflicting results and may underestimate levels of gluten in foods. Mass spectrometry methods have great potential as an orthogonal method, but require curated protein sequence databases to support method development. The GluPro database has been updated to include avenin-like sequences from bread wheat (n = 685; GluPro v1.1) and genes from the sequenced wheat genome (n = 699; GluPro v 1.2) and Triticum turgidum ssp durum (n = 210; GluPro v 2.1). Companion databases have been developed for prolamin sequences from barley (n = 64; GluPro v 3.0), rye (n = 41; GluPro v 4.0), and oats (n = 27; GluPro v 5.0) and combined to provide a complete cereal prolamin database, GluPro v 6.1 comprising 1,041 sequences. Analysis of the coeliac toxic motifs in the curated sequences showed that they were absent from the minor avenin-like proteins in bread and durum wheat and barley, unlike the related avenin proteins from oats. A comparison of prolamin proteins from the different cereal species also showed α- and γ-gliadins in bread and durum wheat, and the sulphur poor prolamins in all cereals had the highest density of coeliac toxic motifs. Analysis of ion-mobility mass spectrometry data for bread wheat (cvs Chinese Spring and Hereward) showed an increased number of identifications when using the GluPro v1.0, 1.1 and 1.2 databases compared to the limited number of verified sequences bread wheat sequences in reviewed UniProt. This family of databases will provide a basis for proteomic profiling of gluten proteins from all the gluten containing cereals and support identification of specific peptide markers for use in development of new methods for gluten quantitation based on coeliac toxic motifs found in all relevant cereal species.
Collapse
Affiliation(s)
- Matthew Daly
- Division of Infection, Immunity and Respiratory Medicine, Faculty of Biology, Medicine and Health, Manchester Institute of Biotechnology, University of Manchester, Manchester, United Kingdom
| | - Sophie N Bromilow
- Division of Infection, Immunity and Respiratory Medicine, Faculty of Biology, Medicine and Health, Manchester Institute of Biotechnology, University of Manchester, Manchester, United Kingdom
| | - Chiara Nitride
- Division of Infection, Immunity and Respiratory Medicine, Faculty of Biology, Medicine and Health, Manchester Institute of Biotechnology, University of Manchester, Manchester, United Kingdom
| | - Peter R Shewry
- Centre for Crop Genetic Improvement, Rothamsted Research, Harpenden, United Kingdom
| | | | - E N Clare Mills
- Division of Infection, Immunity and Respiratory Medicine, Faculty of Biology, Medicine and Health, Manchester Institute of Biotechnology, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
23
|
Liu C, Zhou N, Du MX, Sun YT, Wang K, Wang YJ, Li DH, Yu HY, Song Y, Bai BB, Xin Y, Wu L, Jiang CY, Feng J, Xiang H, Zhou Y, Ma J, Wang J, Liu HW, Liu SJ. The Mouse Gut Microbial Biobank expands the coverage of cultured bacteria. Nat Commun 2020; 11:79. [PMID: 31911589 PMCID: PMC6946648 DOI: 10.1038/s41467-019-13836-5] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 11/25/2019] [Indexed: 02/07/2023] Open
Abstract
Mice are widely used as experimental models for gut microbiome (GM) studies, yet the majority of mouse GM members remain uncharacterized. Here, we report the construction of a mouse gut microbial biobank (mGMB) that contains 126 species, represented by 244 strains that have been deposited in the China General Microorganism Culture Collection. We sequence and phenotypically characterize 77 potential new species and propose their nomenclatures. The mGMB includes 22 and 17 species that are significantly enriched in ob/ob and wild-type C57BL/6J mouse cecal samples, respectively. The genomes of the 126 species in the mGMB cover 52% of the metagenomic nonredundant gene catalog (sequence identity ≥ 60%) and represent 93-95% of the KEGG-Orthology-annotated functions of the sampled mouse GMs. The microbial and genome data assembled in the mGMB enlarges the taxonomic characterization of mouse GMs and represents a useful resource for studies of host-microbe interactions and of GM functions associated with host health and diseases.
Collapse
Affiliation(s)
- Chang Liu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- Environmental Microbiology Research Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Nan Zhou
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- Environmental Microbiology Research Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Meng-Xuan Du
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Yu-Tong Sun
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Kai Wang
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Yu-Jing Wang
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
| | - Dan-Hua Li
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Hai-Ying Yu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Yuqin Song
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- Environmental Microbiology Research Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Bing-Bing Bai
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Yuhua Xin
- Microbial Resources and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Linhuan Wu
- Microbial Resources and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Cheng-Ying Jiang
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- Environmental Microbiology Research Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
| | - Jie Feng
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
- Environmental Microbiology Research Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Hua Xiang
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Yuguang Zhou
- Microbial Resources and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Juncai Ma
- Microbial Resources and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Jun Wang
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China
| | - Hong-Wei Liu
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China.
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China.
| | - Shuang-Jiang Liu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China.
- Environmental Microbiology Research Center, Institute of Microbiology, Chinese Academy of Sciences, No. 1 Beichenxi Road, Chaoyang District, Beijing, 100101, P. R. China.
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China.
| |
Collapse
|
24
|
Functional Gene Network of Prenyltransferases in Arabidopsis thaliana. Molecules 2019; 24:molecules24244556. [PMID: 31842481 PMCID: PMC6943727 DOI: 10.3390/molecules24244556] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 12/09/2019] [Accepted: 12/10/2019] [Indexed: 12/17/2022] Open
Abstract
Prenyltransferases (PTs) are enzymes that catalyze prenyl chain elongation. Some are highly similar to each other at the amino acid level. Therefore, it is difficult to assign their function based solely on their sequence homology to functional orthologs. Other experiments, such as in vitro enzymatic assay, mutant analysis, and mutant complementation are necessary to assign their precise function. Moreover, subcellular localization can also influence the functionality of the enzymes within the pathway network, because different isoprenoid end products are synthesized in the cytosol, mitochondria, or plastids from prenyl diphosphate (prenyl-PP) substrates. In addition to in vivo functional experiments, in silico approaches, such as co-expression analysis, can provide information about the topology of PTs within the isoprenoid pathway network. There has been huge progress in the last few years in the characterization of individual Arabidopsis PTs, resulting in better understanding of their function and their topology within the isoprenoid pathway. Here, we summarize these findings and present the updated topological model of PTs in the Arabidopsis thaliana isoprenoid pathway.
Collapse
|
25
|
Marchant A, Cisneros AF, Dubé AK, Gagnon-Arsenault I, Ascencio D, Jain H, Aubé S, Eberlein C, Evans-Yamamoto D, Yachie N, Landry CR. The role of structural pleiotropy and regulatory evolution in the retention of heteromers of paralogs. eLife 2019; 8:46754. [PMID: 31454312 PMCID: PMC6711710 DOI: 10.7554/elife.46754] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 08/11/2019] [Indexed: 01/07/2023] Open
Abstract
Gene duplication is a driver of the evolution of new functions. The duplication of genes encoding homomeric proteins leads to the formation of homomers and heteromers of paralogs, creating new complexes after a single duplication event. The loss of these heteromers may be required for the two paralogs to evolve independent functions. Using yeast as a model, we find that heteromerization is frequent among duplicated homomers and correlates with functional similarity between paralogs. Using in silico evolution, we show that for homomers and heteromers sharing binding interfaces, mutations in one paralog can have structural pleiotropic effects on both interactions, resulting in highly correlated responses of the complexes to selection. Therefore, heteromerization could be preserved indirectly due to selection for the maintenance of homomers, thus slowing down functional divergence between paralogs. We suggest that paralogs can overcome the obstacle of structural pleiotropy by regulatory evolution at the transcriptional and post-translational levels.
Collapse
Affiliation(s)
- Axelle Marchant
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Département de biologie, Université Laval, Québec, Canada
| | - Angel F Cisneros
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada
| | - Alexandre K Dubé
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Département de biologie, Université Laval, Québec, Canada
| | - Isabelle Gagnon-Arsenault
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Département de biologie, Université Laval, Québec, Canada
| | - Diana Ascencio
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Département de biologie, Université Laval, Québec, Canada
| | - Honey Jain
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Department of Biological Sciences, Birla Institute of Technology and Sciences, Pilani, India
| | - Simon Aubé
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada
| | - Chris Eberlein
- PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Département de biologie, Université Laval, Québec, Canada
| | - Daniel Evans-Yamamoto
- Research Center for Advanced Science and Technology, University of Tokyo, Tokyo, Japan.,Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan.,Graduate School of Media and Governance, Keio University, Fujisawa, Japan
| | - Nozomu Yachie
- Research Center for Advanced Science and Technology, University of Tokyo, Tokyo, Japan.,Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan.,Graduate School of Media and Governance, Keio University, Fujisawa, Japan.,Department of Biological Sciences, Graduate School of Science, University of Tokyo, Tokyo, Japan
| | - Christian R Landry
- Département de biochimie, de microbiologie et de bio-informatique, Université Laval, Québec, Canada.,PROTEO, le réseau québécois de recherche sur la fonction, la structure et l'ingénierie des protéines, Université Laval, Québec, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, Canada.,Département de biologie, Université Laval, Québec, Canada
| |
Collapse
|
26
|
Zhang F, Song H, Zeng M, Li Y, Kurgan L, Li M. DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions. Proteomics 2019; 19:e1900019. [PMID: 30941889 DOI: 10.1002/pmic.201900019] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 03/18/2019] [Indexed: 01/06/2023]
Abstract
Annotation of protein functions plays an important role in understanding life at the molecular level. High-throughput sequencing produces massive numbers of raw proteins sequences and only about 1% of them have been manually annotated with functions. Experimental annotations of functions are expensive, time-consuming and do not keep up with the rapid growth of the sequence numbers. This motivates the development of computational approaches that predict protein functions. A novel deep learning framework, DeepFunc, is proposed which accurately predicts protein functions from protein sequence- and network-derived information. More precisely, DeepFunc uses a long and sparse binary vector to encode information concerning domains, families, and motifs collected from the InterPro tool that is associated with the input protein sequence. This vector is processed with two neural layers to obtain a low-dimensional vector which is combined with topological information extracted from protein-protein interactions (PPIs) and functional linkages. The combined information is processed by a deep neural network that predicts protein functions. DeepFunc is empirically and comparatively tested on a benchmark testing dataset and the Critical Assessment of protein Function Annotation algorithms (CAFA) 3 dataset. The experimental results demonstrate that DeepFunc outperforms current methods on the testing dataset and that it secures the highest Fmax = 0.54 and AUC = 0.94 on the CAFA3 dataset.
Collapse
Affiliation(s)
- Fuhao Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Hong Song
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Yaohang Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China.,Department of Computer Science, Old Dominion University, Norfolk, VA, 23529, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, P. R. China
| |
Collapse
|
27
|
Buchholz PCF, Ferrario V, Pohl M, Gardossi L, Pleiss J. Navigating within thiamine diphosphate-dependent decarboxylases: Sequences, structures, functional positions, and binding sites. Proteins 2019; 87:774-785. [PMID: 31070804 DOI: 10.1002/prot.25706] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2019] [Revised: 04/23/2019] [Accepted: 05/05/2019] [Indexed: 11/10/2022]
Abstract
Thiamine diphosphate-dependent decarboxylases catalyze both cleavage and formation of CC bonds in various reactions, which have been assigned to different homologous sequence families. This work compares 53 ThDP-dependent decarboxylases with known crystal structures. Both sequence and structural information were analyzed synergistically and data were analyzed for global and local properties by means of statistical approaches (principle component analysis and principal coordinate analysis) enabling complexity reduction. The different results obtained both locally and globally, that is, individual positions compared with the overall protein sequence or structure, revealed challenges in the assignment of separated homologous families. The methods applied herein support the comparison of enzyme families and the identification of functionally relevant positions. The findings for the family of ThDP-dependent decarboxylases underline that global sequence identity alone is not sufficient to distinguish enzyme function. Instead, local sequence similarity, defined by comparisons of structurally equivalent positions, allows for a better navigation within several groups of homologous enzymes. The differentiation between homologous sequences is further enhanced by taking structural information into account, such as BioGPS analysis of the active site properties or pairwise structural superimpositions. The methods applied herein are expected to be transferrable to other enzyme families, to facilitate family assignments for homologous protein sequences.
Collapse
Affiliation(s)
- Patrick C F Buchholz
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany
| | - Valerio Ferrario
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany.,Laboratory of Applied and Computational Biocatalysis, Department of Chemical and Pharmaceutical Sciences, Università degli Studi di Trieste, Trieste, Italy
| | - Martina Pohl
- Forschungszentrum Jülich GmbH, IBG-1: Biotechnology, Jülich, Germany
| | - Lucia Gardossi
- Laboratory of Applied and Computational Biocatalysis, Department of Chemical and Pharmaceutical Sciences, Università degli Studi di Trieste, Trieste, Italy
| | - Jürgen Pleiss
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
28
|
Identification of Differentiating Metabolic Pathways between Infant Gut Microbiome Populations Reveals Depletion of Function-Level Adaptation to Human Milk in the Finnish Population. mSphere 2019; 4:4/2/e00152-19. [PMID: 30894435 PMCID: PMC6429046 DOI: 10.1128/mspheredirect.00152-19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Knowing the limitations of taxonomy-based research, there is an emerging need for the development of higher-resolution techniques. The significance of this research is demonstrated by the novel method used for the analysis of function-level metagenomes. BiomeScout—the presented technology—utilizes proprietary algorithms for the detection of differences between functionalities present in metagenomic samples. A variety of autoimmune and allergy events are becoming increasingly common, especially in Western countries. Some pieces of research link such conditions with the composition of microbiota during infancy. In this period, the predominant form of nutrition for gut microbiota is oligosaccharides from human milk (HMO). A number of gut-colonizing strains, such as Bifidobacterium and Bacteroides, are able to utilize HMO, but only some Bifidobacterium strains have evolved to digest the specific composition of human oligosaccharides. Differences in the proportions of the two genera that are able to utilize HMO have already been associated with the frequency of allergies and autoimmune diseases in the Finnish and the Russian populations. Our results show that differences in terms of the taxonomic annotation do not explain the reason for the differences in the Bifidobacterium/Bacteroides ratio between the Finnish and the Russian populations. In this paper, we present the results of function-level analysis. Unlike the typical workflow for gene abundance analysis, BiomeScout technology explains the differences in the Bifidobacterium/Bacteroides ratio. Our research shows the differences in the abundances of the two enzymes that are crucial for the utilization of short type 1 oligosaccharides. IMPORTANCE Knowing the limitations of taxonomy-based research, there is an emerging need for the development of higher-resolution techniques. The significance of this research is demonstrated by the novel method used for the analysis of function-level metagenomes. BiomeScout—the presented technology—utilizes proprietary algorithms for the detection of differences between functionalities present in metagenomic samples.
Collapse
|
29
|
A Robust Methodology for Assessing Differential Homeolog Contributions to the Transcriptomes of Allopolyploids. Genetics 2018; 210:883-894. [PMID: 30213855 DOI: 10.1534/genetics.118.301564] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 09/07/2018] [Indexed: 12/18/2022] Open
Abstract
Polyploidy has played a pivotal and recurring role in angiosperm evolution. Allotetraploids arise from hybridization between species and possess duplicated gene copies (homeologs) that serve redundant roles immediately after polyploidization. Although polyploidization is a major contributor to plant evolution, it remains poorly understood. We describe an analytical approach for assessing homeolog-specific expression that begins with de novo assembly of parental transcriptomes and effectively (i) reduces redundancy in de novo assemblies, (ii) identifies putative orthologs, (iii) isolates common regions between orthologs, and (iv) assesses homeolog-specific expression using a robust Bayesian Poisson-Gamma model to account for sequence bias when mapping polyploid reads back to parental references. Using this novel methodology, we examine differential homeolog contributions to the transcriptome in the recently formed allopolyploids Tragopogon mirus and T. miscellus (Compositae). Notably, we assess a larger Tragopogon gene set than previous studies of this system. Using carefully identified orthologous regions and filtering biased orthologs, we find in both allopolyploids largely balanced expression with no strong parental bias. These new methods can be used to examine homeolog expression in any tetrapolyploid system without requiring a reference genome.
Collapse
|
30
|
Hüdig M, Schmitz J, Engqvist MKM, Maurino VG. Biochemical control systems for small molecule damage in plants. PLANT SIGNALING & BEHAVIOR 2018; 13:e1477906. [PMID: 29944438 PMCID: PMC6103286 DOI: 10.1080/15592324.2018.1477906] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 05/11/2018] [Indexed: 05/29/2023]
Abstract
As a system, plant metabolism is far from perfect: small molecules (metabolites, cofactors, coenzymes, and inorganic molecules) are frequently damaged by unwanted enzymatic or spontaneous reactions. Here, we discuss the emerging principles in small molecule damage biology. We propose that plants evolved at least three distinct systems to control small molecule damage: (i) repair, which returns a damaged molecule to its original state; (ii) scavenging, which converts reactive molecules to harmless products; and (iii) steering, in which the possible formation of a damaged molecule is suppressed. We illustrate the concept of small molecule damage control in plants by describing specific examples for each of these three categories. We highlight interesting insights that we expect future research will provide on those systems, and we discuss promising strategies to discover new small molecule damage-control systems in plants.
Collapse
Affiliation(s)
- M. Hüdig
- Plant Molecular Physiology and Biotechnology Group, Institute of Developmental and Molecular Biology of Plants, Heinrich Heine University, and Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
| | - J. Schmitz
- Plant Molecular Physiology and Biotechnology Group, Institute of Developmental and Molecular Biology of Plants, Heinrich Heine University, and Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
| | - M. K. M. Engqvist
- Department of Biology and Biological engineering, Division of Systems and Synthetic Biology, Chalmers University of Technology, Gothenburg, Sweden
| | - V. G. Maurino
- Plant Molecular Physiology and Biotechnology Group, Institute of Developmental and Molecular Biology of Plants, Heinrich Heine University, and Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
| |
Collapse
|
31
|
Gil N, Fiser A. Identifying functionally informative evolutionary sequence profiles. Bioinformatics 2018; 34:1278-1286. [PMID: 29211823 PMCID: PMC5905606 DOI: 10.1093/bioinformatics/btx779] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Accepted: 11/29/2017] [Indexed: 01/06/2023] Open
Abstract
Motivation Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. Results We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. Contact andras.fiser@einstein.yu.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nelson Gil
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Andras Fiser
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| |
Collapse
|
32
|
Affiliation(s)
- Jacquelyn S. Fetrow
- Office of the President, Albright College, Reading, Pennsylvania, United States of America
- * E-mail:
| | - Patricia C. Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
| |
Collapse
|
33
|
McManus HA, Fučíková K, Lewis PO, Lewis LA, Karol KG. Organellar phylogenomics inform systematics in the green algal family Hydrodictyaceae (Chlorophyceae) and provide clues to the complex evolutionary history of plastid genomes in the green algal tree of life. AMERICAN JOURNAL OF BOTANY 2018; 105:315-329. [PMID: 29722901 DOI: 10.1002/ajb2.1066] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 12/19/2017] [Indexed: 05/11/2023]
Abstract
PREMISE OF THE STUDY Phylogenomic analyses across the green algae are resolving relationships at the class, order, and family levels and highlighting dynamic patterns of evolution in organellar genomes. Here we present a within-family phylogenomic study to resolve genera and species relationships in the family Hydrodictyaceae (Chlorophyceae), for which poor resolution in previous phylogenetic studies, along with divergent morphological traits, have precluded taxonomic revisions. METHODS Complete plastome sequences and mitochondrial protein-coding gene sequences were acquired from representatives of the Hydrodictyaceae using next-generation sequencing methods. Plastomes were characterized, and gene order and content were compared with plastomes spanning the Sphaeropleales. Single-gene and concatenated-gene phylogenetic analyses of plastid and mitochondrial genes were performed. KEY RESULTS The Hydrodictyaceae contain the largest sphaeroplealean plastomes thus far fully sequenced. Conservation of plastome gene order within Hydrodictyaceae is striking compared with more dynamic patterns revealed across Sphaeropleales. Phylogenetic analyses resolve Hydrodictyon sister to a monophyletic Pediastrum, though the morphologically distinct P. angulosum and P. duplex continue to be polyphyletic. Analyses of plastid data supported the neochloridacean genus Chlorotetraëdron as sister to Hydrodictyaceae, while conflicting signal was found in the mitochondrial data. CONCLUSIONS A phylogenomic approach resolved within-family relationships not obtainable with previous phylogenetic analyses. Denser taxon sampling across Sphaeropleales is necessary to capture patterns in plastome evolution, and further taxa and studies are needed to fully resolve the sister lineage to Hydrodictyaceae and polyphyly of Pediastrum angulosum and P. duplex.
Collapse
Affiliation(s)
- Hilary A McManus
- Department of Biological and Environmental Sciences, Le Moyne College, 1419 Salt Springs Road, Syracuse, New York, 13066, USA
| | - Karolina Fučíková
- Department of Natural Sciences, Assumption College, 500 Salisbury Street, Worcester, Massachusetts, 01609, USA
| | - Paul O Lewis
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, 06269, USA
| | - Louise A Lewis
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, 06269, USA
| | - Kenneth G Karol
- Lewis B. and Dorothy Cullman Program for Molecular Systematics, The New York Botanical Garden, 2900 Southern Boulevard, Bronx, New York, 10458, USA
| |
Collapse
|
34
|
Contrasting carbon metabolism in saprotrophic and pathogenic microascalean fungi from Protea trees. FUNGAL ECOL 2017. [DOI: 10.1016/j.funeco.2017.09.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
35
|
Abstract
Surveys of public sequence resources show that experimentally supported functional information is still completely missing for a considerable fraction of known proteins and is clearly incomplete for an even larger portion. Bioinformatics methods have long made use of very diverse data sources alone or in combination to predict protein function, with the understanding that different data types help elucidate complementary biological roles. This chapter focuses on methods accepting amino acid sequences as input and producing GO term assignments directly as outputs; the relevant biological and computational concepts are presented along with the advantages and limitations of individual approaches.
Collapse
Affiliation(s)
- Domenico Cozzetto
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
| | - David T Jones
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.
| |
Collapse
|
36
|
Koehorst JJ, Saccenti E, Schaap PJ, Martins Dos Santos VAP, Suarez-Diez M. Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics. F1000Res 2016; 5:1987. [PMID: 27703668 PMCID: PMC5031134 DOI: 10.12688/f1000research.9416.3] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/26/2017] [Indexed: 11/20/2022] Open
Abstract
A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures for large scale functional comparative analysis as an alternative method. The performance of both approaches was assessed through functional comparison of 446 bacterial genomes sampled at different taxonomic levels. We show that protein domain architectures provide a fast and efficient alternative to methods based on sequence similarity to identify groups of functionally equivalent proteins within and across taxonomic boundaries, and it is suitable for large scale comparative analysis. Running both methods in parallel pinpoints potential functional adaptations that may add to bacterial fitness.
Collapse
Affiliation(s)
- Jasper J Koehorst
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Peter J Schaap
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Vitor A P Martins Dos Santos
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands.,LifeGlimmer GmBH, Berlin, Germany
| | - Maria Suarez-Diez
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| |
Collapse
|
37
|
Wheeler NE, Barquist L, Kingsley RA, Gardner PP. A profile-based method for identifying functional divergence of orthologous genes in bacterial genomes. Bioinformatics 2016; 32:3566-3574. [PMID: 27503221 PMCID: PMC5181535 DOI: 10.1093/bioinformatics/btw518] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Revised: 07/17/2016] [Accepted: 08/02/2016] [Indexed: 02/04/2023] Open
Abstract
Motivation: Next generation sequencing technologies have provided us with a wealth of information on genetic variation, but predicting the functional significance of this variation is a difficult task. While many comparative genomics studies have focused on gene flux and large scale changes, relatively little attention has been paid to quantifying the effects of single nucleotide polymorphisms and indels on protein function, particularly in bacterial genomics. Results: We present a hidden Markov model based approach we call delta-bitscore (DBS) for identifying orthologous proteins that have diverged at the amino acid sequence level in a way that is likely to impact biological function. We benchmark this approach with several widely used datasets and apply it to a proof-of-concept study of orthologous proteomes in an investigation of host adaptation in Salmonella enterica. We highlight the value of the method in identifying functional divergence of genes, and suggest that this tool may be a better approach than the commonly used dN/dS metric for identifying functionally significant genetic changes occurring in recently diverged organisms. Availability and Implementation: A program implementing DBS for pairwise genome comparisons is freely available at: https://github.com/UCanCompBio/deltaBS. Contact:nicole.wheeler@pg.canterbury.ac.nz or lars.barquist@uni-wuerzburg.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nicole E Wheeler
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.,Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand
| | - Lars Barquist
- Institute for Molecular Infection Biology, University of Wuerzburg, Wuerzburg, Germany
| | - Robert A Kingsley
- Institute of Food Research, Norwich Research Park, Norwich, UK.,Wellcome Trust Sanger Institute, Hinxton, UK
| | - Paul P Gardner
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.,Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.,Bio-protection Research Centre, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
38
|
Lee D, Das S, Dawson NL, Dobrijevic D, Ward J, Orengo C. Novel Computational Protocols for Functionally Classifying and Characterising Serine Beta-Lactamases. PLoS Comput Biol 2016; 12:e1004926. [PMID: 27332861 PMCID: PMC4917113 DOI: 10.1371/journal.pcbi.1004926] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 04/19/2016] [Indexed: 11/23/2022] Open
Abstract
Beta-lactamases represent the main bacterial mechanism of resistance to beta-lactam antibiotics and are a significant challenge to modern medicine. We have developed an automated classification and analysis protocol that exploits structure- and sequence-based approaches and which allows us to propose a grouping of serine beta-lactamases that more consistently captures and rationalizes the existing three classification schemes: Classes, (A, C and D, which vary in their implementation of the mechanism of action); Types (that largely reflect evolutionary distance measured by sequence similarity); and Variant groups (which largely correspond with the Bush-Jacoby clinical groups). Our analysis platform exploits a suite of in-house and public tools to identify Functional Determinants (FDs), i.e. residue sites, responsible for conferring different phenotypes between different classes, different types and different variants. We focused on Class A beta-lactamases, the most highly populated and clinically relevant class, to identify FDs implicated in the distinct phenotypes associated with different Class A Types and Variants. We show that our FunFHMMer method can separate the known beta-lactamase classes and identify those positions likely to be responsible for the different implementations of the mechanism of action in these enzymes. Two novel algorithms, ASSP and SSPA, allow detection of FD sites likely to contribute to the broadening of the substrate profiles. Using our approaches, we recognise 151 Class A types in UniProt. Finally, we used our beta-lactamase FunFams and ASSP profiles to detect 4 novel Class A types in microbiome samples. Our platforms have been validated by literature studies, in silico analysis and some targeted experimental verification. Although developed for the serine beta-lactamases they could be used to classify and analyse any diverse protein superfamily where sub-families have diverged over both long and short evolutionary timescales.
Collapse
Affiliation(s)
- David Lee
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Sayoni Das
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Natalie L. Dawson
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Dragana Dobrijevic
- Department of Biochemical Engineering, University College London, London, United Kingdom
| | - John Ward
- Department of Biochemical Engineering, University College London, London, United Kingdom
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| |
Collapse
|
39
|
Das S, Orengo CA. Protein function annotation using protein domain family resources. Methods 2016; 93:24-34. [DOI: 10.1016/j.ymeth.2015.09.029] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Revised: 09/28/2015] [Accepted: 09/29/2015] [Indexed: 01/25/2023] Open
|
40
|
Wu Z, Hu G, Yang J, Peng Z, Uversky VN, Kurgan L. In various protein complexes, disordered protomers have large per-residue surface areas and area of protein-, DNA- and RNA-binding interfaces. FEBS Lett 2015; 589:2561-9. [DOI: 10.1016/j.febslet.2015.08.014] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2015] [Revised: 07/31/2015] [Accepted: 08/03/2015] [Indexed: 11/28/2022]
|
41
|
Rozenberg A, Parida M, Leese F, Weiss LC, Tollrian R, Manak JR. Transcriptional profiling of predator-induced phenotypic plasticity in Daphnia pulex. Front Zool 2015. [PMID: 26213557 PMCID: PMC4514973 DOI: 10.1186/s12983-015-0109-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background Predator-induced defences are a prominent example of phenotypic plasticity found from single-celled organisms to vertebrates. The water flea Daphnia pulex is a very convenient ecological genomic model for studying predator-induced defences as it exhibits substantial morphological changes under predation risk. Most importantly, however, genetically identical clones can be transcriptionally profiled under both control and predation risk conditions and be compared due to the availability of the sequenced reference genome. Earlier gene expression analyses of candidate genes as well as a tiled genomic microarray expression experiment have provided insights into some genes involved in predator-induced phenotypic plasticity. Here we performed the first RNA-Seq analysis to identify genes that were differentially expressed in defended vs. undefended D. pulex specimens in order to explore the genetic mechanisms underlying predator-induced defences at a qualitatively novel level. Results We report 230 differentially expressed genes (158 up- and 72 down-regulated) identified in at least two of three different assembly approaches. Several of the differentially regulated genes belong to families of paralogous genes. The most prominent classes amongst the up-regulated genes include cuticle genes, zinc-metalloproteinases and vitellogenin genes. Furthermore, several genes from this group code for proteins recruited in chromatin-reorganization or regulation of the cell cycle (cyclins). Down-regulated gene classes include C-type lectins, proteins involved in lipogenesis, and other families, some of which encode proteins with no known molecular function. Conclusions The RNA-Seq transcriptome data presented in this study provide important insights into gene regulatory patterns underlying predator-induced defences. In particular, we characterized different effector genes and gene families found to be regulated in Daphnia in response to the presence of an invertebrate predator. These effector genes are mostly in agreement with expectations based on observed phenotypic changes including morphological alterations, i.e., expression of proteins involved in formation of protective structures and in cuticle strengthening, as well as proteins required for resource re-allocation. Our findings identify key genetic pathways associated with anti-predator defences. Electronic supplementary material The online version of this article (doi:10.1186/s12983-015-0109-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Andrey Rozenberg
- Department of Animal Ecology, Evolution and Biodiversity, Ruhr University Bochum, Universitaetsstrasse 150, Bochum, 44801 Germany
| | - Mrutyunjaya Parida
- Departments of Biology and Pediatrics and the Roy J. Carver Center for Genomics, 459 Biology Building, University of Iowa, Iowa City, IA 52242 USA
| | - Florian Leese
- Department of Animal Ecology, Evolution and Biodiversity, Ruhr University Bochum, Universitaetsstrasse 150, Bochum, 44801 Germany.,Present address: University of Duisburg-Essen, Aquatic Ecosystems Research, Universitaetsstrasse 5, Essen, 45141 Germany
| | - Linda C Weiss
- Department of Animal Ecology, Evolution and Biodiversity, Ruhr University Bochum, Universitaetsstrasse 150, Bochum, 44801 Germany.,Environmental Genomics Group, School of Biosciences, University of Birmingham, Birmingham, B15 2TT UK
| | - Ralph Tollrian
- Department of Animal Ecology, Evolution and Biodiversity, Ruhr University Bochum, Universitaetsstrasse 150, Bochum, 44801 Germany
| | - J Robert Manak
- Departments of Biology and Pediatrics and the Roy J. Carver Center for Genomics, 459 Biology Building, University of Iowa, Iowa City, IA 52242 USA
| |
Collapse
|
42
|
Wang T, Mori H, Zhang C, Kurokawa K, Xing XH, Yamada T. DomSign: a top-down annotation pipeline to enlarge enzyme space in the protein universe. BMC Bioinformatics 2015; 16:96. [PMID: 25888481 PMCID: PMC4389672 DOI: 10.1186/s12859-015-0499-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2014] [Accepted: 02/18/2015] [Indexed: 12/27/2022] Open
Abstract
Background Computational predictions of catalytic function are vital for in-depth understanding of enzymes. Because several novel approaches performing better than the common BLAST tool are rarely applied in research, we hypothesized that there is a large gap between the number of known annotated enzymes and the actual number in the protein universe, which significantly limits our ability to extract additional biologically relevant functional information from the available sequencing data. To reliably expand the enzyme space, we developed DomSign, a highly accurate domain signature–based enzyme functional prediction tool to assign Enzyme Commission (EC) digits. Results DomSign is a top-down prediction engine that yields results comparable, or superior, to those from many benchmark EC number prediction tools, including BLASTP, when a homolog with an identity >30% is not available in the database. Performance tests showed that DomSign is a highly reliable enzyme EC number annotation tool. After multiple tests, the accuracy is thought to be greater than 90%. Thus, DomSign can be applied to large-scale datasets, with the goal of expanding the enzyme space with high fidelity. Using DomSign, we successfully increased the percentage of EC-tagged enzymes from 12% to 30% in UniProt-TrEMBL. In the Kyoto Encyclopedia of Genes and Genomes bacterial database, the percentage of EC-tagged enzymes for each bacterial genome could be increased from 26.0% to 33.2% on average. Metagenomic mining was also efficient, as exemplified by the application of DomSign to the Human Microbiome Project dataset, recovering nearly one million new EC-labeled enzymes. Conclusions Our results offer preliminarily confirmation of the existence of the hypothesized huge number of “hidden enzymes” in the protein universe, the identification of which could substantially further our understanding of the metabolisms of diverse organisms and also facilitate bioengineering by providing a richer enzyme resource. Furthermore, our results highlight the necessity of using more advanced computational tools than BLAST in protein database annotations to extract additional biologically relevant functional information from the available biological sequences. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0499-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tianmin Wang
- Department of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 M6-3, Ookayama, Meguro-ku, Tokyo, 152-8550, Japan. .,Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China.
| | - Hiroshi Mori
- Department of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 M6-3, Ookayama, Meguro-ku, Tokyo, 152-8550, Japan. .,Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1-E3-10 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.
| | - Chong Zhang
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China.
| | - Ken Kurokawa
- Department of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 M6-3, Ookayama, Meguro-ku, Tokyo, 152-8550, Japan. .,Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1-E3-10 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.
| | - Xin-Hui Xing
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China.
| | - Takuji Yamada
- Department of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 M6-3, Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.
| |
Collapse
|
43
|
Sanchez-Pulido L, Ponting CP. TM6SF2 and MAC30, new enzyme homologs in sterol metabolism and common metabolic disease. Front Genet 2014; 5:439. [PMID: 25566323 PMCID: PMC4263179 DOI: 10.3389/fgene.2014.00439] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Accepted: 11/27/2014] [Indexed: 12/14/2022] Open
Abstract
Carriers of the Glu167Lys coding variant in the TM6SF2 gene have recently been identified as being more susceptible to non-alcoholic fatty liver disease (NAFLD), yet exhibit lower levels of circulating lipids and hence are protected against cardiovascular disease. Despite the physiological importance of these observations, the molecular function of TM6SF2 remains unknown, and no sequence similarity with functionally characterized proteins has been identified. In order to trace its evolutionary history and to identify functional domains, we embarked on a computational protein sequence analysis of TM6SF2. We identified a new domain, the EXPERA domain, which is conserved among TM6SF, MAC30/TMEM97 and EBP (D8, D7 sterol isomerase) protein families. EBP mutations are the cause of chondrodysplasia punctata 2 X-linked dominant (CDPX2), also known as Conradi-Hünermann-Happle syndrome, a defective cholesterol biosynthesis disorder. Our analysis of evolutionary conservation among EXPERA domain-containing families and the previously suggested catalytic mechanism for the EBP enzyme, indicate that TM6SF and MAC30/TMEM97 families are both highly likely to possess, as for the EBP family, catalytic activity as sterol isomerases. This unexpected prediction of enzymatic functions for TM6SF and MAC30/TMEM97 is important because it now permits detailed experiments to investigate the function of these key proteins in various human pathologies, from cardiovascular disease to cancer.
Collapse
Affiliation(s)
- Luis Sanchez-Pulido
- Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford Oxford, UK
| | - Chris P Ponting
- Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford Oxford, UK
| |
Collapse
|
44
|
Mizianty MJ, Fan X, Yan J, Chalmers E, Woloschuk C, Joachimiak A, Kurgan L. Covering complete proteomes with X-ray structures: a current snapshot. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2014; 70:2781-93. [PMID: 25372670 PMCID: PMC4220968 DOI: 10.1107/s1399004714019427] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 08/27/2014] [Indexed: 12/23/2022]
Abstract
Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.
Collapse
Affiliation(s)
- Marcin J. Mizianty
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Xiao Fan
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Jing Yan
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Eric Chalmers
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Christopher Woloschuk
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Andrzej Joachimiak
- Midwest Center for Structural Genomics, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Lukasz Kurgan
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| |
Collapse
|
45
|
Zayner JP, Antoniou C, French AR, Hause RJ, Sosnick TR. Investigating models of protein function and allostery with a widespread mutational analysis of a light-activated protein. Biophys J 2014; 105:1027-36. [PMID: 23972854 DOI: 10.1016/j.bpj.2013.07.010] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 06/17/2013] [Accepted: 07/01/2013] [Indexed: 10/26/2022] Open
Abstract
To investigate the relationship between a protein's sequence and its biophysical properties, we studied the effects of more than 100 mutations in Avena sativa light-oxygen-voltage domain 2, a model protein of the Per-Arnt-Sim family. The A. sativa light-oxygen-voltage domain 2 undergoes a photocycle with a conformational change involving the unfolding of the terminal helices. Whereas selection studies typically search for winners in a large population and fail to characterize many sites, we characterized the biophysical consequences of mutations throughout the protein using NMR, circular dichroism, and ultraviolet/visible spectroscopy. Despite our intention to introduce highly disruptive substitutions, most had modest or no effect on function, and many could even be considered to be more photoactive. Substitutions at evolutionarily conserved sites can have minimal effect, whereas those at nonconserved positions can have large effects, contrary to the view that the effects of mutations, especially at conserved positions, are predictable. Using predictive models, we found that the effects of mutations on biophysical function and allostery reflect a complex mixture of multiple characteristics including location, character, electrostatics, and chemistry.
Collapse
Affiliation(s)
- Josiah P Zayner
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA
| | | | | | | | | |
Collapse
|
46
|
Nagao C, Nagano N, Mizuguchi K. Prediction of detailed enzyme functions and identification of specificity determining residues by random forests. PLoS One 2014; 9:e84623. [PMID: 24416252 PMCID: PMC3885575 DOI: 10.1371/journal.pone.0084623] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 11/15/2013] [Indexed: 12/03/2022] Open
Abstract
Determining enzyme functions is essential for a thorough understanding of cellular processes. Although many prediction methods have been developed, it remains a significant challenge to predict enzyme functions at the fourth-digit level of the Enzyme Commission numbers. Functional specificity of enzymes often changes drastically by mutations of a small number of residues and therefore, information about these critical residues can potentially help discriminate detailed functions. However, because these residues must be identified by mutagenesis experiments, the available information is limited, and the lack of experimentally verified specificity determining residues (SDRs) has hindered the development of detailed function prediction methods and computational identification of SDRs. Here we present a novel method for predicting enzyme functions by random forests, EFPrf, along with a set of putative SDRs, the random forests derived SDRs (rf-SDRs). EFPrf consists of a set of binary predictors for enzymes in each CATH superfamily and the rf-SDRs are the residue positions corresponding to the most highly contributing attributes obtained from each predictor. EFPrf showed a precision of 0.98 and a recall of 0.89 in a cross-validated benchmark assessment. The rf-SDRs included many residues, whose importance for specificity had been validated experimentally. The analysis of the rf-SDRs revealed both a general tendency that functionally diverged superfamilies tend to include more active site residues in their rf-SDRs than in less diverged superfamilies, and superfamily-specific conservation patterns of each functional residue. EFPrf and the rf-SDRs will be an effective tool for annotating enzyme functions and for understanding how enzyme functions have diverged within each superfamily.
Collapse
Affiliation(s)
- Chioko Nagao
- National Institute of Biomedical Innovation, Ibaraki, Osaka, Japan
- * E-mail: (CN); (KM)
| | - Nozomi Nagano
- Computational Biology Research Center, AIST, Koto-ku, Tokyo, Japan
| | - Kenji Mizuguchi
- National Institute of Biomedical Innovation, Ibaraki, Osaka, Japan
- * E-mail: (CN); (KM)
| |
Collapse
|
47
|
Wang P, Lai WF, Li MJ, Xu F, Yalamanchili HK, Lovell-Badge R, Wang J. Inference of gene-phenotype associations via protein-protein interaction and orthology. PLoS One 2013; 8:e77478. [PMID: 24194887 PMCID: PMC3806783 DOI: 10.1371/journal.pone.0077478] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 08/30/2013] [Indexed: 01/23/2023] Open
Abstract
One of the fundamental goals of genetics is to understand gene functions and their associated phenotypes. To achieve this goal, in this study we developed a computational algorithm that uses orthology and protein-protein interaction information to infer gene-phenotype associations for multiple species. Furthermore, we developed a web server that provides genome-wide phenotype inference for six species: fly, human, mouse, worm, yeast, and zebrafish. We evaluated our inference method by comparing the inferred results with known gene-phenotype associations. The high Area Under the Curve values suggest a significant performance of our method. By applying our method to two human representative diseases, Type 2 Diabetes and Breast Cancer, we demonstrated that our method is able to identify related Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways. The web server can be used to infer functions and putative phenotypes of a gene along with the candidate genes of a phenotype, and thus aids in disease candidate gene discovery. Our web server is available at http://jjwanglab.org/PhenoPPIOrth.
Collapse
Affiliation(s)
- Panwen Wang
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
| | - Wing-Fu Lai
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Mulin Jun Li
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
| | - Feng Xu
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
| | - Hari Krishna Yalamanchili
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
| | - Robin Lovell-Badge
- Division of Developmental Genetics, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London, United Kingdom
| | - Junwen Wang
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- * E-mail:
| |
Collapse
|
48
|
Prediction and experimental validation of enzyme substrate specificity in protein structures. Proc Natl Acad Sci U S A 2013; 110:E4195-202. [PMID: 24145433 DOI: 10.1073/pnas.1305162110] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Structural Genomics aims to elucidate protein structures to identify their functions. Unfortunately, the variation of just a few residues can be enough to alter activity or binding specificity and limit the functional resolution of annotations based on sequence and structure; in enzymes, substrates are especially difficult to predict. Here, large-scale controls and direct experiments show that the local similarity of five or six residues selected because they are evolutionarily important and on the protein surface can suffice to identify an enzyme activity and substrate. A motif of five residues predicted that a previously uncharacterized Silicibacter sp. protein was a carboxylesterase for short fatty acyl chains, similar to hormone-sensitive-lipase-like proteins that share less than 20% sequence identity. Assays and directed mutations confirmed this activity and showed that the motif was essential for catalysis and substrate specificity. We conclude that evolutionary and structural information may be combined on a Structural Genomics scale to create motifs of mixed catalytic and noncatalytic residues that identify enzyme activity and substrate specificity.
Collapse
|
49
|
Tan K, Chang C, Cuff M, Osipiuk J, Landorf E, Mack JC, Zerbs S, Joachimiak A, Collart FR. Structural and functional characterization of solute binding proteins for aromatic compounds derived from lignin: p-coumaric acid and related aromatic acids. Proteins 2013; 81:1709-26. [PMID: 23606130 DOI: 10.1002/prot.24305] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2012] [Revised: 03/12/2013] [Accepted: 03/28/2013] [Indexed: 11/10/2022]
Abstract
Lignin comprises 15-25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP-binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute-binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins.
Collapse
Affiliation(s)
- Kemin Tan
- Biosciences Division, Argonne National Laboratory, Lemont, Illinois, 60439; The Midwest Center for Structural Genomics, Argonne National Laboratory, Lemont, Illinois, 60439; Structural Biology Center, Argonne National Laboratory, Lemont, Illinois, 60439
| | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Primmer CR, Papakostas S, Leder EH, Davis MJ, Ragan MA. Annotated genes and nonannotated genomes: cross-species use of Gene Ontology in ecology and evolution research. Mol Ecol 2013; 22:3216-41. [DOI: 10.1111/mec.12309] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 02/22/2013] [Accepted: 02/26/2013] [Indexed: 02/01/2023]
Affiliation(s)
- C. R. Primmer
- Department of Biology; University of Turku; 20014 Turku Finland
| | - S. Papakostas
- Department of Biology; University of Turku; 20014 Turku Finland
| | - E. H. Leder
- Department of Biology; University of Turku; 20014 Turku Finland
| | - M. J. Davis
- Institute for Molecular Bioscience; The University of Queensland; Brisbane Qld 4072 Australia
| | - M. A. Ragan
- Institute for Molecular Bioscience; The University of Queensland; Brisbane Qld 4072 Australia
| |
Collapse
|