Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Godzik A. Metagenomics and the protein universe. Curr Opin Struct Biol 2011;21:398-403. [PMID: 21497084 DOI: 10.1016/j.sbi.2011.03.010] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 03/07/2011] [Accepted: 03/24/2011] [Indexed: 02/07/2023]

For:	Godzik A. Metagenomics and the protein universe. Curr Opin Struct Biol 2011;21:398-403. [PMID: 21497084 DOI: 10.1016/j.sbi.2011.03.010] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 03/07/2011] [Accepted: 03/24/2011] [Indexed: 02/07/2023]

Number

Cited by Other Article(s)

Hogg BN, Schnepel C, Finnigan JD, Charnock SJ, Hayes MA, Turner NJ. The Impact of Metagenomics on Biocatalysis. Angew Chem Int Ed Engl 2024;63:e202402316. [PMID: 38494442 DOI: 10.1002/anie.202402316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/11/2024] [Accepted: 03/12/2024] [Indexed: 03/19/2024]

Thermophilic Carboxylesterases from Hydrothermal Vents of the Volcanic Island of Ischia Active on Synthetic and Biobased Polymers and Mycotoxins. Appl Environ Microbiol 2023;89:e0170422. [PMID: 36719236 PMCID: PMC9972953 DOI: 10.1128/aem.01704-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open

Abstract

Hydrothermal vents are geographically widespread and host microorganisms with robust enzymes useful in various industrial applications. We examined microbial communities and carboxylesterases of two terrestrial hydrothermal vents of the volcanic island of Ischia (Italy) predominantly composed of Firmicutes, Proteobacteria, and Bacteroidota. High-temperature enrichment cultures with the polyester plastics polyhydroxybutyrate and polylactic acid (PLA) resulted in an increase of Thermus and Geobacillus species and to some extent Fontimonas and Schleiferia species. The screening at 37 to 70°C of metagenomic fosmid libraries from above enrichment cultures identified three hydrolases (IS10, IS11, and IS12), all derived from yet-uncultured Chloroflexota and showing low sequence identity (33 to 56%) to characterized enzymes. Enzymes expressed in Escherichia coli exhibited maximal esterase activity at 70 to 90°C, with IS11 showing the highest thermostability (90% activity after 20-min incubation at 80°C). IS10 and IS12 were highly substrate promiscuous and hydrolyzed all 51 monoester substrates tested. Enzymes were active with PLA, polyethylene terephthalate model substrate, and mycotoxin T-2 (IS12). IS10 and IS12 had a classical α/β-hydrolase core domain with a serine hydrolase catalytic triad (Ser155, His280, and Asp250) in their hydrophobic active sites. The crystal structure of IS11 resolved at 2.92 Å revealed the presence of a N-terminal β-lactamase-like domain and C-terminal lipocalin domain. The catalytic cleft of IS11 included catalytic Ser68, Lys71, Tyr160, and Asn162, whereas the lipocalin domain enclosed the catalytic cleft like a lid and contributed to substrate binding. Our study identified novel thermotolerant carboxylesterases with a broad substrate range, including polyesters and mycotoxins, for potential applications in biotechnology. IMPORTANCE High-temperature-active microbial enzymes are important biocatalysts for many industrial applications, including recycling of synthetic and biobased polyesters increasingly used in textiles, fibers, coatings and adhesives. Here, we identified three novel thermotolerant carboxylesterases (IS10, IS11, and IS12) from high-temperature enrichment cultures from Ischia hydrothermal vents and incubated with biobased polymers. The identified metagenomic enzymes originated from uncultured Chloroflexota and showed low sequence similarity to known carboxylesterases. Active sites of IS10 and IS12 had the largest effective volumes among the characterized prokaryotic carboxylesterases and exhibited high substrate promiscuity, including hydrolysis of polyesters and mycotoxin T-2 (IS12). Though less promiscuous than IS10 and IS12, IS11 had a higher thermostability with a high temperature optimum (80 to 90°C) for activity and hydrolyzed polyesters, and its crystal structure revealed an unusual lipocalin domain likely involved in substrate binding. The polyesterase activity of these enzymes makes them attractive candidates for further optimization and potential application in plastics recycling.

Collapse

Akdel M, Pires DEV, Pardo EP, Jänes J, Zalevsky AO, Mészáros B, Bryant P, Good LL, Laskowski RA, Pozzati G, Shenoy A, Zhu W, Kundrotas P, Serra VR, Rodrigues CHM, Dunham AS, Burke D, Borkakoti N, Velankar S, Frost A, Basquin J, Lindorff-Larsen K, Bateman A, Kajava AV, Valencia A, Ovchinnikov S, Durairaj J, Ascher DB, Thornton JM, Davey NE, Stein A, Elofsson A, Croll TI, Beltrao P. A structural biology community assessment of AlphaFold2 applications. Nat Struct Mol Biol 2022;29:1056-1067. [PMID: 36344848 PMCID: PMC9663297 DOI: 10.1038/s41594-022-00849-w] [Citation(s) in RCA: 198] [Impact Index Per Article: 99.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 09/20/2022] [Indexed: 11/09/2022]

Affiliation(s)

Mehmet Akdel Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
Douglas E V Pires School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
Eduard Porta Pardo Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain Barcelona Supercomputing Center (BSC), Barcelona, Spain
Jürgen Jänes European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
Arthur O Zalevsky Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russian Federation
Bálint Mészáros European Molecular Biology Laboratory, Heidelberg, Germany
Patrick Bryant Dep of Biochemistry and Biophysics and Science for Life Laboratory, Solna, Sweden
Lydia L Good Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
Roman A Laskowski European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
Gabriele Pozzati Dep of Biochemistry and Biophysics and Science for Life Laboratory, Solna, Sweden
Aditi Shenoy Dep of Biochemistry and Biophysics and Science for Life Laboratory, Solna, Sweden
Wensi Zhu Dep of Biochemistry and Biophysics and Science for Life Laboratory, Solna, Sweden
Petras Kundrotas Dep of Biochemistry and Biophysics and Science for Life Laboratory, Solna, Sweden
Victoria Ruiz Serra Barcelona Supercomputing Center (BSC), Barcelona, Spain
Carlos H M Rodrigues School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
Alistair S Dunham European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
David Burke European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
Neera Borkakoti European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
Sameer Velankar European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
Adam Frost Department of Biochemistry and Biophysics University of California, San Francisco, CA, USA
Jérôme Basquin Department of Structural Cell Biology, Max Planck Institute of Biochemistry, Martinsried, Germany
Kresten Lindorff-Larsen Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
Alex Bateman European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
Andrey V Kajava Université de Montpellier, Centre de Recherche en Biologie Cellulaire de Montpellier (CRBM) CNRS, Montpellier, France
Alfonso Valencia Barcelona Supercomputing Center (BSC), Barcelona, Spain.
Sergey Ovchinnikov Faculty of Arts and Sciences, Division of Science, Harvard University, Cambridge, MA, USA.
Janani Durairaj Biozentrum, University of Basel, Basel, Switzerland.
David B Ascher School of Chemistry and Molecular Biology, University of Queensland, Brisbane, Queensland, Australia.
Janet M Thornton European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.
Norman E Davey Institute of Cancer Research, London, UK.
Amelie Stein Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
Arne Elofsson Dep of Biochemistry and Biophysics and Science for Life Laboratory, Solna, Sweden.
Tristan I Croll Cambridge Institute for Medical Research, Department of Haematology, The University of Cambridge, Cambridge, UK.
Pedro Beltrao European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK. Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.

Collapse

Semwal R, Aier I, Tyagi P, Varadwaj PK. DeEPn: a deep neural network based tool for enzyme functional annotation. J Biomol Struct Dyn 2020;39:2733-2743. [PMID: 32274968 DOI: 10.1080/07391102.2020.1754292] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Laurenceau R, Bliem C, Osburne MS, Becker JW, Biller SJ, Cubillos-Ruiz A, Chisholm SW. Toward a genetic system in the marine cyanobacterium Prochlorococcus. Access Microbiol 2020;2:acmi000107. [PMID: 33005871 PMCID: PMC7523629 DOI: 10.1099/acmi.0.000107] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 01/30/2020] [Indexed: 11/26/2022] Open

Gao R, Wang M, Zhou J, Fu Y, Liang M, Guo D, Nie J. Prediction of Enzyme Function Based on Three Parallel Deep CNN and Amino Acid Mutation. Int J Mol Sci 2019;20:E2845. [PMID: 31212665 PMCID: PMC6600291 DOI: 10.3390/ijms20112845] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 06/03/2019] [Accepted: 06/04/2019] [Indexed: 01/28/2023] Open

Hu G, Wang K, Song J, Uversky VN, Kurgan L. Taxonomic Landscape of the Dark Proteomes: Whole-Proteome Scale Interplay Between Structural Darkness, Intrinsic Disorder, and Crystallization Propensity. Proteomics 2018;18:e1800243. [PMID: 30198635 DOI: 10.1002/pmic.201800243] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 08/30/2018] [Indexed: 12/14/2022]

Unique function words characterize genomic proteins. Proc Natl Acad Sci U S A 2018;115:6703-6708. [PMID: 29895692 PMCID: PMC6042118 DOI: 10.1073/pnas.1801182115] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

The vast, mostly unknown protein universe can be explored by analyzing protein sequences as a string of domains. A broader coverage can be achieved when these domains, the essential blocks in protein evolution, are detected using sequence profiles. Using clustering to collapse redundant profiles into unique function words (UFWs), we find that over the years 2009–2016, the number of UFWs saturates while the number of sequences matched by a combination of two or more UFWs grows exponentially.

Between 2009 and 2016 the number of protein sequences from known species increased 10-fold from 8 million to 85 million. About 80% of these sequences contain at least one region recognized by the conserved domain architecture retrieval tool (CDART) as a sequence motif. Motifs provide clues to biological function but CDART often matches the same region of a protein by two or more profiles. Such synonyms complicate estimates of functional complexity. We do full-linkage clustering of redundant profiles by finding maximum disjoint cliques: Each cluster is replaced by a single representative profile to give what we term a unique function word (UFW). From 2009 to 2016, the number of sequence profiles used by CDART increased by 80%; the number of UFWs increased more slowly by 30%, indicating that the number of UFWs may be saturating. The number of sequences matched by a single UFW (sequences with single domain architectures) increased as slowly as the number of different words, whereas the number of sequences matched by a combination of two or more UFWs in sequences with multiple domain architectures (MDAs) increased at the same rate as the total number of sequences. This combinatorial arrangement of a limited number of UFWs in MDAs accounts for the genomic diversity of protein sequences. Although eukaryotes and prokaryotes use very similar sets of “words” or UFWs (57% shared), the “sentences” (MDAs) are different (1.3% shared).

Collapse

Nicoludis JM, Gaudet R. Applications of sequence coevolution in membrane protein biochemistry. BIOCHIMICA ET BIOPHYSICA ACTA. BIOMEMBRANES 2018;1860:895-908. [PMID: 28993150 PMCID: PMC5807202 DOI: 10.1016/j.bbamem.2017.10.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 09/28/2017] [Accepted: 10/02/2017] [Indexed: 12/22/2022]

Popovic A, Hai T, Tchigvintsev A, Hajighasemi M, Nocek B, Khusnutdinova AN, Brown G, Glinos J, Flick R, Skarina T, Chernikova TN, Yim V, Brüls T, Paslier DL, Yakimov MM, Joachimiak A, Ferrer M, Golyshina OV, Savchenko A, Golyshin PN, Yakunin AF. Activity screening of environmental metagenomic libraries reveals novel carboxylesterase families. Sci Rep 2017;7:44103. [PMID: 28272521 PMCID: PMC5341072 DOI: 10.1038/srep44103] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 02/01/2017] [Indexed: 11/29/2022] Open

Affiliation(s)

Ana Popovic Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Tran Hai School of Biological Sciences, Bangor University, Gwynedd LL57 2UW, UK
Anatoly Tchigvintsev Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Mahbod Hajighasemi Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Boguslaw Nocek Midwest Center for Structural Genomics and Structural Biology Center, Biosciences Division, Argonne National Laboratory, Argonne, Illinois 60439, USA
Anna N Khusnutdinova Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Greg Brown Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Julia Glinos Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Robert Flick Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Tatiana Skarina Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Tatyana N Chernikova School of Biological Sciences, Bangor University, Gwynedd LL57 2UW, UK
Veronica Yim Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Thomas Brüls Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Direction de la Recherche Fondamentale, Institut de Génomique, Université de d'Evry Val d'Essonne (UEVE), Centre National de la Recherche Scientifique (CNRS), UMR8030, Génomique métabolique, Evry, France
Denis Le Paslier Université de d'Evry Val d'Essonne (UEVE), Centre National de la Recherche, Scientifique (CNRS), UMR8030, Génomique métabolique, Commissariat à l'Energie, Atomique et aux Energies Alternatives (CEA), Direction de la Recherche, Fondamentale, Institut de Génomique, Evry, France
Michail M Yakimov Institute for Coastal Marine Environment, CNR, 98122 Messina, Italy
Andrzej Joachimiak Midwest Center for Structural Genomics and Structural Biology Center, Biosciences Division, Argonne National Laboratory, Argonne, Illinois 60439, USA
Manuel Ferrer Institute of Catalysis, CSIC, Madrid 28049, Spain
Olga V Golyshina School of Biological Sciences, Bangor University, Gwynedd LL57 2UW, UK
Alexei Savchenko Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
Peter N Golyshin School of Biological Sciences, Bangor University, Gwynedd LL57 2UW, UK
Alexander F Yakunin Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada

Collapse

Exploring the dark foldable proteome by considering hydrophobic amino acids topology. Sci Rep 2017;7:41425. [PMID: 28134276 PMCID: PMC5278394 DOI: 10.1038/srep41425] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 12/19/2016] [Indexed: 12/18/2022] Open

Raad MD, Modavi C, Sukovich DJ, Anderson JC. Observing Biosynthetic Activity Utilizing Next Generation Sequencing and the DNA Linked Enzyme Coupled Assay. ACS Chem Biol 2017;12:191-199. [PMID: 28103681 DOI: 10.1021/acschembio.6b00652] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Pearson VM, Caudle SB, Rokyta DR. Viral recombination blurs taxonomic lines: examination of single-stranded DNA viruses in a wastewater treatment plant. PeerJ 2016;4:e2585. [PMID: 27781171 PMCID: PMC5075696 DOI: 10.7717/peerj.2585] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 09/19/2016] [Indexed: 12/26/2022] Open

Discovery of Nigri/nox and Panto/pox site-specific recombinase systems facilitates advanced genome engineering. Sci Rep 2016;6:30130. [PMID: 27444945 PMCID: PMC4957104 DOI: 10.1038/srep30130] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Accepted: 06/27/2016] [Indexed: 12/21/2022] Open

Lobb B, Doxey AC. Novel function discovery through sequence and structural data mining. Curr Opin Struct Biol 2016;38:53-61. [DOI: 10.1016/j.sbi.2016.05.017] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 05/17/2016] [Accepted: 05/24/2016] [Indexed: 01/30/2023]

Wessels HJCT, de Almeida NM, Kartal B, Keltjens JT. Bacterial Electron Transfer Chains Primed by Proteomics. Adv Microb Physiol 2016;68:219-352. [PMID: 27134025 DOI: 10.1016/bs.ampbs.2016.02.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Addis MF, Tanca A, Uzzau S, Oikonomou G, Bicalho RC, Moroni P. The bovine milk microbiota: insights and perspectives from -omics studies. MOLECULAR BIOSYSTEMS 2016;12:2359-72. [DOI: 10.1039/c6mb00217j] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Punta M, Mistry J. Homology-Based Annotation of Large Protein Datasets. Methods Mol Biol 2016;1415:153-176. [PMID: 27115632 DOI: 10.1007/978-1-4939-3572-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Yutin N, Shevchenko S, Kapitonov V, Krupovic M, Koonin EV. A novel group of diverse Polinton-like viruses discovered by metagenome analysis. BMC Biol 2015;13:95. [PMID: 26560305 PMCID: PMC4642659 DOI: 10.1186/s12915-015-0207-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 10/28/2015] [Indexed: 01/08/2023] Open

Abstract

Background

The rapidly growing metagenomic databases provide increasing opportunities for computational discovery of new groups of organisms. Identification of new viruses is particularly straightforward given the comparatively small size of viral genomes, although fast evolution of viruses complicates the analysis of novel sequences. Here we report the metagenomic discovery of a distinct group of diverse viruses that are distantly related to the eukaryotic virus-like transposons of the Polinton superfamily.

Results

The sequence of the putative major capsid protein (MCP) of the unusual linear virophage associated with Phaeocystis globosa virus (PgVV) was used as a bait to identify potential related viruses in metagenomic databases. Assembly of the contigs encoding the PgVV MCP homologs followed by comprehensive sequence analysis of the proteins encoded in these contigs resulted in the identification of a large group of Polinton-like viruses (PLV) that resemble Polintons (polintoviruses) and virophages in genome size, and share with them a conserved minimal morphogenetic module that consists of major and minor capsid proteins and the packaging ATPase. With a single exception, the PLV lack the retrovirus-type integrase that is encoded in the genomes of all Polintons and the Mavirus group of virophages. However, some PLV encode a newly identified tyrosine recombinase-integrase that is common in bacteria and bacteriophages and is also found in the Organic Lake virophage group. Although several PLV genomes and individual genes are integrated into algal genomes, it appears likely that most of the PLV are viruses. Given the absence of protease and retrovirus-type integrase, the PLV could resemble the ancestral polintoviruses that evolved from bacterial tectiviruses. Apart from the conserved minimal morphogenetic module, the PLV widely differ in their genome complements but share a gene network with Polintons and virophages, suggestive of multiple gene exchanges within a shared gene pool.

Conclusions

The discovery of PLV substantially expands the emerging class of eukaryotic viruses and transposons that also includes Polintons and virophages. This class of selfish elements is extremely widespread and might have been a hotbed of eukaryotic virus, transposon and plasmid evolution. New families of these elements are expected to be discovered.

Electronic supplementary material

The online version of this article (doi:10.1186/s12915-015-0207-4) contains supplementary material, which is available to authorized users.

Collapse

Masuch T, Kusnezowa A, Nilewski S, Bautista JT, Kourist R, Leichert LI. A combined bioinformatics and functional metagenomics approach to discovering lipolytic biocatalysts. Front Microbiol 2015;6:1110. [PMID: 26528261 PMCID: PMC4602143 DOI: 10.3389/fmicb.2015.01110] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 09/25/2015] [Indexed: 11/30/2022] Open

Abstract

The majority of protein sequence data published today is of metagenomic origin. However, our ability to assign functions to these sequences is often hampered by our general inability to cultivate the larger part of microbial species and the sheer amount of sequence data generated in these projects. Here we present a combination of bioinformatics, synthetic biology, and Escherichia coli genetics to discover biocatalysts in metagenomic datasets. We created a subset of the Global Ocean Sampling dataset, the largest metagenomic project published to date, by removing all proteins that matched Hidden Markov Models of known protein families from PFAM and TIGRFAM with high confidence (E-value > 10^-5). This essentially left us with proteins with low or no homology to known protein families, still encompassing ~1.7 million different sequences. In this subset, we then identified protein families de novo with a Markov clustering algorithm. For each protein family, we defined a single representative based on its phylogenetic relationship to all other members in that family. This reduced the dataset to ~17,000 representatives of protein families with more than 10 members. Based on conserved regions typical for lipases and esterases, we selected a representative gene from a family of 27 members for synthesis. This protein, when expressed in E. coli, showed lipolytic activity toward para-nitrophenyl (pNP) esters. The K_m-value of the enzyme was 66.68 μM for pNP-butyrate and 68.08 μM for pNP-palmitate with k_cat/K_m values at 3.4 × 10⁶ and 6.6 × 10⁵ M^-1s^-1, respectively. Hydrolysis of model substrates showed enantiopreference for the R-form. Reactions yielded 43 and 61% enantiomeric excess of products with ibuprofen methyl ester and 2-phenylpropanoic acid ethyl ester, respectively. The enzyme retains 50% of its maximum activity at temperatures as low as 10°C, its activity is enhanced in artificial seawater and buffers with higher salt concentrations with an optimum osmolarity of 3,890 mosmol/l.

Collapse

An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life. Sci Rep 2015;5:14717. [PMID: 26434770 PMCID: PMC4592975 DOI: 10.1038/srep14717] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 09/07/2015] [Indexed: 11/14/2022] Open

Lobb B, Kurtz DA, Moreno-Hagelsieb G, Doxey AC. Remote homology and the functions of metagenomic dark matter. Front Genet 2015;6:234. [PMID: 26257768 PMCID: PMC4508852 DOI: 10.3389/fgene.2015.00234] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 06/22/2015] [Indexed: 01/26/2023] Open

Abstract

Predicted open reading frames (ORFs) that lack detectable homology to known proteins are termed ORFans. Despite their prevalence in metagenomes, the extent to which ORFans encode real proteins, the degree to which they can be annotated, and their functional contributions, remain unclear. To gain insights into these questions, we applied sensitive remote-homology detection methods to functionally analyze ORFans from soil, marine, and human gut metagenome collections. ORFans were identified, clustered into sequence families, and annotated through profile-profile comparison to proteins of known structure. We found that a considerable number of metagenomic ORFans (73,896 of 484,121, 15.3%) exhibit significant remote homology to structurally characterized proteins, providing a means for ORFan functional profiling. The extent of detected remote homology far exceeds that obtained for artificial protein families (1.4%). As expected for real genes, the predicted functions of ORFans are significantly similar to the functions of their gene neighbors (p < 0.001). Compared to the functional profiles predicted through standard homology searches, ORFans show biologically intriguing differences. Many ORFan-enriched functions are virus-related and tend to reflect biological processes associated with extreme sequence diversity. Each environment also possesses a large number of unique ORFan families and functions, including some known to play important community roles such as gut microbial polysaccharide digestion. Lastly, ORFans are a valuable resource for finding novel enzymes of interest, as we demonstrate through the identification of hundreds of novel ORFan metalloproteases that all possess a signature catalytic motif despite a general lack of similarity to known proteins. Our ORFan functional predictions are a valuable resource for discovering novel protein families and exploring the boundaries of protein sequence space. All remote homology predictions are available at http://doxey.uwaterloo.ca/ORFans.

Collapse

Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015;11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Bengtsson-Palme J, Alm Rosenblad M, Molin M, Blomberg A. Metagenomics reveals that detoxification systems are underrepresented in marine bacterial communities. BMC Genomics 2014;15:749. [PMID: 25179155 PMCID: PMC4161860 DOI: 10.1186/1471-2164-15-749] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2014] [Accepted: 08/26/2014] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Environmental shotgun sequencing (metagenomics) provides a new way to study communities in microbial ecology. We here use sequence data from the Global Ocean Sampling (GOS) expedition to investigate toxicant selection pressures revealed by the presence of detoxification genes in marine bacteria. To capture a broad range of potential toxicants we selected detoxification protein families representing systems protecting microorganisms from a variety of stressors, such as metals, organic compounds, antibiotics and oxygen radicals.

RESULTS

Using a bioinformatics procedure based on comparative analysis to finished bacterial genomes we found that the amount of detoxification genes present in marine microorganisms seems surprisingly small. The underrepresentation is particularly evident for toxicant transporters and proteins involved in detoxifying metals. Exceptions are enzymes involved in oxidative stress defense where peroxidase enzymes are more abundant in marine bacteria compared to bacteria in general. In contrast, catalases are almost completely absent from the open ocean environment, suggesting that peroxidases and peroxiredoxins constitute a core line of defense against reactive oxygen species (ROS) in the marine milieu.

CONCLUSIONS

We found no indication that detoxification systems would be generally more abundant close to the coast compared to the open ocean. On the contrary, for several of the protein families that displayed a significant geographical distribution, like peroxidase, penicillin binding transpeptidase and divalent ion transport protein, the open ocean samples showed the highest abundance. Along the same lines, the abundance of most detoxification proteins did not increase with estimated pollution. The low level of detoxification systems in marine bacteria indicate that the majority of marine bacteria have a low capacity to adapt to increased pollution. Our study exemplifies the use of metagenomics data in ecotoxicology, and in particular how anthropogenic consequences on life in the sea can be examined.

Collapse

Bošnjak I, Bojović V, Šegvić-Bubić T, Bielen A. Occurrence of protein disulfide bonds in different domains of life: a comparison of proteins from the Protein Data Bank. Protein Eng Des Sel 2014;27:65-72. [PMID: 24407015 DOI: 10.1093/protein/gzt063] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Kwon K, Peterson SN. High-throughput cloning for biophysical applications. Methods Mol Biol 2014;1140:61-74. [PMID: 24590709 DOI: 10.1007/978-1-4939-0354-2_5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Sharpton TJ. An introduction to the analysis of shotgun metagenomic data. FRONTIERS IN PLANT SCIENCE 2014;5:209. [PMID: 24982662 PMCID: PMC4059276 DOI: 10.3389/fpls.2014.00209] [Citation(s) in RCA: 280] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2014] [Accepted: 04/29/2014] [Indexed: 05/19/2023]

Buj R, Iglesias N, Planas AM, Santalucía T. A plasmid toolkit for cloning chimeric cDNAs encoding customized fusion proteins into any Gateway destination expression vector. BMC Mol Biol 2013;14:18. [PMID: 23957834 PMCID: PMC3765358 DOI: 10.1186/1471-2199-14-18] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Accepted: 08/12/2013] [Indexed: 12/31/2022] Open

Abstract

Background

Valuable clone collections encoding the complete ORFeomes for some model organisms have been constructed following the completion of their genome sequencing projects. These libraries are based on Gateway cloning technology, which facilitates the study of protein function by simplifying the subcloning of open reading frames (ORF) into any suitable destination vector. The expression of proteins of interest as fusions with functional modules is a frequent approach in their initial functional characterization. A limited number of Gateway destination expression vectors allow the construction of fusion proteins from ORFeome-derived sequences, but they are restricted to the possibilities offered by their inbuilt functional modules and their pre-defined model organism-specificity. Thus, the availability of cloning systems that overcome these limitations would be highly advantageous.

Results

We present a versatile cloning toolkit for constructing fully-customizable three-part fusion proteins based on the MultiSite Gateway cloning system. The fusion protein components are encoded in the three plasmids integral to the kit. These can recombine with any purposely-engineered destination vector that uses a heterologous promoter external to the Gateway cassette, leading to the in-frame cloning of an ORF of interest flanked by two functional modules. In contrast to previous systems, a third part becomes available for peptide-encoding as it no longer needs to contain a promoter, resulting in an increased number of possible fusion combinations. We have constructed the kit’s component plasmids and demonstrate its functionality by providing proof-of-principle data on the expression of prototype fluorescent fusions in transiently-transfected cells.

Conclusions

We have developed a toolkit for creating fusion proteins with customized N- and C-term modules from Gateway entry clones encoding ORFs of interest. Importantly, our method allows entry clones obtained from ORFeome collections to be used without prior modifications. Using this technology, any existing Gateway destination expression vector with its model-specific properties could be easily adapted for expressing fusion proteins.

Collapse

Serine/threonine kinases and E2-ubiquitin conjugating enzymes in Planctomycetes: unexpected findings. Antonie van Leeuwenhoek 2013;104:509-20. [DOI: 10.1007/s10482-013-9993-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Accepted: 07/26/2013] [Indexed: 12/25/2022]

Bornberg-Bauer E, Albà MM. Dynamics and adaptive benefits of modular protein evolution. Curr Opin Struct Biol 2013;23:459-66. [PMID: 23562500 DOI: 10.1016/j.sbi.2013.02.012] [Citation(s) in RCA: 80] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 02/15/2013] [Accepted: 02/15/2013] [Indexed: 11/29/2022]

New nuclear markers and exploration of the relationships among Serraniformes (Acanthomorpha, Teleostei): The importance of working at multiple scales. Mol Phylogenet Evol 2013;67:140-55. [DOI: 10.1016/j.ympev.2012.12.020] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2012] [Revised: 11/30/2012] [Accepted: 12/28/2012] [Indexed: 01/20/2023]

Protein structure prediction from sequence variation. Nat Biotechnol 2013;30:1072-80. [PMID: 23138306 DOI: 10.1038/nbt.2419] [Citation(s) in RCA: 430] [Impact Index Per Article: 39.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Accepted: 10/15/2012] [Indexed: 02/07/2023]

Minkiewicz P, Bucholska J, Darewicz M, Borawska J. Epitopic hexapeptide sequences from Baltic cod parvalbumin beta (allergen Gad c 1) are common in the universal proteome. Peptides 2012;38:105-9. [PMID: 22940202 DOI: 10.1016/j.peptides.2012.08.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 08/14/2012] [Accepted: 08/14/2012] [Indexed: 01/25/2023]

Dougherty MJ, D'haeseleer P, Hazen TC, Simmons BA, Adams PD, Hadi MZ. Glycoside hydrolases from a targeted compost metagenome, activity-screening and functional characterization. BMC Biotechnol 2012;12:38. [PMID: 22759983 PMCID: PMC3477009 DOI: 10.1186/1472-6750-12-38] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 07/03/2012] [Indexed: 11/29/2022] Open

Svenson J. MabCent: Arctic marine bioprospecting in Norway. PHYTOCHEMISTRY REVIEWS : PROCEEDINGS OF THE PHYTOCHEMICAL SOCIETY OF EUROPE 2012;12:567-578. [PMID: 24078803 PMCID: PMC3777186 DOI: 10.1007/s11101-012-9239-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 05/12/2012] [Indexed: 05/24/2023]

Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y. dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 2012;40:W445-51. [PMID: 22645317 PMCID: PMC3394287 DOI: 10.1093/nar/gks479] [Citation(s) in RCA: 1209] [Impact Index Per Article: 100.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Collison M, Hirt RP, Wipat A, Nakjang S, Sanseau P, Brown JR. Data mining the human gut microbiota for therapeutic targets. Brief Bioinform 2012;13:751-68. [DOI: 10.1093/bib/bbs002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Thomas T, Gilbert J, Meyer F. Metagenomics - a guide from sampling to data analysis. MICROBIAL INFORMATICS AND EXPERIMENTATION 2012;2:3. [PMID: 22587947 PMCID: PMC3351745 DOI: 10.1186/2042-5783-2-3] [Citation(s) in RCA: 419] [Impact Index Per Article: 34.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2011] [Accepted: 02/09/2012] [Indexed: 12/13/2022]