1
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
2
|
Bell KL, Turo KJ, Lowe A, Nota K, Keller A, Encinas‐Viso F, Parducci L, Richardson RT, Leggett RM, Brosi BJ, Burgess KS, Suyama Y, de Vere N. Plants, pollinators and their interactions under global ecological change: The role of pollen DNA metabarcoding. Mol Ecol 2023; 32:6345-6362. [PMID: 36086900 PMCID: PMC10947134 DOI: 10.1111/mec.16689] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 08/18/2022] [Accepted: 08/30/2022] [Indexed: 11/28/2022]
Abstract
Anthropogenic activities are triggering global changes in the environment, causing entire communities of plants, pollinators and their interactions to restructure, and ultimately leading to species declines. To understand the mechanisms behind community shifts and declines, as well as monitoring and managing impacts, a global effort must be made to characterize plant-pollinator communities in detail, across different habitat types, latitudes, elevations, and levels and types of disturbances. Generating data of this scale will only be feasible with rapid, high-throughput methods. Pollen DNA metabarcoding provides advantages in throughput, efficiency and taxonomic resolution over traditional methods, such as microscopic pollen identification and visual observation of plant-pollinator interactions. This makes it ideal for understanding complex ecological networks and their responses to change. Pollen DNA metabarcoding is currently being applied to assess plant-pollinator interactions, survey ecosystem change and model the spatiotemporal distribution of allergenic pollen. Where samples are available from past collections, pollen DNA metabarcoding has been used to compare contemporary and past ecosystems. New avenues of research are possible with the expansion of pollen DNA metabarcoding to intraspecific identification, analysis of DNA in ancient pollen samples, and increased use of museum and herbarium specimens. Ongoing developments in sequencing technologies can accelerate progress towards these goals. Global ecological change is happening rapidly, and we anticipate that high-throughput methods such as pollen DNA metabarcoding are critical for understanding the evolutionary and ecological processes that support biodiversity, and predicting and responding to the impacts of change.
Collapse
Affiliation(s)
- Karen L. Bell
- CSIRO Health & Biosecurity and CSIRO Land & WaterFloreatWAAustralia
- School of Biological SciencesUniversity of Western AustraliaCrawleyWAAustralia
| | - Katherine J. Turo
- Department of Ecology, Evolution, and Natural ResourcesRutgers UniversityNew BrunswickNew JerseyUSA
| | | | - Kevin Nota
- Department of Ecology and GeneticsEvolutionary Biology Centre, Uppsala UniversityUppsalaSweden
| | - Alexander Keller
- Organismic and Cellular Networks, Faculty of BiologyBiocenter, Ludwig‐Maximilians‐Universität MünchenPlaneggGermany
| | - Francisco Encinas‐Viso
- Centre for Australian National Biodiversity ResearchCSIROBlack MountainAustralian Capital TerritoryAustralia
| | - Laura Parducci
- Department of Ecology and GeneticsEvolutionary Biology Centre, Uppsala UniversityUppsalaSweden
- Department of Environmental BiologySapienza University of RomeRomeItaly
| | - Rodney T. Richardson
- Appalachian LaboratoryUniversity of Maryland Center for Environmental ScienceFrostburgMarylandUSA
| | | | - Berry J. Brosi
- Department of BiologyUniversity of WashingtonSeattleWashingtonUSA
| | - Kevin S. Burgess
- Department of BiologyCollege of Letters and Sciences, Columbus State University, University System of GeorgiaAtlantaGeorgiaUSA
| | - Yoshihisa Suyama
- Field Science CenterGraduate School of Agricultural Science, Tohoku UniversityOsakiMiyagiJapan
| | - Natasha de Vere
- Natural History Museum of DenmarkUniversity of CopenhagenCopenhagenDenmark
| |
Collapse
|
3
|
Khan A, Sohail S, Yaseen S, Fatima S, Wisal A, Ahmed S, Nasir M, Irfan M, Karim A, Basharat Z, Khan Y, Aurongzeb M, Raza SK, Alshahrani MY, Morel CM, Hassan SS. Exploring and targeting potential druggable antimicrobial resistance targets ArgS, SecY, and MurA in Staphylococcus sciuri with TCM inhibitors through a subtractive genomics strategy. Funct Integr Genomics 2023; 23:254. [PMID: 37495774 DOI: 10.1007/s10142-023-01179-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/14/2023] [Accepted: 07/14/2023] [Indexed: 07/28/2023]
Abstract
Staphylococcus sciuri (also currently Mammaliicoccus sciuri) are anaerobic facultative and non-motile bacteria that cause significant human pathogenesis such as endocarditis, wound infections, peritonitis, UTI, and septic shock. Methicillin-resistant S. sciuri (MRSS) strains also infects animals that include healthy broilers, cattle, dogs, and pigs. The emergence of MRSS strains thereby poses a serious health threat and thrives the scientific community towards novel treatment options. Herein, we investigated the druggable genome of S. sciuri by employing subtractive genomics that resulted in seven genes/proteins where only three of them were predicted as final targets. Further mining the literature showed that the ArgS (WP_058610923), SecY (WP_058611897), and MurA (WP_058612677) are involved in the multi-drug resistance phenomenon. After constructing and verifying the 3D protein homology models, a screening process was carried out using a library of Traditional Chinese Medicine compounds (consisting of 36,043 compounds). The molecular docking and simulation studies revealed the physicochemical stability parameters of the docked TCM inhibitors in the druggable cavities of each protein target by identifying their druggability potential and maximum hydrogen bonding interactions. The simulated receptor-ligand complexes showed the conformational changes and stability index of the secondary structure elements. The root mean square deviation (RMSD) graph showed fluctuations due to structural changes in the helix-coil-helix and beta-turn-beta changes at specific points where the pattern of the RMSD and root mean square fluctuation (RMSF) (< 1.0 Å) support any major domain shifts within the structural framework of the protein-ligand complex and placement of ligand was well complemented within the binding site. The β-factor values demonstrated instability at few points while the radius of gyration for structural compactness as a time function for the 100-ns simulation of protein-ligand complexes showed favorable average values and denoted the stability of all complexes. It is assumed that such findings might facilitate researchers to robustly discover and develop effective therapeutics against S. sciuri alongside other enteric infections.
Collapse
Affiliation(s)
- Aafareen Khan
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000, KP, Pakistan
| | - Saman Sohail
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000, KP, Pakistan
| | - Seerat Yaseen
- Abbasi Shaheed Hospital, Karachi Medical and Dental College, Karachi, Pakistan
| | - Sareen Fatima
- Department of Microbiology, University of Balochistan, Quetta, Balochistan, Pakistan
| | - Ayesha Wisal
- Department of Chemistry, Islamia College Peshawar, Peshawar, 25000, KP, Pakistan
| | - Sufyan Ahmed
- Abbasi Shaheed Hospital, Karachi Medical and Dental College, Karachi, Pakistan
| | - Mahrukh Nasir
- Dr. Panjwani Center for Molecular Medicine, International Center for Chemical and Biological Sciences (ICCBS-PCMD), University of Karachi, Karachi, 75270, Pakistan
| | - Muhammad Irfan
- Dr. Panjwani Center for Molecular Medicine, International Center for Chemical and Biological Sciences (ICCBS-PCMD), University of Karachi, Karachi, 75270, Pakistan
| | - Asad Karim
- Dr. Panjwani Center for Molecular Medicine, International Center for Chemical and Biological Sciences (ICCBS-PCMD), University of Karachi, Karachi, 75270, Pakistan
| | - Zarrin Basharat
- Alpha Genomics (Private) Limited, Islamabad, 44710, Pakistan
| | - Yasmin Khan
- Dr. Panjwani Center for Molecular Medicine, International Center for Chemical and Biological Sciences (ICCBS-PCMD), University of Karachi, Karachi, 75270, Pakistan
| | - Muhammad Aurongzeb
- Faculty of Engineering Sciences & Technology, Hamdard University, Karachi, 74600, Pakistan
| | - Syed Kashif Raza
- Faculty of Rehabilitation and Allied Health Sciences (FRAHS), Riphah International University, Faisalabad, Pakistan
| | - Mohammad Y Alshahrani
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Khalid University, P.O. Box 61413, Abha, 9088, Saudi Arabia
| | - Carlos M Morel
- Centre for Technological Development in Health (CDTS), Oswaldo Cruz Foundation (Fiocruz), Building "Expansão", 8Th Floor Room 814, Av. Brasil 4036 - Manguinhos, Rio de Janeiro, RJ, 21040-361, Brazil.
| | - Syed S Hassan
- Dr. Panjwani Center for Molecular Medicine, International Center for Chemical and Biological Sciences (ICCBS-PCMD), University of Karachi, Karachi, 75270, Pakistan.
- Centre for Technological Development in Health (CDTS), Oswaldo Cruz Foundation (Fiocruz), Building "Expansão", 8Th Floor Room 814, Av. Brasil 4036 - Manguinhos, Rio de Janeiro, RJ, 21040-361, Brazil.
| |
Collapse
|
4
|
Mukherjee S, Stamatis D, Li C, Ovchinnikova G, Bertsch J, Sundaramurthi J, Kandimalla M, Nicolopoulos P, Favognano A, Chen IM, Kyrpides N, Reddy TBK. Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9. Nucleic Acids Res 2023; 51:D957-D963. [PMID: 36318257 PMCID: PMC9825498 DOI: 10.1093/nar/gkac974] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 10/05/2022] [Accepted: 10/16/2022] [Indexed: 01/09/2023] Open
Abstract
The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute (DOE-JGI) continues to maintain its role as one of the flagship genomic metadata repositories of the world. The ever-increasing number of projects and metadata are freely available to the user community world-wide. GOLD's metadata is consumed by scientists and remains an important source for large-scale comparative genomics analysis initiatives. Encouraged by this active user engagement and growth, GOLD has continued to add new components and capabilities. The new features such as a public Application Programming Interface (API) and Ecosystem landing page as well as the growth of different entities in this current GOLD v.9 edition are described in detail in this manuscript.
Collapse
Affiliation(s)
- Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dimitri Stamatis
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Cindy Tianqing Li
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Galina Ovchinnikova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jon Bertsch
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Mahathi Kandimalla
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Paul A Nicolopoulos
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Alessandro Favognano
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
5
|
Patra P, B R D, Kundu P, Das M, Ghosh A. Recent advances in machine learning applications in metabolic engineering. Biotechnol Adv 2023; 62:108069. [PMID: 36442697 DOI: 10.1016/j.biotechadv.2022.108069] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Revised: 10/18/2022] [Accepted: 11/22/2022] [Indexed: 11/27/2022]
Abstract
Metabolic engineering encompasses several widely-used strategies, which currently hold a high seat in the field of biotechnology when its potential is manifesting through a plethora of research and commercial products with a strong societal impact. The genomic revolution that occurred almost three decades ago has initiated the generation of large omics-datasets which has helped in gaining a better understanding of cellular behavior. The itinerary of metabolic engineering that has occurred based on these large datasets has allowed researchers to gain detailed insights and a reasonable understanding of the intricacies of biosystems. However, the existing trail-and-error approaches for metabolic engineering are laborious and time-intensive when it comes to the production of target compounds with high yields through genetic manipulations in host organisms. Machine learning (ML) coupled with the available metabolic engineering test instances and omics data brings a comprehensive and multidisciplinary approach that enables scientists to evaluate various parameters for effective strain design. This vast amount of biological data should be standardized through knowledge engineering to train different ML models for providing accurate predictions in gene circuits designing, modification of proteins, optimization of bioprocess parameters for scaling up, and screening of hyper-producing robust cell factories. This review briefs on the premise of ML, followed by mentioning various ML methods and algorithms alongside the numerous omics datasets available to train ML models for predicting metabolic outcomes with high-accuracy. The combinative interplay between the ML algorithms and biological datasets through knowledge engineering have guided the recent advancements in applications such as CRISPR/Cas systems, gene circuits, protein engineering, metabolic pathway reconstruction, and bioprocess engineering. Finally, this review addresses the probable challenges of applying ML in metabolic engineering which will guide the researchers toward novel techniques to overcome the limitations.
Collapse
Affiliation(s)
- Pradipta Patra
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Disha B R
- B.M.S College of Engineering, Basavanagudi, Bengaluru, Karnataka 560019, India
| | - Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Manali Das
- School of Bioscience, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
6
|
Montero-Calasanz MDC, Yaramis A, Rohde M, Schumann P, Klenk HP, Meier-Kolthoff JP. Genotype-phenotype correlations within the Geodermatophilaceae. Front Microbiol 2022; 13:975365. [PMID: 36439792 PMCID: PMC9686282 DOI: 10.3389/fmicb.2022.975365] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 10/11/2022] [Indexed: 11/11/2022] Open
Abstract
The integration of genomic information into microbial systematics along with physiological and chemotaxonomic parameters provides for a reliable classification of prokaryotes. In silico analysis of chemotaxonomic traits is now being introduced to replace characteristics traditionally determined in the laboratory with the dual goal of both increasing the speed of the description of taxa and the accuracy and consistency of taxonomic reports. Genomics has already successfully been applied in the taxonomic rearrangement of Geodermatophilaceae (Actinomycetota) but in the light of new genomic data the taxonomy of the family needs to be revisited. In conjunction with the taxonomic characterisation of four strains phylogenetically located within the family, we conducted a phylogenetic analysis of the whole proteomes of the sequenced type strains and established genotype-phenotype correlations for traits related to chemotaxonomy, cell morphology and metabolism. Results indicated that the four isolates under study represent four novel species within the genus Blastococcus. Additionally, the genera Blastococcus, Geodermatophilus and Modestobacter were shown to be paraphyletic. Consequently, the new genera Trujillonella, Pleomorpha and Goekera were proposed within the Geodermatophilaceae and Blastococcus endophyticus was reclassified as Trujillonella endophytica comb. nov., Geodermatophilus daqingensis as Pleomorpha daqingensis comb. nov. and Modestobacter deserti as Goekera deserti comb. nov. Accordingly, we also proposed emended descriptions of Blastococcus aggregatus, Blastococcus jejuensis, Blastococcus saxobsidens and Blastococcus xanthilyniticus. In silico chemotaxonomic results were overall consistent with wet-lab results. Even though in silico discriminatory levels varied depending on the respective chemotaxonomic trait, this approach is promising for effectively replacing and/or complementing chemotaxonomic analyses at taxonomic ranks above the species level. Finally, interesting but previously overlooked insights regarding morphology and ecology were revealed by the presence of a repertoire of genes related to flagellum synthesis, chemotaxis, spore production and pilus assembly in all representatives of the family. A rich carbon metabolism including four different CO2 fixation pathways and a battery of enzymes able to degrade complex carbohydrates were also identified in Blastococcus genomes.
Collapse
Affiliation(s)
- Maria del Carmen Montero-Calasanz
- IFAPA Las Torres-Andalusian Institute of Agricultural and Fisheries Research and Training, Junta de Andalucía, Seville, Spain
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Adnan Yaramis
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Manfred Rohde
- Central Facility for Microscopy, HZI – Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Peter Schumann
- Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Hans-Peter Klenk
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Jan P. Meier-Kolthoff
- Department Bioinformatics and Databases, Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| |
Collapse
|
7
|
Di Carlo P, Serra N, Alduina R, Guarino R, Craxì A, Giammanco A, Fasciana T, Cascio A, Sergi CM. A systematic review on omics data (metagenomics, metatranscriptomics, and metabolomics) in the role of microbiome in gallbladder disease. Front Physiol 2022; 13:888233. [PMID: 36111147 PMCID: PMC9468903 DOI: 10.3389/fphys.2022.888233] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 07/11/2022] [Indexed: 12/04/2022] Open
Abstract
Microbiotas are the range of microorganisms (mainly bacteria and fungi) colonizing multicellular, macroscopic organisms. They are crucial for several metabolic functions affecting the health of the host. However, difficulties hamper the investigation of microbiota composition in cultivating microorganisms in standard growth media. For this reason, our knowledge of microbiota can benefit from the analysis of microbial macromolecules (DNA, transcripts, proteins, or by-products) present in various samples collected from the host. Various omics technologies are used to obtain different data. Metagenomics provides a taxonomical profile of the sample. It can also be used to obtain potential functional information. At the same time, metatranscriptomics can characterize members of a microbiome responsible for specific functions and elucidate genes that drive the microbiotas relationship with its host. Thus, while microbiota refers to microorganisms living in a determined environment (taxonomy of microorganisms identified), microbiome refers to the microorganisms and their genes living in a determined environment and, of course, metagenomics focuses on the genes and collective functions of identified microorganisms. Metabolomics completes this framework by determining the metabolite fluxes and the products released into the environment. The gallbladder is a sac localized under the liver in the human body and is difficult to access for bile and tissue sampling. It concentrates the bile produced in the hepatocytes, which drains into bile canaliculi. Bile promotes fat digestion and is released from the gallbladder into the upper small intestine in response to food. Considered sterile originally, recent data indicate that bile microbiota is associated with the biliary tract’s inflammation and carcinogenesis. The sample size is relevant for omic studies of rare diseases, such as gallbladder carcinoma. Although in its infancy, the study of the biliary microbiota has begun taking advantage of several omics strategies, mainly based on metagenomics, metabolomics, and mouse models. Here, we show that omics analyses from the literature may provide a more comprehensive image of the biliary microbiota. We review studies performed in this environmental niche and focus on network-based approaches for integrative studies.
Collapse
Affiliation(s)
- Paola Di Carlo
- Department of Health Promotion, Maternal-Childhood, Internal Medicine of Excellence G. D’Alessandro, Section of Infectious Disease, University of Palermo, Palermo, Italy
| | - Nicola Serra
- Department of Public Health, University “Federico II”, Naples, Italy
| | - Rosa Alduina
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies (STEBICEF), University of Palermo, Palermo, Italy
| | - Riccardo Guarino
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies (STEBICEF), University of Palermo, Palermo, Italy
| | - Antonio Craxì
- Department of Health Promotion, Maternal-Childhood, Internal Medicine of Excellence G. D’Alessandro, Section of Gastroenterology, University of Palermo, Palermo, Italy
| | - Anna Giammanco
- Department of Health Promotion, Maternal-Childhood, Internal Medicine of Excellence G. D’Alessandro, Section of Microbiology, University of Palermo, Palermo, Italy
| | - Teresa Fasciana
- Department of Health Promotion, Maternal-Childhood, Internal Medicine of Excellence G. D’Alessandro, Section of Microbiology, University of Palermo, Palermo, Italy
| | - Antonio Cascio
- Department of Health Promotion, Maternal-Childhood, Internal Medicine of Excellence G. D’Alessandro, Section of Infectious Disease, University of Palermo, Palermo, Italy
| | - Consolato M. Sergi
- Children’s Hospital of Eastern Ontario (CHEO), University of Ottawa, Ottawa, ON, Canada
- Department of Pediatrics, Stollery Children’s Hospital, University of Alberta, Edmonton, AB, Canada
- *Correspondence: Consolato M. Sergi,
| |
Collapse
|
8
|
Li X, Ren W, Li Y, Shi Y, Sun H, Wang L, Wu L, Xie Y, Du Y, Jiang Z, Hong B. Production of chain-extended cinnamoyl compounds by overexpressing two adjacent cluster-situated LuxR regulators in Streptomyces globisporus C-1027. Front Microbiol 2022; 13:931180. [PMID: 35992673 PMCID: PMC9381841 DOI: 10.3389/fmicb.2022.931180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 07/04/2022] [Indexed: 11/17/2022] Open
Abstract
Natural products from microorganisms are important sources for drug discovery. With the development of high-throughput sequencing technology and bioinformatics, a large amount of uncharacterized biosynthetic gene clusters (BGCs) in microorganisms have been found, which show the potential for novel natural product production. Nine BGCs containing PKS and/or NRPS in Streptomyces globisporus C-1027 were transcriptionally low/silent under the experimental fermentation conditions, and the products of these clusters are unknown. Thus, we tried to activate these BGCs to explore cryptic products of this strain. We constructed the cluster-situated regulator overexpressing strains which contained regulator gene(s) under the control of the constitutive promoter ermE*p in S. globisporus C-1027. Overexpression of regulators in cluster 26 resulted in significant transcriptional upregulation of biosynthetic genes. With the separation and identification of products from the overexpressing strain OELuxR1R2, three ortho-methyl phenyl alkenoic acids (compounds 1-3) were obtained. Gene disruption showed that compounds 1 and 2 were completely abolished in the mutant GlaEKO, but were hardly affected by deletion of the genes orf3 or echA in cluster 26. The type II PKS biosynthetic pathway of chain-extended cinnamoyl compounds was deduced by bioinformatics analysis. This study showed that overexpression of the two adjacent cluster-situated LuxR regulator(s) is an effective strategy to connect the orphan BGC to its products.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Bin Hong
- NHC Key Laboratory of Biotechnology of Antibiotics, CAMS Key Laboratory of Synthetic Biology for Drug Innovation, Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| |
Collapse
|
9
|
MiDSystem: A comprehensive online system for de novo assembly and analysis of microbial genomes. N Biotechnol 2021; 65:42-52. [PMID: 34411700 DOI: 10.1016/j.nbt.2021.08.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 08/13/2021] [Accepted: 08/14/2021] [Indexed: 12/12/2022]
Abstract
The substantial reduction in experimental cost of next-generation sequencing techniques makes it feasible to assemble a bacterial genome of unknown species de novo and acquire substantial genetic information from environmental samples. Many bioinformatics tools and algorithms have also been developed for prokaryotes, but complex parameter settings and command line-based user interfaces cause a significant entry barrier for novices. Efficient construction of pipelines that integrate all the available genomic data poses a major challenge to the understanding of unknown pathogens. MiDSystem is a comprehensive online system for analyzing genomic data from microbiomes. With a user-friendly interface, MiDSystem supports both de novo assembly and metagenomic analysis pipelines. It is designed to automatically analyze whole genome shotgun sequencing data of bacteria submitted by users. Multiple analytical steps can be performed directly on the system, and the results generated from the embedded tools are visualized in an online summary report to make it more interpretable. Constructing a genome de novo has gradually become the foundation of bacterial studies. Taking both single species and metagenomic samples into consideration, MiDSystem can greatly reduce the time and effort for analysis of bacterial genomic data. Use of MiDSystem will enable more focus to be placed on understanding the etiology of bacterial infections and microorganism ecologies.
Collapse
|
10
|
Vieira AZ, Raittz RT, Faoro H. Origin and evolution of nonulosonic acid synthases and their relationship with bacterial pathogenicity revealed by a large-scale phylogenetic analysis. Microb Genom 2021; 7:000563. [PMID: 33848237 PMCID: PMC8208679 DOI: 10.1099/mgen.0.000563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 03/16/2021] [Indexed: 12/28/2022] Open
Abstract
Nonulosonic acids (NulOs) are a group of nine-carbon monosaccharides with different functions in nature. N-acetylneuraminic acid (Neu5Ac) is the most common NulO. It covers the membrane surface of all human cells and is a central molecule in the process of self-recognition via SIGLECS receptors. Some pathogenic bacteria escape the immune system by copying the sialylation of the host cell membrane. Neu5Ac production in these bacteria is catalysed by the enzyme NeuB. Some bacteria can also produce other NulOs named pseudaminic and legionaminic acids, through the NeuB homologues PseI and LegI, respectively. In Opisthokonta eukaryotes, the biosynthesis of Neu5Ac is catalysed by the enzyme NanS. In this study, we used publicly available data of sequences of NulOs synthases to investigate its distribution within the three domains of life and its relationship with pathogenic bacteria. We mined the KEGG database and found 425 NeuB sequences. Most NeuB sequences (58.74 %) from the KEGG orthology database were classified as from environmental bacteria; however, sequences from pathogenic bacteria showed higher conservation and prevalence of a specific domain named SAF. Using the HMM profile we identified 13 941 NulO synthase sequences in UniProt. Phylogenetic analysis of these sequences showed that the synthases were divided into three main groups that can be related to the lifestyle of these bacteria: (I) predominantly environmental, (II) intermediate and (III) predominantly pathogenic. NeuB was widely distributed in the groups. However, LegI and PseI were more concentrated in groups II and III, respectively. We also found that PseI appeared later in the evolutionary process, derived from NeuB. We use this same methodology to retrieve sialic acid synthase sequences from Archaea and Eukarya. A large-scale phylogenetic analysis showed that while the Archaea sequences are spread across the tree, the eukaryotic NanS sequences were grouped in a specific branch in group II. None of the bacterial NanS sequences grouped with the eukaryotic branch. The analysis of conserved residues showed that the synthases of Archaea and Eukarya present a mutation in one of the three catalytic residues, an E134D change, related to a Neisseria meningitidis reference sequence. We also found that the conservation profile is higher between NeuB of pathogenic bacteria and NanS of eukaryotes than between NeuB of environmental bacteria and NanS of eukaryotes. Our large-scale analysis brings new perspectives on the evolution of NulOs synthases, suggesting their presence in the last common universal ancestor.
Collapse
Affiliation(s)
- Alexandre Zanatta Vieira
- Laboratory for Applied Science and Technology in Health, Carlos Chagas Institute, Fiocruz-PR, Algacyr Munhoz Mader street, 3775, Curitiba, Paraná, Brazil
- Graduation Program on Bioinformatics – Universidade Federal do Paraná, Alcides Viera Arcoverde street 1225, Curitiba, Paraná, Brazil
| | - Roberto Tadeu Raittz
- Graduation Program on Bioinformatics – Universidade Federal do Paraná, Alcides Viera Arcoverde street 1225, Curitiba, Paraná, Brazil
| | - Helisson Faoro
- Laboratory for Applied Science and Technology in Health, Carlos Chagas Institute, Fiocruz-PR, Algacyr Munhoz Mader street, 3775, Curitiba, Paraná, Brazil
- Graduation Program on Bioinformatics – Universidade Federal do Paraná, Alcides Viera Arcoverde street 1225, Curitiba, Paraná, Brazil
| |
Collapse
|
11
|
Xavier JC, Gerhards RE, Wimmer JLE, Brueckner J, Tria FDK, Martin WF. The metabolic network of the last bacterial common ancestor. Commun Biol 2021; 4:413. [PMID: 33772086 PMCID: PMC7997952 DOI: 10.1038/s42003-021-01918-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 02/26/2021] [Indexed: 02/03/2023] Open
Abstract
Bacteria are the most abundant cells on Earth. They are generally regarded as ancient, but due to striking diversity in their metabolic capacities and widespread lateral gene transfer, the physiology of the first bacteria is unknown. From 1089 reference genomes of bacterial anaerobes, we identified 146 protein families that trace to the last bacterial common ancestor, LBCA, and form the conserved predicted core of its metabolic network, which requires only nine genes to encompass all universal metabolites. Our results indicate that LBCA performed gluconeogenesis towards cell wall synthesis, and had numerous RNA modifications and multifunctional enzymes that permitted life with low gene content. In accordance with recent findings for LUCA and LACA, analyses of thousands of individual gene trees indicate that LBCA was rod-shaped and the first lineage to diverge from the ancestral bacterial stem was most similar to modern Clostridia, followed by other autotrophs that harbor the acetyl-CoA pathway.
Collapse
Affiliation(s)
- Joana C Xavier
- Institute for Molecular Evolution, Heinrich-Heine-University, 40225, Düsseldorf, Germany.
| | - Rebecca E Gerhards
- Institute for Molecular Evolution, Heinrich-Heine-University, 40225, Düsseldorf, Germany
| | - Jessica L E Wimmer
- Institute for Molecular Evolution, Heinrich-Heine-University, 40225, Düsseldorf, Germany
| | - Julia Brueckner
- Institute for Molecular Evolution, Heinrich-Heine-University, 40225, Düsseldorf, Germany
| | - Fernando D K Tria
- Institute for Molecular Evolution, Heinrich-Heine-University, 40225, Düsseldorf, Germany
| | - William F Martin
- Institute for Molecular Evolution, Heinrich-Heine-University, 40225, Düsseldorf, Germany
| |
Collapse
|
12
|
Thorsen J, Stokholm J, Rasmussen MA, Mortensen MS, Brejnrod AD, Hjelmsø M, Shah S, Chawes B, Bønnelykke K, Sørensen SJ, Bisgaard H. The Airway Microbiota Modulates Effect of Azithromycin Treatment for Episodes of Recurrent Asthma-like Symptoms in Preschool Children: A Randomized Clinical Trial. Am J Respir Crit Care Med 2021; 204:149-158. [PMID: 33730519 DOI: 10.1164/rccm.202008-3226oc] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Rationale: Childhood asthma is often preceded by recurrent episodes of asthma-like symptoms, which can be triggered by both viral and bacterial agents. Recent randomized controlled trials have shown that azithromycin treatment reduces episode duration and severity through yet undefined mechanisms. Objectives: To study the influence of the airway microbiota on the effect of azithromycin treatment during acute episodes of asthma-like symptoms. Methods: Children from the COPSAC2010 (Copenhagen Prospective Studies on Asthma in Childhood 2010) cohort with recurrent asthma-like symptoms aged 12-36 months were randomized during acute episodes to azithromycin or placebo as previously reported. Before randomization, hypopharyngeal aspirates were collected and examined by 16S ribosomal RNA gene amplicon sequencing. Measurements and Main Results: In 139 airway samples from 68 children, episode duration after randomization was associated with microbiota richness (7.5% increased duration per 10 additional operational taxonomic units [OTUs]; 95% confidence interval, 1-14%; P = 0.025), with 15 individual OTUs (including several Neisseria and Veillonella), and with microbial pneumotypes defined from weighted UniFrac distances (longest durations in a Neisseria-dominated pneumotype). Microbiota richness before treatment increased the effect of azithromycin by 10% per 10 additional OTUs, and more OTUs were positively versus negatively associated with an increased azithromycin effect (82 vs. 58; P = 0.0032). Furthermore, effect modification of azithromycin was found for five individual OTUs (three OTUs increased and two OTUs decreased the effect; q < 0.05). Conclusions: The airway microbiota in acute episodes of asthma-like symptoms is associated with episode duration and modifies the effect of azithromycin treatment of the episodes in preschool children with recurrent asthma-like symptoms. Clinical trial registered with www.clinicaltrials.gov (NCT01233297).
Collapse
Affiliation(s)
- Jonathan Thorsen
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital.,Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, and
| | - Jakob Stokholm
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital.,Department of Food Science, Faculty of Science, and.,Department of Pediatrics, Slagelse Hospital, Slagelse, Denmark
| | - Morten Arendt Rasmussen
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital.,Department of Food Science, Faculty of Science, and
| | - Martin Steen Mortensen
- Section for Microbiology, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark; and
| | - Asker Daniel Brejnrod
- Section for Microbiology, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark; and
| | - Mathis Hjelmsø
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital
| | - Shiraz Shah
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital
| | - Bo Chawes
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital
| | - Klaus Bønnelykke
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital
| | - Søren Johannes Sørensen
- Section for Microbiology, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark; and
| | - Hans Bisgaard
- Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital
| |
Collapse
|
13
|
Mukherjee S, Stamatis D, Bertsch J, Ovchinnikova G, Sundaramurthi J, Lee J, Kandimalla M, Chen IMA, Kyrpides NC, Reddy TBK. Genomes OnLine Database (GOLD) v.8: overview and updates. Nucleic Acids Res 2021; 49:D723-D733. [PMID: 33152092 PMCID: PMC7778979 DOI: 10.1093/nar/gkaa983] [Citation(s) in RCA: 109] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/08/2020] [Accepted: 10/19/2020] [Indexed: 12/28/2022] Open
Abstract
The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) is a manually curated, daily updated collection of genome projects and their metadata accumulated from around the world. The current version of the database includes over 1.17 million entries organized broadly into Studies (45 770), Organisms (387 382) or Biosamples (101 207), Sequencing Projects (355 364) and Analysis Projects (283 481). These four levels contain over 600 metadata fields, which includes 76 controlled vocabulary (CV) tables containing 3873 terms. GOLD provides an interactive web user interface for browsing and searching by a wide range of project and metadata fields. Users can enter details about their own projects in GOLD, which acts as a gatekeeper to ensure that metadata is accurately documented before submitting sequence information to the Integrated Microbial Genomes (IMG) system for analysis. In order to maintain a reference dataset for use by members of the scientific community, GOLD also imports projects from public repositories such as GenBank and SRA. The current status of the database, along with recent updates and improvements are described in this manuscript.
Collapse
Affiliation(s)
- Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dimitri Stamatis
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jon Bertsch
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Galina Ovchinnikova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Janey Lee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Mahathi Kandimalla
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
14
|
Helmy M, Smith D, Selvarajoo K. Systems biology approaches integrated with artificial intelligence for optimized metabolic engineering. Metab Eng Commun 2020; 11:e00149. [PMID: 33072513 PMCID: PMC7546651 DOI: 10.1016/j.mec.2020.e00149] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 10/01/2020] [Accepted: 10/07/2020] [Indexed: 12/05/2022] Open
Abstract
Metabolic engineering aims to maximize the production of bio-economically important substances (compounds, enzymes, or other proteins) through the optimization of the genetics, cellular processes and growth conditions of microorganisms. This requires detailed understanding of underlying metabolic pathways involved in the production of the targeted substances, and how the cellular processes or growth conditions are regulated by the engineering. To achieve this goal, a large system of experimental techniques, compound libraries, computational methods and data resources, including multi-omics data, are used. The recent advent of multi-omics systems biology approaches significantly impacted the field by opening new avenues to perform dynamic and large-scale analyses that deepen our knowledge on the manipulations. However, with the enormous transcriptomics, proteomics and metabolomics available, it is a daunting task to integrate the data for a more holistic understanding. Novel data mining and analytics approaches, including Artificial Intelligence (AI), can provide breakthroughs where traditional low-throughput experiment-alone methods cannot easily achieve. Here, we review the latest attempts of combining systems biology and AI in metabolic engineering research, and highlight how this alliance can help overcome the current challenges facing industrial biotechnology, especially for food-related substances and compounds using microorganisms.
Collapse
Affiliation(s)
- Mohamed Helmy
- Singapore Institute of Food and Biotechnology Innovation (SIFBI), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
| | - Derek Smith
- Singapore Institute of Food and Biotechnology Innovation (SIFBI), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
| | - Kumar Selvarajoo
- Singapore Institute of Food and Biotechnology Innovation (SIFBI), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
- Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore (NUS), Singapore, Singapore
| |
Collapse
|
15
|
Canon F, Mariadassou M, Maillard MB, Falentin H, Parayre S, Madec MN, Valence F, Henry G, Laroute V, Daveran-Mingot ML, Cocaign-Bousquet M, Thierry A, Gagnaire V. Function-Driven Design of Lactic Acid Bacteria Co-cultures to Produce New Fermented Food Associating Milk and Lupin. Front Microbiol 2020; 11:584163. [PMID: 33329449 PMCID: PMC7717992 DOI: 10.3389/fmicb.2020.584163] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 10/13/2020] [Indexed: 11/17/2022] Open
Abstract
Designing bacterial co-cultures adapted to ferment mixes of vegetal and animal resources for food diversification and sustainability is becoming a challenge. Among bacteria used in food fermentation, lactic acid bacteria (LAB) are good candidates, as they are used as starter or adjunct in numerous fermented foods, where they allow preservation, enhanced digestibility, and improved flavor. We developed here a strategy to design LAB co-cultures able to ferment a new food made of bovine milk and lupin flour, consisting in: (i) in silico preselection of LAB species for targeted carbohydrate degradation; (ii) in vitro screening of 97 strains of the selected species for their ability to ferment carbohydrates and hydrolyze proteins from milk and lupin and clustering strains that displayed similar phenotypes; and (iii) assembling strains randomly sampled from clusters that showed complementary phenotypes. The designed co-cultures successfully expressed the targeted traits i.e., hydrolyzed proteins and degraded raffinose family oligosaccharides of lupin and lactose of milk in a large range of concentrations. They also reduced an off-flavor-generating volatile, hexanal, and produced various desirable flavor compounds. Most of the strains in co-cultures achieved higher cell counts than in monoculture, suggesting positive interactions. This work opens new avenues for the development of innovative fermented food products based on functionally complementary strains in the world-wide context of diet diversification.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Valérie Laroute
- Université de Toulouse, CNRS, INRAE, INSA, TBI, Toulouse, France
| | | | | | | | | |
Collapse
|
16
|
Reyes-Prieto M, Vargas-Chávez C, Llabrés M, Palmer P, Latorre A, Moya A. An update on the Symbiotic Genomes Database (SymGenDB): a collection of metadata, genomic, genetic and protein sequences, orthologs and metabolic networks of symbiotic organisms. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5735476. [PMID: 32055857 PMCID: PMC7018611 DOI: 10.1093/database/baz160] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 07/20/2019] [Accepted: 12/31/2019] [Indexed: 11/14/2022]
Abstract
The Symbiotic Genomes Database (SymGenDB; http://symbiogenomesdb.uv.es/) is a public resource of manually curated associations between organisms involved in symbiotic relationships, maintaining a catalog of completely sequenced/finished bacterial genomes exclusively. It originally consisted of three modules where users could search for the bacteria involved in a specific symbiotic relationship, their genomes and their genes (including their orthologs). In this update, we present an additional module that includes a representation of the metabolic network of each organism included in the database, as Directed Acyclic Graphs (MetaDAGs). This module provides unique opportunities to explore the metabolism of each individual organism and/or to evaluate the shared and joint metabolic capabilities of the organisms of the same genera included in our listing, allowing users to construct predictive analyses of metabolic associations and complementation between systems. We also report a ~25% increase in manually curated content in the database, i.e. bacterial genomes and their associations, with a final count of 2328 bacterial genomes associated to 498 hosts. We describe new querying possibilities for all the modules, as well as new display features for the MetaDAGs module, providing a relevant range of content and utility. This update continues to improve SymGenDB and can help elucidate the mechanisms by which organisms depend on each other.
Collapse
Affiliation(s)
- Mariana Reyes-Prieto
- Evolutionary Systems Biology of Symbionts, Institute for Integrative Systems Biology (I2SysBio), Universitat de València, Paterna, València, Spain.,Sequencing and Bioinformatics Service, Foundation for the Promotion of Sanitary and Biomedical Research of the Valencia Region (FISABIO), València, Spain
| | - Carlos Vargas-Chávez
- Evolutionary Systems Biology of Symbionts, Institute for Integrative Systems Biology (I2SysBio), Universitat de València, Paterna, València, Spain.,Functional and Evolutionary Genomics, Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Mercè Llabrés
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma, Balearic Islands, Spain
| | - Pere Palmer
- Department of Mathematics and Computer Science, University of the Balearic Islands, Palma, Balearic Islands, Spain
| | - Amparo Latorre
- Evolutionary Systems Biology of Symbionts, Institute for Integrative Systems Biology (I2SysBio), Universitat de València, Paterna, València, Spain.,Genomic and Health Area, Foundation for the Promotion of Sanitary and Biomedical Research of the Valencia Region (FISABIO), València, Spain
| | - Andrés Moya
- Evolutionary Systems Biology of Symbionts, Institute for Integrative Systems Biology (I2SysBio), Universitat de València, Paterna, València, Spain.,Genomic and Health Area, Foundation for the Promotion of Sanitary and Biomedical Research of the Valencia Region (FISABIO), València, Spain.,CIBER in Epidemiology and Public Health (CIBEResp), Madrid, Spain
| |
Collapse
|
17
|
Phase separation by ssDNA binding protein controlled via protein-protein and protein-DNA interactions. Proc Natl Acad Sci U S A 2020; 117:26206-26217. [PMID: 33020264 PMCID: PMC7584906 DOI: 10.1073/pnas.2000761117] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Cells must rapidly and efficiently react to DNA damage to avoid its harmful consequences. Here we report a molecular mechanism that gives rise to a model of how bacterial cells mobilize DNA repair proteins for timely response to genomic stress and initiation of DNA repair upon exposure of single-stranded DNA. We found that bacterial single-stranded DNA binding protein (SSB), a central player in genome metabolism, can undergo dynamic phase separation under physiological conditions. SSB condensates can store a wide array of DNA repair proteins that specifically interact with SSB. However, elevated levels of single-stranded DNA during genomic stress can dissolve SSB condensates, enabling rapid mobilization of SSB and SSB-interacting proteins to sites of DNA damage. Bacterial single-stranded (ss)DNA-binding proteins (SSB) are essential for the replication and maintenance of the genome. SSBs share a conserved ssDNA-binding domain, a less conserved intrinsically disordered linker (IDL), and a highly conserved C-terminal peptide (CTP) motif that mediates a wide array of protein−protein interactions with DNA-metabolizing proteins. Here we show that the Escherichia coli SSB protein forms liquid−liquid phase-separated condensates in cellular-like conditions through multifaceted interactions involving all structural regions of the protein. SSB, ssDNA, and SSB-interacting molecules are highly concentrated within the condensates, whereas phase separation is overall regulated by the stoichiometry of SSB and ssDNA. Together with recent results on subcellular SSB localization patterns, our results point to a conserved mechanism by which bacterial cells store a pool of SSB and SSB-interacting proteins. Dynamic phase separation enables rapid mobilization of this protein pool to protect exposed ssDNA and repair genomic loci affected by DNA damage.
Collapse
|
18
|
Phylogeny resolved, metabolism revealed: functional radiation within a widespread and divergent clade of sponge symbionts. ISME JOURNAL 2020; 15:503-519. [PMID: 33011742 DOI: 10.1038/s41396-020-00791-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 09/09/2020] [Accepted: 09/21/2020] [Indexed: 01/17/2023]
Abstract
The symbiosis between bacteria and sponges has arguably the longest evolutionary history for any extant metazoan lineage, yet little is known about bacterial evolution or adaptation in this process. An example of often dominant and widespread bacterial symbionts of sponges is a clade of uncultured and uncharacterised Proteobacteria. Here we set out to characterise this group using metagenomics, in-depth phylogenetic analyses, metatranscriptomics, and fluorescence in situ hybridisation microscopy. We obtained five metagenome-assembled-genomes (MAGs) from different sponge species that, together with a previously published MAG (AqS2), comprise two families within a new gammaproteobacterial order that we named UTethybacterales. Members of this order share a heterotrophic lifestyle but vary in their predicted ability to use various carbon, nitrogen and sulfur sources, including taurine, spermidine and dimethylsulfoniopropionate. The deep branching of the UTethybacterales within the Gammaproteobacteria and their almost exclusive presence in sponges suggests they have entered a symbiosis with their host relatively early in evolutionary time and have subsequently functionally radiated. This is reflected in quite distinct lifestyles of various species of UTethybacterales, most notably their diverse morphologies, predicted substrate preferences, and localisation within the sponge tissue. This study provides new insight into the evolution of metazoan-bacteria symbiosis.
Collapse
|
19
|
Neubauer V, Petri RM, Humer E, Kröger I, Reisinger N, Baumgartner W, Wagner M, Zebeli Q. Starch-Rich Diet Induced Rumen Acidosis and Hindgut Dysbiosis in Dairy Cows of Different Lactations. Animals (Basel) 2020; 10:ani10101727. [PMID: 32977653 PMCID: PMC7598178 DOI: 10.3390/ani10101727] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 09/17/2020] [Accepted: 09/19/2020] [Indexed: 01/23/2023] Open
Abstract
Simple Summary High-producing dairy cows receive high-energy diets for maintenance and production. This study showed that 60% concentrate in the diet, containing 27.7% starch, changed the fecal-microbial community and lowered its diversity, suggesting hindgut dysbiosis. Both ruminal and fecal pH decreased with high-starch feeding, which suggests further investigations in fecal pH as rumen- and hindgut-acidosis diagnostic tool. Cows in the third lactation spent more time below the threshold for subacute-ruminal acidosis (pH 6.0) than second or fourth-or-below lactation cows. Their higher susceptibility was caused by their high dry matter intake but missing counter-regulation by increased rumination activity. Further, we suggest that body weight and rumen size might play a role in the absorptive capacity of short-chain fatty acids. The study also identified indicator-bacterial phylotypes that changed with starch-rich diet and lactation number. In conclusion, we suggest including lactation number as a factor in practical feeding management for identification of high risk-cows for acidosis, and in dairy cow research. Abstract Starch-rich diets can cause subacute ruminal acidosis (SARA) in dairy cows with potentially different susceptibility according to lactation number. We wanted to evaluate the bacterial community and the fermentation end products in feces to study susceptibility to hindgut acidosis and dysbiosis. Sixteen dairy cows received a medium-concentrate diet (MC, 40% concentrate, 18.8% starch) for one week and a high-concentrate diet (HC, 60% concentrate, 27.7% starch, DM) for four weeks. Milk yield, dry-matter intake, chewing activity, ruminal pH, milk constituents, and fecal samples for short-chain fatty acids (SCFA), pH, and 16S rRNA-gene sequencing were investigated. The HC feeding caused a reduction in fecal pH, bacterial diversity and richness, an increase in total SCFA, and a separate phylogenetic clustering of MC and HC samples. Ruminal and fecal pH had fair correlation (r = 0.5). Cows in the second lactation (2ndL) had lower dry matter intake (DMI) than cows of third or fourth or more lactations (3rdL; ≥4 L), whereas DMI/kg body weight was lower for ≥4 L than for 2ndL and 3rdL cows. The mean ruminal pH was highest in ≥4 L, whereas the time spent below the SARA threshold was highest for 3rdL cows. The latter also had higher total SCFA in the feces. Our results suggest that hindgut dysbiosis is caused by increased substrate flow to the hindgut, but further investigations are needed to define hindgut acidosis. The 3rdL cows were most susceptible to rumen acidosis and hindgut dysbiosis due to high DMI level, but missing counter regulations, as suggested happening in 2ndL and ≥4 L cows.
Collapse
Affiliation(s)
- Viktoria Neubauer
- Unit of Food Microbiology, Institute of Food Safety, Food Technology, and Veterinary Public Health, University of Veterinary Medicine, 1210 Vienna, Austria;
- FFoQSI GmbH—Austrian Competence Centre for Feed and Food Quality, Safety & Innovation, 3430 Tulln, Austria
- Correspondence:
| | - Renee M. Petri
- Institute of Animal Nutrition and Functional Plant Compounds, University of Veterinary Medicine, 1210 Vienna, Austria; (R.M.P.); (E.H.); (I.K.); (Q.Z.)
| | - Elke Humer
- Institute of Animal Nutrition and Functional Plant Compounds, University of Veterinary Medicine, 1210 Vienna, Austria; (R.M.P.); (E.H.); (I.K.); (Q.Z.)
- Department for Psychotherapy and Biopsychosocial Health, Danube University Krems, 3500 Krems, Austria
| | - Iris Kröger
- Institute of Animal Nutrition and Functional Plant Compounds, University of Veterinary Medicine, 1210 Vienna, Austria; (R.M.P.); (E.H.); (I.K.); (Q.Z.)
| | - Nicole Reisinger
- BIOMIN Research Center, BIOMIN Holding GmbH, 3430 Tulln, Austria;
| | - Walter Baumgartner
- University Clinic for Ruminants, University of Veterinary Medicine, 1210 Vienna, Austria;
| | - Martin Wagner
- Unit of Food Microbiology, Institute of Food Safety, Food Technology, and Veterinary Public Health, University of Veterinary Medicine, 1210 Vienna, Austria;
- FFoQSI GmbH—Austrian Competence Centre for Feed and Food Quality, Safety & Innovation, 3430 Tulln, Austria
| | - Qendrim Zebeli
- Institute of Animal Nutrition and Functional Plant Compounds, University of Veterinary Medicine, 1210 Vienna, Austria; (R.M.P.); (E.H.); (I.K.); (Q.Z.)
| |
Collapse
|
20
|
Schmiedová L, Kreisinger J, Požgayová M, Honza M, Martin JF, Procházka P. Gut microbiota in a host-brood parasite system: insights from common cuckoos raised by two warbler species. FEMS Microbiol Ecol 2020; 96:5872480. [PMID: 32672792 DOI: 10.1093/femsec/fiaa143] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 07/15/2020] [Indexed: 11/13/2022] Open
Abstract
An animal's gut microbiota (GM) is shaped by a range of environmental factors affecting the bacterial sources invading the host. At the same time, animal hosts are equipped with intrinsic mechanisms enabling regulation of GM. However, there is limited knowledge on the relative importance of these forces. To assess the significance of host-intrinsic vs environmental factors, we studied GM in nestlings of an obligate brood parasite, the common cuckoo (Cuculus canorus), raised by two foster species, great reed warblers (Acrocephalus arundinaceus) and Eurasian reed warblers (A. scirpaceus), and compared these with GM of the fosterers' own nestlings. We show that fecal GM varied between cuckoo and warbler nestlings when accounting for the effect of foster/parent species, highlighting the importance of host-intrinsic regulatory mechanisms. In addition to feces, cuckoos also expel a deterrent secretion, which provides protection against olfactory predators. We observed an increased abundance of bacterial genera capable of producing repulsive volatile molecules in the deterrent secretion. Consequently, our results support the hypothesis that microbiota play a role in this antipredator mechanism. Interestingly, fosterer/parent identity affected only cuckoo deterrent secretion and warbler feces microbiota, but not that of cuckoo feces, suggesting a strong selection of bacterial strains in the GM by cuckoo nestlings.
Collapse
Affiliation(s)
- Lucie Schmiedová
- Department of Zoology, Faculty of Science, Charles University, Viničná 7, CZ-12800 Prague, Czech Republic
| | - Jakub Kreisinger
- Department of Zoology, Faculty of Science, Charles University, Viničná 7, CZ-12800 Prague, Czech Republic
| | - Milica Požgayová
- Institute of Vertebrate Biology, Czech Academy of Sciences, Květná 8, CZ-60365 Brno, Czech Republic
| | - Marcel Honza
- Institute of Vertebrate Biology, Czech Academy of Sciences, Květná 8, CZ-60365 Brno, Czech Republic
| | | | - Petr Procházka
- Institute of Vertebrate Biology, Czech Academy of Sciences, Květná 8, CZ-60365 Brno, Czech Republic
| |
Collapse
|
21
|
Moi D, Kilchoer L, Aguilar PS, Dessimoz C. Scalable phylogenetic profiling using MinHash uncovers likely eukaryotic sexual reproduction genes. PLoS Comput Biol 2020; 16:e1007553. [PMID: 32697802 PMCID: PMC7423146 DOI: 10.1371/journal.pcbi.1007553] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 08/12/2020] [Accepted: 05/18/2020] [Indexed: 01/09/2023] Open
Abstract
Phylogenetic profiling is a computational method to predict genes involved in the same biological process by identifying protein families which tend to be jointly lost or retained across the tree of life. Phylogenetic profiling has customarily been more widely used with prokaryotes than eukaryotes, because the method is thought to require many diverse genomes. There are now many eukaryotic genomes available, but these are considerably larger, and typical phylogenetic profiling methods require at least quadratic time as a function of the number of genes. We introduce a fast, scalable phylogenetic profiling approach entitled HogProf, which leverages hierarchical orthologous groups for the construction of large profiles and locality-sensitive hashing for efficient retrieval of similar profiles. We show that the approach outperforms Enhanced Phylogenetic Tree, a phylogeny-based method, and use the tool to reconstruct networks and query for interactors of the kinetochore complex as well as conserved proteins involved in sexual reproduction: Hap2, Spo11 and Gex1. HogProf enables large-scale phylogenetic profiling across the three domains of life, and will be useful to predict biological pathways among the hundreds of thousands of eukaryotic species that will become available in the coming few years. HogProf is available at https://github.com/DessimozLab/HogProf. Genes that are involved in the same biological process tend to co-evolve. This property is exploited by the technique of phylogenetic profiling, which identifies co-evolving (and therefore likely functionally related) genes through patterns of correlated gene retention and loss in evolution and across species. However, conventional methods to computing and clustering these correlated genes do not scale with increasing numbers of genomes. HogProf is a novel phylogenetic profiling tool built on probabilistic data structures. It allows the user to construct searchable databases containing the evolutionary history of hundreds of thousands of protein families. Such fast detection of coevolution takes advantage of the rapidly increasing amount of genomic data publicly available, and can uncover unknown biological networks and guide in-vivo research and experimentation. We have applied our tool to describe the biological networks underpinning sexual reproduction in eukaryotes.
Collapse
Affiliation(s)
- David Moi
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail: (DM); (CD)
| | - Laurent Kilchoer
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo S. Aguilar
- Instituto de Investigaciones Biotecnologicas (IIBIO), Universidad Nacional de San Martín, Buenos Aires, Argentina
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE-CONICET), Buenos Aires, Argentina
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Genetics, Evolution, and Environment, University College London, London, United Kingdom
- Department of Computer Science, University College London, London, United Kingdom
- * E-mail: (DM); (CD)
| |
Collapse
|
22
|
Villar E, Cabrol L, Heimbürger-Boavida LE. Widespread microbial mercury methylation genes in the global ocean. ENVIRONMENTAL MICROBIOLOGY REPORTS 2020. [PMID: 32090489 DOI: 10.1101/648329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Methylmercury is a neurotoxin that bioaccumulates from seawater to high concentrations in marine fish, putting human and ecosystem health at risk. High methylmercury levels have been found in the oxic subsurface waters of all oceans, but only anaerobic microorganisms have been shown to efficiently produce methylmercury in anoxic environments. The microaerophilic nitrite-oxidizing bacteria Nitrospina have previously been suggested as possible mercury methylating bacteria in Antarctic sea ice. However, the microorganisms responsible for processing inorganic mercury into methylmercury in oxic seawater remain unknown. Here, we show metagenomic and metatranscriptomic evidence that the genetic potential for microbial methylmercury production is widespread in oxic seawater. We find high abundance and expression of the key mercury methylating genes hgcAB across all ocean basins, corresponding to the taxonomic relatives of known mercury methylating bacteria from Deltaproteobacteria, Firmicutes and Chloroflexi. Our results identify Nitrospina as the predominant and widespread microorganism carrying and actively expressing hgcAB. The highest hgcAB abundance and expression occurs in the oxic subsurface waters of the global ocean where the highest MeHg concentrations are typically observed.
Collapse
Affiliation(s)
- Emilie Villar
- Aix Marseille Université, Univ Toulon, CNRS, IRD, Mediterranean Institute of Oceanography (MIO) UM 110, 13288, Marseille, France
- Sorbonne Université, Université Pierre et Marie Curie - Paris 6, CNRS, UMR 7144 (AD2M), Station Biologique de Roscoff, Place Georges Teissier, CS90074, Roscoff, 29688, France
| | - Léa Cabrol
- Aix Marseille Université, Univ Toulon, CNRS, IRD, Mediterranean Institute of Oceanography (MIO) UM 110, 13288, Marseille, France
- Instituto de Ecologia y Biodiversidad, Departamento de Ciencias Ecologicas, Facultad de Ciencias, Universidad de Chile, Santiago de Chile, Chile
| | - Lars-Eric Heimbürger-Boavida
- Aix Marseille Université, Univ Toulon, CNRS, IRD, Mediterranean Institute of Oceanography (MIO) UM 110, 13288, Marseille, France
| |
Collapse
|
23
|
Chen IMA, Chu K, Palaniappan K, Pillay M, Ratner A, Huang J, Huntemann M, Varghese N, White JR, Seshadri R, Smirnova T, Kirton E, Jungbluth SP, Woyke T, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 2020; 47:D666-D677. [PMID: 30289528 PMCID: PMC6323987 DOI: 10.1093/nar/gky901] [Citation(s) in RCA: 547] [Impact Index Per Article: 136.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 09/24/2018] [Indexed: 11/12/2022] Open
Abstract
The Integrated Microbial Genomes & Microbiomes system v.5.0 (IMG/M: https://img.jgi.doe.gov/m/) contains annotated datasets categorized into: archaea, bacteria, eukarya, plasmids, viruses, genome fragments, metagenomes, cell enrichments, single particle sorts, and metatranscriptomes. Source datasets include those generated by the DOE's Joint Genome Institute (JGI), submitted by external scientists, or collected from public sequence data archives such as NCBI. All submissions are typically processed through the IMG annotation pipeline and then loaded into the IMG data warehouse. IMG's web user interface provides a variety of analytical and visualization tools for comparative analysis of isolate genomes and metagenomes in IMG. IMG/M allows open access to all public genomes in the IMG data warehouse, while its expert review (ER) system (IMG/MER: https://img.jgi.doe.gov/mer/) allows registered users to access their private genomes and to store their private datasets in workspace for sharing and for further analysis. IMG/M data content has grown by 60% since the last report published in the 2017 NAR Database Issue. IMG/M v.5.0 has a new and more powerful genome search feature, new statistical tools, and supports metagenome binning.
Collapse
Affiliation(s)
- I-Min A Chen
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Ken Chu
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Krishna Palaniappan
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Manoj Pillay
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Anna Ratner
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Jinghua Huang
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Marcel Huntemann
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Neha Varghese
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | | | - Rekha Seshadri
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Tatyana Smirnova
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Edward Kirton
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Sean P Jungbluth
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Tanja Woyke
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Emiley A Eloe-Fadrosh
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Natalia N Ivanova
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
24
|
Paez-Espino D, Roux S, Chen IMA, Palaniappan K, Ratner A, Chu K, Huntemann M, Reddy TBK, Pons JC, Llabrés M, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes. Nucleic Acids Res 2020; 47:D678-D686. [PMID: 30407573 PMCID: PMC6323928 DOI: 10.1093/nar/gky1127] [Citation(s) in RCA: 114] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/31/2018] [Indexed: 01/06/2023] Open
Abstract
The Integrated Microbial Genome/Virus (IMG/VR) system v.2.0 (https://img.jgi.doe.gov/vr/) is the largest publicly available data management and analysis platform dedicated to viral genomics. Since the last report published in the 2016, NAR Database Issue, the data has tripled in size and currently contains genomes of 8389 cultivated reference viruses, 12 498 previously published curated prophages derived from cultivated microbial isolates, and 735 112 viral genomic fragments computationally predicted from assembled shotgun metagenomes. Nearly 60% of the viral genomes and genome fragments are clustered into 110 384 viral Operational Taxonomic Units (vOTUs) with two or more members. To improve data quality and predictions of host specificity, IMG/VR v.2.0 now separates prokaryotic and eukaryotic viruses, utilizes known prophage sequences to improve taxonomic assignments, and provides viral genome quality scores based on the estimated genome completeness. New features also include enhanced BLAST search capabilities for external queries. Finally, geographic map visualization to locate user-selected viral genomes or genome fragments has been implemented and download options have been extended. All of these features make IMG/VR v.2.0 a key resource for the study of viruses.
Collapse
Affiliation(s)
| | - Simon Roux
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - I-Min A Chen
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Krishna Palaniappan
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Anna Ratner
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Ken Chu
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Marcel Huntemann
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - T B K Reddy
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Spain
| | - Mercè Llabrés
- Department of Mathematics and Computer Science, University of the Balearic Islands, Spain
| | | | | | - Nikos C Kyrpides
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| |
Collapse
|
25
|
Pathogenomics and Management of Fusarium Diseases in Plants. Pathogens 2020; 9:pathogens9050340. [PMID: 32369942 PMCID: PMC7281180 DOI: 10.3390/pathogens9050340] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 04/25/2020] [Accepted: 04/28/2020] [Indexed: 12/16/2022] Open
Abstract
There is an urgency to supplant the heavy reliance on chemical control of Fusarium diseases in different economically important, staple food crops due to development of resistance in the pathogen population, the high cost of production to the risk-averse grower, and the concomitant environmental impacts. Pathogenomics has enabled (i) the creation of genetic inventories which identify those putative genes, regulators, and effectors that are associated with virulence, pathogenicity, and primary and secondary metabolism; (ii) comparison of such genes among related pathogens; (iii) identification of potential genetic targets for chemical control; and (iv) better characterization of the complex dynamics of host–microbe interactions that lead to disease. This type of genomic data serves to inform host-induced gene silencing (HIGS) technology for targeted disruption of transcription of select genes for the control of Fusarium diseases. This review discusses the various repositories and browser access points for comparison of genomic data, the strategies for identification and selection of pathogenicity- and virulence-associated genes and effectors in different Fusarium species, HIGS and successful Fusarium disease control trials with a consideration of loss of RNAi, off-target effects, and future challenges in applying HIGS for management of Fusarium diseases.
Collapse
|
26
|
The Great Oxidation Event expanded the genetic repertoire of arsenic metabolism and cycling. Proc Natl Acad Sci U S A 2020; 117:10414-10421. [PMID: 32350143 PMCID: PMC7229686 DOI: 10.1073/pnas.2001063117] [Citation(s) in RCA: 75] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The rise of oxygen on the early Earth about 2.4 billion years ago reorganized the redox cycle of harmful metal(loids), including that of arsenic, which doubtlessly imposed substantial barriers to the physiology and diversification of life. Evaluating the adaptive biological responses to these environmental challenges is inherently difficult because of the paucity of fossil records. Here we applied molecular clock analyses to 13 gene families participating in principal pathways of arsenic resistance and cycling, to explore the nature of early arsenic biogeocycles and decipher feedbacks associated with planetary oxygenation. Our results reveal the advent of nascent arsenic resistance systems under the anoxic environment predating the Great Oxidation Event (GOE), with the primary function of detoxifying reduced arsenic compounds that were abundant in Archean environments. To cope with the increased toxicity of oxidized arsenic species that occurred as oxygen built up in Earth's atmosphere, we found that parts of preexisting detoxification systems for trivalent arsenicals were merged with newly emerged pathways that originated via convergent evolution. Further expansion of arsenic resistance systems was made feasible by incorporation of oxygen-dependent enzymatic pathways into the detoxification network. These genetic innovations, together with adaptive responses to other redox-sensitive metals, provided organisms with novel mechanisms for adaption to changes in global biogeocycles that emerged as a consequence of the GOE.
Collapse
|
27
|
Zoller R, Zehavi M, Ziv-Ukelson M. A New Paradigm for Identifying Reconciliation-Scenario Altering Mutations Conferring Environmental Adaptation. J Comput Biol 2020; 27:1561-1580. [PMID: 32250165 DOI: 10.1089/cmb.2019.0472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
An important goal in microbial computational genomics is to identify crucial events in the evolution of a gene that severely alter the duplication, loss, and mobilization patterns of the gene within the genomes in which it disseminates. In this article, we formalize this microbiological goal as a new pattern-matching problem in the domain of gene tree and species tree reconciliation, denoted "Reconciliation-Scenario Altering Mutation (RSAM) Discovery." We propose an [Formula: see text] time algorithm to solve this new problem, where m and n are the number of vertices of the input gene tree and species tree, respectively, and k is a user-specified parameter that bounds from above the number of optimal solutions of interest. The algorithm first constructs a hypergraph representing the k highest scoring reconciliation scenarios between the given gene tree and species tree, and then interrogates this hypergraph for subtrees matching a prespecified RSAM pattern. Our algorithm is optimal in the sense that the number of hypernodes in the hypergraph can be lower bounded by [Formula: see text]. We implement the new algorithm as a tool, called RSAM-finder, and demonstrate its application to the identification of RSAMs in toxins and drug resistance elements across a data set spanning hundreds of species.
Collapse
Affiliation(s)
- Roni Zoller
- Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel
| | - Meirav Zehavi
- Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel
| | - Michal Ziv-Ukelson
- Department of Computer Science, Ben Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
28
|
Blumer-Schuette SE. Insights into Thermophilic Plant Biomass Hydrolysis from Caldicellulosiruptor Systems Biology. Microorganisms 2020; 8:E385. [PMID: 32164310 PMCID: PMC7142884 DOI: 10.3390/microorganisms8030385] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Revised: 03/06/2020] [Accepted: 03/07/2020] [Indexed: 11/16/2022] Open
Abstract
Plant polysaccharides continue to serve as a promising feedstock for bioproduct fermentation. However, the recalcitrant nature of plant biomass requires certain key enzymes, including cellobiohydrolases, for efficient solubilization of polysaccharides. Thermostable carbohydrate-active enzymes are sought for their stability and tolerance to other process parameters. Plant biomass degrading microbes found in biotopes like geothermally heated water sources, compost piles, and thermophilic digesters are a common source of thermostable enzymes. While traditional thermophilic enzyme discovery first focused on microbe isolation followed by functional characterization, metagenomic sequences are negating the initial need for species isolation. Here, we summarize the current state of knowledge about the extremely thermophilic genus Caldicellulosiruptor, including genomic and metagenomic analyses in addition to recent breakthroughs in enzymology and genetic manipulation of the genus. Ten years after completing the first Caldicellulosiruptor genome sequence, the tools required for systems biology of this non-model environmental microorganism are in place.
Collapse
|
29
|
Adeyemi JA, Peters SO, De Donato M, Cervantes AP, Ogunade IM. Effects of a blend of Saccharomyces cerevisiae-based direct-fed microbial and fermentation products on plasma carbonyl-metabolome and fecal bacterial community of beef steers. J Anim Sci Biotechnol 2020; 11:14. [PMID: 32095237 PMCID: PMC7025411 DOI: 10.1186/s40104-019-0419-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 12/22/2019] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Previous studies have evaluated the metabolic status of animals fed direct-fed microbial (DFM) using enzyme-based assays which are time-consuming and limited to a few metabolites. In addition, little emphasis has been placed on investigating the effects of DFM on hindgut microbiota. We examined the effects of dietary supplementation of a blend of Saccharomyces cerevisiae-based DFM and fermentation products on the plasma concentrations of carbonyl-containing metabolites via a metabolomics approach, and fecal bacterial community, via 16S rRNA gene sequencing, of beef steers during a 42-day receiving period. Forty newly weaned steers were randomly assigned to receive a basal diet with no additive (CON; n = 20) or a basal diet supplemented with 19 g of Commence™ (PROB; n = 20) for a 42-day period. Commence™ (PMI, Arden Hills, MN) is a blend of 6.2 × 1011 cfu/g of S. cerevisiae, 3.5 × 1010 cfu/g of a mixture of Enterococcus lactis, Bacillus subtilis, Enterococcus faecium, and Lactobacillus casei, and the fermentation products of these aforementioned microorganisms and those of Aspergillus oryzae and Aspergillus niger. On d 0 and 40, rectal fecal samples were collected randomly from 10 steers from each treatment group. On d 42, blood was collected for plasma preparation. RESULTS A total number of 812 plasma metabolites were detected. Up to 305 metabolites [fold change (FC) ≥ 1.5, FDR ≤ 0.01] including glucose, hippuric acid, and 5-hydroxykynurenamine were increased by PROB supplementation, whereas 199 metabolites (FC ≤ 0.63, FDR ≤ 0.01) including acetoacetate were reduced. Supplementation of PROB increased (P ≤ 0.05) the relative abundance of Prevotellaceae UCG-003, Megasphaera, Dorea, Acetitomaculum, and Blautia. In contrast, the relative abundance of Elusimicrobium, Moheibacter, Stenotrophomonas, Comamonas, and uncultured bacterium belonging to family p-2534-18B5 gut group (phylum Bacteroidetes) were reduced (P ≤ 0.05). CONCLUSIONS The results of this study demonstrated that supplementation of PROB altered both the plasma carbonyl metabolome towards increased glucose concentration suggesting an improved energy status, and fecal bacterial community, suggesting an increased hindgut fermentation of the beef steers.
Collapse
Affiliation(s)
- James A. Adeyemi
- College of Agriculture, Communities, and the Environment, Kentucky State University, Frankfort, KY 40601 USA
| | - Sunday O. Peters
- Department of Animal Science, Berry College, Mount Berry, GA 30149 USA
| | - Marcos De Donato
- Tecnologico de Monterrey, Escuela de Ingenieria y Ciencias, Queretaro, Mexico
| | - Andres Pech Cervantes
- Agricultural Research Station, Fort Valley State University, Fort Valley, GA 31030 USA
| | - Ibukun M. Ogunade
- College of Agriculture, Communities, and the Environment, Kentucky State University, Frankfort, KY 40601 USA
| |
Collapse
|
30
|
Pérez-Losada M, Arenas M, Galán JC, Bracho MA, Hillung J, García-González N, González-Candelas F. High-throughput sequencing (HTS) for the analysis of viral populations. INFECTION GENETICS AND EVOLUTION 2020; 80:104208. [PMID: 32001386 DOI: 10.1016/j.meegid.2020.104208] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 01/21/2020] [Accepted: 01/24/2020] [Indexed: 12/12/2022]
Abstract
The development of High-Throughput Sequencing (HTS) technologies is having a major impact on the genomic analysis of viral populations. Current HTS platforms can capture nucleic acid variation across millions of genes for both selected amplicons and full viral genomes. HTS has already facilitated the discovery of new viruses, hinted new taxonomic classifications and provided a deeper and broader understanding of their diversity, population and genetic structure. Hence, HTS has already replaced standard Sanger sequencing in basic and applied research fields, but the next step is its implementation as a routine technology for the analysis of viruses in clinical settings. The most likely application of this implementation will be the analysis of viral genomics, because the huge population sizes, high mutation rates and very fast replacement of viral populations have demonstrated the limited information obtained with Sanger technology. In this review, we describe new technologies and provide guidelines for the high-throughput sequencing and genetic and evolutionary analyses of viral populations and metaviromes, including software applications. With the development of new HTS technologies, new and refurbished molecular and bioinformatic tools are also constantly being developed to process and integrate HTS data. These allow assembling viral genomes and inferring viral population diversity and dynamics. Finally, we also present several applications of these approaches to the analysis of viral clinical samples including transmission clusters and outbreak characterization.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão 4485-661, Portugal
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain; Biomedical Research Center (CINBIO), University of Vigo, 36310 Vigo, Spain.
| | - Juan Carlos Galán
- Microbiology Service, Hospital Ramón y Cajal, Madrid, Spain; CIBER in Epidemiology and Public Health, Spain.
| | - Mª Alma Bracho
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain.
| | - Julia Hillung
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Neris García-González
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Fernando González-Candelas
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| |
Collapse
|
31
|
Tracking microbial evolution in the human gut using Hi-C reveals extensive horizontal gene transfer, persistence and adaptation. Nat Microbiol 2019; 5:343-353. [PMID: 31873203 PMCID: PMC6992475 DOI: 10.1038/s41564-019-0625-0] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 10/30/2019] [Indexed: 12/15/2022]
Abstract
Despite the importance of horizontal gene transfer for rapid bacterial evolution, reliable assignment of mobile genetic elements to their microbial hosts in natural communities such as the human gut microbiota is lacking. We used high-throughput chromosomal conformation capture coupled with probabilistic modelling of experimental noise to resolve 88 strain-level metagenome-assembled genomes of distal gut bacteria from two participants, including 12,251 accessory elements. Comparisons of two samples collected 10 years apart for each of the participants revealed extensive in situ exchange of accessory elements as well as evidence of adaptive evolution in core genomes. Accessory elements were predominantly promiscuous and prevalent in the distal gut metagenomes of 218 adult individuals. This research provides a foundation and approach for studying microbial evolution in natural environments.
Collapse
|
32
|
Paez-Espino D, Zhou J, Roux S, Nayfach S, Pavlopoulos GA, Schulz F, McMahon KD, Walsh D, Woyke T, Ivanova NN, Eloe-Fadrosh EA, Tringe SG, Kyrpides NC. Diversity, evolution, and classification of virophages uncovered through global metagenomics. MICROBIOME 2019; 7:157. [PMID: 31823797 PMCID: PMC6905037 DOI: 10.1186/s40168-019-0768-5] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 11/11/2019] [Indexed: 05/19/2023]
Abstract
BACKGROUND Virophages are small viruses with double-stranded DNA genomes that replicate along with giant viruses and co-infect eukaryotic cells. Due to the paucity of virophage reference genomes, a collective understanding of the global virophage diversity, distribution, and evolution is lacking. RESULTS Here we screened a public collection of over 14,000 metagenomes using the virophage-specific major capsid protein (MCP) as "bait." We identified 44,221 assembled virophage sequences, of which 328 represent high-quality (complete or near-complete) genomes from diverse habitats including the human gut, plant rhizosphere, and terrestrial subsurface. Comparative genomic analysis confirmed the presence of four core genes in a conserved block. We used these genes to establish a revised virophage classification including 27 clades with consistent genome length, gene content, and habitat distribution. Moreover, for eight high-quality virophage genomes, we computationally predicted putative eukaryotic virus hosts. CONCLUSION Overall, our approach has increased the number of known virophage genomes by 10-fold and revealed patterns of genome evolution and global virophage distribution. We anticipate that the expanded diversity presented here will provide the backbone for further virophage studies.
Collapse
Affiliation(s)
- David Paez-Espino
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Jinglie Zhou
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Simon Roux
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Georgios A. Pavlopoulos
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
- BSRC “Alexander Fleming”, 34 Fleming Street, Vari, 16672 Athens, Greece
| | - Frederik Schulz
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Katherine D. McMahon
- Departments of Civil and Environmental Engineering and Bacteriology, University of Wisconsin Madison, 1550 Linden Drive, Madison, WI 53726 USA
| | - David Walsh
- Department of Biology, Concordia University, 7141 Sherbrooke St. West, Montreal, QC, H4B 1R6 Canada
| | - Tanja Woyke
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Natalia N. Ivanova
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Emiley A. Eloe-Fadrosh
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Susannah G. Tringe
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| | - Nikos C. Kyrpides
- Department of Energy, Joint Genome Institute, 2800 Mitchell Dr., Walnut Creek, 94598 USA
| |
Collapse
|
33
|
Sarsaiya S, Shi J, Chen J. Bioengineering tools for the production of pharmaceuticals: current perspective and future outlook. Bioengineered 2019; 10:469-492. [PMID: 31656120 PMCID: PMC6844412 DOI: 10.1080/21655979.2019.1682108] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Revised: 09/08/2019] [Accepted: 10/11/2019] [Indexed: 01/18/2023] Open
Abstract
The bioengineering tools have significant advantages through less time-consuming and utilized as a promising stage for the production of pharmaceutical bioproducts under the single platform. This review highlighted the advantages and current improvement in the plant, animal and microbial bioengineering tools and outlines feasible approaches by biological and process's bioengineering levels for advancing the economic feasibility of pharmaceutical's production. The critical analysis results revealed that system biology and synthetic biology along with advanced bioengineering tools like transcriptome, proteome, metabolome and nano bioengineering tools have shown a promising impact on the development of pharmaceutical's bioproducts. Tools to overcome and resolve the accompanying encounters of pharmaceutical's production that include nano bioengineering tools are also discussed. As a summary and prospect, it also gives new insight into the challenges and possible breakthrough of the development of pharmaceutical's bioproducts through bioengineering tools.
Collapse
Affiliation(s)
- Surendra Sarsaiya
- Key Laboratory of Basic Pharmacology and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, Zunyi Medical University, Zunyi, China
- Bioresource Institute for Healthy Utilization, Zunyi Medical University, Zunyi, China
| | - Jingshan Shi
- Key Laboratory of Basic Pharmacology and Joint International Research Laboratory of Ethnomedicine of Ministry of Education, Zunyi Medical University, Zunyi, China
| | - Jishuang Chen
- Bioresource Institute for Healthy Utilization, Zunyi Medical University, Zunyi, China
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, Nanjing, China
| |
Collapse
|
34
|
Infant airway microbiota and topical immune perturbations in the origins of childhood asthma. Nat Commun 2019; 10:5001. [PMID: 31676759 PMCID: PMC6825176 DOI: 10.1038/s41467-019-12989-7] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 10/14/2019] [Indexed: 12/24/2022] Open
Abstract
Asthma is believed to arise through early life aberrant immune development in response to environmental exposures that may influence the airway microbiota. Here, we examine the airway microbiota during the first three months of life by 16S rRNA gene amplicon sequencing in the population-based Copenhagen Prospective Studies on Asthma in Childhood 2010 (COPSAC2010) cohort consisting of 700 children monitored for the development of asthma since birth. Microbial diversity and the relative abundances of Veillonella and Prevotella in the airways at age one month are associated with asthma by age 6 years, both individually and with additional taxa in a multivariable model. Higher relative abundance of these bacteria is furthermore associated with an airway immune profile dominated by reduced TNF-α and IL-1β and increased CCL2 and CCL17, which itself is an independent predictor for asthma. These findings suggest a mechanism of microbiota-immune interactions in early infancy that predisposes to childhood asthma. Here, Thorsen et al. examine the microbiota during the first three months of life in a cohort of 700 children and find that microbial diversity and the relative abundances of Veillonella and Prevotella in the airways at one month of age are associated with topical immune mediators and asthma by age 6 years.
Collapse
|
35
|
Cruz F, Lagoa D, Mendes J, Rocha I, Ferreira EC, Rocha M, Dias O. SamPler - a novel method for selecting parameters for gene functional annotation routines. BMC Bioinformatics 2019; 20:454. [PMID: 31488049 PMCID: PMC6727554 DOI: 10.1186/s12859-019-3038-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 08/21/2019] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND As genome sequencing projects grow rapidly, the diversity of organisms with recently assembled genome sequences peaks at an unprecedented scale, thereby highlighting the need to make gene functional annotations fast and efficient. However, the (high) quality of such annotations must be guaranteed, as this is the first indicator of the genomic potential of every organism. Automatic procedures help accelerating the annotation process, though decreasing the confidence and reliability of the outcomes. Manually curating a genome-wide annotation of genes, enzymes and transporter proteins function is a highly time-consuming, tedious and impractical task, even for the most proficient curator. Hence, a semi-automated procedure, which balances the two approaches, will increase the reliability of the annotation, while speeding up the process. In fact, a prior analysis of the annotation algorithm may leverage its performance, by manipulating its parameters, hastening the downstream processing and the manual curation of assigning functions to genes encoding proteins. RESULTS Here SamPler, a novel strategy to select parameters for gene functional annotation routines is presented. This semi-automated method is based on the manual curation of a randomly selected set of genes/proteins. Then, in a multi-dimensional array, this sample is used to assess the automatic annotations for all possible combinations of the algorithm's parameters. These assessments allow creating an array of confusion matrices, for which several metrics are calculated (accuracy, precision and negative predictive value) and used to reach optimal values for the parameters. CONCLUSIONS The potential of this methodology is demonstrated with four genome functional annotations performed in merlin, an in-house user-friendly computational framework for genome-scale metabolic annotation and model reconstruction. For that, SamPler was implemented as a new plugin for the merlin tool.
Collapse
Affiliation(s)
- Fernando Cruz
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
| | - Davide Lagoa
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
| | - João Mendes
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
| | - Isabel Rocha
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, 2780-157 Oeiras, Portugal
| | - Eugénio C. Ferreira
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
| | - Oscar Dias
- Centre of Biological Engineering, University of Minho, 4710-057 Braga, Portugal
| |
Collapse
|
36
|
Mier P, Andrade-Navarro MA. Toward completion of the Earth's proteome: an update a decade later. Brief Bioinform 2019; 20:463-470. [PMID: 29040399 DOI: 10.1093/bib/bbx127] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 09/08/2017] [Indexed: 12/13/2022] Open
Abstract
Protein databases are steadily growing driven by the spread of new more efficient sequencing techniques. This growth is dominated by an increase in redundancy (homologous proteins with various degrees of sequence similarity) and by the incapability to process and curate sequence entries as fast as they are created. To understand these trends and aid bioinformatic resources that might be compromised by the increasing size of the protein sequence databases, we have created a less-redundant protein data set. In parallel, we analyzed the evolution of protein sequence databases in terms of size and redundancy. While the SwissProt database has decelerated its growth mostly because of a focus on increasing the level of annotation of its sequences, its counterpart TrEMBL, much less limited by curation steps, is still in a phase of accelerated growth. However, we predict that before 2020, almost all entries deposited in UniProtKB will be homologous to known proteins. We propose that new sequencing projects can be made more useful if they are driven to sequencing voids, parts of the tree of life far from already sequenced species or model organisms. We show these voids are present in the Archaea and Eukarya domains of life. The approach to the certainty of the redundancy of new protein sequence entries leads to the consideration that most of the protein diversity on Earth has already been described, which we estimate to be of around 3.75 million proteins, revising down the prediction we did a decade ago.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Gresemundweg, Mainz, Germany
| | | |
Collapse
|
37
|
Garcia AK, Kaçar B. How to resurrect ancestral proteins as proxies for ancient biogeochemistry. Free Radic Biol Med 2019; 140:260-269. [PMID: 30951835 DOI: 10.1016/j.freeradbiomed.2019.03.033] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 02/11/2019] [Accepted: 03/26/2019] [Indexed: 10/27/2022]
Abstract
Throughout the history of life, enzymes have served as the primary molecular mediators of biogeochemical cycles by catalyzing the metabolic pathways that interact with geochemical substrates. The byproducts of enzymatic activities have been preserved as chemical and isotopic signatures in the geologic record. However, interpretations of these signatures are limited by the assumption that such enzymes have remained functionally conserved over billions of years of molecular evolution. By reconstructing ancient genetic sequences in conjunction with laboratory enzyme resurrection, preserved biogeochemical signatures can instead be related to experimentally constrained, ancestral enzymatic properties. We may thereby investigate instances within molecular evolutionary trajectories potentially tied to significant biogeochemical transitions evidenced in the geologic record. Here, we survey recent enzyme resurrection studies to provide a reasoned assessment of areas of success and common pitfalls relevant to ancient biogeochemical applications. We conclude by considering the Great Oxidation Event, which provides a constructive example of a significant biogeochemical transition that warrants investigation with ancestral enzyme resurrection. This event also serves to highlight the pitfalls of facile interpretation of paleophenotype models and data, as applied to two examples of enzymes that likely both influenced and were influenced by the rise of atmospheric oxygen - RuBisCO and nitrogenase.
Collapse
Affiliation(s)
- Amanda K Garcia
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Betül Kaçar
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA; Department of Astronomy and Steward Observatory, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
38
|
Chen KT, Lu CL. CSAR-web: a web server of contig scaffolding using algebraic rearrangements. Nucleic Acids Res 2019; 46:W55-W59. [PMID: 29733393 PMCID: PMC6030906 DOI: 10.1093/nar/gky337] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 04/19/2018] [Indexed: 01/23/2023] Open
Abstract
CSAR-web is a web-based tool that allows the users to efficiently and accurately scaffold (i.e. order and orient) the contigs of a target draft genome based on a complete or incomplete reference genome from a related organism. It takes as input a target genome in multi-FASTA format and a reference genome in FASTA or multi-FASTA format, depending on whether the reference genome is complete or incomplete, respectively. In addition, it requires the users to choose either ‘NUCmer on nucleotides’ or ‘PROmer on translated amino acids’ for CSAR-web to identify conserved genomic markers (i.e. matched sequence regions) between the target and reference genomes, which are used by the rearrangement-based scaffolding algorithm in CSAR-web to order and orient the contigs of the target genome based on the reference genome. In the output page, CSAR-web displays its scaffolding result in a graphical mode (i.e. scalable dotplot) allowing the users to visually validate the correctness of scaffolded contigs and in a tabular mode allowing the users to view the details of scaffolds. CSAR-web is available online at http://genome.cs.nthu.edu.tw/CSAR-web.
Collapse
Affiliation(s)
- Kun-Tze Chen
- Department of Computer Science, National Tsing Hua University, Hsinchu 30013, Taiwan
| | - Chin Lung Lu
- Department of Computer Science, National Tsing Hua University, Hsinchu 30013, Taiwan
| |
Collapse
|
39
|
Klemetsen T, Raknes IA, Fu J, Agafonov A, Balasundaram SV, Tartari G, Robertsen E, Willassen NP. The MAR databases: development and implementation of databases specific for marine metagenomics. Nucleic Acids Res 2019; 46:D692-D699. [PMID: 29106641 PMCID: PMC5753341 DOI: 10.1093/nar/gkx1036] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 10/18/2017] [Indexed: 12/03/2022] Open
Abstract
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/.
Collapse
Affiliation(s)
- Terje Klemetsen
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Inge A Raknes
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Juan Fu
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Alexander Agafonov
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Sudhagar V Balasundaram
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Giacomo Tartari
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway.,Department of Information Technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Espen Robertsen
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Nils P Willassen
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| |
Collapse
|
40
|
Graells T, Ishak H, Larsson M, Guy L. The all-intracellular order Legionellales is unexpectedly diverse, globally distributed and lowly abundant. FEMS Microbiol Ecol 2019; 94:5110392. [PMID: 30973601 PMCID: PMC6167759 DOI: 10.1093/femsec/fiy185] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 09/08/2018] [Indexed: 12/14/2022] Open
Abstract
Legionellales is an order of the Gammaproteobacteria, only composed of host-adapted, intracellular bacteria, including the accidental human pathogens Legionella pneumophila and Coxiella burnetii. Although the diversity in terms of lifestyle is large across the order, only a few genera have been sequenced, owing to the difficulty to grow intracellular bacteria in pure culture. In particular, we know little about their global distribution and abundance. Here, we analyze 16/18S rDNA amplicons both from tens of thousands of published studies and from two separate sampling campaigns in and around ponds and in a silver mine. We demonstrate that the diversity of the order is much larger than previously thought, with over 450 uncultured genera. We show that Legionellales are found in about half of the samples from freshwater, soil and marine environments and quasi-ubiquitous in man-made environments. Their abundance is low, typically 0.1%, with few samples up to 1%. Most Legionellales OTUs are globally distributed, while many do not belong to a previously identified species. This study sheds a new light on the ubiquity and diversity of one major group of host-adapted bacteria. It also emphasizes the need to use metagenomics to better understand the role of host-adapted bacteria in all environments.
Collapse
Affiliation(s)
- Tiscar Graells
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden.,Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Edifici C, Carrer de la Vall Moronta, 08193 Bellaterra, Spain
| | - Helena Ishak
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden
| | - Madeleine Larsson
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden
| | - Lionel Guy
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden
| |
Collapse
|
41
|
Roux S, Krupovic M, Daly RA, Borges AL, Nayfach S, Schulz F, Sharrar A, Matheus Carnevali PB, Cheng JF, Ivanova NN, Bondy-Denomy J, Wrighton KC, Woyke T, Visel A, Kyrpides NC, Eloe-Fadrosh EA. Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth's biomes. Nat Microbiol 2019; 4:1895-1906. [PMID: 31332386 PMCID: PMC6813254 DOI: 10.1038/s41564-019-0510-x] [Citation(s) in RCA: 153] [Impact Index Per Article: 30.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 06/05/2019] [Indexed: 01/02/2023]
Abstract
Bacteriophages from the Inoviridae family (inoviruses) are characterized by their unique morphology, genome content and infection cycle. One of the most striking features of inoviruses is their ability to establish a chronic infection whereby the viral genome resides within the cell in either an exclusively episomal state or integrated into the host chromosome and virions are continuously released without killing the host. To date, a relatively small number of inovirus isolates have been extensively studied, either for biotechnological applications, such as phage display, or because of their effect on the toxicity of known bacterial pathogens including Vibrio cholerae and Neisseria meningitidis. Here, we show that the current 56 members of the Inoviridae family represent a minute fraction of a highly diverse group of inoviruses. Using a machine learning approach leveraging a combination of marker gene and genome features, we identified 10,295 inovirus-like sequences from microbial genomes and metagenomes. Collectively, our results call for reclassification of the current Inoviridae family into a viral order including six distinct proposed families associated with nearly all bacterial phyla across virtually every ecosystem. Putative inoviruses were also detected in several archaeal genomes, suggesting that, collectively, members of this supergroup infect hosts across the domains Bacteria and Archaea. Finally, we identified an expansive diversity of inovirus-encoded toxin–antitoxin and gene expression modulation systems, alongside evidence of both synergistic (CRISPR evasion) and antagonistic (superinfection exclusion) interactions with co-infecting viruses, which we experimentally validated in a Pseudomonas model. Capturing this previously obscured component of the global virosphere may spark new avenues for microbial manipulation approaches and innovative biotechnological applications. A machine learning approach was used to recover over 10,000 inovirus-like sequences from existing microbial genomes and metagenomes, consequently proposing the reclassification of the Inoviridae family to a viral order, and uncover the previously unrecognized diversity of these viruses across hosts and environments.
Collapse
Affiliation(s)
- Simon Roux
- DOE Joint Genome Institute, Walnut Creek, CA, USA.
| | - Mart Krupovic
- Department of Microbiology, Institut Pasteur, Paris, France
| | - Rebecca A Daly
- Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO, USA
| | - Adair L Borges
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, USA
| | | | | | - Allison Sharrar
- Department of Earth & Planetary Sciences, University of California, Berkeley, Berkeley, CA, USA
| | | | | | | | - Joseph Bondy-Denomy
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, USA.,Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Kelly C Wrighton
- Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO, USA
| | - Tanja Woyke
- DOE Joint Genome Institute, Walnut Creek, CA, USA
| | - Axel Visel
- DOE Joint Genome Institute, Walnut Creek, CA, USA
| | | | | |
Collapse
|
42
|
Lund JB, List M, Baumbach J. Interactive microbial distribution analysis using BioAtlas. Nucleic Acids Res 2019; 45:W509-W513. [PMID: 28460071 PMCID: PMC5570126 DOI: 10.1093/nar/gkx304] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Accepted: 04/12/2017] [Indexed: 02/01/2023] Open
Abstract
Massive amounts of 16S rRNA sequencing data have been stored in publicly accessible databases, such as GOLD, SILVA, GreenGenes (GG), and the Ribosomal Database Project (RDP). Many of these sequences are tagged with geo-locations. Nevertheless, researchers currently lack a user-friendly tool to analyze microbial distribution in a location-specific context. BioAtlas is an interactive web application that closes this gap between sequence databases, taxonomy profiling and geo/body-location information. It enables users to browse taxonomically annotated sequences across (i) the world map, (ii) human body maps and (iii) user-defined maps. It further allows for (iv) uploading of own sample data, which can be placed on existing maps to (v) browse the distribution of the associated taxonomies. Finally, BioAtlas enables users to (vi) contribute custom maps (e.g. for plants or animals) and to map taxonomies to pre-defined map locations. In summary, BioAtlas facilitates map-supported browsing of public 16S rRNA sequence data and analyses of user-provided sequences without requiring manual mapping to taxonomies and existing databases. Availability: http://bioatlas.compbio.sdu.dk/
Collapse
Affiliation(s)
- Jesper Beltoft Lund
- Department of Mathematics and Computer Science (IMADA), University of Southern Denmark, 5000 Odense, Denmark
| | - Markus List
- Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany
| | - Jan Baumbach
- Department of Mathematics and Computer Science (IMADA), University of Southern Denmark, 5000 Odense, Denmark.,Max Planck Institute for Informatics, Saarland Informatics Campus, 66123 Saarbrücken, Germany
| |
Collapse
|
43
|
diCenzo GC, Mengoni A, Perrin E. Chromids Aid Genome Expansion and Functional Diversification in the Family Burkholderiaceae. Mol Biol Evol 2019; 36:562-574. [PMID: 30608550 DOI: 10.1093/molbev/msy248] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Multipartite genomes, containing at least two large replicons, are found in diverse bacteria; however, the advantage of this genome structure remains incompletely understood. Here, we perform comparative genomics of hundreds of finished β-proteobacterial genomes to gain insights into the role and emergence of multipartite genomes. Almost all essential secondary replicons (chromids) of the β-proteobacteria are found in the family Burkholderiaceae. These replicons arose from just two plasmid acquisition events, and they were likely stabilized early in their evolution by the presence of core genes. On average, Burkholderiaceae genera with multipartite genomes had a larger total genome size, but smaller chromosome, than genera without secondary replicons. Pangenome-level functional enrichment analyses suggested that interreplicon functional biases are partially driven by the enrichment of secondary replicons in the accessory pangenome fraction. Nevertheless, the small overlap in orthologous groups present in each replicon's pangenome indicated a clear functional separation of the replicons. Chromids appeared biased to environmental adaptation, as the functional categories enriched on chromids were also overrepresented on the chromosomes of the environmental genera (Paraburkholderia and Cupriavidus) compared with the pathogenic genera (Burkholderia and Ralstonia). Using ancestral state reconstruction, it was predicted that the rate of accumulation of modern-day genes by chromids was more rapid than the rate of gene accumulation by the chromosomes. Overall, the data are consistent with a model where the primary advantage of secondary replicons is in facilitating increased rates of gene acquisition through horizontal gene transfer, consequently resulting in replicons enriched in genes associated with adaptation to novel environments.
Collapse
Affiliation(s)
- George C diCenzo
- Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy
| | - Alessio Mengoni
- Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy
| | - Elena Perrin
- Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy
| |
Collapse
|
44
|
Dunivin TK, Yeh SY, Shade A. A global survey of arsenic-related genes in soil microbiomes. BMC Biol 2019; 17:45. [PMID: 31146755 PMCID: PMC6543643 DOI: 10.1186/s12915-019-0661-5] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 05/02/2019] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Environmental resistomes include transferable microbial genes. One important resistome component is resistance to arsenic, a ubiquitous and toxic metalloid that can have negative and chronic consequences for human and animal health. The distribution of arsenic resistance and metabolism genes in the environment is not well understood. However, microbial communities and their resistomes mediate key transformations of arsenic that are expected to impact both biogeochemistry and local toxicity. RESULTS We examined the phylogenetic diversity, genomic location (chromosome or plasmid), and biogeography of arsenic resistance and metabolism genes in 922 soil genomes and 38 metagenomes. To do so, we developed a bioinformatic toolkit that includes BLAST databases, hidden Markov models and resources for gene-targeted assembly of nine arsenic resistance and metabolism genes: acr3, aioA, arsB, arsC (grx), arsC (trx), arsD, arsM, arrA, and arxA. Though arsenic-related genes were common, they were not universally detected, contradicting the common conjecture that all organisms have them. From major clades of arsenic-related genes, we inferred their potential for horizontal and vertical transfer. Different types and proportions of genes were detected across soils, suggesting microbial community composition will, in part, determine local arsenic toxicity and biogeochemistry. While arsenic-related genes were globally distributed, particular sequence variants were highly endemic (e.g., acr3), suggesting dispersal limitation. The gene encoding arsenic methylase arsM was unexpectedly abundant in soil metagenomes (median 48%), suggesting that it plays a prominent role in global arsenic biogeochemistry. CONCLUSIONS Our analysis advances understanding of arsenic resistance, metabolism, and biogeochemistry, and our approach provides a roadmap for the ecological investigation of environmental resistomes.
Collapse
Affiliation(s)
- Taylor K Dunivin
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, 48824, USA
- Environmental and Integrative Toxicological Sciences Doctoral Program, Michigan State University, East Lansing, MI, 48824, USA
| | - Susanna Y Yeh
- Institute for Cyber-Enabled Research, Michigan State University, East Lansing, MI, 48824, USA
| | - Ashley Shade
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, 48824, USA.
- Program in Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Plant, Soil, and Microbial Sciences, Michigan State University, East Lansing, MI, 48824, USA.
- Plant Resilience Institute, Michigan State University, East Lansing, MI, 48834, USA.
| |
Collapse
|
45
|
Vinatzer BA, Heath LS, Almohri HMJ, Stulberg MJ, Lowe C, Li S. Cyberbiosecurity Challenges of Pathogen Genome Databases. Front Bioeng Biotechnol 2019; 7:106. [PMID: 31157218 PMCID: PMC6529814 DOI: 10.3389/fbioe.2019.00106] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 04/25/2019] [Indexed: 01/21/2023] Open
Abstract
Pathogen detection, identification, and tracking is shifting from non-molecular methods, DNA fingerprinting methods, and single gene methods to methods relying on whole genomes. Viral Ebola and influenza genome data are being used for real-time tracking, while food-borne bacterial pathogen outbreaks and hospital outbreaks are investigated using whole genomes in the UK, Canada, the USA and the other countries. Also, plant pathogen genomes are starting to be used to investigate plant disease epidemics such as the wheat blast outbreak in Bangladesh. While these genome-based approaches provide never-seen advantages over all previous approaches with regard to public health and biosecurity, they also come with new vulnerabilities and risks with regard to cybersecurity. The more we rely on genome databases, the more likely these databases will become targets for cyber-attacks to interfere with public health and biosecurity systems by compromising their integrity, taking them hostage, or manipulating the data they contain. Also, while there is the potential to collect pathogen genomic data from infected individuals or agricultural and food products during disease outbreaks to improve disease modeling and forecast, how to protect the privacy of individuals, growers, and retailers is another major cyberbiosecurity challenge. As data become linkable to other data sources, individuals and groups become identifiable and potential malicious activities targeting those identified become feasible. Here, we define a number of potential cybersecurity weaknesses in today's pathogen genome databases to raise awareness, and we provide potential solutions to strengthen cyberbiosecurity during the development of the next generation of pathogen genome databases.
Collapse
Affiliation(s)
- Boris A. Vinatzer
- School of Plant and Environmental Sciences, College of Agriculture and Life Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Lenwood S. Heath
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | | | - Michael J. Stulberg
- Animal and Plant Health Inspection Service (USDA), Riverdale Park, MD, United States
| | - Christopher Lowe
- Beltsville Agricultural Research Center, Agricultural Research Service (USDA), Beltsville, MD, United States
| | - Song Li
- School of Plant and Environmental Sciences, College of Agriculture and Life Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| |
Collapse
|
46
|
Presnell KV, Alper HS. Systems Metabolic Engineering Meets Machine Learning: A New Era for Data-Driven Metabolic Engineering. Biotechnol J 2019; 14:e1800416. [PMID: 30927499 DOI: 10.1002/biot.201800416] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 02/20/2019] [Indexed: 12/30/2022]
Abstract
The recent increase in high-throughput capacity of 'omics datasets combined with advances and interest in machine learning (ML) have created great opportunities for systems metabolic engineering. In this regard, data-driven modeling methods have become increasingly valuable to metabolic strain design. In this review, the nature of 'omics is discussed and a broad introduction to the ML algorithms combining these datasets into predictive models of metabolism and metabolic rewiring is provided. Next, this review highlights recent work in the literature that utilizes such data-driven methods to inform various metabolic engineering efforts for different classes of application including product maximization, understanding and profiling phenotypes, de novo metabolic pathway design, and creation of robust system-scale models for biotechnology. Overall, this review aims to highlight the potential and promise of using ML algorithms with metabolic engineering and systems biology related datasets.
Collapse
Affiliation(s)
- Kristin V Presnell
- McKetta Department of Chemical Engineering, The University of Texas at Austin, 200 E Dean Keeton St. Stop C0400, Austin, TX, 78712, USA
| | - Hal S Alper
- McKetta Department of Chemical Engineering, The University of Texas at Austin, 200 E Dean Keeton St. Stop C0400, Austin, TX, 78712, USA.,Institute for Cellular and Molecular Biology, The University of Texas at Austin, 100 E 24 St., Austin, TX, 78712, USA
| |
Collapse
|
47
|
Dutta A, Peoples LM, Gupta A, Bartlett DH, Sar P. Exploring the piezotolerant/piezophilic microbial community and genomic basis of piezotolerance within the deep subsurface Deccan traps. Extremophiles 2019; 23:421-433. [PMID: 31049708 DOI: 10.1007/s00792-019-01094-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 04/23/2019] [Indexed: 01/22/2023]
Abstract
The deep biosphere is often characterized by multiple extreme physical-chemical conditions, of which pressure is an important parameter that influences life but remains less studied. This geomicrobiology study was designed to understand the response of a subterranean microbial community of the Deccan traps to high-pressure conditions and to elucidate their genomic properties. Groundwater from a deep basaltic aquifer of the Deccan traps was used to ascertain the community response to 25 MPa and 50 MPa pressure following enrichment in high-salt and low-salt organic media. Quantitative PCR data indicated a decrease in bacterial and archaeal cell numbers with increasing pressure. 16S rRNA gene sequencing displayed substantial changes in the microbial community in which Acidovorax appeared to be the most dominant genus in the low-salt medium and Microbacteriaceae emerged as the major family in the high-salt medium under both pressure conditions. Genes present in metagenome-associated genomes which have previously been associated with piezotolerance include those related to nutrient uptake and extracytoplasmic stress (omp, rseC), protein folding and unfolding (dnaK, groEL and others), and DNA repair mechanisms (mutT, uvr and others). We hypothesize that these genes facilitate tolerance to high pressure by certain groups of microbes residing in subsurface Deccan traps.
Collapse
Affiliation(s)
- Avishek Dutta
- Environmental Microbiology and Genomics Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India.,School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India
| | - Logan M Peoples
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California San Diego, La Jolla, San Diego, CA, 92093, USA
| | - Abhishek Gupta
- Environmental Microbiology and Genomics Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India
| | - Douglas H Bartlett
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California San Diego, La Jolla, San Diego, CA, 92093, USA
| | - Pinaki Sar
- Environmental Microbiology and Genomics Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India.
| |
Collapse
|
48
|
Mathema VB, Dondorp AM, Imwong M. OSTRFPD: Multifunctional Tool for Genome-Wide Short Tandem Repeat Analysis for DNA, Transcripts, and Amino Acid Sequences with Integrated Primer Designer. Evol Bioinform Online 2019; 15:1176934319843130. [PMID: 31040636 PMCID: PMC6482647 DOI: 10.1177/1176934319843130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 03/15/2019] [Indexed: 01/18/2023] Open
Abstract
Microsatellite mining is a common outcome of the in silico approach to genomic studies. The resulting short tandemly repeated DNA could be used as molecular markers for studying polymorphism, genotyping and forensics. The omni short tandem repeat finder and primer designer (OSTRFPD) is among the few versatile, platform-independent open-source tools written in Python that enables researchers to identify and analyse genome-wide short tandem repeats in both nucleic acids and protein sequences. OSTRFPD is designed to run either in a user-friendly fully featured graphical interface or in a command line interface mode for advanced users. OSTRFPD can detect both perfect and imperfect repeats of low complexity with customisable scores. Moreover, the software has built-in architecture to simultaneously filter selection of flanking regions in DNA and generate microsatellite-targeted primers implementing the Primer3 platform. The software has built-in motif-sequence generator engines and an additional option to use the dictionary mode for custom motif searches. The software generates search results including general statistics containing motif categorisation, repeat frequencies, densities, coverage, guanine–cytosine (GC) content, and simple text-based imperfect alignment visualisation. Thus, OSTRFPD presents users with a quick single-step solution package to assist development of microsatellite markers and categorise tandemly repeated amino acids in proteome databases. Practical implementation of OSTRFPD was demonstrated using publicly available whole-genome sequences of selected Plasmodium species. OSTRFPD is freely available and open-sourced for improvement and user-specific adaptation.
Collapse
Affiliation(s)
- Vivek Bhakta Mathema
- Department of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | - Arjen M Dondorp
- Mahidol-Oxford Tropical Medicine Research unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
- Centre for Tropical Medicine, Churchill Hospital, Oxford, UK
| | - Mallika Imwong
- Department of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
- Mallika Imwong, Department of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University, Bangkok 10400, Thailand.
| |
Collapse
|
49
|
Corel E, Méheust R, Watson AK, McInerney JO, Lopez P, Bapteste E. Bipartite Network Analysis of Gene Sharings in the Microbial World. Mol Biol Evol 2019; 35:899-913. [PMID: 29346651 PMCID: PMC5888944 DOI: 10.1093/molbev/msy001] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Extensive microbial gene flows affect how we understand virology, microbiology, medical sciences, genetic modification, and evolutionary biology. Phylogenies only provide a narrow view of these gene flows: plasmids and viruses, lacking core genes, cannot be attached to cellular life on phylogenetic trees. Yet viruses and plasmids have a major impact on cellular evolution, affecting both the gene content and the dynamics of microbial communities. Using bipartite graphs that connect up to 149,000 clusters of homologous genes with 8,217 related and unrelated genomes, we can in particular show patterns of gene sharing that do not map neatly with the organismal phylogeny. Homologous genes are recycled by lateral gene transfer, and multiple copies of homologous genes are carried by otherwise completely unrelated (and possibly nested) genomes, that is, viruses, plasmids and prokaryotes. When a homologous gene is present on at least one plasmid or virus and at least one chromosome, a process of "gene externalization," affected by a postprocessed selected functional bias, takes place, especially in Bacteria. Bipartite graphs give us a view of vertical and horizontal gene flow beyond classic taxonomy on a single very large, analytically tractable, graph that goes beyond the cellular Web of Life.
Collapse
Affiliation(s)
- Eduardo Corel
- Unité Mixte de Recherche 7138 Evolution Paris-Seine, Centre National de la Recherche Scientifique, Institut de Biologie Paris-Seine, Sorbonne Université, Université Pierre et Marie Curie, Paris, France
| | - Raphaël Méheust
- Unité Mixte de Recherche 7138 Evolution Paris-Seine, Centre National de la Recherche Scientifique, Institut de Biologie Paris-Seine, Sorbonne Université, Université Pierre et Marie Curie, Paris, France
| | - Andrew K Watson
- Unité Mixte de Recherche 7138 Evolution Paris-Seine, Centre National de la Recherche Scientifique, Institut de Biologie Paris-Seine, Sorbonne Université, Université Pierre et Marie Curie, Paris, France
| | - James O McInerney
- Chair in Evolutionary Biology, The University of Manchester, United Kingdom
| | - Philippe Lopez
- Unité Mixte de Recherche 7138 Evolution Paris-Seine, Centre National de la Recherche Scientifique, Institut de Biologie Paris-Seine, Sorbonne Université, Université Pierre et Marie Curie, Paris, France
| | - Eric Bapteste
- Unité Mixte de Recherche 7138 Evolution Paris-Seine, Centre National de la Recherche Scientifique, Institut de Biologie Paris-Seine, Sorbonne Université, Université Pierre et Marie Curie, Paris, France
| |
Collapse
|
50
|
Nguyen TTH, Myrold DD, Mueller RS. Distributions of Extracellular Peptidases Across Prokaryotic Genomes Reflect Phylogeny and Habitat. Front Microbiol 2019; 10:413. [PMID: 30891022 PMCID: PMC6411800 DOI: 10.3389/fmicb.2019.00413] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 02/18/2019] [Indexed: 11/19/2022] Open
Abstract
Proteinaceous compounds are abundant forms of organic nitrogen in soil and aquatic ecosystems, and the rate of protein depolymerization, which is accomplished by a diverse range of microbial secreted peptidases, often limits nitrogen turnover in the environment. To determine if the distribution of secreted peptidases reflects the ecological and evolutionary histories of different taxa, we analyzed their distribution across prokaryotic lineages. Peptidase gene sequences of 147 archaeal and 2,191 bacterial genomes from the MEROPS database were screened for secretion signals, resulting in 55,072 secreted peptidases belonging to 148 peptidase families. These data, along with their corresponding 16S rRNA sequences, were used in our analysis. Overall, Bacteria had a much wider collection of secreted peptidases, higher average numbers of secreted peptidases per genome, and more unique peptidase families than Archaea. We found that the distribution of secreted peptidases corresponded to phylogenetic relationships among Bacteria and Archaea and often segregated according to microbial lifestyles, suggesting that the secreted peptidase complements of microbial taxa are optimized for the environmental microhabitats they occupy. Our analyses provide the groundwork for examining the specific functional role of families of secreted peptidases in relationship to the organisms and the corresponding environments in which they function.
Collapse
Affiliation(s)
- Trang T. H. Nguyen
- Department of Crop and Soil Science, Oregon State University, Corvallis, OR, United States
| | - David D. Myrold
- Department of Crop and Soil Science, Oregon State University, Corvallis, OR, United States
| | - Ryan S. Mueller
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| |
Collapse
|