1
|
Theodosiou T, Vrettos K, Baltsavia I, Baltoumas F, Papanikolaou N, Antonakis AΝ, Mossialos D, Ouzounis CA, Promponas VJ, Karaglani M, Chatzaki E, Brandau S, Pavlopoulos GA, Andreakos E, Iliopoulos I. BioTextQuest v2.0: An evolved tool for biomedical literature mining and concept discovery. Comput Struct Biotechnol J 2024; 23:3247-3253. [PMID: 39279874 PMCID: PMC11399685 DOI: 10.1016/j.csbj.2024.08.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 08/05/2024] [Accepted: 08/15/2024] [Indexed: 09/18/2024] Open
Abstract
The process of navigating through the landscape of biomedical literature and performing searches or combining them with bioinformatics analyses can be daunting, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related repositories. Herein, we present BioTextQuest v2.0, a tool for biomedical literature mining. BioTextQuest v2.0 is an open-source online web portal for document clustering based on sets of selected biomedical terms, offering efficient management of information derived from PubMed abstracts. Employing established machine learning algorithms, the tool facilitates document clustering while allowing users to customize the analysis by selecting terms of interest. BioTextQuest v2.0 streamlines the process of uncovering valuable insights from biomedical research articles, serving as an agent that connects the identification of key terms like genes/proteins, diseases, chemicals, Gene Ontology (GO) terms, functions, and others through named entity recognition, and their application in biological research. Instead of manually sifting through articles, researchers can enter their PubMed-like query and receive extracted information in two user-friendly formats, tables and word clouds, simplifying the comprehension of key findings. The latest update of BioTextQuest leverages the EXTRACT named entity recognition tagger, enhancing its ability to pinpoint various biological entities within text. BioTextQuest v2.0 acts as a research assistant, significantly reducing the time and effort required for researchers to identify and present relevant information from the biomedical literature.
Collapse
Affiliation(s)
- Theodosios Theodosiou
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece
| | - Konstantinos Vrettos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece
| | - Ismini Baltsavia
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece
| | - Fotis Baltoumas
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Athens 16672, Greece
| | - Nikolas Papanikolaou
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece
| | - Andreas Ν Antonakis
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece
| | - Dimitrios Mossialos
- Department of Biochemistry and Biotechnology, University of Thessaly, 41500 Larissa, Greece
| | - Christos A Ouzounis
- Biological Computation & Computational Biology Group, AIIA Lab, School of Informatics, Aristotle University of Thessalonica, 57001 Thessalonica, Greece
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia 1678, Cyprus
| | - Makrina Karaglani
- Medical School, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Ekaterini Chatzaki
- Medical School, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Sven Brandau
- Experimental and Translational Research, Department of Otorhinolaryngology, University Hospital Essen, Essen, Germany
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Athens 16672, Greece
| | - Evangelos Andreakos
- Center for Immunology and Transplantation, Biomedical Research Foundation Academy of Athens, Athens, Greece
| | - Ioannis Iliopoulos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece
| |
Collapse
|
2
|
Pechlivanis N, Karakatsoulis G, Kyritsis K, Tsagiopoulou M, Sgardelis S, Kappas I, Psomopoulos F. Microbial co-occurrence network demonstrates spatial and climatic trends for global soil diversity. Sci Data 2024; 11:672. [PMID: 38909071 PMCID: PMC11193810 DOI: 10.1038/s41597-024-03528-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 06/14/2024] [Indexed: 06/24/2024] Open
Abstract
Despite recent research efforts to explore the co-occurrence patterns of diverse microbes within soil microbial communities, a substantial knowledge-gap persists regarding global climate influences on soil microbiota behaviour. Comprehending co-occurrence patterns within distinct geoclimatic groups is pivotal for unravelling the ecological structure of microbial communities, that are crucial for preserving ecosystem functions and services. Our study addresses this gap by examining global climatic patterns of microbial diversity. Using data from the Earth Microbiome Project, we analyse a meta-community co-occurrence network for bacterial communities. This method unveils substantial shifts in topological features, highlighting regional and climatic trends. Arid, Polar, and Tropical zones show lower diversity but maintain denser networks, whereas Temperate and Cold zones display higher diversity alongside more modular networks. Furthermore, it identifies significant co-occurrence patterns across diverse climatic regions. Central taxa associated with different climates are pinpointed, highlighting climate's pivotal role in community structure. In conclusion, our study identifies significant correlations between microbial interactions in diverse climatic regions, contributing valuable insights into the intricate dynamics of soil microbiota.
Collapse
Affiliation(s)
- Nikos Pechlivanis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece
- Department of Genetics, Development and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Georgios Karakatsoulis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece
| | - Konstantinos Kyritsis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece
| | - Maria Tsagiopoulou
- Centro Nacional de Analisis Genomico (CNAG), C/Baldiri Reixac 4, 08028, Barcelona, Spain
| | - Stefanos Sgardelis
- Department of Ecology, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Ilias Kappas
- Department of Genetics, Development and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Fotis Psomopoulos
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thermi, 57001, Thessaloniki, Greece.
| |
Collapse
|
3
|
Ahmad HI, Mahmood S, Hassan M, Sajid M, Ahmed I, Shokrollahi B, Shahzad AH, Abbas S, Raza S, Khan K, Muhammad SA, Fouad D, Ataya FS, Li Z. Genomic insights into Yak (Bos grunniens) adaptations for nutrient assimilation in high-altitudes. Sci Rep 2024; 14:5650. [PMID: 38453987 PMCID: PMC10920680 DOI: 10.1038/s41598-024-55712-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 02/27/2024] [Indexed: 03/09/2024] Open
Abstract
High-altitude environments present formidable challenges for survival and reproduction, with organisms facing limited oxygen availability and scarce nutrient resources. The yak (Bos grunniens), indigenous to the Tibetan Plateau, has notably adapted to these extreme conditions. This study delves into the genomic basis of the yak's adaptation, focusing on the positive selection acting on genes involved in nutrient assimilation pathways. Employing techniques in comparative genomics and molecular evolutionary analyses, we selected genes in the yak that show signs of positive selection associated with nutrient metabolism, absorption, and transport. Our findings reveal specific genetic adaptations related to nutrient metabolism in harsh climatic conditions. Notably, genes involved in energy metabolism, oxygen transport, and thermoregulation exhibited signs of positive selection, suggesting their crucial role in the yak's successful colonization of high-altitude regions. The study also sheds light on the yak's immune system adaptations, emphasizing genes involved in response to various stresses prevalent at elevated altitudes. Insights into the yak's genomic makeup provide valuable information for understanding the broader implications of high-altitude adaptations in mammalian evolution. They may contribute to efforts in enhancing livestock resilience to environmental challenges.
Collapse
Affiliation(s)
- Hafiz Ishfaq Ahmad
- Department of Animal Breeding and Genetics, Faculty of Veterinary and Animal Sciences, The Islamia University of Bahawalpur, Bahawalpur, Pakistan.
| | - Sammina Mahmood
- Department of Botany, Division of Science and Technology, University of Education, Lahore, Pakistan
| | - Mubashar Hassan
- Department of Clinical Sciences, College of Veterinary and Animal Sciences (Sub campus UVAS, Lahore), Jhang, 35200, Pakistan
| | - Muhammad Sajid
- Department of Pathobiology, College of Veterinary and Animal Sciences (Sub campus UVAS, Lahore), Jhang, 35200, Pakistan
| | - Irfan Ahmed
- Department of Animal Nutrition, Faculty of Veterinary and Animal Sciences, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Borhan Shokrollahi
- Hanwoo Research Institute, National Institute of Animal Science, Pyeongchang, 25340, Korea
| | - Abid Hussain Shahzad
- Department of Clinical Sciences, College of Veterinary and Animal Sciences (Sub campus UVAS, Lahore), Jhang, 35200, Pakistan
| | - Shaista Abbas
- Department of Physiology and Biochemistry, College of Veterinary and Animal Sciences, Jhang, 35200, Pakistan
| | - Sanan Raza
- Department of Clinical Sciences, College of Veterinary and Animal Sciences (Sub campus UVAS, Lahore), Jhang, 35200, Pakistan
| | - Komal Khan
- Department of Basic Sciences, Anatomy Section, College of Veterinary and Animal Sciences, Jhang, 35200, Pakistan
| | - Sayyed Aun Muhammad
- Department of Clinical Sciences, College of Veterinary and Animal Sciences (Sub campus UVAS, Lahore), Jhang, 35200, Pakistan
| | - Dalia Fouad
- Department of Zoology, College of Science, King Saud University, PO Box 22452, Riyadh, 11495, Saudi Arabia
| | - Farid S Ataya
- Department of Biochemistry, College of Science, King Saud University, PO Box 2455, 11495, Riyadh, Saudi Arabia
| | - Zhengtian Li
- Qujing Normal University, College of Biological Resource and Food Engineering, 655011, Yunnan, China.
| |
Collapse
|
4
|
Huang Q, Zhang H, Zhang L, Xu B. Bacterial microbiota in different types of processed meat products: diversity, adaptation, and co-occurrence. Crit Rev Food Sci Nutr 2023:1-16. [PMID: 37905560 DOI: 10.1080/10408398.2023.2272770] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
As a double-edged sword, some bacterial microbes can improve the quality and shelf life of meat products, but others mainly responsible for deterioration of the safety and quality of meat products. This review aims to present a landscape of the bacterial microbiota in different types of processed meat products. After demonstrating a panoramic view of the bacterial genera in meat products, the diversity of bacterial microbiota was evaluated in two dimensions, namely different types of processed meat products and different meats. Then, the influence of environmental factors on bacterial communities was evaluated according to the storage temperature, packaging conditions, and sterilization methods. Furthermore, microbes are not independent. To explore interactions among those genera, co-occurrence patterns were examined. In these respects, this review highlighted the recent advances in fundamental principles that underlie the environmental adaption tricks and why some species tend to occur together frequently, such as metabolic cross-feeding, co-aggregate at microscale, and the intercellular signaling system. Further investigations are required to unveil the underlying molecular mechanisms that govern microbial community systems, ultimately contributing to developing new strategies to harness beneficial microorganisms and control harmful microorganisms.
Collapse
Affiliation(s)
- Qianli Huang
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| | - Huijuan Zhang
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| | - Li Zhang
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| | - Baocai Xu
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| |
Collapse
|
5
|
Karatzas E, Baltoumas FA, Aplakidou E, Kontou PI, Stathopoulos P, Stefanis L, Bagos PG, Pavlopoulos GA. Flame (v2.0): advanced integration and interpretation of functional enrichment results from multiple sources. Bioinformatics 2023; 39:btad490. [PMID: 37540207 PMCID: PMC10423032 DOI: 10.1093/bioinformatics/btad490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 05/31/2023] [Accepted: 08/03/2023] [Indexed: 08/05/2023] Open
Abstract
Functional enrichment is the process of identifying implicated functional terms from a given input list of genes or proteins. In this article, we present Flame (v2.0), a web tool which offers a combinatorial approach through merging and visualizing results from widely used functional enrichment applications while also allowing various flexible input options. In this version, Flame utilizes the aGOtool, g: Profiler, WebGestalt, and Enrichr pipelines and presents their outputs separately or in combination following a visual analytics approach. For intuitive representations and easier interpretation, it uses interactive plots such as parameterizable networks, heatmaps, barcharts, and scatter plots. Users can also: (i) handle multiple protein/gene lists and analyse union and intersection sets simultaneously through interactive UpSet plots, (ii) automatically extract genes and proteins from free text through text-mining and Named Entity Recognition (NER) techniques, (iii) upload single nucleotide polymorphisms (SNPs) and extract their relative genes, or (iv) analyse multiple lists of differentially expressed proteins/genes after selecting them interactively from a parameterizable volcano plot. Compared to the previous version of 197 supported organisms, Flame (v2.0) currently allows enrichment for 14 436 organisms. AVAILABILITY AND IMPLEMENTATION Web Application: http://flame.pavlopouloslab.info. Code: https://github.com/PavlopoulosLab/Flame. Docker: https://hub.docker.com/r/pavlopouloslab/flame.
Collapse
Affiliation(s)
- Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari (Athens), 16672, Greece
| | - Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari (Athens), 16672, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari (Athens), 16672, Greece
| | - Panagiota I Kontou
- Department of Mathematics, University of Thessaly, Lamia, 35100, Greece
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, 35131, Greece
| | - Panos Stathopoulos
- 1st Department of Neurology, Eginition Hospital, Athens, 11528, Greece
- School of Medicine, National and Kapodistrian University of Athens, Athens, 11527, Greece
| | - Leonidas Stefanis
- 1st Department of Neurology, Eginition Hospital, Athens, 11528, Greece
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, 35131, Greece
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari (Athens), 16672, Greece
- Center of Basic Research, Biomedical Research Foundation of the Academy of Athens, Athens, 11527, Greece
- Hellenic Army Academy, Vari, 16673, Greece
| |
Collapse
|
6
|
Kokoli M, Karatzas E, Baltoumas FA, Schneider R, Pafilis E, Paragkamian S, Doncheva NT, Jensen L, Pavlopoulos G. Arena3D web: interactive 3D visualization of multilayered networks supporting multiple directional information channels, clustering analysis and application integration. NAR Genom Bioinform 2023; 5:lqad053. [PMID: 37260509 PMCID: PMC10227371 DOI: 10.1093/nargab/lqad053] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/25/2023] [Accepted: 05/18/2023] [Indexed: 06/02/2023] Open
Abstract
Arena3Dweb is an interactive web tool that visualizes multi-layered networks in 3D space. In this update, Arena3Dweb supports directed networks as well as up to nine different types of connections between pairs of nodes with the use of Bézier curves. It comes with different color schemes (light/gray/dark mode), custom channel coloring, four node clustering algorithms which one can run on-the-fly, visualization in VR mode and predefined layer layouts (zig-zag, star and cube). This update also includes enhanced navigation controls (mouse orbit controls, layer dragging and layer/node selection), while its newly developed API allows integration with external applications as well as saving and loading of sessions in JSON format. Finally, a dedicated Cytoscape app has been developed, through which users can automatically send their 2D networks from Cytoscape to Arena3Dweb for 3D multi-layer visualization. Arena3Dweb is accessible at http://arena3d.pavlopouloslab.info or http://arena3d.org.
Collapse
Affiliation(s)
| | | | - Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari16672, Greece
| | - Reinhard Schneider
- University of Luxembourg, Luxembourg Centre for Systems Biomedicine, Bioinformatics Core, Esch-sur-Alzette, Luxembourg
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, Heraklion 71003, Greece
| | - Savvas Paragkamian
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, Heraklion 71003, Greece
- Department of Biology, University of Crete, Voutes University Campus, P.O. Box 2208, 70013 Heraklion, Crete, Greece
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | | |
Collapse
|
7
|
Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA. Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters. FRONTIERS IN BIOINFORMATICS 2023; 3:1157956. [PMID: 36959975 PMCID: PMC10029925 DOI: 10.3389/fbinf.2023.1157956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Anastasis Oulas
- The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Robert D. Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Nikos C. Kyrpides
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
- Hellenic Army Academy, Vari, Greece
| |
Collapse
|
8
|
Dérozier S, Bossy R, Deléger L, Ba M, Chaix E, Harlé O, Loux V, Falentin H, Nédellec C. Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach. PLoS One 2023; 18:e0272473. [PMID: 36662691 PMCID: PMC9858090 DOI: 10.1371/journal.pone.0272473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 01/04/2023] [Indexed: 01/21/2023] Open
Abstract
The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes, and usages of microbes from scientific sources of high interest to the microbiology community. The Omnicrobe database contains around 1 million descriptions of microbe properties. These descriptions are created by analyzing and combining six information sources of various kinds, i.e. biological resource catalogs, sequence databases and scientific literature. The microbe properties are indexed by the Ontobiotope ontology and their taxa are indexed by an extended version of the taxonomy maintained by the National Center for Biotechnology Information. The Omnicrobe application covers all domains of microbiology. With simple or rich ontology-based queries, it provides easy-to-use support in the resolution of scientific questions related to the habitats, phenotypes, and uses of microbes. We illustrate the potential of Omnicrobe with a use case from the food innovation domain.
Collapse
Affiliation(s)
- Sandra Dérozier
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| | - Robert Bossy
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| | - Louise Deléger
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| | - Mouhamadou Ba
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
- Université Paris-Saclay, INRAE, BioinfOmics, MIGALE Bioinformatics Facility, Jouy-en-Josas, France
| | - Estelle Chaix
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| | | | - Valentin Loux
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
- Université Paris-Saclay, INRAE, BioinfOmics, MIGALE Bioinformatics Facility, Jouy-en-Josas, France
| | | | - Claire Nédellec
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| |
Collapse
|
9
|
Zafeiropoulos H, Beracochea M, Ninidakis S, Exter K, Potirakis A, De Moro G, Richardson L, Corre E, Machado J, Pafilis E, Kotoulas G, Santi I, Finn RD, Cox CJ, Pavloudi C. metaGOflow: a workflow for the analysis of marine Genomic Observatories shotgun metagenomics data. Gigascience 2022; 12:giad078. [PMID: 37850871 PMCID: PMC10583283 DOI: 10.1093/gigascience/giad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 06/30/2023] [Accepted: 09/11/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Genomic Observatories (GOs) are sites of long-term scientific study that undertake regular assessments of the genomic biodiversity. The European Marine Omics Biodiversity Observation Network (EMO BON) is a network of GOs that conduct regular biological community samplings to generate environmental and metagenomic data of microbial communities from designated marine stations around Europe. The development of an effective workflow is essential for the analysis of the EMO BON metagenomic data in a timely and reproducible manner. FINDINGS Based on the established MGnify resource, we developed metaGOflow. metaGOflow supports the fast inference of taxonomic profiles from GO-derived data based on ribosomal RNA genes and their functional annotation using the raw reads. Thanks to the Research Object Crate packaging, relevant metadata about the sample under study, and the details of the bioinformatics analysis it has been subjected to, are inherited to the data product while its modular implementation allows running the workflow partially. The analysis of 2 EMO BON samples and 1 Tara Oceans sample was performed as a use case. CONCLUSIONS metaGOflow is an efficient and robust workflow that scales to the needs of projects producing big metagenomic data such as EMO BON. It highlights how containerization technologies along with modern workflow languages and metadata package approaches can support the needs of researchers when dealing with ever-increasing volumes of biological data. Despite being initially oriented to address the needs of EMO BON, metaGOflow is a flexible and easy-to-use workflow that can be broadly used for one-sample-at-a-time analysis of shotgun metagenomics data.
Collapse
Affiliation(s)
- Haris Zafeiropoulos
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
- KU Leuven, Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Laboratory of Molecular Bacteriology, 3000 Leuven, Belgium
| | - Martin Beracochea
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stelios Ninidakis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
| | - Katrina Exter
- Flanders Marine Institute (VLIZ), 8400 Oostende, Belgium
| | - Antonis Potirakis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
| | - Gianluca De Moro
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, Campus de Gambelas, 8005-139 Faro, Portugal
| | - Lorna Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Erwan Corre
- CNRS, FR 2424, ABiMS Platform, Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - João Machado
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, Campus de Gambelas, 8005-139 Faro, Portugal
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
| | - Georgios Kotoulas
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
| | - Ioulia Santi
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
- European Marine Biological Resource Centre (EMBRC-ERIC), 75005 Paris, France
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cymon J Cox
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, Campus de Gambelas, 8005-139 Faro, Portugal
| | - Christina Pavloudi
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, 71003 Heraklion, Crete, Greece
- Department of Biological Sciences, The George Washington University, 20052 Washington, DC, USA
| |
Collapse
|
10
|
Ahmed SAJA, Bapatdhar N, Kumar BP, Ghosh S, Yachie A, Palaniappan SK. Large scale text mining for deriving useful insights: A case study focused on microbiome. Front Physiol 2022; 13:933069. [PMID: 36117696 PMCID: PMC9473635 DOI: 10.3389/fphys.2022.933069] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 07/18/2022] [Indexed: 11/23/2022] Open
Abstract
Text mining has been shown to be an auxiliary but key driver for modeling, data harmonization, and interpretation in bio-medicine. Scientific literature holds a wealth of information and embodies cumulative knowledge and remains the core basis on which mechanistic pathways, molecular databases, and models are built and refined. Text mining provides the necessary tools to automatically harness the potential of text. In this study, we show the potential of large-scale text mining for deriving novel insights, with a focus on the growing field of microbiome. We first collected the complete set of abstracts relevant to the microbiome from PubMed and used our text mining and intelligence platform Taxila for analysis. We drive the usefulness of text mining using two case studies. First, we analyze the geographical distribution of research and study locations for the field of microbiome by extracting geo mentions from text. Using this analysis, we were able to draw useful insights on the state of research in microbiome w. r.t geographical distributions and economic drivers. Next, to understand the relationships between diseases, microbiome, and food which are central to the field, we construct semantic relationship networks between these different concepts central to the field of microbiome. We show how such networks can be useful to derive useful insight with no prior knowledge encoded.
Collapse
Affiliation(s)
| | | | | | - Samik Ghosh
- SBX Corporation Inc., Tokyo, Japan
- The NLP Group, The Systems Biology Institute, Tokyo, Japan
| | - Ayako Yachie
- SBX Corporation Inc., Tokyo, Japan
- The NLP Group, The Systems Biology Institute, Tokyo, Japan
| | - Sucheendra K. Palaniappan
- SBX Corporation Inc., Tokyo, Japan
- The NLP Group, The Systems Biology Institute, Tokyo, Japan
- *Correspondence: Sucheendra K. Palaniappan,
| |
Collapse
|
11
|
Nassar M, Rogers AB, Talo' F, Sanchez S, Shafique Z, Finn RD, McEntyre J. A machine learning framework for discovery and enrichment of metagenomics metadata from open access publications. Gigascience 2022; 11:giac077. [PMID: 35950838 PMCID: PMC9366992 DOI: 10.1093/gigascience/giac077] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/13/2022] [Accepted: 07/12/2022] [Indexed: 11/17/2022] Open
Abstract
Metagenomics is a culture-independent method for studying the microbes inhabiting a particular environment. Comparing the composition of samples (functionally/taxonomically), either from a longitudinal study or cross-sectional studies, can provide clues into how the microbiota has adapted to the environment. However, a recurring challenge, especially when comparing results between independent studies, is that key metadata about the sample and molecular methods used to extract and sequence the genetic material are often missing from sequence records, making it difficult to account for confounding factors. Nevertheless, these missing metadata may be found in the narrative of publications describing the research. Here, we describe a machine learning framework that automatically extracts essential metadata for a wide range of metagenomics studies from the literature contained in Europe PMC. This framework has enabled the extraction of metadata from 114,099 publications in Europe PMC, including 19,900 publications describing metagenomics studies in European Nucleotide Archive (ENA) and MGnify. Using this framework, a new metagenomics annotations pipeline was developed and integrated into Europe PMC to regularly enrich up-to-date ENA and MGnify metagenomics studies with metadata extracted from research articles. These metadata are now available for researchers to explore and retrieve in the MGnify and Europe PMC websites, as well as Europe PMC annotations API.
Collapse
Affiliation(s)
- Maaly Nassar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Current affiliation: SciBite - an Elsevier Company, Wellcome Genome Campus, Hinxton, Cambridge CB10 1DR, UK
| | - Alexander B Rogers
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Francesco Talo'
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Santiago Sanchez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zunaira Shafique
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Johanna McEntyre
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
12
|
New-Generation Sequencing Technology in Diagnosis of Fungal Plant Pathogens: A Dream Comes True? J Fungi (Basel) 2022; 8:jof8070737. [PMID: 35887492 PMCID: PMC9320658 DOI: 10.3390/jof8070737] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/01/2022] [Accepted: 07/11/2022] [Indexed: 02/01/2023] Open
Abstract
The fast and continued progress of high-throughput sequencing (HTS) and the drastic reduction of its costs have boosted new and unpredictable developments in the field of plant pathology. The cost of whole-genome sequencing, which, until few years ago, was prohibitive for many projects, is now so affordable that a new branch, phylogenomics, is being developed. Fungal taxonomy is being deeply influenced by genome comparison, too. It is now easier to discover new genes as potential targets for an accurate diagnosis of new or emerging pathogens, notably those of quarantine concern. Similarly, with the development of metabarcoding and metagenomics techniques, it is now possible to unravel complex diseases or answer crucial questions, such as "What's in my soil?", to a good approximation, including fungi, bacteria, nematodes, etc. The new technologies allow to redraw the approach for disease control strategies considering the pathogens within their environment and deciphering the complex interactions between microorganisms and the cultivated crops. This kind of analysis usually generates big data that need sophisticated bioinformatic tools (machine learning, artificial intelligence) for their management. Herein, examples of the use of new technologies for research in fungal diversity and diagnosis of some fungal pathogens are reported.
Collapse
|
13
|
Darling: A Web Application for Detecting Disease-Related Biomedical Entity Associations with Literature Mining. Biomolecules 2022; 12:biom12040520. [PMID: 35454109 PMCID: PMC9028073 DOI: 10.3390/biom12040520] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 03/24/2022] [Accepted: 03/28/2022] [Indexed: 12/15/2022] Open
Abstract
Finding, exploring and filtering frequent sentence-based associations between a disease and a biomedical entity, co-mentioned in disease-related PubMed literature, is a challenge, as the volume of publications increases. Darling is a web application, which utilizes Name Entity Recognition to identify human-related biomedical terms in PubMed articles, mentioned in OMIM, DisGeNET and Human Phenotype Ontology (HPO) disease records, and generates an interactive biomedical entity association network. Nodes in this network represent genes, proteins, chemicals, functions, tissues, diseases, environments and phenotypes. Users can search by identifiers, terms/entities or free text and explore the relevant abstracts in an annotated format.
Collapse
|