1
|
Disclosing the Genomic Diversity among Members of the Bifidobacterium Genus of Canine and Feline Origin with Respect to Those from Human. Appl Environ Microbiol 2022; 88:e0203821. [PMID: 35285708 DOI: 10.1128/aem.02038-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
In recent decades, much scientific attention has been paid to characterizing members of the genus Bifidobacterium due to their well-accepted ability to exert various beneficial effects upon their host. However, despite the well-accepted status of dogs and cats as principal companion animals of humans, the bifidobacterial communities that colonize their gut still represents a rather unexplored research area. To expand and further investigate the bifidobacterial ecosystem inhabiting the canine and feline intestine, strains belonging to this genus were isolated from fecal samples of dogs and cats and subjected to de novo sequencing. The obtained sequencing data, together with publicly available genomes of strains belonging to the same bifidobacterial species of our isolates, and of both human and animal origin, were employed for in-depth comparative genome analyses. These phylogenomic investigations highlighted a different degree of genetic variability between human- or pet-derived bifidobacteria depending on the considered species, with B. pseudocatenulatum strains of pet origin showing higher genetic variability than human-derived strains of the same bifidobacterial species. Furthermore, in silico evaluation of metabolic activities coupled with in vitro growth assays revealed the crucial role of diet in driving the genetic assembly of bifidobacteria as a result of their adaptation to the specific ecological niche they colonize. IMPORTANCE Despite cats and dogs being well recognized as the most intimate companion animals to humans, current knowledge on canine and feline gut microbial consortia is still far from being fully dissected compared to the significant advances achieved for other microbial ecosystems, such as the human gut microbiota. In this context, a combination of in silico genome-based analysis and in vitro carbohydrate growth assay allowed us to further explore the canine and feline bifidobacterial community with respect to that inhabiting the human intestine. Specifically, these data revealed how strains of different bifidobacterial species seem to have evolved a different degree of host-specific adaptation. In detail, genotypic and phenotypic evidence of how diet can be considered the main factor of this host-specific adaptation is provided.
Collapse
|
2
|
Chen X, Xie H, Li Z, Cheng G. Topic analysis and development in knowledge graph research: A bibliometric review on three decades. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.02.098] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
3
|
Wang L, Xie H, Han W, Yang X, Shi L, Dong J, Jiang K, Wu H. Construction of a knowledge graph for diabetes complications from expert-reviewed clinical evidences. Comput Assist Surg (Abingdon) 2021; 25:29-35. [PMID: 33275462 DOI: 10.1080/24699322.2020.1850866] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
A knowledge graph is a structured representation of data that can express entity and relational knowledge. More attention has been paid to the study of a clinical knowledge graph, especially in the field of chronic diseases. However, knowledge graph construction is based mainly on electronic medical records and other data sources, and the authority of the constructed knowledge graph presents some problems. Therefore, regarding the quality of evidence, this study, in combination with experimental research on system evaluation and meta-analysis presents some new information, On the basis of evidence-based medicine (EBM), the secondary results of systematic evaluation and meta-analyses of social, psychological, and behavioral aspects were extracted as data for the core nodes and edges of a knowledge graph to construct a graph of type 2 diabetes (T2D) and its complications. In this study, relevant life-style evidence that are factors for the risk of diabetic retinopathy (DR), diabetic nephropathy (DN), diabetic foot (DF), and diabetic depression (DD), and the results of several of the relevant clinical test, including bariatric surgery, myopia, lipid-lowering drugs, lipid-lowering drug duration, blood glucose control, disease course, glycosylated hemoglobin, fasting blood glucose, hypertension, sex, smoking and other common lifestyle characteristics were finally extracted. The evidence-based knowledge graph of the DM complications was constructed by extracting relevant disease, risk factors, risk outcomes, and other diabetes entities and the strength of the data for the odds ratio (OR) or relative risk (RR) correlations from clinical evidence. Moreover, the risk prediction models constructed using a logistic model were incorporated into the knowledge graph to visualize the risk score of DM complications for each user. In short, the EBM-powered construction of the knowledge graph could provide high-quality information to support decisions for the prevention and control of diabetes and its complications.
Collapse
Affiliation(s)
- Lei Wang
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Huimin Xie
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Wentao Han
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Xiao Yang
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Lili Shi
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Jiancheng Dong
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Kui Jiang
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Huiqun Wu
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| |
Collapse
|
4
|
Comparative genome analyses of Lactobacillus crispatus isolated from different ecological niches reveal an environmental adaptation of this species to the human vaginal environment. Appl Environ Microbiol 2021; 87:AEM.02899-20. [PMID: 33579685 PMCID: PMC8091109 DOI: 10.1128/aem.02899-20] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Vaginal microbiota is defined as the community of bacteria residing in the human vaginal tract. Recent studies have demonstrated that the vaginal microbiota is dominated by members of the Lactobacillus genus, whose relative abundance and microbial taxa composition are dependent on the healthy status of this human body site. Particularly, among members of this genus, the high prevalence of Lactobacillus crispatus is commonly associated with a healthy vaginal environment. In the current study, we assessed the microbial composition of 94 healthy vaginal microbiome samples through shotgun metagenomics analyses. Based on our results we observed that L. crispatus was the most representative species and correlated negatively with bacteria involved in vaginal infections. Therefore, we isolated fifteen L. crispatus strains from different environments in which this species is abounding, ranging from vaginal swabs of healthy women to chicken fecal samples. The genomes of these strains were decoded and their genetic content was analyzed and correlated with their physiological features. An extensive comparative genomic analysis encompassing all publicly available genome sequences of L. crispatus and combined with those decoded in this study, revealed a genetic adaptation of strains to their ecological niche. In addition, in vitro growth experiments involving all isolated L. crispatus strains together with a synthetic vaginal microbiota reveal how this species is able to modulate the composition of the vaginal microbial consortia at strain level. Overall, our findings suggest that L. crispatus plays an important ecological role in reducing the complexity of the vaginal microbiota by depleting pathogenic bacteria.Importance The vaginal microbiota is defined as the community of bacteria residing in the human vaginal tract. Recent studies have demonstrated that the high prevalence of Lactobacillus crispatus species is commonly associated with a healthy vaginal environment. In the current study, we assessed the microbial composition of 94 public healthy vaginal samples through shotgun metagenomics analyses. Results showed that L. crispatus was the most representative species and correlated negatively with bacteria involved in vaginal infections. Moreover, we isolated and sequenced the genome of new L. crispatus strains from different environments and the comparative genomics analysis revealed a genetic adaptation of strains to their ecological niche. In addition, in-vitro growth experiments display the capability of this species to modulate the composition of the vaginal microbial consortia. Overall, our findings suggest an ecological role exploited by L. crispatus in reducing the complexity of the vaginal microbiota toward a depletion of pathogenic bacteria.
Collapse
|
5
|
Zhang L, Hu J, Xu Q, Li F, Rao G, Tao C. A semantic relationship mining method among disorders, genes, and drugs from different biomedical datasets. BMC Med Inform Decis Mak 2020; 20:283. [PMID: 33317518 PMCID: PMC7734713 DOI: 10.1186/s12911-020-01274-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 09/22/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Semantic web technology has been applied widely in the biomedical informatics field. Large numbers of biomedical datasets are available online in the resource description framework (RDF) format. Semantic relationship mining among genes, disorders, and drugs is widely used in, for example, precision medicine and drug repositioning. However, most of the existing studies focused on a single dataset. It is not easy to find the most current relationships among disorder-gene-drug relationships since the relationships are distributed in heterogeneous datasets. How to mine their semantic relationships from different biomedical datasets is an important issue. METHODS First, a variety of biomedical datasets were converted into RDF triple data; then, multisource biomedical datasets were integrated into a storage system using a data integration algorithm. Second, nine query patterns among genes, disorders, and drugs from different biomedical datasets were designed. Third, the gene-disorder-drug semantic relationship mining algorithm is presented. This algorithm can query the relationships among various entities from different datasets. RESULTS AND CONCLUSIONS We focused on mining the putative and the most current disorder-gene-drug relationships about Parkinson's disease (PD). The results demonstrate that our method has significant advantages in mining and integrating multisource heterogeneous biomedical datasets. Twenty-five new relationships among the genes, disorders, and drugs were mined from four different datasets. The query results showed that most of them came from different datasets. The precision of the method increased by 2.51% compared to that of the multisource linked open data fusion method presented in the 4th International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019). Moreover, the number of query results increased by 7.7%, and the number of correct queries increased by 9.5%.
Collapse
Affiliation(s)
- Li Zhang
- School of Economics and Management, Tianjin University of Science and Technology, Tianjin, 300457 China
| | - Jiamei Hu
- School of Economics and Management, Tianjin University of Science and Technology, Tianjin, 300457 China
| | - Qianzhi Xu
- School of Economics and Management, Tianjin University of Science and Technology, Tianjin, 300457 China
| | - Fang Li
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin St Suite 600, Houston, TX 77030 USA
| | - Guozheng Rao
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350 China
- Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin, 300350 China
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin St Suite 600, Houston, TX 77030 USA
| |
Collapse
|
6
|
Lugli GA, Tarracchini C, Alessandri G, Milani C, Mancabelli L, Turroni F, Neuzil-Bunesova V, Ruiz L, Margolles A, Ventura M. Decoding the Genomic Variability among Members of the Bifidobacterium dentium Species. Microorganisms 2020; 8:E1720. [PMID: 33152994 PMCID: PMC7693768 DOI: 10.3390/microorganisms8111720] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 10/27/2020] [Accepted: 10/30/2020] [Indexed: 12/16/2022] Open
Abstract
Members of the Bifidobacterium dentium species are usually identified in the oral cavity of humans and associated with the development of plaque and dental caries. Nevertheless, they have also been detected from fecal samples, highlighting a widespread distribution among mammals. To explore the genetic variability of this species, we isolated and sequenced the genomes of 18 different B. dentium strains collected from fecal samples of several primate species and an Ursus arctos. Thus, we investigated the genomic variability and metabolic abilities of the new B. dentium isolates together with 20 public genome sequences. Comparative genomic analyses provided insights into the vast metabolic repertoire of the species, highlighting 19 glycosyl hydrolases families shared between each analyzed strain. Phylogenetic analysis of the B. dentium taxon, involving 1140 conserved genes, revealed a very close phylogenetic relatedness among members of this species. Furthermore, low genomic variability between strains was also confirmed by an average nucleotide identity analysis showing values higher than 98.2%. Investigating the genetic features of each strain, few putative functional mobile elements were identified. Besides, a consistent occurrence of defense mechanisms such as CRISPR-Cas and restriction-modification systems may be responsible for the high genome synteny identified among members of this taxon.
Collapse
Affiliation(s)
- Gabriele Andrea Lugli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (C.T.); (C.M.); (L.M.); (F.T.)
| | - Chiara Tarracchini
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (C.T.); (C.M.); (L.M.); (F.T.)
| | - Giulia Alessandri
- Department of Veterinary Medical Science, University of Parma, 43126 Parma, Italy;
| | - Christian Milani
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (C.T.); (C.M.); (L.M.); (F.T.)
- Microbiome Research Hub, University of Parma, 13121 Parma, Italy
| | - Leonardo Mancabelli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (C.T.); (C.M.); (L.M.); (F.T.)
| | - Francesca Turroni
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (C.T.); (C.M.); (L.M.); (F.T.)
| | - Vera Neuzil-Bunesova
- Department of Microbiology, Nutrition and Dietetics, Czech University of Life Sciences Prague, Kamycka 129, 16500 Prague, Czech Republic;
| | - Lorena Ruiz
- Department of Microbiology and Biochemistry, Dairy Research Institute of Asturias, Spanish National Research Council (IPLA-CSIC), Paseo Río Linares s/n, Villaviciosa, 33300 Asturias, Spain; (L.R.); (A.M.)
- MicroHealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, 33011 Asturias, Spain
| | - Abelardo Margolles
- Department of Microbiology and Biochemistry, Dairy Research Institute of Asturias, Spanish National Research Council (IPLA-CSIC), Paseo Río Linares s/n, Villaviciosa, 33300 Asturias, Spain; (L.R.); (A.M.)
- MicroHealth Group, Instituto de Investigación Sanitaria del Principado de Asturias (ISPA), Oviedo, 33011 Asturias, Spain
| | - Marco Ventura
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (C.T.); (C.M.); (L.M.); (F.T.)
- Microbiome Research Hub, University of Parma, 13121 Parma, Italy
| |
Collapse
|
7
|
Du J, Li X. A Knowledge Graph of Combined Drug Therapies Using Semantic Predications From Biomedical Literature: Algorithm Development. JMIR Med Inform 2020; 8:e18323. [PMID: 32343247 PMCID: PMC7218597 DOI: 10.2196/18323] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 03/26/2020] [Accepted: 03/29/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Combination therapy plays an important role in the effective treatment of malignant neoplasms and precision medicine. Numerous clinical studies have been carried out to investigate combination drug therapies. Automated knowledge discovery of these combinations and their graphic representation in knowledge graphs will enable pattern recognition and identification of drug combinations used to treat a specific type of cancer, improve drug efficacy and treatment of human disorders. OBJECTIVE This paper aims to develop an automated, visual approach to discover knowledge about combination therapies from biomedical literature, especially from those studies with high-level evidence such as clinical trial reports and clinical practice guidelines. METHODS Based on semantic predications, which consist of a triple structure of subject-predicate-object (SPO), we proposed an automated algorithm to discover knowledge of combination drug therapies using the following rules: 1) two or more semantic predications (S1-P-O and Si-P-O, i = 2, 3…) can be extracted from one conclusive claim (sentence) in the abstract of a given publication, and 2) these predications have an identical predicate (that closely relates to human disease treatment, eg, "treat") and object (eg, disease name) but different subjects (eg, drug names). A customized knowledge graph organizes and visualizes these combinations, improving the traditional semantic triples. After automatic filtering of broad concepts such as "pharmacologic actions" and generic disease names, a set of combination drug therapies were identified and characterized through manual interpretation. RESULTS We retrieved 22,263 clinical trial reports and 31 clinical practice guidelines from PubMed abstracts by searching "antineoplastic agents" for drug restriction (published between Jan 2009 and Oct 2019). There were 15,603 conclusive claims locally parsed using the search terms "conclusion*" and "conclude*" ready for semantic predications extraction by SemRep, and 325 candidate groups of semantic predications about combined medications were automatically discovered within 316 conclusive claims. Based on manual analysis, we determined that 255/316 claims (78.46%) were accurately identified as describing combination therapies and adopted these to construct the customized knowledge graph. We also identified two categories (and 4 subcategories) to characterize the inaccurate results: limitations of SemRep and limitations of proposal. We further learned the predominant patterns of drug combinations based on mechanism of action for new combined medication studies and discovered 4 obvious markers ("combin*," "coadministration," "co-administered," and "regimen") to identify potential combination therapies to enable development of a machine learning algorithm. CONCLUSIONS Semantic predications from conclusive claims in the biomedical literature can be used to support automated knowledge discovery and knowledge graph construction for combination therapies. A machine learning approach is warranted to take full advantage of the identified markers and other contextual features.
Collapse
Affiliation(s)
- Jian Du
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Xiaoying Li
- Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
8
|
Lugli GA, Duranti S, Albert K, Mancabelli L, Napoli S, Viappiani A, Anzalone R, Longhi G, Milani C, Turroni F, Alessandri G, Sela DA, van Sinderen D, Ventura M. Unveiling Genomic Diversity among Members of the Species Bifidobacterium pseudolongum, a Widely Distributed Gut Commensal of the Animal Kingdom. Appl Environ Microbiol 2019; 85:e03065-18. [PMID: 30737347 PMCID: PMC6450028 DOI: 10.1128/aem.03065-18] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 02/03/2019] [Indexed: 12/31/2022] Open
Abstract
Bifidobacteria are commensals of the animal gut and are commonly found in mammals, birds, and social insects. Specifically, strains of Bifidobacterium adolescentis, Bifidobacterium bifidum, Bifidobacterium longum, and Bifidobacterium pseudolongum are widely distributed in the mammalian gut. In this context, we investigated the genetic variability and metabolic abilities of the B. pseudolongum taxon, whose genomic characterization has so far not received much attention. Phylogenomic analysis of the genome sequences of 60 B. pseudolongum strains revealed that B. pseudolongum subsp. globosum and B. pseudolongum subsp. pseudolongum may actually represent two distinct bifidobacterial species. Furthermore, our analysis highlighted metabolic differences between members of these two subspecies. Moreover, comparative analyses of genetic strategies to prevent invasion of foreign DNA revealed that the B. pseudolongum subsp. globosum group exhibits greater genome plasticity. In fact, the obtained findings indicate that B. pseudolongum subsp. globosum is more adaptable to different ecological niches such as the mammalian and avian gut than is B. pseudolongum subsp. pseudolongumIMPORTANCE Currently, little information exists on the genetics of the B. pseudolongum taxon due to the limited number of sequenced genomes belonging to this species. In order to survey genome variability within this species and explore how members of this taxon evolved as commensals of the animal gut, we isolated and decoded the genomes of 51 newly isolated strains. Comparative genomics coupled with growth profiles on different carbohydrates has further provided insights concerning the genotype and phenotype of members of the B. pseudolongum taxon.
Collapse
Affiliation(s)
- Gabriele Andrea Lugli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Sabrina Duranti
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Korin Albert
- Department of Food Science, University of Massachusetts, Amherst, Massachusetts, USA
- Molecular and Cellular Biology Graduate Program, University of Massachusetts, Amherst, Massachusetts, USA
| | - Leonardo Mancabelli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Stefania Napoli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | | | - Rosaria Anzalone
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | | | - Christian Milani
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Francesca Turroni
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| | - Giulia Alessandri
- Department of Veterinary Medical Science, University of Parma, Parma, Italy
| | - David A Sela
- Department of Food Science, University of Massachusetts, Amherst, Massachusetts, USA
- Molecular and Cellular Biology Graduate Program, University of Massachusetts, Amherst, Massachusetts, USA
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, USA
| | - Douwe van Sinderen
- APC Microbiome Institute and School of Microbiology, Bioscience Institute, National University of Ireland, Cork, Ireland
| | - Marco Ventura
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| |
Collapse
|
9
|
Hatz S, Spangler S, Bender A, Studham M, Haselmayer P, Lacoste AMB, Willis VC, Martin RL, Gurulingappa H, Betz U. Identification of pharmacodynamic biomarker hypotheses through literature analysis with IBM Watson. PLoS One 2019; 14:e0214619. [PMID: 30958864 PMCID: PMC6453528 DOI: 10.1371/journal.pone.0214619] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 03/16/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Pharmacodynamic biomarkers are becoming increasingly valuable for assessing drug activity and target modulation in clinical trials. However, identifying quality biomarkers is challenging due to the increasing volume and heterogeneity of relevant data describing the biological networks that underlie disease mechanisms. A biological pathway network typically includes entities (e.g. genes, proteins and chemicals/drugs) as well as the relationships between these and is typically curated or mined from structured databases and textual co-occurrence data. We propose a hybrid Natural Language Processing and directed relationships-based network analysis approach using IBM Watson for Drug Discovery to rank all human genes and identify potential candidate biomarkers, requiring only an initial determination of a specific target-disease relationship. METHODS Through natural language processing of scientific literature, Watson for Drug Discovery creates a network of semantic relationships between biological concepts such as genes, drugs, and diseases. Using Bruton's tyrosine kinase as a case study, Watson for Drug Discovery's automatically extracted relationship network was compared with a prominent manually curated physical interaction network. Additionally, potential biomarkers for Bruton's tyrosine kinase inhibition were predicted using a matrix factorization approach and subsequently compared with expert-generated biomarkers. RESULTS Watson's natural language processing generated a relationship network matching 55 (86%) genes upstream of BTK and 98 (95%) genes downstream of Bruton's tyrosine kinase in a prominent manually curated physical interaction network. Matrix factorization analysis predicted 11 of 13 genes identified by Merck subject matter experts in the top 20% of Watson for Drug Discovery's 13,595 ranked genes, with 7 in the top 5%. CONCLUSION Taken together, these results suggest that Watson for Drug Discovery's automatic relationship network identifies the majority of upstream and downstream genes in biological pathway networks and can be used to help with the identification and prioritization of pharmacodynamic biomarker evaluation, accelerating the early phases of disease hypothesis generation.
Collapse
Affiliation(s)
- Sonja Hatz
- Merck KGaA, Frankfurter Straße, Darmstadt, Germany
| | - Scott Spangler
- IBM Watson Health, Almaden, California, United States of America
| | - Andrew Bender
- EMD Serono, Middlesex Turnpike, Billerica, United States of America
| | - Matthew Studham
- EMD Serono, Middlesex Turnpike, Billerica, United States of America
| | | | | | - Van C. Willis
- IBM Watson Health, Cambridge, Massachusetts, United States of America
| | - Richard L. Martin
- IBM Watson Health, Cambridge, Massachusetts, United States of America
| | | | - Ulrich Betz
- Merck KGaA, Frankfurter Straße, Darmstadt, Germany
| |
Collapse
|
10
|
Dissecting the Evolutionary Development of the Species Bifidobacterium animalis through Comparative Genomics Analyses. Appl Environ Microbiol 2019; 85:AEM.02806-18. [PMID: 30709821 DOI: 10.1128/aem.02806-18] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Accepted: 01/28/2019] [Indexed: 12/20/2022] Open
Abstract
Bifidobacteria are members of the gut microbiota of animals, including mammals, birds, and social insects. In this study, we analyzed and determined the pangenome of Bifidobacterium animalis species, encompassing B. animalis subsp. animalis and the B. animalis subsp. lactis taxon, which is one of the most intensely exploited probiotic bifidobacterial species. In order to reveal differences within the B. animalis species, detailed comparative genomics and phylogenomics analyses were performed, indicating that these two subspecies recently arose through divergent evolutionary events. A subspecies-specific core genome was identified for both B. animalis subspecies, revealing the existence of subspecies-defining genes involved in carbohydrate metabolism. Notably, these in silico analyses coupled with carbohydrate profiling assays suggest genetic adaptations toward a distinct glycan milieu for each member of the B. animalis subspecies, resulting in a divergent evolutionary development of the two subspecies.IMPORTANCE The majority of characterized B. animalis strains have been isolated from human fecal samples. In order to explore genome variability within this species, we isolated 15 novel strains from the gastrointestinal tracts of different animals, including mammals and birds. The present study allowed us to reconstruct the pangenome of this taxon, including the genome contents of 56 B. animalis strains. Through careful assessment of subspecies-specific core genes of the B. animalis subsp. animalis/lactis taxon, we identified genes encoding enzymes involved in carbohydrate transport and metabolism, while unveiling specific gene acquisition and loss events that caused the evolutionary emergence of these two subspecies.
Collapse
|
11
|
Thilakaratne M, Falkner K, Atapattu T. A systematic review on literature-based discovery workflow. PeerJ Comput Sci 2019; 5:e235. [PMID: 33816888 PMCID: PMC7924697 DOI: 10.7717/peerj-cs.235] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/17/2019] [Indexed: 05/02/2023]
Abstract
As scientific publication rates increase, knowledge acquisition and the research development process have become more complex and time-consuming. Literature-Based Discovery (LBD), supporting automated knowledge discovery, helps facilitate this process by eliciting novel knowledge by analysing existing scientific literature. This systematic review provides a comprehensive overview of the LBD workflow by answering nine research questions related to the major components of the LBD workflow (i.e., input, process, output, and evaluation). With regards to the input component, we discuss the data types and data sources used in the literature. The process component presents filtering techniques, ranking/thresholding techniques, domains, generalisability levels, and resources. Subsequently, the output component focuses on the visualisation techniques used in LBD discipline. As for the evaluation component, we outline the evaluation techniques, their generalisability, and the quantitative measures used to validate results. To conclude, we summarise the findings of the review for each component by highlighting the possible future research directions.
Collapse
Affiliation(s)
- Menasha Thilakaratne
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Katrina Falkner
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Thushari Atapattu
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| |
Collapse
|
12
|
Vlietstra WJ, Vos R, Sijbers AM, van Mulligen EM, Kors JA. Using predicate and provenance information from a knowledge graph for drug efficacy screening. J Biomed Semantics 2018; 9:23. [PMID: 30189889 PMCID: PMC6127943 DOI: 10.1186/s13326-018-0189-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 08/01/2018] [Indexed: 12/11/2022] Open
Abstract
Background Biomedical knowledge graphs have become important tools to computationally analyse the comprehensive body of biomedical knowledge. They represent knowledge as subject-predicate-object triples, in which the predicate indicates the relationship between subject and object. A triple can also contain provenance information, which consists of references to the sources of the triple (e.g. scientific publications or database entries). Knowledge graphs have been used to classify drug-disease pairs for drug efficacy screening, but existing computational methods have often ignored predicate and provenance information. Using this information, we aimed to develop a supervised machine learning classifier and determine the added value of predicate and provenance information for drug efficacy screening. To ensure the biological plausibility of our method we performed our research on the protein level, where drugs are represented by their drug target proteins, and diseases by their disease proteins. Results Using random forests with repeated 10-fold cross-validation, our method achieved an area under the ROC curve (AUC) of 78.1% and 74.3% for two reference sets. We benchmarked against a state-of-the-art knowledge-graph technique that does not use predicate and provenance information, obtaining AUCs of 65.6% and 64.6%, respectively. Classifiers that only used predicate information performed superior to classifiers that only used provenance information, but using both performed best. Conclusion We conclude that both predicate and provenance information provide added value for drug efficacy screening. Electronic supplementary material The online version of this article (10.1186/s13326-018-0189-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wytze J Vlietstra
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands.
| | - Rein Vos
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands.,Department of Methodology and Statistics, Maastricht University, Maastricht, 6200, MD, the Netherlands
| | - Anneke M Sijbers
- Centre for Molecular and Biomolecular Informatics, Radboudumc, Nijmegen, 6525, GA, the Netherlands
| | - Erik M van Mulligen
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands
| |
Collapse
|