1
|
Abstract
Large, open-source DNA sequence databases have been generated, in part, through the collection of microbial pathogens by swabbing surfaces in built environments. Analyzing these data in aggregate through public health surveillance requires digitization of the complex, domain-specific metadata that are associated with the swab site locations. However, the swab site location information is currently collected in a single, free-text, "isolation source", field-promoting generation of poorly detailed descriptions with various word order, granularity, and linguistic errors, making automation difficult and reducing machine-actionability. We assessed 1,498 free-text swab site descriptions that were generated during routine foodborne pathogen surveillance. The lexicon of free-text metadata was evaluated to determine the informational facets and the quantity of unique terms used by data collectors. Open Biological Ontologies (OBO) Foundry libraries were used to develop hierarchical vocabularies that are connected with logical relationships to describe swab site locations. 5 informational facets that were described by 338 unique terms were identified via content analysis. Term hierarchy facets were developed, as were statements (called axioms) about how the entities within these five domains are related. The schema developed through this study has been integrated into a publicly available pathogen metadata standard, facilitating ongoing surveillance and investigations. The One Health Enteric Package was available at NCBI BioSample, beginning in 2022. The collective use of metadata standards increases the interoperability of DNA sequence databases and enables large-scale approaches to data sharing and artificial intelligence as well as big-data solutions to food safety. IMPORTANCE The regular analysis of whole-genome sequence data in collections such as NCBI's Pathogen Detection Database is used by many public health organizations to detect outbreaks of infectious disease. However, isolate metadata in these databases are often incomplete and of poor quality. These complex, raw metadata must often be reorganized and manually formatted for use in aggregate analyses. These processes are inefficient and time-consuming, increasing the interpretative labor needed by public health groups to extract actionable information. The future use of open genomic epidemiology networks will be supported through the development of an internationally applicable vocabulary system with which swab site locations can be described.
Collapse
|
2
|
Plomp E, Stantis C, James HF, Cheung C, Snoeck C, Kootker L, Kharobi A, Borges C, Moreiras Reynaga DK, Pospieszny Ł, Fulminante F, Stevens R, Alaica AK, Becker A, de Rochefort X, Salesse K. The IsoArcH initiative: Working towards an open and collaborative isotope data culture in bioarchaeology. Data Brief 2022; 45:108595. [PMID: 36188136 PMCID: PMC9516382 DOI: 10.1016/j.dib.2022.108595] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
3
|
Vitali F, Zinno P, Schifano E, Gori A, Costa A, De Filippo C, Koroušić Seljak B, Panov P, Devirgiliis C, Cavalieri D. Semantics of Dairy Fermented Foods: A Microbiologist’s Perspective. Foods 2022; 11:foods11131939. [PMID: 35804753 PMCID: PMC9265904 DOI: 10.3390/foods11131939] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/17/2022] [Accepted: 06/26/2022] [Indexed: 01/05/2023] Open
Abstract
Food ontologies are acquiring a central role in human nutrition, providing a standardized terminology for a proper description of intervention and observational trials. In addition to bioactive molecules, several fermented foods, particularly dairy products, provide the host with live microorganisms, thus carrying potential “genetic/functional” nutrients. To date, a proper ontology to structure and formalize the concepts used to describe fermented foods is lacking. Here we describe a semantic representation of concepts revolving around what consuming fermented foods entails, both from a technological and health point of view, focusing actions on kefir and Parmigiano Reggiano, as representatives of fresh and ripened dairy products. We included concepts related to the connection of specific microbial taxa to the dairy fermentation process, demonstrating the potential of ontologies to formalize the various gene pathways involved in raw ingredient transformation, connect them to resulting metabolites, and finally to their consequences on the fermented product, including technological, health and sensory aspects. Our work marks an improvement in the ambition of creating a harmonized semantic model for integrating different aspects of modern nutritional science. Such a model, besides formalizing a multifaceted knowledge, will be pivotal for a rich annotation of data in public repositories, as a prerequisite to generalized meta-analysis.
Collapse
Affiliation(s)
- Francesco Vitali
- Institute of Agricultural Biology and Biotechnology (IBBA), National Research Council (CNR), Via Moruzzi 1, 56124 Pisa, Italy; (F.V.); (C.D.F.)
- Research Centre for Agriculture and Environment, CREA (Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria), Via di Lanciola 12/A, 50125 Florence, Italy
| | - Paola Zinno
- Research Centre for Food and Nutrition, CREA (Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria), Via Ardeatina 546, 00178 Rome, Italy; (P.Z.); (E.S.)
| | - Emily Schifano
- Research Centre for Food and Nutrition, CREA (Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria), Via Ardeatina 546, 00178 Rome, Italy; (P.Z.); (E.S.)
| | - Agnese Gori
- Department of Biology, University of Florence, Via Madonna del Piano 6, 50019 Sesto Fiorentino, Italy; (A.G.); (A.C.)
| | - Ana Costa
- Department of Biology, University of Florence, Via Madonna del Piano 6, 50019 Sesto Fiorentino, Italy; (A.G.); (A.C.)
| | - Carlotta De Filippo
- Institute of Agricultural Biology and Biotechnology (IBBA), National Research Council (CNR), Via Moruzzi 1, 56124 Pisa, Italy; (F.V.); (C.D.F.)
| | - Barbara Koroušić Seljak
- Computer Systems Department, Jozef Stefan Institute, Jamova Cesta 39, 1000 Ljubljana, Slovenia;
| | - Panče Panov
- Department of Knowledge Technologies, Jozef Stefan Institute, Jamova Cesta 39, 1000 Ljubljana, Slovenia;
| | - Chiara Devirgiliis
- Research Centre for Food and Nutrition, CREA (Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria), Via Ardeatina 546, 00178 Rome, Italy; (P.Z.); (E.S.)
- Correspondence: (C.D.); (D.C.)
| | - Duccio Cavalieri
- Department of Biology, University of Florence, Via Madonna del Piano 6, 50019 Sesto Fiorentino, Italy; (A.G.); (A.C.)
- Correspondence: (C.D.); (D.C.)
| |
Collapse
|
4
|
Min W, Liu C, Xu L, Jiang S. Applications of knowledge graphs for food science and industry. PATTERNS (NEW YORK, N.Y.) 2022; 3:100484. [PMID: 35607620 PMCID: PMC9122965 DOI: 10.1016/j.patter.2022.100484] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The deployment of various networks (e.g., Internet of Things [IoT] and mobile networks), databases (e.g., nutrition tables and food compositional databases), and social media (e.g., Instagram and Twitter) generates huge amounts of food data, which present researchers with an unprecedented opportunity to study various problems and applications in food science and industry via data-driven computational methods. However, these multi-source heterogeneous food data appear as information silos, leading to difficulty in fully exploiting these food data. The knowledge graph provides a unified and standardized conceptual terminology in a structured form, and thus can effectively organize these food data to benefit various applications. In this review, we provide a brief introduction to knowledge graphs and the evolution of food knowledge organization mainly from food ontology to food knowledge graphs. We then summarize seven representative applications of food knowledge graphs, such as new recipe development, diet-disease correlation discovery, and personalized dietary recommendation. We also discuss future directions in this field, such as multimodal food knowledge graph construction and food knowledge graphs for human health.
Collapse
Affiliation(s)
- Weiqing Min
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chunlin Liu
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Leyi Xu
- Soochow University, Suzhou, Jiangsu 215006, China
| | - Shuqiang Jiang
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
5
|
Amiri M, Li J, Roy S. Meal Planning for Alzheimer's Disease Using an Ontology-Assisted Multiple Criteria Decision-Making Approach. INTERNATIONAL JOURNAL OF E-HEALTH AND MEDICAL COMMUNICATIONS 2022. [DOI: 10.4018/ijehmc.316133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
As healthy diets and nutrition are crucial for people with Alzheimer's disease (AD), caregivers of patients with AD need to provide a balanced diet with the correct nutrients to boost the health and well-being of patients. However, this is challenging as they are likely to suffer from aging-related problems (such as teeth or gum problems) that make eating more uncomfortable; the planners, who are usually patients' family members, generally face high pressure, a busy schedule, and little experience. To help unprofessional caregivers of AD plan meals with the right nutrition and flavors, in this paper, the authors propose a meal planning mechanism that uses a multiple criteria decision-making approach to integrate various factors that affect a caregiver's choice of meals for AD patients. Ontology-based knowledge has been used to model personal preferences and characteristics and customize general diet recommendations. Case studies have demonstrated the feasibility and usability of the proposed approach.
Collapse
Affiliation(s)
| | - Juan Li
- North Dakota State University, USA
| | | |
Collapse
|
6
|
Kim H, Jung J, Choi J. Developing a Dietary Lifestyle Ontology (DILON) to Improve the Interoperability of Dietary Data: A Proof-of-Concept Study (Preprint). JMIR Form Res 2021; 6:e34962. [PMID: 35451991 PMCID: PMC9073603 DOI: 10.2196/34962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 03/12/2022] [Accepted: 03/16/2022] [Indexed: 11/13/2022] Open
Abstract
Background Dietary habits offer crucial information on one's health and form a considerable part of the patient-generated health data. Dietary data are collected through various channels and formats; thus, interoperability is a significant challenge to reusing this type of data. The vast scope of dietary concepts and the colloquial expression style add difficulty to standardizing the data. The interoperability issues of dietary data can be addressed through Common Data Elements with metadata annotation to some extent. However, making culture-specific dietary habits and questionnaire-based dietary assessment data interoperable still requires substantial efforts. Objective The main goal of this study was to address the interoperability challenge of questionnaire-based dietary data from different cultural backgrounds by combining ontological curation and metadata annotation of dietary concepts. Specifically, this study aimed to develop a Dietary Lifestyle Ontology (DILON) and demonstrate the improved interoperability of questionnaire-based dietary data by annotating its main semantics with DILON. Methods By analyzing 1158 dietary assessment data elements (367 in Korean and 791 in English), 515 dietary concepts were extracted and used to construct DILON. To demonstrate the utility of DILON in addressing the interoperability challenges of questionnaire-based multicultural dietary data, we developed 10 competency questions that asked to identify data elements sharing the same dietary topics and assessment properties. We instantiated 68 data elements on dietary habits selected from Korean and English questionnaires and annotated them with DILON to answer the competency questions. We translated the competency questions into Semantic Query-Enhanced Web Rule Language and reviewed the query results for accuracy. Results DILON was built with 262 concept classes and validated with ontology validation tools. A small overlap (72 concepts) in the concepts extracted from the questionnaires in 2 languages indicates that we need to pay closer attention to representing culture-specific dietary concepts. The Semantic Query-Enhanced Web Rule Language queries reflecting the 10 competency questions yielded correct results. Conclusions Ensuring the interoperability of dietary lifestyle data is a demanding task due to its vast scope and variations in expression. This study demonstrated that we could improve the interoperability of dietary data generated in different cultural contexts and expressed in various styles by annotating their core semantics with DILON.
Collapse
Affiliation(s)
- Hyeoneui Kim
- The Research Institute of Nursing Science, College of Nursing, Seoul National University, Seoul, Republic of Korea
| | - Jinsun Jung
- The Research Institute of Nursing Science, College of Nursing, Seoul National University, Seoul, Republic of Korea
| | - Jisung Choi
- Samsung Medical Center, Seoul, Republic of Korea
| |
Collapse
|
7
|
Zeb A, Soininen JP, Sozer N. Data harmonisation as a key to enable digitalisation of the food sector: A review. FOOD AND BIOPRODUCTS PROCESSING 2021. [DOI: 10.1016/j.fbp.2021.02.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Youn J, Naravane T, Tagkopoulos I. Using Word Embeddings to Learn a Better Food Ontology. Front Artif Intell 2021; 3:584784. [PMID: 33733222 PMCID: PMC7861243 DOI: 10.3389/frai.2020.584784] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Accepted: 10/14/2020] [Indexed: 11/13/2022] Open
Abstract
Food ontologies require significant effort to create and maintain as they involve manual and time-consuming tasks, often with limited alignment to the underlying food science knowledge. We propose a semi-supervised framework for the automated ontology population from an existing ontology scaffold by using word embeddings. Having applied this on the domain of food and subsequent evaluation against an expert-curated ontology, FoodOn, we observe that the food word embeddings capture the latent relationships and characteristics of foods. The resulting ontology, which utilizes word embeddings trained from the Wikipedia corpus, has an improvement of 89.7% in precision when compared to the expert-curated ontology FoodOn (0.34 vs. 0.18, respectively, p value = 2.6 × 10–138), and it has a 43.6% shorter path distance (hops) between predicted and actual food instances (2.91 vs. 5.16, respectively, p value = 4.7 × 10–84) when compared to other methods. This work demonstrates how high-dimensional representations of food can be used to populate ontologies and paves the way for learning ontologies that integrate contextual information from a variety of sources and types.
Collapse
Affiliation(s)
- Jason Youn
- Department of Computer Science, University of California at Davis, Davis, CA, United States.,Genome Center, University of California at Davis, Davis, CA, United States
| | - Tarini Naravane
- Genome Center, University of California at Davis, Davis, CA, United States.,Biological Systems Engineering, University of California at Davis, Davis, CA, United States
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California at Davis, Davis, CA, United States.,Genome Center, University of California at Davis, Davis, CA, United States
| |
Collapse
|
9
|
Tao D, Yang P, Feng H. Utilization of text mining as a big data analysis tool for food science and nutrition. Compr Rev Food Sci Food Saf 2020; 19:875-894. [PMID: 33325182 DOI: 10.1111/1541-4337.12540] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 12/26/2019] [Accepted: 01/13/2020] [Indexed: 12/21/2022]
Abstract
Big data analysis has found applications in many industries due to its ability to turn huge amounts of data into insights for informed business and operational decisions. Advanced data mining techniques have been applied in many sectors of supply chains in the food industry. However, the previous work has mainly focused on the analysis of instrument-generated data such as those from hyperspectral imaging, spectroscopy, and biometric receptors. The importance of digital text data in the food and nutrition has only recently gained attention due to advancements in big data analytics. The purpose of this review is to provide an overview of the data sources, computational methods, and applications of text data in the food industry. Text mining techniques such as word-level analysis (e.g., frequency analysis), word association analysis (e.g., network analysis), and advanced techniques (e.g., text classification, text clustering, topic modeling, information retrieval, and sentiment analysis) will be discussed. Applications of text data analysis will be illustrated with respect to food safety and food fraud surveillance, dietary pattern characterization, consumer-opinion mining, new-product development, food knowledge discovery, food supply-chain management, and online food services. The goal is to provide insights for intelligent decision-making to improve food production, food safety, and human nutrition.
Collapse
Affiliation(s)
- Dandan Tao
- Department of Food Science and Human Nutrition, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Pengkun Yang
- Department of Electrical Engineering, Princeton University, Princeton, New Jersey
| | - Hao Feng
- Department of Food Science and Human Nutrition, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| |
Collapse
|