1
|
Cui Y, Riley M, Moreno MV, Cepeda MM, Perez IA, Wen Y, Lim LX, Andre E, Nguyen A, Liu C, Lerno L, Nichols PK, Schmitz H, Tagkopoulos I, Kennedy JA, Oberholster A, Siegel JB. Discovery of Potent Glycosidases Enables Quantification of Smoke-Derived Phenolic Glycosides through Enzymatic Hydrolysis. J Agric Food Chem 2024. [PMID: 38728580 DOI: 10.1021/acs.jafc.4c01247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2024]
Abstract
When grapes are exposed to wildfire smoke, certain smoke-related volatile phenols (VPs) can be absorbed into the fruit, where they can be then converted into volatile-phenol (VP) glycosides through glycosylation. These volatile-phenol glycosides can be particularly problematic from a winemaking standpoint as they can be hydrolyzed, releasing volatile phenols, which can contribute to smoke-related off-flavors. Current methods for quantitating these volatile-phenol glycosides present several challenges, including the requirement of expensive capital equipment, limited accuracy due to the molecular complexity of the glycosides, and the utilization of harsh reagents. To address these challenges, we proposed an enzymatic hydrolysis method enabled by a tailored enzyme cocktail of novel glycosidases discovered through genome mining, and the generated VPs from VP glycosides can be quantitated by gas chromatography-mass spectrometry (GC-MS). The enzyme cocktails displayed high activities and a broad substrate scope when using commercially available VP glycosides as the substrates for testing. When evaluated in an industrially relevant matrix of Cabernet Sauvignon wine and grapes, this enzymatic cocktail consistently achieved a comparable efficacy of acid hydrolysis. The proposed method offers a simple, safe, and affordable option for smoke taint analysis.
Collapse
Affiliation(s)
- Youtian Cui
- Genome Center, University of California, Davis, California 95616, United States
- VinZymes, LLC, Davis, California 95616, United States
| | - Mary Riley
- Genome Center, University of California, Davis, California 95616, United States
- Microbiology Graduate Group, University of California, Davis, California 95616, United States
| | - Marcus V Moreno
- Genome Center, University of California, Davis, California 95616, United States
| | - Mateo M Cepeda
- Department of Chemistry, University of California, Davis, California 95616, United States
| | - Ignacio Arias Perez
- Department of Viticulture & Enology, University of California, Davis, California 95616, United States
| | - Yan Wen
- Department of Viticulture & Enology, University of California, Davis, California 95616, United States
| | - Lik Xian Lim
- Department of Food Science & Technology, University of California, Davis, California 95616, United States
- UC Davis Coffee Center, University of California, Davis, California 95616, United States
| | - Eric Andre
- Genome Center, University of California, Davis, California 95616, United States
| | - An Nguyen
- Genome Center, University of California, Davis, California 95616, United States
| | - Cody Liu
- Genome Center, University of California, Davis, California 95616, United States
| | - Larry Lerno
- Department of Viticulture & Enology, University of California, Davis, California 95616, United States
- Food Safety and Measurement Facility, University of California, Davis, California 95616, United States
| | | | - Harold Schmitz
- March Capital US, LLC, Davis, California 95616, United States
- T.O.P., LLC, Davis, California 95616, United States
- Graduate School of Management, University of California, Davis, California 95616, United States
| | - Ilias Tagkopoulos
- Genome Center, University of California, Davis, California 95616, United States
- Department of Computer Science, USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, California 95616, United States
- PIPA, LLC, Davis, California 95616, United States
| | | | - Anita Oberholster
- Department of Viticulture & Enology, University of California, Davis, California 95616, United States
| | - Justin B Siegel
- Genome Center, University of California, Davis, California 95616, United States
- Microbiology Graduate Group, University of California, Davis, California 95616, United States
- Department of Chemistry, University of California, Davis, California 95616, United States
- Department of Biochemistry and Molecular Medicine, University of California, Davis, California 95616, United States
| |
Collapse
|
2
|
Tan CE, Neupane BP, Wen Y, Lim LX, Medina Plaza C, Oberholster A, Tagkopoulos I. Volatile Organic Compound-Based Predictive Modeling of Smoke Taint in Wine. J Agric Food Chem 2024; 72:8060-8071. [PMID: 38533667 PMCID: PMC11010234 DOI: 10.1021/acs.jafc.3c07019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 03/11/2024] [Accepted: 03/12/2024] [Indexed: 03/28/2024]
Abstract
Smoke taint in wine has become a critical issue in the wine industry due to its significant negative impact on wine quality. Data-driven approaches including univariate analysis and predictive modeling are applied to a data set containing concentrations of 20 VOCs in 48 grape samples and 56 corresponding wine samples with a taster-evaluated smoke taint index. The resulting models for predicting the smoke taint index of wines are highly predictive when using as inputs VOC concentrations after log conversion in both grapes and wines (Pearson Correlation Coefficient PCC = 0.82; R2 = 0.68) and less so when only grape VOCs are used (Pearson Correlation Coefficient PCC = 0.76; R2 = 0.56), and the classification models also show the capacity for detecting smoke-tainted wines using both wine and grape VOC concentrations (Recall = 0.76; Precision = 0.92; F1 = 0.82) or using only grape VOC concentrations (Recall = 0.74; Precision = 0.92; F1 = 0.80). The performance of the predictive model shows the possibility of predicting the smoke taint index of the wine and grape samples before fermentation. The corresponding code of data analysis and predictive modeling of smoke taint in wine is available in the Github repository (https://github.com/IBPA/smoke_taint_prediction).
Collapse
Affiliation(s)
- Cheng-En Tan
- Department
of Computer Science, University of California,
Davis, Davis, California 95616, United States
- Genome
Center, University of California, Davis, Davis, California 95616, United States
- USDA/NSF
AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California 95616, United States
| | - Bishnu Prasad Neupane
- Department
of Viticulture and Enology, University of
California, Davis, Davis, California 95616, United States
| | - Yan Wen
- Department
of Viticulture and Enology, University of
California, Davis, Davis, California 95616, United States
| | - Lik Xian Lim
- Department
of Viticulture and Enology, University of
California, Davis, Davis, California 95616, United States
| | - Cristina Medina Plaza
- Department
of Viticulture and Enology, University of
California, Davis, Davis, California 95616, United States
| | - Anita Oberholster
- Department
of Viticulture and Enology, University of
California, Davis, Davis, California 95616, United States
| | - Ilias Tagkopoulos
- Department
of Computer Science, University of California,
Davis, Davis, California 95616, United States
- Genome
Center, University of California, Davis, Davis, California 95616, United States
- USDA/NSF
AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California 95616, United States
| |
Collapse
|
3
|
Furuya H, Nguyen CT, Chan T, Marusina AI, Merleev AA, Garcia-Hernandez MDLL, Hsieh SL, Tsokos GC, Ritchlin CT, Tagkopoulos I, Maverakis E, Adamopoulos IE. IL-23 induces CLEC5A + IL-17A + neutrophils and elicit skin inflammation associated with psoriatic arthritis. J Autoimmun 2024; 143:103167. [PMID: 38301504 PMCID: PMC10981569 DOI: 10.1016/j.jaut.2024.103167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 01/04/2024] [Accepted: 01/09/2024] [Indexed: 02/03/2024]
Abstract
IL-23-activation of IL-17 producing T cells is involved in many rheumatic diseases. Herein, we investigate the role of IL-23 in the activation of myeloid cell subsets that contribute to skin inflammation in mice and man. IL-23 gene transfer in WT, IL-23RGFP reporter mice and subsequent analysis with spectral cytometry show that IL-23 regulates early innate immune events by inducing the expansion of a myeloid MDL1+CD11b+Ly6G+ population that dictates epidermal hyperplasia, acanthosis, and parakeratosis; hallmark pathologic features of psoriasis. Genetic ablation of MDL-1, a major PU.1 transcriptional target during myeloid differentiation exclusively expressed in myeloid cells, completely prevents IL-23-pathology. Moreover, we show that IL-23-induced myeloid subsets are also capable of producing IL-17A and IL-23R+MDL1+ cells are present in the involved skin of psoriasis patients and gene expression correlations between IL-23 and MDL-1 have been validated in multiple patient cohorts. Collectively, our data demonstrate a novel role of IL-23 in MDL-1-myelopoiesis that is responsible for skin inflammation and related pathologies. Our data open a new avenue of investigations regarding the role of IL-23 in the activation of myeloid immunoreceptors and their role in autoimmunity.
Collapse
Affiliation(s)
- Hiroki Furuya
- Department of Rheumatology and Clinical Immunology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, USA
| | - Cuong Thach Nguyen
- Division of Rheumatology, Allergy and Clinical Immunology, University of California, Davis, USA
| | - Trevor Chan
- Department of Computer Science, University of California, Davis, CA, USA; Genome Center, University of California, Davis, CA, USA
| | - Alina I Marusina
- Department of Dermatology, University of California, Davis, Sacramento, USA
| | | | | | - Shie-Liang Hsieh
- Genomics Research Center, Academia Sinica, Nankang, Taipei, Taiwan
| | - George C Tsokos
- Division of Rheumatology, Allergy and Clinical Immunology, University of California, Davis, USA
| | - Christopher T Ritchlin
- Division of Allergy, Immunology & Rheumatology, University of Rochester Medical School, NY, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA; Process Integration and Predictive Analytics, PIPA LLC, CA, USA
| | - Emanual Maverakis
- Department of Dermatology, University of California, Davis, Sacramento, USA
| | - Iannis E Adamopoulos
- Department of Rheumatology and Clinical Immunology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, USA; Division of Rheumatology, Allergy and Clinical Immunology, University of California, Davis, USA.
| |
Collapse
|
4
|
Aboud O, Liu Y, Dahabiyeh L, Abuaisheh A, Li F, Aboubechara JP, Riess J, Bloch O, Hodeify R, Tagkopoulos I, Fiehn O. Profile Characterization of Biogenic Amines in Glioblastoma Patients Undergoing Standard-of-Care Treatment. Biomedicines 2023; 11:2261. [PMID: 37626757 PMCID: PMC10452138 DOI: 10.3390/biomedicines11082261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 07/29/2023] [Accepted: 08/07/2023] [Indexed: 08/27/2023] Open
Abstract
INTRODUCTION Biogenic amines play important roles throughout cellular metabolism. This study explores a role of biogenic amines in glioblastoma pathogenesis. Here, we characterize the plasma levels of biogenic amines in glioblastoma patients undergoing standard-of-care treatment. METHODS We examined 138 plasma samples from 36 patients with isocitrate dehydrogenase (IDH) wild-type glioblastoma at multiple stages of treatment. Untargeted gas chromatography-time of flight mass spectrometry (GC-TOF MS) was used to measure metabolite levels. Machine learning approaches were then used to develop a predictive tool based on these datasets. RESULTS Surgery was associated with increased levels of 12 metabolites and decreased levels of 11 metabolites. Chemoradiation was associated with increased levels of three metabolites and decreased levels of three other metabolites. Ensemble learning models, specifically random forest (RF) and AdaBoost (AB), accurately classified treatment phases with high accuracy (RF: 0.81 ± 0.04, AB: 0.78 ± 0.05). The metabolites sorbitol and N-methylisoleucine were identified as important predictive features and confirmed via SHAP. CONCLUSION To our knowledge, this is the first study to describe plasma biogenic amine signatures throughout the treatment of patients with glioblastoma. A larger study is needed to confirm these results with hopes of developing a diagnostic algorithm.
Collapse
Affiliation(s)
- Orwa Aboud
- Department of Neurology, University of California, Davis, Sacramento, CA 95817, USA
- Department of Neurological Surgery, University of California, Davis, Sacramento, CA 95817, USA
- Comprehensive Cancer Center, University of California Davis, Sacramento, CA 95817, USA
| | - Yin Liu
- Department of Neurology, University of California, Davis, Sacramento, CA 95817, USA
- Department of Neurological Surgery, University of California, Davis, Sacramento, CA 95817, USA
- Department of Ophthalmology, University of California, Davis, Sacramento, CA 95817, USA
| | - Lina Dahabiyeh
- West Coast Metabolomics Center, University of California Davis, Davis, CA 95616, USA
- Department of Pharmaceutical Sciences, School of Pharmacy, The University of Jordan, Amman 11942, Jordan
| | - Ahmad Abuaisheh
- School of Medicine, Al Balqa Applied University, Al-Salt 19117, Jordan
| | - Fangzhou Li
- Department of Computer Science, University of California, Davis, Sacramento, CA 95616, USA
- Genome Center, University of California, Davis, Sacramento, CA 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), Davis, CA 95616, USA
| | | | - Jonathan Riess
- Comprehensive Cancer Center, University of California Davis, Sacramento, CA 95817, USA
- Department of Internal Medicine, Division of Hematology and Oncology, University of California, Davis, Sacramento, CA 95817, USA
| | - Orin Bloch
- Department of Neurological Surgery, University of California, Davis, Sacramento, CA 95817, USA
| | - Rawad Hodeify
- Department of Biotechnology, School of Arts and Sciences, American University of Ras Al Khaimah, Ras Al-Khaimah 10021, United Arab Emirates
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, Sacramento, CA 95616, USA
- Genome Center, University of California, Davis, Sacramento, CA 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), Davis, CA 95616, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
5
|
Naravane T, Tagkopoulos I. Machine learning models to predict micronutrient profile in food after processing. Curr Res Food Sci 2023; 6:100500. [PMID: 37151381 PMCID: PMC10160345 DOI: 10.1016/j.crfs.2023.100500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/28/2023] [Accepted: 04/02/2023] [Indexed: 05/09/2023] Open
Abstract
The information on nutritional profile of cooked foods is important to both food manufacturers and consumers, and a major challenge to obtaining precise information is the inherent variation in composition across biological samples of any given raw ingredient. The ideal solution would address precision and generability, but the current solutions are limited in their capabilities; analytical methods are too costly to scale, retention-factor based methods are scalable but approximate, and kinetic models are bespoke to a food and nutrient. We provide an alternate solution that predicts the micronutrient profile in cooked food from the raw food composition, and for multiple foods. The prediction model is trained on an existing food composition dataset and has a 31% lower error on average (across all foods, processes and nutrients) than predictions obtained using the baseline method of retention-factors. Our results argue that data scaling and transformation prior to training the models is important to mitigate any yield bias. This study demonstrates the potential of machine learning methods over current solutions, and additionally provides guidance for the future generation of food composition data, specifically for sampling approach, data quality checks, and data representation standards.
Collapse
Affiliation(s)
- Tarini Naravane
- Biological Systems Engineering, University of California at Davis, United States
- Genome Center, University of California at Davis, United States
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California at Davis, United States
- Genome Center, University of California at Davis, United States
- Corresponding author. Department of Computer Science, University of California at Davis, United States.
| |
Collapse
|
6
|
Eetemadi A, Tagkopoulos I. Algorithmic lifestyle optimization. J Am Med Inform Assoc 2022; 30:38-45. [PMID: 36308771 PMCID: PMC9748593 DOI: 10.1093/jamia/ocac186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/09/2022] [Accepted: 10/06/2022] [Indexed: 12/15/2022] Open
Abstract
OBJECTIVE A hallmark of personalized medicine and nutrition is to identify effective treatment plans at the individual level. Lifestyle interventions (LIs), from diet to exercise, can have a significant effect over time, especially in the case of food intolerances and allergies. The large set of candidate interventions, make it difficult to evaluate which intervention plan would be more favorable for any given individual. In this study, we aimed to develop a method for rapid identification of favorable LIs for a given individual. MATERIALS AND METHODS We have developed a method, algorithmic lifestyle optimization (ALO), for rapid identification of effective LIs. At its core, a group testing algorithm identifies the effectiveness of each intervention efficiently, within the context of its pertinent group. RESULTS Evaluations on synthetic and real data show that ALO is robust to noise, data size, and data heterogeneity. Compared to the standard of practice techniques, such as the standard elimination diet (SED), it identifies the effective LIs 58.9%-68.4% faster when used to discover an individual's food intolerances and allergies to 19-56 foods. DISCUSSION ALO achieves its superior performance by: (1) grouping multiple LIs together optimally from prior statistics, and (2) adapting the groupings of LIs from the individual's subsequent responses. Future extensions to ALO should enable incorporating nutritional constraints. CONCLUSION ALO provides a new approach for the discovery of effective interventions in nutrition and medicine, leading to better intervention plans faster and with less inconvenience to the patient compared to SED.
Collapse
Affiliation(s)
- Ameen Eetemadi
- Department of Computer Science, University of California, Davis, Davis, California, USA
- Genome Center, University of California, Davis, Davis, California, USA
- AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, Davis, California, USA
- Genome Center, University of California, Davis, Davis, California, USA
- AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California, USA
| |
Collapse
|
7
|
Holt RR, Barile D, Wang SC, Munafo JP, Arvik T, Li X, Lee F, Keen CL, Tagkopoulos I, Schmitz HH. Chardonnay Marc as a New Model for Upcycled Co-products in the Food Industry: Concentration of Diverse Natural Products Chemistry for Consumer Health and Sensory Benefits. J Agric Food Chem 2022; 70:15007-15027. [PMID: 36409321 PMCID: PMC9732887 DOI: 10.1021/acs.jafc.2c04519] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 06/16/2023]
Abstract
Research continues to provide compelling insights into potential health benefits associated with diets rich in plant-based natural products (PBNPs). Coupled with evidence from dietary intervention trials, dietary recommendations increasingly include higher intakes of PBNPs. In addition to health benefits, PBNPs can drive flavor and sensory perceptions in foods and beverages. Chardonnay marc (pomace) is a byproduct of winemaking obtained after fruit pressing that has not undergone fermentation. Recent research has revealed that PBNP diversity within Chardonnay marc has potential relevance to human health and desirable sensory attributes in food and beverage products. This review explores the potential of Chardonnay marc as a valuable new PBNP ingredient in the food system by combining health, sensory, and environmental sustainability benefits that serves as a model for development of future ingredients within a sustainable circular bioeconomy. This includes a discussion on the potential role of computational methods, including artificial intelligence (AI), in accelerating research and development required to discover and commercialize this new source of PBNPs.
Collapse
Affiliation(s)
- Roberta R Holt
- Department of Nutrition, University of California, Davis, Davis, California 95616, United States
| | - Daniela Barile
- Department of Food Science and Technology, University of California, Davis, Davis, California 95616, United States
| | - Selina C Wang
- Department of Food Science and Technology, University of California, Davis, Davis, California 95616, United States
| | - John P Munafo
- Department of Food Science, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Torey Arvik
- Sonomaceuticals, LLC, Santa Rosa, California 95403, United States
| | - Xueqi Li
- Department of Food Science and Technology, University of California, Davis, Davis, California 95616, United States
| | - Fanny Lee
- Sonomaceuticals, LLC, Santa Rosa, California 95403, United States
| | - Carl L Keen
- Department of Nutrition, University of California, Davis, Davis, California 95616, United States
| | - Ilias Tagkopoulos
- PIPA, LLC, Davis, California 95616, United States
- Department of Computer Science and Genome Center, USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California 95616 United States
| | - Harold H Schmitz
- March Capital US, LLC, Davis, California 95616, United States
- T.O.P., LLC, Davis, California 95616, United States
- Graduate School of Management, University of California, Davis, Davis, California 95616, United States
| |
Collapse
|
8
|
Chan T, Tan CE, Tagkopoulos I. Audit lead selection and yield prediction from historical tax data using artificial neural networks. PLoS One 2022; 17:e0278121. [PMID: 36449508 PMCID: PMC9710839 DOI: 10.1371/journal.pone.0278121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 11/08/2022] [Indexed: 12/03/2022] Open
Abstract
Tax audits are a crucial process adopted in all tax departments to ensure tax compliance and fairness. Traditionally, tax audit leads have been selected based on empirical rules and randomization methods, which are not adaptive, may miss major cases and can introduce bias. Here, we present an audit lead tool based on artificial neural networks that have been trained and evaluated on an integrated dataset of 93,413 unique tax records from 8,647 restaurant businesses over 10 years in the Northern California, provided by the California Department of Tax and Fee Administration (CDTFA). The tool achieved a 40.1% precision and 58.7% recall (F1-score of 0.42) on classifying positive audit leads, and the corresponding regressor provided estimated audit gains (MAE of $155,490). Finally, we evaluated the statistical significance of various empirical rules for use in lead selection, with two out of five being supported by the data. This work demonstrates how data can be leveraged for creating evidence-based models of audit selection and validating empirical hypotheses, resulting in higher audit yields and more fair audit selection processes.
Collapse
Affiliation(s)
- Trevor Chan
- Department of Computer Science, University of California, Davis, California, United States of America
- Genome Center, University of California, Davis, California, United States of America
| | - Cheng-En Tan
- Department of Computer Science, University of California, Davis, California, United States of America
- Genome Center, University of California, Davis, California, United States of America
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, California, United States of America
- Genome Center, University of California, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
9
|
Rollins ZA, Huang J, Tagkopoulos I, Faller R, George SC. A Computational Algorithm to Assess the Physiochemical Determinants of T Cell Receptor Dissociation Kinetics. Comput Struct Biotechnol J 2022; 20:3473-3481. [PMID: 35860406 PMCID: PMC9278023 DOI: 10.1016/j.csbj.2022.06.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 06/20/2022] [Accepted: 06/21/2022] [Indexed: 11/29/2022] Open
Abstract
The rational design of T Cell Receptors (TCRs) for immunotherapy has stagnated due to a limited understanding of the dynamic physiochemical features of the TCR that elicit an immunogenic response. The physiochemical features of the TCR-peptide major histocompatibility complex (pMHC) bond dictate bond lifetime which, in turn, correlates with immunogenicity. Here, we: i) characterize the force-dependent dissociation kinetics of the bond between a TCR and a set of pMHC ligands using Steered Molecular Dynamics (SMD); and ii) implement a machine learning algorithm to identify which physiochemical features of the TCR govern dissociation kinetics. Our results demonstrate that the total number of hydrogen bonds between the CDR2β-MHC⍺(β), CDR1α-Peptide, and CDR3β-Peptide are critical features that determine bond lifetime.
Collapse
Affiliation(s)
| | - Jun Huang
- University of California, Davis, Davis, California, Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL
| | | | | | - Steven C. George
- Department of Biomedical Engineering
- Corresponding author at: Department of Biomedical Engineering, 451 E. Health Sciences Drive, room 2315, University of California, Davis, Davis, CA 95616.
| |
Collapse
|
10
|
Rai N, Kim M, Tagkopoulos I. Understanding the Formation and Mechanism of Anticipatory Responses in Escherichia coli. Int J Mol Sci 2022; 23:ijms23115985. [PMID: 35682665 PMCID: PMC9181292 DOI: 10.3390/ijms23115985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 05/23/2022] [Accepted: 05/24/2022] [Indexed: 12/04/2022] Open
Abstract
Microorganisms often live in complex habitats, where changes in the environment are predictable, providing an opportunity for microorganisms to learn, anticipate the upcoming environmental changes and prepare in advance for better survival and growth. One such environment is the mammalian intestine, where the abundance of different carbon sources is spatially distributed. In this study, we identified seven spatially distributed carbon sources in the mammalian intestine and tested whether Escherichia coli exhibits phenotypes that are consistent with an anticipatory response given their spatial order and abundance within the mammalian intestine. Through RNA-Seq and RT-PCR validation measurements, we found that there was a 67% match in the expression patterns between the measured phenotypes and what would otherwise be expected in the case of anticipatory behavior, while 83% and 0% were in agreement with the homeostatic and random response, respectively. To understand the genetic and phenotypic basis of the discrepancies between the expected and measured anticipatory responses, we thoroughly investigated the discrepancy in D-galactose treatment and the expression of maltose operon in E. coli. Here, the expected anticipatory response, based on the spatial distribution of D-galactose and D-maltose, was that D-galactose should upregulate the maltose operon, but it was the opposite in experimental validation. We performed whole genome random mutagenesis and screening and identified E. coli strains with positive expression of maltose operon in D-galactose. Targeted Sanger sequencing and mutation repair identified that the mutations in the promoter region of malT and in the coding region of the crp gene were the factors responsible for the reversion in the association. Further, to identify why positive association in the D-galactose treatment and the expression of the maltose operon did not evolve naturally, fitness measurements were performed. Fitness experiments demonstrated that the fitness of E. coli strains with a positive association in the D-galactose treatment and the expression of the maltose operon was 12% to 20% lower than that of the wild type strain.
Collapse
Affiliation(s)
- Navneet Rai
- UC Davis Genome Center, University of California-Davis, Davis, CA 95616, USA; (N.R.); (M.K.)
- Department of Computer Science, University of California-Davis, Davis, CA 95616, USA
| | - Minseung Kim
- UC Davis Genome Center, University of California-Davis, Davis, CA 95616, USA; (N.R.); (M.K.)
- Department of Computer Science, University of California-Davis, Davis, CA 95616, USA
| | - Ilias Tagkopoulos
- UC Davis Genome Center, University of California-Davis, Davis, CA 95616, USA; (N.R.); (M.K.)
- Department of Computer Science, University of California-Davis, Davis, CA 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA 95616, USA
- Correspondence:
| |
Collapse
|
11
|
Youn J, Rai N, Tagkopoulos I. Knowledge integration and decision support for accelerated discovery of antibiotic resistance genes. Nat Commun 2022; 13:2360. [PMID: 35487919 PMCID: PMC9055065 DOI: 10.1038/s41467-022-29993-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 03/04/2022] [Indexed: 11/09/2022] Open
Abstract
We present a machine learning framework to automate knowledge discovery through knowledge graph construction, inconsistency resolution, and iterative link prediction. By incorporating knowledge from 10 publicly available sources, we construct an Escherichia coli antibiotic resistance knowledge graph with 651,758 triples from 23 triple types after resolving 236 sets of inconsistencies. Iteratively applying link prediction to this graph and wet-lab validation of the generated hypotheses reveal 15 antibiotic resistant E. coli genes, with 6 of them never associated with antibiotic resistance for any microbe. Iterative link prediction leads to a performance improvement and more findings. The probability of positive findings highly correlates with experimentally validated findings (R2 = 0.94). We also identify 5 homologs in Salmonella enterica that are all validated to confer resistance to antibiotics. This work demonstrates how evidence-driven decisions are a step toward automating knowledge discovery with high confidence and accelerated pace, thereby substituting traditional time-consuming and expensive methods.
Collapse
Affiliation(s)
- Jason Youn
- Department of Computer Science, University of California, Davis, CA, 95616, USA
- Genome Center, University of California, Davis, CA, 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA, 95616, USA
| | - Navneet Rai
- Department of Computer Science, University of California, Davis, CA, 95616, USA
- Genome Center, University of California, Davis, CA, 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA, 95616, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, 95616, USA.
- Genome Center, University of California, Davis, CA, 95616, USA.
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA, 95616, USA.
| |
Collapse
|
12
|
Matson MM, Cepeda MM, Zhang A, Case AE, Kavvas ES, Wang X, Carroll AL, Tagkopoulos I, Atsumi S. Adaptive laboratory evolution for improved tolerance of isobutyl acetate in Escherichia coli. Metab Eng 2021; 69:50-58. [PMID: 34763090 DOI: 10.1016/j.ymben.2021.11.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 10/14/2021] [Accepted: 11/04/2021] [Indexed: 02/08/2023]
Abstract
Previously, Escherichia coli was engineered to produce isobutyl acetate (IBA). Titers greater than the toxicity threshold (3 g/L) were achieved by using layer-assisted production. To avoid this costly and complex method, adaptive laboratory evolution (ALE) was applied to E. coli for improved IBA tolerance. Over 37 rounds of selective pressure, 22 IBA-tolerant mutants were isolated. Remarkably, these mutants not only tolerate high IBA concentrations, they also produce higher IBA titers. Using whole-genome sequencing followed by CRISPR/Cas9 mediated genome editing, the mutations (SNPs in metH, rho and deletion of arcA) that confer improved tolerance and higher titers were elucidated. The improved IBA titers in the evolved mutants were a result of an increased supply of acetyl-CoA and altered transcriptional machinery. Without the use of phase separation, a strain capable of 3.2-fold greater IBA production than the parent strain was constructed by combing select beneficial mutations. These results highlight the impact improved tolerance has on the production capability of a biosynthetic system.
Collapse
Affiliation(s)
- Morgan M Matson
- Department of Chemistry, University of California, Davis, CA, 95616, USA
| | - Mateo M Cepeda
- Department of Chemistry, University of California, Davis, CA, 95616, USA
| | - Angela Zhang
- Department of Chemistry, University of California, Davis, CA, 95616, USA
| | - Anna E Case
- Department of Chemistry, University of California, Davis, CA, 95616, USA
| | - Erol S Kavvas
- Genome Center, University of California, Davis, CA, 95616, USA
| | - Xiaokang Wang
- Genome Center, University of California, Davis, CA, 95616, USA; Department of Biomedical Engineering, University of California, Davis, CA 95616, USA
| | - Austin L Carroll
- Department of Chemistry, University of California, Davis, CA, 95616, USA
| | - Ilias Tagkopoulos
- Genome Center, University of California, Davis, CA, 95616, USA; Department of Computer Science, University of California, Davis, CA, 95616, USA
| | - Shota Atsumi
- Department of Chemistry, University of California, Davis, CA, 95616, USA.
| |
Collapse
|
13
|
de Oliveira EB, Ferreira FC, Galvão KN, Youn J, Tagkopoulos I, Silva-Del-Rio N, Pereira RVV, Machado VS, Lima FS. Integration of statistical inferences and machine learning algorithms for prediction of metritis cure in dairy cows. J Dairy Sci 2021; 104:12887-12899. [PMID: 34538497 DOI: 10.3168/jds.2021-20262] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 07/23/2021] [Indexed: 11/19/2022]
Abstract
The study's objectives were to identify cow-level and environmental factors associated with metritis cure to predict metritis cure using traditional statistics and machine learning algorithms. The data set used was from a previous study comparing the efficacy of different therapies and self-cure for metritis. Metritis was defined as fetid, watery, reddish-brownish discharge, with or without fever. Cure was defined as an absence of metritis signs 12 d after diagnosis. Cows were randomly allocated to receive a subcutaneous injection of 6.6 mg/kg of ceftiofur crystalline-free acid (Excede, Zoetis) at the day of diagnosis and 3 d later (n = 275); and no treatment at the time of metritis diagnosis (n = 275). The variables days in milk (DIM) at metritis diagnosis, treatment, season of the metritis diagnosis, month of metritis diagnostic, number of lactation, parity, calving score, dystocia, retained fetal membranes, body condition score at d 5 postpartum, vulvovaginal laceration score, the rectal temperature at the metritis diagnosis, fever at diagnosis, milk production from the day before to metritis diagnosis, and milk production slope up to 5, 7, and 9 DIM were offered to univariate logistic regression. Variables included in the multivariable logistic regression model were selected from the univariate analysis according to P-value. Variables were offered to the model to assess the association between these factors and metritis cure. Additionally, the univariate logistic regression variables were offered to a recursive feature elimination to find the optimal subset of features for a machine learning algorithms analysis. Cows without vulvovaginal laceration had 1.91 higher odds of curing of metritis than cows with vulvovaginal laceration. Cows that developed metritis at >7 DIM had 2.09 higher odds of being cured than cows that developed metritis at ≤7 DIM. For rectal temperature, each degree Celsius above 39.4°C led to lower odds to be cured than cows with rectal temperature ≤39.4°C. Furthermore, milk production slope and milk production difference from the day before to the metritis diagnosis were essential variables to predict metritis cure. Cows that had reduced milk production from the day before to the metritis diagnosis had lower odds to be cured than cows with moderate milk production increase. The results from the multivariable logistic regression and receiver operating characteristic analysis indicated that cows developing metritis at >7 DIM, with increase in milk production, and with a rectal temperature ≤39.40°C had increased likelihood of cure of metritis with an accuracy of 75%. The machine learning analysis showed that in addition to these variables, calving-related disorders, season, and month of metritis event were needed to predict whether the cow will cure or not from metritis with an accuracy ≥70% and F1 score (harmonic mean between precision and recall) ≥0.78. Although machine learning algorithms are acknowledged as powerful tools for predictive classification, the current study was unable to replicate its potential benefits. More research is needed to optimize predictive models of metritis cure.
Collapse
Affiliation(s)
- E B de Oliveira
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis 95616; Veterinary Medicine Teaching and Research Center, 18830 Road 112, Tulare, CA 93274
| | - F C Ferreira
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis 95616; Veterinary Medicine Teaching and Research Center, 18830 Road 112, Tulare, CA 93274
| | - K N Galvão
- Department of Large Animal Clinical Sciences, University of Florida, Gainesville 32610; D. H. Barron Reproductive and Perinatal Biology Research Program, University of Florida, Gainesville 32610
| | - J Youn
- Department of Computer Science, University of California, Davis 95616; Computer Science and Genome Center, University of California, Davis 95616; AI Next Generation for Food System (AIFS), University of California, Davis 95616
| | - I Tagkopoulos
- Department of Computer Science, University of California, Davis 95616; Computer Science and Genome Center, University of California, Davis 95616; AI Next Generation for Food System (AIFS), University of California, Davis 95616
| | - N Silva-Del-Rio
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis 95616; Veterinary Medicine Teaching and Research Center, 18830 Road 112, Tulare, CA 93274
| | - R V V Pereira
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis 95616
| | - V S Machado
- Department of Veterinary Sciences, College of Agricultural Sciences and Natural Resources, Texas Tech University, Lubbock 79409
| | - F S Lima
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis 95616.
| |
Collapse
|
14
|
Merchel Piovesan Pereira B, Adil Salim M, Rai N, Tagkopoulos I. Tolerance to Glutaraldehyde in Escherichia coli Mediated by Overexpression of the Aldehyde Reductase YqhD by YqhC. Front Microbiol 2021; 12:680553. [PMID: 34248896 PMCID: PMC8262776 DOI: 10.3389/fmicb.2021.680553] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 05/27/2021] [Indexed: 11/13/2022] Open
Abstract
Glutaraldehyde is a widely used biocide on the market for about 50 years. Despite its broad application, several reports on the emergence of bacterial resistance, and occasional outbreaks caused by poorly disinfection, there is a gap of knowledge on the bacterial adaptation, tolerance, and resistance mechanisms to glutaraldehyde. Here, we analyze the effects of the independent selection of mutations in the transcriptional regulator yqhC for biological replicates of Escherichia coli cells subjected to adaptive laboratory evolution (ALE) in the presence of glutaraldehyde. The evolved strains showed improved survival in the biocide (11-26% increase in fitness) as a result of mutations in the activator yqhC, which led to the overexpression of the yqhD aldehyde reductase gene by 8 to over 30-fold (3.1-5.2 log2FC range). The protective effect was exclusive to yqhD as other aldehyde reductase genes of E. coli, such as yahK, ybbO, yghA, and ahr did not offer protection against the biocide. We describe a novel mechanism of tolerance to glutaraldehyde based on the activation of the aldehyde reductase YqhD by YqhC and bring attention to the potential for the selection of such tolerance mechanism outside the laboratory, given the existence of YqhD homologs in various pathogenic and opportunistic bacterial species.
Collapse
Affiliation(s)
- Beatriz Merchel Piovesan Pereira
- Microbiology Graduate Group, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
| | - Muhammad Adil Salim
- Microbiology Graduate Group, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
| | - Navneet Rai
- Genome Center, University of California, Davis, Davis, CA, United States
- Department of Computer Science, University of California, Davis, Davis, CA, United States
| | - Ilias Tagkopoulos
- Genome Center, University of California, Davis, Davis, CA, United States
- Department of Computer Science, University of California, Davis, Davis, CA, United States
| |
Collapse
|
15
|
Joo YB, Baek IW, Park KS, Tagkopoulos I, Kim KJ. Novel classification of axial spondyloarthritis to predict radiographic progression using machine learning. Clin Exp Rheumatol 2021. [DOI: 10.55563/clinexprheumatol/217pmi] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Young Bin Joo
- Division of Rheumatology, Department of Internal Medicine, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - In-Woon Baek
- Division of Rheumatology, Department of Internal Medicine, Yeouido St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Kyung-Su Park
- Division of Rheumatology, Department of Internal Medicine, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Ilias Tagkopoulos
- Department of Computer Science & Genome Center, University of California, Davis, USA
| | - Ki-Jo Kim
- Division of Rheumatology, Department of Internal Medicine, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea.
| |
Collapse
|
16
|
Kim KJ, Moon SJ, Park KS, Tagkopoulos I. Author Correction: Network-based modeling of drug effects on disease module in systemic sclerosis. Sci Rep 2021; 11:8238. [PMID: 33837228 PMCID: PMC8035395 DOI: 10.1038/s41598-021-87277-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Collapse
Affiliation(s)
- Ki-Jo Kim
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea. .,St. Vincent's Hospital, 93 Jungbu‑daero, Paldal‑gu, Suwon, Gyeonggi‑do, 16247, Republic of Korea.
| | - Su-Jin Moon
- Division of Rheumatology, Department of Internal Medicine, Uijeongbu St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Kyung-Su Park
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA. .,Genome Center, University of California, Davis, CA, USA. .,AI Institute for Next-Generation Food Systems, AIFS, Davis, CA, USA.
| |
Collapse
|
17
|
Abstract
Food ontologies require significant effort to create and maintain as they involve manual and time-consuming tasks, often with limited alignment to the underlying food science knowledge. We propose a semi-supervised framework for the automated ontology population from an existing ontology scaffold by using word embeddings. Having applied this on the domain of food and subsequent evaluation against an expert-curated ontology, FoodOn, we observe that the food word embeddings capture the latent relationships and characteristics of foods. The resulting ontology, which utilizes word embeddings trained from the Wikipedia corpus, has an improvement of 89.7% in precision when compared to the expert-curated ontology FoodOn (0.34 vs. 0.18, respectively, p value = 2.6 × 10–138), and it has a 43.6% shorter path distance (hops) between predicted and actual food instances (2.91 vs. 5.16, respectively, p value = 4.7 × 10–84) when compared to other methods. This work demonstrates how high-dimensional representations of food can be used to populate ontologies and paves the way for learning ontologies that integrate contextual information from a variety of sources and types.
Collapse
Affiliation(s)
- Jason Youn
- Department of Computer Science, University of California at Davis, Davis, CA, United States.,Genome Center, University of California at Davis, Davis, CA, United States
| | - Tarini Naravane
- Genome Center, University of California at Davis, Davis, CA, United States.,Biological Systems Engineering, University of California at Davis, Davis, CA, United States
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California at Davis, Davis, CA, United States.,Genome Center, University of California at Davis, Davis, CA, United States
| |
Collapse
|
18
|
Merchel Piovesan Pereira B, Wang X, Tagkopoulos I. Biocide-Induced Emergence of Antibiotic Resistance in Escherichia coli. Front Microbiol 2021; 12:640923. [PMID: 33717036 PMCID: PMC7952520 DOI: 10.3389/fmicb.2021.640923] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 02/03/2021] [Indexed: 12/26/2022] Open
Abstract
Biocide use is essential and ubiquitous, exposing microbes to sub-inhibitory concentrations of antiseptics, disinfectants, and preservatives. This can lead to the emergence of biocide resistance, and more importantly, potential cross-resistance to antibiotics, although the degree, frequency, and mechanisms that give rise to this phenomenon are still unclear. Here, we systematically performed adaptive laboratory evolution of the gut bacteria Escherichia coli in the presence of sub-inhibitory, constant concentrations of ten widespread biocides. Our results show that 17 out of 40 evolved strains (43%) also decreased the susceptibility to medically relevant antibiotics. Through whole-genome sequencing, we identified mutations related to multidrug efflux proteins (mdfA and acrR), porins (envZ and ompR), and RNA polymerase (rpoA and rpoBC), as mechanisms behind the resulting (cross)resistance. We also report an association of several genes (yeaW, pyrE, yqhC, aes, pgpA, and yeeP-isrC) and specific mutations that induce cross-resistance, verified through mutation repairs. A greater capacity for biofilm formation with respect to the parent strain was also a common feature in 11 out of 17 (65%) cross-resistant strains. Evolution in the biocides chlorophene, benzalkonium chloride, glutaraldehyde, and chlorhexidine had the most impact in antibiotic susceptibility, while hydrogen peroxide and povidone-iodine the least. No cross-resistance to antibiotics was observed for isopropanol, ethanol, sodium hypochlorite, and peracetic acid. This work reinforces the link between exposure to biocides and the potential for cross-resistance to antibiotics, presents evidence on the underlying mechanisms of action, and provides a prioritized list of biocides that are of greater concern for public safety from the perspective of antibiotic resistance. SIGNIFICANCE STATEMENT Bacterial resistance and decreased susceptibility to antimicrobials is of utmost concern. There is evidence that improper biocide (antiseptic and disinfectant) use and discard may select for bacteria cross-resistant to antibiotics. Understanding the cross-resistance emergence and the risks associated with each of those chemicals is relevant for proper applications and recommendations. Our work establishes that not all biocides are equal when it comes to their risk of inducing antibiotic resistance; it provides evidence on the mechanisms of cross-resistance and a risk assessment of the biocides concerning antibiotic resistance under residual sub-inhibitory concentrations.
Collapse
Affiliation(s)
- Beatriz Merchel Piovesan Pereira
- Microbiology Graduate Group, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
| | - Xiaokang Wang
- Genome Center, University of California, Davis, Davis, CA, United States
- Department of Computer Science, University of California, Davis, Davis, CA, United States
| | - Ilias Tagkopoulos
- Microbiology Graduate Group, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
- Department of Computer Science, University of California, Davis, Davis, CA, United States
| |
Collapse
|
19
|
Eetemadi A, Tagkopoulos I. Methane and fatty acid metabolism pathways are predictive of Low-FODMAP diet efficacy for patients with irritable bowel syndrome. Clin Nutr 2021; 40:4414-4421. [PMID: 33504454 DOI: 10.1016/j.clnu.2020.12.041] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 12/13/2020] [Accepted: 12/25/2020] [Indexed: 02/07/2023]
Abstract
OBJECTIVE Identification of microbiota-based biomarkers as predictors of low-FODMAP diet response and design of a diet recommendation strategy for IBS patients. DESIGN We created a compendium of gut microbiome and disease severity data before and after a low-FODMAP diet treatment from published studies followed by unified data processing, statistical analysis and predictive modeling. We employed data-driven methods that solely rely on the compendium data, as well as hypothesis-driven methods that focus on methane and short chain fatty acid (SCFA) metabolism pathways that were implicated in the disease etiology. RESULTS The patient's response to a low-FODMAP diet was predictable using their pre-diet fecal samples with F1 accuracy scores of 0.750 and 0.875 achieved through data-driven and hypothesis-driven predictors, respectively. The fecal microbiome of patients with high response had higher abundance of methane and SCFA metabolism pathways compared to patients with no response (p-values < 6 × 10-3). The genera Ruminococcus 1, Ruminococcaceae UCG-002 and Anaerostipes can be used as predictive biomarkers of diet response. Furthermore, the low-FODMAP diet followers were identifiable given their microbiome data (F1-score of 0.656). CONCLUSION Our integrated data analysis results argue that there are two types of patients, those with high colonic methane and SCFA production, who will respond well on a low-FODMAP diet, and all others, who would benefit a dietary supplementation containing butyrate and propionate, as well as probiotics with SCFA-producing bacteria, such as lactobacillus. This work demonstrates how data integration can lead to novel discoveries and paves the way towards personalized diet recommendations for IBS.
Collapse
Affiliation(s)
- Ameen Eetemadi
- Department of Computer Science, University of California, Davis, CA, USA; Genome Center, University of California, Davis, CA, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA; Genome Center, University of California, Davis, CA, USA.
| |
Collapse
|
20
|
Wang X, Rai N, Merchel Piovesan Pereira B, Eetemadi A, Tagkopoulos I. Accelerated knowledge discovery from omics data by optimal experimental design. Nat Commun 2020; 11:5026. [PMID: 33024104 PMCID: PMC7538421 DOI: 10.1038/s41467-020-18785-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 08/27/2020] [Indexed: 12/15/2022] Open
Abstract
How to design experiments that accelerate knowledge discovery on complex biological landscapes remains a tantalizing question. We present an optimal experimental design method (coined OPEX) to identify informative omics experiments using machine learning models for both experimental space exploration and model training. OPEX-guided exploration of Escherichia coli’s populations exposed to biocide and antibiotic combinations lead to more accurate predictive models of gene expression with 44% less data. Analysis of the proposed experiments shows that broad exploration of the experimental space followed by fine-tuning emerges as the optimal strategy. Additionally, analysis of the experimental data reveals 29 cases of cross-stress protection and 4 cases of cross-stress vulnerability. Further validation reveals the central role of chaperones, stress response proteins and transport pumps in cross-stress exposure. This work demonstrates how active learning can be used to guide omics data collection for training predictive models, making evidence-driven decisions and accelerating knowledge discovery in life sciences. How to design experiments that accelerate knowledge discovery on complex biological landscapes remains a tantalizing question. Here, the authors present OPEX, an optimal experimental design method to identify informative omics experiments for both experimental space exploration and model training.
Collapse
Affiliation(s)
- Xiaokang Wang
- Department of Biomedical Engineering, University of California, Davis, CA, 95616, USA.,Genome Center, University of California, Davis, CA, 95616, USA
| | - Navneet Rai
- Genome Center, University of California, Davis, CA, 95616, USA.,Department of Computer Science, University of California, Davis, CA, 95616, USA
| | - Beatriz Merchel Piovesan Pereira
- Genome Center, University of California, Davis, CA, 95616, USA.,Microbiology Graduate Group, University of California, Davis, CA, 95616, USA
| | - Ameen Eetemadi
- Genome Center, University of California, Davis, CA, 95616, USA.,Department of Computer Science, University of California, Davis, CA, 95616, USA
| | - Ilias Tagkopoulos
- Genome Center, University of California, Davis, CA, 95616, USA. .,Department of Computer Science, University of California, Davis, CA, 95616, USA.
| |
Collapse
|
21
|
Mak WS, Wang X, Arenas R, Cui Y, Bertolani S, Deng WQ, Tagkopoulos I, Wilson DK, Siegel JB. Discovery, Design, and Structural Characterization of Alkane-Producing Enzymes across the Ferritin-like Superfamily. Biochemistry 2020; 59:3834-3843. [PMID: 32935984 DOI: 10.1021/acs.biochem.0c00665] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
To complement established rational and evolutionary protein design approaches, significant efforts are being made to utilize computational modeling and the diversity of naturally occurring protein sequences. Here, we combine structural biology, genomic mining, and computational modeling to identify structural features critical to aldehyde deformylating oxygenases (ADOs), an enzyme family that has significant implications in synthetic biology and chemoenzymatic synthesis. Through these efforts, we discovered latent ADO-like function across the ferritin-like superfamily in various species of Bacteria and Archaea. We created a machine learning model that uses protein structural features to discriminate ADO-like activity. Computational enzyme design tools were then utilized to introduce ADO-like activity into the small subunit of Escherichia coli class I ribonucleotide reductase. The integrated approach of genomic mining, structural biology, molecular modeling, and machine learning has the potential to be utilized for rapid discovery and modulation of functions across enzyme families.
Collapse
Affiliation(s)
- Wai Shun Mak
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - XiaoKang Wang
- Department of Biomedical Engineering, University of California, Davis, Davis, California 95616, United States
| | - Rigoberto Arenas
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, California 95616, United States.,Chemistry Graduate Group, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - Youtian Cui
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - Steve Bertolani
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - Wen Qiao Deng
- California College of Arts, 1111 Eighth Street, San Francisco, California 94107, United States
| | - Ilias Tagkopoulos
- Department of Biomedical Engineering, University of California, Davis, Davis, California 95616, United States.,Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California 95616, United States.,Department of Computer Science, University of California, Davis, Davis, California 95616, United States
| | - David K Wilson
- Department of Molecular and Cellular Biology, University of California, Davis, Davis, California 95616, United States.,Chemistry Graduate Group, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - Justin B Siegel
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, California 95616, United States.,Department of Biochemistry and Molecular Medicine, University of California, Davis, 2700 Stockton Boulevard, Suite 2102, Sacramento, California 95817, United States.,Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California 95616, United States
| |
Collapse
|
22
|
Kim KJ, Moon SJ, Park KS, Tagkopoulos I. Network-based modeling of drug effects on disease module in systemic sclerosis. Sci Rep 2020; 10:13393. [PMID: 32770109 PMCID: PMC7414841 DOI: 10.1038/s41598-020-70280-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 07/10/2020] [Indexed: 01/13/2023] Open
Abstract
The network-based proximity between drug targets and disease genes can provide novel insights regarding the repercussions, interplay, and repositioning of drugs in the context of disease. Current understanding and treatment for reversing of the fibrotic process is limited in systemic sclerosis (SSc). We have developed a network-based analysis for drug effects that takes into account the human interactome network, proximity measures between drug targets and disease-associated genes, genome-wide gene expression and disease modules that emerge through pertinent analysis. Currently used and potential drugs showed a wide variation in proximity to SSc-associated genes and distinctive proximity to the SSc-relevant pathways, depending on their class and targets. Tyrosine kinase inhibitors (TyKIs) approach disease gene through multiple pathways, including both inflammatory and fibrosing processes. The SSc disease module includes the emerging molecular targets and is in better accord with the current knowledge of the pathophysiology of the disease. In the disease-module network, the greatest perturbing activity was shown by nintedanib, followed by imatinib, dasatinib, and acetylcysteine. Suppression of the SSc-relevant pathways and alleviation of the skin fibrosis was remarkable in the inflammatory subsets of the SSc patients receiving TyKI therapy. Our results show that network-based drug-disease proximity offers a novel perspective into a drug’s therapeutic effect in the SSc disease module. This could be applied to drug combinations or drug repositioning, and be helpful guiding clinical trial design and subgroup analysis.
Collapse
Affiliation(s)
- Ki-Jo Kim
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea. .,St. Vincent's Hospital, 93 Jungbu-daero, Paldal-gu, Suwon, Gyeonggi-do, 16247, Republic of Korea.
| | - Su-Jin Moon
- Division of Rheumatology, Department of Internal Medicine, Uijeongbu St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Kyung-Su Park
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA. .,Genome Center, University of California, Davis, CA, USA. .,AI Institute for Next-Generation Food Systems, AIFS, Davis, CA, USA.
| |
Collapse
|
23
|
Merchel Piovesan Pereira B, Wang X, Tagkopoulos I. Short- and Long-Term Transcriptomic Responses of Escherichia coli to Biocides: a Systems Analysis. Appl Environ Microbiol 2020; 86:e00708-20. [PMID: 32385082 PMCID: PMC7357472 DOI: 10.1128/aem.00708-20] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 05/01/2020] [Indexed: 12/01/2022] Open
Abstract
The mechanisms of the bacterial response to biocides are poorly understood, despite their broad application. To identify the genetic basis and pathways implicated in the biocide stress response, we exposed Escherichia coli populations to 10 ubiquitous biocides. By comparing the transcriptional responses between a short-term exposure (30 min) and a long-term exposure (8 to 12 h) to biocide stress, we established the common gene and pathway clusters that are implicated in general and biocide-specific stress responses. Our analysis revealed a temporal choreography, starting from the upregulation of chaperones to the subsequent repression of motility and chemotaxis pathways and the induction of an anaerobic pool of enzymes and biofilm regulators. A systematic analysis of the transcriptional data identified a zur-regulated gene cluster to be highly active in the stress response against sodium hypochlorite and peracetic acid, presenting a link between the biocide stress response and zinc homeostasis. Susceptibility assays with knockout mutants further validated our findings and provide clear targets for downstream investigation of the implicated mechanisms of action.IMPORTANCE Antiseptics and disinfectant products are of great importance to control and eliminate pathogens, especially in settings such as hospitals and the food industry. Such products are widely distributed and frequently poorly regulated. Occasional outbreaks have been associated with microbes resistant to such compounds, and researchers have indicated potential cross-resistance with antibiotics. Despite that, there are many gaps in knowledge about the bacterial stress response and the mechanisms of microbial resistance to antiseptics and disinfectants. We investigated the stress response of the bacterium Escherichia coli to 10 common disinfectant and antiseptic chemicals to shed light on the potential mechanisms of tolerance to such compounds.
Collapse
Affiliation(s)
- Beatriz Merchel Piovesan Pereira
- Microbiology Graduate Group, University of California, Davis, California, USA
- Genome Center, University of California, Davis, California, USA
| | - Xiaokang Wang
- Genome Center, University of California, Davis, California, USA
- Biomedical Engineering Graduate Group, University of California, Davis, California, USA
| | - Ilias Tagkopoulos
- Genome Center, University of California, Davis, California, USA
- Department of Computer Science, University of California, Davis, California, USA
| |
Collapse
|
24
|
Simmons G, Lee F, Kim M, Holt R, Tagkopoulos I. Identification of Differential, Health-Related Compounds in Chardonnay Marc through Network-Based Meta-Analysis. Curr Dev Nutr 2020. [DOI: 10.1093/cdn/nzaa045_108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Objectives
Food/residue waste streams may be a significant source of bioactive compounds that benefit human health. Dietary intervention trials demonstrate the health benefits of such residues, but they are resource and time intensive. Bioinformatics meta-analyses can elucidate putative pathways, genes and chemicals that are relevant to human health, hence guiding further experimentation and intervention trials. To this end, we integrated publicly available phytochemical datasets related to general grape marc from different varieties (GM) and Chardonnay grape marc (CM) to investigate their differences and potential implications to human health through a network-based meta-analysis.
Methods
To characterize the phytochemical profile of grape marc, compositional data was aggregated from publicly available literature. To identify potential health effects based on this chemical information, associations between disease states and the chemical profiles of GM/CM were extracted from the Comparative Toxicogenomics Database (CTD). Disease associative networks were constructed for a) marc products, b) all marc-related phenolics, c) compounds that are differentially abundant in CM.
Results
The union of available marc composition datasets from 14 articles contained 66 phenolic compounds; 29 of these were associated with at least 1 disease state in the CTD. There were 5 differentially over-abundant compounds in CM versus other grape marcs (red varietals n = 75, white varietals n = 57). These were flavan-3-ols catechin, epicatechin, epigallocatechin, gallocatechin, and proanthocyanidin C1 (P < 0.001); with gallocatechin unique to CM. Studies investigating marc products indicated associations to 15 diseases. CTD evidence from 934 studies associated the phenolic profile of GM to 358 diseases of 34 disease classes. Network-based meta-analysis suggested associations between GM and CM phenolics and several disease targets. This includes confirmatory associations between flavan-3-ols and cardiovascular disease outcomes.
Conclusions
Chardonnay marc is not widely studied; however, the developed framework of network-based meta-analysis utilizing composition information provides a holistic view of the knowledge space for grape marc, and highlights suggested health effects that can guide future research programs.
Funding Sources
Sonomaceuticals, LLC.
Collapse
Affiliation(s)
| | | | | | - Roberta Holt
- Department of Nutrition, University of California, Davis
| | - Ilias Tagkopoulos
- PIPA, LLC; Department of Computer Science, University of California, Davis; Genome Center, University of California, Davis
| |
Collapse
|
25
|
Eetemadi A, Rai N, Pereira BMP, Kim M, Schmitz H, Tagkopoulos I. The Computational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health. Front Microbiol 2020; 11:393. [PMID: 32318028 PMCID: PMC7146706 DOI: 10.3389/fmicb.2020.00393] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 02/26/2020] [Indexed: 12/12/2022] Open
Abstract
Food and human health are inextricably linked. As such, revolutionary impacts on health have been derived from advances in the production and distribution of food relating to food safety and fortification with micronutrients. During the past two decades, it has become apparent that the human microbiome has the potential to modulate health, including in ways that may be related to diet and the composition of specific foods. Despite the excitement and potential surrounding this area, the complexity of the gut microbiome, the chemical composition of food, and their interplay in situ remains a daunting task to fully understand. However, recent advances in high-throughput sequencing, metabolomics profiling, compositional analysis of food, and the emergence of electronic health records provide new sources of data that can contribute to addressing this challenge. Computational science will play an essential role in this effort as it will provide the foundation to integrate these data layers and derive insights capable of revealing and understanding the complex interactions between diet, gut microbiome, and health. Here, we review the current knowledge on diet-health-gut microbiota, relevant data sources, bioinformatics tools, machine learning capabilities, as well as the intellectual property and legislative regulatory landscape. We provide guidance on employing machine learning and data analytics, identify gaps in current methods, and describe new scenarios to be unlocked in the next few years in the context of current knowledge.
Collapse
Affiliation(s)
- Ameen Eetemadi
- Department of Computer Science, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
| | - Navneet Rai
- Genome Center, University of California, Davis, Davis, CA, United States
| | - Beatriz Merchel Piovesan Pereira
- Genome Center, University of California, Davis, Davis, CA, United States
- Department of Microbiology, University of California, Davis, Davis, CA, United States
| | - Minseung Kim
- Department of Computer Science, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
- Process Integration and Predictive Analytics (PIPA LLC), Davis, CA, United States
| | - Harold Schmitz
- Graduate School of Management, University of California, Davis, Davis, CA, United States
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, Davis, CA, United States
- Genome Center, University of California, Davis, Davis, CA, United States
- Process Integration and Predictive Analytics (PIPA LLC), Davis, CA, United States
| |
Collapse
|
26
|
Chin EL, Simmons G, Bouzid YY, Kan A, Burnett DJ, Tagkopoulos I, Lemay DG. Nutrient Estimation from 24-Hour Food Recalls Using Machine Learning and Database Mapping: A Case Study with Lactose. Nutrients 2019; 11:E3045. [PMID: 31847188 PMCID: PMC6950225 DOI: 10.3390/nu11123045] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 11/30/2019] [Accepted: 12/06/2019] [Indexed: 01/03/2023] Open
Abstract
The Automated Self-Administered 24-Hour Dietary Assessment Tool (ASA24) is a free dietary recall system that outputs fewer nutrients than the Nutrition Data System for Research (NDSR). NDSR uses the Nutrition Coordinating Center (NCC) Food and Nutrient Database, both of which require a license. Manual lookup of ASA24 foods into NDSR is time-consuming but currently the only way to acquire NCC-exclusive nutrients. Using lactose as an example, we evaluated machine learning and database matching methods to estimate this NCC-exclusive nutrient from ASA24 reports. ASA24-reported foods were manually looked up into NDSR to obtain lactose estimates and split into training (n = 378) and test (n = 189) datasets. Nine machine learning models were developed to predict lactose from the nutrients common between ASA24 and the NCC database. Database matching algorithms were developed to match NCC foods to an ASA24 food using only nutrients ("Nutrient-Only") or the nutrient and food descriptions ("Nutrient + Text"). For both methods, the lactose values were compared to the manual curation. Among machine learning models, the XGB-Regressor model performed best on held-out test data (R2 = 0.33). For the database matching method, Nutrient + Text matching yielded the best lactose estimates (R2 = 0.76), a vast improvement over the status quo of no estimate. These results suggest that computational methods can successfully estimate an NCC-exclusive nutrient for foods reported in ASA24.
Collapse
Affiliation(s)
- Elizabeth L Chin
- Western Human Nutrition Research Center, USDA ARS, Davis, CA 95616, USA
- Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Gabriel Simmons
- Department of Mechanical Engineering, University of California Davis, Davis, CA 95616, USA
| | - Yasmine Y Bouzid
- Western Human Nutrition Research Center, USDA ARS, Davis, CA 95616, USA
- Department of Nutrition, University of California Davis, Davis, CA 95616, USA
| | - Annie Kan
- Western Human Nutrition Research Center, USDA ARS, Davis, CA 95616, USA
- Department of Nutrition, University of California Davis, Davis, CA 95616, USA
| | - Dustin J Burnett
- Western Human Nutrition Research Center, USDA ARS, Davis, CA 95616, USA
- Department of Nutrition, University of California Davis, Davis, CA 95616, USA
| | - Ilias Tagkopoulos
- Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Computer Science, University of California Davis, Davis, CA 95616, USA
| | - Danielle G Lemay
- Western Human Nutrition Research Center, USDA ARS, Davis, CA 95616, USA
- Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Nutrition, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
27
|
Bradley R, Tagkopoulos I, Kim M, Kokkinos Y, Panagiotakos T, Kennedy J, De Meyer G, Watson P, Elliott J. Predicting early risk of chronic kidney disease in cats using routine clinical laboratory tests and machine learning. J Vet Intern Med 2019; 33:2644-2656. [PMID: 31557361 PMCID: PMC6872623 DOI: 10.1111/jvim.15623] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 08/29/2019] [Indexed: 02/01/2023] Open
Abstract
Background Advanced machine learning methods combined with large sets of health screening data provide opportunities for diagnostic value in human and veterinary medicine. Hypothesis/Objectives To derive a model to predict the risk of cats developing chronic kidney disease (CKD) using data from electronic health records (EHRs) collected during routine veterinary practice. Animals A total of 106 251 cats that attended Banfield Pet Hospitals between January 1, 1995, and December 31, 2017. Methods Longitudinal EHRs from Banfield Pet Hospitals were extracted and randomly split into 2 parts. The first 67% of the data were used to build a prediction model, which included feature selection and identification of the optimal neural network type and architecture. The remaining unseen EHRs were used to evaluate the model performance. Results The final model was a recurrent neural network (RNN) with 4 features (creatinine, blood urea nitrogen, urine specific gravity, and age). When predicting CKD near the point of diagnosis, the model displayed a sensitivity of 90.7% and a specificity of 98.9%. Model sensitivity decreased when predicting the risk of CKD with a longer horizon, having 63.0% sensitivity 1 year before diagnosis and 44.2% 2 years before diagnosis, but with specificity remaining around 99%. Conclusions and clinical importance The use of models based on machine learning can support veterinary decision making by improving early identification of CKD.
Collapse
Affiliation(s)
- Richard Bradley
- WALTHAM® Centre for Pet Nutrition, Freeby Lane, Waltham on the Wolds, Leicestershire, United Kingdom
| | - Ilias Tagkopoulos
- Department of Computer Science and Genome Center, University of California, Davis, California.,Process Integration and Predictive Analytics, PIPA LLC, Davis, California
| | - Minseung Kim
- Process Integration and Predictive Analytics, PIPA LLC, Davis, California
| | - Yiannis Kokkinos
- Process Integration and Predictive Analytics, PIPA LLC, Davis, California
| | | | | | - Geert De Meyer
- WALTHAM® Centre for Pet Nutrition, Freeby Lane, Waltham on the Wolds, Leicestershire, United Kingdom
| | - Phillip Watson
- WALTHAM® Centre for Pet Nutrition, Freeby Lane, Waltham on the Wolds, Leicestershire, United Kingdom
| | - Jonathan Elliott
- Department of Comparative Biomedical Sciences, Royal Veterinary College, London, United Kingdom
| |
Collapse
|
28
|
Abstract
Over the past decade, there has been a paradigm shift in how clinical data are collected, processed and utilized. Machine learning and artificial intelligence, fueled by breakthroughs in high-performance computing, data availability and algorithmic innovations, are paving the way to effective analyses of large, multi-dimensional collections of patient histories, laboratory results, treatments, and outcomes. In the new era of machine learning and predictive analytics, the impact on clinical decision-making in all clinical areas, including rheumatology, will be unprecedented. Here we provide a critical review of the machine-learning methods currently used in the analysis of clinical data, the advantages and limitations of these methods, and how they can be leveraged within the field of rheumatology.
Collapse
Affiliation(s)
- Ki-Jo Kim
- Division of Rheumatology, Department of Internal Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Correspondence to Ki-Jo Kim, M.D. Division of Rheumatology, Department of Internal Medicine, College of Medicine, St. Vincent's Hospital, The Catholic University of Korea, 93 Jungbu-daero, Paldal-gu, Suwon 16247, Korea Tel: +82-31-249-8805 Fax: +82-31-253-8898 E-mail:
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA
- Genome Center, University of California, Davis, CA, USA
| |
Collapse
|
29
|
Moon SJ, Bae JM, Park KS, Tagkopoulos I, Kim KJ. Compendium of skin molecular signatures identifies key pathological features associated with fibrosis in systemic sclerosis. Ann Rheum Dis 2019; 78:817-825. [DOI: 10.1136/annrheumdis-2018-214778] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 02/26/2019] [Accepted: 03/23/2019] [Indexed: 01/02/2023]
Abstract
ObjectivesTreatment of patients with systemic sclerosis (SSc) can be challenging because of clinical heterogeneity. Integration of genome-scale transcriptomic profiling for patients with SSc can provide insights on patient categorisation and novel drug targets.MethodsA normalised compendium was created from 344 skin samples of 173 patients with SSc, covering an intersection of 17 424 genes from eight data sets. Differentially expressed genes (DEGs) identified by three independent methods were subjected to functional network analysis, where samples were grouped using non-negative matrix factorisation. Finally, we investigated the pathways and biomarkers associated with skin fibrosis using gene-set enrichment analysis.ResultsWe identified 1089 upregulated DEGs, including 14 known genetic risk factors and five potential drug targets. Pathway-based subgrouping revealed four distinct clusters of patients with SSc with distinct activity signatures for SSc-relevant pathways. The inflammatory subtype was related to significant improvement in skin fibrosis at follow-up. The phosphoinositide-3-kinase-protein kinase B (PI3K-Akt) signalling pathway showed both the closest correlation and temporal pattern to skin fibrosis score. COMP, THBS1, THBS4, FN1, and TNC were leading-edge genes of the PI3K-Akt pathway in skin fibrogenesis.ConclusionsConstruction and analysis of normalised skin transcriptomic compendia can provide useful insights on pathway involvement by SSc subsets and discovering viable biomarkers for a skin fibrosis index. Particularly, the PI3K-Akt pathway and its leading players are promising therapeutic targets.
Collapse
|
30
|
Rai N, Huynh L, Kim M, Tagkopoulos I. Population collapse and adaptive rescue during long‐term chemostat fermentation. Biotechnol Bioeng 2019; 116:693-703. [DOI: 10.1002/bit.26898] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Revised: 11/02/2018] [Accepted: 12/06/2018] [Indexed: 11/09/2022]
Affiliation(s)
- Navneet Rai
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| | - Linh Huynh
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| | - Minseung Kim
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| | - Ilias Tagkopoulos
- UC Davis Genome Center, University of California Davis California
- Department of Computer Science University of California Davis California
| |
Collapse
|
31
|
Eetemadi A, Tagkopoulos I. Genetic Neural Networks: an artificial neural network architecture for capturing gene expression relationships. Bioinformatics 2018; 35:2226-2234. [DOI: 10.1093/bioinformatics/bty945] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 10/27/2018] [Accepted: 11/16/2018] [Indexed: 01/16/2023] Open
Abstract
Abstract
Motivation
Gene expression prediction is one of the grand challenges in computational biology. The availability of transcriptomics data combined with recent advances in artificial neural networks provide an unprecedented opportunity to create predictive models of gene expression with far reaching applications.
Results
We present the Genetic Neural Network (GNN), an artificial neural network for predicting genome-wide gene expression given gene knockouts and master regulator perturbations. In its core, the GNN maps existing gene regulatory information in its architecture and it uses cell nodes that have been specifically designed to capture the dependencies and non-linear dynamics that exist in gene networks. These two key features make the GNN architecture capable to capture complex relationships without the need of large training datasets. As a result, GNNs were 40% more accurate on average than competing architectures (MLP, RNN, BiRNN) when compared on hundreds of curated and inferred transcription modules. Our results argue that GNNs can become the architecture of choice when building predictors of gene expression from exponentially growing corpus of genome-wide transcriptomics data.
Availability and implementation
https://github.com/IBPA/GNN
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ameen Eetemadi
- Department of Computer Science, University of California, Davis, CA, USA
- Genome Center, University of California, Davis, CA, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, USA
- Genome Center, University of California, Davis, CA, USA
| |
Collapse
|
32
|
Abstract
We provide an overview of opportunities and challenges in multi-omics predictive analytics with particular emphasis on data integration and machine learning methods.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science
- University of California
- Davis
- USA
- Genome Center
| | - Ilias Tagkopoulos
- Department of Computer Science
- University of California
- Davis
- USA
- Genome Center
| |
Collapse
|
33
|
Freund GS, O’Brien TE, Vinson L, Carlin DA, Yao A, Mak WS, Tagkopoulos I, Facciotti MT, Tantillo DJ, Siegel JB. Elucidating Substrate Promiscuity within the FabI Enzyme Family. ACS Chem Biol 2017; 12:2465-2473. [PMID: 28820936 DOI: 10.1021/acschembio.7b00400] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The rapidly growing appreciation of enzymes' catalytic and substrate promiscuity may lead to their expanded use in the fields of chemical synthesis and industrial biotechnology. Here, we explore the substrate promiscuity of enoyl-acyl carrier protein reductases (commonly known as FabI) and how that promiscuity is a function of inherent reactivity and the geometric demands of the enzyme's active site. We demonstrate that these enzymes catalyze the reduction of a wide range of substrates, particularly α,β-unsaturated aldehydes. In addition, we demonstrate that a combination of quantum mechanical hydride affinity calculations and molecular docking can be used to rapidly categorize compounds that FabI can use as substrates. The results here provide new insight into the determinants of catalysis for FabI and set the stage for the development of a new assay for drug discovery, organic synthesis, and novel biocatalysts.
Collapse
Affiliation(s)
- Gabriel S. Freund
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Department
of Mathematics, University of California Davis, Davis, California United States
| | - Terrence E. O’Brien
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Department
of Chemistry, University of California Davis, Davis, California United States
| | - Logan Vinson
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
| | - Dylan Alexander Carlin
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Biophysics
Graduate Group, University of California Davis, Davis, California United States
| | - Andrew Yao
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
| | - Wai Shun Mak
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Department
of Chemistry, University of California Davis, Davis, California United States
| | - Ilias Tagkopoulos
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Department
of Computer Science, University of California Davis, Davis, California United States
| | - Marc T. Facciotti
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Department
of Biomedical Engineering, University of California, Davis, California United States
| | - Dean J. Tantillo
- Department
of Chemistry, University of California Davis, Davis, California United States
| | - Justin B. Siegel
- Genome
Center, University of California Davis, One Shields Avenue, Davis, California 95616, United States
- Department
of Chemistry, University of California Davis, Davis, California United States
- Department of Biochemistry & Molecular Medicine, University of CaliforniaDavis, Davis, California United States
| |
Collapse
|
34
|
Abstract
Protein inference, the identification of the protein set that is the origin of a given peptide profile, is a fundamental challenge in proteomics. We present DeepPep, a deep-convolutional neural network framework that predicts the protein set from a proteomics mixture, given the sequence universe of possible proteins and a target peptide profile. In its core, DeepPep quantifies the change in probabilistic score of peptide-spectrum matches in the presence or absence of a specific protein, hence selecting as candidate proteins with the largest impact to the peptide profile. Application of the method across datasets argues for its competitive predictive ability (AUC of 0.80±0.18, AUPR of 0.84±0.28) in inferring proteins without need of peptide detectability on which the most competitive methods rely. We find that the convolutional neural network architecture outperforms the traditional artificial neural network architectures without convolution layers in protein inference. We expect that similar deep learning architectures that allow learning nonlinear patterns can be further extended to problems in metagenome profiling and cell type inference. The source code of DeepPep and the benchmark datasets used in this study are available at https://deeppep.github.io/DeepPep/. The accurate identification of proteins in a proteomics sample, called the protein inference problem, is a fundamental challenge in biomedical sciences. Current approaches are based on applications of traditional neural networks, linear optimization and Bayesian techniques. We here present DeepPep, a deep-convolutional neural network framework that predicts the protein set from a standard proteomics mixture, given all protein sequences and a peptide profile. Comparison to leading methods shows that DeepPep has most robust performance with various instruments and datasets. Our results provide evidence that using sequence-level location information of a peptide in the context of proteome sequence can result in more accurate and robust protein inference. We conclude that Deep Learning on protein sequence leads to superior platforms for protein inference that can be further refined with additional features and extended for far reaching applications.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science, University of California, Davis, Davis, California, United States of America
- Genome Center, University of California, Davis, Davis, California, United States of America
| | - Ameen Eetemadi
- Department of Computer Science, University of California, Davis, Davis, California, United States of America
- Genome Center, University of California, Davis, Davis, California, United States of America
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, Davis, California, United States of America
- Genome Center, University of California, Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
35
|
Bjornson M, Balcke GU, Xiao Y, de Souza A, Wang JZ, Zhabinskaya D, Tagkopoulos I, Tissier A, Dehesh K. Integrated omics analyses of retrograde signaling mutant delineate interrelated stress-response strata. Plant J 2017; 91:70-84. [PMID: 28370892 PMCID: PMC5488868 DOI: 10.1111/tpj.13547] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Revised: 03/15/2017] [Accepted: 03/20/2017] [Indexed: 05/19/2023]
Abstract
To maintain homeostasis in the face of intrinsic and extrinsic insults, cells have evolved elaborate quality control networks to resolve damage at multiple levels. Interorganellar communication is a key requirement for this maintenance, however the underlying mechanisms of this communication have remained an enigma. Here we integrate the outcome of transcriptomic, proteomic, and metabolomics analyses of genotypes including ceh1, a mutant with constitutively elevated levels of both the stress-specific plastidial retrograde signaling metabolite methyl-erythritol cyclodiphosphate (MEcPP) and the defense hormone salicylic acid (SA), as well as the high MEcPP but SA deficient genotype ceh1/eds16, along with corresponding controls. Integration of multi-omic analyses enabled us to delineate the function of MEcPP from SA, and expose the compartmentalized role of this retrograde signaling metabolite in induction of distinct but interdependent signaling cascades instrumental in adaptive responses. Specifically, here we identify strata of MEcPP-sensitive stress-response cascades, among which we focus on selected pathways including organelle-specific regulation of jasmonate biosynthesis; simultaneous induction of synthesis and breakdown of SA; and MEcPP-mediated alteration of cellular redox status in particular glutathione redox balance. Collectively, these integrated multi-omic analyses provided a vehicle to gain an in-depth knowledge of genome-metabolism interactions, and to further probe the extent of these interactions and delineate their functional contributions. Through this approach we were able to pinpoint stress-mediated transcriptional and metabolic signatures and identify the downstream processes modulated by the independent or overlapping functions of MEcPP and SA in adaptive responses.
Collapse
Affiliation(s)
- Marta Bjornson
- Dept. of Plant Biology, University of California, Davis, CA 95616
- Dept. of Plant Sciences, University of California, Davis, CA 95616
| | | | - Yanmei Xiao
- Dept. of Plant Biology, University of California, Davis, CA 95616
| | - Amancio de Souza
- Dept. of Plant Biology, University of California, Davis, CA 95616
| | - Jin-Zheng Wang
- Dept. of Plant Biology, University of California, Davis, CA 95616
| | - Dina Zhabinskaya
- Dept. of Computer Science, University of California, Davis, CA 95616
| | - Ilias Tagkopoulos
- Dept. of Cell and Metabolic Biology, Leibniz-Institute of Plant Biochemistry, Halle, Germany
| | - Alain Tissier
- Dept. of Physics, University of California, Davis, CA 95616
| | - Katayoon Dehesh
- Dept. of Plant Biology, University of California, Davis, CA 95616
| |
Collapse
|
36
|
Zorraquino V, Kim M, Rai N, Tagkopoulos I. The genetic and transcriptional basis of short and long term adaptation across multiple stresses in
Escherichia coli. Mol Biol Evol 2016; 34:707-717. [DOI: 10.1093/molbev/msw269] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
37
|
Abstract
Mathematical modeling and numerical simulation are crucial to support design decisions in synthetic biology. Accurate estimation of parameter values is key, as direct experimental measurements are difficult and time-consuming. Insufficient data, incompatible measurements, and specialized models that lack universal parameters make this task challenging. Here, we have created a database (PAMDB) that integrates data from 135 publications that contain 118 circuits and 165 genetic parts of the bacterium Escherichia coli. We used a succinct, universal model formulation to describe the part behavior in each circuit. We introduce a constrained consensus inference method that was used to infer the value of the model parameters and evaluated its performance through cross-validation in a benchmark of 23 circuits. We discuss these results and summarize the challenges in data integration and parameter inference. This work provides a resource and a methodology that can be used as a point of reference for synthetic circuit modeling.
Collapse
Affiliation(s)
- Linh Huynh
- Department of Computer Science & UC Davis Genome Center, University of California Davis, Davis, California 95616 United States
| | - Ilias Tagkopoulos
- Department of Computer Science & UC Davis Genome Center, University of California Davis, Davis, California 95616 United States
| |
Collapse
|
38
|
Kim M, Rai N, Zorraquino V, Tagkopoulos I. Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli. Nat Commun 2016; 7:13090. [PMID: 27713404 PMCID: PMC5059772 DOI: 10.1038/ncomms13090] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 09/01/2016] [Indexed: 12/20/2022] Open
Abstract
A significant obstacle in training predictive cell models is the lack of integrated data sources. We develop semi-supervised normalization pipelines and perform experimental characterization (growth, transcriptional, proteome) to create Ecomics, a consistent, quality-controlled multi-omics compendium for Escherichia coli with cohesive meta-data information. We then use this resource to train a multi-scale model that integrates four omics layers to predict genome-wide concentrations and growth dynamics. The genetic and environmental ontology reconstructed from the omics data is substantially different and complementary to the genetic and chemical ontologies. The integration of different layers confers an incremental increase in the prediction performance, as does the information about the known gene regulatory and protein-protein interactions. The predictive performance of the model ranges from 0.54 to 0.87 for the various omics layers, which far exceeds various baselines. This work provides an integrative framework of omics-driven predictive modelling that is broadly applicable to guide biological discovery. Multi-omics data integration is a great challenge. Here, the authors compile a database of E. coli proteomics, transcriptomics, metabolomics and fluxomics data to train models of recurrent neural network and constrained regression, enabling prediction of bacterial responses to perturbations.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science, University of California, Davis, California 95616, USA.,Genome Center, University of California, Davis, California 95616, USA
| | - Navneet Rai
- Genome Center, University of California, Davis, California 95616, USA
| | | | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, California 95616, USA.,Genome Center, University of California, Davis, California 95616, USA
| |
Collapse
|
39
|
Affiliation(s)
- Matthew H. Meisner
- Department of Statistics and Center for Population Biology University of California‐Davis Davis California 95616 USA
| | - Jay A. Rosenheim
- Department of Entomology and Nematology University of California‐Davis Davis California 95616 USA
| | - Ilias Tagkopoulos
- Department of Computer Science University of California‐Davis Davis California 95616 USA
| |
Collapse
|
40
|
Carlin DA, Caster RW, Wang X, Betzenderfer SA, Chen CX, Duong VM, Ryklansky CV, Alpekin A, Beaumont N, Kapoor H, Kim N, Mohabbot H, Pang B, Teel R, Whithaus L, Tagkopoulos I, Siegel JB. Kinetic Characterization of 100 Glycoside Hydrolase Mutants Enables the Discovery of Structural Features Correlated with Kinetic Constants. PLoS One 2016; 11:e0147596. [PMID: 26815142 PMCID: PMC4729467 DOI: 10.1371/journal.pone.0147596] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 01/06/2016] [Indexed: 11/18/2022] Open
Abstract
The use of computational modeling algorithms to guide the design of novel enzyme catalysts is a rapidly growing field. Force-field based methods have now been used to engineer both enzyme specificity and activity. However, the proportion of designed mutants with the intended function is often less than ten percent. One potential reason for this is that current force-field based approaches are trained on indirect measures of function rather than direct correlation to experimentally-determined functional effects of mutations. We hypothesize that this is partially due to the lack of data sets for which a large panel of enzyme variants has been produced, purified, and kinetically characterized. Here we report the kcat and KM values of 100 purified mutants of a glycoside hydrolase enzyme. We demonstrate the utility of this data set by using machine learning to train a new algorithm that enables prediction of each kinetic parameter based on readily-modeled structural features. The generated dataset and analyses carried out in this study not only provide insight into how this enzyme functions, they also provide a clear path forward for the improvement of computational enzyme redesign algorithms.
Collapse
Affiliation(s)
- Dylan Alexander Carlin
- Biophysics Graduate Group, University of California Davis, California, United States of America
| | - Ryan W. Caster
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Xiaokang Wang
- Department of Biomedical Engineering, University of California Davis, Davis, California, United States of America
| | | | - Claire X. Chen
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Veasna M. Duong
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Carolina V. Ryklansky
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Alp Alpekin
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Nathan Beaumont
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Harshul Kapoor
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Nicole Kim
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Hosna Mohabbot
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Boyu Pang
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Rachel Teel
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Lillian Whithaus
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Ilias Tagkopoulos
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Computer Science, University of California Davis, Davis, California, United States of America
| | - Justin B. Siegel
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Chemistry, University of California Davis, Davis, California, United States of America
- Department of Biochemistry & Molecular Medicine, University of California Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
41
|
Abstract
In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees.
Collapse
Affiliation(s)
- Linh Huynh
- Department of Computer Science & UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| | - Ilias Tagkopoulos
- Department of Computer Science & UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| |
Collapse
|
42
|
Kim M, Zorraquino V, Tagkopoulos I. Microbial forensics: predicting phenotypic characteristics and environmental conditions from large-scale gene expression profiles. PLoS Comput Biol 2015; 11:e1004127. [PMID: 25774498 PMCID: PMC4361189 DOI: 10.1371/journal.pcbi.1004127] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2014] [Accepted: 01/14/2015] [Indexed: 01/13/2023] Open
Abstract
A tantalizing question in cellular physiology is whether the cellular state and environmental conditions can be inferred by the expression signature of an organism. To investigate this relationship, we created an extensive normalized gene expression compendium for the bacterium Escherichia coli that was further enriched with meta-information through an iterative learning procedure. We then constructed an ensemble method to predict environmental and cellular state, including strain, growth phase, medium, oxygen level, antibiotic and carbon source presence. Results show that gene expression is an excellent predictor of environmental structure, with multi-class ensemble models achieving balanced accuracy between 70.0% (±3.5%) to 98.3% (±2.3%) for the various characteristics. Interestingly, this performance can be significantly boosted when environmental and strain characteristics are simultaneously considered, as a composite classifier that captures the inter-dependencies of three characteristics (medium, phase and strain) achieved 10.6% (±1.0%) higher performance than any individual models. Contrary to expectations, only 59% of the top informative genes were also identified as differentially expressed under the respective conditions. Functional analysis of the respective genetic signatures implicates a wide spectrum of Gene Ontology terms and KEGG pathways with condition-specific information content, including iron transport, transferases, and enterobactin synthesis. Further experimental phenotypic-to-genotypic mapping that we conducted for knock-out mutants argues for the information content of top-ranked genes. This work demonstrates the degree at which genome-scale transcriptional information can be predictive of latent, heterogeneous and seemingly disparate phenotypic and environmental characteristics, with far-reaching applications. The transcriptional profile of an organism contains clues about the environmental context in which it has evolved and currently lives, its behavior and cellular state. It is yet unclear, however, how much information can be efficiently extracted and how it can be used to classify new samples with respect to their environmental and genetic characteristics. Here, we have constructed an extensive transcriptome compendium of Escherichia coli that we have further enriched via an iterative learning approach. We then apply an ensemble of various machine learning algorithms to infer environmental and cellular information such as strain, growth phase, medium, oxygen level, antibiotic and carbon source. Functional analysis of the most informative genes provides mechanistic insights and palpable hypotheses regarding their role in each environmental or genetic context. Our work argues that genome-scale gene expression can be a multi-purpose marker for identifying latent, heterogeneous cellular and environmental states and that optimal classification can be achieved with a feature set of a couple hundred genes that might not necessarily have the most pronounced differential expression in the respective conditions.
Collapse
Affiliation(s)
- Minseung Kim
- Department of Computer Science, University of California, Davis, Davis, California, United States of America
- UC Davis Genome Center, University of California, Davis, Davis, California, United States of America
| | - Violeta Zorraquino
- UC Davis Genome Center, University of California, Davis, Davis, California, United States of America
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, Davis, California, United States of America
- UC Davis Genome Center, University of California, Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
43
|
Tsoukalas A, Albertson T, Tagkopoulos I. From data to optimal decision making: a data-driven, probabilistic machine learning approach to decision support for patients with sepsis. JMIR Med Inform 2015; 3:e11. [PMID: 25710907 PMCID: PMC4376114 DOI: 10.2196/medinform.3445] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2014] [Revised: 08/26/2014] [Accepted: 10/11/2014] [Indexed: 12/01/2022] Open
Abstract
Background A tantalizing question in medical informatics is how to construct knowledge from heterogeneous datasets, and as an extension, inform clinical decisions. The emergence of large-scale data integration in electronic health records (EHR) presents tremendous opportunities. However, our ability to efficiently extract informed decision support is limited due to the complexity of the clinical states and decision process, missing data and lack of analytical tools to advice based on statistical relationships. Objective Development and assessment of a data-driven method that infers the probability distribution of the current state of patients with sepsis, likely trajectories, optimal actions related to antibiotic administration, prediction of mortality and length-of-stay. Methods We present a data-driven, probabilistic framework for clinical decision support in sepsis-related cases. We first define states, actions, observations and rewards based on clinical practice, expert knowledge and data representations in an EHR dataset of 1492 patients. We then use Partially Observable Markov Decision Process (POMDP) model to derive the optimal policy based on individual patient trajectories and we evaluate the performance of the model-derived policies in a separate test set. Policy decisions were focused on the type of antibiotic combinations to administer. Multi-class and discriminative classifiers were used to predict mortality and length of stay. Results Data-derived antibiotic administration policies led to a favorable patient outcome in 49% of the cases, versus 37% when the alternative policies were followed (P=1.3e-13). Sensitivity analysis on the model parameters and missing data argue for a highly robust decision support tool that withstands parameter variation and data uncertainty. When the optimal policy was followed, 387 patients (25.9%) have 90% of their transitions to better states and 503 patients (33.7%) patients had 90% of their transitions to worse states (P=4.0e-06), while in the non-policy cases, these numbers are 192 (12.9%) and 764 (51.2%) patients (P=4.6e-117), respectively. Furthermore, the percentage of transitions within a trajectory that lead to a better or better/same state are significantly higher by following the policy than for non-policy cases (605 vs 344 patients, P=8.6e-25). Mortality was predicted with an AUC of 0.7 and 0.82 accuracy in the general case and similar performance was obtained for the inference of the length-of-stay (AUC of 0.69 to 0.73 with accuracies from 0.69 to 0.82). Conclusions A data-driven model was able to suggest favorable actions, predict mortality and length of stay with high accuracy. This work provides a solid basis for a scalable probabilistic clinical decision support framework for sepsis treatment that can be expanded to other clinically relevant states and actions, as well as a data-driven model that can be adopted in other clinical areas with sufficient training data.
Collapse
Affiliation(s)
- Athanasios Tsoukalas
- Department of Computer Science and Genome Center, University of California, Davis, Davis, CA, United States
| | | | | |
Collapse
|
44
|
Taylor-Teeples M, Lin L, de Lucas M, Turco G, Toal TW, Gaudinier A, Young NF, Trabucco GM, Veling MT, Lamothe R, Handakumbura PP, Xiong G, Wang C, Corwin J, Tsoukalas A, Zhang L, Ware D, Pauly M, Kliebenstein DJ, Dehesh K, Tagkopoulos I, Breton G, Pruneda-Paz JL, Ahnert SE, Kay SA, Hazen SP, Brady SM. An Arabidopsis gene regulatory network for secondary cell wall synthesis. Nature 2014; 517:571-5. [PMID: 25533953 PMCID: PMC4333722 DOI: 10.1038/nature14099] [Citation(s) in RCA: 447] [Impact Index Per Article: 44.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Accepted: 11/20/2014] [Indexed: 12/15/2022]
Abstract
The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here, we present a protein-DNA network between Arabidopsis transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component.
Collapse
Affiliation(s)
- M Taylor-Teeples
- 1] Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - L Lin
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - M de Lucas
- 1] Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - G Turco
- 1] Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - T W Toal
- 1] Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - A Gaudinier
- 1] Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - N F Young
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - G M Trabucco
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - M T Veling
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - R Lamothe
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - P P Handakumbura
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - G Xiong
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California 94720, USA
| | - C Wang
- Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - J Corwin
- Department of Plant Sciences, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - A Tsoukalas
- 1] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Department of Computer Science, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - L Zhang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - D Ware
- 1] Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA [2] US Department of Agriculture, Agricultural Research Service, Ithaca, New York 14853, USA
| | - M Pauly
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California 94720, USA
| | - D J Kliebenstein
- Department of Plant Sciences, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - K Dehesh
- Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - I Tagkopoulos
- 1] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Department of Computer Science, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| | - G Breton
- Section of Cell and Developmental Biology, Division of Biological Sciences, University of California San Diego, La Jolla, California 92093, USA
| | - J L Pruneda-Paz
- Section of Cell and Developmental Biology, Division of Biological Sciences, University of California San Diego, La Jolla, California 92093, USA
| | - S E Ahnert
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge CB3 0HE, UK
| | - S A Kay
- Section of Cell and Developmental Biology, Division of Biological Sciences, University of California San Diego, La Jolla, California 92093, USA
| | - S P Hazen
- Biology Department, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - S M Brady
- 1] Department of Plant Biology, University of California Davis, One Shields Avenue, Davis, California 95616, USA [2] Genome Center, University of California Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
45
|
Abstract
An integral challenge in synthetic circuit design is the selection of optimal parts to populate a given circuit topology, so that the resulting circuit behavior best approximates the desired one. In some cases, it is also possible to reuse multipart constructs or modules that have been already built and experimentally characterized. Efficient part and module selection algorithms are essential to systematically search the solution space, and their significance will only increase in the following years due to the projected explosion in part libraries and circuit complexity. Here, we address this problem by introducing a structured abstraction methodology and a dynamic programming-based algorithm that guaranties optimal part selection. In addition, we provide three extensions that are based on symmetry check, information look-ahead and branch-and-bound techniques, to reduce the running time and space requirements. We have evaluated the proposed methodology with a benchmark of 11 circuits, a database of 73 parts and 304 experimentally constructed modules with encouraging results. This work represents a fundamental departure from traditional heuristic-based methods for part and module selection and is a step toward maximizing efficiency in synthetic circuit design and construction.
Collapse
Affiliation(s)
- Linh Huynh
- Department of Computer Science
and UC Davis Genome Center University of California Davis, California 95616 United States
| | - Ilias Tagkopoulos
- Department of Computer Science
and UC Davis Genome Center University of California Davis, California 95616 United States
| |
Collapse
|
46
|
Aung HH, Tsoukalas A, Rutledge JC, Tagkopoulos I. A systems biology analysis of brain microvascular endothelial cell lipotoxicity. BMC Syst Biol 2014; 8:80. [PMID: 24993133 PMCID: PMC4112729 DOI: 10.1186/1752-0509-8-80] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 06/23/2014] [Indexed: 02/08/2023]
Abstract
Background Neurovascular inflammation is associated with a number of neurological diseases including vascular dementia and Alzheimer’s disease, which are increasingly important causes of morbidity and mortality around the world. Lipotoxicity is a metabolic disorder that results from accumulation of lipids, particularly fatty acids, in non-adipose tissue leading to cellular dysfunction, lipid droplet formation, and cell death. Results Our studies indicate for the first time that the neurovascular circulation also can manifest lipotoxicity, which could have major effects on cognitive function. The penetration of integrative systems biology approaches is limited in this area of research, which reduces our capacity to gain an objective insight into the signal transduction and regulation dynamics at a systems level. To address this question, we treated human microvascular endothelial cells with triglyceride-rich lipoprotein (TGRL) lipolysis products and then we used genome-wide transcriptional profiling to obtain transcript abundances over four conditions. We then identified regulatory genes and their targets that have been differentially expressed through analysis of the datasets with various statistical methods. We created a functional gene network by exploiting co-expression observations through a guilt-by-association assumption. Concomitantly, we used various network inference algorithms to identify putative regulatory interactions and we integrated all predictions to construct a consensus gene regulatory network that is TGRL lipolysis product specific. Conclusion System biology analysis has led to the validation of putative lipid-related targets and the discovery of several genes that may be implicated in lipotoxic-related brain microvascular endothelial cell responses. Here, we report that activating transcription factors 3 (ATF3) is a principal regulator of TGRL lipolysis products-induced gene expression in human brain microvascular endothelial cell.
Collapse
Affiliation(s)
| | | | | | - Ilias Tagkopoulos
- UC Davis Genome Center, University of California, Davis, CA 95616, USA.
| |
Collapse
|
47
|
Carrera J, Estrela R, Luo J, Rai N, Tsoukalas A, Tagkopoulos I. An integrative, multi-scale, genome-wide model reveals the phenotypic landscape of Escherichia coli. Mol Syst Biol 2014; 10:735. [PMID: 24987114 PMCID: PMC4299492 DOI: 10.15252/msb.20145108] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Given the vast behavioral repertoire and biological complexity of even the simplest organisms,
accurately predicting phenotypes in novel environments and unveiling their biological organization
is a challenging endeavor. Here, we present an integrative modeling methodology that unifies under a
common framework the various biological processes and their interactions across multiple layers. We
trained this methodology on an extensive normalized compendium for the gram-negative bacterium
Escherichia coli, which incorporates gene expression data for genetic and
environmental perturbations, transcriptional regulation, signal transduction, and metabolic
pathways, as well as growth measurements. Comparison with measured growth and high-throughput data
demonstrates the enhanced ability of the integrative model to predict phenotypic outcomes in various
environmental and genetic conditions, even in cases where their underlying functions are
under-represented in the training set. This work paves the way toward integrative techniques that
extract knowledge from a variety of biological data to achieve more than the sum of their parts in
the context of prediction, analysis, and redesign of biological systems.
Collapse
Affiliation(s)
- Javier Carrera
- UC Davis Genome Center, University of California, Davis, CA, USA
| | - Raissa Estrela
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Jing Luo
- UC Davis Genome Center, University of California, Davis, CA, USA
| | - Navneet Rai
- UC Davis Genome Center, University of California, Davis, CA, USA
| | - Athanasios Tsoukalas
- UC Davis Genome Center, University of California, Davis, CA, USA Department of Computer Science, University of California, Davis, CA, USA
| | - Ilias Tagkopoulos
- UC Davis Genome Center, University of California, Davis, CA, USA Department of Computer Science, University of California, Davis, CA, USA
| |
Collapse
|
48
|
Gultepe E, Green JP, Nguyen H, Adams J, Albertson T, Tagkopoulos I. From vital signs to clinical outcomes for patients with sepsis: a machine learning basis for a clinical decision support system. J Am Med Inform Assoc 2013; 21:315-25. [PMID: 23959843 DOI: 10.1136/amiajnl-2013-001815] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
OBJECTIVE To develop a decision support system to identify patients at high risk for hyperlactatemia based upon routinely measured vital signs and laboratory studies. MATERIALS AND METHODS Electronic health records of 741 adult patients at the University of California Davis Health System who met at least two systemic inflammatory response syndrome criteria were used to associate patients' vital signs, white blood cell count (WBC), with sepsis occurrence and mortality. Generative and discriminative classification (naïve Bayes, support vector machines, Gaussian mixture models, hidden Markov models) were used to integrate heterogeneous patient data and form a predictive tool for the inference of lactate level and mortality risk. RESULTS An accuracy of 0.99 and discriminability of 1.00 area under the receiver operating characteristic curve (AUC) for lactate level prediction was obtained when the vital signs and WBC measurements were analysed in a 24 h time bin. An accuracy of 0.73 and discriminability of 0.73 AUC for mortality prediction in patients with sepsis was achieved with only three features: median of lactate levels, mean arterial pressure, and median absolute deviation of the respiratory rate. DISCUSSION This study introduces a new scheme for the prediction of lactate levels and mortality risk from patient vital signs and WBC. Accurate prediction of both these variables can drive the appropriate response by clinical staff and thus may have important implications for patient health and treatment outcome. CONCLUSIONS Effective predictions of lactate levels and mortality risk can be provided with a few clinical variables when the temporal aspect and variability of patient data are considered.
Collapse
Affiliation(s)
- Eren Gultepe
- Department of Biomedical Engineering, University of California, Davis, California, USA
| | | | | | | | | | | |
Collapse
|
49
|
Abstract
Microbial evolution has been extensively studied in the past fifty years, which has lead to seminal discoveries that have shaped our understanding of evolutionary forces and dynamics. It is only recently however, that transformative technologies and computational advances have enabled a larger in-scale and in-depth investigation of the genetic basis and mechanistic underpinnings of evolutionary adaptation. In this review we focus on the strengths and limitations of in vivo and in silico techniques for studying microbial evolution in the laboratory, and we discuss how these complementary approaches can be integrated in a unifying framework for elucidating microbial evolution.
Collapse
Affiliation(s)
- Vadim Mozhayskiy
- Department of Computer Science, UC Davis Genome Center, University of California Davis, Davis, California 95616, USA
| | | |
Collapse
|
50
|
Huynh L, Tsoukalas A, Köppe M, Tagkopoulos I. SBROME: a scalable optimization and module matching framework for automated biosystems design. ACS Synth Biol 2013; 2:263-73. [PMID: 23654271 DOI: 10.1021/sb300095m] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The development of a scalable framework for biodesign automation is a formidable challenge given the expected increase in part availability and the ever-growing complexity of synthetic circuits. To allow for (a) the use of previously constructed and characterized circuits or modules and (b) the implementation of designs that can scale up to hundreds of nodes, we here propose a divide-and-conquer Synthetic Biology Reusable Optimization Methodology (SBROME). An abstract user-defined circuit is first transformed and matched against a module database that incorporates circuits that have previously been experimentally characterized. Then the resulting circuit is decomposed to subcircuits that are populated with the set of parts that best approximate the desired function. Finally, all subcircuits are subsequently characterized and deposited back to the module database for future reuse. We successfully applied SBROME toward two alternative designs of a modular 3-input multiplexer that utilize pre-existing logic gates and characterized biological parts.
Collapse
Affiliation(s)
- Linh Huynh
- Department of Computer Science and UC Davis
Genome Center and ‡Department of Mathematics, University of California, Davis, California 95616 United States
| | - Athanasios Tsoukalas
- Department of Computer Science and UC Davis
Genome Center and ‡Department of Mathematics, University of California, Davis, California 95616 United States
| | - Matthias Köppe
- Department of Computer Science and UC Davis
Genome Center and ‡Department of Mathematics, University of California, Davis, California 95616 United States
| | - Ilias Tagkopoulos
- Department of Computer Science and UC Davis
Genome Center and ‡Department of Mathematics, University of California, Davis, California 95616 United States
| |
Collapse
|