1
|
Costanzo A, Clarke D, Holt M, Sharma S, Nagy K, Tan X, Kain L, Abe B, Luce S, Boitard C, Wyseure T, Mosnier LO, Su AI, Grimes C, Finn MG, Savage PB, Gottschalk M, Pettus J, Teyton L. Repositioning the Early Pathology of Type 1 Diabetes to the Extraislet Vasculature. J Immunol 2024; 212:1094-1104. [PMID: 38426888 PMCID: PMC10944819 DOI: 10.4049/jimmunol.2300769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 01/29/2024] [Indexed: 03/02/2024]
Abstract
Type 1 diabetes (T1D) is a prototypic T cell-mediated autoimmune disease. Because the islets of Langerhans are insulated from blood vessels by a double basement membrane and lack detectable lymphatic drainage, interactions between endocrine and circulating T cells are not permitted. Thus, we hypothesized that initiation and progression of anti-islet immunity required islet neolymphangiogenesis to allow T cell access to the islet. Combining microscopy and single cell approaches, the timing of this phenomenon in mice was situated between 5 and 8 wk of age when activated anti-insulin CD4 T cells became detectable in peripheral blood while peri-islet pathology developed. This "peri-insulitis," dominated by CD4 T cells, respected the islet basement membrane and was limited on the outside by lymphatic endothelial cells that gave it the attributes of a tertiary lymphoid structure. As in most tissues, lymphangiogenesis seemed to be secondary to local segmental endothelial inflammation at the collecting postcapillary venule. In addition to classic markers of inflammation such as CD29, V-CAM, and NOS, MHC class II molecules were expressed by nonhematopoietic cells in the same location both in mouse and human islets. This CD45- MHC class II+ cell population was capable of spontaneously presenting islet Ags to CD4 T cells. Altogether, these observations favor an alternative model for the initiation of T1D, outside of the islet, in which a vascular-associated cell appears to be an important MHC class II-expressing and -presenting cell.
Collapse
Affiliation(s)
- Anne Costanzo
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | - Don Clarke
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | - Marie Holt
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | - Siddhartha Sharma
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | - Kenna Nagy
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | - Xuqian Tan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA
| | - Lisa Kain
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | - Brian Abe
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| | | | | | - Tine Wyseure
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA
| | - Laurent O. Mosnier
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA
| | - Andrew I. Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA
| | - Catherine Grimes
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE
| | - M. G. Finn
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA
| | - Paul B. Savage
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT
| | - Michael Gottschalk
- Rady Children’s Hospital, University of California San Diego, San Diego, CA
| | - Jeremy Pettus
- UC San Diego School of Medicine, University of California San Diego, San Diego, CA
| | - Luc Teyton
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA
| |
Collapse
|
2
|
Waagmeester A, Willighagen EL, Su AI, Kutmon M, Gayo JEL, Fernández-Álvarez D, Groom Q, Schaap PJ, Verhagen LM, Koehorst JJ. Author Correction: A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses. BMC Biol 2023; 21:261. [PMID: 37974169 PMCID: PMC10655412 DOI: 10.1186/s12915-023-01764-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023] Open
Affiliation(s)
| | | | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Martina Kutmon
- NUTRIM, Maastricht University, Maastricht, The Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
| | | | | | | | - Peter J Schaap
- Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands
| | | | - Jasper J Koehorst
- Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands.
| |
Collapse
|
3
|
Gonzalez-Cavazos AC, Tanska A, Mayers M, Carvalho-Silva D, Sridharan B, Rewers PA, Sankarlal U, Jagannathan L, Su AI. DrugMechDB: A Curated Database of Drug Mechanisms. Sci Data 2023; 10:632. [PMID: 37717042 PMCID: PMC10505144 DOI: 10.1038/s41597-023-02534-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 09/01/2023] [Indexed: 09/18/2023] Open
Abstract
Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to a disease prediction. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repositioning models or as a valuable resource for training such models.
Collapse
Affiliation(s)
- Adriana Carolina Gonzalez-Cavazos
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Anna Tanska
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Michael Mayers
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Denise Carvalho-Silva
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Brindha Sridharan
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Patrick A Rewers
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Umasri Sankarlal
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Lakshmanan Jagannathan
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA
| | - Andrew I Su
- The Scripps Research Institute, Department of Integrative Structural and Computational Biology, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA.
| |
Collapse
|
4
|
Fecho K, Bizon C, Issabekova T, Moxon S, Thessen AE, Abdollahi S, Baranzini SE, Belhu B, Byrd WE, Chung L, Crouse A, Duby MP, Ferguson S, Foksinska A, Forero L, Friedman J, Gardner V, Glusman G, Hadlock J, Hanspers K, Hinderer E, Hobbs C, Hyde G, Huang S, Koslicki D, Mease P, Muller S, Mungall CJ, Ramsey SA, Roach J, Rubin I, Schurman SH, Shalev A, Smith B, Soman K, Stemann S, Su AI, Ta C, Watkins PB, Williams MD, Wu C, Xu CH. An approach for collaborative development of a federated biomedical knowledge graph-based question-answering system: Question-of-the-Month challenges. J Clin Transl Sci 2023; 7:e214. [PMID: 37900350 PMCID: PMC10603356 DOI: 10.1017/cts.2023.619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 08/21/2023] [Indexed: 10/31/2023] Open
Abstract
Knowledge graphs have become a common approach for knowledge representation. Yet, the application of graph methodology is elusive due to the sheer number and complexity of knowledge sources. In addition, semantic incompatibilities hinder efforts to harmonize and integrate across these diverse sources. As part of The Biomedical Translator Consortium, we have developed a knowledge graph-based question-answering system designed to augment human reasoning and accelerate translational scientific discovery: the Translator system. We have applied the Translator system to answer biomedical questions in the context of a broad array of diseases and syndromes, including Fanconi anemia, primary ciliary dyskinesia, multiple sclerosis, and others. A variety of collaborative approaches have been used to research and develop the Translator system. One recent approach involved the establishment of a monthly "Question-of-the-Month (QotM) Challenge" series. Herein, we describe the structure of the QotM Challenge; the six challenges that have been conducted to date on drug-induced liver injury, cannabidiol toxicity, coronavirus infection, diabetes, psoriatic arthritis, and ATP1A3-related phenotypes; the scientific insights that have been gleaned during the challenges; and the technical issues that were identified over the course of the challenges and that can now be addressed to foster further development of the prototype Translator system. We close with a discussion on Large Language Models such as ChatGPT and highlight differences between those models and the Translator system.
Collapse
Affiliation(s)
- Karamarie Fecho
- Renaissance Computing Institute (RENCI), University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Copperline Professional Solutions, Pittsboro, NC, USA
| | - Chris Bizon
- Renaissance Computing Institute (RENCI), University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Tursynay Issabekova
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Sierra Moxon
- Biosystems Data Science Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Anne E. Thessen
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Shervin Abdollahi
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Sergio E. Baranzini
- Department of Neurology, Weill Institute for Neuroscience, University of California - San Francisco, San Francisco, CA, USA
| | | | - William E. Byrd
- The Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lawrence Chung
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrew Crouse
- The Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Marc P. Duby
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Stephen Ferguson
- National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, USA
| | - Aleksandra Foksinska
- The Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Laura Forero
- Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA, USA
- University of California at San Diego, San Diego, CA, USA
| | - Jennifer Friedman
- Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA, USA
- University of California at San Diego, San Diego, CA, USA
| | - Vicki Gardner
- Renaissance Computing Institute (RENCI), University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | | | - Kristina Hanspers
- Gladstone Institutes, University of California - San Francisco, San Francisco, CA, USA
| | - Eugene Hinderer
- Tufts Clinical and Translational Science Institute, Tufts Medical Center, Boston, MA, USA
| | - Charlotte Hobbs
- Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA, USA
| | - Gregory Hyde
- Thayer School of Engineering at Dartmouth College, Hanover, NH, USA
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA, USA
| | - David Koslicki
- Departments of Computer Science and Engineering, Biology, and the Huck Institutes of the Life Sciences, Penn State University, University Park, PA, USA
| | - Philip Mease
- Swedish Medical Center, St. Joseph Health, Seattle, WA, USA
- University of Washington, Seattle, WA, USA
| | | | - Christopher J. Mungall
- Biosystems Data Science Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Jared Roach
- Institute for Systems Biology, Seattle, WA, USA
| | - Irit Rubin
- Institute for Systems Biology, Seattle, WA, USA
| | | | - Anath Shalev
- The Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Brett Smith
- Institute for Systems Biology, Seattle, WA, USA
| | - Karthik Soman
- Department of Neurology, Weill Institute for Neuroscience, University of California - San Francisco, San Francisco, CA, USA
| | - Sarah Stemann
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Andrew I. Su
- The Scripps Research Institute, La Jolla, CA, USA
| | - Casey Ta
- Columbia University Irving Medical Center, New York, NY, USA
| | - Paul B. Watkins
- Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Mark D. Williams
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Chunlei Wu
- The Scripps Research Institute, La Jolla, CA, USA
| | | | | |
Collapse
|
5
|
Callaghan J, Xu CH, Xin J, Cano MA, Riutta A, Zhou E, Juneja R, Yao Y, Narayan M, Hanspers K, Agrawal A, Pico AR, Wu C, Su AI. BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. Bioinformatics 2023; 39:7273783. [PMID: 37707514 PMCID: PMC11015316 DOI: 10.1093/bioinformatics/btad570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 08/18/2023] [Accepted: 09/12/2023] [Indexed: 09/15/2023] Open
Abstract
SUMMARY Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThings Explorer is distributed as a lightweight application that dynamically retrieves information at query time. AVAILABILITY AND IMPLEMENTATION More information can be found at https://explorer.biothings.io and code is available at https://github.com/biothings/biothings_explorer.
Collapse
Affiliation(s)
- Jackson Callaghan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Colleen H Xu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Anders Riutta
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Eric Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Rohan Juneja
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Yao Yao
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Madhumita Narayan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Kristina Hanspers
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Ayushi Agrawal
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Alexander R Pico
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| |
Collapse
|
6
|
Hu E, Li TS, Wineinger NE, Su AI. Association study between drug prescriptions and Alzheimer's disease claims in a commercial insurance database. Alzheimers Res Ther 2023; 15:118. [PMID: 37355615 PMCID: PMC10290352 DOI: 10.1186/s13195-023-01255-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 06/01/2023] [Indexed: 06/26/2023]
Abstract
In the ongoing effort to discover treatments for Alzheimer's disease (AD), there has been considerable focus on investigating the use of repurposed drug candidates. Mining of electronic health record data has the potential to identify novel correlated effects between commonly used drugs and AD. In this study, claims from members with commercial health insurance coverage were analyzed to determine the correlation between the use of various drugs on AD incidence and claim frequency. We found that, within the insured population, several medications for psychotic and mental illnesses were associated with higher disease incidence and frequency, while, to a lesser extent, antibiotics and anti-inflammatory drugs were associated with lower AD incidence rates. The observations thus provide a general overview of the prescription and claim relationships between various drug types and Alzheimer's disease, with insights into which drugs have possible implications on resulting AD diagnosis.
Collapse
Affiliation(s)
- Eric Hu
- Integrative Structural and Computational Biology, Scripps Research Institute, 10550 North Torrey Pines Rd, La Jolla, CA 92037 USA
| | - Tong Shu Li
- Integrative Structural and Computational Biology, Scripps Research Institute, 10550 North Torrey Pines Rd, La Jolla, CA 92037 USA
| | | | - Andrew I. Su
- Integrative Structural and Computational Biology, Scripps Research Institute, 10550 North Torrey Pines Rd, La Jolla, CA 92037 USA
- Present Address: Scripps Research Translational Institute, La Jolla, CA 92037 USA
| |
Collapse
|
7
|
Gonzalez-Cavazos AC, Tanska A, Mayers MD, Carvalho-Silva D, Sridharan B, Rewers PA, Sankarlal U, Jagannathan L, Su AI. DrugMechDB: A Curated Database of Drug Mechanisms. bioRxiv 2023:2023.05.01.538993. [PMID: 37205439 PMCID: PMC10187194 DOI: 10.1101/2023.05.01.538993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to disease predictions. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repurposing models or as a valuable resource for training such models.
Collapse
Affiliation(s)
| | - Anna Tanska
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Michael D. Mayers
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Denise Carvalho-Silva
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Brindha Sridharan
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Patrik A. Rewers
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Umasri Sankarlal
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Lakshmanan Jagannathan
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| | - Andrew I. Su
- The Scripps Research Institute, Department of Integrative and Structural Biology, 10550 N Torrey Pines Rd. La Jolla, CA, 92037, USA
| |
Collapse
|
8
|
Cano MA, Tsueng G, Zhou X, Xin J, Hughes LD, Mullen JL, Su AI, Wu C. Schema Playground: a tool for authoring, extending, and using metadata schemas to improve FAIRness of biomedical data. BMC Bioinformatics 2023; 24:159. [PMID: 37081398 PMCID: PMC10116472 DOI: 10.1186/s12859-023-05258-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 03/27/2023] [Indexed: 04/22/2023] Open
Abstract
BACKGROUND Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. Schema.org is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging Schema.org could benefit biomedical research resource providers, but it can be challenging to apply Schema.org standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize Schema.org or other biomedical schema projects. RESULTS Our browser-based tool includes features which can help address many of the barriers towards Schema.org-compliance such as: The ability to easily browse for relevant Schema.org classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the Outbreak.info schema-a large multi-class schema for harmonizing various COVID-19 related resources. CONCLUSIONS We have created a browser-based tool to empower biomedical research resource providers to leverage Schema.org classes to make their research outputs more FAIR.
Collapse
Affiliation(s)
| | | | | | - Jiwen Xin
- The Scripps Research Institute, San Diego, USA
| | | | | | - Andrew I Su
- The Scripps Research Institute, San Diego, USA
| | - Chunlei Wu
- The Scripps Research Institute, San Diego, USA
| |
Collapse
|
9
|
Callaghan J, Xu CH, Xin J, Cano MA, Riutta A, Zhou E, Juneja R, Yao Y, Narayan M, Hanspers K, Agrawal A, Pico AR, Wu C, Su AI. BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. ArXiv 2023:arXiv:2304.09344v1. [PMID: 37131885 PMCID: PMC10153288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThing Explorer is distributed as a lightweight application that dynamically retrieves information at query time. More information can be found at https://explorer.biothings.io, and code is available at https://github.com/biothings/biothings_explorer.
Collapse
Affiliation(s)
- Jackson Callaghan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Colleen H Xu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Anders Riutta
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Eric Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Rohan Juneja
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Yao Yao
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Madhumita Narayan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Kristina Hanspers
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ayushi Agrawal
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Alexander R Pico
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| |
Collapse
|
10
|
Tsueng G, Mullen JL, Alkuzweny M, Cano M, Rush B, Haag E, Lin J, Welzel DJ, Zhou X, Qian Z, Latif AA, Hufbauer E, Zeller M, Andersen KG, Wu C, Su AI, Gangavarapu K, Hughes LD. Outbreak.info Research Library: a standardized, searchable platform to discover and explore COVID-19 resources. Nat Methods 2023; 20:536-540. [PMID: 36823331 PMCID: PMC10393269 DOI: 10.1038/s41592-023-01770-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
Outbreak.info Research Library is a standardized, searchable interface of coronavirus disease 2019 (COVID-19) and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) publications, clinical trials, datasets, protocols and other resources, built with a reusable framework. We developed a rigorous schema to enforce consistency across different sources and resource types and linked related resources. Researchers can quickly search the latest research across data repositories, regardless of resource type or repository location, via a search interface, public application programming interface (API) and R package.
Collapse
Affiliation(s)
- Ginger Tsueng
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA.
| | - Julia L Mullen
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | - Manar Alkuzweny
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA
- Department of Immunology and Microbiology, the Scripps Research Institute, La Jolla, CA, USA
| | - Marco Cano
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | | | - Emily Haag
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | - Jason Lin
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | - Dylan J Welzel
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | - Xinghua Zhou
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | - Zhongchao Qian
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
| | - Alaa Abdel Latif
- Department of Immunology and Microbiology, the Scripps Research Institute, La Jolla, CA, USA
| | - Emory Hufbauer
- Department of Immunology and Microbiology, the Scripps Research Institute, La Jolla, CA, USA
| | - Mark Zeller
- Department of Immunology and Microbiology, the Scripps Research Institute, La Jolla, CA, USA
| | - Kristian G Andersen
- Department of Immunology and Microbiology, the Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
| | - Chunlei Wu
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Molecular Medicine, the Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Molecular Medicine, the Scripps Research Institute, La Jolla, CA, USA
| | - Karthik Gangavarapu
- Department of Immunology and Microbiology, the Scripps Research Institute, La Jolla, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Laura D Hughes
- Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, CA, USA.
| |
Collapse
|
11
|
Gangavarapu K, Latif AA, Mullen JL, Alkuzweny M, Hufbauer E, Tsueng G, Haag E, Zeller M, Aceves CM, Zaiets K, Cano M, Zhou X, Qian Z, Sattler R, Matteson NL, Levy JI, Lee RTC, Freitas L, Maurer-Stroh S, Suchard MA, Wu C, Su AI, Andersen KG, Hughes LD. Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations. Nat Methods 2023; 20:512-522. [PMID: 36823332 PMCID: PMC10399614 DOI: 10.1038/s41592-023-01769-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 78.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info , a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how outbreak.info can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.
Collapse
Affiliation(s)
- Karthik Gangavarapu
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA.
| | - Alaa Abdel Latif
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Julia L Mullen
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Manar Alkuzweny
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA
| | - Emory Hufbauer
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Emily Haag
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Mark Zeller
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Christine M Aceves
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Karina Zaiets
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Marco Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Xinghua Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Zhongchao Qian
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Rachel Sattler
- Skaggs Graduate School of Biological and Chemical Sciences, The Scripps Research Institute, La Jolla, CA, USA
| | - Nathaniel L Matteson
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Joshua I Levy
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Raphael T C Lee
- GISAID Global Data Science Initiative, Munich, Germany
- Bioinformatics Institute & ID Labs, Agency for Science Technology and Research, Singapore, Singapore
| | - Lucas Freitas
- GISAID Global Data Science Initiative, Munich, Germany
- Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil
| | - Sebastian Maurer-Stroh
- GISAID Global Data Science Initiative, Munich, Germany
- Bioinformatics Institute & ID Labs, Agency for Science Technology and Research, Singapore, Singapore
- National Centre for Infectious Diseases, Ministry of Health, Singapore, Singapore
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore
| | - Marc A Suchard
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
| | - Kristian G Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
| | - Laura D Hughes
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.
| |
Collapse
|
12
|
Hughes LD, Tsueng G, DiGiovanna J, Horvath TD, Rasmussen LV, Savidge TC, Stoeger T, Turkarslan S, Wu Q, Wu C, Su AI, Pache L. Addressing barriers in FAIR data practices for biomedical data. Sci Data 2023; 10:98. [PMID: 36823198 PMCID: PMC9950056 DOI: 10.1038/s41597-023-01969-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/13/2023] [Indexed: 02/25/2023] Open
Affiliation(s)
- Laura D Hughes
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA.
| | - Ginger Tsueng
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Jack DiGiovanna
- Velsera, 529 Main St, Suite 6610, Charlestown, MA, 02129, USA
| | - Thomas D Horvath
- Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, 77030, USA
- Texas Children's Microbiome Center, Department of Pathology, Texas Children's Hospital, Houston, TX, 77030, USA
| | - Luke V Rasmussen
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Tor C Savidge
- Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, 77030, USA
- Texas Children's Microbiome Center, Texas Children's Hospital, Houston, TX, 77030, USA
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, McCormick School of Engineering, Evanston, IL, 60208, USA
| | | | - Qinglong Wu
- Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, 77030, USA
- Texas Children's Microbiome Center, Texas Children's Hospital, Houston, TX, 77030, USA
| | - Chunlei Wu
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
- Scripps Research Translational Institute, La Jolla, CA, 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
- Scripps Research Translational Institute, La Jolla, CA, 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Lars Pache
- Infectious and Inflammatory Disease Center, Immunity and Pathogenesis Program, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, 92037, USA
| |
Collapse
|
13
|
Tsueng G, Cano MAA, Bento J, Czech C, Kang M, Pache L, Rasmussen LV, Savidge TC, Starren J, Wu Q, Xin J, Yeaman MR, Zhou X, Su AI, Wu C, Brown L, Shabman RS, Hughes LD. Developing a standardized but extendable framework to increase the findability of infectious disease datasets. Sci Data 2023; 10:99. [PMID: 36823157 PMCID: PMC9950378 DOI: 10.1038/s41597-023-01968-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/13/2023] [Indexed: 02/25/2023] Open
Abstract
Biomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, encourage reproducibility, and accelerate research advances through data reuse. To improve FAIRness of our datasets and computational tools, we evaluated metadata standards across established biomedical data repositories. The vast majority do not adhere to a single standard, such as Schema.org, which is widely-adopted by generalist repositories. Consequently, datasets in these repositories are not findable in aggregation projects like Google Dataset Search. We alleviated this gap by creating a reusable metadata schema based on Schema.org and catalogued nearly 400 datasets and computational tools we collected. The approach is easily reusable to create schemas interoperable with community standards, but customized to a particular context. Our approach enabled data discovery, increased the reusability of datasets from a large research consortium, and accelerated research. Lastly, we discuss ongoing challenges with FAIRness beyond discoverability.
Collapse
Affiliation(s)
- Ginger Tsueng
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA.
| | - Marco A Alvarado Cano
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - José Bento
- Department of Computer Science, Boston College, 245 Beacon St, Chestnut Hill, MA, 02467, USA
| | - Candice Czech
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Mengjia Kang
- Division of Pulmonary and Critical Care, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Lars Pache
- Infectious and Inflammatory Disease Center, Immunity and Pathogenesis Program, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, 92037, USA
| | - Luke V Rasmussen
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Tor C Savidge
- Texas Children's Microbiome Center & Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Justin Starren
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Qinglong Wu
- Texas Children's Microbiome Center & Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jiwen Xin
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Michael R Yeaman
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Divisions of Molecular Medicine and Infectious Diseases, Harbor-UCLA Medical Center, Torrance, CA, 90502, USA
- Lundquist Institute for Infection & Immunity at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA
| | - Xinghua Zhou
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
- Scripps Research Translational Institute, La Jolla, CA, 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Chunlei Wu
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
- Scripps Research Translational Institute, La Jolla, CA, 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Liliana Brown
- Office of Genomics and Advanced Technologies, National Institute of Allergy and Infectious Diseases, Rockville, MD, 20852, USA
| | - Reed S Shabman
- Office of Genomics and Advanced Technologies, National Institute of Allergy and Infectious Diseases, Rockville, MD, 20852, USA
| | - Laura D Hughes
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA.
| |
Collapse
|
14
|
Tsueng G, Mullen JL, Alkuzweny M, Cano M, Rush B, Haag E, Curators O, Lin J, Welzel DJ, Zhou X, Qian Z, Latif AA, Hufbauer E, Zeller M, Andersen KG, Wu C, Su AI, Gangavarapu K, Hughes LD. Outbreak.info Research Library: A standardized, searchable platform to discover and explore COVID-19 resources. bioRxiv 2022:2022.01.20.477133. [PMID: 35132411 PMCID: PMC8820656 DOI: 10.1101/2022.01.20.477133] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
To combat the ongoing COVID-19 pandemic, scientists have been conducting research at breakneck speeds, producing over 52,000 peer-reviewed articles within the first year. To address the challenge in tracking the vast amount of new research located in separate repositories, we developed outbreak.info Research Library, a standardized, searchable interface of COVID-19 and SARS-CoV-2 resources. Unifying metadata from sixteen repositories, we assembled a collection of over 350,000 publications, clinical trials, datasets, protocols, and other resources as of October 2022. We used a rigorous schema to enforce consistency across different sources and resource types and linked related resources. Researchers can quickly search the latest research across data repositories, regardless of resource type or repository location, via a search interface, public API, and R package. Finally, we discuss the challenges inherent in combining metadata from scattered and heterogeneous resources and provide recommendations to streamline this process to aid scientific research.
Collapse
Affiliation(s)
- Ginger Tsueng
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Julia L. Mullen
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Manar Alkuzweny
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Marco Cano
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | - Emily Haag
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | - Jason Lin
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Dylan J. Welzel
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Xinghua Zhou
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Zhongchao Qian
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Alaa Abdel Latif
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Emory Hufbauer
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Mark Zeller
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Kristian G. Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
| | - Chunlei Wu
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew I. Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Karthik Gangavarapu
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Laura D. Hughes
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
15
|
van Wijnen AJ, Golemis E, Hanukoglu I, Tsui SKW, Hu E, Ul-Hasan S, Joy J, Su AI, Tsueng G. A retrospective evaluation of a decade of Gene Wiki Reviews and their impact. Gene X 2022; 830:146534. [PMID: 35525475 DOI: 10.1016/j.gene.2022.146534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Affiliation(s)
| | - Erica Golemis
- Fox Chase Cancer Center, Philadelphia, PA, USA; Lewis Katz School of Medicine, Temple University, Philadelphia, PA, USA
| | | | | | - Eric Hu
- The Scripps Research Institute, La Jolla, CA, USA
| | | | - Janet Joy
- The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- The Scripps Research Institute, La Jolla, CA, USA
| | | |
Collapse
|
16
|
Gangavarapu K, Latif AA, Mullen JL, Alkuzweny M, Hufbauer E, Tsueng G, Haag E, Zeller M, Aceves CM, Zaiets K, Cano M, Zhou J, Qian Z, Sattler R, Matteson NL, Levy JI, Lee RTC, Freitas L, Maurer-Stroh S, Suchard MA, Wu C, Su AI, Andersen KG, Hughes LD. Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations. Res Sq 2022:rs.3.rs-1723829. [PMID: 35794893 PMCID: PMC9258294 DOI: 10.21203/rs.3.rs-1723829/v1] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
The emergence of SARS-CoV-2 variants of concern has prompted the need for near real-time genomic surveillance to inform public health interventions. In response to this need, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info, a platform that currently tracks over 40 million combinations of PANGO lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials, and the general public. We describe the interpretable and opinionated visualizations in the variant and location focussed reports available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data, and the server infrastructure that enables widespread data dissemination via a high performance API that can be accessed using an R package. We present a case study that illustrates how outbreak.info can be used for genomic surveillance and as a hypothesis generation tool to understand the ongoing pandemic at varying geographic and temporal scales. With an emphasis on scalability, interactivity, interpretability, and reusability, outbreak.info provides a template to enable genomic surveillance at a global and localized scale.
Collapse
Affiliation(s)
- Karthik Gangavarapu
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Alaa Abdel Latif
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Julia L. Mullen
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Manar Alkuzweny
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Emory Hufbauer
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Ginger Tsueng
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Emily Haag
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Mark Zeller
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Christine M. Aceves
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Karina Zaiets
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Marco Cano
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Jerry Zhou
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Zhongchao Qian
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Rachel Sattler
- Skaggs Graduate School of Biological and Chemical Sciences, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Nathaniel L Matteson
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Joshua I. Levy
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Raphael TC Lee
- GISAID Global Data Science Initiative (GISAID), Munich, Germany
- Bioinformatics Institute & ID Labs, Agency for Science Technology and Research, Singapore
| | - Lucas Freitas
- GISAID Global Data Science Initiative (GISAID), Munich, Germany
- Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil
| | - Sebastian Maurer-Stroh
- GISAID Global Data Science Initiative (GISAID), Munich, Germany
- Bioinformatics Institute & ID Labs, Agency for Science Technology and Research, Singapore
- National Centre for Infectious Diseases, Ministry of Health, Singapore
- Department of Biological Sciences, National University of Singapore, Singapore
| | | | - Marc A. Suchard
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Chunlei Wu
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew I. Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Kristian G. Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
| | - Laura D. Hughes
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
17
|
Unni DR, Moxon SAT, Bada M, Brush M, Bruskiewich R, Caufield JH, Clemons PA, Dancik V, Dumontier M, Fecho K, Glusman G, Hadlock JJ, Harris NL, Joshi A, Putman T, Qin G, Ramsey SA, Shefchek KA, Solbrig H, Soman K, Thessen AE, Haendel MA, Bizon C, Mungall CJ, Acevedo L, Ahalt SC, Alden J, Alkanaq A, Amin N, Avila R, Balhoff J, Baranzini SE, Baumgartner A, Baumgartner W, Belhu B, Brandes M, Brandon N, Burtt N, Byrd W, Callaghan J, Cano MA, Carrell S, Celebi R, Champion J, Chen Z, Chen M, Chung L, Cohen K, Conlin T, Corkill D, Costanzo M, Cox S, Crouse A, Crowder C, Crumbley ME, Dai C, Dančík V, De Miranda Azevedo R, Deutsch E, Dougherty J, Duby MP, Duvvuri V, Edwards S, Emonet V, Fehrmann N, Flannick J, Foksinska AM, Gardner V, Gatica E, Glen A, Goel P, Gormley J, Greyber A, Haaland P, Hanspers K, He K, He K, Henrickson J, Hinderer EW, Hoatlin M, Hoffman A, Huang S, Huang C, Hubal R, Huellas‐Bruskiewicz K, Huls FB, Hunter L, Hyde G, Issabekova T, Jarrell M, Jenkins L, Johs A, Kang J, Kanwar R, Kebede Y, Kim KJ, Kluge A, Knowles M, Koesterer R, Korn D, Koslicki D, Krishnamurthy A, Kvarfordt L, Lee J, Leigh M, Lin J, Liu Z, Liu S, Ma C, Magis A, Mamidi T, Mandal M, Mantilla M, Massung J, Mauldin D, McClelland J, McMurry J, Mease P, Mendoza L, Mersmann M, Mesbah A, Might M, Morton K, Muller S, Muluka AT, Osborne J, Owen P, Patton M, Peden DB, Peene RC, Persaud B, Pfaff E, Pico A, Pollard E, Price G, Raj S, Reilly J, Riutta A, Roach J, Roper RT, Rosenblatt G, Rubin I, Rucka S, Rudavsky‐Brody N, Sakaguchi R, Santos E, Schaper K, Schmitt CP, Schurman S, Scott E, Seitanakis S, Sharma P, Shmulevich I, Shrestha M, Shrivastava S, Sinha M, Smith B, Southall N, Southern N, Stillwell L, Strasser M"M, Su AI, Ta C, Thessen AE, Tinglin J, Tonstad L, Tran‐Nguyen T, Tropsha A, Vaidya G, Veenhuis L, Viola A, Grotthuss M, Wang M, Wang P, Watkins PB, Weber R, Wei Q, Weng C, Whitlock J, Williams MD, Williams A, Womack F, Wood E, Wu C, Xin JK, Xu H, Xu C, Yakaboski C, Yao Y, Yi H, Yilmaz A, Zheng M, Zhou X, Zhou E, Zhu Q, Zisk T. Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science. Clin Transl Sci 2022; 15:1848-1855. [PMID: 36125173 PMCID: PMC9372416 DOI: 10.1111/cts.13302] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 12/12/2022] Open
Abstract
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph‐based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these “knowledge graphs” (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open‐access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open‐source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object‐oriented classification and graph‐oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.
Collapse
Affiliation(s)
- Deepak R. Unni
- Genome Biology Unit, European Molecular Biology Laboratory Heidelberg Germany
- Division of Environmental Genomics and Systems Biology Lawrence Berkeley National Laboratory Berkeley California USA
| | - Sierra A. T. Moxon
- Division of Environmental Genomics and Systems Biology Lawrence Berkeley National Laboratory Berkeley California USA
| | - Michael Bada
- Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USA
| | - Matthew Brush
- Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USA
| | | | - J. Harry Caufield
- Division of Environmental Genomics and Systems Biology Lawrence Berkeley National Laboratory Berkeley California USA
| | - Paul A. Clemons
- Chemical Biology and Therapeutics Science Program Broad Institute Cambridge Massachusetts USA
| | - Vlado Dancik
- Chemical Biology and Therapeutics Science Program Broad Institute Cambridge Massachusetts USA
| | - Michel Dumontier
- Institute of Data Science Maastricht University Maastricht The Netherlands
| | - Karamarie Fecho
- Renaissance Computing Institute University of North Carolina at Chapel Hill Chapel Hill North Carolina USA
| | | | | | - Nomi L. Harris
- Division of Environmental Genomics and Systems Biology Lawrence Berkeley National Laboratory Berkeley California USA
| | - Arpita Joshi
- Institute for Systems Biology Seattle Washington USA
| | - Tim Putman
- Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USA
| | - Guangrong Qin
- Institute for Systems Biology Seattle Washington USA
| | - Stephen A. Ramsey
- Department of Biomedical Sciences Oregon State University Corvallis Oregon USA
| | - Kent A. Shefchek
- Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USA
| | | | - Karthik Soman
- Department of Neurology University of California San Francisco San Francisco California USA
| | - Anne E. Thessen
- Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USA
| | - Melissa A. Haendel
- Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USA
| | - Chris Bizon
- Renaissance Computing Institute University of North Carolina at Chapel Hill Chapel Hill North Carolina USA
| | - Christopher J. Mungall
- Division of Environmental Genomics and Systems Biology Lawrence Berkeley National Laboratory Berkeley California USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Fecho K, Thessen AE, Baranzini SE, Bizon C, Hadlock JJ, Huang S, Roper RT, Southall N, Ta C, Watkins PB, Williams MD, Xu H, Byrd W, Dančík V, Duby MP, Dumontier M, Glusman G, Harris NL, Hinderer EW, Hyde G, Johs A, Su AI, Qin G, Zhu Q. Progress toward a universal biomedical data translator. Clin Transl Sci 2022; 15:1838-1847. [PMID: 35611543 PMCID: PMC9372428 DOI: 10.1111/cts.13301] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 11/28/2022] Open
Abstract
Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science. Having demonstrated feasibility of the Translator system, the Translator program has since moved into development, and the Translator Consortium has made significant progress in the research, design, and implementation of an operational system. Herein, we describe the current system's architecture, performance, and quality of results. We apply Translator to several real-world use cases developed in collaboration with subject-matter experts. Finally, we discuss the scientific and technical features of Translator and compare those features to other state-of-the-art, biomedical graph-based question-answering systems.
Collapse
Grants
- OT3TR002019 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- ZIA TR000276-05 National Center for Advancing Translational Sciences, Intramural Research Program
- OT2TR003449 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- U01 DK065201 NIDDK NIH HHS
- OT2TR002515 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003443 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR002584 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003434 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2 TR003449 NCATS NIH HHS
- OT2TR003433 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003435 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR002517 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002027 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003422 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003441 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002020 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003448 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003428 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003445 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- I75N95021P00636 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR002520 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003427 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003436 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- ZIA TR000276 Intramural NIH HHS
- OT2TR002514 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002025 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2 TR003428 NCATS NIH HHS
- 5U01DK065201 NIDDK NIH HHS
- OT2TR003437 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003450 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002026 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003430 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- National Institute of Diabetes and Digestive and Kidney Diseases
- National Center for Advancing Translational Sciences, Biomedical Data Translator Program
Collapse
Affiliation(s)
- Karamarie Fecho
- Renaissance Computing InstituteUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Anne E. Thessen
- Center for Health AIUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Sergio E. Baranzini
- Weill Institute for Neurosciences, Department of NeurologyUniversity of California at San FranciscoSan FranciscoCaliforniaUSA
| | - Chris Bizon
- Renaissance Computing InstituteUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | | | - Sui Huang
- Institute for Systems BiologySeattleWashingtonUSA
| | | | - Noel Southall
- Division of Preclinical Innovation, National Center for Advancing Translational SciencesNational Institutes of HealthRockvilleMarylandUSA
| | - Casey Ta
- Department of Biomedical InformaticsColumbia UniversityNew YorkNew YorkUSA
| | - Paul B. Watkins
- Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of PharmacyUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Mark D. Williams
- Division of Preclinical Innovation, National Center for Advancing Translational SciencesNational Institutes of HealthRockvilleMarylandUSA
| | - Hao Xu
- Renaissance Computing InstituteUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - William Byrd
- Hugh Kaul Precision Medicine InstituteUniversity of Alabama at BirminghamBirminghamAlabamaUSA
| | - Vlado Dančík
- Chemical Biology and Therapeutics Science ProgramBroad InstituteCambridgeMassachusettsUSA
| | - Marc P. Duby
- Medical and Population Genetics ProgramBroad InstituteCambridgeMassachusettsUSA
| | - Michel Dumontier
- Institute of Data ScienceMaastricht UniversityMaastrichtThe Netherlands
| | | | - Nomi L. Harris
- Division of Environmental Genomics and Systems BiologyLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | - Eugene W. Hinderer
- Tufts Clinical and Translational Science InstituteTufts Medical CenterBostonMassachusettsUSA
| | - Greg Hyde
- Thayer School of EngineeringDartmouth CollegeHanoverNew HampshireUSA
| | - Adam Johs
- Department of Information Science, College of Computing and InformaticsDrexel UniversityPhiladelphiaPennsylvaniaUSA
| | - Andrew I. Su
- Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | | | - Qian Zhu
- Division of Preclinical Innovation, National Center for Advancing Translational SciencesNational Institutes of HealthRockvilleMarylandUSA
| | | |
Collapse
|
19
|
Mayers M, Tu R, Steinecke D, Li TS, Queralt-Rosinach N, Su AI. Design and application of a knowledge network for automatic prioritization of drug mechanisms. Bioinformatics 2022; 38:2880-2891. [PMID: 35561182 PMCID: PMC9113361 DOI: 10.1093/bioinformatics/btac205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 02/17/2022] [Accepted: 04/04/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Drug repositioning is an attractive alternative to de novo drug discovery due to reduced time and costs to bring drugs to market. Computational repositioning methods, particularly non-black-box methods that can account for and predict a drug's mechanism, may provide great benefit for directing future development. By tuning both data and algorithm to utilize relationships important to drug mechanisms, a computational repositioning algorithm can be trained to both predict and explain mechanistically novel indications. RESULTS In this work, we examined the 123 curated drug mechanism paths found in the drug mechanism database (DrugMechDB) and after identifying the most important relationships, we integrated 18 data sources to produce a heterogeneous knowledge graph, MechRepoNet, capable of capturing the information in these paths. We applied the Rephetio repurposing algorithm to MechRepoNet using only a subset of relationships known to be mechanistic in nature and found adequate predictive ability on an evaluation set with AUROC value of 0.83. The resulting repurposing model allowed us to prioritize paths in our knowledge graph to produce a predicted treatment mechanism. We found that DrugMechDB paths, when present in the network were rated highly among predicted mechanisms. We then demonstrated MechRepoNet's ability to use mechanistic insight to identify a drug's mechanistic target, with a mean reciprocal rank of 0.525 on a test set of known drug-target interactions. Finally, we walked through repurposing examples of the anti-cancer drug imatinib for use in the treatment of asthma, and metolazone for use in the treatment of osteoporosis, to demonstrate this method's utility in providing mechanistic insight into repurposing predictions it provides. AVAILABILITY AND IMPLEMENTATION The Python code to reproduce the entirety of this analysis is available at: https://github.com/SuLab/MechRepoNet (archived at https://doi.org/10.5281/zenodo.6456335). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Dylan Steinecke
- Department of Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Tong Shu Li
- Department of Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Núria Queralt-Rosinach
- Department of Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | |
Collapse
|
20
|
Thuy-Boun PS, Wang AY, Crissien-Martinez A, Xu JH, Chatterjee S, Stupp GS, Su AI, Coyle WJ, Wolan DW. Quantitative metaproteomics and activity-based protein profiling of patient fecal microbiome identifies host and microbial serine-type endopeptidase activity associated with ulcerative colitis. Mol Cell Proteomics 2022; 21:100197. [PMID: 35033677 PMCID: PMC8941213 DOI: 10.1016/j.mcpro.2022.100197] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 01/10/2022] [Accepted: 01/11/2022] [Indexed: 12/12/2022] Open
Abstract
The gut microbiota plays an important yet incompletely understood role in the induction and propagation of ulcerative colitis (UC). Organism-level efforts to identify UC-associated microbes have revealed the importance of community structure, but less is known about the molecular effectors of disease. We performed 16S rRNA gene sequencing in parallel with label-free data-dependent LC-MS/MS proteomics to characterize the stool microbiomes of healthy (n = 8) and UC (n = 10) patients. Comparisons of taxonomic composition between techniques revealed major differences in community structure partially attributable to the additional detection of host, fungal, viral, and food peptides by metaproteomics. Differential expression analysis of metaproteomic data identified 176 significantly enriched protein groups between healthy and UC patients. Gene ontology analysis revealed several enriched functions with serine-type endopeptidase activity overrepresented in UC patients. Using a biotinylated fluorophosphonate probe and streptavidin-based enrichment, we show that serine endopeptidases are active in patient fecal samples and that additional putative serine hydrolases are detectable by this approach compared with unenriched profiling. Finally, as metaproteomic databases expand, they are expected to asymptotically approach completeness. Using ComPIL and de novo peptide sequencing, we estimate the size of the probable peptide space unidentified (“dark peptidome”) by our large database approach to establish a rough benchmark for database sufficiency. Despite high variability inherent in patient samples, our analysis yielded a catalog of differentially enriched proteins between healthy and UC fecal proteomes. This catalog provides a clinically relevant jumping-off point for further molecular-level studies aimed at identifying the microbial underpinnings of UC. Identified 176 significantly altered protein groups between healthy and UC patients. Serine-type endopeptidase activity is overrepresented in UC patients. Fluorophosphonate ABPP shows that endopeptidases are active in fecal samples. ABPP enrichment helps identify additional putative serine hydrolases in samples. De novo sequencing used to estimate number of MS2 spectra unidentified by ComPIL.
Collapse
Affiliation(s)
- Peter S Thuy-Boun
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037
| | - Ana Y Wang
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037
| | | | - Janice H Xu
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037
| | - Sandip Chatterjee
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037
| | - Gregory S Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037
| | - Walter J Coyle
- Scripps Clinic Gastroenterology Division, La Jolla, CA 92037
| | - Dennis W Wolan
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037.
| |
Collapse
|
21
|
Lelong S, Zhou X, Afrasiabi C, Qian Z, Cano MA, Tsueng G, Xin J, Mullen J, Yao Y, Avila R, Taylor G, Su AI, Wu C. BioThings SDK: a toolkit for building high-performance data APIs in biomedical research. Bioinformatics 2022; 38:2077-2079. [PMID: 35020801 PMCID: PMC8963279 DOI: 10.1093/bioinformatics/btac017] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 12/10/2021] [Accepted: 01/08/2022] [Indexed: 02/04/2023] Open
Abstract
SUMMARY To meet the increased need of making biomedical resources more accessible and reusable, Web Application Programming Interfaces (APIs) or web services have become a common way to disseminate knowledge sources. The BioThings APIs are a collection of high-performance, scalable, annotation as a service APIs that automate the integration of biological annotations from disparate data sources. This collection of APIs currently includes MyGene.info, MyVariant.info and MyChem.info for integrating annotations on genes, variants and chemical compounds, respectively. These APIs are used by both individual researchers and application developers to simplify the process of annotation retrieval and identifier mapping. Here, we describe the BioThings Software Development Kit (SDK), a generalizable and reusable toolkit for integrating data from multiple disparate data sources and creating high-performance APIs. This toolkit allows users to easily create their own BioThings APIs for any data type of interest to them, as well as keep APIs up-to-date with their underlying data sources. AVAILABILITY AND IMPLEMENTATION The BioThings SDK is built in Python and released via PyPI (https://pypi.org/project/biothings/). Its source code is hosted at its github repository (https://github.com/biothings/biothings.api). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sebastien Lelong
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Xinghua Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Cyrus Afrasiabi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Zhongchao Qian
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Julia Mullen
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Yao Yao
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Ricardo Avila
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Greg Taylor
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Chunlei Wu
- To whom correspondence should be addressed.
| |
Collapse
|
22
|
Cano M, Tsueng G, Zhou X, Hughes LD, Mullen JL, Xin J, Su AI, Wu C. Schema Playground: A tool for authoring, extending, and using metadata schemas to improve FAIRness of biomedical data.. [PMID: 35677074 PMCID: PMC9176648 DOI: 10.1101/2021.09.02.458726] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Background: Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. Schema.org is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging schema.org could benefit biomedical research resource providers, but it can be challenging to apply schema.org standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize schema.org or other biomedical schema projects. Results: Our browser-based tool includes features which can help address many of the barriers towards schema.org-compliance such as: The ability to easily browse for relevant schema.org classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the Outbreak.info schema—a large multi-class schema for harmonizing various COVID-19 related resources. Conclusions: We have created a browser-based tool to empower biomedical research resource providers to leverage schema.org classes to make their research outputs more FAIR.
Collapse
|
23
|
Lee KI, Gamini R, Olmer M, Ikuta Y, Hasei J, Baek J, Alvarez-Garcia O, Grogan SP, D'Lima DD, Asahara H, Su AI, Lotz MK. Mohawk is a transcription factor that promotes meniscus cell phenotype and tissue repair and reduces osteoarthritis severity. Sci Transl Med 2021; 12:12/567/eaan7967. [PMID: 33115953 DOI: 10.1126/scitranslmed.aan7967] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 02/06/2020] [Accepted: 09/21/2020] [Indexed: 12/13/2022]
Abstract
Meniscus tears are common knee injuries and a major osteoarthritis (OA) risk factor. Knowledge gaps that limit the development of therapies for meniscus injury and degeneration concern transcription factors that control the meniscus cell phenotype. Analysis of RNA sequencing data from 37 human tissues in the Genotype-Tissue Expression database and RNA sequencing data from meniscus and articular cartilage showed that transcription factor Mohawk (MKX) is highly enriched in meniscus. In human meniscus cells, MKX regulates the expression of meniscus marker genes, OA-related genes, and other transcription factors, including Scleraxis (SCX), SRY Box 5 (SOX5), and Runt domain-related transcription factor 2 (RUNX2). In mesenchymal stem cells (MSCs), the combination of adenoviral MKX (Ad-MKX) and transforming growth factor-β3 (TGF-β3) induced a meniscus cell phenotype. When Ad-MKX-transduced MSCs were seeded on TGF-β3-conjugated decellularized meniscus scaffold (DMS) and inserted into experimental tears in meniscus explants, they increased glycosaminoglycan content, extracellular matrix interconnectivity, cell infiltration into the DMS, and improved biomechanical properties. Ad-MKX injection into mouse knee joints with experimental OA induced by surgical destabilization of the meniscus suppressed meniscus and cartilage damage, reducing OA severity. Ad-MKX injection into human OA meniscus tissue explants corrected pathogenic gene expression. These results identify MKX as a previously unidentified key transcription factor that regulates the meniscus cell phenotype. The combination of Ad-MKX with TGF-β3 is effective for differentiation of MSCs to a meniscus cell phenotype and useful for meniscus repair. MKX is a promising therapeutic target for meniscus tissue engineering, repair, and prevention of OA.
Collapse
Affiliation(s)
- Kwang Il Lee
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA
| | - Ramya Gamini
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA
| | - Merissa Olmer
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA
| | - Yasunari Ikuta
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA
| | - Joe Hasei
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA
| | - Jihye Baek
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA.,Shiley Center for Orthopaedic Research and Education at Scripps Clinic, La Jolla, CA 92037, USA
| | | | - Shawn P Grogan
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA.,Shiley Center for Orthopaedic Research and Education at Scripps Clinic, La Jolla, CA 92037, USA
| | - Darryl D D'Lima
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA.,Shiley Center for Orthopaedic Research and Education at Scripps Clinic, La Jolla, CA 92037, USA
| | - Hiroshi Asahara
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA
| | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, Scripps Research, La Jolla, CA 92037, USA
| | - Martin K Lotz
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA.
| |
Collapse
|
24
|
Queralt-Rosinach N, Stupp GS, Li TS, Mayers M, Hoatlin ME, Might M, Good BM, Su AI. Structured reviews for data and knowledge-driven research. Database (Oxford) 2021; 2020:5818923. [PMID: 32283553 PMCID: PMC7153956 DOI: 10.1093/database/baaa015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 01/21/2020] [Accepted: 02/07/2020] [Indexed: 12/25/2022]
Abstract
Hypothesis generation is a critical step in research and a cornerstone in the rare disease field. Research is most efficient when those hypotheses are based on the entirety of knowledge known to date. Systematic review articles are commonly used in biomedicine to summarize existing knowledge and contextualize experimental data. But the information contained within review articles is typically only expressed as free-text, which is difficult to use computationally. Researchers struggle to navigate, collect and remix prior knowledge as it is scattered in several silos without seamless integration and access. This lack of a structured information framework hinders research by both experimental and computational scientists. To better organize knowledge and data, we built a structured review article that is specifically focused on NGLY1 Deficiency, an ultra-rare genetic disease first reported in 2012. We represented this structured review as a knowledge graph and then stored this knowledge graph in a Neo4j database to simplify dissemination, querying and visualization of the network. Relative to free-text, this structured review better promotes the principles of findability, accessibility, interoperability and reusability (FAIR). In collaboration with domain experts in NGLY1 Deficiency, we demonstrate how this resource can improve the efficiency and comprehensiveness of hypothesis generation. We also developed a read–write interface that allows domain experts to contribute FAIR structured knowledge to this community resource. In contrast to traditional free-text review articles, this structured review exists as a living knowledge graph that is curated by humans and accessible to computational analyses. Finally, we have generalized this workflow into modular and repurposable components that can be applied to other domain areas. This NGLY1 Deficiency-focused network is publicly available at http://ngly1graph.org/. Availability and implementation Database URL: http://ngly1graph.org/. Network data files are at: https://github.com/SuLab/ngly1-graph and source code at: https://github.com/SuLab/bioknowledge-reviewer. Contact asu@scripps.edu
Collapse
Affiliation(s)
- Núria Queralt-Rosinach
- Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd. La Jolla, CA 92037, USA
| | - Gregory S Stupp
- Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd. La Jolla, CA 92037, USA
| | - Tong Shu Li
- Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd. La Jolla, CA 92037, USA
| | - Michael Mayers
- Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd. La Jolla, CA 92037, USA
| | - Maureen E Hoatlin
- Department of Biochemistry and Molecular Biology, Oregon Health and Science University, 3181 SW Sam Jackson Parkway, Portland, OR 97239, USA
| | - Matthew Might
- Department of Medicine, Hugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, 510 20th St S, Birmingham, AL 35210, USA
| | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd. La Jolla, CA 92037, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd. La Jolla, CA 92037, USA
| |
Collapse
|
25
|
Waagmeester A, Willighagen EL, Su AI, Kutmon M, Gayo JEL, Fernández-Álvarez D, Groom Q, Schaap PJ, Verhagen LM, Koehorst JJ. A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses. BMC Biol 2021; 19:12. [PMID: 33482803 PMCID: PMC7820539 DOI: 10.1186/s12915-020-00940-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 12/13/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Pandemics, even more than other medical problems, require swift integration of knowledge. When caused by a new virus, understanding the underlying biology may help finding solutions. In a setting where there are a large number of loosely related projects and initiatives, we need common ground, also known as a "commons." Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons and uses unique identifiers to link knowledge in other knowledge bases. However, Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modeled with entity schemas represented by Shape Expressions. RESULTS As a telling example, we describe the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses as well as how Shape Expressions can be defined for Wikidata to model the knowledge, helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable is demonstrated by integrating data from NCBI (National Center for Biotechnology Information) Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates. CONCLUSIONS Although this workflow is developed and applied in the context of the COVID-19 pandemic, to demonstrate its broader applicability it was also applied to other human coronaviruses (MERS, SARS, human coronavirus NL63, human coronavirus 229E, human coronavirus HKU1, human coronavirus OC4).
Collapse
Affiliation(s)
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Martina Kutmon
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
| | | | | | | | - Peter J Schaap
- Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands
| | | | - Jasper J Koehorst
- Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands.
| |
Collapse
|
26
|
Wang J, Vallee I, Dutta A, Wang Y, Mo Z, Liu Z, Cui H, Su AI, Yang XL. Multi-Omics Database Analysis of Aminoacyl-tRNA Synthetases in Cancer. Genes (Basel) 2020; 11:genes11111384. [PMID: 33266490 PMCID: PMC7700366 DOI: 10.3390/genes11111384] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 10/24/2020] [Accepted: 11/20/2020] [Indexed: 12/23/2022] Open
Abstract
Aminoacyl-tRNA synthetases (aaRSs) are key enzymes in the mRNA translation machinery, yet they possess numerous non-canonical functions developed during the evolution of complex organisms. The aaRSs and aaRS-interacting multi-functional proteins (AIMPs) are continually being implicated in tumorigenesis, but these connections are often limited in scope, focusing on specific aaRSs in distinct cancer subtypes. Here, we analyze publicly available genomic and transcriptomic data on human cytoplasmic and mitochondrial aaRSs across many cancer types. As high-throughput technologies have improved exponentially, large-scale projects have systematically quantified genetic alteration and expression from thousands of cancer patient samples. One such project is the Cancer Genome Atlas (TCGA), which processed over 20,000 primary cancer and matched normal samples from 33 cancer types. The wealth of knowledge provided from this undertaking has streamlined the identification of cancer drivers and suppressors. We examined aaRS expression data produced by the TCGA project and combined this with patient survival data to recognize trends in aaRSs' impact on cancer both molecularly and prognostically. We further compared these trends to an established tumor suppressor and a proto-oncogene. We observed apparent upregulation of many tRNA synthetase genes with aggressive cancer types, yet, at the individual gene level, some aaRSs resemble a tumor suppressor while others show similarities to an oncogene. This study provides an unbiased, overarching perspective on the relationship of aaRSs with cancers and identifies certain aaRS family members as promising therapeutic targets or potential leads for developing biological therapy for cancer.
Collapse
Affiliation(s)
- Justin Wang
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Ingrid Vallee
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Aditi Dutta
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Yu Wang
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Zhongying Mo
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Ze Liu
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Haissi Cui
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
| | - Andrew I. Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA;
| | - Xiang-Lei Yang
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; (J.W.); (I.V.); (A.D.); (Y.W.); (Z.M.); (Z.L.); (H.C.)
- Correspondence: ; Tel.: +1-858-784-8976; Fax: +1-858-784-7250
| |
Collapse
|
27
|
Riva L, Yuan S, Yin X, Martin-Sancho L, Matsunaga N, Pache L, Burgstaller-Muehlbacher S, De Jesus PD, Teriete P, Hull MV, Chang MW, Chan JFW, Cao J, Poon VKM, Herbert KM, Cheng K, Nguyen TTH, Rubanov A, Pu Y, Nguyen C, Choi A, Rathnasinghe R, Schotsaert M, Miorin L, Dejosez M, Zwaka TP, Sit KY, Martinez-Sobrido L, Liu WC, White KM, Chapman ME, Lendy EK, Glynne RJ, Albrecht R, Ruppin E, Mesecar AD, Johnson JR, Benner C, Sun R, Schultz PG, Su AI, García-Sastre A, Chatterjee AK, Yuen KY, Chanda SK. Discovery of SARS-CoV-2 antiviral drugs through large-scale compound repurposing. Nature 2020; 586:113-119. [PMID: 32707573 PMCID: PMC7603405 DOI: 10.1038/s41586-020-2577-1] [Citation(s) in RCA: 559] [Impact Index Per Article: 139.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 07/17/2020] [Indexed: 02/08/2023]
Abstract
The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in 2019 has triggered an ongoing global pandemic of the severe pneumonia-like disease coronavirus disease 2019 (COVID-19)1. The development of a vaccine is likely to take at least 12-18 months, and the typical timeline for approval of a new antiviral therapeutic agent can exceed 10 years. Thus, repurposing of known drugs could substantially accelerate the deployment of new therapies for COVID-19. Here we profiled a library of drugs encompassing approximately 12,000 clinical-stage or Food and Drug Administration (FDA)-approved small molecules to identify candidate therapeutic drugs for COVID-19. We report the identification of 100 molecules that inhibit viral replication of SARS-CoV-2, including 21 drugs that exhibit dose-response relationships. Of these, thirteen were found to harbour effective concentrations commensurate with probable achievable therapeutic doses in patients, including the PIKfyve kinase inhibitor apilimod2-4 and the cysteine protease inhibitors MDL-28170, Z LVG CHN2, VBY-825 and ONO 5334. Notably, MDL-28170, ONO 5334 and apilimod were found to antagonize viral replication in human pneumocyte-like cells derived from induced pluripotent stem cells, and apilimod also demonstrated antiviral efficacy in a primary human lung explant model. Since most of the molecules identified in this study have already advanced into the clinic, their known pharmacological and human safety profiles will enable accelerated preclinical and clinical evaluation of these drugs for the treatment of COVID-19.
Collapse
Affiliation(s)
- Laura Riva
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Shuofeng Yuan
- State Key Laboratory of Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Carol Yu Centre for Infection, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
| | - Xin Yin
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Laura Martin-Sancho
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Naoko Matsunaga
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Lars Pache
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Sebastian Burgstaller-Muehlbacher
- Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of Vienna, Vienna, Austria
| | - Paul D De Jesus
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Peter Teriete
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | | | - Max W Chang
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Jasper Fuk-Woo Chan
- State Key Laboratory of Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Carol Yu Centre for Infection, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
| | - Jianli Cao
- State Key Laboratory of Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Carol Yu Centre for Infection, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
| | - Vincent Kwok-Man Poon
- State Key Laboratory of Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
- Carol Yu Centre for Infection, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
| | - Kristina M Herbert
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Kuoyuan Cheng
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, USA
- Biological Sciences Graduate Program, University of Maryland, College Park, MD, USA
| | | | - Andrey Rubanov
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Yuan Pu
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Courtney Nguyen
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Angela Choi
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Raveen Rathnasinghe
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Michael Schotsaert
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lisa Miorin
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Marion Dejosez
- Huffington Foundation Center for Cell-based Research in Parkinson's Disease, Department for Cell, Regenerative and Developmental Biology, Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Thomas P Zwaka
- Huffington Foundation Center for Cell-based Research in Parkinson's Disease, Department for Cell, Regenerative and Developmental Biology, Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ko-Yung Sit
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China
| | | | - Wen-Chun Liu
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kris M White
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mackenzie E Chapman
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Emma K Lendy
- Department of Biochemistry, Purdue University, West Lafayette, IN, USA
| | | | - Randy Albrecht
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eytan Ruppin
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, USA
| | - Andrew D Mesecar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Biochemistry, Purdue University, West Lafayette, IN, USA
| | - Jeffrey R Johnson
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Christopher Benner
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, USA
| | | | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Adolfo García-Sastre
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Division of Infectious Diseases, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Kwok-Yung Yuen
- State Key Laboratory of Emerging Infectious Diseases, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China.
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China.
- Carol Yu Centre for Infection, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, Hong Kong, China.
| | - Sumit K Chanda
- Immunity and Pathogenesis Program, Infectious and Inflammatory Disease Center, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA.
| |
Collapse
|
28
|
Waagmeester A, Stupp G, Burgstaller-Muehlbacher S, Good BM, Griffith M, Griffith OL, Hanspers K, Hermjakob H, Hudson TS, Hybiske K, Keating SM, Manske M, Mayers M, Mietchen D, Mitraka E, Pico AR, Putman T, Riutta A, Queralt-Rosinach N, Schriml LM, Shafee T, Slenter D, Stephan R, Thornton K, Tsueng G, Tu R, Ul-Hasan S, Willighagen E, Wu C, Su AI. Wikidata as a knowledge graph for the life sciences. eLife 2020; 9:e52614. [PMID: 32180547 PMCID: PMC7077981 DOI: 10.7554/elife.52614] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 02/28/2020] [Indexed: 12/22/2022] Open
Abstract
Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.
Collapse
Affiliation(s)
| | - Gregory Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Sebastian Burgstaller-Muehlbacher
- Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of ViennaViennaAustria
| | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Malachi Griffith
- McDonnell Genome Institute, Washington University School of MedicineSt. LouisUnited States
| | - Obi L Griffith
- McDonnell Genome Institute, Washington University School of MedicineSt. LouisUnited States
| | - Kristina Hanspers
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | | | - Toby S Hudson
- School of Chemistry, The University of SydneySydneyAustralia
| | - Kevin Hybiske
- Division of Allergy and Infectious Diseases, Department of Medicine, University of WashingtonSeattleUnited States
| | - Sarah M Keating
- European Bioinformatics Institute (EMBL-EBI)HinxtonUnited Kingdom
| | - Magnus Manske
- Wellcome Trust Sanger InstituteCambridgeUnited Kingdom
| | - Michael Mayers
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Daniel Mietchen
- School of Data Science, University of VirginiaCharlottesvilleUnited States
| | - Elvira Mitraka
- University of Maryland School of MedicineBaltimoreUnited States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | - Timothy Putman
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Anders Riutta
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | - Nuria Queralt-Rosinach
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Lynn M Schriml
- University of Maryland School of MedicineBaltimoreUnited States
| | - Thomas Shafee
- Department of Animal Plant and Soil Sciences, La Trobe UniversityMelbourneAustralia
| | - Denise Slenter
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht UniversityMaastrichtNetherlands
| | | | | | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Roger Tu
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Sabah Ul-Hasan
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht UniversityMaastrichtNetherlands
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| |
Collapse
|
29
|
Summers KM, Bush SJ, Wu C, Su AI, Muriuki C, Clark EL, Finlayson HA, Eory L, Waddell LA, Talbot R, Archibald AL, Hume DA. Functional Annotation of the Transcriptome of the Pig, Sus scrofa, Based Upon Network Analysis of an RNAseq Transcriptional Atlas. Front Genet 2020; 10:1355. [PMID: 32117413 PMCID: PMC7034361 DOI: 10.3389/fgene.2019.01355] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 12/11/2019] [Indexed: 12/15/2022] Open
Abstract
The domestic pig (Sus scrofa) is both an economically important livestock species and a model for biomedical research. Two highly contiguous pig reference genomes have recently been released. To support functional annotation of the pig genomes and comparative analysis with large human transcriptomic data sets, we aimed to create a pig gene expression atlas. To achieve this objective, we extended a previous approach developed for the chicken. We downloaded RNAseq data sets from public repositories, down-sampled to a common depth, and quantified expression against a reference transcriptome using the mRNA quantitation tool, Kallisto. We then used the network analysis tool Graphia to identify clusters of transcripts that were coexpressed across the merged data set. Consistent with the principle of guilt-by-association, we identified coexpression clusters that were highly tissue or cell-type restricted and contained transcription factors that have previously been implicated in lineage determination. Other clusters were enriched for transcripts associated with biological processes, such as the cell cycle and oxidative phosphorylation. The same approach was used to identify coexpression clusters within RNAseq data from multiple individual liver and brain samples, highlighting cell type, process, and region-specific gene expression. Evidence of conserved expression can add confidence to assignment of orthology between pig and human genes. Many transcripts currently identified as novel genes with ENSSSCG or LOC IDs were found to be coexpressed with annotated neighbouring transcripts in the same orientation, indicating they may be products of the same transcriptional unit. The meta-analytic approach to utilising public RNAseq data is extendable to include new data sets and new species and provides a framework to support the Functional Annotation of Animals Genomes (FAANG) initiative.
Collapse
Affiliation(s)
- Kim M. Summers
- Mater Research Institute-University of Queensland, Translational Research Institute, Woolloongabba, QLD, Australia
| | - Stephen J. Bush
- Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Andrew I. Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Charity Muriuki
- The Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - Emily L. Clark
- The Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | | | - Lel Eory
- The Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - Lindsey A. Waddell
- The Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - Richard Talbot
- The Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - Alan L. Archibald
- The Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - David A. Hume
- Mater Research Institute-University of Queensland, Translational Research Institute, Woolloongabba, QLD, Australia
| |
Collapse
|
30
|
Blasco A, Endres MG, Sergeev RA, Jonchhe A, Macaluso NJM, Narayan R, Natoli T, Paik JH, Briney B, Wu C, Su AI, Subramanian A, Lakhani KR. Advancing computational biology and bioinformatics research through open innovation competitions. PLoS One 2019; 14:e0222165. [PMID: 31560691 PMCID: PMC6764653 DOI: 10.1371/journal.pone.0222165] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 08/22/2019] [Indexed: 11/19/2022] Open
Abstract
Open data science and algorithm development competitions offer a unique avenue for rapid discovery of better computational strategies. We highlight three examples in computational biology and bioinformatics research in which the use of competitions has yielded significant performance gains over established algorithms. These include algorithms for antibody clustering, imputing gene expression data, and querying the Connectivity Map (CMap). Performance gains are evaluated quantitatively using realistic, albeit sanitized, data sets. The solutions produced through these competitions are then examined with respect to their utility and the prospects for implementation in the field. We present the decision process and competition design considerations that lead to these successful outcomes as a model for researchers who want to use competitions and non-domain crowds as collaborators to further their research.
Collapse
Affiliation(s)
- Andrea Blasco
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Institute for Quantitative Social Science, Harvard University, Cambridge, MA, United States of America
- The Broad Institute, Cambridge, MA, United States of America
- * E-mail:
| | - Michael G. Endres
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Institute for Quantitative Social Science, Harvard University, Cambridge, MA, United States of America
| | - Rinat A. Sergeev
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Harvard Business School, Harvard University, Boston, MA, United States of America
| | - Anup Jonchhe
- The Broad Institute, Cambridge, MA, United States of America
| | | | - Rajiv Narayan
- The Broad Institute, Cambridge, MA, United States of America
| | - Ted Natoli
- The Broad Institute, Cambridge, MA, United States of America
| | - Jin H. Paik
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Harvard Business School, Harvard University, Boston, MA, United States of America
| | - Bryan Briney
- Department of Immunology and Microbial Science, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Andrew I. Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States of America
| | | | - Karim R. Lakhani
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Harvard Business School, Harvard University, Boston, MA, United States of America
- National Bureau of Economic Research, Cambridge, MA, United States of America
| |
Collapse
|
31
|
Matsuzaki T, Alvarez-Garcia O, Mokuda S, Nagira K, Olmer M, Gamini R, Miyata K, Akasaki Y, Su AI, Asahara H, Lotz MK. FoxO transcription factors modulate autophagy and proteoglycan 4 in cartilage homeostasis and osteoarthritis. Sci Transl Med 2019; 10:10/428/eaan0746. [PMID: 29444976 DOI: 10.1126/scitranslmed.aan0746] [Citation(s) in RCA: 167] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 01/08/2018] [Indexed: 12/14/2022]
Abstract
Aging is a main risk factor for osteoarthritis (OA). FoxO transcription factors protect against cellular and organismal aging, and FoxO expression in cartilage is reduced with aging and in OA. To investigate the role of FoxO in cartilage, Col2Cre-FoxO1, 3, and 4 single knockout (KO) and triple KO mice (Col2Cre-TKO) were analyzed. Articular cartilage in Col2Cre-TKO and Col2Cre-FoxO1 KO mice was thicker than in control mice at 1 or 2 months of age. This was associated with increased proliferation of chondrocytes of Col2Cre-TKO mice in vivo and in vitro. OA-like changes developed in cartilage, synovium, and subchondral bone between 4 and 6 months of age in Col2Cre-TKO and Col2Cre-FoxO1 KO mice. Col2Cre-FoxO3 and FoxO4 KO mice showed no cartilage abnormalities until 18 months of age when Col2Cre-FoxO3 KO mice had more severe OA than control mice. Autophagy and antioxidant defense genes were reduced in Col2Cre-TKO mice. Deletion of FoxO1/3/4 in mature mice using Aggrecan(Acan)-CreERT2 (AcanCreERT-TKO) also led to spontaneous cartilage degradation and increased OA severity in a surgical model or treadmill running. The superficial zone of knee articular cartilage of Col2Cre-TKO and AcanCreERT-TKO mice exhibited reduced cell density and markedly decreased Prg4 In vitro, ectopic FoxO1 expression increased Prg4 and synergized with transforming growth factor-β stimulation. In OA chondrocytes, overexpression of FoxO1 reduced inflammatory mediators and cartilage-degrading enzymes, increased protective genes, and antagonized interleukin-1β effects. Our observations suggest that FoxO play a key role in postnatal cartilage development, maturation, and homeostasis and protect against OA-associated cartilage damage.
Collapse
Affiliation(s)
- Tokio Matsuzaki
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Oscar Alvarez-Garcia
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Sho Mokuda
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Keita Nagira
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Merissa Olmer
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Ramya Gamini
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Kohei Miyata
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Yukio Akasaki
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew I Su
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Hiroshi Asahara
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Martin K Lotz
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA.
| |
Collapse
|
32
|
Tsueng G, Nanis M, Fouquier JT, Mayers M, Good BM, Su AI. Applying citizen science to gene, drug and disease relationship extraction from biomedical abstracts. Bioinformatics 2019; 36:1226-1233. [PMID: 31504205 PMCID: PMC8104067 DOI: 10.1093/bioinformatics/btz678] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 08/05/2019] [Accepted: 08/29/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION Biomedical literature is growing at a rate that outpaces our ability to harness the knowledge contained therein. To mine valuable inferences from the large volume of literature, many researchers use information extraction algorithms to harvest information in biomedical texts. Information extraction is usually accomplished via a combination of manual expert curation and computational methods. Advances in computational methods usually depend on the time-consuming generation of gold standards by a limited number of expert curators. Citizen science is public participation in scientific research. We previously found that citizen scientists are willing and capable of performing named entity recognition of disease mentions in biomedical abstracts, but did not know if this was true with relationship extraction (RE). RESULTS In this article, we introduce the Relationship Extraction Module of the web-based application Mark2Cure (M2C) and demonstrate that citizen scientists can perform RE. We confirm the importance of accurate named entity recognition on user performance of RE and identify design issues that impacted data quality. We find that the data generated by citizen scientists can be used to identify relationship types not currently available in the M2C Relationship Extraction Module. We compare the citizen science-generated data with algorithm-mined data and identify ways in which the two approaches may complement one another. We also discuss opportunities for future improvement of this system, as well as the potential synergies between citizen science, manual biocuration and natural language processing. AVAILABILITY AND IMPLEMENTATION Mark2Cure platform: https://mark2cure.org; Mark2Cure source code: https://github.com/sulab/mark2cure; and data and analysis code for this article: https://github.com/gtsueng/M2C_rel_nb. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Max Nanis
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Jennifer T Fouquier
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Michael Mayers
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
33
|
Putman T, Hybiske K, Jow D, Afrasiabi C, Lelong S, Cano MA, Stupp GS, Waagmeester A, Good BM, Wu C, Su AI. ChlamBase: a curated model organism database for the Chlamydia research community. Database (Oxford) 2019; 2019:5519651. [PMID: 31211397 PMCID: PMC6580685 DOI: 10.1093/database/baz091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Affiliation(s)
- Tim Putman
- Ontology Development Group, Library, Oregon Health and Science University, Portland, OR, USA
| | - Kevin Hybiske
- Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Derek Jow
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Cyrus Afrasiabi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Sebastien Lelong
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Gregory S Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | | | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
34
|
Murillo OD, Thistlethwaite W, Rozowsky J, Subramanian SL, Lucero R, Shah N, Jackson AR, Srinivasan S, Chung A, Laurent CD, Kitchen RR, Galeev T, Warrell J, Diao JA, Welsh JA, Hanspers K, Riutta A, Burgstaller-Muehlbacher S, Shah RV, Yeri A, Jenkins LM, Ahsen ME, Cordon-Cardo C, Dogra N, Gifford SM, Smith JT, Stolovitzky G, Tewari AK, Wunsch BH, Yadav KK, Danielson KM, Filant J, Moeller C, Nejad P, Paul A, Simonson B, Wong DK, Zhang X, Balaj L, Gandhi R, Sood AK, Alexander RP, Wang L, Wu C, Wong DTW, Galas DJ, Van Keuren-Jensen K, Patel T, Jones JC, Das S, Cheung KH, Pico AR, Su AI, Raffai RL, Laurent LC, Roth ME, Gerstein MB, Milosavljevic A. exRNA Atlas Analysis Reveals Distinct Extracellular RNA Cargo Types and Their Carriers Present across Human Biofluids. Cell 2019; 177:463-477.e15. [PMID: 30951672 PMCID: PMC6616370 DOI: 10.1016/j.cell.2019.02.018] [Citation(s) in RCA: 187] [Impact Index Per Article: 37.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Revised: 11/06/2018] [Accepted: 02/11/2019] [Indexed: 12/11/2022]
Abstract
To develop a map of cell-cell communication mediated by extracellular RNA (exRNA), the NIH Extracellular RNA Communication Consortium created the exRNA Atlas resource (https://exrna-atlas.org). The Atlas version 4P1 hosts 5,309 exRNA-seq and exRNA qPCR profiles from 19 studies and a suite of analysis and visualization tools. To analyze variation between profiles, we apply computational deconvolution. The analysis leads to a model with six exRNA cargo types (CT1, CT2, CT3A, CT3B, CT3C, CT4), each detectable in multiple biofluids (serum, plasma, CSF, saliva, urine). Five of the cargo types associate with known vesicular and non-vesicular (lipoprotein and ribonucleoprotein) exRNA carriers. To validate utility of this model, we re-analyze an exercise response study by deconvolution to identify physiologically relevant response pathways that were not detected previously. To enable wide application of this model, as part of the exRNA Atlas resource, we provide tools for deconvolution and analysis of user-provided case-control studies.
Collapse
Affiliation(s)
- Oscar D Murillo
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - William Thistlethwaite
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Joel Rozowsky
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Sai Lakshmi Subramanian
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Rocco Lucero
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Neethu Shah
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Andrew R Jackson
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Srimeenakshi Srinivasan
- Department of Obstetrics, Gynecology, and Reproductive Sciences and Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA 92037, USA
| | - Allen Chung
- Department of Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Surgical Service, San Francisco Veterans Affairs Medical Center, San Francisco, CA 94121, USA
| | - Clara D Laurent
- Department of Obstetrics, Gynecology, and Reproductive Sciences and Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA 92037, USA
| | | | - Timur Galeev
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Jonathan Warrell
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - James A Diao
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston, MA 02115, USA
| | - Joshua A Welsh
- Translational Nanobiology Section, Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | - Ravi V Shah
- Cardiovascular Research Center, Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Ashish Yeri
- Cardiovascular Research Center, Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Lisa M Jenkins
- Laboratory of Cell Biology, Center for Cancer Research, NIH, Bethesda, MD 20892, USA
| | - Mehmet E Ahsen
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Carlos Cordon-Cardo
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Navneet Dogra
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; IBM T.J. Watson Research Center, IBM Research, Yorktown Heights, NY 10598, USA
| | - Stacey M Gifford
- IBM T.J. Watson Research Center, IBM Research, Yorktown Heights, NY 10598, USA
| | - Joshua T Smith
- IBM T.J. Watson Research Center, IBM Research, Yorktown Heights, NY 10598, USA
| | - Gustavo Stolovitzky
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; IBM T.J. Watson Research Center, IBM Research, Yorktown Heights, NY 10598, USA
| | - Ashutosh K Tewari
- Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Benjamin H Wunsch
- IBM T.J. Watson Research Center, IBM Research, Yorktown Heights, NY 10598, USA
| | - Kamlesh K Yadav
- Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Sema4, Stamford, CT 06902, USA
| | - Kirsty M Danielson
- Cardiovascular Research Center, Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Justyna Filant
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Courtney Moeller
- Department of Obstetrics, Gynecology, and Reproductive Sciences and Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA 92037, USA
| | - Parham Nejad
- Department of Neurology, Center for Neurologic Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Anu Paul
- Department of Neurology, Center for Neurologic Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Bridget Simonson
- Cardiovascular Research Center, Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - David K Wong
- Department of Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Surgical Service, San Francisco Veterans Affairs Medical Center, San Francisco, CA 94121, USA
| | - Xuan Zhang
- Exosome Diagnostics, Inc., Waltham, MA 02451, USA
| | - Leonora Balaj
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Roopali Gandhi
- Department of Neurology, Center for Neurologic Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Anil K Sood
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Center for RNA Interference and Non-Coding RNAs, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Liang Wang
- Department of Pathology and MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - David T W Wong
- School of Dentistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - David J Galas
- Pacific Northwest Research Institute, Seattle, WA 98122, USA
| | | | - Tushar Patel
- Department of Transplantation, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Jennifer C Jones
- Translational Nanobiology Section, Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Saumya Das
- Cardiovascular Research Center, Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Kei-Hoi Cheung
- Department of Emergency Medicine, Yale University School of Medicine, New Haven, CT 06520, USA
| | | | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Robert L Raffai
- Department of Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Surgical Service, San Francisco Veterans Affairs Medical Center, San Francisco, CA 94121, USA
| | - Louise C Laurent
- Department of Obstetrics, Gynecology, and Reproductive Sciences and Sanford Consortium for Regenerative Medicine, University of California, San Diego, La Jolla, CA 92037, USA
| | - Matthew E Roth
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Mark B Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Computer Science, Yale University, New Haven, CT 06520, USA
| | | |
Collapse
|
35
|
Tsueng G, Kumar A, Nanis SM, Su AI. Aligning Needs: Integrating Citizen Science Efforts into Schools Through Service Requirements. Hum Comput (Fairfax) 2019; 6:56-82. [PMID: 31363486 PMCID: PMC6667230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Citizen science is the participation in scientific research by members of the public, and it is an increasingly valuable tool for both scientists and educators. For researchers, citizen science is a means of more quickly investigating questions which would otherwise be time-consuming and costly to study. For educators, citizen science offers a means to engage students in actual research and improve learning outcomes. Since most citizen science projects are usually designed with research goals in mind, many lack the necessary educator materials for successful integration in a formal science education (FSE) setting. In an ideal world, researchers and educators would build the necessary materials together; however, many researchers lack the time, resources, and networks to create these materials early on in the life of a citizen science project. For resource-poor projects, we propose an intermediate entry point for recruiting from the educational setting: community service or service learning requirements (CSSLRs). Many schools require students to participate in community service or service learning activities in order to graduate. When implemented well, CSSLRs provide students with growth and development opportunities outside the classroom while contributing to the community and other worthwhile causes. However, CSSLRs take time, resources, and effort to implement well. Just as citizen science projects need to establish relationships to transition well into formal science education, schools need to cultivate relationships with community service organizations. Students and educators at schools with CSSLRs where implementation is still a work in progress may be left with a burdensome requirement and inadequate support. With the help of a volunteer fulfilling a CSSLR, we investigated the number of students impacted by CSSLRs set at different levels of government and explored the qualifications needed for citizen science projects to fulfill CSSLRs by examining the explicitly-stated justifications for having CSSLRs, surveying how CSSLRs are verified, and using these qualifications to demonstrate how an online citizen science project, Mark2Cure, could use this information to meet the needs of students fulfilling CSSLRs.
Collapse
Affiliation(s)
| | - Arun Kumar
- graduate volunteer from Purdue University pursuing Optional Training
| | | | | |
Collapse
|
36
|
Putman T, Hybiske K, Jow D, Afrasiabi C, Lelong S, Cano MA, Stupp GS, Waagmeester A, Good BM, Wu C, Su AI. ChlamBase: a curated model organism database for the Chlamydia research community. Database (Oxford) 2019; 2019:baz041. [PMID: 30985891 PMCID: PMC6463448 DOI: 10.1093/database/baz041] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 02/22/2019] [Accepted: 03/07/2019] [Indexed: 02/06/2023]
Abstract
The accelerating growth of genomic and proteomic information for Chlamydia species, coupled with unique biological aspects of these pathogens, necessitates bioinformatic tools and features that are not provided by major public databases. To meet these growing needs, we developed ChlamBase, a model organism database for Chlamydia that is built upon the WikiGenomes application framework, and Wikidata, a community-curated database. ChlamBase was designed to serve as a central access point for genomic and proteomic information for the Chlamydia research community. ChlamBase integrates information from numerous external databases, as well as important data extracted from the literature that are otherwise not available in structured formats that are easy to use. In addition, a key feature of ChlamBase is that it empowers users in the field to contribute new annotations and data as the field advances with continued discoveries. ChlamBase is freely and publicly available at chlambase.org.
Collapse
Affiliation(s)
- Tim Putman
- Ontology Development Group, Library, Oregon Health and Science University, Portland, OR, USA
| | - Kevin Hybiske
- Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Derek Jow
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Cyrus Afrasiabi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Sebastien Lelong
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Gregory S Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | | | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
37
|
Fisch KM, Gamini R, Alvarez-Garcia O, Akagi R, Saito M, Muramatsu Y, Sasho T, Koziol JA, Su AI, Lotz MK. Identification of transcription factors responsible for dysregulated networks in human osteoarthritis cartilage by global gene expression analysis. Osteoarthritis Cartilage 2018; 26:1531-1538. [PMID: 30081074 PMCID: PMC6245598 DOI: 10.1016/j.joca.2018.07.012] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Revised: 06/28/2018] [Accepted: 07/13/2018] [Indexed: 02/02/2023]
Abstract
OBJECTIVE Osteoarthritis (OA) is the most prevalent joint disease. As disease-modifying therapies are not available, novel therapeutic targets need to be discovered and prioritized for their importance in mediating the abnormal phenotype of cells in OA-affected joints. Here, we generated a genome-wide molecular profile of OA to elucidate regulatory mechanisms of OA pathogenesis and to identify possible therapeutic targets using integrative analysis of mRNA-sequencing data obtained from human knee cartilage. DESIGN RNA-sequencing (RNA-seq) was performed on 18 normal and 20 OA human knee cartilage tissues. RNA-seq datasets were analysed to identify genes, pathways and regulatory networks that were dysregulated in OA. RESULTS RNA-seq data analysis revealed 1332 differentially expressed (DE) genes between OA and non-OA samples, including known and novel transcription factors (TFs). Pathway analysis identified 15 significantly perturbed pathways in OA with ECM-related, PI3K-Akt, HIF-1, FoxO and circadian rhythm pathways being the most significantly dysregulated. We selected DE TFs that are enriched for regulating DE genes in OA and prioritized these TFs by creating a cartilage-specific interaction subnetwork. This analysis revealed eight TFs, including JUN, Early growth response (EGR)1, JUND, FOSL2, MYC, KLF4, RELA, and FOS that both target large numbers of dysregulated genes in OA and are themselves suppressed in OA. CONCLUSIONS We identified a novel subnetwork of dysregulated TFs that represent new mediators of abnormal gene expression and promising therapeutic targets in OA.
Collapse
Affiliation(s)
- K M Fisch
- Center for Computational Biology and Bioinformatics, Department of Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, USA
| | - R Gamini
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - O Alvarez-Garcia
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - R Akagi
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA; Department of Orthopaedic Surgery, Chiba University Hospital 1-8-1 Inohana, Chuo-ku, Chiba, Japan
| | - M Saito
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA; Department of Orthopaedic Surgery, Chiba University Hospital 1-8-1 Inohana, Chuo-ku, Chiba, Japan
| | - Y Muramatsu
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA; Department of Orthopaedic Surgery, Chiba University Hospital 1-8-1 Inohana, Chuo-ku, Chiba, Japan
| | - T Sasho
- Department of Orthopaedic Surgery, Chiba University Hospital 1-8-1 Inohana, Chuo-ku, Chiba, Japan
| | - J A Koziol
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - A I Su
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - M K Lotz
- Department of Molecular Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA.
| |
Collapse
|
38
|
Janes J, Young ME, Chen E, Rogers NH, Burgstaller-Muehlbacher S, Hughes LD, Love MS, Hull MV, Kuhen KL, Woods AK, Joseph SB, Petrassi HM, McNamara CW, Tremblay MS, Su AI, Schultz PG, Chatterjee AK. The ReFRAME library as a comprehensive drug repurposing library and its application to the treatment of cryptosporidiosis. Proc Natl Acad Sci U S A 2018; 115:10750-10755. [PMID: 30282735 PMCID: PMC6196526 DOI: 10.1073/pnas.1810137115] [Citation(s) in RCA: 128] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The chemical diversity and known safety profiles of drugs previously tested in humans make them a valuable set of compounds to explore potential therapeutic utility in indications outside those originally targeted, especially neglected tropical diseases. This practice of "drug repurposing" has become commonplace in academic and other nonprofit drug-discovery efforts, with the appeal that significantly less time and resources are required to advance a candidate into the clinic. Here, we report a comprehensive open-access, drug repositioning screening set of 12,000 compounds (termed ReFRAME; Repurposing, Focused Rescue, and Accelerated Medchem) that was assembled by combining three widely used commercial drug competitive intelligence databases (Clarivate Integrity, GVK Excelra GoStar, and Citeline Pharmaprojects), together with extensive patent mining of small molecules that have been dosed in humans. To date, 12,000 compounds (∼80% of compounds identified from data mining) have been purchased or synthesized and subsequently plated for screening. To exemplify its utility, this collection was screened against Cryptosporidium spp., a major cause of childhood diarrhea in the developing world, and two active compounds previously tested in humans for other therapeutic indications were identified. Both compounds, VB-201 and a structurally related analog of ASP-7962, were subsequently shown to be efficacious in animal models of Cryptosporidium infection at clinically relevant doses, based on available human doses. In addition, an open-access data portal (https://reframedb.org) has been developed to share ReFRAME screen hits to encourage additional follow-up and maximize the impact of the ReFRAME screening collection.
Collapse
Affiliation(s)
- Jeff Janes
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Megan E Young
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Emily Chen
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Nicole H Rogers
- California Institute for Biomedical Research, La Jolla, CA 92037
| | | | - Laura D Hughes
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037
| | - Melissa S Love
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Mitchell V Hull
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Kelli L Kuhen
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Ashley K Woods
- California Institute for Biomedical Research, La Jolla, CA 92037
| | - Sean B Joseph
- California Institute for Biomedical Research, La Jolla, CA 92037
| | | | - Case W McNamara
- California Institute for Biomedical Research, La Jolla, CA 92037
| | | | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037
| | - Peter G Schultz
- California Institute for Biomedical Research, La Jolla, CA 92037;
| | | |
Collapse
|
39
|
Abstract
The lysis and extraction of soluble bacterial proteins from cells is a common practice for proteomics analyses, but insoluble bacterial biomasses are often left behind. Here, we show that with triflic acid treatment, the insoluble bacterial biomass of Gram- and Gram+ bacteria can be rendered soluble. We use LC-MS/MS shotgun proteomics to show that bacterial proteins in the soluble and insoluble postlysis fractions differ significantly. Additionally, in the case of Gram- Pseudomonas aeruginosa, triflic acid treatment enables the enrichment of cell-envelope-associated proteins. Finally, we apply triflic acid to a human microbiome sample to show that this treatment is robust and enables the identification of a new, complementary subset of proteins from a complex microbial mixture.
Collapse
Affiliation(s)
- Ana Y Wang
- Department of Molecular Medicine and Department of Integrative Structural and Computational Biology , The Scripps Research Institute , 10550 North Torrey Pines Road , La Jolla , California 92037 , United States
| | - Peter S Thuy-Boun
- Department of Molecular Medicine and Department of Integrative Structural and Computational Biology , The Scripps Research Institute , 10550 North Torrey Pines Road , La Jolla , California 92037 , United States
| | - Gregory S Stupp
- Department of Molecular Medicine and Department of Integrative Structural and Computational Biology , The Scripps Research Institute , 10550 North Torrey Pines Road , La Jolla , California 92037 , United States
| | - Andrew I Su
- Department of Molecular Medicine and Department of Integrative Structural and Computational Biology , The Scripps Research Institute , 10550 North Torrey Pines Road , La Jolla , California 92037 , United States
| | - Dennis W Wolan
- Department of Molecular Medicine and Department of Integrative Structural and Computational Biology , The Scripps Research Institute , 10550 North Torrey Pines Road , La Jolla , California 92037 , United States
| |
Collapse
|
40
|
Hutt DM, Loguercio S, Roth DM, Su AI, Balch WE. Correcting the F508del-CFTR variant by modulating eukaryotic translation initiation factor 3-mediated translation initiation. J Biol Chem 2018; 293:13477-13495. [PMID: 30006345 DOI: 10.1074/jbc.ra118.003192] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 07/05/2018] [Indexed: 12/31/2022] Open
Abstract
Inherited and somatic rare diseases result from >200,000 genetic variants leading to loss- or gain-of-toxic function, often caused by protein misfolding. Many of these misfolded variants fail to properly interact with other proteins. Understanding the link between factors mediating the transcription, translation, and protein folding of these disease-associated variants remains a major challenge in cell biology. Herein, we utilized the cystic fibrosis transmembrane conductance regulator (CFTR) protein as a model and performed a proteomics-based high-throughput screen (HTS) to identify pathways and components affecting the folding and function of the most common cystic fibrosis-associated mutation, the F508del variant of CFTR. Using a shortest-path algorithm we developed, we mapped HTS hits to the CFTR interactome to provide functional context to the targets and identified the eukaryotic translation initiation factor 3a (eIF3a) as a central hub for the biogenesis of CFTR. Of note, siRNA-mediated silencing of eIF3a reduced the polysome-to-monosome ratio in F508del-expressing cells, which, in turn, decreased the translation of CFTR variants, leading to increased CFTR stability, trafficking, and function at the cell surface. This finding suggested that eIF3a is involved in mediating the impact of genetic variations in CFTR on the folding of this protein. We posit that the number of ribosomes on a CFTR mRNA transcript is inversely correlated with the stability of the translated polypeptide. Polysome-based translation challenges the capacity of the proteostasis environment to balance message fidelity with protein folding, leading to disease. We suggest that this deficit can be corrected through control of translation initiation.
Collapse
Affiliation(s)
| | | | | | - Andrew I Su
- Integrative Structural and Computational Biology and
| | - William E Balch
- From the Departments of Molecular Medicine and .,the Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037
| |
Collapse
|
41
|
Abstract
Background The Jurkat cell line has an extensive history as a model of T cell signaling. But at the turn of the 21st century, some expression irregularities were observed, raising doubts about how closely the cell line paralleled normal human T cells. While numerous expression deficiencies have been described in Jurkat, genetic explanations have only been provided for a handful of defects. Results Here, we report a comprehensive catolog of genomic variation in the Jurkat cell line based on whole-genome sequencing. With this list of all detectable, non-reference sequences, we prioritize potentially damaging mutations by mining public databases for functional effects. We confirm documented mutations in Jurkat and propose links from detrimental gene variants to observed expression abnormalities in the cell line. Conclusions The Jurkat cell line harbors many mutations that are associated with cancer and contribute to Jurkat’s unique characteristics. Genes with damaging mutations in the Jurkat cell line are involved in T-cell receptor signaling (PTEN, INPP5D, CTLA4, and SYK), maintenance of genome stability (TP53, BAX, and MSH2), and O-linked glycosylation (C1GALT1C1). This work ties together decades of molecular experiments and serves as a resource that will streamline both the interpretation of past research and the design of future Jurkat studies. Electronic supplementary material The online version of this article (10.1186/s12864-018-4718-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Louis Gioia
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, California, 92037, USA.
| | - Azeem Siddique
- Next Generation Sequencing Core, The Scripps Research Institute, La Jolla, California, 92037, USA
| | - Steven R Head
- Next Generation Sequencing Core, The Scripps Research Institute, La Jolla, California, 92037, USA
| | - Daniel R Salomon
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, California, 92037, USA
| | - Andrew I Su
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, California, 92037, USA
| |
Collapse
|
42
|
Ma S, Cahalan S, LaMonte G, Grubaugh ND, Zeng W, Murthy SE, Paytas E, Gamini R, Lukacs V, Whitwam T, Loud M, Lohia R, Berry L, Khan SM, Janse CJ, Bandell M, Schmedt C, Wengelnik K, Su AI, Honore E, Winzeler EA, Andersen KG, Patapoutian A. Common PIEZO1 Allele in African Populations Causes RBC Dehydration and Attenuates Plasmodium Infection. Cell 2018; 173:443-455.e12. [PMID: 29576450 DOI: 10.1016/j.cell.2018.02.047] [Citation(s) in RCA: 140] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 01/06/2018] [Accepted: 02/14/2018] [Indexed: 01/05/2023]
Abstract
Hereditary xerocytosis is thought to be a rare genetic condition characterized by red blood cell (RBC) dehydration with mild hemolysis. RBC dehydration is linked to reduced Plasmodium infection in vitro; however, the role of RBC dehydration in protection against malaria in vivo is unknown. Most cases of hereditary xerocytosis are associated with gain-of-function mutations in PIEZO1, a mechanically activated ion channel. We engineered a mouse model of hereditary xerocytosis and show that Plasmodium infection fails to cause experimental cerebral malaria in these mice due to the action of Piezo1 in RBCs and in T cells. Remarkably, we identified a novel human gain-of-function PIEZO1 allele, E756del, present in a third of the African population. RBCs from individuals carrying this allele are dehydrated and display reduced Plasmodium infection in vitro. The existence of a gain-of-function PIEZO1 at such high frequencies is surprising and suggests an association with malaria resistance.
Collapse
Affiliation(s)
- Shang Ma
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Stuart Cahalan
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Gregory LaMonte
- Division of Host-Microbe Systems & Therapeutics, Department of Pediatrics, University of California, San Diego, San Diego, CA, USA
| | - Nathan D Grubaugh
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Weizheng Zeng
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Swetha E Murthy
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Emma Paytas
- Division of Host-Microbe Systems & Therapeutics, Department of Pediatrics, University of California, San Diego, San Diego, CA, USA
| | - Ramya Gamini
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Viktor Lukacs
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Tess Whitwam
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Meaghan Loud
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Rakhee Lohia
- DIMNP, CNRS, INSERM, University Montpellier, Montpellier, France
| | - Laurence Berry
- DIMNP, CNRS, INSERM, University Montpellier, Montpellier, France
| | - Shahid M Khan
- Leiden Malaria Research Group, Department of Parasitology, Leiden University Medical Center (LUMC), 2333ZA Leiden, the Netherlands
| | - Chris J Janse
- Leiden Malaria Research Group, Department of Parasitology, Leiden University Medical Center (LUMC), 2333ZA Leiden, the Netherlands
| | - Michael Bandell
- Genomics Institute of the Novartis Research Foundation, La Jolla, CA, USA
| | - Christian Schmedt
- Genomics Institute of the Novartis Research Foundation, La Jolla, CA, USA
| | - Kai Wengelnik
- DIMNP, CNRS, INSERM, University Montpellier, Montpellier, France
| | - Andrew I Su
- Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Eric Honore
- Université Côte d'Azur, Centre National de la Recherche Scientifique, Paris, France; Institut de Pharmacologie Moléculaire et Cellulaire, Labex ICST, Valbonne, France
| | - Elizabeth A Winzeler
- Division of Host-Microbe Systems & Therapeutics, Department of Pediatrics, University of California, San Diego, San Diego, CA, USA
| | - Kristian G Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA; Department of Integrative, Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Ardem Patapoutian
- Howard Hughes Medical Institute, Department of Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA.
| |
Collapse
|
43
|
Abstract
Extraction of particles from cryo-electron microscopy (cryo-EM) micrographs is a crucial step in processing single-particle datasets. Although algorithms have been developed for automatic particle picking, these algorithms generally rely on two-dimensional templates for particle identification, which may exhibit biases that can propagate artifacts through the reconstruction pipeline. Manual picking is viewed as a gold-standard solution for particle selection, but it is too time-consuming to perform on data sets of thousands of images. In recent years, crowdsourcing has proven effective at leveraging the open web to manually curate datasets. In particular, citizen science projects such as Galaxy Zoo have shown the power of appealing to users’ scientific interests to process enormous amounts of data. To this end, we explored the possible applications of crowdsourcing in cryo-EM particle picking, presenting a variety of novel experiments including the production of a fully annotated particle set from untrained citizen scientists. We show the possibilities and limitations of crowdsourcing particle selection tasks, and explore further options for crowdsourcing cryo-EM data processing.
Collapse
Affiliation(s)
- Jacob Bruggemann
- Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 USA.
| | - Gabriel C Lander
- Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 USA.
| | - Andrew I Su
- Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 USA.
| |
Collapse
|
44
|
Moon C, Stupp GS, Su AI, Wolan DW. Metaproteomics of Colonic Microbiota Unveils Discrete Protein Functions among Colitic Mice and Control Groups. Proteomics 2018; 18:10.1002/pmic.201700391. [PMID: 29319931 PMCID: PMC5921860 DOI: 10.1002/pmic.201700391] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 12/19/2017] [Indexed: 12/14/2022]
Abstract
Metaproteomics can greatly assist established high-throughput sequencing methodologies to provide systems biological insights into the alterations of microbial protein functionalities correlated with disease-associated dysbiosis of the intestinal microbiota. Here, the authors utilize the well-characterized murine T cell transfer model of colitis to find specific changes within the intestinal luminal proteome associated with inflammation. MS proteomic analysis of colonic samples permitted the identification of ≈10 000-12 000 unique peptides that corresponded to 5610 protein clusters identified across three groups, including the colitic Rag1-/- T cell recipients, isogenic Rag1-/- controls, and wild-type mice. The authors demonstrate that the colitic mice exhibited a significant increase in Proteobacteria and Verrucomicrobia and show that such alterations in the microbial communities contributed to the enrichment of specific proteins with transcription and translation gene ontology terms. In combination with 16S sequencing, the authors' metaproteomics-based microbiome studies provide a foundation for assessing alterations in intestinal luminal protein functionalities in a robust and well-characterized mouse model of colitis, and set the stage for future studies to further explore the functional mechanisms of altered protein functionalities associated with dysbiosis and inflammation.
Collapse
Affiliation(s)
- Clara Moon
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
| | - Gregory S Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Dennis W Wolan
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
45
|
Xin J, Afrasiabi C, Lelong S, Adesara J, Tsueng G, Su AI, Wu C. Cross-linking BioThings APIs through JSON-LD to facilitate knowledge exploration. BMC Bioinformatics 2018; 19:30. [PMID: 29390967 PMCID: PMC5796402 DOI: 10.1186/s12859-018-2041-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 01/24/2018] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Application Programming Interfaces (APIs) are now widely used to distribute biological data. And many popular biological APIs developed by many different research teams have adopted Javascript Object Notation (JSON) as their primary data format. While usage of a common data format offers significant advantages, that alone is not sufficient for rich integrative queries across APIs. RESULTS Here, we have implemented JSON for Linking Data (JSON-LD) technology on the BioThings APIs that we have developed, MyGene.info , MyVariant.info and MyChem.info . JSON-LD provides a standard way to add semantic context to the existing JSON data structure, for the purpose of enhancing the interoperability between APIs. We demonstrated several use cases that were facilitated by semantic annotations using JSON-LD, including simpler and more precise query capabilities as well as API cross-linking. CONCLUSIONS We believe that this pattern offers a generalizable solution for interoperability of APIs in the life sciences.
Collapse
Affiliation(s)
- Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Cyrus Afrasiabi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Sebastien Lelong
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Julee Adesara
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.
| |
Collapse
|
46
|
Putman TE, Lelong S, Burgstaller-Muehlbacher S, Waagmeester A, Diesh C, Dunn N, Munoz-Torres M, Stupp GS, Wu C, Su AI, Good BM. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata. Database (Oxford) 2017; 2017:3084697. [PMID: 28365742 PMCID: PMC5467579 DOI: 10.1093/database/bax025] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 03/06/2017] [Indexed: 11/25/2022]
Abstract
With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomic data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction. Database URL: www.wikigenomes.org
Collapse
Affiliation(s)
- Tim E Putman
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, 92037 USA
| | - Sebastien Lelong
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, 92037 USA
| | | | | | - Colin Diesh
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Nathan Dunn
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Monica Munoz-Torres
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Gregory S Stupp
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, 92037 USA
| | - Chunlei Wu
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, 92037 USA
| | - Andrew I Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, 92037 USA
| | - Benjamin M Good
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, 92037 USA
| |
Collapse
|
47
|
|
48
|
Alvarez-Garcia O, Fisch KM, Wineinger NE, Akagi R, Saito M, Sasho T, Su AI, Lotz MK. Increased DNA Methylation and Reduced Expression of Transcription Factors in Human Osteoarthritis Cartilage. Arthritis Rheumatol 2017; 68:1876-86. [PMID: 26881698 DOI: 10.1002/art.39643] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 02/11/2016] [Indexed: 12/13/2022]
Abstract
OBJECTIVE To analyze the methylome of normal and osteoarthritic (OA) knee articular cartilage and to determine the role of DNA methylation in the regulation of gene expression in vitro. METHODS DNA was isolated from human normal (n = 11) and OA (n = 12) knee articular cartilage and analyzed using the Infinium HumanMethylation450 BeadChip array. To integrate methylation and transcription, RNA sequencing was performed on normal and OA cartilage and validated by quantitative polymerase chain reaction. Functional validation was performed in the human TC28 cell line and primary chondrocytes that were treated with the DNA methylation inhibitor 5-aza-2'-deoxycytidine (5-aza-dC). RESULTS DNA methylation profiling revealed 929 differentially methylated sites between normal and OA cartilage, comprising a total of 500 individual genes. Among these, 45 transcription factors that harbored differentially methylated sites were identified. Integrative analysis and subsequent validation showed a subset of 6 transcription factors that were significantly hypermethylated and down-regulated in OA cartilage (ATOH8, MAFF, NCOR2, TBX4, ZBTB16, and ZHX2). Upon 5-aza-dC treatment, TC28 cells showed a significant increase in gene expression for all 6 transcription factors. In primary chondrocytes, ATOH8 and TBX4 were increased after 5-aza-dC treatment. CONCLUSION Our findings reveal that normal and OA knee articular cartilage have significantly different methylomes. The identification of a subset of epigenetically regulated transcription factors with reduced expression in OA may represent an important mechanism to explain changes in the chondrocyte transcriptome and function during OA pathogenesis.
Collapse
Affiliation(s)
| | | | | | - Ryuichiro Akagi
- The Scripps Research Institute, La Jolla, California, and Chiba University, Chiba, Japan
| | | | | | - Andrew I Su
- The Scripps Research Institute, La Jolla, California
| | - Martin K Lotz
- The Scripps Research Institute, La Jolla, California
| |
Collapse
|
49
|
Akagi R, Akatsu Y, Fisch KM, Alvarez-Garcia O, Teramura T, Muramatsu Y, Saito M, Sasho T, Su AI, Lotz MK. Dysregulated circadian rhythm pathway in human osteoarthritis: NR1D1 and BMAL1 suppression alters TGF-β signaling in chondrocytes. Osteoarthritis Cartilage 2017; 25:943-951. [PMID: 27884645 PMCID: PMC5438901 DOI: 10.1016/j.joca.2016.11.007] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Revised: 11/08/2016] [Accepted: 11/12/2016] [Indexed: 02/02/2023]
Abstract
OBJECTIVES Circadian rhythm (CR) was identified by RNA sequencing as the most dysregulated pathway in human osteoarthritis (OA) in articular cartilage. This study examined circadian rhythmicity in cultured chondrocytes and the role of the CR genes NR1D1 and BMAL1 in regulating chondrocyte functions. METHODS RNA was extracted from normal and OA-affected human knee cartilage (n = 14 each). Expression levels of NR1D1 and BMAL1 mRNA and protein were assessed by quantitative PCR and immunohistochemistry. Human chondrocytes were synchronized and harvested at regular intervals to examine circadian rhythmicity in RNA and protein expression. Chondrocytes were treated with small interfering RNA (siRNA) for NR1D1 or BMAL1, followed by RNA sequencing and analysis of the effects on the transforming growth factor beta (TGF-β) pathway. RESULTS NR1D1 and BMAL1 mRNA and protein levels were significantly reduced in OA compared to normal cartilage. In cultured human chondrocytes, a clear circadian rhythmicity was observed for NR1D1 and BMAL1. Increased BMAL1 expression was observed after knocking down NR1D1, and decreased NR1D1 levels were observed after knocking down BMAL1. Sequencing of RNA from chondrocytes treated with NR1D1 or BMAL1 siRNA identified 330 and 68 significantly different genes, respectively, and this predominantly affected the TGF-β signaling pathway. CONCLUSIONS The CR pathway is dysregulated in OA cartilage. Interference with circadian rhythmicity in cultured chondrocytes affects TGF-β signaling, which is a central pathway in cartilage homeostasis.
Collapse
Affiliation(s)
- R Akagi
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA; Department of Orthopaedic Surgery, School of Medicine, Chiba University, 1-8-1, Inohana, Chuou, Chiba, 260-8677, Japan
| | - Y Akatsu
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA; Department of Orthopaedic Surgery, School of Medicine, Chiba University, 1-8-1, Inohana, Chuou, Chiba, 260-8677, Japan
| | - K M Fisch
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - O Alvarez-Garcia
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - T Teramura
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - Y Muramatsu
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - M Saito
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA; Department of Orthopaedic Surgery, Toho University Sakura Medical Center, 564-1 Shimoshizu, Sakura, Chiba, 285-8741, Japan
| | - T Sasho
- Department of Orthopaedic Surgery, School of Medicine, Chiba University, 1-8-1, Inohana, Chuou, Chiba, 260-8677, Japan
| | - A I Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA
| | - M K Lotz
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, USA.
| |
Collapse
|
50
|
Mayers MD, Moon C, Stupp GS, Su AI, Wolan DW. Quantitative Metaproteomics and Activity-Based Probe Enrichment Reveals Significant Alterations in Protein Expression from a Mouse Model of Inflammatory Bowel Disease. J Proteome Res 2017; 16:1014-1026. [PMID: 28052195 DOI: 10.1021/acs.jproteome.6b00938] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Tandem mass spectrometry based shotgun proteomics of distal gut microbiomes is exceedingly difficult due to the inherent complexity and taxonomic diversity of the samples. We introduce two new methodologies to improve metaproteomic studies of microbiome samples. These methods include the stable isotope labeling in mammals to permit protein quantitation across two mouse cohorts as well as the application of activity-based probes to enrich and analyze both host and microbial proteins with specific functionalities. We used these technologies to study the microbiota from the adoptive T cell transfer mouse model of inflammatory bowel disease (IBD) and compare these samples to an isogenic control, thereby limiting genetic and environmental variables that influence microbiome composition. The data generated highlight quantitative alterations in both host and microbial proteins due to intestinal inflammation and corroborates the observed phylogenetic changes in bacteria that accompany IBD in humans and mouse models. The combination of isotope labeling with shotgun proteomics resulted in the total identification of 4434 protein clusters expressed in the microbial proteomic environment, 276 of which demonstrated differential abundance between control and IBD mice. Notably, application of a novel cysteine-reactive probe uncovered several microbial proteases and hydrolases overrepresented in the IBD mice. Implementation of these methods demonstrated that substantial insights into the identity and dysregulation of host and microbial proteins altered in IBD can be accomplished and can be used in the interrogation of other microbiome-related diseases.
Collapse
Affiliation(s)
- Michael D Mayers
- Department of Molecular and Experimental Medicine, ‡Department of Integrative Structural and Computational Biology, and §Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Clara Moon
- Department of Molecular and Experimental Medicine, ‡Department of Integrative Structural and Computational Biology, and §Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Gregory S Stupp
- Department of Molecular and Experimental Medicine, ‡Department of Integrative Structural and Computational Biology, and §Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Andrew I Su
- Department of Molecular and Experimental Medicine, ‡Department of Integrative Structural and Computational Biology, and §Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Dennis W Wolan
- Department of Molecular and Experimental Medicine, ‡Department of Integrative Structural and Computational Biology, and §Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| |
Collapse
|