1
|
Martone ME. The past, present and future of neuroscience data sharing: a perspective on the state of practices and infrastructure for FAIR. Front Neuroinform 2024; 17:1276407. [PMID: 38250019 PMCID: PMC10796549 DOI: 10.3389/fninf.2023.1276407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 10/31/2023] [Indexed: 01/23/2024] Open
Abstract
Neuroscience has made significant strides over the past decade in moving from a largely closed science characterized by anemic data sharing, to a largely open science where the amount of publicly available neuroscience data has increased dramatically. While this increase is driven in significant part by large prospective data sharing studies, we are starting to see increased sharing in the long tail of neuroscience data, driven no doubt by journal requirements and funder mandates. Concomitant with this shift to open is the increasing support of the FAIR data principles by neuroscience practices and infrastructure. FAIR is particularly critical for neuroscience with its multiplicity of data types, scales and model systems and the infrastructure that serves them. As envisioned from the early days of neuroinformatics, neuroscience is currently served by a globally distributed ecosystem of neuroscience-centric data repositories, largely specialized around data types. To make neuroscience data findable, accessible, interoperable, and reusable requires the coordination across different stakeholders, including the researchers who produce the data, data repositories who make it available, the aggregators and indexers who field search engines across the data, and community organizations who help to coordinate efforts and develop the community standards critical to FAIR. The International Neuroinformatics Coordinating Facility has led efforts to move neuroscience toward FAIR, fielding several resources to help researchers and repositories achieve FAIR. In this perspective, I provide an overview of the components and practices required to achieve FAIR in neuroscience and provide thoughts on the past, present and future of FAIR infrastructure for neuroscience, from the laboratory to the search engine.
Collapse
Affiliation(s)
- Maryann E. Martone
- Department of Neurosciences, University of California, San Diego, CA, United States
- San Francisco Veterans Administration Hospital, San Francisco, CA, United States
| |
Collapse
|
2
|
Mittal D, Mease R, Kuner T, Flor H, Kuner R, Andoh J. Data management strategy for a collaborative research center. Gigascience 2022; 12:giad049. [PMID: 37401720 PMCID: PMC10318494 DOI: 10.1093/gigascience/giad049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/20/2023] [Accepted: 06/11/2023] [Indexed: 07/05/2023] Open
Abstract
The importance of effective research data management (RDM) strategies to support the generation of Findable, Accessible, Interoperable, and Reusable (FAIR) neuroscience data grows with each advance in data acquisition techniques and research methods. To maximize the impact of diverse research strategies, multidisciplinary, large-scale neuroscience research consortia face a number of unsolved challenges in RDM. While open science principles are largely accepted, it is practically difficult for researchers to prioritize RDM over other pressing demands. The implementation of a coherent, executable RDM plan for consortia spanning animal, human, and clinical studies is becoming increasingly challenging. Here, we present an RDM strategy implemented for the Heidelberg Collaborative Research Consortium. Our consortium combines basic and clinical research in diverse populations (animals and humans) and produces highly heterogeneous and multimodal research data (e.g., neurophysiology, neuroimaging, genetics, behavior). We present a concrete strategy for initiating early-stage RDM and FAIR data generation for large-scale collaborative research consortia, with a focus on sustainable solutions that incentivize incremental RDM while respecting research-specific requirements.
Collapse
Affiliation(s)
- Deepti Mittal
- Institute of Pharmacology, Heidelberg University, 69120 Heidelberg, Germany
| | - Rebecca Mease
- Institute of Physiology and Pathophysiology, Heidelberg University, 69120 Heidelberg, Germany
| | - Thomas Kuner
- Institute for Anatomy and Cell Biology, Heidelberg University, 69120 Mannheim, Germany
| | - Herta Flor
- Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany
| | - Rohini Kuner
- Institute of Pharmacology, Heidelberg University, 69120 Heidelberg, Germany
| | - Jamila Andoh
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany
| |
Collapse
|
3
|
Murphy F, Bar-Sinai M, Martone ME. A tool for assessing alignment of biomedical data repositories with open, FAIR, citation and trustworthy principles. PLoS One 2021; 16:e0253538. [PMID: 34242248 PMCID: PMC8270168 DOI: 10.1371/journal.pone.0253538] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 06/08/2021] [Indexed: 11/19/2022] Open
Abstract
Increasing attention is being paid to the operation of biomedical data repositories in light of efforts to improve how scientific data is handled and made available for the long term. Multiple groups have produced recommendations for functions that biomedical repositories should support, with many using requirements of the FAIR data principles as guidelines. However, FAIR is but one set of principles that has arisen out of the open science community. They are joined by principles governing open science, data citation and trustworthiness, all of which are important aspects for biomedical data repositories to support. Together, these define a framework for data repositories that we call OFCT: Open, FAIR, Citable and Trustworthy. Here we developed an instrument using the open source PolicyModels toolkit that attempts to operationalize key aspects of OFCT principles and piloted the instrument by evaluating eight biomedical community repositories listed by the NIDDK Information Network (dkNET.org). Repositories included both specialist repositories that focused on a particular data type or domain, in this case diabetes and metabolomics, and generalist repositories that accept all data types and domains. The goal of this work was both to obtain a sense of how much the design of current biomedical data repositories align with these principles and to augment the dkNET listing with additional information that may be important to investigators trying to choose a repository, e.g., does the repository fully support data citation? The evaluation was performed from March to November 2020 through inspection of documentation and interaction with the sites by the authors. Overall, although there was little explicit acknowledgement of any of the OFCT principles in our sample, the majority of repositories provided at least some support for their tenets.
Collapse
Affiliation(s)
- Fiona Murphy
- MoreBrains Cooperative Ltd, Chichester, United Kingdom
| | - Michael Bar-Sinai
- Department of Computer Science, Ben-Gurion University of the Negev and The Institute of Quantitative Social Science at Harvard University, Beersheba, Israel
| | - Maryann E. Martone
- Department of Neurosciences, SciCrunch, Inc., University of California, San Diego, California, United States of America
| |
Collapse
|
4
|
Hsu CN, Bandrowski AE, Gillespie TH, Udell J, Lin KW, Ozyurt IB, Grethe JS, Martone ME. Comparing the Use of Research Resource Identifiers and Natural Language Processing for Citation of Databases, Software, and Other Digital Artifacts. Comput Sci Eng 2020. [DOI: 10.1109/mcse.2019.2952838] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
5
|
Ozyurt IB, Bandrowski A, Grethe JS. Bio-AnswerFinder: a system to find answers to questions from biomedical texts. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5700339. [PMID: 31925435 PMCID: PMC7053013 DOI: 10.1093/database/baz137] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 10/04/2019] [Accepted: 11/07/2019] [Indexed: 11/15/2022]
Abstract
The ever accelerating pace of biomedical research results in corresponding acceleration in the volume of biomedical literature created. Since new research builds upon existing knowledge, the rate of increase in the available knowledge encoded in biomedical literature makes the easy access to that implicit knowledge more vital over time. Toward the goal of making implicit knowledge in the biomedical literature easily accessible to biomedical researchers, we introduce a question answering system called Bio-AnswerFinder. Bio-AnswerFinder uses a weighted-relaxed word mover's distance based similarity on word/phrase embeddings learned from PubMed abstracts to rank answers after question focus entity type filtering. Our approach retrieves relevant documents iteratively via enhanced keyword queries from a traditional search engine. To improve document retrieval performance, we introduced a supervised long short term memory neural network to select keywords from the question to facilitate iterative keyword search. Our unsupervised baseline system achieves a mean reciprocal rank score of 0.46 and Precision@1 of 0.32 on 936 questions from BioASQ. The answer sentences are further ranked by a fine-tuned bidirectional encoder representation from transformers (BERT) classifier trained using 100 answer candidate sentences per question for 492 BioASQ questions. To test ranking performance, we report a blind test on 100 questions that three independent annotators scored. These experts preferred BERT based reranking with 7% improvement on MRR and 13% improvement on Precision@1 scores on average.
Collapse
Affiliation(s)
- Ibrahim Burak Ozyurt
- Center for Research in Biological Systems, University of California, San Diego, 9500 Gilman Drive, M/C 0608, La Jolla, CA 92093-0608
| | - Anita Bandrowski
- Center for Research in Biological Systems, University of California, San Diego, 9500 Gilman Drive, M/C 0608, La Jolla, CA 92093-0608
| | - Jeffrey S Grethe
- Center for Research in Biological Systems, University of California, San Diego, 9500 Gilman Drive, M/C 0608, La Jolla, CA 92093-0608
| |
Collapse
|
6
|
Ozyurt IB, Grethe JS. Foundry: a message-oriented, horizontally scalable ETL system for scientific data integration and enhancement. Database (Oxford) 2018; 2018:5255189. [PMID: 30576493 PMCID: PMC6301337 DOI: 10.1093/database/bay130] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 10/18/2018] [Accepted: 11/14/2018] [Indexed: 11/12/2022]
Abstract
Data generated by scientific research enables further advancement in science through reanalyses and pooling of data for novel analyses. With the increasing amounts of scientific data generated by biomedical research providing researchers with more data than they have ever had access to, finding the data matching the researchers' requirements continues to be a major challenge and will only grow more challenging as more data is produced and shared. In this paper, we introduce a horizontally scalable distributed extract-transform-load system to tackle scientific data aggregation, transformation and enhancement for scientific data discovery and retrieval. We also introduce a data transformation language for biomedical curators allowing for the transformation and combination of data/metadata from heterogeneous data sources. Applicability of the system for scientific data is illustrated in biomedical and earth science domains.
Collapse
Affiliation(s)
- Ibrahim Burak Ozyurt
- Center for Research in Biological Systems, University of California, San Diego, La Jolla, CA, USA
| | - Jeffrey S Grethe
- Center for Research in Biological Systems, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
7
|
Karapiperis C, Kempf SJ, Quintens R, Azimzadeh O, Vidal VL, Pazzaglia S, Bazyka D, Mastroberardino PG, Scouras ZG, Tapio S, Benotmane MA, Ouzounis CA. Brain Radiation Information Data Exchange (BRIDE): integration of experimental data from low-dose ionising radiation research for pathway discovery. BMC Bioinformatics 2016; 17:212. [PMID: 27170263 PMCID: PMC4865096 DOI: 10.1186/s12859-016-1068-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 04/21/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The underlying molecular processes representing stress responses to low-dose ionising radiation (LDIR) in mammals are just beginning to be understood. In particular, LDIR effects on the brain and their possible association with neurodegenerative disease are currently being explored using omics technologies. RESULTS We describe a light-weight approach for the storage, analysis and distribution of relevant LDIR omics datasets. The data integration platform, called BRIDE, contains information from the literature as well as experimental information from transcriptomics and proteomics studies. It deploys a hybrid, distributed solution using both local storage and cloud technology. CONCLUSIONS BRIDE can act as a knowledge broker for LDIR researchers, to facilitate molecular research on the systems biology of LDIR response in mammals. Its flexible design can capture a range of experimental information for genomics, epigenomics, transcriptomics, and proteomics. The data collection is available at: .
Collapse
Affiliation(s)
- Christos Karapiperis
- Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessalonica, 54124, Thessalonica, Greece
| | - Stefan J Kempf
- Institute of Radiation Biology, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, 85764, Neuherberg, Germany
- Present address: Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Roel Quintens
- Radiobiology Unit, Belgian Nuclear Research Centre (SCK•CEN), B-2400, Mol, Belgium
| | - Omid Azimzadeh
- Institute of Radiation Biology, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, 85764, Neuherberg, Germany
| | - Victoria Linares Vidal
- School of Medicine, IISPV, "Rovira i Virgili" University, Sant Llorens 21, 43201, Reus, Spain
| | - Simonetta Pazzaglia
- Laboratory of Radiation Biology & Biomedicine, Agenzia Nazionale per le Nuove Tecnologie, l'Energia e lo Sviluppo Economico Sostenibile (ENEA) Centro Ricerche Casaccia, 00123, Rome, Italy
| | - Dimitry Bazyka
- National Research Center for Radiation Medicine of the National Academy of Medical Sciences of Ukraine, Melnykov str. 53, Kyiv, 04050, Ukraine
| | | | - Zacharias G Scouras
- Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessalonica, 54124, Thessalonica, Greece
| | - Soile Tapio
- Institute of Radiation Biology, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, 85764, Neuherberg, Germany.
| | | | - Christos A Ouzounis
- Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessalonica, 54124, Thessalonica, Greece.
- Biological Process & Computation Laboratory (BCPL), Chemical Process & Energy Resources Institute (CPERI), Centre for Research & Technology Hellas (CERTH), Thessalonica, 57001, Greece.
| |
Collapse
|
8
|
Ozyurt IB, Grethe JS, Martone ME, Bandrowski AE. Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature. PLoS One 2016; 11:e0146300. [PMID: 26730820 PMCID: PMC5156472 DOI: 10.1371/journal.pone.0146300] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 12/15/2015] [Indexed: 11/19/2022] Open
Abstract
The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over the years broadened in the scope to include research resources of general relevance to biomedical research. The current number of research resources listed by the Registry numbers over 13K. The broadening in scope to biomedical science led us to re-christen the NIF Registry platform as SciCrunch. The NIF/SciCrunch Registry has been cataloging the resource landscape since 2006; as such, it serves as a valuable dataset for tracking the breadth, fate and utilization of these resources. Our experience shows research resources like databases are dynamic objects, that can change location and scope over time. Although each record is entered manually and human-curated, the current size of the registry requires tools that can aid in curation efforts to keep content up to date, including when and where such resources are used. To address this challenge, we have developed an open source tool suite, collectively termed RDW: Resource Disambiguator for the (Web). RDW is designed to help in the upkeep and curation of the registry as well as in enhancing the content of the registry by automated extraction of resource candidates from the literature. The RDW toolkit includes a URL extractor from papers, resource candidate screen, resource URL change tracker, resource content change tracker. Curators access these tools via a web based user interface. Several strategies are used to optimize these tools, including supervised and unsupervised learning algorithms as well as statistical text analysis. The complete tool suite is used to enhance and maintain the resource registry as well as track the usage of individual resources through an innovative literature citation index honed for research resources. Here we present an overview of the Registry and show how the RDW tools are used in curation and usage tracking.
Collapse
|
9
|
van Niekerk EA, Tuszynski MH, Lu P, Dulin JN. Molecular and Cellular Mechanisms of Axonal Regeneration After Spinal Cord Injury. Mol Cell Proteomics 2015; 15:394-408. [PMID: 26695766 DOI: 10.1074/mcp.r115.053751] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Indexed: 12/28/2022] Open
Abstract
Following axotomy, a complex temporal and spatial coordination of molecular events enables regeneration of the peripheral nerve. In contrast, multiple intrinsic and extrinsic factors contribute to the general failure of axonal regeneration in the central nervous system. In this review, we examine the current understanding of differences in protein expression and post-translational modifications, activation of signaling networks, and environmental cues that may underlie the divergent regenerative capacity of central and peripheral axons. We also highlight key experimental strategies to enhance axonal regeneration via modulation of intraneuronal signaling networks and the extracellular milieu. Finally, we explore potential applications of proteomics to fill gaps in the current understanding of molecular mechanisms underlying regeneration, and to provide insight into the development of more effective approaches to promote axonal regeneration following injury to the nervous system.
Collapse
Affiliation(s)
- Erna A van Niekerk
- From the ‡Department of Neurosciences, University of California, San Diego, La Jolla, CA, 92093;
| | - Mark H Tuszynski
- From the ‡Department of Neurosciences, University of California, San Diego, La Jolla, CA, 92093; §Veterans Administration Medical Center, San Diego, CA 92161
| | - Paul Lu
- From the ‡Department of Neurosciences, University of California, San Diego, La Jolla, CA, 92093; §Veterans Administration Medical Center, San Diego, CA 92161
| | - Jennifer N Dulin
- From the ‡Department of Neurosciences, University of California, San Diego, La Jolla, CA, 92093
| |
Collapse
|
10
|
Whetzel PL, Grethe JS, Banks DE, Martone ME. The NIDDK Information Network: A Community Portal for Finding Data, Materials, and Tools for Researchers Studying Diabetes, Digestive, and Kidney Diseases. PLoS One 2015; 10:e0136206. [PMID: 26393351 PMCID: PMC4578941 DOI: 10.1371/journal.pone.0136206] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 07/30/2015] [Indexed: 11/19/2022] Open
Abstract
The NIDDK Information Network (dkNET; http://dknet.org) was launched to serve the needs of basic and clinical investigators in metabolic, digestive and kidney disease by facilitating access to research resources that advance the mission of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). By research resources, we mean the multitude of data, software tools, materials, services, projects and organizations available to researchers in the public domain. Most of these are accessed via web-accessible databases or web portals, each developed, designed and maintained by numerous different projects, organizations and individuals. While many of the large government funded databases, maintained by agencies such as European Bioinformatics Institute and the National Center for Biotechnology Information, are well known to researchers, many more that have been developed by and for the biomedical research community are unknown or underutilized. At least part of the problem is the nature of dynamic databases, which are considered part of the "hidden" web, that is, content that is not easily accessed by search engines. dkNET was created specifically to address the challenge of connecting researchers to research resources via these types of community databases and web portals. dkNET functions as a "search engine for data", searching across millions of database records contained in hundreds of biomedical databases developed and maintained by independent projects around the world. A primary focus of dkNET are centers and projects specifically created to provide high quality data and resources to NIDDK researchers. Through the novel data ingest process used in dkNET, additional data sources can easily be incorporated, allowing it to scale with the growth of digital data and the needs of the dkNET community. Here, we provide an overview of the dkNET portal and its functions. We show how dkNET can be used to address a variety of use cases that involve searching for research resources.
Collapse
Affiliation(s)
- Patricia L. Whetzel
- Center for Research in Biological Systems, University of California, San Diego, San Diego, California, United States of America
| | - Jeffrey S. Grethe
- Center for Research in Biological Systems, University of California, San Diego, San Diego, California, United States of America
| | - Davis E. Banks
- Center for Research in Biological Systems, University of California, San Diego, San Diego, California, United States of America
| | - Maryann E. Martone
- Center for Research in Biological Systems, University of California, San Diego, San Diego, California, United States of America
- Dept of Neurosciences, University of California, San Diego, San Diego, California, United States of America
- * E-mail:
| |
Collapse
|
11
|
Abstract
Medical and scientific advances are predicated on new knowledge that is robust and reliable and that serves as a solid foundation on which further advances can be built. In biomedical research, we are in the midst of a revolution with the generation of new data and scientific publications at a previously unprecedented rate. However, unfortunately, there is compelling evidence that the majority of these discoveries will not stand the test of time. To a large extent, this reproducibility crisis in basic and preclinical research may be as a result of failure to adhere to good scientific practice and the desperation to publish or perish. This is a multifaceted, multistakeholder problem. No single party is solely responsible, and no single solution will suffice. Here we review the reproducibility problems in basic and preclinical biomedical research, highlight some of the complexities, and discuss potential solutions that may help improve research quality and reproducibility.
Collapse
Affiliation(s)
- C. Glenn Begley
- From the TetraLogic Pharmaceuticals Corporation, Malvern, PA; and Departments of Medicine and Health Research and Policy, Stanford University School of Medicine, Department of Statistics, Stanford University School of Humanities and Sciences, and Meta-Research Innovation Center at Stanford (METRICS), CA
| | - John P.A. Ioannidis
- From the TetraLogic Pharmaceuticals Corporation, Malvern, PA; and Departments of Medicine and Health Research and Policy, Stanford University School of Medicine, Department of Statistics, Stanford University School of Humanities and Sciences, and Meta-Research Innovation Center at Stanford (METRICS), CA
| |
Collapse
|
12
|
Marenco LN, Wang R, Bandrowski AE, Grethe JS, Shepherd GM, Miller PL. Extending the NIF DISCO framework to automate complex workflow: coordinating the harvest and integration of data from diverse neuroscience information resources. Front Neuroinform 2014; 8:58. [PMID: 25018728 PMCID: PMC4071641 DOI: 10.3389/fninf.2014.00058] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 05/06/2014] [Indexed: 11/15/2022] Open
Abstract
This paper describes how DISCO, the data aggregator that supports the Neuroscience Information Framework (NIF), has been extended to play a central role in automating the complex workflow required to support and coordinate the NIF’s data integration capabilities. The NIF is an NIH Neuroscience Blueprint initiative designed to help researchers access the wealth of data related to the neurosciences available via the Internet. A central component is the NIF Federation, a searchable database that currently contains data from 231 data and information resources regularly harvested, updated, and warehoused in the DISCO system. In the past several years, DISCO has greatly extended its functionality and has evolved to play a central role in automating the complex, ongoing process of harvesting, validating, integrating, and displaying neuroscience data from a growing set of participating resources. This paper provides an overview of DISCO’s current capabilities and discusses a number of the challenges and future directions related to the process of coordinating the integration of neuroscience data within the NIF Federation.
Collapse
Affiliation(s)
- Luis N Marenco
- Center for Medical Informatics, Yale University School of Medicine New Haven, CT, USA ; VA Connecticut Healthcare System, US Department of Veterans Affairs West Haven, CT, USA ; Department of Neurobiology, Yale University School of Medicine New Haven, CT, USA
| | - Rixin Wang
- Center for Medical Informatics, Yale University School of Medicine New Haven, CT, USA
| | - Anita E Bandrowski
- Department of Neurosciences, Center for Research in Biological Systems, University of California at San Diego La Jolla, CA, USA
| | - Jeffrey S Grethe
- Department of Neurosciences, Center for Research in Biological Systems, University of California at San Diego La Jolla, CA, USA
| | - Gordon M Shepherd
- Department of Neurobiology, Yale University School of Medicine New Haven, CT, USA
| | - Perry L Miller
- Center for Medical Informatics, Yale University School of Medicine New Haven, CT, USA ; VA Connecticut Healthcare System, US Department of Veterans Affairs West Haven, CT, USA ; Department of Anesthesiology, Yale University School of Medicine New Haven, CT, USA ; Department of Molecular, Cellular and Developmental Biology, Yale University New Haven, CT, USA
| |
Collapse
|
13
|
Rane P, Haselgrove C, Hodge SM, Frazier JA, Kennedy DN. Structure-centered portal for child psychiatry research. Front Neuroinform 2014; 8:47. [PMID: 24817850 PMCID: PMC4012203 DOI: 10.3389/fninf.2014.00047] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 04/07/2014] [Indexed: 11/13/2022] Open
Abstract
The real world needs of the clinical community require a domain-specific solution to integrate disparate information available from various web-based resources for data, materials, and tools into routine clinical and clinical research setting. We present a child-psychiatry oriented portal as an effort to deliver a knowledge environment wrapper that provides organization and integration of multiple information and data sources. Organized semantically by resource context, the portal groups information sources by context type, and permits the user to interactively “narrow” or “broaden” the scope of the information resources that are available and relevant to the specific context. The overall objective of the portal is to bring information from multiple complex resources into a simple single uniform framework and present it to the user in a single window format.
Collapse
Affiliation(s)
- Pallavi Rane
- Child and Adolescent NeuroDevelopment Initiative, Department of Psychiatry, University of Massachusetts Medical School Worcester, MA, USA
| | - Christian Haselgrove
- Child and Adolescent NeuroDevelopment Initiative, Department of Psychiatry, University of Massachusetts Medical School Worcester, MA, USA
| | - Steven M Hodge
- Child and Adolescent NeuroDevelopment Initiative, Department of Psychiatry, University of Massachusetts Medical School Worcester, MA, USA
| | - Jean A Frazier
- Child and Adolescent NeuroDevelopment Initiative, Department of Psychiatry, University of Massachusetts Medical School Worcester, MA, USA
| | - David N Kennedy
- Child and Adolescent NeuroDevelopment Initiative, Department of Psychiatry, University of Massachusetts Medical School Worcester, MA, USA
| |
Collapse
|
14
|
Abstract
We at Brain and Behavior are happy, for one, that data sharing is now here.
Collapse
|
15
|
Abstract
The Mouse Phenome Database (MPD; phenome.jax.org) was launched in 2001 as the data coordination center for the international Mouse Phenome Project. MPD integrates quantitative phenotype, gene expression and genotype data into a common annotated framework to facilitate query and analysis. MPD contains >3500 phenotype measurements or traits relevant to human health, including cancer, aging, cardiovascular disorders, obesity, infectious disease susceptibility, blood disorders, neurosensory disorders, drug addiction and toxicity. Since our 2012 NAR report, we have added >70 new data sets, including data from Collaborative Cross lines and Diversity Outbred mice. During this time we have completely revamped our homepage, improved search and navigational aspects of the MPD application, developed several web-enabled data analysis and visualization tools, annotated phenotype data to public ontologies, developed an ontology browser and released new single nucleotide polymorphism query functionality with much higher density coverage than before. Here, we summarize recent data acquisitions and describe our latest improvements.
Collapse
Affiliation(s)
- Stephen C Grubb
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609 USA
| | | | | |
Collapse
|
16
|
Abstract
The past 25 years have seen great progress in parcellating the cerebral cortex into a mosaic of many distinct areas in mice, monkeys, and humans. Quantitative studies of interareal connectivity have revealed unexpectedly many pathways and a wide range of connection strengths in mouse and macaque cortex. In humans, advances in analyzing "structural" and "functional" connectivity using powerful but indirect noninvasive neuroimaging methods are yielding intriguing insights about brain circuits, their variability across individuals, and their relationship to behavior.
Collapse
Affiliation(s)
- David C Van Essen
- Anatomy and Neurobiology Department, Washington University in St. Louis, St. Louis, MO 63110, USA.
| |
Collapse
|
17
|
Maynard SM, Mungall CJ, Lewis SE, Imam FT, Martone ME. A knowledge based approach to matching human neurodegenerative disease and animal models. Front Neuroinform 2013; 7:7. [PMID: 23717278 PMCID: PMC3653101 DOI: 10.3389/fninf.2013.00007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2012] [Accepted: 04/09/2013] [Indexed: 12/19/2022] Open
Abstract
Neurodegenerative diseases present a wide and complex range of biological and clinical features. Animal models are key to translational research, yet typically only exhibit a subset of disease features rather than being precise replicas of the disease. Consequently, connecting animal to human conditions using direct data-mining strategies has proven challenging, particularly for diseases of the nervous system, with its complicated anatomy and physiology. To address this challenge we have explored the use of ontologies to create formal descriptions of structural phenotypes across scales that are machine processable and amenable to logical inference. As proof of concept, we built a Neurodegenerative Disease Phenotype Ontology (NDPO) and an associated Phenotype Knowledge Base (PKB) using an entity-quality model that incorporates descriptions for both human disease phenotypes and those of animal models. Entities are drawn from community ontologies made available through the Neuroscience Information Framework (NIF) and qualities are drawn from the Phenotype and Trait Ontology (PATO). We generated ~1200 structured phenotype statements describing structural alterations at the subcellular, cellular and gross anatomical levels observed in 11 human neurodegenerative conditions and associated animal models. PhenoSim, an open source tool for comparing phenotypes, was used to issue a series of competency questions to compare individual phenotypes among organisms and to determine which animal models recapitulate phenotypic aspects of the human disease in aggregate. Overall, the system was able to use relationships within the ontology to bridge phenotypes across scales, returning non-trivial matches based on common subsumers that were meaningful to a neuroscientist with an advanced knowledge of neuroanatomy. The system can be used both to compare individual phenotypes and also phenotypes in aggregate. This proof of concept suggests that expressing complex phenotypes using formal ontologies provides considerable benefit for comparing phenotypes across scales and species.
Collapse
Affiliation(s)
- Sarah M Maynard
- Department of Neurosciences, Center for Research in Biological Systems, University of California San Diego, San Diego, CA, USA
| | | | | | | | | |
Collapse
|