1
|
Katz DS, Chue Hong NP. Special issue on software citation, indexing, and discoverability. PeerJ Comput Sci 2024; 10:e1951. [PMID: 38660149 PMCID: PMC11042024 DOI: 10.7717/peerj-cs.1951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 02/29/2024] [Indexed: 04/26/2024]
Abstract
Software plays a fundamental role in research as a tool, an output, or even as an object of study. This special issue on software citation, indexing, and discoverability brings together five papers examining different aspects of how the use of software is recorded and made available to others. It describes new work on datasets that enable large-scale analysis of the evolution of software usage and citation, that presents evidence of increased citation rates when software artifacts are released, that provides guidance for registries and repositories to support software citation and findability, and that shows there are still barriers to improving and formalising software citation and publication practice. As the use of software increases further, driven by modern research methods, addressing the barriers to software citation and discoverability will encourage greater sharing and reuse of software, in turn enabling research progress.
Collapse
Affiliation(s)
- Daniel S. Katz
- National Center for Supercomputing Applications, Department of Computer Science, Department of Electrical and Computer Engineering, School of Information Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Neil P. Chue Hong
- Edinburgh Parallel Computing Centre, University of Edinburgh, Edinburgh, United Kingdom
- Software Sustainability Institute, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
2
|
Azzi R, Bordea G, Griffier R, Nikiema JN, Mougin F. Enriching the FIDEO ontology with food-drug interactions from online knowledge sources. J Biomed Semantics 2024; 15:1. [PMID: 38438913 PMCID: PMC10913206 DOI: 10.1186/s13326-024-00302-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 02/05/2024] [Indexed: 03/06/2024] Open
Abstract
The increasing number of articles on adverse interactions that may occur when specific foods are consumed with certain drugs makes it difficult to keep up with the latest findings. Conflicting information is available in the scientific literature and specialized knowledge bases because interactions are described in an unstructured or semi-structured format. The FIDEO ontology aims to integrate and represent information about food-drug interactions in a structured way. This article reports on the new version of this ontology in which more than 1700 interactions are integrated from two online resources: DrugBank and Hedrine. These food-drug interactions have been represented in FIDEO in the form of precompiled concepts, each of which specifies both the food and the drug involved. Additionally, competency questions that can be answered are reviewed, and avenues for further enrichment are discussed.
Collapse
Affiliation(s)
- Rabia Azzi
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- CHU de Bordeaux, Service d'information médicale, F-33000, Bordeaux, France
| | - Georgeta Bordea
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- Univ. La Rochelle, L3i, F-17000, La Rochelle, France
| | - Romain Griffier
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- CHU de Bordeaux, Service d'information médicale, F-33000, Bordeaux, France
| | - Jean Noël Nikiema
- Department of Management, Evaluation and Health Policy, School of Public Health, Université de Montréal, Québec, Canada
| | - Fleur Mougin
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France.
| |
Collapse
|
3
|
van Swieten MMH, Haselgrove C. Editorial: Navigating the landscape of FAIR data sharing and reuse: repositories, standards, and resources. Front Neuroinform 2024; 18:1387758. [PMID: 38495843 PMCID: PMC10943951 DOI: 10.3389/fninf.2024.1387758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 02/20/2024] [Indexed: 03/19/2024] Open
|
4
|
Stellmach C, Muzoora MR. How to Assess FAIRness of Your Data - A Summary of Testing Two FAIR Validators. Stud Health Technol Inform 2024; 310:154-158. [PMID: 38269784 DOI: 10.3233/shti230946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Decision-making in healthcare is heavily reliant on data that is findable, accessible, interoperable and reusable (FAIR). Evolving advancements in genomics also heavily rely on FAIR data to steer reliable research for the future. For practical purposes, ensuring FAIRness of a clinical data set can be challenging but could be aided by using FAIR validators. The study describes the test of two open-access web-tools in their demo versions to determine the FAIR levels of three submitted genomic data files with different formats (JSON, TXT, CSV). The F-UJI tool and FAIR-Checker tools provided similar FAIR scores for the three submitted files. However, the F-UJI tool assigned a total rating whereas the FAIR-Checker gave scores clustered by FAIR principles. Neither tool was suited to determine FAIR levels of a FHIR® JSON metadata file. Despite their early developmental status, FAIR validator tools have great potential to assist clinicians in the FAIRification of their research data.
Collapse
|
5
|
Pribec I, Hachinger S, Hayek M, Pringle GJ, Brüchle H, Jamitzky F, Mathias G. Efficient and Reliable Data Management for Biomedical Applications. Methods Mol Biol 2024; 2716:383-403. [PMID: 37702950 DOI: 10.1007/978-1-0716-3449-3_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
This chapter discusses the challenges and requirements of modern Research Data Management (RDM), particularly for biomedical applications in the context of high-performance computing (HPC). The FAIR data principles (Findable, Accessible, Interoperable, Reusable) are of special importance. Data formats, publication platforms, annotation schemata, automated data management and staging, the data infrastructure in HPC centers, file transfer and staging methods in HPC, and the EUDAT components are discussed. Tools and approaches for automated data movement and replication in cross-center workflows are explained, as well as the development of ontologies for structuring and quality-checking of metadata in computational biomedicine. The CompBioMed project is used as a real-world example of implementing these principles and tools in practice. The LEXIS project has built a workflow-execution and data management platform that follows the paradigm of HPC-Cloud convergence for demanding Big Data applications. It is used for orchestrating workflows with YORC, utilizing the data documentation initiative (DDI) and distributed computing resources (DCI). The platform is accessed by a user-friendly LEXIS portal for workflow and data management, making HPC and Cloud Computing significantly more accessible. Checkpointing, duplicate runs, and spare images of the data are used to create resilient workflows. The CompBioMed project is completing the implementation of such a workflow, using data replication and brokering, which will enable urgent computing on exascale platforms.
Collapse
Affiliation(s)
- Ivan Pribec
- Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ-BAdW), Munich, Germany
| | - Stephan Hachinger
- Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ-BAdW), Munich, Germany
| | - Mohamad Hayek
- Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ-BAdW), Munich, Germany
| | | | - Helmut Brüchle
- Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ-BAdW), Munich, Germany
| | - Ferdinand Jamitzky
- Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ-BAdW), Munich, Germany
| | - Gerald Mathias
- Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ-BAdW), Munich, Germany.
| |
Collapse
|
6
|
Pedrera-Jiménez M, García-Barrio N, Frid S, Moner D, Boscá-Tomás D, Lozano-Rubí R, Kalra D, Beale T, Muñoz-Carrero A, Serrano-Balazote P. Can OpenEHR, ISO 13606, and HL7 FHIR Work Together? An Agnostic Approach for the Selection and Application of Electronic Health Record Standards to the Next-Generation Health Data Spaces. J Med Internet Res 2023; 25:e48702. [PMID: 38153779 PMCID: PMC10784985 DOI: 10.2196/48702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/15/2023] [Accepted: 11/27/2023] [Indexed: 12/29/2023] Open
Abstract
In order to maximize the value of electronic health records (EHRs) for both health care and secondary use, it is necessary for the data to be interoperable and reusable without loss of the original meaning and context, in accordance with the findable, accessible, interoperable, and reusable (FAIR) principles. To achieve this, it is essential for health data platforms to incorporate standards that facilitate addressing needs such as formal modeling of clinical knowledge (health domain concepts) as well as the harmonized persistence, query, and exchange of data across different information systems and organizations. However, the selection of these specifications has not been consistent across the different health data initiatives, often applying standards to address needs for which they were not originally designed. This issue is essential in the current scenario of implementing the European Health Data Space, which advocates harmonization, interoperability, and reuse of data without regulating the specific standards to be applied for this purpose. Therefore, this viewpoint aims to establish a coherent, agnostic, and homogeneous framework for the use of the most impactful EHR standards in the new-generation health data spaces: OpenEHR, International Organization for Standardization (ISO) 13606, and Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR). Thus, a panel of EHR standards experts has discussed several critical points to reach a consensus that will serve decision-making teams in health data platform projects who may not be experts in these EHR standards. It was concluded that these specifications possess different capabilities related to modeling, flexibility, and implementation resources. Because of this, in the design of future data platforms, these standards must be applied based on the specific needs they were designed for, being likewise fully compatible with their combined functional and technical implementation.
Collapse
Affiliation(s)
- Miguel Pedrera-Jiménez
- Data Science Unit, Hospital Universitario 12 de Octubre, Madrid, Spain
- ETSI Telecomunicación, Universidad Politécnica de Madrid, Madrid, Spain
| | | | - Santiago Frid
- Medical Informatics Unit, Hospital Clinic de Barcelona, Barcelona, Spain
| | | | | | | | - Dipak Kalra
- The European Institute for Innovation through Health Data, Gent, Belgium
| | | | - Adolfo Muñoz-Carrero
- Telemedicine and Digital Health Research Unit, Instituto de Salud Carlos III, Madrid, Spain
| | | |
Collapse
|
7
|
Rahimzadeh V, Jones KM, Majumder MA, Kahana MJ, Rutishauser U, Williams ZM, Cash SS, Paulk AC, Zheng J, Beauchamp MS, Collinger JL, Pouratian N, McGuire AL, Sheth SA. Benefits of sharing neurophysiology data from the BRAIN Initiative Research Opportunities in Humans Consortium. Neuron 2023; 111:3710-3715. [PMID: 37944519 PMCID: PMC10995938 DOI: 10.1016/j.neuron.2023.09.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 11/12/2023]
Abstract
Sharing human brain data can yield scientific benefits, but because of various disincentives, only a fraction of these data is currently shared. We profile three successful data-sharing experiences from the NIH BRAIN Initiative Research Opportunities in Humans (ROH) Consortium and demonstrate benefits to data producers and to users.
Collapse
Affiliation(s)
- Vasiliki Rahimzadeh
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kathryn Maxson Jones
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA; Department of History, Purdue University, West Lafayette, IN 47907, USA
| | - Mary A Majumder
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA
| | - Michael J Kahana
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ueli Rutishauser
- Department of Neurosurgery, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Ziv M Williams
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Sydney S Cash
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Angelique C Paulk
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Jie Zheng
- Department of Ophthalmology, Boston Children's Hospital, Boston, MA 02115, USA
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jennifer L Collinger
- Rehab Neural Engineering Labs, Department of Physical Medicine and Rehabilitation, University of Pittsburgh, Pittsburgh, PA 15219, USA
| | - Nader Pouratian
- Department of Neurological Surgery, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Amy L McGuire
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX 77030, USA
| | - Sameer A Sheth
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
8
|
Müller H, Lopes-Dias C, Holub P, Plass M, Jungwirth E, Reihs R, Torke PR, Malatras A, Berger A, Coombs H, Dillner J, Merino-Martinez R. BIBBOX, a FAIR toolbox and App Store for life science research. N Biotechnol 2023; 77:12-19. [PMID: 37295722 DOI: 10.1016/j.nbt.2023.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 06/05/2023] [Accepted: 06/06/2023] [Indexed: 06/12/2023]
Abstract
Data quality has recently become a critical topic for the research community. European guidelines recommend that scientific data should be made FAIR: findable, accessible, interoperable and reusable. However, as FAIR guidelines do not specify how the stated principles should be implemented, it might not be straightforward for researchers to know how actually to make their data FAIR. This can prevent life-science researchers from sharing their datasets and pipelines, ultimately hindering the progress of research. To address this difficulty, we developed the BIBBOX, which is a platform that supports researchers publishing their datasets and the associated software in a FAIR manner.
Collapse
Affiliation(s)
- Heimo Müller
- Medical University of Graz, Neue Stiftingtalstraße 6, A-8010 Graz, Austria.
| | | | - Petr Holub
- BBMRI-ERIC, Neue Stiftingtalstraße 2/B/6, A-8010 Graz, Austria
| | - Markus Plass
- Medical University of Graz, Neue Stiftingtalstraße 6, A-8010 Graz, Austria
| | - Emilian Jungwirth
- Medical University of Graz, Neue Stiftingtalstraße 6, A-8010 Graz, Austria
| | - Robert Reihs
- Medical University of Graz, Neue Stiftingtalstraße 6, A-8010 Graz, Austria
| | - Paul R Torke
- Medical University of Graz, Neue Stiftingtalstraße 6, A-8010 Graz, Austria
| | | | - Anouk Berger
- International Agency for Research on Cancer (IARC), 25 avenue Tony Garnier, 69366 Lyon, France
| | - Heather Coombs
- International Agency for Research on Cancer (IARC), 25 avenue Tony Garnier, 69366 Lyon, France
| | - Joakim Dillner
- Karolinska Institutet, Alfred Nobels Allé 8, 14152 Huddinge, Sweden
| | | |
Collapse
|
9
|
DuBois JM, Mozersky J, Parsons M, Walsh HA, Friedrich A, Pienta A. Exchanging words: Engaging the challenges of sharing qualitative research data. Proc Natl Acad Sci U S A 2023; 120:e2206981120. [PMID: 37831745 PMCID: PMC10614603 DOI: 10.1073/pnas.2206981120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2023] Open
Abstract
In January 2023, a new NIH policy on data sharing went into effect. The policy applies to both quantitative and qualitative research (QR) data such as data from interviews or focus groups. QR data are often sensitive and difficult to deidentify, and thus have rarely been shared in the United States. Over the past 5 y, our research team has engaged stakeholders on QR data sharing, developed software to support data deidentification, produced guidance, and collaborated with the ICPSR data repository to pilot the deposit of 30 QR datasets. In this perspective article, we share important lessons learned by addressing eight clusters of questions on issues such as where, when, and what to share; how to deidentify data and support high-quality secondary use; budgeting for data sharing; and the permissions needed to share data. We also offer a brief assessment of the state of preparedness of data repositories, QR journals, and QR textbooks to support data sharing. While QR data sharing could yield important benefits to the research community, we quickly need to develop enforceable standards, expertise, and resources to support responsible QR data sharing. Absent these resources, we risk violating participant confidentiality and wasting a significant amount of time and funding on data that are not useful for either secondary use or data transparency and verification.
Collapse
Affiliation(s)
- James M. DuBois
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Jessica Mozersky
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Meredith Parsons
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Heidi A. Walsh
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Annie Friedrich
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Amy Pienta
- ICPSR, Institute for Social Research, University of Michigan, Ann Arbor, MI 48106
| |
Collapse
|
10
|
Avila Santos AP, Kabiru Nata'ala M, Kasmanas JC, Bartholomäus A, Keller-Costa T, Jurburg SD, Tal T, Camarinha-Silva A, Saraiva JP, Ponce de Leon Ferreira de Carvalho AC, Stadler PF, Sipoli Sanches D, Rocha U. The AnimalAssociatedMetagenomeDB reveals a bias towards livestock and developed countries and blind spots in functional-potential studies of animal-associated microbiomes. Anim Microbiome 2023; 5:48. [PMID: 37798675 PMCID: PMC10552293 DOI: 10.1186/s42523-023-00267-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 09/18/2023] [Indexed: 10/07/2023] Open
Abstract
BACKGROUND Metagenomic data can shed light on animal-microbiome relationships and the functional potential of these communities. Over the past years, the generation of metagenomics data has increased exponentially, and so has the availability and reusability of data present in public repositories. However, identifying which datasets and associated metadata are available is not straightforward. We created the Animal-Associated Metagenome Metadata Database (AnimalAssociatedMetagenomeDB - AAMDB) to facilitate the identification and reuse of publicly available non-human, animal-associated metagenomic data, and metadata. Further, we used the AAMDB to (i) annotate common and scientific names of the species; (ii) determine the fraction of vertebrates and invertebrates; (iii) study their biogeography; and (iv) specify whether the animals were wild, pets, livestock or used for medical research. RESULTS We manually selected metagenomes associated with non-human animals from SRA and MG-RAST. Next, we standardized and curated 51 metadata attributes (e.g., host, compartment, geographic coordinates, and country). The AAMDB version 1.0 contains 10,885 metagenomes associated with 165 different species from 65 different countries. From the collected metagenomes, 51.1% were recovered from animals associated with medical research or grown for human consumption (i.e., mice, rats, cattle, pigs, and poultry). Further, we observed an over-representation of animals collected in temperate regions (89.2%) and a lower representation of samples from the polar zones, with only 11 samples in total. The most common genus among invertebrate animals was Trichocerca (rotifers). CONCLUSION Our work may guide host species selection in novel animal-associated metagenome research, especially in biodiversity and conservation studies. The data available in our database will allow scientists to perform meta-analyses and test new hypotheses (e.g., host-specificity, strain heterogeneity, and biogeography of animal-associated metagenomes), leveraging existing data. The AAMDB WebApp is a user-friendly interface that is publicly available at https://webapp.ufz.de/aamdb/ .
Collapse
Affiliation(s)
- Anderson Paulo Avila Santos
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, 04318, Leipzig, Germany
- Institute of Mathematics and Computer Sciences, University of Sao Paulo, Sao Carlos, Brazil
| | - Muhammad Kabiru Nata'ala
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, 04318, Leipzig, Germany
- Department of Computer Science and Interdisciplinary Centre of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107, Leipzig, Saxony, Germany
| | - Jonas Coelho Kasmanas
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, 04318, Leipzig, Germany
- Department of Computer Science and Interdisciplinary Centre of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107, Leipzig, Saxony, Germany
- Institute of Mathematics and Computer Sciences, University of Sao Paulo, Sao Carlos, Brazil
| | - Alexander Bartholomäus
- GFZ German Research Centre for Geosciences, Section 3.7 Geomicrobiology, 14473, Telegrafenberg, Potsdam, Germany
| | - Tina Keller-Costa
- Institute for Bioengineering and Biosciences (iBB) and Institute for Health and Bioeconomy (i4HB), Instituto Superior Tecnico (IST), Universidade de Lisboa, Lisbon, 1049-001, Portugal
| | - Stephanie D Jurburg
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, 04318, Leipzig, Germany
- German Centre of Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Puschstraße 4, Leipzig, 04103, Germany
| | - Tamara Tal
- Department of Bioanalytical Ecotoxicology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Amélia Camarinha-Silva
- Hohenheim Center for Livestock Microbiome Research (HoLMiR), University of Hohenheim, Stuttgart, Germany
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - João Pedro Saraiva
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, 04318, Leipzig, Germany
| | | | - Peter F Stadler
- Department of Computer Science and Interdisciplinary Centre of Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107, Leipzig, Saxony, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße, 04103, Leipzig, Germany
- Institute for Theoretical Chemistry, Universität Wien, Währingerstraße 17, Vienna, A-1090, Austria
- Center for Scalable Data Analytics and Artificial Intelligence Dresden-Leipzig, Leipzig University, Leipzig, Germany
- Faculdad de Ciencias, Universidad Nacional de Colombia, Sede Bogotá, Bogotá, Colombia
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
- The Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| | | | - Ulisses Rocha
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research - UFZ GmbH, 04318, Leipzig, Germany.
| |
Collapse
|
11
|
Qin J, Bratt S, Hemsley J, Smith A, Liu Q. A FAIR Data Ecosystem for Science of Science. Proc Assoc Inf Sci Technol 2023; 60:1107-1109. [PMID: 38584609 PMCID: PMC10993287 DOI: 10.1002/pra2.960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/03/2023] [Indexed: 04/09/2024]
Abstract
This poster discusses Automated Research Workflows (ARWs) in the context of a FAIR data ecosystem for the science of science research. We offer a conceptual discussion from the point of view of information science and technology using several cases of "data problems" in the science of science research to illustrate the characteristics and expectations for designers and developers of a FAIR data ecosystem. Drawing from a 10-year data science project developing GenBank metadata workflows, we incorporate the ideas of ARWs into the FAIR data ecosystem discussion to set a broader context and increase generalizability. Researchers can use these as a guide for their data science projects to automate research workflows in the science of science domain and beyond.
Collapse
|
12
|
Kroon-Batenburg LMJ. Making your raw data available to the macromolecular crystallography community. Acta Crystallogr F Struct Biol Commun 2023; 79:267-273. [PMID: 37815476 PMCID: PMC10565795 DOI: 10.1107/s2053230x23007987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 09/12/2023] [Indexed: 10/11/2023] Open
Abstract
A recent editorial in the IUCr macromolecular crystallography journals [Helliwell et al. (2019), Acta Cryst. D75, 455-457] called for the implementation of the FAIR data principles. This implies that the authors of a paper that describes research on a macromolecular structure should make their raw diffraction data available. Authors are already used to submitting the derived data (coordinates) and the processed data (structure factors, merged or unmerged) to the PDB, but may still be uncomfortable with making the raw diffraction images available. In this paper, some guidelines and instructions on depositing raw data to Zenodo are given.
Collapse
Affiliation(s)
- Loes M. J. Kroon-Batenburg
- Department of Chemistry, Structural Biochemistry, Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
13
|
Aci-Sèche S, Bourg S, Bonnet P, Rebehmed J, de Brevern AG, Diharce J. A perspective on the sharing of docking data. Data Brief 2023; 49:109386. [PMID: 37492229 PMCID: PMC10365938 DOI: 10.1016/j.dib.2023.109386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 05/17/2023] [Accepted: 07/03/2023] [Indexed: 07/27/2023] Open
Abstract
Computational approaches are nowadays largely applied in drug discovery projects. Among these, molecular docking is the most used for hit identification against a drug target protein. However, many scientists in the field shed light on the lack of availability and reproducibility of the data obtained from such studies to the whole community. Consequently, sustaining and developing the efforts toward a large and fully transparent sharing of those data could be beneficial for all researchers in drug discovery. The purpose of this article is first to propose guidelines and recommendations on the appropriate way to conduct virtual screening experiments and second to depict the current state of sharing molecular docking data. In conclusion, we have explored and proposed several prospects to enhance data sharing from docking experiment that could be developed in the foreseeable future.
Collapse
Affiliation(s)
- Samia Aci-Sèche
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311, Université d'Orléans BP 6759, Orléans Cedex 2, 45067, France
| | - Stéphane Bourg
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311, Université d'Orléans BP 6759, Orléans Cedex 2, 45067, France
| | - Pascal Bonnet
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311, Université d'Orléans BP 6759, Orléans Cedex 2, 45067, France
| | - Joseph Rebehmed
- Department of Computer Science and Mathematics, Lebanese, American University, Beirut, Lebanon
| | - Alexandre G. de Brevern
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, Biologie Intégrée du Globule Rouge, UMR_S 1134, DSIMB Bioinformatics team, 75014 Paris, France
| | - Julien Diharce
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, Biologie Intégrée du Globule Rouge, UMR_S 1134, DSIMB Bioinformatics team, 75014 Paris, France
| |
Collapse
|
14
|
Alvarez-Romero C, Rodríguez-Mejias S, Parra-Calderón CL. Desiderata for the Data Governance and FAIR Principles Adoption in Health Data Hubs. Stud Health Technol Inform 2023; 305:164-167. [PMID: 37386986 DOI: 10.3233/shti230452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
The objective of this study, as part of the European HealthyCloud project, has been to analyse the data management mechanisms of representative data hubs in Europe and identify whether they comply with an adequate adoption of FAIR principles that will enable data discovery. A dedicated consultation survey was performed, and the analysis of the results allowed to generate a set of comprehensive recommendations and best practices so that these data hubs can be integrated into a data sharing ecosystem such as the future European Health Research and Innovation Cloud.
Collapse
Affiliation(s)
- Celia Alvarez-Romero
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, Spain
| | - Silvia Rodríguez-Mejias
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, Spain
| | - Carlos Luis Parra-Calderón
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, Spain
| |
Collapse
|
15
|
Devignes MD, Smaïl-Tabbone M, Dhondge H, Dolcemascolo R, Gavaldá-García J, Higuera-Rodriguez RA, Kravchenko A, Roca Martínez J, Messini N, Pérez-Ràfols A, Pérez Ropero G, Sperotto L, Chauvot de Beauchêne I, Vranken W. Experiences with a training DSW knowledge model for early-stage researchers. Open Res Eur 2023; 3:97. [PMID: 37645489 PMCID: PMC10445825 DOI: 10.12688/openreseurope.15609.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/30/2023] [Indexed: 08/31/2023]
Abstract
Background: Data management is fast becoming an essential part of scientific practice, driven by open science and FAIR (findable, accessible, interoperable, and reusable) data sharing requirements. Whilst data management plans (DMPs) are clear to data management experts and data stewards, understandings of their purpose and creation are often obscure to the producers of the data, which in academic environments are often PhD students. Methods: Within the RNAct EU Horizon 2020 ITN project, we engaged the 10 RNAct early-stage researchers (ESRs) in a training project aimed at formulating a DMP. To do so, we used the Data Stewardship Wizard (DSW) framework and modified the existing Life Sciences Knowledge Model into a simplified version aimed at training young scientists, with computational or experimental backgrounds, in core data management principles. We collected feedback from the ESRs during this exercise. Results: Here, we introduce our new life-sciences training DMP template for young scientists. We report and discuss our experiences as principal investigators (PIs) and ESRs during this project and address the typical difficulties that are encountered in developing and understanding a DMP. Conclusions: We found that the DS-wizard can also be an appropriate tool for DMP training, to get terminology and concepts across to researchers. A full training in addition requires an upstream step to present basic DMP concepts and a downstream step to publish a dataset in a (public) repository. Overall, the DS-Wizard tool was essential for our DMP training and we hope our efforts can be used in other projects.
Collapse
Affiliation(s)
| | | | | | - Roswitha Dolcemascolo
- Institute for Integrative Systems Biology (I2SysBio), CSIC - University of Valencia, Paterna, 46980, Spain
- Department of Biotechnology, Polytechnic University of Valencia, Valencia, 46022, Spain
| | - Jose Gavaldá-García
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, 1050, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| | - R. Anahí Higuera-Rodriguez
- Dynamic Biosensors GmbH, Munich, 81379, Germany
- Department of Physics, School of Natural Sciences, Technical University of Munich, Garching, 85748, Germany
| | - Anna Kravchenko
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, F-5400, France
| | - Joel Roca Martínez
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, 1050, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| | - Niki Messini
- Department of Bioscience, School of Natural Sciences, Technical University of Munich, Garching, 85748, Germany
| | - Anna Pérez-Ràfols
- Giotto Biotech s.r.l,, Florence, 50019, Italy
- Magnetic Resonance Center (CERM), Department of Chemistry “Ugo Schiff”, University of Florence, Florence, 50019, Italy
| | - Guillermo Pérez Ropero
- Department of Chemistry-BMC, Uppsala University, Uppsala, 75123, Sweden
- Ridgeview Instruments AB, Uppsala, 75237, Sweden
| | - Luca Sperotto
- Department of Bioscience, School of Natural Sciences, Technical University of Munich, Garching, 85748, Germany
| | | | - Wim Vranken
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, 1050, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| |
Collapse
|
16
|
Xu F, Juty N, Goble C, Jupp S, Parkinson H, Courtot M. Features of a FAIR vocabulary. J Biomed Semantics 2023; 14:6. [PMID: 37264430 DOI: 10.1186/s13326-023-00286-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 04/27/2023] [Indexed: 06/03/2023] Open
Abstract
BACKGROUND The Findable, Accessible, Interoperable and Reusable(FAIR) Principles explicitly require the use of FAIR vocabularies, but what precisely constitutes a FAIR vocabulary remains unclear. Being able to define FAIR vocabularies, identify features of FAIR vocabularies, and provide assessment approaches against the features can guide the development of vocabularies. RESULTS We differentiate data, data resources and vocabularies used for FAIR, examine the application of the FAIR Principles to vocabularies, align their requirements with the Open Biomedical Ontologies principles, and propose FAIR Vocabulary Features. We also design assessment approaches for FAIR vocabularies by mapping the FVFs with existing FAIR assessment indicators. Finally, we demonstrate how they can be used for evaluating and improving vocabularies using exemplary biomedical vocabularies. CONCLUSIONS Our work proposes features of FAIR vocabularies and corresponding indicators for assessing the FAIR levels of different types of vocabularies, identifies use cases for vocabulary engineers, and guides the evolution of vocabularies.
Collapse
Affiliation(s)
- Fuqi Xu
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SD, UK
| | - Nick Juty
- The University of Manchester, Oxford Rd, Manchester, M13 9PL, UK
| | - Carole Goble
- The University of Manchester, Oxford Rd, Manchester, M13 9PL, UK
| | - Simon Jupp
- SciBite BioData Innovation Centre, Wellcome Genome Campus, Hinxton, CB10 1DR, UK
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SD, UK
| | - Mélanie Courtot
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SD, UK.
- Ontario Institute for Cancer Research, 661 University Ave Suite 510, Toronto, M5G 0A3, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada.
| |
Collapse
|
17
|
Busquet F, Laperrouze J, Jankovic K, Krsmanovic T, Ignasiak T, Leoni B, Apic G, Asole G, Guigó R, Marangio P, Palumbo E, Perez-Lluch S, Wucher V, Vlot AH, Anholt R, Mackay T, Escher BI, Grasse N, Huchthausen J, Massei R, Reemtsma T, Scholz S, Schüürmann G, Bondesson M, Cherbas P, Freedman JH, Glaholt S, Holsopple J, Jacobson SC, Kaufman T, Popodi E, Shaw JJ, Smoot S, Tennessen JM, Churchill G, von Clausbruch CC, Dickmeis T, Hayot G, Pace G, Peravali R, Weiss C, Cistjakova N, Liu X, Slaitas A, Brown JB, Ayerbe R, Cabellos J, Cerro-Gálvez E, Diez-Ortiz M, González V, Martínez R, Vives PS, Barnett R, Lawson T, Lee RG, Sostare E, Viant M, Grafström R, Hongisto V, Kohonen P, Patyra K, Bhaskar PK, Garmendia-Cedillos M, Farooq I, Oliver B, Pohida T, Salem G, Jacobson D, Andrews E, Barnard M, Čavoški A, Chaturvedi A, Colbourne JK, Epps DJT, Holden L, Jones MR, Li X, Müller F, Ormanin-Lewandowska A, Orsini L, Roberts R, Weber RJM, Zhou J, Chung ME, Sanchez JCG, Diwan GD, Singh G, Strähle U, Russell RB, Batista D, Sansone SA, Rocca-Serra P, Du Pasquier D, Lemkine G, Robin-Duchesne B, Tindall A. The Precision Toxicology Initiative. Toxicol Lett 2023:S0378-4274(23)00180-7. [PMID: 37211341 DOI: 10.1016/j.toxlet.2023.05.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/01/2023] [Accepted: 05/09/2023] [Indexed: 05/23/2023]
Abstract
The goal of PrecisionTox is to overcome conceptual barriers to replacing traditional mammalian chemical safety testing by accelerating the discovery of evolutionarily conserved toxicity pathways that are shared by descent among humans and more distantly related animals. An international consortium is systematically testing the toxicological effects of a diverse set of chemicals on a suite of five model species comprising fruit flies, nematodes, water fleas, and embryos of clawed frogs and zebrafish along with human cell lines. Multiple forms of omics and comparative toxicology data are integrated to map the evolutionary origins of biomolecular interactions, which are predictive of adverse health effects, to major branches of the animal phylogeny. These conserved elements of adverse outcome pathways (AOPs) and their biomarkers are expect to provide mechanistic insight useful for regulating groups of chemicals based on their shared modes of action. PrecisionTox also aims to quantify risk variation within populations by recognizing susceptibility as a heritable trait that varies with genetic diversity. This initiative incorporates legal experts and collaborates with risk managers to address specific needs within European chemicals legislation, including the uptake of new approach methodologies (NAMs) for setting precise regulatory limits on toxic chemicals.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Nico Grasse
- Helmholtz Centre for Environmental Research, DE
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Krefting D, Anton G, Chaplinskaya-Sobol I, Hanss S, Hoffmann W, Hopff SM, Kraus M, Lorbeer R, Lorenz-Depiereux B, Illig T, Schäfer C, Schaller J, Stahl D, Valentin H, Heuschmann P, Vehreschild J. The Importance of Being FAIR and FAST - The Clinical Epidemiology and Study Platform of the German Network University Medicine (NUKLEUS). Stud Health Technol Inform 2023; 302:93-97. [PMID: 37203616 DOI: 10.3233/shti230071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
The COVID-19 pandemic has urged the need to set up, conduct and analyze high-quality epidemiological studies within a very short time-scale to provide timely evidence on influential factors on the pandemic, e.g. COVID-19 severity and disease course. The comprehensive research infrastructure developed to run the German National Pandemic Cohort Network within the Network University Medicine is now maintained within a generic clinical epidemiology and study platform NUKLEUS. It is operated and subsequently extended to allow efficient joint planning, execution and evaluation of clinical and clinical-epidemiological studies. We aim to provide high-quality biomedical data and biospecimens and make its results widely available to the scientific community by implementing findability, accessibility, interoperability and reusability - i.e. following the FAIR guiding principles. Thus, NUKLEUS might serve as role model for FAIR and fast implementation of clinical epidemiological studies within the setting of University Medical Centers and beyond.
Collapse
Affiliation(s)
- Dagmar Krefting
- Dpt. of Medical Informatics, University Medical Center Göttingen, German Center for Cardiovascular Research (DZHK) partner site Göttingen, Germany
- Campus Institute Data Science (CIDAS), Georg-August-University Göttingen, Germany
| | - Gabi Anton
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Irina Chaplinskaya-Sobol
- Dpt. of Medical Informatics, University Medical Center Göttingen, German Center for Cardiovascular Research (DZHK) partner site Göttingen, Germany
| | - Sabine Hanss
- Dpt. of Medical Informatics, University Medical Center Göttingen, German Center for Cardiovascular Research (DZHK) partner site Göttingen, Germany
| | - Wolfgang Hoffmann
- Institute for Community Medicine, University Medicine Greifswald, Germany
| | - Sina M Hopff
- Faculty of Medicine, University of Cologne, Department I of Internal Medicine, University Hospital Cologne, Germany
- Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Cologne, Germany
| | - Monika Kraus
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Roberto Lorbeer
- Medical Heart Center and Institute of Computer-assisted Cardiovascular Medicine, Charité - Universitätsmedizin Berlin, Germany
| | - Bettina Lorenz-Depiereux
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Thomas Illig
- Hannover Unified Biobank, Hannover Medical School, Hannover, Germany
| | - Christian Schäfer
- Institute of Clinical Chemistry and Laboratory Medicine, University Medicine Greifswald, Germany
| | - Jens Schaller
- Medical Heart Center and Institute of Computer-assisted Cardiovascular Medicine, Charité - Universitätsmedizin Berlin, Germany
| | - Dana Stahl
- Independent Trusted Third Party of the University Medicine Greifswald, Germany
| | - Heike Valentin
- Independent Trusted Third Party of the University Medicine Greifswald, Germany
| | - Peter Heuschmann
- Institute of Clinical Epidemiology and Biometry, University of Würzburg; Clinical Trial Center, University Hospital Würzburg, Germany
| | - Janne Vehreschild
- Faculty of Medicine, University of Cologne, Department I of Internal Medicine, University Hospital Cologne, Germany
- German Centre for Infection Research (DZIF), partner site Bonn-Cologne, Cologne, Department II for Internal Medicine, Hematology/Oncology, University Hospital Frankfurt, Frankfurt am Main, Germany
| |
Collapse
|
19
|
Martínez-García A, Alvarez-Romero C, Román-Villarán E, Bernabeu-Wittel M, Luis Parra-Calderón C. FAIR principles to improve the impact on health research management outcomes. Heliyon 2023; 9:e15733. [PMID: 37205991 PMCID: PMC10189186 DOI: 10.1016/j.heliyon.2023.e15733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/19/2023] [Accepted: 04/20/2023] [Indexed: 05/21/2023] Open
Abstract
Background The FAIR principles, under the open science paradigm, aim to improve the Findability, Accessibility, Interoperability and Reusability of digital data. In this sense, the FAIR4Health project aimed to apply the FAIR principles in the health research field. For this purpose, a workflow and a set of tools were developed to apply FAIR principles in health research datasets, and validated through the demonstration of the potential impact that this strategy has on health research management outcomes. Objective This paper aims to describe the analysis of the impact on health research management outcomes of the FAIR4Health solution. Methods To analyse the impact on health research management outcomes in terms of time and economic savings, a survey was designed and sent to experts on data management with expertise in the use of the FAIR4Health solution. Then, differences between the time and costs needed to perform the techniques with (i) standalone research, and (ii) using the proposed solution, were analyzed. Results In the context of the health research management outcomes, the survey analysis concluded that 56.57% of the time and 16800 EUR per month could be saved if the FAIR4Health solution is used. Conclusions Adopting principles in health research through the FAIR4Health solution saves time and, consequently, costs in the execution of research involving data management techniques.
Collapse
Affiliation(s)
- Alicia Martínez-García
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS/Virgen del Rocío University Hospital/CSIC/University of Seville, Seville, Spain
| | - Celia Alvarez-Romero
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS/Virgen del Rocío University Hospital/CSIC/University of Seville, Seville, Spain
- Corresponding author.
| | - Esther Román-Villarán
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS/Virgen del Rocío University Hospital/CSIC/University of Seville, Seville, Spain
| | | | - Carlos Luis Parra-Calderón
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS/Virgen del Rocío University Hospital/CSIC/University of Seville, Seville, Spain
| |
Collapse
|
20
|
Quille RVE, de Almeida FV, Ohara MY, Corrêa PLP, de Freitas LG, Alves-Souza SN, de Almeida JR, Davis M, Prakash G. Architecture of a Data Portal for Publishing and Delivering Open Data for Atmospheric Measurement. Int J Environ Res Public Health 2023; 20:5374. [PMID: 37047988 PMCID: PMC10094644 DOI: 10.3390/ijerph20075374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 03/12/2023] [Accepted: 03/14/2023] [Indexed: 06/19/2023]
Abstract
Atmospheric data are collected by researchers every day. Campaigns such as GOAmazon 2014/2015 and the Amazon Tall Tower Observatory collect essential data on aerosols, gases, cloud properties, and meteorological parameters in the Brazilian Amazon basin. These data products provide insights and essential information for analyzing and predicting natural processes. However, in Brazil, it is estimated that more than 80% of the scientific data collected are not published due to the lack of web portals that collect and store these data. This makes it difficult, or even impossible, to access and integrate the data, which can result in the loss of significant amounts of information and significantly affect the understanding of the overall data. To address this problem, we propose a data portal architecture and open data deployment that enable Big Data processing, human interaction, and download-oriented approaches with tools that help users catalog, publish and visualize atmospheric data. Thus, we describe the architecture developed, based on the experience of the Atmospheric Radiation Measurement Data Center, which incorporates the principles of FAIR, the infrastructure and content management system for managing scientific data. The portal partial results were tested with environmental data from contaminated areas at the University of São Paulo. Overall, this data portal creates more shared knowledge about atmospheric processes by providing users with access to open environmental data.
Collapse
Affiliation(s)
- Rosa Virginia Encinas Quille
- School of Arts, Sciences and Humanities, University of São Paulo, Rua Arlindo Béttio, 1000-Ermelino Matarazzo, São Paulo 03828-000, Brazil
- Residues and Contaminated Areas Laboratory (LARC), Institute for Technological Research (IPT), Av. Prof. Almeida Prado, 532-Butantã, São Paulo 05508-901, Brazil
| | - Felipe Valencia de Almeida
- Polytechnic School, University of São Paulo, Av. Prof. Luciano Gualberto, 380-Butantã, São Paulo 05508-010, Brazil; (F.V.d.A.)
| | - Mauro Yuji Ohara
- Polytechnic School, University of São Paulo, Av. Prof. Luciano Gualberto, 380-Butantã, São Paulo 05508-010, Brazil; (F.V.d.A.)
| | - Pedro Luiz Pizzigatti Corrêa
- School of Arts, Sciences and Humanities, University of São Paulo, Rua Arlindo Béttio, 1000-Ermelino Matarazzo, São Paulo 03828-000, Brazil
- Polytechnic School, University of São Paulo, Av. Prof. Luciano Gualberto, 380-Butantã, São Paulo 05508-010, Brazil; (F.V.d.A.)
| | - Leandro Gomes de Freitas
- Residues and Contaminated Areas Laboratory (LARC), Institute for Technological Research (IPT), Av. Prof. Almeida Prado, 532-Butantã, São Paulo 05508-901, Brazil
| | - Solange Nice Alves-Souza
- School of Arts, Sciences and Humanities, University of São Paulo, Rua Arlindo Béttio, 1000-Ermelino Matarazzo, São Paulo 03828-000, Brazil
- Polytechnic School, University of São Paulo, Av. Prof. Luciano Gualberto, 380-Butantã, São Paulo 05508-010, Brazil; (F.V.d.A.)
| | - Jorge Rady de Almeida
- Polytechnic School, University of São Paulo, Av. Prof. Luciano Gualberto, 380-Butantã, São Paulo 05508-010, Brazil; (F.V.d.A.)
| | - Maggie Davis
- Environmental Sciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37831, USA
| | - Giri Prakash
- Environmental Sciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37831, USA
| |
Collapse
|
21
|
Sinaci AA, Gencturk M, Teoman HA, Laleci Erturkmen GB, Alvarez-Romero C, Martinez-Garcia A, Poblador-Plou B, Carmona-Pírez J, Löbe M, Parra-Calderon CL. A Data Transformation Methodology to Create Findable, Accessible, Interoperable, and Reusable Health Data: Software Design, Development, and Evaluation Study. J Med Internet Res 2023; 25:e42822. [PMID: 36884270 PMCID: PMC10034606 DOI: 10.2196/42822] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 01/04/2023] [Accepted: 01/31/2023] [Indexed: 03/09/2023] Open
Abstract
BACKGROUND Sharing health data is challenging because of several technical, ethical, and regulatory issues. The Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles have been conceptualized to enable data interoperability. Many studies provide implementation guidelines, assessment metrics, and software to achieve FAIR-compliant data, especially for health data sets. Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) is a health data content modeling and exchange standard. OBJECTIVE Our goal was to devise a new methodology to extract, transform, and load existing health data sets into HL7 FHIR repositories in line with FAIR principles, develop a Data Curation Tool to implement the methodology, and evaluate it on health data sets from 2 different but complementary institutions. We aimed to increase the level of compliance with FAIR principles of existing health data sets through standardization and facilitate health data sharing by eliminating the associated technical barriers. METHODS Our approach automatically processes the capabilities of a given FHIR end point and directs the user while configuring mappings according to the rules enforced by FHIR profile definitions. Code system mappings can be configured for terminology translations through automatic use of FHIR resources. The validity of the created FHIR resources can be automatically checked, and the software does not allow invalid resources to be persisted. At each stage of our data transformation methodology, we used particular FHIR-based techniques so that the resulting data set could be evaluated as FAIR. We performed a data-centric evaluation of our methodology on health data sets from 2 different institutions. RESULTS Through an intuitive graphical user interface, users are prompted to configure the mappings into FHIR resource types with respect to the restrictions of selected profiles. Once the mappings are developed, our approach can syntactically and semantically transform existing health data sets into HL7 FHIR without loss of data utility according to our privacy-concerned criteria. In addition to the mapped resource types, behind the scenes, we create additional FHIR resources to satisfy several FAIR criteria. According to the data maturity indicators and evaluation methods of the FAIR Data Maturity Model, we achieved the maximum level (level 5) for being Findable, Accessible, and Interoperable and level 3 for being Reusable. CONCLUSIONS We developed and extensively evaluated our data transformation approach to unlock the value of existing health data residing in disparate data silos to make them available for sharing according to the FAIR principles. We showed that our method can successfully transform existing health data sets into HL7 FHIR without loss of data utility, and the result is FAIR in terms of the FAIR Data Maturity Model. We support institutional migration to HL7 FHIR, which not only leads to FAIR data sharing but also eases the integration with different research networks.
Collapse
Affiliation(s)
- A Anil Sinaci
- Software Research & Development and Consultancy Corporation (SRDC), Cankaya, Turkey
| | - Mert Gencturk
- Software Research & Development and Consultancy Corporation (SRDC), Cankaya, Turkey
- Department of Computer Engineering, Middle East Technical University, Cankaya, Turkey
| | - Huseyin Alper Teoman
- Software Research & Development and Consultancy Corporation (SRDC), Cankaya, Turkey
- Department of Computer Engineering, Middle East Technical University, Cankaya, Turkey
| | | | - Celia Alvarez-Romero
- Group of Computational Health Informatics, Institute of Biomedicine of Seville, Virgen del Rocío University Hospital, Spanish National Research Council, University of Seville, Seville, Spain
| | - Alicia Martinez-Garcia
- Group of Computational Health Informatics, Institute of Biomedicine of Seville, Virgen del Rocío University Hospital, Spanish National Research Council, University of Seville, Seville, Spain
| | - Beatriz Poblador-Plou
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), Aragon Health Research Institute (IIS Aragon), Zaragoza, Spain
| | - Jonás Carmona-Pírez
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), Aragon Health Research Institute (IIS Aragon), Zaragoza, Spain
| | - Matthias Löbe
- Institute for Medical Informatics, Statistics and Epidemiology (IMISE), University of Leipzig, Leipzig, Germany
| | - Carlos Luis Parra-Calderon
- Group of Computational Health Informatics, Institute of Biomedicine of Seville, Virgen del Rocío University Hospital, Spanish National Research Council, University of Seville, Seville, Spain
| |
Collapse
|
22
|
Du X, Dastmalchi F, Ye H, Garrett TJ, Diller MA, Liu M, Hogan WR, Brochhausen M, Lemas DJ. Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software. Metabolomics 2023; 19:11. [PMID: 36745241 DOI: 10.1007/s11306-023-01974-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 01/20/2023] [Indexed: 02/07/2023]
Abstract
BACKGROUND Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS). AIM OF REVIEW This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software. KEY SCIENTIFIC CONCEPTS OF REVIEW We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.
Collapse
Affiliation(s)
- Xinsong Du
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Farhad Dastmalchi
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Hao Ye
- Health Science Center Libraries, University of Florida, Florida, USA
| | - Timothy J Garrett
- Department of Pathology, Immunology and Laboratory Medicine, College of Medicine, University of Florida, Florida, USA
| | - Matthew A Diller
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Mei Liu
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - William R Hogan
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Mathias Brochhausen
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, USA
| | - Dominick J Lemas
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA.
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Florida, Gainesville, United States.
- Center for Perinatal Outcomes Research, University of Florida College of Medicine, Gainesville, United States.
| |
Collapse
|
23
|
Bittrich S, Bhikadiya C, Bi C, Chao H, Duarte JM, Dutta S, Fayazi M, Henry J, Khokhriakov I, Lowe R, Piehl DW, Segura J, Vallat B, Voigt M, Westbrook JD, Burley SK, Rose Y. RCSB Protein Data Bank: Efficient Searching and Simultaneous Access to One Million Computed Structure Models Alongside the PDB Structures Enabled by Architectural Advances. J Mol Biol 2023:167994. [PMID: 36738985 DOI: 10.1016/j.jmb.2023.167994] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/27/2023] [Accepted: 01/28/2023] [Indexed: 02/05/2023]
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) provides open access to experimentally-determined three-dimensional (3D) structures of biomolecules. The RCSB PDB RCSB.org research-focused web portal is used annually by many millions of users around the world. They access biostructure information, run complex queries utilizing various search services (e.g., full-text, structural and chemical attribute, chemical, sequence, and structure similarity searches), and visualize macromolecules in 3D, all at no charge and with no limitations on data usage. Notwithstanding more than 24,000-fold growth of the PDB over the past five decades, experimentally-determined structures are only available for a small subset of the millions of proteins of known sequence. Recently developed machine learning software tools can predict 3D structures of proteins at accuracies comparable to lower-resolution experimental methods. The RCSB PDB now provides access to ∼1,000,000 Computed Structure Models (CSMs) of proteins coming from AlphaFold DB and the ModelArchive alongside ∼200,000 experimentally-determined PDB structures. Both CSMs and PDB structures are available on RCSB.org and via well-established RCSB PDB Data, Search, and 1D-Coordinates application programming interfaces (APIs). Simultaneous delivery of PDB data and CSMs provides users with access to complementary structural information across the human proteome and those of model organisms and selected pathogens. API enhancements are backwards-compatible and programmatic users can "opt in" to access CSMs with minimal effort. Herein, we describe modifications to RCSB PDB cyberinfrastructure required to support sixfold scaling of 3D biostructure data delivery and lay the groundwork for scaling to accommodate hundreds of millions of CSMs.
Collapse
Affiliation(s)
- Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA.
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| |
Collapse
|
24
|
Arend D, Scholz U, Lange M. The Plant Phenomics and Genomics Research Data Repository: An On-Premise Approach for FAIR-Compliant Data Acquisition. Methods Mol Biol 2023; 2703:3-22. [PMID: 37646933 DOI: 10.1007/978-1-0716-3389-2_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
The FAIR data principle as a commitment to support long-term research data management is widely accepted in the scientific community. However, although many established infrastructures provide comprehensive and long-term stable services and platforms, a large quantity of research data is still hidden. Currently, high-throughput plant genomics and phenomics technologies are producing research data in abundance, the storage of which is not covered by established core databases. This concerns the data volume, for example, time series of images or high-resolution hyperspectral data; the quality of data formatting and annotation, e.g., with regard to structure and annotation specifications of core databases; uncovered data domains; or organizational constraints prohibiting primary data storage outside institutional boundaries. To share these potentially dark data in a FAIR way and master these challenges the ELIXIR Germany/de.NBI service Plant Genomic and Phenomics Research Data Repository (PGP) implements an on-premise approach, which allows research data to be kept in place and wrapped in FAIR-aware software infrastructure. In this chapter, the e!DAL infrastructure software and the PGP repository are presented as best practice on how to easily setup FAIR-compliant and intuitive research data services.
Collapse
Affiliation(s)
- Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany.
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany
| |
Collapse
|
25
|
Röckel F, Schreiber T, Schüler D, Braun U, Krukenberg I, Schwander F, Peil A, Brandt C, Willner E, Gransow D, Scholz U, Kecke S, Maul E, Lange M, Töpfer R. PhenoApp: A mobile tool for plant phenotyping to record field and greenhouse observations. F1000Res 2022; 11:12. [PMID: 36636476 PMCID: PMC9813448 DOI: 10.12688/f1000research.74239.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/20/2021] [Indexed: 01/21/2023] Open
Abstract
With the ongoing cost decrease of genotyping and sequencing technologies, accurate and fast phenotyping remains the bottleneck in the utilizing of plant genetic resources for breeding and breeding research. Although cost-efficient high-throughput phenotyping platforms are emerging for specific traits and/or species, manual phenotyping is still widely used and is a time- and money-consuming step. Approaches that improve data recording, processing or handling are pivotal steps towards the efficient use of genetic resources and are demanded by the research community. Therefore, we developed PhenoApp, an open-source Android app for tablets and smartphones to facilitate the digital recording of phenotypical data in the field and in greenhouses. It is a versatile tool that offers the possibility to fully customize the descriptors/scales for any possible scenario, also in accordance with international information standards such as MIAPPE (Minimum Information About a Plant Phenotyping Experiment) and FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Furthermore, PhenoApp enables the use of pre-integrated ready-to-use BBCH (Biologische Bundesanstalt für Land- und Forstwirtschaft, Bundessortenamt und CHemische Industrie) scales for apple, cereals, grapevine, maize, potato, rapeseed and rice. Additional BBCH scales can easily be added. The simple and adaptable structure of input and output files enables an easy data handling by either spreadsheet software or even the integration in the workflow of laboratory information management systems (LIMS). PhenoApp is therefore a decisive contribution to increase efficiency of digital data acquisition in genebank management but also contributes to breeding and breeding research by accelerating the labour intensive and time-consuming acquisition of phenotyping data.
Collapse
Affiliation(s)
- Franco Röckel
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany,
| | - Toni Schreiber
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Ulrike Braun
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Ina Krukenberg
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Königin-Luise-Strasse 19, Berlin, 14195, Germany
| | - Florian Schwander
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Andreas Peil
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Breeding Research on Fruit Crops, Pillnitzer Platz 3a, Dresden/Pillnitz, 01326, Germany
| | - Christine Brandt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Parkweg 3a, Sanitz, 18190, Germany
| | - Evelin Willner
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Daniel Gransow
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Steffen Kecke
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Erika Maul
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Reinhard Töpfer
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| |
Collapse
|
26
|
Röckel F, Schreiber T, Schüler D, Braun U, Krukenberg I, Schwander F, Peil A, Brandt C, Willner E, Gransow D, Scholz U, Kecke S, Maul E, Lange M, Töpfer R. PhenoApp: A mobile tool for plant phenotyping to record field and greenhouse observations. F1000Res 2022; 11:12. [PMID: 36636476 PMCID: PMC9813448 DOI: 10.12688/f1000research.74239.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/25/2022] [Indexed: 11/29/2022] Open
Abstract
With the ongoing cost decrease of genotyping and sequencing technologies, accurate and fast phenotyping remains the bottleneck in the utilizing of plant genetic resources for breeding and breeding research. Although cost-efficient high-throughput phenotyping platforms are emerging for specific traits and/or species, manual phenotyping is still widely used and is a time- and money-consuming step. Approaches that improve data recording, processing or handling are pivotal steps towards the efficient use of genetic resources and are demanded by the research community. Therefore, we developed PhenoApp, an open-source Android app for tablets and smartphones to facilitate the digital recording of phenotypical data in the field and in greenhouses. It is a versatile tool that offers the possibility to fully customize the descriptors/scales for any possible scenario, also in accordance with international information standards such as MIAPPE (Minimum Information About a Plant Phenotyping Experiment) and FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Furthermore, PhenoApp enables the use of pre-integrated ready-to-use BBCH (Biologische Bundesanstalt für Land- und Forstwirtschaft, Bundessortenamt und CHemische Industrie) scales for apple, cereals, grapevine, maize, potato, rapeseed and rice. Additional BBCH scales can easily be added. The simple and adaptable structure of input and output files enables an easy data handling by either spreadsheet software or even the integration in the workflow of laboratory information management systems (LIMS). PhenoApp is therefore a decisive contribution to increase efficiency of digital data acquisition in genebank management but also contributes to breeding and breeding research by accelerating the labour intensive and time-consuming acquisition of phenotyping data.
Collapse
Affiliation(s)
- Franco Röckel
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany,
| | - Toni Schreiber
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Ulrike Braun
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Ina Krukenberg
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Königin-Luise-Strasse 19, Berlin, 14195, Germany
| | - Florian Schwander
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Andreas Peil
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Breeding Research on Fruit Crops, Pillnitzer Platz 3a, Dresden/Pillnitz, 01326, Germany
| | - Christine Brandt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Parkweg 3a, Sanitz, 18190, Germany
| | - Evelin Willner
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Daniel Gransow
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Steffen Kecke
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Erika Maul
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Reinhard Töpfer
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| |
Collapse
|
27
|
Hamilton DG, Page MJ, Finch S, Everitt S, Fidler F. How often do cancer researchers make their data and code available and what factors are associated with sharing? BMC Med 2022; 20:438. [PMID: 36352426 PMCID: PMC9646258 DOI: 10.1186/s12916-022-02644-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 10/31/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Various stakeholders are calling for increased availability of data and code from cancer research. However, it is unclear how commonly these products are shared, and what factors are associated with sharing. Our objective was to evaluate how frequently oncology researchers make data and code available and explore factors associated with sharing. METHODS A cross-sectional analysis of a random sample of 306 cancer-related articles indexed in PubMed in 2019 which studied research subjects with a cancer diagnosis was performed. All articles were independently screened for eligibility by two authors. Outcomes of interest included the prevalence of affirmative sharing declarations and the rate with which declarations connected to data complying with key FAIR principles (e.g. posted to a recognised repository, assigned an identifier, data license outlined, non-proprietary formatting). We also investigated associations between sharing rates and several journal characteristics (e.g. sharing policies, publication models), study characteristics (e.g. cancer rarity, study design), open science practices (e.g. pre-registration, pre-printing) and subsequent citation rates between 2020 and 2021. RESULTS One in five studies declared data were publicly available (59/306, 19%, 95% CI: 15-24%). However, when data availability was investigated this percentage dropped to 16% (49/306, 95% CI: 12-20%), and then to less than 1% (1/306, 95% CI: 0-2%) when data were checked for compliance with key FAIR principles. While only 4% of articles that used inferential statistics reported code to be available (10/274, 95% CI: 2-6%), the odds of reporting code to be available were 5.6 times higher for researchers who shared data. Compliance with mandatory data and code sharing policies was observed in 48% (14/29) and 0% (0/6) of articles, respectively. However, 88% of articles (45/51) included data availability statements when required. Policies that encouraged data sharing did not appear to be any more effective than not having a policy at all. The only factors associated with higher rates of data sharing were studying rare cancers and using publicly available data to complement original research. CONCLUSIONS Data and code sharing in oncology occurs infrequently, and at a lower rate than would be expected given the prevalence of mandatory sharing policies. There is also a large gap between those declaring data to be available, and those archiving data in a way that facilitates its reuse. We encourage journals to actively check compliance with sharing policies, and researchers consult community-accepted guidelines when archiving the products of their research.
Collapse
Affiliation(s)
- Daniel G Hamilton
- MetaMelb Research Group, School of BioSciences, University of Melbourne, Melbourne, Australia. .,Melbourne Medical School, Faculty of Medicine, Dentistry & Health Sciences, University of Melbourne, Melbourne, Australia.
| | - Matthew J Page
- School of Public Health & Preventive Medicine, Monash University, Melbourne, Australia
| | - Sue Finch
- Melbourne Statistical Consulting Platform, School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Sarah Everitt
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Australia
| | - Fiona Fidler
- MetaMelb Research Group, School of BioSciences, University of Melbourne, Melbourne, Australia.,School of Historical and Philosophical Studies, University of Melbourne, Melbourne, Australia
| |
Collapse
|
28
|
Celuchova Bosanska D, Huptych M, Lhotská L. Decentralized EHRs in the Semantic Web for Better Health Data Management. Stud Health Technol Inform 2022; 299:157-162. [PMID: 36325857 DOI: 10.3233/shti220975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Electronic Health Record (EHR) systems currently in use are not designed for widely interoperable longitudinal health data. Therefore, EHR data cannot be properly shared, managed and analyzed. In this article, we propose two approaches to making EHR data more comprehensive and FAIR (Findable, Accessible, Interoperable, and Reusable) and thus more useful for diagnosis and clinical research. Firstly, the data modeling based on the LinkML framework makes the data interoperability more realistic in diverse environments with various experts involved. We show the first results of how diverse health data can be integrated based on an easy-to-understand data model and without loss of available clinical knowledge. Secondly, decentralizing EHRs contributes to the higher availability of comprehensive and consistent EHR data. We propose a technology stack for decentralized EHRs and the reasons behind this proposal. Moreover, the two proposed approaches empower patients because their EHR data can become more available, understandable, and usable for them, and they can share their data according to their needs and preferences. Finally, we explore how the users of the proposed solution could be involved in the process of its validation and adoption.
Collapse
Affiliation(s)
| | - Michal Huptych
- Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Czech Republic
| | - Lenka Lhotská
- Faculty of Biomedical Engineering, Czech Technical University in Prague, Czech Republic
- Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Czech Republic
| |
Collapse
|
29
|
Lange L, Berg G, Cernava T, Champomier-Vergès MC, Charles T, Cocolin L, Cotter P, D’Hondt K, Kostic T, Maguin E, Makhalanyane T, Meisner A, Ryan M, Kiran GS, de Souza RS, Sanz Y, Schloter M, Smidt H, Wakelin S, Sessitsch A. Microbiome ethics, guiding principles for microbiome research, use and knowledge management. Environ Microbiome 2022; 17:50. [PMID: 36180931 PMCID: PMC9526347 DOI: 10.1186/s40793-022-00444-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 09/14/2022] [Indexed: 06/16/2023]
Abstract
The overarching biological impact of microbiomes on their hosts, and more generally their environment, reflects the co-evolution of a mutualistic symbiosis, generating fitness for both. Knowledge of microbiomes, their systemic role, interactions, and impact grows exponentially. When a research field of importance for planetary health evolves so rapidly, it is essential to consider it from an ethical holistic perspective. However, to date, the topic of microbiome ethics has received relatively little attention considering its importance. Here, ethical analysis of microbiome research, innovation, use, and potential impact is structured around the four cornerstone principles of ethics: Do Good; Don't Harm; Respect; Act Justly. This simple, but not simplistic approach allows ethical issues to be communicative and operational. The essence of the paper is captured in a set of eleven microbiome ethics recommendations, e.g., proposing gut microbiome status as common global heritage, similar to the internationally agreed status of major food crops.
Collapse
Affiliation(s)
- Lene Lange
- LL-BioEconomy, Valby, Copenhagen, Denmark
| | | | | | | | | | | | - Paul Cotter
- Teagasc Food Research Centre, Moorepark, APC Microbiome Ireland and VistaMilk, Cork, Ireland
| | - Kathleen D’Hondt
- Department of Economy, Science and Innovation, Flemish Government, Brussels, Belgium
| | - Tanja Kostic
- AIT Austrian Institute of Technology GmbH, Tulln, Austria
| | - Emmanuelle Maguin
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, Jouy-en-Josas, France
| | | | - Annelein Meisner
- Wageningen Research, Wageningen University & Research, Wageningen, The Netherlands
| | | | | | | | - Yolanda Sanz
- Institute of Agrochemistry and Food Technology- Spanish National Research Council (IATA-CSIC), Valencia, Spain
| | | | - Hauke Smidt
- Laboratory of Microbiology, Wageningen University & Research, Wageningen, The Netherlands
| | | | | |
Collapse
|
30
|
Eva G, Liese G, Stephanie B, Petr H, Leslie M, Roel V, Martine V, Sergi B, Mette H, Sarah J, Laura RM, Arnout S, Morris A S, Jan T, Xenia T, Nina V, Koert VE, Sylvie R, Greet S. Position paper on management of personal data in environment and health research in Europe. Environ Int 2022; 165:107334. [PMID: 35696847 DOI: 10.1016/j.envint.2022.107334] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 05/30/2022] [Accepted: 06/01/2022] [Indexed: 06/15/2023]
Abstract
Management of datasets that include health information and other sensitive personal information of European study participants has to be compliant with the General Data Protection Regulation (GDPR, Regulation (EU) 2016/679). Within scientific research, the widely subscribed'FAIR' data principles should apply, meaning that research data should be findable, accessible, interoperable and re-usable. Balancing the aim of open science driven FAIR data management with GDPR compliant personal data protection safeguards is now a common challenge for many research projects dealing with (sensitive) personal data. In December 2020 a workshop was held with representatives of several large EU research consortia and of the European Commission to reflect on how to apply the FAIR data principles for environment and health research (E&H). Several recent data intensive EU funded E&H research projects face this challenge and work intensively towards developing solutions to access, exchange, store, handle, share, process and use such sensitive personal data, with the aim to support European and transnational collaborations. As a result, several recommendations, opportunities and current limitations were formulated. New technical developments such as federated data management and analysis systems, machine learning together with advanced search software, harmonized ontologies and data quality standards should in principle facilitate the FAIRification of data. To address ethical, legal, political and financial obstacles to the wider re-use of data for research purposes, both specific expertise and underpinning infrastructure are needed. There is a need for the E&H research data to find their place in the European Open Science Cloud. Communities using health and population data, environmental data and other publicly available data have to interconnect and synergize. To maximize the use and re-use of environment and health data, a dedicated supporting European infrastructure effort, such as the EIRENE research infrastructure within the ESFRI roadmap 2021, is needed that would interact with existing infrastructures.
Collapse
Affiliation(s)
- Govarts Eva
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium.
| | - Gilles Liese
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Bopp Stephanie
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | | | - Matalonga Leslie
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Vermeulen Roel
- Institute for Risk Assessment Sciences, Utrecht University, Utrecht, Netherlands
| | - Vrijheid Martine
- ISGlobal, Barcelona, Spain; Spanish Consortium for Research on Epidemiology and Public Health (CIBERESP), Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Beltran Sergi
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona (UB), Barcelona, Spain
| | - Hartlev Mette
- Faculty of Law, University of Copenhagen, Copenhagen, Denmark
| | | | | | - Standaert Arnout
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Swertz Morris A
- Department of Genetics & Genomics Coordination Center, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Theunis Jan
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Trier Xenia
- European Environment Agency (EEA), Copenhagen, Denmark
| | - Vogel Nina
- German Environment Agency (UBA), Berlin, Germany
| | | | - Remy Sylvie
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Schoeters Greet
- VITO Health, Flemish Institute for Technological Research (VITO), Mol, Belgium; Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
31
|
Alvarez-Romero C, Martínez-García A, Sinaci AA, Gencturk M, Méndez E, Hernández-Pérez T, Liperoti R, Angioletti C, Löbe M, Ganapathy N, Deserno TM, Almada M, Costa E, Chronaki C, Cangioli G, Cornet R, Poblador-Plou B, Carmona-Pírez J, Gimeno-Miguel A, Poncel-Falcó A, Prados-Torres A, Kovacevic T, Zaric B, Bokan D, Hromis S, Djekic Malbasa J, Rapallo Fernández C, Velázquez Fernández T, Rochat J, Gaudet-Blavignac C, Lovis C, Weber P, Quintero M, Perez-Perez MM, Ashley K, Horton L, Parra Calderón CL. FAIR4Health: Findable, Accessible, Interoperable and Reusable data to foster Health Research. Open Res Eur 2022; 2:34. [PMID: 37645268 PMCID: PMC10446092 DOI: 10.12688/openreseurope.14349.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/25/2022] [Indexed: 08/31/2023]
Abstract
Due to the nature of health data, its sharing and reuse for research are limited by ethical, legal and technical barriers. The FAIR4Health project facilitated and promoted the application of FAIR principles in health research data, derived from the publicly funded health research initiatives to make them Findable, Accessible, Interoperable, and Reusable (FAIR). To confirm the feasibility of the FAIR4Health solution, we performed two pathfinder case studies to carry out federated machine learning algorithms on FAIRified datasets from five health research organizations. The case studies demonstrated the potential impact of the developed FAIR4Health solution on health outcomes and social care research. Finally, we promoted the FAIRified data to share and reuse in the European Union Health Research community, defining an effective EU-wide strategy for the use of FAIR principles in health research and preparing the ground for a roadmap for health research institutions. This scientific report presents a general overview of the FAIR4Health solution: from the FAIRification workflow design to translate raw data/metadata to FAIR data/metadata in the health research domain to the FAIR4Health demonstrators' performance.
Collapse
Affiliation(s)
- Celia Alvarez-Romero
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, 41013, Spain
| | - Alicia Martínez-García
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, 41013, Spain
| | - A. Anil Sinaci
- SRDC Software Research Development and Consultancy Corporation, Ankara, 06800, Turkey
| | - Mert Gencturk
- SRDC Software Research Development and Consultancy Corporation, Ankara, 06800, Turkey
| | - Eva Méndez
- Dept. of Library & Inf Sci. Universidad Carlos III de Madrid, Getafe, 28903, Spain
| | - Tony Hernández-Pérez
- Dept. of Library & Inf Sci. Universidad Carlos III de Madrid, Getafe, 28903, Spain
| | - Rosa Liperoti
- Department of Geriatric and Orthopedic Sciences, Catholic University of Sacred Heart, Roma, 00168, Italy
| | - Carmen Angioletti
- Department of Geriatric and Orthopedic Sciences, Catholic University of Sacred Heart, Roma, 00168, Italy
| | - Matthias Löbe
- Institute for Medical Informatics (IMISE), University of Leipzig, Leipzig, 04107, Germany
| | - Nagarajan Ganapathy
- PLRI Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, 38106, Germany
| | - Thomas M. Deserno
- PLRI Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, 38106, Germany
| | - Marta Almada
- Ucibio Requimte, Faculty of Pharmacy University of Porto. Porto4Ageing, Porto, 4050-313, Portugal
| | - Elisio Costa
- Ucibio Requimte, Faculty of Pharmacy University of Porto. Porto4Ageing, Porto, 4050-313, Portugal
| | | | | | - Ronald Cornet
- Amsterdam UMC, University of Amsterdam, Medical Informatics, Amsterdam Public Health, Amsterdam, 1105AZ, The Netherlands
| | - Beatriz Poblador-Plou
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), IIS Aragón, Miguel Servet University Hospital, Zaragoza, 50009, Spain
| | - Jonás Carmona-Pírez
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), IIS Aragón, Miguel Servet University Hospital, Zaragoza, 50009, Spain
| | - Antonio Gimeno-Miguel
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), IIS Aragón, Miguel Servet University Hospital, Zaragoza, 50009, Spain
| | - Antonio Poncel-Falcó
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), IIS Aragón, Aragon Health Service, Zaragoza, 50009, Spain
| | - Alexandra Prados-Torres
- EpiChron Research Group, Aragon Health Sciences Institute (IACS), IIS Aragón, Miguel Servet University Hospital, Zaragoza, 50009, Spain
| | - Tomi Kovacevic
- Medical Faculty University of Novi Sad, Novi Sad, 21000, Serbia
- Institute for Pulmonary Diseases of Vojvodina, Sremska Kamenica, 21204, Serbia
| | - Bojan Zaric
- Medical Faculty University of Novi Sad, Novi Sad, 21000, Serbia
- Institute for Pulmonary Diseases of Vojvodina, Sremska Kamenica, 21204, Serbia
| | - Darijo Bokan
- Institute for Pulmonary Diseases of Vojvodina, Sremska Kamenica, 21204, Serbia
| | - Sanja Hromis
- Medical Faculty University of Novi Sad, Novi Sad, 21000, Serbia
- Institute for Pulmonary Diseases of Vojvodina, Sremska Kamenica, 21204, Serbia
| | - Jelena Djekic Malbasa
- Medical Faculty University of Novi Sad, Novi Sad, 21000, Serbia
- Institute for Pulmonary Diseases of Vojvodina, Sremska Kamenica, 21204, Serbia
| | | | | | - Jessica Rochat
- University of Geneva and University hospitals of Geneva, Geneva, 1211, Switzerland
| | | | - Christian Lovis
- University of Geneva and University hospitals of Geneva, Geneva, 1211, Switzerland
| | - Patrick Weber
- Nice Computing SA Le Mont-sur-Lausanne, Le Mont-sur-Lausanne, 1052, Switzerland
| | - Miriam Quintero
- Atos Research and Innovation - ARI. Atos IT., Madrid, 28037, Spain
- Atos Research and Innovation - ARI. Atos Spain., Madrid, 28037, Spain
| | - Manuel M. Perez-Perez
- Atos Research and Innovation - ARI. Atos IT., Madrid, 28037, Spain
- Atos Research and Innovation - ARI. Atos Spain., Madrid, 28037, Spain
| | - Kevin Ashley
- Digital Curation Centre, University of Edinburgh, Argyle House, Edinburgh, EH3 9DR, UK
| | - Laurence Horton
- Digital Curation Centre, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Carlos Luis Parra Calderón
- Computational Health Informatics Group, Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville, Seville, 41013, Spain
| |
Collapse
|
32
|
Chou A, Torres-Espín A, Huie JR, Krukowski K, Lee S, Nolan A, Guglielmetti C, Hawkins BE, Chaumeil MM, Manley GT, Beattie MS, Bresnahan JC, Martone ME, Grethe JS, Rosi S, Ferguson AR. Empowering Data Sharing and Analytics through the Open Data Commons for Traumatic Brain Injury Research. Neurotrauma Rep 2022; 3:139-157. [PMID: 35403104 PMCID: PMC8985540 DOI: 10.1089/neur.2021.0061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Traumatic brain injury (TBI) is a major public health problem. Despite considerable research deciphering injury pathophysiology, precision therapies remain elusive. Here, we present large-scale data sharing and machine intelligence approaches to leverage TBI complexity. The Open Data Commons for TBI (ODC-TBI) is a community-centered repository emphasizing Findable, Accessible, Interoperable, and Reusable data sharing and publication with persistent identifiers. Importantly, the ODC-TBI implements data sharing of individual subject data, enabling pooling for high-sample-size, feature-rich data sets for machine learning analytics. We demonstrate pooled ODC-TBI data analyses, starting with descriptive analytics of subject-level data from 11 previously published articles (N = 1250 subjects) representing six distinct pre-clinical TBI models. Second, we perform unsupervised machine learning on multi-cohort data to identify persistent inflammatory patterns across different studies, improving experimental sensitivity for pro- versus anti-inflammation effects. As funders and journals increasingly mandate open data practices, ODC-TBI will create new scientific opportunities for researchers and facilitate multi-data-set, multi-dimensional analytics toward effective translation.
Collapse
Affiliation(s)
- Austin Chou
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
| | - Abel Torres-Espín
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
| | - J Russell Huie
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
- San Francisco Veterans Affairs Healthcare System, San Francisco, California, USA
| | - Karen Krukowski
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA
| | - Sangmi Lee
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
| | - Amber Nolan
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA
| | - Caroline Guglielmetti
- Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA
- Department of Radiology & Biomedical Imaging, University of California San Francisco, San Francisco, California, USA
| | - Bridget E Hawkins
- Department of Anesthesiology, University of Texas Medical Branch at Galveston, Galveston, Texas, USA
- Moody Project for Traumatic Brain Injury Research, University of Texas Medical Branch at Galveston, Galveston, Texas, USA
| | - Myriam M Chaumeil
- Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA
- Department of Radiology & Biomedical Imaging, University of California San Francisco, San Francisco, California, USA
| | - Geoffrey T Manley
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
| | - Michael S Beattie
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
- San Francisco Veterans Affairs Healthcare System, San Francisco, California, USA
- Weill Institute for Neuroscience, University of California San Francisco, San Francisco, California, USA
| | - Jacqueline C Bresnahan
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
- Weill Institute for Neuroscience, University of California San Francisco, San Francisco, California, USA
| | - Maryann E Martone
- Department of Neuroscience, University of California San Diego, San Diego, California, USA
| | - Jeffrey S Grethe
- Department of Neuroscience, University of California San Diego, San Diego, California, USA
| | - Susanna Rosi
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
- Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA
- Weill Institute for Neuroscience, University of California San Francisco, San Francisco, California, USA
- Kavli Institute of Fundamental Neuroscience, University of California San Francisco, San Francisco, California, USA
| | - Adam R Ferguson
- Brain and Spinal Injury Center, University of California San Francisco, San Francisco, California, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA
- San Francisco Veterans Affairs Healthcare System, San Francisco, California, USA
- Weill Institute for Neuroscience, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
33
|
Maas AIR, Ercole A, De Keyser V, Menon DK, Steyerberg EW. Opportunities and Challenges in High-Quality Contemporary Data Collection in Traumatic Brain Injury: The CENTER-TBI Experience. Neurocrit Care 2022; 37:192-201. [PMID: 35303262 DOI: 10.1007/s12028-022-01471-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 02/11/2022] [Indexed: 11/29/2022]
Abstract
Strong evidence in support of guidelines for traumatic brain injury (TBI) is lacking. Large-scale observational studies may offer a complementary source of evidence to clinical trials to improve the care and outcome for patients with TBI. They are, however, challenging to execute. In this review, we aim to characterize opportunities and challenges of large-scale collaborative research in neurotrauma. We use the setup and conduct of Collaborative European Neurotrauma Effectiveness Research in TBI (CENTER-TBI) as an illustrative example. We highlight the importance of building a team and of developing a network for younger researchers, thus investing toward the future. We involved investigators early in the design phase and recognized their efforts in a group contributor list on all publications. We found, however, that translation to academic credits often failed, and we suggest that the current system of academic credits be critically appraised. We found substantial variability in consent procedures for participant enrollment within and between countries. Overall, obtaining approvals typically required 4-6 months, with outliers up to 18 months. Research costs varied considerably across Europe and should be defined by center. We substantially underestimated costs of data curation, and we suggest that 15-20% of the budget be reserved for this purpose. Streamlining analyses and accommodating external research proposals demanded a structured approach. We implemented a systematic inventory of study plans and found this effective in maintaining oversight and in promoting collaboration between research groups. Ensuring good use of the data was a prominent feature in the review of external proposals. Multiple interactions occurred with industrial partners, mainly related to biomarkers and neuroimaging, and resulted in various formal collaborations, substantially extending the scope of CENTER-TBI. Overall, CENTER-TBI has been productive, with over 250 international peer-reviewed publications. We have ensured mechanisms to maintain the infrastructure and continued analyses. We see potential for individual patient data meta-analyses in connection to other large-scale projects. Our collaboration with Transforming Research and Clinical Knowledge in TBI (TRACK-TBI) has taught us that although standardized data collection and coding according to common data elements can facilitate such meta-analyses, further data harmonization is required for meaningful results. Both CENTER-TBI and TRACK-TBI have demonstrated the complexity of the conduct of large-scale collaborative studies that produce high-quality science and new insights.
Collapse
Affiliation(s)
- Andrew I R Maas
- Department of Neurosurgery, Antwerp University Hospital and University of Antwerp, Drie Eikenstraat 655, 2650, Edegem, Belgium.
| | - Ari Ercole
- Division of Anaesthesia, University of Cambridge and Addenbrooke's Hospital, Box 93, Cambridge, CB2 0QQ, UK
| | - Veronique De Keyser
- Department of Neurosurgery, Antwerp University Hospital and University of Antwerp, Drie Eikenstraat 655, 2650, Edegem, Belgium
| | - David K Menon
- Division of Anaesthesia, University of Cambridge and Addenbrooke's Hospital, Box 93, Cambridge, CB2 0QQ, UK
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
34
|
Carmona-Pírez J, Poblador-Plou B, Poncel-Falcó A, Rochat J, Alvarez-Romero C, Martínez-García A, Angioletti C, Almada M, Gencturk M, Sinaci AA, Ternero-Vega JE, Gaudet-Blavignac C, Lovis C, Liperoti R, Costa E, Parra-Calderón CL, Moreno-Juste A, Gimeno-Miguel A, Prados-Torres A. Applying the FAIR4Health Solution to Identify Multimorbidity Patterns and Their Association with Mortality through a Frequent Pattern Growth Association Algorithm. Int J Environ Res Public Health 2022; 19:2040. [PMID: 35206230 DOI: 10.3390/ijerph19042040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 02/09/2022] [Accepted: 02/10/2022] [Indexed: 12/18/2022]
Abstract
The current availability of electronic health records represents an excellent research opportunity on multimorbidity, one of the most relevant public health problems nowadays. However, it also poses a methodological challenge due to the current lack of tools to access, harmonize and reuse research datasets. In FAIR4Health, a European Horizon 2020 project, a workflow to implement the FAIR (findability, accessibility, interoperability and reusability) principles on health datasets was developed, as well as two tools aimed at facilitating the transformation of raw datasets into FAIR ones and the preservation of data privacy. As part of this project, we conducted a multicentric retrospective observational study to apply the aforementioned FAIR implementation workflow and tools to five European health datasets for research on multimorbidity. We applied a federated frequent pattern growth association algorithm to identify the most frequent combinations of chronic diseases and their association with mortality risk. We identified several multimorbidity patterns clinically plausible and consistent with the bibliography, some of which were strongly associated with mortality. Our results show the usefulness of the solution developed in FAIR4Health to overcome the difficulties in data management and highlight the importance of implementing a FAIR data policy to accelerate responsible health research.
Collapse
|
35
|
Facile R, Muhlbradt EE, Gong M, Li Q, Popat V, Pétavy F, Cornet R, Ruan Y, Koide D, Saito TI, Hume S, Rockhold F, Bao W, Dubman S, Jauregui Wurst B. Use of Clinical Data Interchange Standards Consortium (CDISC) Standards for Real-world Data: Expert Perspectives From a Qualitative Delphi Survey. JMIR Med Inform 2022; 10:e30363. [PMID: 35084343 PMCID: PMC8832264 DOI: 10.2196/30363] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 09/17/2021] [Accepted: 10/09/2021] [Indexed: 01/16/2023] Open
Abstract
Background Real-world data (RWD) and real-world evidence (RWE) are playing increasingly important roles in clinical research and health care decision-making. To leverage RWD and generate reliable RWE, data should be well defined and structured in a way that is semantically interoperable and consistent across stakeholders. The adoption of data standards is one of the cornerstones supporting high-quality evidence for the development of clinical medicine and therapeutics. Clinical Data Interchange Standards Consortium (CDISC) data standards are mature, globally recognized, and heavily used by the pharmaceutical industry for regulatory submissions. The CDISC RWD Connect Initiative aims to better understand the barriers to implementing CDISC standards for RWD and to identify the tools and guidance needed to more easily implement them. Objective The aim of this study is to understand the barriers to implementing CDISC standards for RWD and to identify the tools and guidance that may be needed to implement CDISC standards more easily for this purpose. Methods We conducted a qualitative Delphi survey involving an expert advisory board with multiple key stakeholders, with 3 rounds of input and review. Results Overall, 66 experts participated in round 1, 56 in round 2, and 49 in round 3 of the Delphi survey. Their inputs were collected and analyzed, culminating in group statements. It was widely agreed that the standardization of RWD is highly necessary, and the primary focus should be on its ability to improve data sharing and the quality of RWE. The priorities for RWD standardization included electronic health records, such as data shared using Health Level 7 Fast Health care Interoperability Resources (FHIR), and the data stemming from observational studies. With different standardization efforts already underway in these areas, a gap analysis should be performed to identify the areas where synergies and efficiencies are possible and then collaborate with stakeholders to create or extend existing mappings between CDISC and other standards, controlled terminologies, and models to represent data originating across different sources. Conclusions There are many ongoing data standardization efforts around human health data–related activities, each with different definitions, levels of granularity, and purpose. Among these, CDISC has been successful in standardizing clinical trial-based data for regulation worldwide. However, the complexity of the CDISC standards and the fact that they were developed for different purposes, combined with the lack of awareness and incentives to use a new standard and insufficient training and implementation support, are significant barriers to setting up the use of CDISC standards for RWD. The collection and dissemination of use cases, development of tools and support systems for the RWD community, and collaboration with other standards development organizations are potential steps forward. Using CDISC will help link clinical trial data and RWD and promote innovation in health data science.
Collapse
Affiliation(s)
- Rhonda Facile
- Clinical Data Interchange Standards Consortium, Austin, TX, United States
| | | | - Mengchun Gong
- Digital Health China Technologies, Bejing, China.,Institute of Health Management, Southern Medical University, Guangzhou, China
| | - Qingna Li
- Institute of Clinical Pharmacology, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, China.,Key Laboratory for Clinical Research and Evaluation of Traditional Chinese Medicine of National Medical Products Administration, Beijing, China.,National Clinical Research Center for Chinese Medicine Cardiology, Beijing, China
| | - Vaishali Popat
- Food and Drug Administration, Center for Drug Evaluation Research, Silver Spring, MD, United States
| | - Frank Pétavy
- European Medicines Agency, Amsterdam, Netherlands
| | - Ronald Cornet
- Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam University Medical Centers - University of Amsterdam, Amsterdam, Netherlands
| | | | - Daisuke Koide
- Department of Biostatistics & Bioinformatics, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Toshiki I Saito
- National Hospital Organization Nagoya Medical Center, Nagoya, Japan
| | - Sam Hume
- Clinical Data Interchange Standards Consortium, Austin, TX, United States
| | - Frank Rockhold
- Duke Clinical Research Institute, Duke University Medical Center, Durham, NC, United States
| | - Wenjun Bao
- JMP Life Sciences, SAS Institute Inc, Cary, NC, United States
| | - Sue Dubman
- Clinical Data Interchange Standards Consortium, Austin, TX, United States
| | | |
Collapse
|
36
|
Unim B, Mattei E, Carle F, Tolonen H, Bernal-Delgado E, Achterberg P, Zaletel M, Seeling S, Haneef R, Lorcy AC, Van Oyen H, Palmieri L. Health data collection methods and procedures across EU member states: findings from the InfAct Joint Action on health information. Arch Public Health 2022; 80:17. [PMID: 34986889 PMCID: PMC8728985 DOI: 10.1186/s13690-021-00780-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 12/23/2021] [Indexed: 11/15/2022] Open
Abstract
Background Health-related data are collected from a variety of sources for different purposes, including secondary use for population health monitoring (HM) and health system performance assessment (HSPA). Most of these data sources are not included in databases of international organizations (e.g., WHO, OECD, Eurostat), limiting their use for research activities and policy making. This study aims at identifying and describing collection methods, quality assessment procedures, availability and accessibility of health data across EU Member States (MS) for HM and HSPA. Methods A structured questionnaire was developed and administered through an online platform to partners of the InfAct consortium form EU MS to investigate data collections applied in HM and HSPA projects, as well as their methods and procedures. A descriptive analysis of the questionnaire results was performed. Results Information on 91 projects from 18 EU MS was collected. In these projects, data were mainly collected through administrative sources, population health interview or health examination surveys and from electronic medical records. Tools and methods used for data collection were mostly mandatory reports, self-administered questionnaires, or record linkage of various data sources. One-third of the projects shared data with EU research networks and less than one-third performed quality assessment of their data collection procedures using international standardized criteria. Macrodata were accessible via open access and reusable in 22 projects. Microdata were accessible upon specific request and reusable in 15 projects based on data usage licenses. Metadata was available for the majority of the projects, but followed reporting standards only in 29 projects. Overall, compliance to FAIR Data principles (Findable, Accessible, Interoperable, and Reusable) was not optimal across the EU projects. Conclusions Data collection and exchange procedures differ across EU MS and research data are not always available, accessible, comparable or reusable for further research and evidence-based policy making. There is a need for an EU-level health information infrastructure and governance to promote and facilitate sharing and dissemination of standardized and comparable health data, following FAIR Data principles, across the EU. Supplementary Information The online version contains supplementary material available at 10.1186/s13690-021-00780-4.
Collapse
Affiliation(s)
- Brigid Unim
- Department of Cardiovascular, Endocrine-metabolic Diseases and Aging, Istituto Superiore di Sanità, Via Giano della Bella 34, 00162, Rome, Italy.
| | - Eugenio Mattei
- Department of Cardiovascular, Endocrine-metabolic Diseases and Aging, Istituto Superiore di Sanità, Via Giano della Bella 34, 00162, Rome, Italy
| | - Flavia Carle
- Center of Epidemiology, Biostatistics and Medical Information, Marche Polytechnic University, Ancona, Italy
| | - Hanna Tolonen
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Enrique Bernal-Delgado
- Data Sciences for Health Services and Policy Research Group, Institute for Health Sciences in Aragon (IACS), Zaragoza, Spain
| | - Peter Achterberg
- Centre for Health Knowledge Integration, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Metka Zaletel
- Health Data Centre, National Institute of Public Health, Ljubljana, Slovenia
| | - Stefanie Seeling
- Department of Epidemiology and Health Monitoring, Robert Koch Institute, Berlin, Germany
| | - Romana Haneef
- Department of Non-Communicable Diseases and Injuries, Santé Publique France, 94415, Saint-Maurice, France
| | | | - Herman Van Oyen
- Epidemiology and Public Health, Sciensano, Brussels, Belgium
| | - Luigi Palmieri
- Department of Cardiovascular, Endocrine-metabolic Diseases and Aging, Istituto Superiore di Sanità, Via Giano della Bella 34, 00162, Rome, Italy
| |
Collapse
|
37
|
Romanchikova M, Thomas SA, Dexter A, Shaw M, Partarrieau I, Smith N, Venton J, Adeogun M, Brettle D, Turpin RJ. The need for measurement science in digital pathology. J Pathol Inform 2022; 13:100157. [PMID: 36405869 PMCID: PMC9646441 DOI: 10.1016/j.jpi.2022.100157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 11/07/2022] [Indexed: 11/11/2022] Open
Abstract
Background Pathology services experienced a surge in demand during the COVID-19 pandemic. Digitalisation of pathology workflows can help to increase throughput, yet many existing digitalisation solutions use non-standardised workflows captured in proprietary data formats and processed by black-box software, yielding data of varying quality. This study presents the views of a UK-led expert group on the barriers to adoption and the required input of measurement science to improve current practices in digital pathology. Methods With an aim to support the UK's efforts in digitalisation of pathology services, this study comprised: (1) a review of existing evidence, (2) an online survey of domain experts, and (3) a workshop with 42 representatives from healthcare, regulatory bodies, pharmaceutical industry, academia, equipment, and software manufacturers. The discussion topics included sample processing, data interoperability, image analysis, equipment calibration, and use of novel imaging modalities. Findings The lack of data interoperability within the digital pathology workflows hinders data lookup and navigation, according to 80% of attendees. All participants stressed the importance of integrating imaging and non-imaging data for diagnosis, while 80% saw data integration as a priority challenge. 90% identified the benefits of artificial intelligence and machine learning, but identified the need for training and sound performance metrics.Methods for calibration and providing traceability were seen as essential to establish harmonised, reproducible sample processing, and image acquisition pipelines. Vendor-neutral data standards were seen as a "must-have" for providing meaningful data for downstream analysis. Users and vendors need good practice guidance on evaluation of uncertainty, fitness-for-purpose, and reproducibility of artificial intelligence/machine learning tools. All of the above needs to be accompanied by an upskilling of the pathology workforce. Conclusions Digital pathology requires interoperable data formats, reproducible and comparable laboratory workflows, and trustworthy computer analysis software. Despite high interest in the use of novel imaging techniques and artificial intelligence tools, their adoption is slowed down by the lack of guidance and evaluation tools to assess the suitability of these techniques for specific clinical question. Measurement science expertise in uncertainty estimation, standardisation, reference materials, and calibration can help establishing reproducibility and comparability between laboratory procedures, yielding high quality data and providing higher confidence in diagnosis.
Collapse
Affiliation(s)
- Marina Romanchikova
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom,Corresponding author
| | - Spencer Angus Thomas
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - Alex Dexter
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - Mike Shaw
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - Ignacio Partarrieau
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - Nadia Smith
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - Jenny Venton
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - Michael Adeogun
- National Physical Laboratory, Hampton Road, Teddington, Middlesex TW11 0LW, United Kingdom
| | - David Brettle
- Leeds Teaching Hospitals NHS Trust, St. James's University Hospital, Beckett Street, Leeds, West Yorkshire LS9 7TF, United Kingdom
| | - Robert James Turpin
- British Standards Institution, 389 Chiswick High Road, London W4 4AL, United Kingdom
| |
Collapse
|
38
|
Abrams MB, Bjaalie JG, Das S, Egan GF, Ghosh SS, Goscinski WJ, Grethe JS, Kotaleski JH, Ho ETW, Kennedy DN, Lanyon LJ, Leergaard TB, Mayberg HS, Milanesi L, Mouček R, Poline JB, Roy PK, Strother SC, Tang TB, Tiesinga P, Wachtler T, Wójcik DK, Martone ME. A Standards Organization for Open and FAIR Neuroscience: the International Neuroinformatics Coordinating Facility. Neuroinformatics 2022; 20:25-36. [PMID: 33506383 PMCID: PMC9036053 DOI: 10.1007/s12021-020-09509-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/28/2020] [Indexed: 01/07/2023]
Abstract
There is great need for coordination around standards and best practices in neuroscience to support efforts to make neuroscience a data-centric discipline. Major brain initiatives launched around the world are poised to generate huge stores of neuroscience data. At the same time, neuroscience, like many domains in biomedicine, is confronting the issues of transparency, rigor, and reproducibility. Widely used, validated standards and best practices are key to addressing the challenges in both big and small data science, as they are essential for integrating diverse data and for developing a robust, effective, and sustainable infrastructure to support open and reproducible neuroscience. However, developing community standards and gaining their adoption is difficult. The current landscape is characterized both by a lack of robust, validated standards and a plethora of overlapping, underdeveloped, untested and underutilized standards and best practices. The International Neuroinformatics Coordinating Facility (INCF), an independent organization dedicated to promoting data sharing through the coordination of infrastructure and standards, has recently implemented a formal procedure for evaluating and endorsing community standards and best practices in support of the FAIR principles. By formally serving as a standards organization dedicated to open and FAIR neuroscience, INCF helps evaluate, promulgate, and coordinate standards and best practices across neuroscience. Here, we provide an overview of the process and discuss how neuroscience can benefit from having a dedicated standards body.
Collapse
Affiliation(s)
| | - Jan G. Bjaalie
- Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway
| | - Samir Das
- McGill Centre for Integrative Neuroscience, McGill University, Montreal, QC Canada
| | - Gary F. Egan
- Monash Biomedical Imaging, Monash University, Clayton, VIC Australia
| | - Satrajit S. Ghosh
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA USA ,Department of Otolaryngology - Head and Neck Surgery Harvard Medical School Boston, Boston, MA USA
| | | | - Jeffrey S. Grethe
- Department of Neuroscience, School of Medicine, University of California, San Diego, La Jolla, CA USA
| | | | - Eric Tatt Wei Ho
- Centre for Intelligent Signal and Imaging Research, Institute of Health and Analytics, Universiti Teknologi PETRONAS, Perak, Malaysia
| | - David N. Kennedy
- Department of Psychiatry, University of Massachusetts Medical School, Worchester, MA USA
| | | | | | - Helen S. Mayberg
- Nash Family Center for Advanced Circuit Therapeutics, Icahn School of Medicine, New York, NY USA
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council (CNR), Milan, Italy
| | - Roman Mouček
- Department of Computer Science and Engineering, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic
| | - J. B. Poline
- Montreal Neurological Institute, Faculty of Medicine and Health Sciences, McGill University, Montreal, Canada
| | - Prasun K. Roy
- Computational Neuroscience & Neuroimaging Laboratory, School of Bio-Medical Engineering, Indian Institute of Technology (BHU), Varanasi, UP India
| | - Stephen C. Strother
- Rotman Research Institute, Baycrest Centre, Department of Medical Biophysics, University of Toronto, Ontario, ON Canada
| | - Tong Boon Tang
- Centre for Intelligent Signal and Imaging Research, Institute of Health and Analytics, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Malaysia
| | - Paul Tiesinga
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Thomas Wachtler
- Department of Biology II, Ludwig-Maximilians-Universität München, Martinsried, Planegg Germany
| | - Daniel K. Wójcik
- Laboratory of Neuroinformatics, Nencki Institute of Experimental Biology of Polish Academy of Sciences, Warsaw, Poland
| | - Maryann E. Martone
- Department of Neuroscience, School of Medicine, University of California, San Diego, La Jolla, CA USA
| |
Collapse
|
39
|
Alper BS, Flynn A, Bray BE, Conte ML, Eldredge C, Gold S, Greenes RA, Haug P, Jacoby K, Koru G, McClay J, Sainvil ML, Sottara D, Tuttle M, Visweswaran S, Yurk RA. Categorizing metadata to help mobilize computable biomedical knowledge. Learn Health Syst 2022; 6:e10271. [PMID: 35036552 PMCID: PMC8753304 DOI: 10.1002/lrh2.10271] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 04/03/2021] [Accepted: 04/24/2021] [Indexed: 12/03/2022] Open
Abstract
INTRODUCTION Computable biomedical knowledge artifacts (CBKs) are digital objects conveying biomedical knowledge in machine-interpretable structures. As more CBKs are produced and their complexity increases, the value obtained from sharing CBKs grows. Mobilizing CBKs and sharing them widely can only be achieved if the CBKs are findable, accessible, interoperable, reusable, and trustable (FAIR+T). To help mobilize CBKs, we describe our efforts to outline metadata categories to make CBKs FAIR+T. METHODS We examined the literature regarding metadata with the potential to make digital artifacts FAIR+T. We also examined metadata available online today for actual CBKs of 12 different types. With iterative refinement, we came to a consensus on key categories of metadata that, when taken together, can make CBKs FAIR+T. We use subject-predicate-object triples to more clearly differentiate metadata categories. RESULTS We defined 13 categories of CBK metadata most relevant to making CBKs FAIR+T. Eleven of these categories (type, domain, purpose, identification, location, CBK-to-CBK relationships, technical, authorization and rights management, provenance, evidential basis, and evidence from use metadata) are evident today where CBKs are stored online. Two additional categories (preservation and integrity metadata) were not evident in our examples. We provide a research agenda to guide further study and development of these and other metadata categories. CONCLUSION A wide variety of metadata elements in various categories is needed to make CBKs FAIR+T. More work is needed to develop a common framework for CBK metadata that can make CBKs FAIR+T for all stakeholders.
Collapse
Affiliation(s)
| | - Allen Flynn
- Medical SchoolUniversity of MichiganAnn ArborMichiganUSA
| | - Bruce E. Bray
- Biomedical Informatics and Cardiovascular MedicineSchool of Medicine, University of UtahSalt Lake CityUtahUSA
| | - Marisa L. Conte
- Taubman Health Sciences Library, University of MichiganAnn ArborMichiganUSA
| | | | - Sigfried Gold
- College of Information StudiesUniversity of MarylandCollege ParkMarylandUSA
| | | | - Peter Haug
- Intermountain HealthcareUniversity of UtahSalt Lake CityUtahUSA
| | | | - Gunes Koru
- Department of Information SystemsUniversity of MarylandBaltimoreMarylandUSA
| | - James McClay
- Emergency MedicineUniversity of Nebraska Medical CenterOmahaNebraskaUSA
| | | | | | | | - Shyam Visweswaran
- Department of Biomedical InformaticsUniversity of PittsburghPittsburghPennsylvaniaUSA
| | | |
Collapse
|
40
|
Gillespie TH, Tripathy SJ, Sy MF, Martone ME, Hill SL. The Neuron Phenotype Ontology: A FAIR Approach to Proposing and Classifying Neuronal Types. Neuroinformatics 2022; 20:793-809. [PMID: 35267146 PMCID: PMC9547803 DOI: 10.1007/s12021-022-09566-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/19/2022] [Indexed: 12/31/2022]
Abstract
The challenge of defining and cataloging the building blocks of the brain requires a standardized approach to naming neurons and organizing knowledge about their properties. The US Brain Initiative Cell Census Network, Human Cell Atlas, Blue Brain Project, and others are generating vast amounts of data and characterizing large numbers of neurons throughout the nervous system. The neuroscientific literature contains many neuron names (e.g. parvalbumin-positive interneuron or layer 5 pyramidal cell) that are commonly used and generally accepted. However, it is often unclear how such common usage types relate to many evidence-based types that are proposed based on the results of new techniques. Further, comparing different types across labs remains a significant challenge. Here, we propose an interoperable knowledge representation, the Neuron Phenotype Ontology (NPO), that provides a standardized and automatable approach for naming cell types and normalizing their constituent phenotypes using identifiers from community ontologies as a common language. The NPO provides a framework for systematically organizing knowledge about cellular properties and enables interoperability with existing neuron naming schemes. We evaluate the NPO by populating a knowledge base with three independent cortical neuron classifications derived from published data sets that describe neurons according to molecular, morphological, electrophysiological, and synaptic properties. Competency queries to this knowledge base demonstrate that the NPO knowledge model enables interoperability between the three test cases and neuron names commonly used in the literature.
Collapse
Affiliation(s)
| | - Shreejoy J. Tripathy
- Department of Psychiatry, University of Toronto, Toronto, ON Canada ,Department of Physiology, University of Toronto, Toronto, ON Canada ,Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON Canada
| | - Mohameth François Sy
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Campus Biotech, 1202 Geneva, Switzerland
| | | | - Sean L. Hill
- Department of Psychiatry, University of Toronto, Toronto, ON Canada ,Department of Physiology, University of Toronto, Toronto, ON Canada ,Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON Canada ,Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Campus Biotech, 1202 Geneva, Switzerland
| |
Collapse
|
41
|
Pigeot I, Bongaerts B, Eberle A, Katalinic A, Kieschke J, Luttmann S, Meyer M, Nennecke A, Rathmann W, Stabenow R, Wilsdorf-Köhler H, Kollhorst B, Reinders T. [Linkage of claims data with data from epidemiological cancer registries: possibilities and limitations in the German federal states]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2021. [PMID: 34940893 DOI: 10.1007/s00103-021-03475-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 12/03/2021] [Indexed: 11/27/2022]
Abstract
Hintergrund In den letzten Jahren wird verstärkt gefordert, Forschungsdaten gemäß den sog. FAIR-Prinzipien für eine Nachnutzung aufzubereiten. Dadurch könnten zukünftige Projekte auf einer breiteren Datengrundlage durchgeführt sowie durch Verknüpfung verschiedener Datenquellen neue Fragestellungen untersucht werden. Fragestellung Eruiert werden soll, inwieweit Abrechnungsdaten gesetzlicher Krankenversicherungen mit den Daten der Landeskrebsregister (LKR) überregional verknüpft werden können, um die in den Abrechnungsdaten fehlenden Informationen zu Krebserkrankungen ergänzen und die Validität der dortigen Angaben zur Tumordiagnose beurteilen zu können. Der Fokus liegt dabei auf der Beschreibung der länderspezifischen Anforderungen für einen solchen Datenabgleich. Material und Methoden Als Datenquellen wurden die Pharmakoepidemiologische Forschungsdatenbank GePaRD des Leibniz-Instituts für Präventionsforschung und Epidemiologie – BIPS und sechs Krebsregister herangezogen. Zur Verknüpfung wurden vergleichend das logistisch aufwendige direkte Linkage- und ein weniger aufwendiges indirektes Linkage-Verfahren angewandt. Dazu mussten für GePaRD und für jedes LKR die Genehmigungen der jeweils zuständigen Behörde eingeholt werden. Ergebnisse Hinsichtlich der Verknüpfung von LKR-Daten mit GePaRD zeigten sich gravierende Unterschiede in der Datenbereitstellung (vollständige Ablehnung bis hin zu einer unkomplizierten Umsetzung). Diskussion In Deutschland müssen einheitliche Rahmenbedingungen geschaffen werden, um eine angemessene Nachnutzung und eine Verknüpfung von personenbezogenen Gesundheitsdaten zu Forschungszwecken im Sinne der FAIR-Prinzipien zu ermöglichen. Bezüglich der Verknüpfung von LKR-Daten mit anderen Datenquellen könnte das neue Gesetz zur Zusammenführung von Krebsregisterdaten Abhilfe schaffen.
Collapse
|
42
|
Bossa C, Andreoli C, Bakker M, Barone F, De Angelis I, Jeliazkova N, Nymark P, Battistelli CL. FAIRification of nanosafety data to improve applicability of (Q)SAR approaches: A case study on in vitro Comet assay genotoxicity data. Comput Toxicol 2021; 20:100190. [PMID: 34820591 PMCID: PMC8591730 DOI: 10.1016/j.comtox.2021.100190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 09/17/2021] [Accepted: 09/20/2021] [Indexed: 12/30/2022]
Abstract
(Quantitative) structure-activity relationship ([Q]SAR) methodologies are widely applied to predict the (eco)toxicological effects of chemicals, and their use is envisaged in different regulatory frameworks for filling data gaps of untested substances. However, their application to the risk assessment of nanomaterials is still limited, also due to the scarcity of large and curated experimental datasets. Despite a great amount of nanosafety data having been produced over the last decade in international collaborative initiatives, their interpretation, integration and reuse has been hampered by several obstacles, such as poorly described (meta)data, non-standard terminology, lack of harmonized reporting formats and criteria. Recently, the FAIR (Findable, Accessible, Interoperable, and Reusable) principles have been established to guide the scientific community in good data management and stewardship. The EU H2020 Gov4Nano project, together with other international projects and initiatives, is addressing the challenge of improving nanosafety data FAIRness, for maximizing their availability, understanding, exchange and ultimately their reuse. These efforts are largely supported by the creation of a common Nanosafety Data Interface, which connects a row of project-specific databases applying the eNanoMapper data model. A wide variety of experimental data relating to characterization and effects of nanomaterials are stored in the database; however, the methods, protocols and parameters driving their generation are not fully mature. This article reports the progress of an ongoing case study in the Gov4nano project on the reuse of in vitro Comet genotoxicity data, focusing on the issues and challenges encountered in their FAIRification through the eNanoMapper data model. The case study is part of an iterative process in which the FAIRification of data supports the understanding of the phenomena underlying their generation and, ultimately, improves their reusability.
Collapse
Key Words
- (Q)SAR approaches
- (Q)SAR, (Quantitative) structure-activity relationship
- AOP, Adverse Outcome Pathway
- ECHA, European Chemicals Agency
- FAIR principles
- FAIR, Findable, Accessible, Interoperable, and Reusable
- Fpg, Formamido pyrimidine glycosilase
- Genotoxicity
- IATA, Integrated Approaches to Testing and Assessment
- ISA–Tab, Investigation/Study/Assay Tab-delimited
- JRC, Joint Research Centre
- MIRCA, Minimum Information for Reporting Comet Assay
- NMBP, Horizon 2020 Advisory Group for Nanotechnologies, Advanced Materials, Biotechnology and Advanced Manufacturing and Processing
- NMBP-13-2018 projects, Gov4Nano, NANORIGO and RiskGONE
- NMs, nanomaterials
- Nano-EHS, Nano Environment, Health and Safety
- Nanomaterials
- Nanosafety data
- OECD, Organisation for Economic Co-operation and Development
- OTM, Olive tail moment
- REACH, Registration, Evaluation Authorisation and Restriction of Chemicals
- SCGE, Single Cell Gel Electrophoresis
- SOPs, Standard Operating Procedures
- in vitro Comet assay
Collapse
Affiliation(s)
- Cecilia Bossa
- Environment and Health Department, Istituto Superiore di Sanità, Rome, Italy
| | - Cristina Andreoli
- Environment and Health Department, Istituto Superiore di Sanità, Rome, Italy
| | - Martine Bakker
- Centre for Safety of Substances and Products, National Institute of Public Health and the Environment (RIVM), Bilthoven, the Netherlands
| | - Flavia Barone
- Environment and Health Department, Istituto Superiore di Sanità, Rome, Italy
| | - Isabella De Angelis
- Environment and Health Department, Istituto Superiore di Sanità, Rome, Italy
| | | | - Penny Nymark
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | | |
Collapse
|
43
|
Mayer G, Müller W, Schork K, Uszkoreit J, Weidemann A, Wittig U, Rey M, Quast C, Felden J, Glöckner FO, Lange M, Arend D, Beier S, Junker A, Scholz U, Schüler D, Kestler HA, Wibberg D, Pühler A, Twardziok S, Eils J, Eils R, Hoffmann S, Eisenacher M, Turewicz M. Implementing FAIR data management within the German Network for Bioinformatics Infrastructure (de.NBI) exemplified by selected use cases. Brief Bioinform 2021; 22:bbab010. [PMID: 33589928 PMCID: PMC8425304 DOI: 10.1093/bib/bbab010] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 12/21/2020] [Accepted: 01/06/2021] [Indexed: 12/21/2022] Open
Abstract
This article describes some use case studies and self-assessments of FAIR status of de.NBI services to illustrate the challenges and requirements for the definition of the needs of adhering to the FAIR (findable, accessible, interoperable and reusable) data principles in a large distributed bioinformatics infrastructure. We address the challenge of heterogeneity of wet lab technologies, data, metadata, software, computational workflows and the levels of implementation and monitoring of FAIR principles within the different bioinformatics sub-disciplines joint in de.NBI. On the one hand, this broad service landscape and the excellent network of experts are a strong basis for the development of useful research data management plans. On the other hand, the large number of tools and techniques maintained by distributed teams renders FAIR compliance challenging.
Collapse
Affiliation(s)
- Gerhard Mayer
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
- Ulm University, Institute of Medical Systems Biology, Ulm, Germany
| | - Wolfgang Müller
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | - Karin Schork
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| | - Julian Uszkoreit
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| | - Andreas Weidemann
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | - Ulrike Wittig
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | - Maja Rey
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | | | - Janine Felden
- Jacobs University Bremen gGmbH, Bremen, Germany
- University of Bremen, MARUM - Center for Marine Environmental Sciences, Bremen, Germany
| | - Frank Oliver Glöckner
- Jacobs University Bremen gGmbH, Bremen, Germany
- University of Bremen, MARUM - Center for Marine Environmental Sciences, Bremen, Germany
- Alfred Wegener Institute - Helmholtz Center for Polar- and Marine Research, Bremerhaven, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Sebastian Beier
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Astrid Junker
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Hans A Kestler
- Ulm University, Institute of Medical Systems Biology, Ulm, Germany
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Jena
| | - Daniel Wibberg
- Bielefeld University, Center for Biotechnology (CeBiTec), Bielefeld, Germany
| | - Alfred Pühler
- Bielefeld University, Center for Biotechnology (CeBiTec), Bielefeld, Germany
| | - Sven Twardziok
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Center for Digital Health, Berlin, Germany
| | - Jürgen Eils
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Center for Digital Health, Berlin, Germany
| | - Roland Eils
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Center for Digital Health, Berlin, Germany
- Heidelberg University Hospital and BioQuant, Health Data Science Unit, Heidelberg, Germany
| | - Steve Hoffmann
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Jena
| | - Martin Eisenacher
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| | - Michael Turewicz
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| |
Collapse
|
44
|
Mann M, Kumar C, Zeng WF, Strauss MT. Artificial intelligence for proteomics and biomarker discovery. Cell Syst 2021; 12:759-770. [PMID: 34411543 DOI: 10.1016/j.cels.2021.06.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/07/2021] [Accepted: 06/28/2021] [Indexed: 12/14/2022]
Abstract
There is an avalanche of biomedical data generation and a parallel expansion in computational capabilities to analyze and make sense of these data. Starting with genome sequencing and widely employed deep sequencing technologies, these trends have now taken hold in all omics disciplines and increasingly call for multi-omics integration as well as data interpretation by artificial intelligence technologies. Here, we focus on mass spectrometry (MS)-based proteomics and describe how machine learning and, in particular, deep learning now predicts experimental peptide measurements from amino acid sequences alone. This will dramatically improve the quality and reliability of analytical workflows because experimental results should agree with predictions in a multi-dimensional data landscape. Machine learning has also become central to biomarker discovery from proteomics data, which now starts to outperform existing best-in-class assays. Finally, we discuss model transparency and explainability and data privacy that are required to deploy MS-based biomarkers in clinical settings.
Collapse
Affiliation(s)
- Matthias Mann
- Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
| | - Chanchal Kumar
- Translational Science & Experimental Medicine, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden.
| | - Wen-Feng Zeng
- Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
| | | |
Collapse
|
45
|
Roselli M, Natella F, Zinno P, Guantario B, Canali R, Schifano E, De Angelis M, Nikoloudaki O, Gobbetti M, Perozzi G, Devirgiliis C. Colonization Ability and Impact on Human Gut Microbiota of Foodborne Microbes From Traditional or Probiotic-Added Fermented Foods: A Systematic Review. Front Nutr 2021; 8:689084. [PMID: 34395494 PMCID: PMC8360115 DOI: 10.3389/fnut.2021.689084] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/02/2021] [Indexed: 12/22/2022] Open
Abstract
A large subset of fermented foods act as vehicles of live environmental microbes, which often contribute food quality assets to the overall diet, such as health-associated microbial metabolites. Foodborne microorganisms also carry the potential to interact with the human gut microbiome via the food chain. However, scientific results describing the microbial flow connecting such different microbiomes as well as their impact on human health, are still fragmented. The aim of this systematic review is to provide a knowledge-base about the scientific literature addressing the connection between foodborne and gut microbiomes, as well as to identify gaps where more research is needed to clarify and map gut microorganisms originating from fermented foods, either traditional or added with probiotics, their possible impact on human gut microbiota composition and to which extent foodborne microbes might be able to colonize the gut environment. An additional aim was also to highlight experimental approaches and study designs which could be better standardized to improve comparative analysis of published datasets. Overall, the results presented in this systematic review suggest that a complex interplay between food and gut microbiota is indeed occurring, although the possible mechanisms for this interaction, as well as how it can impact human health, still remain a puzzling picture. Further research employing standardized and trans-disciplinary approaches aimed at understanding how fermented foods can be tailored to positively influence human gut microbiota and, in turn, host health, are therefore of pivotal importance.
Collapse
Affiliation(s)
- Marianna Roselli
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Fausta Natella
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Paola Zinno
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Barbara Guantario
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Raffaella Canali
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Emily Schifano
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Maria De Angelis
- Department of Soil, Plant and Food Science, University of Bari Aldo Moro, Bari, Italy
| | - Olga Nikoloudaki
- Faculty of Science and Technology, Free University of Bozen-Bolzano, Bolzano, Italy
| | - Marco Gobbetti
- Faculty of Science and Technology, Free University of Bozen-Bolzano, Bolzano, Italy
| | - Giuditta Perozzi
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| | - Chiara Devirgiliis
- Research Centre for Food and Nutrition, CREA (Council for Agricultural Research and Economics), Rome, Italy
| |
Collapse
|
46
|
Schmidt CO, Fluck J, Golebiewski M, Grabenhenrich L, Hahn H, Kirsten T, Klammt S, Löbe M, Sax U, Thun S, Pigeot I. [Making COVID-19 research data more accessible-building a nationwide information infrastructure]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2021; 64:1084-1092. [PMID: 34297162 PMCID: PMC8298983 DOI: 10.1007/s00103-021-03386-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 06/28/2021] [Indexed: 11/24/2022]
Abstract
Public-Health-Forschung, epidemiologische und klinische Studien sind erforderlich, um die COVID-19-Pandemie besser zu verstehen und geeignete Maßnahmen zu ergreifen. Daher wurden auch in Deutschland zahlreiche Forschungsprojekte initiiert. Zum heutigen Zeitpunkt ist es ob der Fülle an Informationen jedoch kaum noch möglich, einen Überblick über die vielfältigen Forschungsaktivitäten und deren Ergebnisse zu erhalten. Im Rahmen der Initiative „Nationale Forschungsdateninfrastruktur für personenbezogene Gesundheitsdaten“ (NFDI4Health) schafft die „Task Force COVID-19“ einen leichteren Zugang zu SARS-CoV-2- und COVID-19-bezogenen klinischen, epidemiologischen und Public-Health-Forschungsdaten. Dabei werden die sogenannten FAIR-Prinzipien (Findable, Accessible, Interoperable, Reusable) berücksichtigt, die eine schnellere Kommunikation von Ergebnissen befördern sollen. Zu den wesentlichen Arbeitsinhalten der Taskforce gehören die Erstellung eines Studienportals mit Metadaten, Erhebungsinstrumenten, Studiendokumenten, Studienergebnissen und Veröffentlichungen sowie einer Suchmaschine für Preprint-Publikationen. Weitere Inhalte sind ein Konzept zur Verknüpfung von Forschungs- und Routinedaten, Services zum verbesserten Umgang mit Bilddaten und die Anwendung standardisierter Analyseroutinen für harmonisierte Qualitätsbewertungen. Die im Aufbau befindliche Infrastruktur erleichtert die Auffindbarkeit von und den Umgang mit deutscher COVID-19-Forschung. Die im Rahmen der NFDI4Health Task Force COVID-19 begonnenen Entwicklungen sind für weitere Forschungsthemen nachnutzbar, da die adressierten Herausforderungen generisch für die Auffindbarkeit von und den Umgang mit Forschungsdaten sind.
Collapse
Affiliation(s)
- Carsten Oliver Schmidt
- Institut für Community Medicine, Universitätsmedizin Greifswald, Walther-Rathenau-Str. 48, 17475, Greifswald, Deutschland.
| | - Juliane Fluck
- ZB MED - Informationszentrum Lebenswissenschaften, Bonn, Deutschland.,Institut für Geodäsie und Geoinformation, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Deutschland.,Abteilung Bioinformatik, Fraunhofer Institut SCAI, Sankt Augustin, Deutschland
| | - Martin Golebiewski
- Heidelberger Institut für Theoretische Studien (HITS), Heidelberg, Deutschland
| | | | - Horst Hahn
- Institut für Digitale Medizin, Fraunhofer MEVIS, Bremen, Deutschland.,Jacobs University, Bremen, Deutschland
| | - Toralf Kirsten
- Fakultät Angewandte Computer- und Biowissenschaften, Hochschule Mittweida, Mittweida, Deutschland.,Institut für Medical Data Science, Universitätsmedizin Leipzig, Leipzig, Deutschland
| | - Sebastian Klammt
- Netzwerk der Koordinierungszentren für Klinische Studien - KKS-Netzwerk e. V., Berlin, Deutschland
| | - Matthias Löbe
- Institut für Medizinische Informatik, Statistik und Epidemiologie (IMISE), Universität Leipzig, Leipzig, Deutschland
| | - Ulrich Sax
- Institut für Medizinische Informatik, Universitätsmedizin Göttingen, Göttingen, Deutschland
| | - Sylvia Thun
- Berlin Institute of Health at Charité, Universitätsmedizin Berlin, Berlin, Deutschland
| | - Iris Pigeot
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Bremen, Deutschland.,Fachbereich Mathematik und Informatik, Universität Bremen, Bremen, Deutschland
| | | |
Collapse
|
47
|
Fursin G. Collective knowledge: organizing research projects as a database of reusable components and portable workflows with common interfaces. Philos Trans A Math Phys Eng Sci 2021; 379:20200211. [PMID: 33775147 DOI: 10.1098/rsta.2020.0211] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/14/2020] [Indexed: 06/12/2023]
Abstract
This article provides the motivation and overview of the Collective Knowledge Framework (CK or cKnowledge). The CK concept is to decompose research projects into reusable components that encapsulate research artifacts and provide unified application programming interfaces (APIs), command-line interfaces (CLIs), meta descriptions and common automation actions for related artifacts. The CK framework is used to organize and manage research projects as a database of such components. Inspired by the USB 'plug and play' approach for hardware, CK also helps to assemble portable workflows that can automatically plug in compatible components from different users and vendors (models, datasets, frameworks, compilers, tools). Such workflows can build and run algorithms on different platforms and environments in a unified way using the customizable CK program pipeline with software detection plugins and the automatic installation of missing packages. This article presents a number of industrial projects in which the modular CK approach was successfully validated in order to automate benchmarking, auto-tuning and co-design of efficient software and hardware for machine learning and artificial intelligence in terms of speed, accuracy, energy, size and various costs. The CK framework also helped to automate the artifact evaluation process at several computer science conferences as well as to make it easier to reproduce, compare and reuse research techniques from published papers, deploy them in production, and automatically adapt them to continuously changing datasets, models and systems. The long-term goal is to accelerate innovation by connecting researchers and practitioners to share and reuse all their knowledge, best practices, artifacts, workflows and experimental results in a common, portable and reproducible format at https://cKnowledge.io/. This article is part of the theme issue 'Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico'.
Collapse
|
48
|
Dimitrova M, Meyer R, Buttigieg PL, Georgiev T, Zhelezov G, Demirov S, Smith V, Penev L. A streamlined workflow for conversion, peer review, and publication of genomics metadata as omics data papers. Gigascience 2021; 10:6275150. [PMID: 33983435 PMCID: PMC8117446 DOI: 10.1093/gigascience/giab034] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 11/30/2020] [Accepted: 04/20/2021] [Indexed: 12/31/2022] Open
Abstract
Background Data papers have emerged as a powerful instrument for open data publishing, obtaining credit, and establishing priority for datasets generated in scientific experiments. Academic publishing improves data and metadata quality through peer review and increases the impact of datasets by enhancing their visibility, accessibility, and reusability. Objective We aimed to establish a new type of article structure and template for omics studies: the omics data paper. To improve data interoperability and further incentivize researchers to publish well-described datasets, we created a prototype workflow for streamlined import of genomics metadata from the European Nucleotide Archive directly into a data paper manuscript. Methods An omics data paper template was designed by defining key article sections that encourage the description of omics datasets and methodologies. A metadata import workflow, based on REpresentational State Transfer services and Xpath, was prototyped to extract information from the European Nucleotide Archive, ArrayExpress, and BioSamples databases. Findings The template and workflow for automatic import of standard-compliant metadata into an omics data paper manuscript provide a mechanism for enhancing existing metadata through publishing. Conclusion The omics data paper structure and workflow for import of genomics metadata will help to bring genomic and other omics datasets into the spotlight. Promoting enhanced metadata descriptions and enforcing manuscript peer review and data auditing of the underlying datasets brings additional quality to datasets. We hope that streamlined metadata reuse for scholarly publishing encourages authors to create enhanced metadata descriptions in the form of data papers to improve both the quality of their metadata and its findability and accessibility.
Collapse
Affiliation(s)
- Mariya Dimitrova
- Pensoft Publishers, Prof. Georgi Zlatarski Street 12, 1700 Sofia, Bulgaria.,Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev St., Block 25A, 1113 Sofia, Bulgaria
| | - Raïssa Meyer
- Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung, Am Handelshafen 12, 27570 Bremerhaven, Germany
| | - Pier Luigi Buttigieg
- Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung, Am Handelshafen 12, 27570 Bremerhaven, Germany
| | - Teodor Georgiev
- Pensoft Publishers, Prof. Georgi Zlatarski Street 12, 1700 Sofia, Bulgaria
| | - Georgi Zhelezov
- Pensoft Publishers, Prof. Georgi Zlatarski Street 12, 1700 Sofia, Bulgaria
| | | | - Vincent Smith
- The Natural History Museum, Cromwell Rd, South Kensington, SW7 5BD London, UK
| | - Lyubomir Penev
- Pensoft Publishers, Prof. Georgi Zlatarski Street 12, 1700 Sofia, Bulgaria.,Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, 2 Gagarin St., 1113 Sofia, Bulgaria
| |
Collapse
|
49
|
Burley SK, Berman HM. Open-access data: A cornerstone for artificial intelligence approaches to protein structure prediction. Structure 2021; 29:515-520. [PMID: 33984281 DOI: 10.1016/j.str.2021.04.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/08/2021] [Accepted: 04/23/2021] [Indexed: 12/28/2022]
Abstract
The Protein Data Bank (PDB) was established in 1971 to archive three-dimensional (3D) structures of biological macromolecules as a public good. Fifty years later, the PDB is providing millions of data consumers around the world with open access to more than 175,000 experimentally determined structures of proteins and nucleic acids (DNA, RNA) and their complexes with one another and small-molecule ligands. PDB data users are working, teaching, and learning in fundamental biology, biomedicine, bioengineering, biotechnology, and energy sciences. They also represent the fields of agriculture, chemistry, physics and materials science, mathematics, statistics, computer science, and zoology, and even the social sciences. The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and deep or machine learning methods.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; The Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
50
|
Aime MC, Miller AN, Aoki T, Bensch K, Cai L, Crous PW, Hawksworth DL, Hyde KD, Kirk PM, Lücking R, May TW, Malosso E, Redhead SA, Rossman AY, Stadler M, Thines M, Yurkov AM, Zhang N, Schoch CL. How to publish a new fungal species, or name, version 3.0. IMA Fungus 2021; 12:11. [PMID: 33934723 PMCID: PMC8091500 DOI: 10.1186/s43008-021-00063-1] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 04/08/2021] [Indexed: 12/19/2022] Open
Abstract
It is now a decade since The International Commission on the Taxonomy of Fungi (ICTF) produced an overview of requirements and best practices for describing a new fungal species. In the meantime the International Code of Nomenclature for algae, fungi, and plants (ICNafp) has changed from its former name (the International Code of Botanical Nomenclature) and introduced new formal requirements for valid publication of species scientific names, including the separation of provisions specific to Fungi and organisms treated as fungi in a new Chapter F. Equally transformative have been changes in the data collection, data dissemination, and analytical tools available to mycologists. This paper provides an updated and expanded discussion of current publication requirements along with best practices for the description of new fungal species and publication of new names and for improving accessibility of their associated metadata that have developed over the last 10 years. Additionally, we provide: (1) model papers for different fungal groups and circumstances; (2) a checklist to simplify meeting (i) the requirements of the ICNafp to ensure the effective, valid and legitimate publication of names of new taxa, and (ii) minimally accepted standards for description; and, (3) templates for preparing standardized species descriptions.
Collapse
Affiliation(s)
- M Catherine Aime
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, IN, 47907, USA.
| | - Andrew N Miller
- Illinois Natural History Survey, University of Illinois Urbana-Champaign, Champaign, IL, 61820, USA
| | - Takayuki Aoki
- Genetic Resources Center, National Agriculture and Food Research Organization, 2-1-2 Kannondai, Tsukuba, Ibaraki, 305-8602, Japan
| | - Konstanze Bensch
- Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584CT, Utrecht, the Netherlands
| | - Lei Cai
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, NO.1 Beichen West Road, Chaoyang District, Beijing, 100101, China
| | - Pedro W Crous
- Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584CT, Utrecht, the Netherlands
| | - David L Hawksworth
- Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew, Surrey, TW9 3DS, UK.,Department of Life Sciences, The Natural History Museum, Cromwell Road, London, SW7 5BD, UK.,Jilin Agricultural University, Changchun, 130118, Jilin Province, China
| | - Kevin D Hyde
- Center of Excellence in Fungal Research, Mae Fah Luang University, Chiang Rai, 57100, Thailand
| | - Paul M Kirk
- Biodiversity Informatics & Spatial Analysis, Royal Botanic Garden Kew, Richmond, London, TW9 3AE, UK
| | - Robert Lücking
- Botanischer Garten und Botanisches Museum, Freie Universität Berlin, Königin-Luise-Str. 6-8, 14195, Berlin, Germany
| | - Tom W May
- Royal Botanic Gardens Victoria, Birdwood Avenue, Melbourne, Victoria, 3004, Australia
| | - Elaine Malosso
- Departamento de Micologia, Centro de Biociências, Universidade Federal de Pernambuco, Recife, PE, 50740-600, Brazil
| | - Scott A Redhead
- Ottawa Research and Development Centre, Science and Technology Branch, Agriculture and Agri-Food Canada, Ottawa, Ontario, K1A 0C6, Canada
| | - Amy Y Rossman
- Botany and Plant Pathology Department, Oregon State University, Corvallis, OR, 97333, USA
| | - Marc Stadler
- Department Microbial Drugs, Helmholtz Centre for Infection Research, Inhoffenstrasse 7, 38124, Braunschweig, Germany
| | - Marco Thines
- Department of Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe University, Max-von-Laue-Str. 13, 60438, Frankfurt am Main, Germany.,Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, 60325, Frankfurt am Main, Germany
| | - Andrey M Yurkov
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Ning Zhang
- Department of Plant Biology, Rutgers University, New Brunswick, NJ, 08901, USA
| | - Conrad L Schoch
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD, 20892, USA
| |
Collapse
|