Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Field D, Sansone SA, Collis A, Booth T, Dukes P, Gregurick SK, Kennedy K, Kolar P, Kolker E, Maxon M, Millard S, Mugabushaka AM, Perrin N, Remacle JE, Remington K, Rocca-Serra P, Taylor CF, Thorley M, Tiwari B, Wilbanks J. Megascience. 'Omics data sharing. Science 2009;326:234-6. [PMID: 19815759 PMCID: PMC2770171 DOI: 10.1126/science.1180598] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

For:	Field D, Sansone SA, Collis A, Booth T, Dukes P, Gregurick SK, Kennedy K, Kolar P, Kolker E, Maxon M, Millard S, Mugabushaka AM, Perrin N, Remacle JE, Remington K, Rocca-Serra P, Taylor CF, Thorley M, Tiwari B, Wilbanks J. Megascience. 'Omics data sharing. Science 2009;326:234-6. [PMID: 19815759 PMCID: PMC2770171 DOI: 10.1126/science.1180598] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Number

Cited by Other Article(s)

Emissah H, Ljungquist B, Ascoli GA. Bibliometric analysis of neuroscience publications quantifies the impact of data sharing. Bioinformatics 2023;39:btad746. [PMID: 38070153 PMCID: PMC10733721 DOI: 10.1093/bioinformatics/btad746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 11/01/2023] [Accepted: 12/07/2023] [Indexed: 12/19/2023] Open

Falconnier C, Caparros-Roissard A, Decraene C, Lutz PE. Functional genomic mechanisms of opioid action and opioid use disorder: a systematic review of animal models and human studies. Mol Psychiatry 2023;28:4568-4584. [PMID: 37723284 PMCID: PMC10914629 DOI: 10.1038/s41380-023-02238-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 08/17/2023] [Accepted: 08/24/2023] [Indexed: 09/20/2023]

Abstract

In the past two decades, over-prescription of opioids for pain management has driven a steep increase in opioid use disorder (OUD) and death by overdose, exerting a dramatic toll on western countries. OUD is a chronic relapsing disease associated with a lifetime struggle to control drug consumption, suggesting that opioids trigger long-lasting brain adaptations, notably through functional genomic and epigenomic mechanisms. Current understanding of these processes, however, remain scarce, and have not been previously reviewed systematically. To do so, the goal of the present work was to synthesize current knowledge on genome-wide transcriptomic and epigenetic mechanisms of opioid action, in primate and rodent species. Using a prospectively registered methodology, comprehensive literature searches were completed in PubMed, Embase, and Web of Science. Of the 2709 articles identified, 73 met our inclusion criteria and were considered for qualitative analysis. Focusing on the 5 most studied nervous system structures (nucleus accumbens, frontal cortex, whole striatum, dorsal striatum, spinal cord; 44 articles), we also conducted a quantitative analysis of differentially expressed genes, in an effort to identify a putative core transcriptional signature of opioids. Only one gene, Cdkn1a, was consistently identified in eleven studies, and globally, our results unveil surprisingly low consistency across published work, even when considering most recent single-cell approaches. Analysis of sources of variability detected significant contributions from species, brain structure, duration of opioid exposure, strain, time-point of analysis, and batch effects, but not type of opioid. To go beyond those limitations, we leveraged threshold-free methods to illustrate how genome-wide comparisons may generate new findings and hypotheses. Finally, we discuss current methodological development in the field, and their implication for future research and, ultimately, better care.

Collapse

Emissah H, Ljungquist B, Ascoli GA. Bibliometric analysis of neuroscience publications quantifies the impact of data sharing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.12.557386. [PMID: 37745378 PMCID: PMC10515804 DOI: 10.1101/2023.09.12.557386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]

Shea MM, Kuppermann J, Rogers MP, Smith DS, Edwards P, Boehm AB. Systematic review of marine environmental DNA metabarcoding studies: toward best practices for data usability and accessibility. PeerJ 2023;11:e14993. [PMID: 36992947 PMCID: PMC10042160 DOI: 10.7717/peerj.14993] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 02/12/2023] [Indexed: 03/31/2023] Open

Assidi M, Buhmeida A, Budowle B. Medicine and health of 21st Century: Not just a high biotech-driven solution. NPJ Genom Med 2022;7:67. [PMID: 36379953 PMCID: PMC9666643 DOI: 10.1038/s41525-022-00336-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 10/27/2022] [Indexed: 11/16/2022] Open

Lou RN, Therkildsen NO. Batch effects in population genomic studies with low-coverage whole genome sequencing data: Causes, detection and mitigation. Mol Ecol Resour 2021;22:1678-1692. [PMID: 34825778 DOI: 10.1111/1755-0998.13559] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 11/05/2021] [Accepted: 11/11/2021] [Indexed: 01/04/2023]

Arribas P, Andújar C, Bidartondo MI, Bohmann K, Coissac É, Creer S, deWaard JR, Elbrecht V, Ficetola GF, Goberna M, Kennedy S, Krehenwinkel H, Leese F, Novotny V, Ronquist F, Yu DW, Zinger L, Creedy TJ, Meramveliotakis E, Noguerales V, Overcast I, Morlon H, Vogler AP, Papadopoulou A, Emerson BC. Connecting high-throughput biodiversity inventories: Opportunities for a site-based genomic framework for global integration and synthesis. Mol Ecol 2021;30:1120-1135. [PMID: 33432777 PMCID: PMC7986105 DOI: 10.1111/mec.15797] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 12/21/2020] [Accepted: 01/05/2021] [Indexed: 01/03/2023]

Affiliation(s)

Paula Arribas Island Ecology and Evolution Research GroupInstituto de Productos Naturales y Agrobiología (IPNA‐CSIC)San Cristóbal de la LagunaSpain
Carmelo Andújar Island Ecology and Evolution Research GroupInstituto de Productos Naturales y Agrobiología (IPNA‐CSIC)San Cristóbal de la LagunaSpain
Martin I. Bidartondo Department of Life SciencesImperial College LondonLondonUK Comparative Plant and Fungal BiologyRoyal Botanic GardensLondonUK
Kristine Bohmann Section for Evolutionary Genomics, Faculty of Health and Medical Sciences, Globe InstituteUniversity of CopenhagenCopenhagenDenmark
Éric Coissac Université Grenoble Alpes, CNRS, Université Savoie Mont BlancLECA, Laboratoire d’Ecologie AlpineGrenobleFrance
Simon Creer School of Natural SciencesBangor UniversityGwyneddUK
Jeremy R. deWaard Centre for Biodiversity GenomicsUniversity of GuelphGuelphCanada School of Environmental SciencesUniversity of GuelphGuelphCanada
Vasco Elbrecht Centre for Biodiversity Monitoring (ZBM)Zoological Research Museum Alexander KoenigBonnGermany
Gentile F. Ficetola Université Grenoble Alpes, CNRS, Université Savoie Mont BlancLECA, Laboratoire d’Ecologie AlpineGrenobleFrance Department of Environmental Sciences and PolicyUniversity of MilanoMilanoItaly
Marta Goberna Department of Environment and AgronomyINIAMadridSpain
Susan Kennedy Biodiversity and Biocomplexity UnitOkinawa Institute of Science and Technology Graduate UniversityOnna‐sonJapan Department of BiogeographyTrier UniversityTrierGermany
Henrik Krehenwinkel Department of BiogeographyTrier UniversityTrierGermany
Florian Leese Aquatic Ecosystem Research, Faculty of BiologyUniversity of Duisburg‐EssenEssenGermany Centre for Water and Environmental Research (ZWU) EssenUniversity of Duisburg‐EssenEssenGermany
Vojtech Novotny Biology Centre, Institute of EntomologyCzech Academy of SciencesCeske BudejoviceCzech Republic Faculty of ScienceUniversity of South BohemiaCeske BudejoviceCzech Republic
Fredrik Ronquist Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
Douglas W. Yu State Key Laboratory of Genetic Resources and EvolutionKunming Institute of Zoology, Chinese Academy of SciencesKunmingChina Center for Excellence in Animal Evolution and GeneticsChinese Academy of SciencesKunmingChina School of Biological SciencesUniversity of East AngliaNorwichUK
Lucie Zinger Institut de Biologie de l’ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERMUniversité PSLParisFrance
Thomas J. Creedy Department of Life SciencesNatural History MuseumLondonUK
Emmanouil Meramveliotakis Department of Biological SciencesUniversity of CyprusNicosiaCyprus
Víctor Noguerales Department of Biological SciencesUniversity of CyprusNicosiaCyprus
Isaac Overcast Institut de Biologie de l’ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERMUniversité PSLParisFrance Division of Vertebrate ZoologyAmerican Museum of Natural HistoryNew YorkUSA
Hélène Morlon Institut de Biologie de l’ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERMUniversité PSLParisFrance
Alfried P. Vogler Department of Life SciencesImperial College LondonLondonUK Department of Life SciencesNatural History MuseumLondonUK
Anna Papadopoulou Department of Biological SciencesUniversity of CyprusNicosiaCyprus
Brent C. Emerson Island Ecology and Evolution Research GroupInstituto de Productos Naturales y Agrobiología (IPNA‐CSIC)San Cristóbal de la LagunaSpain

Collapse

Reynolds T, Johnson EC, Huggett SB, Bubier JA, Palmer RHC, Agrawal A, Baker EJ, Chesler EJ. Interpretation of psychiatric genome-wide association studies with multispecies heterogeneous functional genomic data integration. Neuropsychopharmacology 2021;46:86-97. [PMID: 32791514 PMCID: PMC7688940 DOI: 10.1038/s41386-020-00795-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 07/27/2020] [Accepted: 07/29/2020] [Indexed: 02/08/2023]

Accuracy and efficiency of germline variant calling pipelines for human genome data. Sci Rep 2020. [PMID: 33214604 DOI: 10.1101/2020.03.27.011767v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Zhao S, Agafonov O, Azab A, Stokowy T, Hovig E. Accuracy and efficiency of germline variant calling pipelines for human genome data. Sci Rep 2020;10:20222. [PMID: 33214604 PMCID: PMC7678823 DOI: 10.1038/s41598-020-77218-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 11/02/2020] [Indexed: 12/30/2022] Open

Translational biomarkers in the era of precision medicine. Adv Clin Chem 2020;102:191-232. [PMID: 34044910 DOI: 10.1016/bs.acc.2020.08.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Dass G, Vu MT, Xu P, Audain E, Hitz MP, Grüning B, Hermjakob H, Perez-Riverol Y. The omics discovery REST interface. Nucleic Acids Res 2020;48:W380-W384. [PMID: 32374843 PMCID: PMC7319562 DOI: 10.1093/nar/gkaa326] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 04/11/2020] [Accepted: 04/21/2020] [Indexed: 01/22/2023] Open

Hemphill L, Hedstrom ML, Leonard SH. Saving social media data: Understanding data management practices among social media researchers and their implications for archives. J Assoc Inf Sci Technol 2020. [DOI: 10.1002/asi.24368] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Data in Brief: Can a mega-journal for data be useful? Scientometrics 2020. [DOI: 10.1007/s11192-020-03437-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Brumfield KD, Huq A, Colwell RR, Olds JL, Leddy MB. Microbial resolution of whole genome shotgun and 16S amplicon metagenomic sequencing using publicly available NEON data. PLoS One 2020;15:e0228899. [PMID: 32053657 PMCID: PMC7018008 DOI: 10.1371/journal.pone.0228899] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 01/24/2020] [Indexed: 01/01/2023] Open

Abstract

Microorganisms are ubiquitous in the biosphere, playing a crucial role in both biogeochemistry of the planet and human health. However, identifying these microorganisms and defining their function are challenging. Widely used approaches in comparative metagenomics, 16S amplicon sequencing and whole genome shotgun sequencing (WGS), have provided access to DNA sequencing analysis to identify microorganisms and evaluate diversity and abundance in various environments. However, advances in parallel high-throughput DNA sequencing in the past decade have introduced major hurdles, namely standardization of methods, data storage, reproducible interoperability of results, and data sharing. The National Ecological Observatory Network (NEON), established by the National Science Foundation, enables all researchers to address queries on a regional to continental scale around a variety of environmental challenges and provide high-quality, integrated, and standardized data from field sites across the U.S. As the amount of metagenomic data continues to grow, standardized procedures that allow results across projects to be assessed and compared is becoming increasingly important in the field of metagenomics. We demonstrate the feasibility of using publicly available NEON soil metagenomic sequencing datasets in combination with open access Metagenomics Rapid Annotation using the Subsystem Technology (MG-RAST) server to illustrate advantages of WGS compared to 16S amplicon sequencing. Four WGS and four 16S amplicon sequence datasets, from surface soil samples prepared by NEON investigators, were selected for comparison, using standardized protocols collected at the same locations in Colorado between April-July 2014. The dominant bacterial phyla detected across samples agreed between sequencing methodologies. However, WGS yielded greater microbial resolution, increased accuracy, and allowed identification of more genera of bacteria, archaea, viruses, and eukaryota, and putative functional genes that would have gone undetected using 16S amplicon sequencing. NEON open data will be useful for future studies characterizing and quantifying complex ecological processes associated with changing aquatic and terrestrial ecosystems.

Collapse

Santiago CRDN, Assis RDAB, Moreira LM, Digiampietri LA. Gene Tags Assessment by Comparative Genomics (GTACG): A User-Friendly Framework for Bacterial Comparative Genomics. Front Genet 2019;10:725. [PMID: 31507629 PMCID: PMC6718126 DOI: 10.3389/fgene.2019.00725] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Accepted: 07/10/2019] [Indexed: 12/04/2022] Open

Sansone SA, McQuilton P, Rocca-Serra P, Gonzalez-Beltran A, Izzo M, Lister AL, Thurston M. FAIRsharing as a community approach to standards, repositories and policies. Nat Biotechnol 2019;37:358-367. [PMID: 30940948 PMCID: PMC6785156 DOI: 10.1038/s41587-019-0080-8] [Citation(s) in RCA: 149] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Amann RI, Baichoo S, Blencowe BJ, Bork P, Borodovsky M, Brooksbank C, Chain PSG, Colwell RR, Daffonchio DG, Danchin A, de Lorenzo V, Dorrestein PC, Finn RD, Fraser CM, Gilbert JA, Hallam SJ, Hugenholtz P, Ioannidis JPA, Jansson JK, Kim JF, Klenk HP, Klotz MG, Knight R, Konstantinidis KT, Kyrpides NC, Mason CE, McHardy AC, Meyer F, Ouzounis CA, Patrinos AAN, Podar M, Pollard KS, Ravel J, Muñoz AR, Roberts RJ, Rosselló-Móra R, Sansone SA, Schloss PD, Schriml LM, Setubal JC, Sorek R, Stevens RL, Tiedje JM, Turjanski A, Tyson GW, Ussery DW, Weinstock GM, White O, Whitman WB, Xenarios I. Toward unrestricted use of public genomic data. Science 2019;363:350-352. [PMID: 30679363 DOI: 10.1126/science.aaw1280] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Affiliation(s)

Rudolf I Amann The list of author affiliations is available in the supplementary materials
Shakuntala Baichoo The list of author affiliations is available in the supplementary materials
Benjamin J Blencowe The list of author affiliations is available in the supplementary materials
Peer Bork The list of author affiliations is available in the supplementary materials
Mark Borodovsky The list of author affiliations is available in the supplementary materials
Cath Brooksbank The list of author affiliations is available in the supplementary materials
Patrick S G Chain The list of author affiliations is available in the supplementary materials
Rita R Colwell The list of author affiliations is available in the supplementary materials
Daniele G Daffonchio The list of author affiliations is available in the supplementary materials
Antoine Danchin The list of author affiliations is available in the supplementary materials
Victor de Lorenzo The list of author affiliations is available in the supplementary materials
Pieter C Dorrestein The list of author affiliations is available in the supplementary materials
Robert D Finn The list of author affiliations is available in the supplementary materials
Claire M Fraser The list of author affiliations is available in the supplementary materials
Jack A Gilbert The list of author affiliations is available in the supplementary materials
Steven J Hallam The list of author affiliations is available in the supplementary materials
Philip Hugenholtz The list of author affiliations is available in the supplementary materials
John P A Ioannidis The list of author affiliations is available in the supplementary materials.
Janet K Jansson The list of author affiliations is available in the supplementary materials
Jihyun F Kim The list of author affiliations is available in the supplementary materials
Hans-Peter Klenk The list of author affiliations is available in the supplementary materials
Martin G Klotz The list of author affiliations is available in the supplementary materials
Rob Knight The list of author affiliations is available in the supplementary materials
Konstantinos T Konstantinidis The list of author affiliations is available in the supplementary materials
Nikos C Kyrpides The list of author affiliations is available in the supplementary materials
Christopher E Mason The list of author affiliations is available in the supplementary materials
Alice C McHardy The list of author affiliations is available in the supplementary materials
Folker Meyer The list of author affiliations is available in the supplementary materials
Christos A Ouzounis The list of author affiliations is available in the supplementary materials
Aristides A N Patrinos The list of author affiliations is available in the supplementary materials
Mircea Podar The list of author affiliations is available in the supplementary materials
Katherine S Pollard The list of author affiliations is available in the supplementary materials
Jacques Ravel The list of author affiliations is available in the supplementary materials
Alejandro Reyes Muñoz The list of author affiliations is available in the supplementary materials
Richard J Roberts The list of author affiliations is available in the supplementary materials
Ramon Rosselló-Móra The list of author affiliations is available in the supplementary materials
Susanna-Assunta Sansone The list of author affiliations is available in the supplementary materials
Patrick D Schloss The list of author affiliations is available in the supplementary materials
Lynn M Schriml The list of author affiliations is available in the supplementary materials
João C Setubal The list of author affiliations is available in the supplementary materials
Rotem Sorek The list of author affiliations is available in the supplementary materials
Rick L Stevens The list of author affiliations is available in the supplementary materials
James M Tiedje The list of author affiliations is available in the supplementary materials
Adrian Turjanski The list of author affiliations is available in the supplementary materials
Gene W Tyson The list of author affiliations is available in the supplementary materials
David W Ussery The list of author affiliations is available in the supplementary materials
George M Weinstock The list of author affiliations is available in the supplementary materials
Owen White The list of author affiliations is available in the supplementary materials
William B Whitman The list of author affiliations is available in the supplementary materials
Ioannis Xenarios The list of author affiliations is available in the supplementary materials

Collapse

Maxson Jones K, Ankeny RA, Cook-Deegan R. The Bermuda Triangle: The Pragmatics, Policies, and Principles for Data Sharing in the History of the Human Genome Project. JOURNAL OF THE HISTORY OF BIOLOGY 2018;51:693-805. [PMID: 30390178 PMCID: PMC7307446 DOI: 10.1007/s10739-018-9538-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]

Abstract

The Bermuda Principles for DNA sequence data sharing are an enduring legacy of the Human Genome Project (HGP). They were adopted by the HGP at a strategy meeting in Bermuda in February of 1996 and implemented in formal policies by early 1998, mandating daily release of HGP-funded DNA sequences into the public domain. The idea of daily sharing, we argue, emanated directly from strategies for large, goal-directed molecular biology projects first tested within the "community" of C. elegans researchers, and were introduced and defended for the HGP by the nematode biologists John Sulston and Robert Waterston. In the C. elegans community, and subsequently in the HGP, daily sharing served the pragmatic goals of quality control and project coordination. Yet in the HGP human genome, we also argue, the Bermuda Principles addressed concerns about gene patents impeding scientific advancement, and were aspirational and flexible in implementation and justification. They endured as an archetype for how rapid data sharing could be realized and rationalized, and permitted adaptation to the needs of various scientific communities. Yet in addition to the support of Sulston and Waterston, their adoption also depended on the clout of administrators at the US National Institutes of Health (NIH) and the UK nonprofit charity the Wellcome Trust, which together funded 90% of the HGP human sequencing effort. The other nations wishing to remain in the HGP consortium had to accommodate to the Bermuda Principles, requiring exceptions from incompatible existing or pending data access policies for publicly funded research in Germany, Japan, and France. We begin this story in 1963, with the biologist Sydney Brenner's proposal for a nematode research program at the Laboratory of Molecular Biology (LMB) at the University of Cambridge. We continue through 2003, with the completion of the HGP human reference genome, and conclude with observations about policy and the historiography of molecular biology.

Collapse

Falomir-Lockhart AH, Villegas-Castagnaso EE, Giovambattista G, Rogberg-Muñoz A. Computational prediction of nsSNPs effects on protein function and structure, a prioritization approach for further in vitro studies applied to bovine GSTP1. Free Radic Biol Med 2018;129:486-491. [PMID: 30315934 DOI: 10.1016/j.freeradbiomed.2018.10.403] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 09/20/2018] [Accepted: 10/03/2018] [Indexed: 11/20/2022]

Huan T, Palermo A, Ivanisevic J, Rinehart D, Edler D, Phommavongsay T, Benton HP, Guijas C, Domingo-Almenara X, Warth B, Siuzdak G. Autonomous Multimodal Metabolomics Data Integration for Comprehensive Pathway Analysis and Systems Biology. Anal Chem 2018;90:8396-8403. [PMID: 29893550 DOI: 10.1021/acs.analchem.8b00875] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Dickie IA, Boyer S, Buckley HL, Duncan RP, Gardner PP, Hogg ID, Holdaway RJ, Lear G, Makiola A, Morales SE, Powell JR, Weaver L. Towards robust and repeatable sampling methods in eDNA-based studies. Mol Ecol Resour 2018;18:940-952. [PMID: 29802793 DOI: 10.1111/1755-0998.12907] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 05/10/2018] [Accepted: 05/14/2018] [Indexed: 01/28/2023]

Kaye J, Terry SF, Juengst E, Coy S, Harris JR, Chalmers D, Dove ES, Budin-Ljøsne I, Adebamowo C, Ogbe E, Bezuidenhout L, Morrison M, Minion JT, Murtagh MJ, Minari J, Teare H, Isasi R, Kato K, Rial-Sebbag E, Marshall P, Koenig B, Cambon-Thomsen A. Including all voices in international data-sharing governance. Hum Genomics 2018. [PMID: 29514717 PMCID: PMC5842530 DOI: 10.1186/s40246-018-0143-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

Background

Governments, funding bodies, institutions, and publishers have developed a number of strategies to encourage researchers to facilitate access to datasets. The rationale behind this approach is that this will bring a number of benefits and enable advances in healthcare and medicine by allowing the maximum returns from the investment in research, as well as reducing waste and promoting transparency. As this approach gains momentum, these data-sharing practices have implications for many kinds of research as they become standard practice across the world.

Main text

The governance frameworks that have been developed to support biomedical research are not well equipped to deal with the complexities of international data sharing. This system is nationally based and is dependent upon expert committees for oversight and compliance, which has often led to piece-meal decision-making. This system tends to perpetuate inequalities by obscuring the contributions and the important role of different data providers along the data stream, whether they be low- or middle-income country researchers, patients, research participants, groups, or communities. As research and data-sharing activities are largely publicly funded, there is a strong moral argument for including the people who provide the data in decision-making and to develop governance systems for their continued participation.

Conclusions

We recommend that governance of science becomes more transparent, representative, and responsive to the voices of many constituencies by conducting public consultations about data-sharing addressing issues of access and use; including all data providers in decision-making about the use and sharing of data along the whole of the data stream; and using digital technologies to encourage accessibility, transparency, and accountability. We anticipate that this approach could enhance the legitimacy of the research process, generate insights that may otherwise be overlooked or ignored, and help to bring valuable perspectives into the decision-making around international data sharing.

Collapse

Affiliation(s)

Jane Kaye Centre for Health Law and Emerging Technologies, NDPH, University of Oxford, Ewert House, Ewert Place, Summertown, Oxford, OX2 7DD, UK. .,Melbourne Law School, University of Melbourne, 185 Pelham Street, Carlton, Victoria, 3053, Australia.
Sharon F Terry Genetic Alliance USA, 4301 Connecticut Ave NW, Suite 404, Washington DC, 20008-2369, USA
Eric Juengst Center for Bioethics, University of North Carolina at Chapel Hill, 333 McNider Hall, Chapel Hill, NC, 27599-7240, USA
Sarah Coy Centre for Health Law and Emerging Technologies, NDPH, University of Oxford, Ewert House, Ewert Place, Summertown, Oxford, OX2 7DD, UK
Jennifer R Harris Department of Genetics and Bioinformatics, Norwegian Institute of Public Health, PO Box 4404, Nydalen, 0403, Oslo, Norway
Don Chalmers Faculty of Law, University of Tasmania, Private Bag 89, Hobart, Tasmania, 7001, Australia
Edward S Dove School of Law, University of Edinburgh, Old College, South Bridge, Edinburgh, EH8 9YL, UK
Isabelle Budin-Ljøsne Cohort Studies, Norwegian Institute of Public Health, PO Box 4404, Nydalen, 0403, Oslo, Norway
Clement Adebamowo Center for Bioethics and Research, Ibadan, Nigeria.,Institute of Human Virology Nigeria, Abuja, Nigeria.,Greenebaum Comprehensive Cancer Center and Institute of Human Virology, University of Maryland School of Medicine, 725 W. Lombard St. Suite 445, Baltimore, MD, 21201, USA
Emilomo Ogbe International Centre for Reproductive Health, University of Gent, De Pintepark II, De Pintelaan 185, 9000, Ghent, Belgium
Louise Bezuidenhout Institute for Science, Innovation and Society, University of Oxford, 64 Banbury Road, Oxford, OX2 6PN, UK
Michael Morrison Centre for Health Law and Emerging Technologies, NDPH, University of Oxford, Ewert House, Ewert Place, Summertown, Oxford, OX2 7DD, UK
Joel T Minion Policy, Ethics and Life Sciences (PEALS) Research Centre, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Madeleine J Murtagh Policy, Ethics and Life Sciences (PEALS) Research Centre, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Jusaku Minari Uehiro Research Division for iPS Cell Ethics, Center for iPS Cell Research and Application, Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto, 606-8507, Japan
Harriet Teare Centre for Health Law and Emerging Technologies, NDPH, University of Oxford, Ewert House, Ewert Place, Summertown, Oxford, OX2 7DD, UK.,Melbourne Law School, University of Melbourne, 185 Pelham Street, Carlton, Victoria, 3053, Australia
Rosario Isasi Institute for Bioethics and Health Policy, Department of Human Genetics, Leonard M. Miller School of Medicine, University of Miami, 1501 NW 10th Avenue, Biomedical Research Building (BRB) Room 361, Miami, FL, 33136, USA
Kazuto Kato Department of Biomedical Ethics and Public Policy, Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka, 565-0871, Japan
Emmanuelle Rial-Sebbag National Institute for Research and Health (Inserm), UMR 1027 Inserm, Toulouse University, 37 allées Jules Guesde, 31000, Toulouse, France
Patricia Marshall Department of Bioethics, School of Medicine, TA200, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH, 44106-4976, USA
Barbara Koenig UCSF School of Nursing, Institute for Health and Aging, University of California, San Francisco, 3333 Calif. St, Laurel Heights, San Francisco, CA, 94118, USA
Anne Cambon-Thomsen CNRS, Toulouse, France; Joint research unit on epidemiology and public health, Inserm (National Institute for Health and Medical Research) and University Toulouse III Paul Sabatier, Toulouse, France

Collapse

Karcher S, Willighagen EL, Rumble J, Ehrhart F, Evelo CT, Fritts M, Gaheen S, Harper SL, Hoover MD, Jeliazkova N, Lewinski N, Marchese Robinson RL, Mills KC, Mustad AP, Thomas DG, Tsiliki G, Ogilvie Hendren C. Integration among databases and data sets to support productive nanotechnology: Challenges and recommendations. NANOIMPACT 2018;9:85-101. [PMID: 30246165 PMCID: PMC6145474 DOI: 10.1016/j.impact.2017.11.002] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Affiliation(s)

Sandra Karcher Civil and Environmental Engineering, Carnegie Mellon University, Pittsburgh, PA 15213-3890, USA Center for the Environmental Implications of Nano Technology (CEINT) Duke University, Box 90287, 121 Hudson Hall, Durham, NC 27708-0287, USA
Egon L. Willighagen Department of Bioinformatics - BiGCaT, Maastricht University, P.O. Box 616, UNS50, Box 19, NL-6200, MD, Maastricht, The Netherlands
John Rumble R&R Data Services, 11 Montgomery Avenue, Gaithersburg, MD 20877, USA CODATA-VAMAS Working Group on Nanomaterials, Paris, France
Friederike Ehrhart Department of Bioinformatics - BiGCaT, Maastricht University, P.O. Box 616, UNS50, Box 19, NL-6200, MD, Maastricht, The Netherlands
Chris T. Evelo Department of Bioinformatics - BiGCaT, Maastricht University, P.O. Box 616, UNS50, Box 19, NL-6200, MD, Maastricht, The Netherlands
Martin Fritts Clinical Research Directorate/Clinical Monitoring Research Program, Leidos Biomedical Research, Inc., NCI Campus at Frederick, Frederick, MD 21702, USA
Sharon Gaheen Clinical Research Directorate/Clinical Monitoring Research Program, Leidos Biomedical Research, Inc., NCI Campus at Frederick, Frederick, MD 21702, USA
Stacey L. Harper Environmental and Molecular Toxicology and School of Chemical, Biological and Environmental Engineering, Oregon State University, Corvallis, OR 97331, USA
Mark D. Hoover National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505-2888, USA
Nina Jeliazkova IdeaConsult Ltd., 4 A. Kanchev str., Sofia 1000, Bulgaria
Nastassja Lewinski Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA 23284, USA
Richard L. Marchese Robinson School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, James Parsons Building, Byrom Street, Liverpool L3 3AF, United Kingdom
Karmann C. Mills RTI International, 3040 Cornwallis Rd., Research Triangle Park, NC 27709, USA
Axel P. Mustad Nordic Quantum Computing Group AS, Oslo Science Park, P.O. Box 1892, Vika, N-0124 Oslo, Norway
Dennis G. Thomas Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
Georgia Tsiliki School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechneiou Street, Zografou, 15780, Athens, Greece Institute for the management of Information Systems, ATHENA Research and Innovation Centre, Artemidos 6 & Epidavrou, Marousi, 15125 Athens, Greece
Christine Ogilvie Hendren Center for the Environmental Implications of Nano Technology (CEINT) Duke University, Box 90287, 121 Hudson Hall, Durham, NC 27708-0287, USA

Collapse

Yu XT, Zeng T. Integrative Analysis of Omics Big Data. Methods Mol Biol 2018. [PMID: 29536440 DOI: 10.1007/978-1-4939-7717-8_7] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Park J, Gabbard JL. Factors that affect scientists' knowledge sharing behavior in health and life sciences research communities: Differences between explicit and implicit knowledge. COMPUTERS IN HUMAN BEHAVIOR 2018. [DOI: 10.1016/j.chb.2017.09.017] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Garlid AO, Polson JS, Garlid KD, Hermjakob H, Ping P. Equipping Physiologists with an Informatics Tool Chest: Toward an Integerated Mitochondrial Phenome. Handb Exp Pharmacol 2017;240:377-401. [PMID: 27995389 DOI: 10.1007/164_2016_93] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Pitchers WR, Constantinou SJ, Losilla M, Gallant JR. Electric fish genomics: Progress, prospects, and new tools for neuroethology. ACTA ACUST UNITED AC 2016;110:259-272. [PMID: 27769923 DOI: 10.1016/j.jphysparis.2016.10.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 09/06/2016] [Accepted: 10/16/2016] [Indexed: 01/01/2023]

Forrest CB, Margolis P, Seid M, Colletti RB. PEDSnet: how a prototype pediatric learning health system is being expanded into a national network. Health Aff (Millwood) 2016;33:1171-7. [PMID: 25006143 DOI: 10.1377/hlthaff.2014.0127] [Citation(s) in RCA: 110] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma'ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016;2016:baw100. [PMID: 27374120 PMCID: PMC4930834 DOI: 10.1093/database/baw100] [Citation(s) in RCA: 889] [Impact Index Per Article: 111.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Revised: 05/15/2016] [Accepted: 05/31/2016] [Indexed: 12/18/2022]

Abstract

Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene-gene and attribute-attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about genes and proteins, and as such, it enables researchers to discover novel relationships between biological entities, as well as form novel data-driven hypotheses for experimental validation.Database URL: http://amp.pharm.mssm.edu/Harmonizome.

Collapse

Affiliation(s)

Andrew D Rouillard Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
Gregory W Gundersen Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
Nicolas F Fernandez Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
Zichen Wang Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
Caroline D Monteiro Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
Michael G McDermott Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
Avi Ma'ayan Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA

Collapse

Marchese Robinson RL, Lynch I, Peijnenburg W, Rumble J, Klaessig F, Marquardt C, Rauscher H, Puzyn T, Purian R, Åberg C, Karcher S, Vriens H, Hoet P, Hoover MD, Hendren CO, Harper SL. How should the completeness and quality of curated nanomaterial data be evaluated? NANOSCALE 2016;8:9919-43. [PMID: 27143028 PMCID: PMC4899944 DOI: 10.1039/c5nr08944a] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]

Abstract

Nanotechnology is of increasing significance. Curation of nanomaterial data into electronic databases offers opportunities to better understand and predict nanomaterials' behaviour. This supports innovation in, and regulation of, nanotechnology. It is commonly understood that curated data need to be sufficiently complete and of sufficient quality to serve their intended purpose. However, assessing data completeness and quality is non-trivial in general and is arguably especially difficult in the nanoscience area, given its highly multidisciplinary nature. The current article, part of the Nanomaterial Data Curation Initiative series, addresses how to assess the completeness and quality of (curated) nanomaterial data. In order to address this key challenge, a variety of related issues are discussed: the meaning and importance of data completeness and quality, existing approaches to their assessment and the key challenges associated with evaluating the completeness and quality of curated nanomaterial data. Considerations which are specific to the nanoscience area and lessons which can be learned from other relevant scientific disciplines are considered. Hence, the scope of this discussion ranges from physicochemical characterisation requirements for nanomaterials and interference of nanomaterials with nanotoxicology assays to broader issues such as minimum information checklists, toxicology data quality schemes and computational approaches that facilitate evaluation of the completeness and quality of (curated) data. This discussion is informed by a literature review and a survey of key nanomaterial data curation stakeholders. Finally, drawing upon this discussion, recommendations are presented concerning the central question: how should the completeness and quality of curated nanomaterial data be evaluated?

Collapse

Affiliation(s)

Richard L. Marchese Robinson School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, James Parsons Building, Byrom Street, Liverpool, L3 3AF, United Kingdom
Iseult Lynch School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, B15 2TT Birmingham, United Kingdom
Willie Peijnenburg National Institute of Public Health and the Environment (RIVM), Bilthoven, The Netherlands Institute of Environmental Sciences, Leiden University, Leiden, The Netherlands
John Rumble R&R Data Services, 11 Montgomery Avenue, Gaithersburg MD 20877 USA
Fred Klaessig Pennsylvania Bio Nano Systems LLC, 3805 Old Easton Road, Doylestown, PA 18902
Clarissa Marquardt Institute of Applied Computer Sciences (IAI), Karlsruhe Institute of Technology (KIT), Hermann v. Helmholtz Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
Hubert Rauscher European Commission, Joint Research Centre, Institute for Health and Consumer Protection, Via Fermi 2749, 21027 Ispra (VA), Italy
Tomasz Puzyn Laboratory of Environmental Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
Ronit Purian Faculty of Engineering, Tel Aviv University, Tel Aviv 69978 Israel
Christoffer Åberg Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands
Sandra Karcher Civil and Environmental Engineering, Carnegie Mellon University, Pittsburgh, PA 15213-3890
Hanne Vriens Department of Public Health and Primary Care, K.U.Leuven, Faculty of Medicine, Unit Environment & Health – Toxicology, Herestraat 49 (O&N 706), Leuven, Belgium
Peter Hoet Department of Public Health and Primary Care, K.U.Leuven, Faculty of Medicine, Unit Environment & Health – Toxicology, Herestraat 49 (O&N 706), Leuven, Belgium
Mark D. Hoover National Institute for Occupational Safety and Health, 1095 Willowdale Road, Morgantown, WV 26505-2888
Christine Ogilvie Hendren Center for the Environmental Implications of NanoTechnology, Duke University, PO Box 90287 121 Hudson Hall, Durham NC 27708
Stacey L. Harper Department of Environmental and Molecular Toxicology, School of Chemical, Biological and Environmental Engineering, Oregon State University, 1007 ALS, Corvallis, OR 97331

Collapse

Bandrowski A, Brinkman R, Brochhausen M, Brush MH, Bug B, Chibucos MC, Clancy K, Courtot M, Derom D, Dumontier M, Fan L, Fostel J, Fragoso G, Gibson F, Gonzalez-Beltran A, Haendel MA, He Y, Heiskanen M, Hernandez-Boussard T, Jensen M, Lin Y, Lister AL, Lord P, Malone J, Manduchi E, McGee M, Morrison N, Overton JA, Parkinson H, Peters B, Rocca-Serra P, Ruttenberg A, Sansone SA, Scheuermann RH, Schober D, Smith B, Soldatova LN, Stoeckert CJ, Taylor CF, Torniai C, Turner JA, Vita R, Whetzel PL, Zheng J. The Ontology for Biomedical Investigations. PLoS One 2016;11:e0154556. [PMID: 27128319 PMCID: PMC4851331 DOI: 10.1371/journal.pone.0154556] [Citation(s) in RCA: 142] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 04/17/2016] [Indexed: 12/18/2022] Open

Abstract

The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed in association with OBI. The current release of OBI is available at http://purl.obolibrary.org/obo/obi.owl.

Collapse

Affiliation(s)

Anita Bandrowski University of California San Diego, La Jolla, California, United States of America
Ryan Brinkman British Columbia Cancer Research Centre, Vancouver, British Columbia, Canada
Mathias Brochhausen University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
Matthew H. Brush Oregon Health and Science University, Portland, Oregon, United States of America
Bill Bug Drexel University College of Medicine, Philadelphia, Pennsylvania, United States of America
Marcus C. Chibucos University of Maryland School of Medicine, Baltimore, Maryland, United States of America
Kevin Clancy Thermo Fisher Scientific, Carlsbad, California, United States of America
Mélanie Courtot Simon Fraser University, Burnaby, British Columbia, Canada
Dirk Derom The Vrije Universiteit Brussel, Ixelles, Brussels, Belgium
Michel Dumontier Stanford University, Stanford, California, United States of America
Liju Fan Ontology Workshop, LLC, Columbia, Maryland, United States of America
Jennifer Fostel National Toxicology Program, NIEHS, National Institutes of Health, Research Triangle Park, North Carolina, United States of America
Gilberto Fragoso Center for Biomedical Informatics and Information Technology, National Institutes of Health, Rockville, Maryland, United States of America
Frank Gibson Royal Society of Chemistry, Cambridge, Cambridgeshire, United Kingdom
Alejandra Gonzalez-Beltran University of Oxford, Oxford, Oxfordshire, United Kingdom
Melissa A. Haendel Oregon Health and Science University, Portland, Oregon, United States of America
Yongqun He University of Michigan Medical School, Ann Arbor, Michigan, United States of America
Mervi Heiskanen National Cancer Institute, Rockville, Maryland, United States of America
Tina Hernandez-Boussard Stanford University, Stanford, California, United States of America
Mark Jensen University at Buffalo, Buffalo, New York, United States of America
Yu Lin University of Michigan Medical School, Ann Arbor, Michigan, United States of America
Allyson L. Lister University of Oxford, Oxford, Oxfordshire, United Kingdom
Phillip Lord Newcastle University, Newcastle-upon-Tyne, Tyne and Wear, United Kingdom
James Malone European Molecular Biology Laboratory- European Bioinformatics Institute, Hinxton, Cambridgeshire, United Kingdom
Elisabetta Manduchi University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Monnie McGee Southern Methodist University, Dallas, Texas, United States of America
Norman Morrison The University of Manchester, Manchester, Greater Manchester, United Kingdom
James A. Overton La Jolla Institute for Allergy and Immunology, La Jolla, California, United States of America
Helen Parkinson European Molecular Biology Laboratory- European Bioinformatics Institute, Hinxton, Cambridgeshire, United Kingdom
Bjoern Peters La Jolla Institute for Allergy and Immunology, La Jolla, California, United States of America
Philippe Rocca-Serra University of Oxford, Oxford, Oxfordshire, United Kingdom
Alan Ruttenberg University at Buffalo, Buffalo, New York, United States of America
Susanna-Assunta Sansone University of Oxford, Oxford, Oxfordshire, United Kingdom
Richard H. Scheuermann J. Craig Venter Institute, La Jolla, California, United States of America
Daniel Schober Leibniz Institute of Plant Biochemistry, Halle, Saxony-Anhalt, Germany
Barry Smith University at Buffalo, Buffalo, New York, United States of America
Larisa N. Soldatova Brunel University London, Uxbridge, Middlesex, United Kingdom
Christian J. Stoeckert University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Chris F. Taylor European Molecular Biology Laboratory- European Bioinformatics Institute, Hinxton, Cambridgeshire, United Kingdom
Carlo Torniai Oregon Health and Science University, Portland, Oregon, United States of America
Jessica A. Turner Georgia State University, Atlanta, Georgia, United States of America
Randi Vita La Jolla Institute for Allergy and Immunology, La Jolla, California, United States of America
Patricia L. Whetzel University of California San Diego, La Jolla, California, United States of America
Jie Zheng University of Pennsylvania, Philadelphia, Pennsylvania, United States of America

Collapse

Arend D, Junker A, Scholz U, Schüler D, Wylie J, Lange M. PGP repository: a plant phenomics and genomics data publication infrastructure. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw033. [PMID: 27087305 PMCID: PMC4834206 DOI: 10.1093/database/baw033] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 02/26/2016] [Indexed: 11/22/2022]

Abstract

Plant genomics and phenomics represents the most promising tools for accelerating yield gains and overcoming emerging crop productivity bottlenecks. However, accessing this wealth of plant diversity requires the characterization of this material using state-of-the-art genomic, phenomic and molecular technologies and the release of subsequent research data via a long-term stable, open-access portal. Although several international consortia and public resource centres offer services for plant research data management, valuable digital assets remains unpublished and thus inaccessible to the scientific community. Recently, the Leibniz Institute of Plant Genetics and Crop Plant Research and the German Plant Phenotyping Network have jointly initiated the Plant Genomics and Phenomics Research Data Repository (PGP) as infrastructure to comprehensively publish plant research data. This covers in particular cross-domain datasets that are not being published in central repositories because of its volume or unsupported data scope, like image collections from plant phenotyping and microscopy, unfinished genomes, genotyping data, visualizations of morphological plant models, data from mass spectrometry as well as software and documents.

The repository is hosted at Leibniz Institute of Plant Genetics and Crop Plant Research using e!DAL as software infrastructure and a Hierarchical Storage Management System as data archival backend. A novel developed data submission tool was made available for the consortium that features a high level of automation to lower the barriers of data publication. After an internal review process, data are published as citable digital object identifiers and a core set of technical metadata is registered at DataCite. The used e!DAL-embedded Web frontend generates for each dataset a landing page and supports an interactive exploration. PGP is registered as research data repository at BioSharing.org, re3data.org and OpenAIRE as valid EU Horizon 2020 open data archive. Above features, the programmatic interface and the support of standard metadata formats, enable PGP to fulfil the FAIR data principles—findable, accessible, interoperable, reusable.

Database URL:http://edal.ipk-gatersleben.de/repos/pgp/

Collapse

Higdon R, Earl RK, Stanberry L, Hudac CM, Montague E, Stewart E, Janko I, Choiniere J, Broomall W, Kolker N, Bernier RA, Kolker E. The promise of multi-omics and clinical data integration to identify and target personalized healthcare approaches in autism spectrum disorders. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2016;19:197-208. [PMID: 25831060 DOI: 10.1089/omi.2015.0020] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Cysique LA. Advancing research in NeuroAIDS using collaboration and public data sharing. BMC Med Genomics 2015;8:76. [PMID: 26560870 PMCID: PMC4642768 DOI: 10.1186/s12920-015-0150-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 11/03/2015] [Indexed: 11/16/2022] Open

Sustaining large-scale infrastructure to promote pre-competitive biomedical research: lessons from mouse genomics. N Biotechnol 2015;33:280-94. [PMID: 26563511 DOI: 10.1016/j.nbt.2015.10.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Revised: 08/07/2015] [Accepted: 10/12/2015] [Indexed: 01/25/2023]

Antman EM, Benjamin EJ, Harrington RA, Houser SR, Peterson ED, Bauman MA, Brown N, Bufalino V, Califf RM, Creager MA, Daugherty A, Demets DL, Dennis BP, Ebadollahi S, Jessup M, Lauer MS, Lo B, MacRae CA, McConnell MV, McCray AT, Mello MM, Mueller E, Newburger JW, Okun S, Packer M, Philippakis A, Ping P, Prasoon P, Roger VL, Singer S, Temple R, Turner MB, Vigilante K, Warner J, Wayte P. Acquisition, Analysis, and Sharing of Data in 2015 and Beyond: A Survey of the Landscape: A Conference Report From the American Heart Association Data Summit 2015. J Am Heart Assoc 2015;4:e002810. [PMID: 26541391 PMCID: PMC4845234 DOI: 10.1161/jaha.115.002810] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 10/14/2015] [Indexed: 01/11/2023]

Abstract

BACKGROUND

A 1.5-day interactive forum was convened to discuss critical issues in the acquisition, analysis, and sharing of data in the field of cardiovascular and stroke science. The discussion will serve as the foundation for the American Heart Association's (AHA's) near-term and future strategies in the Big Data area. The concepts evolving from this forum may also inform other fields of medicine and science.

METHODS AND RESULTS

A total of 47 participants representing stakeholders from 7 domains (patients, basic scientists, clinical investigators, population researchers, clinicians and healthcare system administrators, industry, and regulatory authorities) participated in the conference. Presentation topics included updates on data as viewed from conventional medical and nonmedical sources, building and using Big Data repositories, articulation of the goals of data sharing, and principles of responsible data sharing. Facilitated breakout sessions were conducted to examine what each of the 7 stakeholder domains wants from Big Data under ideal circumstances and the possible roles that the AHA might play in meeting their needs. Important areas that are high priorities for further study regarding Big Data include a description of the methodology of how to acquire and analyze findings, validation of the veracity of discoveries from such research, and integration into investigative and clinical care aspects of future cardiovascular and stroke medicine. Potential roles that the AHA might consider include facilitating a standards discussion (eg, tools, methodology, and appropriate data use), providing education (eg, healthcare providers, patients, investigators), and helping build an interoperable digital ecosystem in cardiovascular and stroke science.

CONCLUSION

There was a consensus across stakeholder domains that Big Data holds great promise for revolutionizing the way cardiovascular and stroke research is conducted and clinical care is delivered; however, there is a clear need for the creation of a vision of how to use it to achieve the desired goals. Potential roles for the AHA center around facilitating a discussion of standards, providing education, and helping establish a cardiovascular digital ecosystem. This ecosystem should be interoperable and needs to interface with the rapidly growing digital object environment of the modern-day healthcare system.

Collapse

Kaye J, Muddyman D, Smee C, Kennedy K, Bell J. 'Pop-Up' Governance: developing internal governance frameworks for consortia: the example of UK10K. LIFE SCIENCES, SOCIETY AND POLICY 2015;11:10. [PMID: 26412243 PMCID: PMC4584211 DOI: 10.1186/s40504-015-0028-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 09/15/2015] [Indexed: 06/05/2023]

Alterovitz G, Warner J, Zhang P, Chen Y, Ullman-Cullere M, Kreda D, Kohane IS. SMART on FHIR Genomics: facilitating standardized clinico-genomic apps. J Am Med Inform Assoc 2015. [PMID: 26198304 DOI: 10.1093/jamia/ocv045] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Federer LM, Lu YL, Joubert DJ, Welsh J, Brandys B. Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff. PLoS One 2015;10:e0129506. [PMID: 26107811 PMCID: PMC4481309 DOI: 10.1371/journal.pone.0129506] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 05/08/2015] [Indexed: 01/01/2023] Open

Abstract

Background

Significant efforts are underway within the biomedical research community to encourage sharing and reuse of research data in order to enhance research reproducibility and enable scientific discovery. While some technological challenges do exist, many of the barriers to sharing and reuse are social in nature, arising from researchers’ concerns about and attitudes toward sharing their data. In addition, clinical and basic science researchers face their own unique sets of challenges to sharing data within their communities. This study investigates these differences in experiences with and perceptions about sharing data, as well as barriers to sharing among clinical and basic science researchers.

Methods

Clinical and basic science researchers in the Intramural Research Program at the National Institutes of Health were surveyed about their attitudes toward and experiences with sharing and reusing research data. Of 190 respondents to the survey, the 135 respondents who identified themselves as clinical or basic science researchers were included in this analysis. Odds ratio and Fisher’s exact tests were the primary methods to examine potential relationships between variables. Worst-case scenario sensitivity tests were conducted when necessary.

Results and Discussion

While most respondents considered data sharing and reuse important to their work, they generally rated their expertise as low. Sharing data directly with other researchers was common, but most respondents did not have experience with uploading data to a repository. A number of significant differences exist between the attitudes and practices of clinical and basic science researchers, including their motivations for sharing, their reasons for not sharing, and the amount of work required to prepare their data.

Conclusions

Even within the scope of biomedical research, addressing the unique concerns of diverse research communities is important to encouraging researchers to share and reuse data. Efforts at promoting data sharing and reuse should be aimed at solving not only technological problems, but also addressing researchers’ concerns about sharing their data. Given the varied practices of individual researchers and research communities, standardizing data practices like data citation and repository upload could make sharing and reuse easier.

Collapse

Intuitive web-based experimental design for high-throughput biomedical data. BIOMED RESEARCH INTERNATIONAL 2015;2015:958302. [PMID: 25954760 PMCID: PMC4411450 DOI: 10.1155/2015/958302] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2014] [Accepted: 03/09/2015] [Indexed: 11/25/2022]

Dubé L, Labban A, Moubarac JC, Heslop G, Ma Y, Paquet C. A nutrition/health mindset on commercial Big Data and drivers of food demand in modern and traditional systems. Ann N Y Acad Sci 2015;1331:278-295. [PMID: 25514866 DOI: 10.1111/nyas.12595] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Swamidass SJ, Matlock M, Rozenblit L. Securely measuring the overlap between private datasets with cryptosets. PLoS One 2015;10:e0117898. [PMID: 25714898 PMCID: PMC4340911 DOI: 10.1371/journal.pone.0117898] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Accepted: 01/04/2015] [Indexed: 11/19/2022] Open

Kelder T, Summer G, Caspers M, van Schothorst EM, Keijer J, Duivenvoorde L, Klaus S, Voigt A, Bohnert L, Pico C, Palou A, Bonet ML, Dembinska-Kiec A, Malczewska-Malec M, Kieć-Wilk B, del Bas JM, Caimari A, Arola L, van Erk M, van Ommen B, Radonjic M. White adipose tissue reference network: a knowledge resource for exploring health-relevant relations. GENES & NUTRITION 2015;10:439. [PMID: 25466819 PMCID: PMC4252261 DOI: 10.1007/s12263-014-0439-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 10/24/2014] [Indexed: 12/13/2022]

Abstract

Optimal health is maintained by interaction of multiple intrinsic and environmental factors at different levels of complexity-from molecular, to physiological, to social. Understanding and quantification of these interactions will aid design of successful health interventions. We introduce the reference network concept as a platform for multi-level exploration of biological relations relevant for metabolic health, by integration and mining of biological interactions derived from public resources and context-specific experimental data. A White Adipose Tissue Health Reference Network (WATRefNet) was constructed as a resource for discovery and prioritization of mechanism-based biomarkers for white adipose tissue (WAT) health status and the effect of food and drug compounds on WAT health status. The WATRefNet (6,797 nodes and 32,171 edges) is based on (1) experimental data obtained from 10 studies addressing different adiposity states, (2) seven public knowledge bases of molecular interactions, (3) expert's definitions of five physiologically relevant processes key to WAT health, namely WAT expandability, Oxidative capacity, Metabolic state, Oxidative stress and Tissue inflammation, and (4) a collection of relevant biomarkers of these processes identified by BIOCLAIMS ( http://bioclaims.uib.es ). The WATRefNet comprehends multiple layers of biological complexity as it contains various types of nodes and edges that represent different biological levels and interactions. We have validated the reference network by showing overrepresentation with anti-obesity drug targets, pathology-associated genes and differentially expressed genes from an external disease model dataset. The resulting network has been used to extract subnetworks specific to the above-mentioned expert-defined physiological processes. Each of these process-specific signatures represents a mechanistically supported composite biomarker for assessing and quantifying the effect of interventions on a physiological aspect that determines WAT health status. Following this principle, five anti-diabetic drug interventions and one diet intervention were scored for the match of their expression signature to the five biomarker signatures derived from the WATRefNet. This confirmed previous observations of successful intervention by dietary lifestyle and revealed WAT-specific effects of drug interventions. The WATRefNet represents a sustainable knowledge resource for extraction of relevant relationships such as mechanisms of action, nutrient intervention targets and biomarkers and for assessment of health effects for support of health claims made on food products.

Collapse

Affiliation(s)

Thomas Kelder Microbiology & Systems Biology, TNO, Zeist, The Netherlands Present Address: EdgeLeap B.V., Hooghiemstraplein 15, 3514 AX Utrecht, The Netherlands
Georg Summer Microbiology & Systems Biology, TNO, Zeist, The Netherlands CARIM, Maastricht University, Maastricht, The Netherlands
Martien Caspers Microbiology & Systems Biology, TNO, Zeist, The Netherlands
Evert M. van Schothorst Human and Animal Physiology, Wageningen University, Wageningen, The Netherlands
Jaap Keijer Human and Animal Physiology, Wageningen University, Wageningen, The Netherlands
Loes Duivenvoorde Human and Animal Physiology, Wageningen University, Wageningen, The Netherlands
Susanne Klaus Group of Energy Metabolism, German Institute of Human Nutrition in Potsdam, Nuthetal, Germany
Anja Voigt Group of Energy Metabolism, German Institute of Human Nutrition in Potsdam, Nuthetal, Germany
Laura Bohnert Group of Energy Metabolism, German Institute of Human Nutrition in Potsdam, Nuthetal, Germany
Catalina Pico Molecular Biology, Nutrition and Biotechnology (Nutrigenomics), University of the Balearic Islands (UIB), Palma de Mallorca, Spain CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Palma de Mallorca, Spain
Andreu Palou Molecular Biology, Nutrition and Biotechnology (Nutrigenomics), University of the Balearic Islands (UIB), Palma de Mallorca, Spain CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Palma de Mallorca, Spain
M. Luisa Bonet Molecular Biology, Nutrition and Biotechnology (Nutrigenomics), University of the Balearic Islands (UIB), Palma de Mallorca, Spain CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Palma de Mallorca, Spain
Aldona Dembinska-Kiec Department of Clinical Biochemistry, Jagiellonian University Medical College, Krakow, Poland
Malgorzata Malczewska-Malec Department of Clinical Biochemistry, Jagiellonian University Medical College, Krakow, Poland
Beata Kieć-Wilk Department of Metabolic Disorders, Jagiellonian University Medical College, Krakow, Poland
Josep M. del Bas Centre Tecnològic de Nutrició i Salut (CTNS), TECNIO, Reus, Spain
Antoni Caimari Centre Tecnològic de Nutrició i Salut (CTNS), TECNIO, Reus, Spain
Lluis Arola Centre Tecnològic de Nutrició i Salut (CTNS), TECNIO, Reus, Spain Rovira i Virgili University, Tarragona, Spain
Marjan van Erk Microbiology & Systems Biology, TNO, Zeist, The Netherlands
Ben van Ommen Microbiology & Systems Biology, TNO, Zeist, The Netherlands
Marijana Radonjic Microbiology & Systems Biology, TNO, Zeist, The Netherlands Present Address: EdgeLeap B.V., Hooghiemstraplein 15, 3514 AX Utrecht, The Netherlands

Collapse

Angrist M, Cook-Deegan R. Distributing the future: The weak justifications for keeping human genomic databases secret and the challenges and opportunities in reverse engineering them. Appl Transl Genom 2014;3:124-127. [PMID: 25642409 PMCID: PMC4307597 DOI: 10.1016/j.atg.2014.09.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Heatherly R, Denny JC, Haines JL, Roden DM, Malin BA. Size matters: how population size influences genotype-phenotype association studies in anonymized data. J Biomed Inform 2014;52:243-50. [PMID: 25038554 PMCID: PMC4260994 DOI: 10.1016/j.jbi.2014.07.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Revised: 05/21/2014] [Accepted: 07/07/2014] [Indexed: 12/29/2022]

Abstract

OBJECTIVE

Electronic medical records (EMRs) data is increasingly incorporated into genome-phenome association studies. Investigators hope to share data, but there are concerns it may be "re-identified" through the exploitation of various features, such as combinations of standardized clinical codes. Formal anonymization algorithms (e.g., k-anonymization) can prevent such violations, but prior studies suggest that the size of the population available for anonymization may influence the utility of the resulting data. We systematically investigate this issue using a large-scale biorepository and EMR system through which we evaluate the ability of researchers to learn from anonymized data for genome-phenome association studies under various conditions.

METHODS

We use a k-anonymization strategy to simulate a data protection process (on data sets containing clinical codes) for resources of similar size to those found at nine academic medical institutions within the United States. Following the protection process, we replicate an existing genome-phenome association study and compare the discoveries using the protected data and the original data through the correlation (r(2)) of the p-values of association significance.

RESULTS

Our investigation shows that anonymizing an entire dataset with respect to the population from which it is derived yields significantly more utility than small study-specific datasets anonymized unto themselves. When evaluated using the correlation of genome-phenome association strengths on anonymized data versus original data, all nine simulated sites, results from largest-scale anonymizations (population ∼100,000) retained better utility to those on smaller sizes (population ∼6000-75,000). We observed a general trend of increasing r(2) for larger data set sizes: r(2)=0.9481 for small-sized datasets, r(2)=0.9493 for moderately-sized datasets, r(2)=0.9934 for large-sized datasets.

CONCLUSIONS

This research implies that regardless of the overall size of an institution's data, there may be significant benefits to anonymization of the entire EMR, even if the institution is planning on releasing only data about a specific cohort of patients.

Collapse

Kyrpides NC, Hugenholtz P, Eisen JA, Woyke T, Göker M, Parker CT, Amann R, Beck BJ, Chain PSG, Chun J, Colwell RR, Danchin A, Dawyndt P, Dedeurwaerdere T, DeLong EF, Detter JC, De Vos P, Donohue TJ, Dong XZ, Ehrlich DS, Fraser C, Gibbs R, Gilbert J, Gilna P, Glöckner FO, Jansson JK, Keasling JD, Knight R, Labeda D, Lapidus A, Lee JS, Li WJ, MA J, Markowitz V, Moore ERB, Morrison M, Meyer F, Nelson KE, Ohkuma M, Ouzounis CA, Pace N, Parkhill J, Qin N, Rossello-Mora R, Sikorski J, Smith D, Sogin M, Stevens R, Stingl U, Suzuki KI, Taylor D, Tiedje JM, Tindall B, Wagner M, Weinstock G, Weissenbach J, White O, Wang J, Zhang L, Zhou YG, Field D, Whitman WB, Garrity GM, Klenk HP. Genomic encyclopedia of bacteria and archaea: sequencing a myriad of type strains. PLoS Biol 2014;12:e1001920. [PMID: 25093819 PMCID: PMC4122341 DOI: 10.1371/journal.pbio.1001920] [Citation(s) in RCA: 138] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Affiliation(s)

Nikos C. Kyrpides DOE-Joint Genome Institute, Walnut Creek, California, United States of America Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia * E-mail: (NCK); (HPK)
Philip Hugenholtz Australian Centre for Ecogenomics Research, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
Jonathan A. Eisen University of California, Davis, Davis, California, United States of America
Tanja Woyke DOE-Joint Genome Institute, Walnut Creek, California, United States of America
Markus Göker DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany
Charles T. Parker NamesforLife, LLC, East Lansing, Michigan, United States of America
Rudolf Amann Max Planck Institute for Marine Microbiology, Bremen, Germany
Brian J. Beck American Type Culture Collection (ATCC), Manassas, Virginia, United States of America
Patrick S. G. Chain Los Alamos National Laboratory, Bioscience Division, Los Alamos, New Mexico, United States of America
Jongsik Chun School of Biological Sciences and Chunlab Inc., Seoul National University, Seoul, Korea
Rita R. Colwell University of Maryland, College Park, College Park, Maryland, United States of America Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States of America
Antoine Danchin AMAbiotics SAS, Genopole, Evry, France
Peter Dawyndt Ghent University, Department of Applied Mathematics and Computer Science, Ghent, Belgium
Tom Dedeurwaerdere Centre for Philosophy of Law, Université catholique de Louvain, Louvain-la-Neuve, Belgium
Edward F. DeLong Department of Civil and Environmental Engineering and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
John C. Detter Los Alamos National Laboratory, Bioscience Division, Los Alamos, New Mexico, United States of America
Paul De Vos Ghent University, Department of Applied Mathematics and Computer Science, Ghent, Belgium Ghent University, BCCM/LMG Bacteria collection, Laboratory of Microbiology, Ghent, Belgium
Timothy J. Donohue University of Wisconsin-Madison, Great Lakes Bioenergy Research Center, Madison, Wisconsin, United States of America
Xiu-Zhu Dong Bioresource Center (BRC) of Institute of Microbiology, Chinese Academy of Sciences, P. R. China
Dusko S. Ehrlich Institut National de la Recherche Agronomique, Jouy en Josas, France
Claire Fraser Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
Richard Gibbs Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
Jack Gilbert Institute for Genomics and Systems Biology, Argonne National Laboratory, Argonne, Illinois, United States of America
Paul Gilna BioEnergy Science Center (BESC), Oak Ridge National Laboratory, Knoxville, Tennessee, United States of America
Frank Oliver Glöckner Max Planck Institute for Marine Microbiology, Bremen, Germany Jacobs University Bremen gGmbH, Bremen, Germany
Janet K. Jansson Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
Jay D. Keasling Lawrence Berkeley National Laboratory, Berkeley, California, United States of America Joint BioEnergy Institute (JBEI), Berkeley, California, United States of America
Rob Knight Howard Hughes Medical Institute and Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, United States of America
David Labeda ARS, USDA, National Center for Agricultural Utilization Research, Peoria, Illinois, United States of America
Alla Lapidus Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russia Algorithmic Biology Lab, St. Petersburg Academic University, St. Petersburg, Russia
Jung-Sook Lee Korean Collection for Type Cultures (KCTC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), 111 Gwahangno, Yuseong-gu, Daejeon, Korea
Wen-Jun Li The Key Laboratory for Microbial Resources of the Ministry of Education, Kunming, People's Republic of China
Juncai MA China General Microbiological Culture Collection Center (CGMCC), Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China
Victor Markowitz DOE-Joint Genome Institute, Walnut Creek, California, United States of America Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
Edward R. B. Moore CCUG - Culture Collection University of Gothenburg, Sahlgrenska Academy of the University of Gothenburg, Gothenburg, Sweden
Mark Morrison Diamantina Institute, The University of Queensland, St Lucia, Queensland, Australia
Folker Meyer Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, United States of America
Karen E. Nelson The J. Craig Venter Institute, Rockville, Maryland, United States of America
Moriya Ohkuma Riken Bioresource Center, Japan Collection of Microorganisms, Hirosawa, Wako, Saitama, Japan
Christos A. Ouzounis Chemical Process & Energy Resources Institute, Centre for Research & Technology, Thessalonica, Greece Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Norman Pace Department of Molecular, Cellular and Developmental Biology, University of Colorado, Boulder, Colorado, United States of America
Julian Parkhill The Pathogen Genomics, The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom
Nan Qin State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
Ramon Rossello-Mora Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Esporles, Illes Balears, Spain
Johannes Sikorski DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany
David Smith CABI, Bakeham Lane, Egham, Surrey, United Kingdom
Mitch Sogin Josephine Bay Paul Center for Comparative Evolution and Molecular Biology, MBL, Woods Hole, Massachusetts, United States of America
Rick Stevens Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, United States of America
Uli Stingl Red Sea Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
Ken-ichiro Suzuki NITE Biological Resource Center (NBRC), Kisarazu-shi, Chiba, Japan
Dorothea Taylor NamesforLife, LLC, East Lansing, Michigan, United States of America
Jim M. Tiedje Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
Brian Tindall DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany
Michael Wagner Department of Microbial Ecology, University of Vienna, Vienna, Austria
George Weinstock The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut
Jean Weissenbach Commissariat à l'Energie Atomique (CEA), Genoscope, Evry, France
Owen White Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
Jun Wang State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China Department of Biology, University of Copenhagen, Copenhagen, Denmark
Lixin Zhang Bioresource Center (BRC) of Institute of Microbiology, Chinese Academy of Sciences, P. R. China Chinese Academy of Sciences Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China
Yu-Guang Zhou China General Microbiological Culture Collection Center (CGMCC), Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China
Dawn Field U.K. Natural Environment Research Council (NERC), Environmental Bioinformatics Centre, Oxford, United Kingdom
William B. Whitman Department of Microbiology, University of Georgia, Athens, Georgia, United States of America
George M. Garrity NamesforLife, LLC, East Lansing, Michigan, United States of America Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
Hans-Peter Klenk DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany * E-mail: (NCK); (HPK)

Collapse

Tenenbaum JD, Sansone SA, Haendel M. A sea of standards for omics data: sink or swim? J Am Med Inform Assoc 2014;21:200-3. [PMID: 24076747 PMCID: PMC3932466 DOI: 10.1136/amiajnl-2013-002066] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 07/08/2013] [Accepted: 09/10/2013] [Indexed: 11/29/2022] Open

Kaye J, Hawkins N. Data sharing policy design for consortia: challenges for sustainability. Genome Med 2014;6:4. [PMID: 24475754 PMCID: PMC3978924 DOI: 10.1186/gm523] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

A field guide to genomics research. PLoS Biol 2014;12:e1001744. [PMID: 24409093 PMCID: PMC3883637 DOI: 10.1371/journal.pbio.1001744] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open