1
|
Yoshizawa G, Shinomiya N, Kawamoto S, Kawahara N, Kiga D, Hanaki KI, Minari J. Limiting open science? Three approaches to bottom-up governance of dual-use research of concern. Pathog Glob Health 2024; 118:285-294. [PMID: 37791645 PMCID: PMC11234915 DOI: 10.1080/20477724.2023.2265626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023] Open
Abstract
Governing dual-use research of concern (DURC) in the life sciences has become difficult owing to the diversification of scientific domains, digitalization of potential threats, and the proliferation of actors. This paper proposes three approaches to realize bottom-up governance of DURC from laboratory operation to institutional decision-making levels. First, a technological approach can predict and monitor the dual-use nature of the research target pathogens and their information. Second, an interactive approach is proposed in which diverse stakeholders proactively discuss and examine dual-use issues through research practice. Third, a personnel approach can identify the right persons involved in DURC. These approaches suggest that, going beyond self-governance by researchers, collaborative and networked governance involving diverse actors should become essential. This mode of governance can also be seen in light of the management of research use. Therefore, program design by funding agencies and publication screening by journal publishers continuously contribute to governance at the meso-level. Bottom-up governance may be realized by using an appropriately integrated design of these three approaches at the micro-level, such as dual-use prediction and monitoring, stakeholder dialogue, and background checks. Given that the term 'open science' has been promoted to the research community as part of top-down governance, paying due attention on site to research subjects, research practices, and persons involved in research will provide an opportunity to develop a more socially conscious open science.
Collapse
Affiliation(s)
- Go Yoshizawa
- Innovation System Research Center, Kwansei Gakuin University, Hyogo, Japan
| | | | - Shishin Kawamoto
- Graduate School of Science, Hokkaido University, Hokkaido, Japan
| | - Naoto Kawahara
- Center for Clinical and Translational Research, Kyushu University Hospital, Fukuoka, Japan
| | - Daisuke Kiga
- Center for Advanced Biomedical Sciences, School of Advanced Science and Engineering, Waseda University, Tokyo, Japan
| | - Ken-Ichi Hanaki
- Management Department of Biosafety, Laboratory Animal, and Pathogen Bank, National Institute of Infectious Diseases, Tokyo, Japan
| | - Jusaku Minari
- Uehiro Research Division for iPS Cell Ethics, Center for iPS Cell Research and Application, Kyoto University, Kyoto, Japan
| |
Collapse
|
2
|
Alper P, Dĕd V, Herzinger S, Grouès V, Peter S, Lebioda J, Ebermann L, Popleteeva M, Barry ND, Welter D, Ghosh S, Becker R, Schneider R, Gu W, Trefois C, Satagopam V. DS-PACK: Tool assembly for the end-to-end support of controlled access human data sharing. Sci Data 2024; 11:501. [PMID: 38750048 PMCID: PMC11096168 DOI: 10.1038/s41597-024-03326-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 04/29/2024] [Indexed: 05/18/2024] Open
Abstract
The EU General Data Protection Regulation (GDPR) requirements have prompted a shift from centralised controlled access genome-phenome archives to federated models for sharing sensitive human data. In a data-sharing federation, a central node facilitates data discovery; meanwhile, distributed nodes are responsible for handling data access requests, concluding agreements with data users and providing secure access to the data. Research institutions that want to become part of such federations often lack the resources to set up the required controlled access processes. The DS-PACK tool assembly is a reusable, open-source middleware solution that semi-automates controlled access processes end-to-end, from data submission to access. Data protection principles are engraved into all components of the DS-PACK assembly. DS-PACK centralises access control management and distributes access control enforcement with support for data access via cloud-based applications. DS-PACK is in production use at the ELIXIR Luxembourg data hosting platform, combined with an operational model including legal facilitation and data stewardship.
Collapse
Affiliation(s)
- Pinar Alper
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg.
- ELIXIR Luxembourg, Belvaux, Luxembourg.
| | - Vilém Dĕd
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Sascha Herzinger
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Valentin Grouès
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Sarah Peter
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Jacek Lebioda
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Linda Ebermann
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Marina Popleteeva
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Nene Djenaba Barry
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Danielle Welter
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Soumyabrata Ghosh
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Regina Becker
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Reinhard Schneider
- ELIXIR Luxembourg, Belvaux, Luxembourg
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Wei Gu
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Christophe Trefois
- Luxembourg National Data Service, PNED GIE, Esch-sur-Alzette, L-4362, Luxembourg
- ELIXIR Luxembourg, Belvaux, Luxembourg
| | - Venkata Satagopam
- ELIXIR Luxembourg, Belvaux, Luxembourg.
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg.
| |
Collapse
|
3
|
Oliva A, Kaphle A, Reguant R, Sng LMF, Twine NA, Malakar Y, Wickramarachchi A, Keller M, Ranbaduge T, Chan EKF, Breen J, Buckberry S, Guennewig B, Haas M, Brown A, Cowley MJ, Thorne N, Jain Y, Bauer DC. Future-proofing genomic data and consent management: a comprehensive review of technology innovations. Gigascience 2024; 13:giae021. [PMID: 38837943 PMCID: PMC11152178 DOI: 10.1093/gigascience/giae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 01/15/2024] [Accepted: 04/09/2024] [Indexed: 06/07/2024] Open
Abstract
Genomic information is increasingly used to inform medical treatments and manage future disease risks. However, any personal and societal gains must be carefully balanced against the risk to individuals contributing their genomic data. Expanding our understanding of actionable genomic insights requires researchers to access large global datasets to capture the complexity of genomic contribution to diseases. Similarly, clinicians need efficient access to a patient's genome as well as population-representative historical records for evidence-based decisions. Both researchers and clinicians hence rely on participants to consent to the use of their genomic data, which in turn requires trust in the professional and ethical handling of this information. Here, we review existing and emerging solutions for secure and effective genomic information management, including storage, encryption, consent, and authorization that are needed to build participant trust. We discuss recent innovations in cloud computing, quantum-computing-proof encryption, and self-sovereign identity. These innovations can augment key developments from within the genomics community, notably GA4GH Passports and the Crypt4GH file container standard. We also explore how decentralized storage as well as the digital consenting process can offer culturally acceptable processes to encourage data contributions from ethnic minorities. We conclude that the individual and their right for self-determination needs to be put at the center of any genomics framework, because only on an individual level can the received benefits be accurately balanced against the risk of exposing private information.
Collapse
Affiliation(s)
- Adrien Oliva
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Anubhav Kaphle
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Roc Reguant
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Letitia M F Sng
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Natalie A Twine
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Yuwan Malakar
- Responsible Innovation Future Science Platform, Commonwealth Scientific and Industrial Research Organisation, Brisbane, 41 Boggo Rd, Dutton Park QLD 4102, Australia
| | - Anuradha Wickramarachchi
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Marcel Keller
- Data61, Commonwealth Scientific and Industrial Research Organisation, Level 5/13 Garden St, Eveleigh NSW 2015, Australia
| | - Thilina Ranbaduge
- Data61, Commonwealth Scientific and Industrial Research Organisation, Building 101, Clunies Ross St, Black Mountain, Canberra, ACT 2601, Australia
| | - Eva K F Chan
- NSW Health Pathology, Sydney, 1 Reserve Road, St Leonards NSW 2065, Australia
| | - James Breen
- Telethon Kids Institute, Perth, WA 6009, Australia
- National Centre for Indigenous Genomics, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| | - Sam Buckberry
- Telethon Kids Institute, Perth, WA 6009, Australia
- National Centre for Indigenous Genomics, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| | - Boris Guennewig
- Sydney Medical School, Brain and Mind Centre, The University of Sydney, Sydney, 94 Mallett St, Camperdown NSW 2050, Australia
| | - Matilda Haas
- Australian Genomics, Parkville, VIC 3052, Australia
- Murdoch Children’s Research Institute, Parkville, Victoria 3052, Australia
| | - Alex Brown
- Telethon Kids Institute, Perth, WA 6009, Australia
- National Centre for Indigenous Genomics, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| | - Mark J Cowley
- Children’s Cancer Institute, Lowy Cancer Research Centre, Level 4, Lowy Cancer Research Centre Corner Botany & High Streets UNSW Kensington Campus UNSW Sydney, Kensington NSW 2052, Australia
- School of Clinical Medicine, UNSW Medicine & Health, Wallace Wurth Building (C27), Cnr High St & Botany St, UNSW Sydney, Kensington NSW 2052, Australia
| | - Natalie Thorne
- University of Melbourne, Melbourne, Parkville VIC 3052, Australia
- Melbourne Genomics Health Alliance, Melbourne 1G, Walter and Eliza Hall Institute/1G Royal Parade, Parkville VIC 3052, Australia
- Walter and Eliza Hall Institute, Melbourne, 1G, Walter and Eliza Hall Institute/1G Royal Parade, Parkville VIC 3052, Australia
| | - Yatish Jain
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Applied BioSciences 205B Culloden Rd Macquarie University, NSW 2109, Australia
| | - Denis C Bauer
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Applied BioSciences 205B Culloden Rd Macquarie University, NSW 2109, Australia
- Department of Biomedical Sciences, MQ Health General Practice - Macquarie University, Suite 305, Level 3/2 Technology Pl, Macquarie Park NSW 2109, Australia
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Gate 13, Kintore Avenue University of Adelaide, Adelaide SA 5000, Australia
| |
Collapse
|
4
|
Ouwerkerk J, Rasche H, Spalding JD, Hiltemann S, Stubbs AP. FAIR data retrieval for sensitive clinical research data in Galaxy. Gigascience 2024; 13:giad099. [PMID: 38280189 PMCID: PMC10821763 DOI: 10.1093/gigascience/giad099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 10/16/2023] [Accepted: 11/01/2023] [Indexed: 01/29/2024] Open
Abstract
BACKGROUND In clinical research, data have to be accessible and reproducible, but the generated data are becoming larger and analysis complex. Here we propose a platform for Findable, Accessible, Interoperable, and Reusable (FAIR) data access and creating reproducible findings. Standardized access to a major genomic repository, the European Genome-Phenome Archive (EGA), has been achieved with API services like PyEGA3. We aim to provide a FAIR data analysis service in Galaxy by retrieving genomic data from the EGA and provide a generalized "omics" platform for FAIR data analysis. RESULTS To demonstrate this, we implemented an end-to-end Galaxy workflow to replicate the findings from an RD-Connect synthetic dataset Beyond the 1 Million Genomes (synB1MG) available from the EGA. We developed the PyEGA3 connector within Galaxy to easily download multiple datasets from the EGA. We added the gene.iobio tool, a diagnostic environment for precision genomics, to Galaxy and demonstrate that it provides a more dynamic and interpretable view for trio analysis results. We developed a Galaxy trio analysis workflow to determine the pathogenic variants from the synB1MG trios using the GEMINI and gene.iobio tool. The complete workflow is available at WorkflowHub, and an associated tutorial was created in the Galaxy Training Network, which helps researchers unfamiliar with Galaxy to run the workflow. CONCLUSIONS We showed the feasibility of reusing data from the EGA in Galaxy via PyEGA3 and validated the workflow by rediscovering spiked-in variants in synthetic data. Finally, we improved existing tools in Galaxy and created a workflow for trio analysis to demonstrate the value of FAIR genomics analysis in Galaxy.
Collapse
Affiliation(s)
- Jasper Ouwerkerk
- Clinical Bioinformatics Group, Department of Pathology, Erasmus Medical Center, 3015 CN, Rotterdam, the Netherlands
| | - Helena Rasche
- Clinical Bioinformatics Group, Department of Pathology, Erasmus Medical Center, 3015 CN, Rotterdam, the Netherlands
| | | | - Saskia Hiltemann
- Clinical Bioinformatics Group, Department of Pathology, Erasmus Medical Center, 3015 CN, Rotterdam, the Netherlands
| | - Andrew P Stubbs
- Clinical Bioinformatics Group, Department of Pathology, Erasmus Medical Center, 3015 CN, Rotterdam, the Netherlands
| |
Collapse
|
5
|
Dahlquist JM, Nelson SC, Fullerton SM. Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms. HGG ADVANCES 2023; 4:100196. [PMID: 37181330 PMCID: PMC10173774 DOI: 10.1016/j.xhgg.2023.100196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 04/07/2023] [Indexed: 05/16/2023] Open
Abstract
The storage, sharing, and analysis of genomic data poses technical and logistical challenges that have precipitated the development of cloud-based computing platforms designed to facilitate collaboration and maximize the scientific utility of data. To understand cloud platforms' policies and procedures and the implications for different stakeholder groups, in summer 2021, we reviewed publicly available documents (N = 94) sourced from platform websites, scientific literature, and lay media for five NIH-funded cloud platforms (the All of Us Research Hub, NHGRI AnVIL, NHLBI BioData Catalyst, NCI Genomic Data Commons, and the Kids First Data Resource Center) and a pre-existing data sharing mechanism, dbGaP. Platform policies were compared across seven categories of data governance: data submission, data ingestion, user authentication and authorization, data security, data access, auditing, and sanctions. Our analysis finds similarities across the platforms, including reliance on a formal data ingestion process, multiple tiers of data access with varying user authentication and/or authorization requirements, platform and user data security measures, and auditing for inappropriate data use. Platforms differ in how data tiers are organized, as well as the specifics of user authentication and authorization across access tiers. Our analysis maps elements of data governance across emerging NIH-funded cloud platforms and as such provides a key resource for stakeholders seeking to understand and utilize data access and analysis options across platforms and to surface aspects of governance that may require harmonization to achieve the desired interoperability.
Collapse
Affiliation(s)
- Jacklyn M. Dahlquist
- Department of Bioethics and Humanities, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Sarah C. Nelson
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
- Corresponding author
| | - Stephanie M. Fullerton
- Department of Bioethics and Humanities, University of Washington School of Medicine, Seattle, WA 98195, USA
- Corresponding author
| |
Collapse
|
6
|
Consent Codes: Maintaining Consent in an Ever-expanding Open Science Ecosystem. Neuroinformatics 2023; 21:89-100. [PMID: 36520344 PMCID: PMC9931855 DOI: 10.1007/s12021-022-09577-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2022] [Indexed: 12/23/2022]
Abstract
We previously proposed a structure for recording consent-based data use 'categories' and 'requirements' - Consent Codes - with a view to supporting maximum use and integration of genomic research datasets, and reducing uncertainty about permissible re-use of shared data. Here we discuss clarifications and subsequent updates to the Consent Codes (v4) based on new areas of application (e.g., the neurosciences, biobanking, H3Africa), policy developments (e.g., return of research results), and further practical considerations, including developments in automated approaches to consent management.
Collapse
|
7
|
Verhulst S, Young A. Identifying and addressing data asymmetries so as to enable (better) science. Front Big Data 2022; 5:888384. [PMID: 35923558 PMCID: PMC9339620 DOI: 10.3389/fdata.2022.888384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 06/27/2022] [Indexed: 11/25/2022] Open
Abstract
As a society, we need to become more sophisticated in assessing and addressing data asymmetries—and their resulting political and economic power inequalities—particularly in the realm of open science, research, and development. This article seeks to start filling the analytical gap regarding data asymmetries globally, with a specific focus on the asymmetrical availability of privately-held data for open science, and a look at current efforts to address these data asymmetries. It provides a taxonomy of asymmetries, as well as both their societal and institutional impacts. Moreover, this contribution outlines a set of solutions that could provide a toolbox for open science practitioners and data demand-side actors that stand to benefit from increased access to data. The concept of data liquidity (and portability) is explored at length in connection with efforts to generate an ecosystem of responsible data exchanges. We also examine how data holders and demand-side actors are experimenting with new and emerging operational models and governance frameworks for purpose-driven, cross-sector data collaboratives that connect previously siloed datasets. Key solutions discussed include professionalizing and re-imagining data steward roles and functions (i.e., individuals or groups who are tasked with managing data and their ethical and responsible reuse within organizations). We present these solutions through case studies on notable efforts to address science data asymmetries. We examine these cases using a repurposable analytical framework that could inform future research. We conclude with recommended actions that could support the creation of an evidence base on work to address data asymmetries and unlock the public value of greater science data liquidity and responsible reuse.
Collapse
|
8
|
Rodrigues EDS, Griffith S, Martin R, Antonescu C, Posey JE, Coban‐Akdemir Z, Jhangiani SN, Doheny KF, Lupski JR, Valle D, Bamshad MJ, Hamosh A, Sheffer A, Chong JX, Einhorn Y, Cupak M, Sobreira N. Variant-level matching for diagnosis and discovery: Challenges and opportunities. Hum Mutat 2022; 43:782-790. [PMID: 35191117 PMCID: PMC9133151 DOI: 10.1002/humu.24359] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 02/15/2022] [Accepted: 02/18/2022] [Indexed: 11/30/2022]
Abstract
Here we describe MyGene2, Geno2MP, VariantMatcher, and Franklin; databases that provide variant-level information and phenotypic features to researchers, clinicians, healthcare providers and patients. Following the footsteps of the Matchmaker Exchange project that connects exome, genome, and phenotype databases at the gene level, these databases have as one goal to facilitate connection to one another using Data Connect, a standard for discovery and search of biomedical data from the Global Alliance for Genomics and Health (GA4GH).
Collapse
Affiliation(s)
- Eliete da S. Rodrigues
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Sean Griffith
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Renan Martin
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Corina Antonescu
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Jennifer E. Posey
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
| | - Zeynep Coban‐Akdemir
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
| | - Shalini N. Jhangiani
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
- Human Genome Sequencing CenterBaylor College of MedicineHoustonTexasUSA
| | - Kimberly F. Doheny
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - James R. Lupski
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
- Human Genome Sequencing CenterBaylor College of MedicineHoustonTexasUSA
- Department of PediatricsBaylor College of MedicineHoustonTexasUSA
- Texas Children's HospitalHoustonTexasUSA
| | - David Valle
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Michael J. Bamshad
- Division of Genetic Medicine, Department of PediatricsUniversity of WashingtonSeattleWashingtonUSA
- Brotman Baty Institute for Precision MedicineSeattleWashingtonUSA
| | - Ada Hamosh
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | | | - Jessica X. Chong
- Division of Genetic Medicine, Department of PediatricsUniversity of WashingtonSeattleWashingtonUSA
- Brotman Baty Institute for Precision MedicineSeattleWashingtonUSA
| | | | | | - Nara Sobreira
- McKusick‐Nathans Department of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| |
Collapse
|
9
|
Schatz MC, Philippakis AA, Afgan E, Banks E, Carey VJ, Carroll RJ, Culotti A, Ellrott K, Goecks J, Grossman RL, Hall IM, Hansen KD, Lawson J, Leek JT, Luria AO, Mosher S, Morgan M, Nekrutenko A, O’Connor BD, Osborn K, Paten B, Patterson C, Tan FJ, Taylor CO, Vessio J, Waldron L, Wang T, Wuichet K. Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space. CELL GENOMICS 2022; 2:100085. [PMID: 35199087 PMCID: PMC8863334 DOI: 10.1016/j.xgen.2021.100085] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.
Collapse
Affiliation(s)
- Michael C. Schatz
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA,Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA,Corresponding author
| | | | - Enis Afgan
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Eric Banks
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Robert J. Carroll
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alessandro Culotti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA,Center for Translational Data Science, University of Chicago, Chicago, IL, USA
| | - Kyle Ellrott
- Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Jeremy Goecks
- Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Robert L. Grossman
- Center for Translational Data Science, University of Chicago, Chicago, IL, USA
| | - Ira M. Hall
- Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Kasper D. Hansen
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Jeffrey T. Leek
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Stephen Mosher
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Martin Morgan
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, State College, PA, USA
| | | | - Kevin Osborn
- UC Santa Cruz Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | | | - Frederick J. Tan
- Department of Embryology, Carnegie Institution, Baltimore, MD, USA
| | - Casey Overby Taylor
- Departments of Medicine and Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer Vessio
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Levi Waldron
- Department of Epidemiology and Biostatistics, City University of New York Graduate School of Public Health and Health Policy, New York, NY, USA
| | - Ting Wang
- Department of Genetics, Washington University of St. Louis, St. Louis, MO, USA
| | - Kristin Wuichet
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | |
Collapse
|
10
|
Cabili MN, Lawson J, Saltzman A, Rushton G, O’Rourke P, Wilbanks J, Rodriguez LL, Nyronen T, Courtot M, Donnelly S, Philippakis AA. Empirical validation of an automated approach to data use oversight. CELL GENOMICS 2021; 1:100031. [PMID: 36778584 PMCID: PMC9903839 DOI: 10.1016/j.xgen.2021.100031] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 06/30/2021] [Accepted: 08/07/2021] [Indexed: 10/19/2022]
Abstract
The current paradigm for data use oversight of biomedical datasets is onerous, extending the timescale and resources needed to obtain access for secondary analyses, thus hindering scientific discovery. For a researcher to utilize a controlled-access dataset, a data access committee must review her research plans to determine whether they are consistent with the data use limitations (DULs) specified by the informed consent form. The newly created GA4GH data use ontology (DUO) holds the potential to streamline this process by making data use oversight computable. Here, we describe an open-source software platform, the Data Use Oversight System (DUOS), that connects with DUO terminology to enable automated data use oversight. We analyze dbGaP data acquired since 2006, finding an exponential increase in data access requests, which will not be sustainable with current manual oversight review. We perform an empirical evaluation of DUOS and DUO on selected datasets from the Broad Institute's data repository. We were able to structure 118/123 of the evaluated DULs (96%) and 52/52 (100%) of research proposals using DUO terminology, and we find that DUOS' automated data access adjudication in all cases agreed with the DAC manual review. This first empirical evaluation of the feasibility of automated data use oversight demonstrates comparable accuracy to human-based data access oversight in real-world data governance.
Collapse
Affiliation(s)
- Moran N. Cabili
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jonathan Lawson
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Andrea Saltzman
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Greg Rushton
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | | | | - Tommi Nyronen
- ELIXIR Finland, CSC - IT Center for Science, Espoo, Finland
| | - Mélanie Courtot
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Stacey Donnelly
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA,Corresponding author
| | - Anthony A. Philippakis
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA,Corresponding author
| |
Collapse
|
11
|
Lawson J, Cabili MN, Kerry G, Boughtwood T, Thorogood A, Alper P, Bowers SR, Boyles RR, Brookes AJ, Brush M, Burdett T, Clissold H, Donnelly S, Dyke SO, Freeberg MA, Haendel MA, Hata C, Holub P, Jeanson F, Jene A, Kawashima M, Kawashima S, Konopko M, Kyomugisha I, Li H, Linden M, Rodriguez LL, Morita M, Mulder N, Muller J, Nagaie S, Nasir J, Ogishima S, Ota Wang V, Paglione LD, Pandya RN, Parkinson H, Philippakis AA, Prasser F, Rambla J, Reinold K, Rushton GA, Saltzman A, Saunders G, Sofia HJ, Spalding JD, Swertz MA, Tulchinsky I, van Enckevort EJ, Varma S, Voisin C, Yamamoto N, Yamasaki C, Zass L, Guidry Auvil JM, Nyrönen TH, Courtot M. The Data Use Ontology to streamline responsible access to human biomedical datasets. CELL GENOMICS 2021; 1:None. [PMID: 34820659 PMCID: PMC8591903 DOI: 10.1016/j.xgen.2021.100028] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 07/02/2021] [Accepted: 08/09/2021] [Indexed: 11/25/2022]
Abstract
Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.
Collapse
Affiliation(s)
- Jonathan Lawson
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Moran N. Cabili
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Giselle Kerry
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Tiffany Boughtwood
- Australian Genomics, Murdoch Children’s Research Institute, Parkville, VIC, Australia
| | - Adrian Thorogood
- Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC, Canada,ELIXIR-Luxembourg, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Pinar Alper
- ELIXIR-Luxembourg, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | | | | | | | - Matthew Brush
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Tony Burdett
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Hayley Clissold
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Stacey Donnelly
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Stephanie O.M. Dyke
- McGill Centre for Integrative Neuroscience, Montreal Neurological Institute, Department of Neurology & Neurosurgery, Faculty of Medicine, McGill University, Montreal, QC, Canada
| | - Mallory A. Freeberg
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Chihiro Hata
- Bioinformation and DDBJ Center, National Institute of Genetics, Mishima, Japan
| | - Petr Holub
- BBMRI-ERIC, AT and Masaryk University, Brno, Czech Republic
| | | | - Aina Jene
- Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Minae Kawashima
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
| | - Shuichi Kawashima
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, Japan
| | | | - Irene Kyomugisha
- Division of Human Genetics, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Haoyuan Li
- Canada’s Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - Mikael Linden
- ELIXIR-Finland, CSC - IT Center for Science Ltd, Espoo, Finland
| | | | | | - Nicola Mulder
- Computational Biology Division, IDM, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jean Muller
- Laboratoire de Génétique Médicale, Institut de Génétique Médicale d’Alsace, INSERM U1112, Université; de Strasbourg, Strasbourg, France,Laboratoire de Diagnostic Génétique, Institut de Génétique Médicale d’Alsace, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Satoshi Nagaie
- Tohoku Medical Megabank Organization (ToMMo), Tohoku University, Sendai, Japan
| | - Jamal Nasir
- Department of Life Sciences, University of Northampton, Northampton, UK
| | - Soichi Ogishima
- Tohoku Medical Megabank Organization (ToMMo), Tohoku University, Sendai, Japan
| | - Vivian Ota Wang
- Office of Data Sharing, National Cancer Institute, NIH, Rockville, MD, USA
| | | | | | - Helen Parkinson
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Anthony A. Philippakis
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Fabian Prasser
- Berlin Institute of Health at Charité—Universitätsmedizin Berlin, Berlin, Germany
| | - Jordi Rambla
- Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Kathy Reinold
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Gregory A. Rushton
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Andrea Saltzman
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - Heidi J. Sofia
- National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - John D. Spalding
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Morris A. Swertz
- Genomics Coordination Center, Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | | | - Esther J. van Enckevort
- Genomics Coordination Center, Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Susheel Varma
- Health Data Research UK, Gibbs Building, 215 Euston Road, London NW1 2BE, UK
| | | | | | | | - Lyndon Zass
- Computational Biology Division, IDM, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | | | | | - Mélanie Courtot
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK,Corresponding author
| |
Collapse
|
12
|
Rehm HL, Page AJ, Smith L, Adams JB, Alterovitz G, Babb LJ, Barkley MP, Baudis M, Beauvais MJ, Beck T, Beckmann JS, Beltran S, Bernick D, Bernier A, Bonfield JK, Boughtwood TF, Bourque G, Bowers SR, Brookes AJ, Brudno M, Brush MH, Bujold D, Burdett T, Buske OJ, Cabili MN, Cameron DL, Carroll RJ, Casas-Silva E, Chakravarty D, Chaudhari BP, Chen SH, Cherry JM, Chung J, Cline M, Clissold HL, Cook-Deegan RM, Courtot M, Cunningham F, Cupak M, Davies RM, Denisko D, Doerr MJ, Dolman LI, Dove ES, Dursi LJ, Dyke SO, Eddy JA, Eilbeck K, Ellrott KP, Fairley S, Fakhro KA, Firth HV, Fitzsimons MS, Fiume M, Flicek P, Fore IM, Freeberg MA, Freimuth RR, Fromont LA, Fuerth J, Gaff CL, Gan W, Ghanaim EM, Glazer D, Green RC, Griffith M, Griffith OL, Grossman RL, Groza T, Guidry Auvil JM, Guigó R, Gupta D, Haendel MA, Hamosh A, Hansen DP, Hart RK, Hartley DM, Haussler D, Hendricks-Sturrup RM, Ho CW, Hobb AE, Hoffman MM, Hofmann OM, Holub P, Hsu JS, Hubaux JP, Hunt SE, Husami A, Jacobsen JO, Jamuar SS, Janes EL, Jeanson F, Jené A, Johns AL, Joly Y, Jones SJ, Kanitz A, Kato K, Keane TM, Kekesi-Lafrance K, Kelleher J, Kerry G, Khor SS, Knoppers BM, Konopko MA, Kosaki K, Kuba M, Lawson J, Leinonen R, Li S, Lin MF, Linden M, Liu X, Liyanage IU, Lopez J, Lucassen AM, Lukowski M, Mann AL, Marshall J, Mattioni M, Metke-Jimenez A, Middleton A, Milne RJ, Molnár-Gábor F, Mulder N, Munoz-Torres MC, Nag R, Nakagawa H, Nasir J, Navarro A, Nelson TH, Niewielska A, Nisselle A, Niu J, Nyrönen TH, O’Connor BD, Oesterle S, Ogishima S, Ota Wang V, Paglione LA, Palumbo E, Parkinson HE, Philippakis AA, Pizarro AD, Prlic A, Rambla J, Rendon A, Rider RA, Robinson PN, Rodarmer KW, Rodriguez LL, Rubin AF, Rueda M, Rushton GA, Ryan RS, Saunders GI, Schuilenburg H, Schwede T, Scollen S, Senf A, Sheffield NC, Skantharajah N, Smith AV, Sofia HJ, Spalding D, Spurdle AB, Stark Z, Stein LD, Suematsu M, Tan P, Tedds JA, Thomson AA, Thorogood A, Tickle TL, Tokunaga K, Törnroos J, Torrents D, Upchurch S, Valencia A, Guimera RV, Vamathevan J, Varma S, Vears DF, Viner C, Voisin C, Wagner AH, Wallace SE, Walsh BP, Williams MS, Winkler EC, Wold BJ, Wood GM, Woolley JP, Yamasaki C, Yates AD, Yung CK, Zass LJ, Zaytseva K, Zhang J, Goodhand P, North K, Birney E. GA4GH: International policies and standards for data sharing across genomic research and healthcare. CELL GENOMICS 2021; 1:100029. [PMID: 35072136 PMCID: PMC8774288 DOI: 10.1016/j.xgen.2021.100029] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.
Collapse
Affiliation(s)
- Heidi L. Rehm
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Angela J.H. Page
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Global Alliance for Genomics and Health, Toronto, ON, Canada
| | - Lindsay Smith
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Jeremy B. Adams
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Gil Alterovitz
- Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | | | | - Michael Baudis
- University of Zurich, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michael J.S. Beauvais
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- McGill University, Montreal, QC, Canada
| | - Tim Beck
- University of Leicester, Leicester, UK
| | | | - Sergi Beltran
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Universitat de Barcelona, Barcelona, Spain
| | - David Bernick
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Tiffany F. Boughtwood
- Australian Genomics, Parkville, VIC, Australia
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
| | - Guillaume Bourque
- McGill University, Montreal, QC, Canada
- Canadian Center for Computational Genomics, Montreal, QC, Canada
| | | | | | - Michael Brudno
- Canadian Center for Computational Genomics, Montreal, QC, Canada
- University of Toronto, Toronto, ON, Canada
- University Health Network, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
- Canadian Distributed Infrastructure for Genomics (CanDIG), Toronto, ON, Canada
| | | | - David Bujold
- McGill University, Montreal, QC, Canada
- Canadian Center for Computational Genomics, Montreal, QC, Canada
- Canadian Distributed Infrastructure for Genomics (CanDIG), Toronto, ON, Canada
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | - Daniel L. Cameron
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | | | | | | | - Bimal P. Chaudhari
- Nationwide Children’s Hospital, Columbus, OH, USA
- The Ohio State University, Columbus, OH, USA
| | - Shu Hui Chen
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Justina Chung
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Melissa Cline
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | | | | | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | | | | | | | | | - L. Jonathan Dursi
- University Health Network, Toronto, ON, Canada
- Canadian Distributed Infrastructure for Genomics (CanDIG), Toronto, ON, Canada
| | | | | | | | | | - Susan Fairley
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Khalid A. Fakhro
- Sidra Medicine, Doha, Qatar
- Weill Cornell Medicine - Qatar, Doha, Qatar
| | - Helen V. Firth
- Wellcome Sanger Institute, Hinxton, UK
- Addenbrooke’s Hospital, Cambridge, UK
| | | | | | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Ian M. Fore
- National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mallory A. Freeberg
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Lauren A. Fromont
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | - Clara L. Gaff
- Australian Genomics, Parkville, VIC, Australia
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | - Weiniu Gan
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Elena M. Ghanaim
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - David Glazer
- Verily Life Sciences, South San Francisco, CA, USA
| | - Robert C. Green
- Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Malachi Griffith
- Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Obi L. Griffith
- Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | | | | | | | - Roderic Guigó
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Ada Hamosh
- Johns Hopkins University, Baltimore, MD, USA
| | - David P. Hansen
- Australian Genomics, Parkville, VIC, Australia
- The Australian e-Health Research Centre, CSIRO, Herston, QLD, Australia
| | - Reece K. Hart
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Invitae, San Francisco, CA, USA
- MyOme, Inc, San Bruno, CA, USA
| | | | - David Haussler
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA, USA
| | | | | | | | - Michael M. Hoffman
- University of Toronto, Toronto, ON, Canada
- University Health Network, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
| | - Oliver M. Hofmann
- University of Toronto, Toronto, ON, Canada
- University of Melbourne, Melbourne, VIC, Australia
| | - Petr Holub
- BBMRI-ERIC, Graz, Austria
- Masaryk University, Brno, Czech Republic
| | | | | | - Sarah E. Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Ammar Husami
- Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | | | - Saumya S. Jamuar
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Republic of Singapore
- SingHealth Duke-NUS Institute of Precision Medicine, Singapore, Republic of Singapore
| | - Elizabeth L. Janes
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- University of Waterloo, Waterloo, ON, Canada
| | | | - Aina Jené
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Amber L. Johns
- Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Yann Joly
- McGill University, Montreal, QC, Canada
| | - Steven J.M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Alexander Kanitz
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Thomas M. Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
- University of Nottingham, Nottingham, UK
| | - Kristina Kekesi-Lafrance
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- McGill University, Montreal, QC, Canada
| | | | - Giselle Kerry
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Seik-Soon Khor
- National Center for Global Health and Medicine Hospital, Tokyo, Japan
- University of Tokyo, Tokyo, Japan
| | | | | | | | | | | | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Stephanie Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Global Alliance for Genomics and Health, Toronto, ON, Canada
| | | | - Mikael Linden
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | | | - Isuru Udara Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | | | - Alice L. Mann
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Wellcome Sanger Institute, Hinxton, UK
| | | | | | | | - Anna Middleton
- Wellcome Connecting Science, Hinxton, UK
- University of Cambridge, Cambridge, UK
| | - Richard J. Milne
- Wellcome Connecting Science, Hinxton, UK
- University of Cambridge, Cambridge, UK
| | | | - Nicola Mulder
- H3ABioNet, Computational Biology Division, IDM, Faculty of Health Sciences, Cape Town, South Africa
| | | | - Rishi Nag
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Hidewaki Nakagawa
- Japan Agency for Medical Research & Development (AMED), Tokyo, Japan
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Arcadi Navarro
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Barcelonaβeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain
| | | | - Ania Niewielska
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Amy Nisselle
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
- Human Genetics Society of Australasia Education, Ethics & Social Issues Committee, Alexandria, NSW, Australia
| | - Jeffrey Niu
- University Health Network, Toronto, ON, Canada
| | - Tommi H. Nyrönen
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | | | - Sabine Oesterle
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Vivian Ota Wang
- National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Emilio Palumbo
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Helen E. Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | | | - Jordi Rambla
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | - Renee A. Rider
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter N. Robinson
- The Jackson Laboratory, Farmington, CT, USA
- University of Connecticut, Farmington, CT, USA
| | - Kurt W. Rodarmer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | - Alan F. Rubin
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | - Manuel Rueda
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | | | | | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University of Basel, Basel, Switzerland
| | | | | | | | - Neerjah Skantharajah
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | | | - Heidi J. Sofia
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Dylan Spalding
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | | | - Zornitza Stark
- Australian Genomics, Parkville, VIC, Australia
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | - Lincoln D. Stein
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | | | - Patrick Tan
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Republic of Singapore
- Precision Health Research Singapore, Singapore, Republic of Singapore
- Genome Institute of Singapore, Singapore, Republic of Singapore
| | | | - Alastair A. Thomson
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Adrian Thorogood
- McGill University, Montreal, QC, Canada
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | | | - Katsushi Tokunaga
- University of Tokyo, Tokyo, Japan
- National Center for Global Health and Medicine, Tokyo, Japan
| | - Juha Törnroos
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | - David Torrents
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Barcelona Supercomputing Center, Barcelona, Spain
| | - Sean Upchurch
- California Institute of Technology, Pasadena, CA, USA
| | - Alfonso Valencia
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Barcelona Supercomputing Center, Barcelona, Spain
| | | | - Jessica Vamathevan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Susheel Varma
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
- Health Data Research UK, London, UK
| | - Danya F. Vears
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
- Human Genetics Society of Australasia Education, Ethics & Social Issues Committee, Alexandria, NSW, Australia
- Melbourne Law School, University of Melbourne, Parkville, VIC, Australia
| | - Coby Viner
- University of Toronto, Toronto, ON, Canada
- University Health Network, Toronto, ON, Canada
| | | | - Alex H. Wagner
- Nationwide Children’s Hospital, Columbus, OH, USA
- The Ohio State University, Columbus, OH, USA
| | | | | | | | - Eva C. Winkler
- Section of Translational Medical Ethics, University Hospital Heidelberg, Heidelberg, Germany
| | | | | | | | | | - Andrew D. Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Christina K. Yung
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Indoc Research, Toronto, ON, Canada
| | - Lyndon J. Zass
- H3ABioNet, Computational Biology Division, IDM, Faculty of Health Sciences, Cape Town, South Africa
| | - Ksenia Zaytseva
- McGill University, Montreal, QC, Canada
- Canadian Centre for Computational Genomics, Montreal, QC, Canada
| | - Junjun Zhang
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Peter Goodhand
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Kathryn North
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Toronto, Toronto, ON, Canada
- University of Melbourne, Melbourne, VIC, Australia
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
- European Molecular Biology Laboratory, Heidelberg, Germany
| |
Collapse
|
13
|
I can drive in Iceland: Enabling international joint analyses. CELL GENOMICS 2021; 1:100034. [PMID: 36778587 PMCID: PMC9903678 DOI: 10.1016/j.xgen.2021.100034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In this issue of Cell Genomics, GA4GH reports key efforts to help share data across enclaves, including a framework for responsible data sharing, a data use ontology, and approaches for data use oversight. While there remains work in establishing reciprocity between data providers, we envision a future where joint analysis across enclaves is as easy as driving in different countries.
Collapse
|
14
|
Thorogood A, Rehm HL, Goodhand P, Page AJ, Joly Y, Baudis M, Rambla J, Navarro A, Nyronen TH, Linden M, Dove ES, Fiume M, Brudno M, Cline MS, Birney E. International federation of genomic medicine databases using GA4GH standards. CELL GENOMICS 2021; 1:100032. [PMID: 35128509 PMCID: PMC8813094 DOI: 10.1016/j.xgen.2021.100032] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.
Collapse
Affiliation(s)
- Adrian Thorogood
- ELIXIR-Luxembourg and Biocore, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
- Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Heidi L. Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Peter Goodhand
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Angela J.H. Page
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Global Alliance for Genomics and Health, Toronto, ON, Canada
| | - Yann Joly
- Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Michael Baudis
- University of Zurich and Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Jordi Rambla
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Arcadi Navarro
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Institute of Evolutionary Biology (UPF-CSIC), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Barcelonaβeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain
| | - Tommi H. Nyronen
- CSC - IT Center for Science, Life Science Center, Espoo, Finland
- ELIXIR-Europe (Finland), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Mikael Linden
- CSC - IT Center for Science, Life Science Center, Espoo, Finland
- ELIXIR-Europe (Finland), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | | | - Michael Brudno
- Department of Computer Science, University of Toronto and University Health Network, Toronto, ON, Canada
| | - Melissa S. Cline
- UC Santa Cruz Genomics Institute, Mail Stop: Genomics, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridgeshire, UK
| |
Collapse
|
15
|
CanDIG: Federated network across Canada for multi-omic and health data discovery and analysis. CELL GENOMICS 2021; 1:100033. [PMID: 36778585 PMCID: PMC9903648 DOI: 10.1016/j.xgen.2021.100033] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
We present the Canadian Distributed Infrastructure for Genomics (CanDIG) platform, which enables federated querying and analysis of human genomics and linked biomedical data. CanDIG leverages the standards and frameworks of the Global Alliance for Genomics and Health (GA4GH) and currently hosts data for five pan-Canadian projects. We describe CanDIG's key design decisions and features as a guide for other federated data systems.
Collapse
|