Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Naveed M, Ayday E, Clayton EW, Fellay J, Gunter CA, Hubaux JP, Malin BA, Wang X. Privacy in the Genomic Era. ACM Comput Surv 2015;48:6. [PMID: 26640318 PMCID: PMC4666540 DOI: 10.1145/2767007] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 04/01/2015] [Indexed: 05/19/2023]

For:	Naveed M, Ayday E, Clayton EW, Fellay J, Gunter CA, Hubaux JP, Malin BA, Wang X. Privacy in the Genomic Era. ACM Comput Surv 2015;48:6. [PMID: 26640318 PMCID: PMC4666540 DOI: 10.1145/2767007] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 04/01/2015] [Indexed: 05/19/2023]

Number

Cited by Other Article(s)

Brauneck A, Schmalhorst L, Weiss S, Baumbach L, Völker U, Ellinghaus D, Baumbach J, Buchholtz G. Legal aspects of privacy-enhancing technologies in genome-wide association studies and their impact on performance and feasibility. Genome Biol 2024;25:154. [PMID: 38872191 PMCID: PMC11170858 DOI: 10.1186/s13059-024-03296-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 06/03/2024] [Indexed: 06/15/2024] Open

Cavinato T, Rubinacci S, Malaspinas AS, Delaneau O. A resampling-based approach to share reference panels. NATURE COMPUTATIONAL SCIENCE 2024;4:360-366. [PMID: 38745108 PMCID: PMC11136649 DOI: 10.1038/s43588-024-00630-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 04/16/2024] [Indexed: 05/16/2024]

Koh AS, Bos HMW, Rothblum ED, Carone N, Gartrell NK. Donor sibling relations among adult offspring conceived via insemination by lesbian parents. Hum Reprod 2023;38:2166-2174. [PMID: 37697711 DOI: 10.1093/humrep/dead175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 08/13/2023] [Indexed: 09/13/2023] Open

Abstract

STUDY QUESTION

How do adult offspring in planned lesbian-parent families feel about and relate to their donor (half) sibling(s) (DS)?

SUMMARY ANSWER

A majority of offspring had found DS and maintained good ongoing relationships, and all offspring (regardless of whether a DS had been identified) were satisfied with their knowledge of and contact level with the DS.

WHAT IS KNOWN ALREADY

The first generation of donor insemination offspring of intended lesbian-parent families is now in their 30s. Coincident with this is an increased use of DNA testing and genetic ancestry websites, facilitating the discovery of donor siblings from a common sperm donor. Few studies of offspring and their DS include sexual minority parent (SMP) families, and only sparse data separately analyze the offspring of SMP families or extend the analyses to established adult offspring.

STUDY DESIGN, SIZE, DURATION

This cohort study included 75 adult offspring, longitudinally followed since conception in lesbian-parent families. Quantitative analyses were performed from online surveys of the offspring in the seventh wave of the 36-year study, with a 90% family retention rate. The data were collected from March 2021 to November 2022.

PARTICIPANTS/MATERIALS, SETTING, METHODS

Participants were 30- to 33-year-old donor insemination offspring whose lesbian parents enrolled in a US prospective longitudinal study when these offspring were conceived. Offspring who knew of a DS were asked about their numbers found, characteristics or motivations for meeting, DS terminology, relationship quality and maintenance, and impact of the DS contact on others. All offspring (with or without known DS) were asked about the importance of knowing if they have DS and their terminology, satisfaction with information about DS, and feelings about future contact.

MAIN RESULTS AND THE ROLE OF CHANCE

Of offspring, 53% (n = 40) had found DS in modest numbers, via a DS or sperm bank registry in 45% of cases, and most of these offspring had made contact. The offspring had their meeting motivations fulfilled, viewed the DS as acquaintances more often than siblings or friends, and maintained good relationships via meetings, social media, and cell phone communication. They disclosed their DS meetings to most relatives with neutral impact. The offspring, whether with known or unknown DS, felt neutral about the importance of knowing if they had DS, were satisfied with what they knew (or did not know) of the DS, and were satisfied with their current level of DS contact. This study is the largest, longest-running longitudinal study of intended lesbian-parent families and their offspring, and due to its prospective nature, is not biased by over-sampling offspring who were already satisfied with their DS.

LIMITATIONS, REASONS FOR CAUTION

The sample was from the USA, and mostly White, highly educated individuals, not representative of the diversity of donor insemination offspring of lesbian-parent families.

WIDER IMPLICATIONS OF THE FINDINGS

While about half of the offspring found out about DS, the other half did not. Regardless of knowing of a DS, these adult offspring of lesbian parents were satisfied with their level of DS contact. Early disclosure and identity formation about being donor-conceived in a lesbian-parent family may distinguish these study participants from donor insemination offspring and adoptees in the general population, who may be more compelled to seek genetic relatives. The study participants who sought DS mostly found a modest number of them, in contrast to reports in studies that have found large numbers of DS. This may be because one-third of study offspring had donors known to the families since conception, who may have been less likely to participate in commercial sperm banking or internet donation sites, where quotas are difficult to enforce or nonexistent. The study results have implications for anyone considering gamete donation, gamete donors, donor-conceived offspring, and/or gamete banks, as well as the medical and public policy professionals who advise them.

STUDY FUNDING/COMPETING INTEREST(S)

No funding was provided for this project. The authors have no competing interests.

TRIAL REGISTRATION NUMBER

N/A.

Collapse

Ayday E, Vaidya J, Jiang X, Telenti A. Ensuring Trust in Genomics Research. ... IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS : (TPS-ISA ...). IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS 2023;2023:1-12. [PMID: 38562180 PMCID: PMC10981793 DOI: 10.1109/tps-isa58951.2023.00011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Sadhuka S, Fridman D, Berger B, Cho H. Assessing transcriptomic reidentification risks using discriminative sequence models. Genome Res 2023;33:1101-1112. [PMID: 37541758 PMCID: PMC10538488 DOI: 10.1101/gr.277699.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 04/19/2023] [Indexed: 08/06/2023]

Liu W, Zhang Y, Yang H, Meng Q. A Survey on Differential Privacy for Medical Data Analysis. ANNALS OF DATA SCIENCE 2023;11:1-15. [PMID: 38625247 PMCID: PMC10257172 DOI: 10.1007/s40745-023-00475-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/16/2023] [Accepted: 05/22/2023] [Indexed: 12/01/2023]

Gyngell C, Lynch F, Vears D, Bowman-Smart H, Savulescu J, Christodoulou J. Storing paediatric genomic data for sequential interrogation across the lifespan. JOURNAL OF MEDICAL ETHICS 2023:jme-2022-108471. [PMID: 37263770 DOI: 10.1136/jme-2022-108471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 03/02/2023] [Indexed: 06/03/2023]

Jiang Y, Shang T, Liu J. Secure Counting Query Protocol for Genomic Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1457-1468. [PMID: 35666798 DOI: 10.1109/tcbb.2022.3178446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Zhou J, Lei B, Lang H, Panaousis E, Liang K, Xiang J. Secure genotype imputation using homomorphic encryption. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS 2023. [DOI: 10.1016/j.jisa.2022.103386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]

Hwang S, Ozturk E, Tsudik G. Balancing Security and Privacy in Genomic Range Queries*. ACM TRANSACTIONS ON PRIVACY AND SECURITY 2022. [DOI: 10.1145/3575796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Al Aziz MM, Thulasiraman P, Mohammed N. Parallel and private generalized suffix tree construction and query on genomic data. BMC Genom Data 2022;23:45. [PMID: 35715724 PMCID: PMC9206251 DOI: 10.1186/s12863-022-01053-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 04/25/2022] [Indexed: 11/10/2022] Open

Fierro-Monti I, Wright JC, Choudhary JS, Vizcaíno JA. Identifying individuals using proteomics: are we there yet? Front Mol Biosci 2022;9:1062031. [PMID: 36523653 PMCID: PMC9744771 DOI: 10.3389/fmolb.2022.1062031] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 11/16/2022] [Indexed: 08/31/2023] Open

Thaldar DW, Townsend BA, Donnelly DL, Botes M, Gooden A, van Harmelen J, Shozi B. The multidimensional legal nature of personal genomic sequence data: A South African perspective. Front Genet 2022;13:997595. [PMID: 36437942 PMCID: PMC9681828 DOI: 10.3389/fgene.2022.997595] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/28/2022] [Indexed: 10/19/2023] Open

Al Aziz MM, Anjum MM, Mohammed N, Jiang X. Generalized Genomic Data Sharing for Differentially Private Federated Learning. J Biomed Inform 2022;132:104113. [PMID: 35690350 DOI: 10.1016/j.jbi.2022.104113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 03/28/2022] [Accepted: 06/05/2022] [Indexed: 10/18/2022]

Hartung M, Anastasi E, Mamdouh ZM, Nogales C, Schmidt HHHW, Baumbach J, Zolotareva O, List M. Cancer driver drug interaction explorer. Nucleic Acids Res 2022;50:W138-W144. [PMID: 35580047 PMCID: PMC9252786 DOI: 10.1093/nar/gkac384] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/06/2022] [Accepted: 04/29/2022] [Indexed: 12/16/2022] Open

Nakagawa Y, Ohata S, Shimizu K. Efficient privacy-preserving variable-length substring match for genome sequence. Algorithms Mol Biol 2022;17:9. [PMID: 35473587 PMCID: PMC9040336 DOI: 10.1186/s13015-022-00211-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 03/01/2022] [Indexed: 11/28/2022] Open

Abstract

The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on a technique that efficiently applies FM-index for a secret-sharing scheme. More precisely, we developed an algorithm that can achieve a secure table lookup in such a way that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V[V[\ldots V[p_0] \ldots ]]$$\end{document}V[V[…V[p0]…]] is computed for a given depth of recursion where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_0$$\end{document}p0 is an initial position, and V is a vector. We used the secure table lookup for vectors created based on FM-index. The notable feature of the secure table lookup is that time, communication, and round complexities are not dependent on the table length N, after the query input. Therefore, a substring match by reference to the FM-index-based table can also be conducted independently against the database length, and the entire search time is dramatically improved compared to previous approaches. We conducted an experiment using a human genome sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a non-indexed database search protocol under the realistic computation/network environment.

Collapse

Yilmaz E, Ji T, Ayday E, Li P. Genomic Data Sharing under Dependent Local Differential Privacy. CODASPY : PROCEEDINGS OF THE ... ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY. ACM CONFERENCE ON DATA AND APPLICATION SECURITY & PRIVACY 2022;2022:77-88. [PMID: 35531063 PMCID: PMC9073402 DOI: 10.1145/3508398.3511519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Personalized workflows in reconstructive dentistry-current possibilities and future opportunities. Clin Oral Investig 2022;26:4283-4290. [PMID: 35352184 PMCID: PMC9203374 DOI: 10.1007/s00784-022-04475-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 03/22/2022] [Indexed: 01/20/2023]

Abstract

Objectives

The increasing collection of health data coupled with continuous IT advances have enabled precision medicine with personalized workflows. Traditionally, dentistry has lagged behind general medicine in the integration of new technologies: So what is the status quo of precision dentistry? The primary focus of this review is to provide a current overview of personalized workflows in the discipline of reconstructive dentistry (prosthodontics) and to highlight the disruptive potential of novel technologies for dentistry; the possible impact on society is also critically discussed.

Material and methods

Narrative literature review.

Results

Narrative literature review.

Conclusions

In the near future, artificial intelligence (AI) will increase diagnostic accuracy, simplify treatment planning, and thus contribute to the development of personalized reconstructive workflows by analyzing e-health data to promote decision-making on an individual patient basis. Dental education will also benefit from AI systems for personalized curricula considering the individual students’ skills. Augmented reality (AR) will facilitate communication with patients and improve clinical workflows through the use of visually guided protocols. Tele-dentistry will enable opportunities for remote contact among dental professionals and facilitate remote patient consultations and post-treatment follow-up using digital devices. Finally, a personalized digital dental passport encoded using blockchain technology could enable prosthetic rehabilitation using 3D-printed dental biomaterials.

Clinical significance

Overall, AI can be seen as the door-opener and driving force for the evolution from evidence-based prosthodontics to personalized reconstructive dentistry encompassing a synoptic approach with prosthetic and implant workflows. Nevertheless, ethical concerns need to be solved and international guidelines for data management and computing power must be established prior to a widespread routine implementation.

Collapse

Kim YG, Kang G. Secure Collaborative Platform for Healthcare Research in an Open Environment: A Perspective on Accountability in Access Control (Preprint). J Med Internet Res 2022;24:e37978. [DOI: 10.2196/37978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 08/02/2022] [Accepted: 08/30/2022] [Indexed: 11/13/2022] Open

Wan Z, Hazel JW, Clayton EW, Vorobeychik Y, Kantarcioglu M, Malin BA. Sociotechnical safeguards for genomic data privacy. Nat Rev Genet 2022;23:429-445. [PMID: 35246669 PMCID: PMC8896074 DOI: 10.1038/s41576-022-00455-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/24/2022] [Indexed: 12/21/2022]

Akgün M, Pfeifer N, Kohlbacher O. Efficient privacy-preserving whole-genome variant queries. Bioinformatics 2022;38:2202-2210. [PMID: 35150254 PMCID: PMC9004657 DOI: 10.1093/bioinformatics/btac070] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 01/13/2022] [Accepted: 02/03/2022] [Indexed: 02/03/2023] Open

Abstract

MOTIVATION

Diagnosis and treatment decisions on genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease-gene association studies are of great importance. However, genomic data are very sensitive when compared to other data types and contains information about individuals and their relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this work, we propose a method that uses secure multi-party computation to query genomic databases in a privacy-protected manner. The proposed solution privately outsources genomic data from arbitrarily many sources to the two non-colluding proxies and allows genomic databases to be safely stored in semi-honest cloud environments. It provides data privacy, query privacy and output privacy by using XOR-based sharing and unlike previous solutions, it allows queries to run efficiently on hundreds of thousands of genomic data.

RESULTS

We measure the performance of our solution with parameters similar to real-world applications. It is possible to query a genomic database with 3 000 000 variants with five genomic query predicates under 400 ms. Querying 1 048 576 genomes, each containing 1 000 000 variants, for the presence of five different query variants can be achieved approximately in 6 min with a small amount of dedicated hardware and connectivity. These execution times are in the right range to enable real-world applications in medical research and healthcare. Unlike previous studies, it is possible to query multiple databases with response times fast enough for practical application. To the best of our knowledge, this is the first solution that provides this performance for querying large-scale genomic data.

AVAILABILITY AND IMPLEMENTATION

https://gitlab.com/DIFUTURE/privacy-preserving-variant-queries.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Alsaffar MM, Hasan M, McStay GP, Sedky M. Digital DNA lifecycle security and privacy: an overview. Brief Bioinform 2022;23:6518049. [PMID: 35106557 DOI: 10.1093/bib/bbab607] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 12/29/2021] [Accepted: 12/30/2021] [Indexed: 11/14/2022] Open

Torkzadehmahani R, Nasirigerdeh R, Blumenthal DB, Kacprowski T, List M, Matschinske J, Spaeth J, Wenke NK, Baumbach J. Privacy-Preserving Artificial Intelligence Techniques in Biomedicine. Methods Inf Med 2022;61:e12-e27. [PMID: 35062032 PMCID: PMC9246509 DOI: 10.1055/s-0041-1740630] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Jafarbeiki S, Sakzad A, Kasra Kermanshahi S, Gaire R, Steinfeld R, Lai S, Abraham G, Thapa C. PrivGenDB: Efficient and privacy-preserving query executions over encrypted SNP-Phenotype database. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Ji T, Ayday E, Yilmaz E, Li P. OUP accepted manuscript. Bioinformatics 2022;38:i143-i152. [PMID: 35758787 PMCID: PMC9236581 DOI: 10.1093/bioinformatics/btac243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Abstract

Motivation

Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel’s law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks.

Results

Via experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP–phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP–phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy.

Availability and implementation

https://github.com/xiutianxi/robust-genomic-fp-github.

Collapse

Wan Z, Vorobeychik Y, Xia W, Liu Y, Wooders M, Guo J, Yin Z, Clayton EW, Kantarcioglu M, Malin BA. Using game theory to thwart multistage privacy intrusions when sharing data. SCIENCE ADVANCES 2021;7:eabe9986. [PMID: 34890225 PMCID: PMC8664254 DOI: 10.1126/sciadv.abe9986] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 10/25/2021] [Indexed: 06/13/2023]

Affiliation(s)

Zhiyu Wan Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37212, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA
Yevgeniy Vorobeychik Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
Weiyi Xia Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA
Yongtai Liu Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37212, USA
Myrna Wooders Department of Economics, Vanderbilt University, Nashville, TN 37235, USA
Jia Guo Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37212, USA
Zhijun Yin Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37212, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA
Ellen Wright Clayton Center for Biomedical Ethics and Society, Vanderbilt University Medical Center, Nashville, TN 37203, USA School of Law, Vanderbilt University, Nashville, TN 37203, USA Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
Murat Kantarcioglu Department of Computer Science, University of Texas at Dallas, Richardson, TX 75080, USA Institute for Quantitative Social Science, Harvard University, Cambridge, MA 02138, USA Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA 94720, USA
Bradley A. Malin Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37212, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, USA

Collapse

Blockchain-Based Privacy-Preserving System for Genomic Data Management Using Local Differential Privacy. ELECTRONICS 2021. [DOI: 10.3390/electronics10233019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Dupras C, Bunnik EM. Toward a Framework for Assessing Privacy Risks in Multi-Omic Research and Databases. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2021;21:46-64. [PMID: 33433298 DOI: 10.1080/15265161.2020.1863516] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Kim M, Harmanci AO, Bossuat JP, Carpov S, Cheon JH, Chillotti I, Cho W, Froelicher D, Gama N, Georgieva M, Hong S, Hubaux JP, Kim D, Lauter K, Ma Y, Ohno-Machado L, Sofia H, Son Y, Song Y, Troncoso-Pastoriza J, Jiang X. Ultrafast homomorphic encryption models enable secure outsourcing of genotype imputation. Cell Syst 2021;12:1108-1120.e4. [PMID: 34464590 PMCID: PMC9898842 DOI: 10.1016/j.cels.2021.07.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 04/21/2021] [Accepted: 07/29/2021] [Indexed: 02/06/2023]

Affiliation(s)

Miran Kim Department of Computer Science and Engineering and Graduate School of Artificial Intelligence, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
Arif Ozgun Harmanci Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.,*Corresponding authors: ,
Jean-Philippe Bossuat École polytechnique f´d´rale de Lausanne, Switzerland
Sergiu Carpov Inpher, EPFL Innovation Park Bàtiment A, 3rd Fl, 1015 Lausanne, Switzerland.,5CEA, LIST, 91191 Gif-sur-Yvette Cedex, France
Jung Hee Cheon Department of Mathematical Sciences, Seoul National University, Seoul, 08826, Republic of Korea.,7Crypto Lab Inc., Seoul, 08826, Republic of Korea
Ilaria Chillotti Zama, Paris, France and imec-COSIC, KU Leuven, Leuven, Belgium
Wonhee Cho Department of Mathematical Sciences, Seoul National University, Seoul, 08826, Republic of Korea
David Froelicher École polytechnique f´d´rale de Lausanne, Switzerland
Nicolas Gama Inpher, EPFL Innovation Park Bàtiment A, 3rd Fl, 1015 Lausanne, Switzerland
Mariya Georgieva Inpher, EPFL Innovation Park Bàtiment A, 3rd Fl, 1015 Lausanne, Switzerland
Seungwan Hong Department of Mathematical Sciences, Seoul National University, Seoul, 08826, Republic of Korea
Jean-Pierre Hubaux École polytechnique f´d´rale de Lausanne, Switzerland
Duhyeong Kim Department of Mathematical Sciences, Seoul National University, Seoul, 08826, Republic of Korea
Kristin Lauter Microsoft Research, Redmond, WA, 98052, USA
Yiping Ma University of Pennsylvania, Philadelphia, PA, 19104, USA
Lucila Ohno-Machado UCSD Health Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA
Heidi Sofia National Institutes of Health (NIH) - National Human Genome Research Institute, Bethesda, MD, 20892, USA
Yongha Son Samsung SDS, Seoul, Republic of Korea
Yongsoo Song Department of Computer Science and Engineering, Seoul National University, Seoul, 08826, Republic of Korea
Juan Troncoso-Pastoriza École polytechnique f´d´rale de Lausanne, Switzerland
Xiaoqian Jiang Center for Secure Artificial intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.,*Corresponding authors: ,

Collapse

Sherman MA. Paving the path toward genomic privacy with secure imputation. Cell Syst 2021;12:950-952. [PMID: 34672957 DOI: 10.1016/j.cels.2021.09.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Hekel R, Budis J, Kucharik M, Radvanszky J, Pös Z, Szemes T. Privacy-preserving storage of sequenced genomic data. BMC Genomics 2021;22:712. [PMID: 34600465 PMCID: PMC8487550 DOI: 10.1186/s12864-021-07996-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 09/10/2021] [Indexed: 11/23/2022] Open

Wirth FN, Meurers T, Johns M, Prasser F. Privacy-preserving data sharing infrastructures for medical research: systematization and comparison. BMC Med Inform Decis Mak 2021;21:242. [PMID: 34384406 PMCID: PMC8359765 DOI: 10.1186/s12911-021-01602-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 07/31/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches.

METHODS

The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes.

RESULTS

Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility.

CONCLUSIONS

There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken.

Collapse

Sarkar E, Chielle E, Gürsoy G, Mazonka O, Gerstein M, Maniatakos M. Fast and Scalable Private Genotype Imputation Using Machine Learning and Partially Homomorphic Encryption. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021;9:93097-93110. [PMID: 34476144 PMCID: PMC8409799 DOI: 10.1109/access.2021.3093005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Jungkunz M, Köngeter A, Mehlis K, Winkler EC, Schickhardt C. Secondary Use of Clinical Data in Data-Gathering, Non-Interventional Research or Learning Activities: Definition, Types, and a Framework for Risk Assessment. J Med Internet Res 2021;23:e26631. [PMID: 34100760 PMCID: PMC8241435 DOI: 10.2196/26631] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/10/2021] [Accepted: 05/06/2021] [Indexed: 12/16/2022] Open

Abstract

Background

The secondary use of clinical data in data-gathering, non-interventional research or learning activities (SeConts) has great potential for scientific progress and health care improvement. At the same time, it poses relevant risks for the privacy and informational self-determination of patients whose data are used.

Objective

Since the current literature lacks a tailored framework for risk assessment in SeConts as well as a clarification of the concept and practical scope of SeConts, we aim to fill this gap.

Methods

In this study, we analyze each element of the concept of SeConts to provide a synthetic definition, investigate the practical relevance and scope of SeConts through a literature review, and operationalize the widespread definition of risk (as a harmful event of a certain magnitude that occurs with a certain probability) to conduct a tailored analysis of privacy risk factors typically implied in SeConts.

Results

We offer a conceptual clarification and definition of SeConts and provide a list of types of research and learning activities that can be subsumed under the definition of SeConts. We also offer a proposal for the classification of SeConts types into the categories non-interventional (observational) clinical research, quality control and improvement, or public health research. In addition, we provide a list of risk factors that determine the probability or magnitude of harm implied in SeConts. The risk factors provide a framework for assessing the privacy-related risks for patients implied in SeConts. We illustrate the use of risk assessment by applying it to a concrete example.

Conclusions

In the future, research ethics committees and data use and access committees will be able to rely on and apply the framework offered here when reviewing projects of secondary use of clinical data for learning and research purposes.

Collapse

Eisenhauer ER, Tait AR, Low LK, Arslanian-Engoren CM. Women's Choices Regarding Use of Their Newborns' Residual Dried Blood Samples in Research. J Obstet Gynecol Neonatal Nurs 2021;50:424-438. [PMID: 34033759 DOI: 10.1016/j.jogn.2021.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/01/2021] [Indexed: 09/30/2022] Open

Abstract

OBJECTIVE

To determine the proportion of informed choices women made about donating their newborns' blood samples for research.

DESIGN

A quantitative analysis of informed choice using data on women's knowledge and attitudes from a descriptive, cross-sectional survey.

SETTING

The state of Michigan.

PARTICIPANTS

Women (N = 69, ≥18 years old) who had (a) newborns 0 to 3 months of age, (b) yes or no decisions regarding use of the blood sample for research on file, (c) no evidence of an infant death in the state database, (d) completed the knowledge scale, (e) completed the attitude scale, and (f) recalled the decision (i.e., yes or no) about donating blood samples.

METHODS

We used the multidimensional measure of informed choice to calculate the proportion of informed choices in data on women's knowledge, attitudes, and decisions about biospecimen research.

RESULTS

Fifty-five percent (38/69) of participants made informed choices about donating newborn blood samples for research, and 45% made uninformed choices (31/69). Inadequate knowledge about biospecimen research contributed to 87% of uniformed choices (27/31). Participants who declined to donate their newborns' blood samples struggled with making decisions consistent with their values.

CONCLUSION

Nearly half of the participants made uninformed choices about donating the blood samples of their newborns for research. Women need more information about genetics and the storage and research use of newborns' blood samples to make informed choices. Nurses need to be made aware of the ethical, legal, and social implications of such research because they are primary sources of advocacy, information, and support for childbearing women and may be charged with overseeing or obtaining informed consent. Additional research with larger, more diverse samples is needed.

Collapse

Lu D, Zhang Y, Zhang L, Wang H, Weng W, Li L, Cai H. Methods of privacy-preserving genomic sequencing data alignments. Brief Bioinform 2021;22:6279828. [PMID: 34021302 DOI: 10.1093/bib/bbab151] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 03/10/2021] [Accepted: 03/30/2021] [Indexed: 11/14/2022] Open

Ayoz K, Ayday E, Cicek AE. Genome Reconstruction Attacks Against Genomic Data-Sharing Beacons. PROCEEDINGS ON PRIVACY ENHANCING TECHNOLOGIES. PRIVACY ENHANCING TECHNOLOGIES SYMPOSIUM 2021;2021:28-48. [PMID: 34746296 PMCID: PMC8570374 DOI: 10.2478/popets-2021-0036] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Fake It Till You Make It: Guidelines for Effective Synthetic Data Generation. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11052158] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Dupic T, Bensouda Koraichi M, Minervina AA, Pogorelyy MV, Mora T, Walczak AM. Immune fingerprinting through repertoire similarity. PLoS Genet 2021;17:e1009301. [PMID: 33395405 PMCID: PMC7808657 DOI: 10.1371/journal.pgen.1009301] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 01/14/2021] [Accepted: 12/07/2020] [Indexed: 11/18/2022] Open

Oliver KH, Higgs S, Clayton J. The End of Genetic Privacy in the Blade Runner Canon. JOURNAL OF LITERATURE AND SCIENCE 2021;14:108-124. [PMID: 36506249 PMCID: PMC9731365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]

Rahman Mahdi MS, Al Aziz MM, Mohammed N, Jiang X. Privacy-preserving string search on encrypted genomic data using a generalized suffix tree. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100525] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open

Schumacher GJ, Sawaya S, Nelson D, Hansen AJ. Genetic Information Insecurity as State of the Art. Front Bioeng Biotechnol 2020;8:591980. [PMID: 33381496 PMCID: PMC7768984 DOI: 10.3389/fbioe.2020.591980] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 11/16/2020] [Indexed: 11/16/2022] Open

Karimi S, Jiang X, Dolin RH, Kim M, Boxwala A. A secure system for genomics clinical decision support. J Biomed Inform 2020;112:103602. [PMID: 33080397 PMCID: PMC8577277 DOI: 10.1016/j.jbi.2020.103602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Revised: 09/07/2020] [Accepted: 10/12/2020] [Indexed: 11/26/2022]

Yilmaz E, Ji T, Ayday E, Li P. Preserving Genomic Privacy via Selective Sharing. PROCEEDINGS OF THE ACM WORKSHOP ON PRIVACY IN THE ELECTRONIC SOCIETY. ACM WORKSHOP ON PRIVACY IN THE ELECTRONIC SOCIETY 2020;2020:163-179. [PMID: 34485998 PMCID: PMC8411901 DOI: 10.1145/3411497.3420214] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Chang C, Deng Y, Jiang X, Long Q. Multiple imputation for analysis of incomplete data in distributed health data networks. Nat Commun 2020;11:5467. [PMID: 33122624 PMCID: PMC7596726 DOI: 10.1038/s41467-020-19270-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 10/02/2020] [Indexed: 11/25/2022] Open

Kweon S, Lee JH, Lee Y, Park YR. Personal Health Information Inference Using Machine Learning on RNA Expression Data from Patients With Cancer: Algorithm Validation Study. J Med Internet Res 2020;22:e18387. [PMID: 32773372 PMCID: PMC7445622 DOI: 10.2196/18387] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 03/25/2020] [Accepted: 07/06/2020] [Indexed: 12/21/2022] Open

Abstract

BACKGROUND

As the need for sharing genomic data grows, privacy issues and concerns, such as the ethics surrounding data sharing and disclosure of personal information, are raised.

OBJECTIVE

The main purpose of this study was to verify whether genomic data is sufficient to predict a patient's personal information.

METHODS

RNA expression data and matched patient personal information were collected from 9538 patients in The Cancer Genome Atlas program. Five personal information variables (age, gender, race, cancer type, and cancer stage) were recorded for each patient. Four different machine learning algorithms (support vector machine, decision tree, random forest, and artificial neural network) were used to determine whether a patient's personal information could be accurately predicted from RNA expression data. Performance measurement of the prediction models was based on the accuracy and area under the receiver operating characteristic curve. We selected five cancer types (breast carcinoma, kidney renal clear cell carcinoma, head and neck squamous cell carcinoma, low-grade glioma, and lung adenocarcinoma) with large samples sizes to verify whether predictive accuracy would differ between them. We also validated the efficacy of our four machine learning models in analyzing normal samples from 593 cancer patients.

RESULTS

In most samples, personal information with high genetic relevance, such as gender and cancer type, could be predicted from RNA expression data alone. The prediction accuracies for gender and cancer type, which were the best models, were 0.93-0.99 and 0.78-0.94, respectively. Other aspects of personal information, such as age, race, and cancer stage, were difficult to predict from RNA expression data, with accuracies ranging from 0.0026-0.29, 0.76-0.96, and 0.45-0.79, respectively. Among the tested machine learning methods, the highest predictive accuracy was obtained using the support vector machine algorithm (mean accuracy 0.77), while the lowest accuracy was obtained using the random forest method (mean accuracy 0.65). Gender and race were predicted more accurately than other variables in the samples. On average, the accuracy of cancer stage prediction ranged between 0.71-0.67, while the age prediction accuracy ranged between 0.18-0.23 for the five cancer types.

CONCLUSIONS

We attempted to predict patient information using RNA expression data. We found that some identifiers could be predicted, but most others could not. This study showed that personal information available from RNA expression data is limited and this information cannot be used to identify specific patients.

Collapse

Ma S, Cao Y, Xiong L. Efficient logging and querying for blockchain-based cross-site genomic dataset access audit. BMC Med Genomics 2020;13:91. [PMID: 32693835 PMCID: PMC7372873 DOI: 10.1186/s12920-020-0725-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Abstract

BACKGROUND

Genomic data have been collected by different institutions and companies and need to be shared for broader use. In a cross-site genomic data sharing system, a secure and transparent access control audit module plays an essential role in ensuring the accountability. A centralized access log audit system is vulnerable to the single point of attack and also lack transparency since the log could be tampered by a malicious system administrator or internal adversaries. Several studies have proposed blockchain-based access audit to solve this problem but without considering the efficiency of the audit queries. The 2018 iDASH competition first track provides us with an opportunity to design efficient logging and querying system for cross-site genomic dataset access audit. We designed a blockchain-based log system which can provide a light-weight and widely compatible module for existing blockchain platforms. The submitted solution won the third place of the competition. In this paper, we report the technical details in our system.

METHODS

We present two methods: baseline method and enhanced method. We started with the baseline method and then adjusted our implementation based on the competition evaluation criteria and characteristics of the log system. To overcome obstacles of indexing on the immutable Blockchain system, we designed a hierarchical timestamp structure which supports efficient range queries on the timestamp field.

RESULTS

We implemented our methods in Python3, tested the scalability, and compared the performance using the test data supplied by competition organizer. We successfully boosted the log retrieval speed for complex AND queries that contain multiple predicates. For the range query, we boosted the speed for at least one order of magnitude. The storage usage is reduced by 25%.

CONCLUSION

We demonstrate that Blockchain can be used to build a time and space efficient log and query genomic dataset audit trail. Therefore, it provides a promising solution for sharing genomic data with accountability requirement across multiple sites.

Collapse

Bonomi L, Huang Y, Ohno-Machado L. Privacy challenges and research opportunities for genomic data sharing. Nat Genet 2020;52:646-654. [PMID: 32601475 PMCID: PMC7761157 DOI: 10.1038/s41588-020-0651-0] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 05/22/2020] [Indexed: 12/17/2022]

Almadhoun N, Ayday E, Ulusoy Ö. Inference attacks against differentially private query results from genomic datasets including dependent tuples. Bioinformatics 2020;36:i136-i145. [PMID: 32657411 PMCID: PMC7355303 DOI: 10.1093/bioinformatics/btaa475] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Abstract

MOTIVATION

The rapid decrease in the sequencing technology costs leads to a revolution in medical research and clinical care. Today, researchers have access to large genomic datasets to study associations between variants and complex traits. However, availability of such genomic datasets also results in new privacy concerns about personal information of the participants in genomic studies. Differential privacy (DP) is one of the rigorous privacy concepts, which received widespread interest for sharing summary statistics from genomic datasets while protecting the privacy of participants against inference attacks. However, DP has a known drawback as it does not consider the correlation between dataset tuples. Therefore, privacy guarantees of DP-based mechanisms may degrade if the dataset includes dependent tuples, which is a common situation for genomic datasets due to the inherent correlations between genomes of family members.

RESULTS

In this article, using two real-life genomic datasets, we show that exploiting the correlation between the dataset participants results in significant information leak from differentially private results of complex queries. We formulate this as an attribute inference attack and show the privacy loss in minor allele frequency (MAF) and chi-square queries. Our results show that using the results of differentially private MAF queries and utilizing the dependency between tuples, an adversary can reveal up to 50% more sensitive information about the genome of a target (compared to original privacy guarantees of standard DP-based mechanisms), while differentially privacy chi-square queries can reveal up to 40% more sensitive information. Furthermore, we show that the adversary can use the inferred genomic data obtained from the attribute inference attack to infer the membership of a target in another genomic dataset (e.g. associated with a sensitive trait). Using a log-likelihood-ratio test, our results also show that the inference power of the adversary can be significantly high in such an attack even using inferred (and hence partially incorrect) genomes.

AVAILABILITY AND IMPLEMENTATION

https://github.com/nourmadhoun/Inference-Attacks-Differential-Privacy.

Collapse

Jones K, Daniels H, Heys S, Lacey A, Ford DV. Toward a Risk-Utility Data Governance Framework for Research Using Genomic and Phenotypic Data in Safe Havens: Multifaceted Review. J Med Internet Res 2020;22:e16346. [PMID: 32412420 PMCID: PMC7260661 DOI: 10.2196/16346] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 01/13/2020] [Accepted: 01/30/2020] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Research using genomic data opens up new insights into health and disease. Being able to use the data in association with health and administrative record data held in safe havens can multiply the benefits. However, there is much discussion about the use of genomic data with perceptions of particular challenges in doing so safely and effectively.

OBJECTIVE

This study aimed to work toward a risk-utility data governance framework for research using genomic and phenotypic data in an anonymized form for research in safe havens.

METHODS

We carried out a multifaceted review drawing upon data governance arrangements in published research, case studies of organizations working with genomic and phenotypic data, public views and expectations, and example studies using genomic and phenotypic data in combination. The findings were contextualized against a backdrop of legislative and regulatory requirements and used to create recommendations.

RESULTS

We proposed recommendations toward a risk-utility model with a flexible suite of controls to safeguard privacy and retain data utility for research. These were presented as overarching principles aligned to the core elements in the data sharing framework produced by the Global Alliance for Genomics and Health and as practical control measures distilled from published literature and case studies of operational safe havens to be applied as required at a project-specific level.

CONCLUSIONS

The recommendations presented can be used to contribute toward a proportionate data governance framework to promote the safe, socially acceptable use of genomic and phenotypic data in safe havens. They do not purport to eradicate risk but propose case-by-case assessment with transparency and accountability. If the risks are adequately understood and mitigated, there should be no reason that linked genomic and phenotypic data should not be used in an anonymized form for research in safe havens.

Collapse