1
|
Alam MS, Velayudhan SM, Dey DK, Adilieme C, Malik PK, Bhatta R, König S, Schlecht E. Urbanisation threats to dairy cattle health: Insights from Greater Bengaluru, India. Trop Anim Health Prod 2023; 55:350. [PMID: 37796345 PMCID: PMC10556117 DOI: 10.1007/s11250-023-03737-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 09/12/2023] [Indexed: 10/06/2023]
Abstract
Complex urbanisation dynamics, on the one hand, create a high demand for animal products, and on the other hand put enormous pressure on arable land with negative consequences for animal feed production. To explore the impact of accelerated urbanisation on dairy cattle health in urban farming systems, 151 farmers from different parts of the Greater Bengaluru metropolitan area in India were individually interviewed on aspects addressing cattle management and cattle health. In addition, 97 samples of forages from the shores of 10 different lakes, and vegetable leftovers used in cattle feeding were collected for nutritional analysis. Along with the use of cultivated forages, crop residues, and concentrate feed, 47% and 77% of the farmers occasionally or frequently used lake fodder and food leftovers, respectively. Nutritionally, lake fodder corresponded to high-quality pasture vegetation, but 43% of the samples contained toxic heavy metals such as arsenic, cadmium, chromium, and lead above official critical threshold levels. Therefore, lake fodder may affect cows' health if consumed regularly; however, heavy metal concentrations varied between lakes (P < 0.05), but not between fodder types (P > 0.05). Although 60% of the interviewed farmers believed that their cows were in good health, logit model applications revealed that insufficient drinking water supply and the use of lake fodder negatively impacted cattle health (P < 0.05). While it remains unknown if regular feeding of lake fodder results in heavy metal accumulation in animal products, farmers and farm advisors must address this and other urbanization-related challenges to protect cattle health.
Collapse
Affiliation(s)
- Md Shahin Alam
- Animal Husbandry in the Tropics and Subtropics, University of Kassel and Georg-August-Universität Göttingen, Steinstraße 19, 37213, Witzenhausen, Germany
| | | | - Debpriyo Kumar Dey
- ICAR-National Institute of Animal Nutrition and Physiology (NIANP), Hosur Road, Adugodi, Bengaluru, Karnataka, 560030, India
| | - Chiamaka Adilieme
- Animal Husbandry in the Tropics and Subtropics, University of Kassel and Georg-August-Universität Göttingen, Steinstraße 19, 37213, Witzenhausen, Germany
| | - Pradeep Kumar Malik
- ICAR-National Institute of Animal Nutrition and Physiology (NIANP), Hosur Road, Adugodi, Bengaluru, Karnataka, 560030, India
| | - Raghavendra Bhatta
- ICAR-National Institute of Animal Nutrition and Physiology (NIANP), Hosur Road, Adugodi, Bengaluru, Karnataka, 560030, India
| | - Sven König
- Institute of Animal Breeding and Genetics, University of Gießen, Ludwigstraße 21B, 35390, Gießen, Germany
| | - Eva Schlecht
- Animal Husbandry in the Tropics and Subtropics, University of Kassel and Georg-August-Universität Göttingen, Steinstraße 19, 37213, Witzenhausen, Germany.
| |
Collapse
|
2
|
Park JA, Sung MD, Kim HH, Park YR. Weight-Based Framework for Predictive Modeling of Multiple Databases With Noniterative Communication Without Data Sharing: Privacy-Protecting Analytic Method for Multi-Institutional Studies. JMIR Med Inform 2021; 9:e21043. [PMID: 33818396 PMCID: PMC8056295 DOI: 10.2196/21043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 11/16/2020] [Accepted: 03/03/2021] [Indexed: 01/22/2023] Open
Abstract
Background Securing the representativeness of study populations is crucial in biomedical research to ensure high generalizability. In this regard, using multi-institutional data have advantages in medicine. However, combining data physically is difficult as the confidential nature of biomedical data causes privacy issues. Therefore, a methodological approach is necessary when using multi-institution medical data for research to develop a model without sharing data between institutions. Objective This study aims to develop a weight-based integrated predictive model of multi-institutional data, which does not require iterative communication between institutions, to improve average predictive performance by increasing the generalizability of the model under privacy-preserving conditions without sharing patient-level data. Methods The weight-based integrated model generates a weight for each institutional model and builds an integrated model for multi-institutional data based on these weights. We performed 3 simulations to show the weight characteristics and to determine the number of repetitions of the weight required to obtain stable values. We also conducted an experiment using real multi-institutional data to verify the developed weight-based integrated model. We selected 10 hospitals (2845 intensive care unit [ICU] stays in total) from the electronic intensive care unit Collaborative Research Database to predict ICU mortality with 11 features. To evaluate the validity of our model, compared with a centralized model, which was developed by combining all the data of 10 hospitals, we used proportional overlap (ie, 0.5 or less indicates a significant difference at a level of .05; and 2 indicates 2 CIs overlapping completely). Standard and firth logistic regression models were applied for the 2 simulations and the experiment. Results The results of these simulations indicate that the weight of each institution is determined by 2 factors (ie, the data size of each institution and how well each institutional model fits into the overall institutional data) and that repeatedly generating 200 weights is necessary per institution. In the experiment, the estimated area under the receiver operating characteristic curve (AUC) and 95% CIs were 81.36% (79.37%-83.36%) and 81.95% (80.03%-83.87%) in the centralized model and weight-based integrated model, respectively. The proportional overlap of the CIs for AUC in both the weight-based integrated model and the centralized model was approximately 1.70, and that of overlap of the 11 estimated odds ratios was over 1, except for 1 case. Conclusions In the experiment where real multi-institutional data were used, our model showed similar results to the centralized model without iterative communication between institutions. In addition, our weight-based integrated model provided a weighted average model by integrating 10 models overfitted or underfitted, compared with the centralized model. The proposed weight-based integrated model is expected to provide an efficient distributed research approach as it increases the generalizability of the model and does not require iterative communication.
Collapse
Affiliation(s)
- Ji Ae Park
- Department of Biomedical System Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Min Dong Sung
- Department of Biomedical System Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Ho Heon Kim
- Department of Biomedical System Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Yu Rang Park
- Department of Biomedical System Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
3
|
Tran TC, Pillonel J, Cazein F, Sommen C, Bonnet C, Blondel B, Lot F. Antenatal HIV screening: results from the National Perinatal Survey, France, 2016. ACTA ACUST UNITED AC 2020; 24. [PMID: 31595877 PMCID: PMC6784449 DOI: 10.2807/1560-7917.es.2019.24.40.1800573] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Background Universal antenatal HIV screening programmes are an effective method of preventing mother-to-child transmission. Aims To assess the coverage and yield of the French programme on a nationally representative sample of pregnant women, and predictive factors for being unscreened or missing information on the performance/ result of a HIV test. Methods Data came from the medical records of women included in the cross-sectional 2016 French National Perinatal Survey. We calculated odds ratios (OR) to identify factors for being unscreened for HIV and for missing information by multivariable analyses. Results Of 13,210 women, 12,782 (96.8%) were screened for HIV and 134 (1.0%) were not; information was missing for 294 (2.2%). HIV infection was newly diagnosed in 19/12,769 (0.15%) women screened. The OR for being unscreened was significantly higher in women in legally registered partnerships (OR: 1.3; 95% CI: 1.1–1.6), with 1–2 years of post-secondary schooling (OR: 1.6; 95% CI: 1.2–2.1), part-time employment (OR: 1.4; 95% CI: 1.1–1.8), inadequate antenatal care (OR: 1.9; 95% CI: 1.5–2.4) and receiving care from > 1 provider (OR: 1.8; 95% CI: 1.1–2.8). The OR of missing information was higher in multiparous women (OR: 1.4; 95% CI: 1.2–1.5) and women cared for by general practitioners (OR: 1.4; 95% CI: 1.1–1.9). Conclusions The French antenatal HIV screening programme is effective in detecting HIV among pregnant women. However, a few women are still not screened and awareness of the factors that predict this could contribute to improved screening levels.
Collapse
Affiliation(s)
- Thi-Chiên Tran
- Santé publique France, French national public health agency, Saint-Maurice, France
| | - Josiane Pillonel
- Santé publique France, French national public health agency, Saint-Maurice, France
| | - Françoise Cazein
- Santé publique France, French national public health agency, Saint-Maurice, France
| | - Cécile Sommen
- Santé publique France, French national public health agency, Saint-Maurice, France
| | | | | | - Florence Lot
- Santé publique France, French national public health agency, Saint-Maurice, France
| |
Collapse
|
4
|
Liu Y, Huang J, Urbanowicz RJ, Chen K, Manduchi E, Greene CS, Moore JH, Scheet P, Chen Y. Embracing study heterogeneity for finding genetic interactions in large-scale research consortia. Genet Epidemiol 2019; 44:52-66. [PMID: 31583758 DOI: 10.1002/gepi.22262] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 08/02/2019] [Accepted: 08/09/2019] [Indexed: 11/12/2022]
Abstract
Genetic interactions have been recognized as a potentially important contributor to the heritability of complex diseases. Nevertheless, due to small effect sizes and stringent multiple-testing correction, identifying genetic interactions in complex diseases is particularly challenging. To address the above challenges, many genomic research initiatives collaborate to form large-scale consortia and develop open access to enable sharing of genome-wide association study (GWAS) data. Despite the perceived benefits of data sharing from large consortia, a number of practical issues have arisen, such as privacy concerns on individual genomic information and heterogeneous data sources from distributed GWAS databases. In the context of large consortia, we demonstrate that the heterogeneously appearing marginal effects over distributed GWAS databases can offer new insights into genetic interactions for which conventional methods have had limited success. In this paper, we develop a novel two-stage testing procedure, named phylogenY-based effect-size tests for interactions using first 2 moments (YETI2), to detect genetic interactions through both pooled marginal effects, in terms of averaging site-specific marginal effects, and heterogeneity in marginal effects across sites, using a meta-analytic framework. YETI2 can not only be applied to large consortia without shared personal information but also can be used to leverage underlying heterogeneity in marginal effects to prioritize potential genetic interactions. We investigate the performance of YETI2 through simulation studies and apply YETI2 to bladder cancer data from dbGaP.
Collapse
Affiliation(s)
- Yulun Liu
- Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, Texas
| | - Jing Huang
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Ryan J Urbanowicz
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Kun Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Elisabetta Manduchi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Casey S Greene
- Department of Pharmacology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Paul Scheet
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Yong Chen
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
5
|
Jiang Y, Hamer J, Wang C, Jiang X, Kim M, Song Y, Xia Y, Mohammed N, Sadat MN, Wang S. SecureLR: Secure Logistic Regression Model via a Hybrid Cryptographic Protocol. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:113-123. [PMID: 29994005 DOI: 10.1109/tcbb.2018.2833463] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Machine learning applications are intensively utilized in various science fields, and increasingly the biomedical and healthcare sector. Applying predictive modeling to biomedical data introduces privacy and security concerns requiring additional protection to prevent accidental disclosure or leakage of sensitive patient information. Significant advancements in secure computing methods have emerged in recent years, however, many of which require substantial computational and/or communication overheads, which might hinder their adoption in biomedical applications. In this work, we propose SecureLR, a novel framework allowing researchers to leverage both the computational and storage capacity of Public Cloud Servers to conduct learning and predictions on biomedical data without compromising data security or efficiency. Our model builds upon homomorphic encryption methodologies with hardware-based security reinforcement through Software Guard Extensions (SGX), and our implementation demonstrates a practical hybrid cryptographic solution to address important concerns in conducting machine learning with public clouds.
Collapse
|
6
|
Arellano AM, Dai W, Wang S, Jiang X, Ohno-Machado L. Privacy Policy and Technology in Biomedical Data Science. Annu Rev Biomed Data Sci 2018; 1:115-129. [PMID: 31058261 PMCID: PMC6497413 DOI: 10.1146/annurev-biodatasci-080917-013416] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Privacyis an important consideration when sharing clinical data, which often contain sensitive information. Adequate protection to safeguard patient privacy and to increase public trust in biomedical research is paramount. This review covers topics in policy and technology in the context of clinical data sharing. We review policy articles related to (a) the Common Rule, HIPAA privacy and security rules, and governance; (b) patients' viewpoints and consent practices; and (c) research ethics. We identify key features of the revised Common Rule and the most notable changes since its previous version. We address data governance for research in addition to the increasing emphasis on ethical and social implications. Research ethics topics include data sharing best practices, use of data from populations of low socioeconomic status (SES), recent updates to institutional review board (IRB) processes to protect human subjects' data, and important concerns about the limitations of current policies to address data deidentification. In terms of technology, we focus on articles that have applicability in real world health care applications: deidentification methods that comply with HIPAA, data anonymization approaches to satisfy well-acknowledged issues in deidentified data, encryption methods to safeguard data analyses, and privacy-preserving predictive modeling. The first two technology topics are mostly relevant to methodologies that attempt to sanitize structured or unstructured data. The third topic includes analysis on encrypted data. The last topic includes various mechanisms to build statistical models without sharing raw data.
Collapse
Affiliation(s)
- April Moreno Arellano
- Department of Biomedical Informatics, School of Medicine, University of California, San Diego, La Jolla, California 92093, USA;
| | - Wenrui Dai
- Department of Biomedical Informatics, School of Medicine, University of California, San Diego, La Jolla, California 92093, USA;
| | - Shuang Wang
- Department of Biomedical Informatics, School of Medicine, University of California, San Diego, La Jolla, California 92093, USA;
| | - Xiaoqian Jiang
- Department of Biomedical Informatics, School of Medicine, University of California, San Diego, La Jolla, California 92093, USA;
| | - Lucila Ohno-Machado
- Department of Biomedical Informatics, School of Medicine, University of California, San Diego, La Jolla, California 92093, USA;
| |
Collapse
|
7
|
Hirose M, Kobayashi Y, Nakamoto S, Ueki R, Kariya N, Tatara T. Development of a Hemodynamic Model Using Routine Monitoring Parameters for Nociceptive Responses Evaluation During Surgery Under General Anesthesia. Med Sci Monit 2018; 24:3324-3331. [PMID: 29779036 PMCID: PMC5990992 DOI: 10.12659/msm.907484] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Routine hemodynamic monitoring parameters under general anesthesia, such as heart rate (HR), systolic blood pressure (SBP), and perfusion index (PI), do not solely reflect intraoperative nociceptive levels. We developed a hemodynamic model combining these 3 parameters for nociceptive responses during general anesthesia, and evaluated nociceptive responses to surgical skin incision. MATERIAL AND METHODS We first retrospectively performed discriminant analysis using 3 values - HR, SBP, and PI - to assess response to skin incision during tympanoplasty, laparoscopic cholecystectomy, and open gastrectomy to determine if combined use of these parameters differentiates nociceptive levels among these 3 surgeries. Secondly, ordinal logistic regression analysis was applied using the 3 parameters to develop an equation representing nociceptive response during general anesthesia, and then evaluated its utility to discern nociceptive responses to skin incision. RESULTS We developed the following hemodynamic model as calculated nociceptive response= -1+2/(1+ exp(-0.01 HR -0.02 SBP +0.17 PI)), and prospectively determined that calculated nociceptive responses to small skin incision for laparoscopic surgery were significantly lower than responses to large skin incision for laparotomy. CONCLUSIONS Our hemodynamic model using HR, SBP, and PI likely reflects nociceptive levels at skin incision during general anesthesia, and quantitatively discerned the difference in nociceptive responses to skin incision between laparoscopy and laparotomy. This model could be applicable to assess either real-time nociceptive responses or averaged nociceptive responses throughout surgery without using special equipment.
Collapse
Affiliation(s)
- Munetaka Hirose
- Department of Anesthesiology and Pain Medicine, Hyogo College of Medicine, Nishinomiya, Hyogo, Japan
| | - Yoshiko Kobayashi
- Department of Anesthesiology and Pain Medicine, Hyogo College of Medicine, Nishinomiya, Hyogo, Japan
| | - Shiro Nakamoto
- Department of Anesthesiology and Pain Medicine, Hyogo College of Medicine, Nishinomiya, Hyogo, Japan
| | - Ryusuke Ueki
- Department of Anesthesiology and Pain Medicine, Hyogo College of Medicine, Nishinomiya, Hyogo, Japan
| | - Nobutaka Kariya
- Department of Anesthesiology and Pain Medicine, Hyogo College of Medicine, Nishinomiya, Hyogo, Japan
| | - Tsuneo Tatara
- Department of Anesthesiology and Pain Medicine, Hyogo College of Medicine, Nishinomiya, Hyogo, Japan
| |
Collapse
|
8
|
Weng C, Kahn MG. Clinical Research Informatics for Big Data and Precision Medicine. Yearb Med Inform 2016:211-218. [PMID: 27830253 DOI: 10.15265/iy-2016-019] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
OBJECTIVES To reflect on the notable events and significant developments in Clinical Research Informatics (CRI) in the year of 2015 and discuss near-term trends impacting CRI. METHODS We selected key publications that highlight not only important recent advances in CRI but also notable events likely to have significant impact on CRI activities over the next few years or longer, and consulted the discussions in relevant scientific communities and an online living textbook for modern clinical trials. We also related the new concepts with old problems to improve the continuity of CRI research. RESULTS The highlights in CRI in 2015 include the growing adoption of electronic health records (EHR), the rapid development of regional, national, and global clinical data research networks for using EHR data to integrate scalable clinical research with clinical care and generate robust medical evidence. Data quality, integration, and fusion, data access by researchers, study transparency, results reproducibility, and infrastructure sustainability are persistent challenges. CONCLUSION The advances in Big Data Analytics and Internet technologies together with the engagement of citizens in sciences are shaping the global clinical research enterprise, which is getting more open and increasingly stakeholder-centered, where stakeholders include patients, clinicians, researchers, and sponsors.
Collapse
Affiliation(s)
- C Weng
- Chunhua Weng, PhD, FACMI, Department of Biomedical Informatics, Columbia University, 622 W 168 Street, PH-20, New York, NY 10032, USA, E-mail:
| | | |
Collapse
|
9
|
Shi H, Jiang C, Dai W, Jiang X, Tang Y, Ohno-Machado L, Wang S. Secure Multi-pArty Computation Grid LOgistic REgression (SMAC-GLORE). BMC Med Inform Decis Mak 2016; 16 Suppl 3:89. [PMID: 27454168 PMCID: PMC4959358 DOI: 10.1186/s12911-016-0316-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background In biomedical research, data sharing and information exchange are very important for improving quality of care, accelerating discovery, and promoting the meaningful secondary use of clinical data. A big concern in biomedical data sharing is the protection of patient privacy because inappropriate information leakage can put patient privacy at risk. Methods In this study, we deployed a grid logistic regression framework based on Secure Multi-party Computation (SMAC-GLORE). Unlike our previous work in GLORE, SMAC-GLORE protects not only patient-level data, but also all the intermediary information exchanged during the model-learning phase. Results The experimental results demonstrate the feasibility of secure distributed logistic regression across multiple institutions without sharing patient-level data. Conclusions In this study, we developed a circuit-based SMAC-GLORE framework. The proposed framework provides a practical solution for secure distributed logistic regression model learning.
Collapse
Affiliation(s)
- Haoyi Shi
- Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA.,Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, 13210, USA
| | - Chao Jiang
- Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA.,School of Electrical and Computer Engineering, University of Oklahoma, Tulsa, OK, 74135, USA
| | - Wenrui Dai
- Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA
| | - Xiaoqian Jiang
- Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA
| | - Yuzhe Tang
- Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, 13210, USA
| | - Lucila Ohno-Machado
- Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA
| | - Shuang Wang
- Department of Biomedical Informatics, University of California, San Diego, CA, 92093, USA.
| |
Collapse
|
10
|
Constable SD, Tang Y, Wang S, Jiang X, Chapin S. Privacy-preserving GWAS analysis on federated genomic datasets. BMC Med Inform Decis Mak 2015; 15 Suppl 5:S2. [PMID: 26733045 PMCID: PMC4699163 DOI: 10.1186/1472-6947-15-s5-s2] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical information confidentiality (as data are being exchanged across institutional boundaries), which becomes an inhibiting factor for the practical use. METHODS We present a privacy-preserving GWAS framework on federated genomic datasets. Our method is to layer the GWAS computations on top of secure multi-party computation (MPC) systems. This approach allows two parties in a distributed system to mutually perform secure GWAS computations, but without exposing their private data outside. RESULTS We demonstrate our technique by implementing a framework for minor allele frequency counting and χ2 statistics calculation, one of typical computations used in GWAS. For efficient prototyping, we use a state-of-the-art MPC framework, i.e., Portable Circuit Format (PCF) 1. Our experimental results show promise in realizing both efficient and secure cross-institution GWAS computations.
Collapse
Affiliation(s)
- Scott D Constable
- Department of EECS, Syracuse University, South Crouse Avenue, 13244 Syracuse, NY USA
| | - Yuzhe Tang
- Department of EECS, Syracuse University, South Crouse Avenue, 13244 Syracuse, NY USA
| | - Shuang Wang
- Department of Biomedical Informatics, University of California, San Diego, 9500 Gilman Drive, MC 0728, 92093 La Jolla, CA USA
| | - Xiaoqian Jiang
- Department of Biomedical Informatics, University of California, San Diego, 9500 Gilman Drive, MC 0728, 92093 La Jolla, CA USA
| | - Steve Chapin
- Department of EECS, Syracuse University, South Crouse Avenue, 13244 Syracuse, NY USA
| |
Collapse
|