Kuo TT, Pham A, Edelson ME, Kim J, Chan J, Gupta Y, Ohno-Machado L. Blockchain-enabled immutable, distributed, and highly available clinical research activity logging system for federated COVID-19 data analysis from multiple institutions.
J Am Med Inform Assoc 2023;
30:1167-1178. [PMID:
36916740 PMCID:
PMC10198529 DOI:
10.1093/jamia/ocad049]
[Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 03/07/2023] [Accepted: 03/11/2023] [Indexed: 03/15/2023] Open
Abstract
OBJECTIVE
We aimed to develop a distributed, immutable, and highly available cross-cloud blockchain system to facilitate federated data analysis activities among multiple institutions.
MATERIALS AND METHODS
We preprocessed 9166 COVID-19 Structured Query Language (SQL) code, summary statistics, and user activity logs, from the GitHub repository of the Reliable Response Data Discovery for COVID-19 (R2D2) Consortium. The repository collected local summary statistics from participating institutions and aggregated the global result to a COVID-19-related clinical query, previously posted by clinicians on a website. We developed both on-chain and off-chain components to store/query these activity logs and their associated queries/results on a blockchain for immutability, transparency, and high availability of research communication. We measured run-time efficiency of contract deployment, network transactions, and confirmed the accuracy of recorded logs compared to a centralized baseline solution.
RESULTS
The smart contract deployment took 4.5 s on an average. The time to record an activity log on blockchain was slightly over 2 s, versus 5-9 s for baseline. For querying, each query took on an average less than 0.4 s on blockchain, versus around 2.1 s for baseline.
DISCUSSION
The low deployment, recording, and querying times confirm the feasibility of our cross-cloud, blockchain-based federated data analysis system. We have yet to evaluate the system on a larger network with multiple nodes per cloud, to consider how to accommodate a surge in activities, and to investigate methods to lower querying time as the blockchain grows.
CONCLUSION
Blockchain technology can be used to support federated data analysis among multiple institutions.
Collapse