1
|
Abstract
High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called spectral clustering with feature selection (SC-FS), where we first obtain an initial estimate of labels via spectral clustering, then select a small fraction of features with the largest R-squared with these labels, that is, the proportion of variation explained by group labels, and conduct clustering again using selected features. Under mild conditions, we prove that the proposed method identifies all informative features with high probability and achieves the minimax optimal clustering error rate for the sparse Gaussian mixture model. Applications of SC-FS to four real-world datasets demonstrate its usefulness in clustering high-dimensional data.
Collapse
Affiliation(s)
- Tianqi Liu
- Google Research, New York, New York, USA
| | - Yu Lu
- Two Sigma Investments, New York, New York, USA
| | - Biqing Zhu
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Hongyu Zhao
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Department of Biostatistics, School of Public Health, Yale University, New Haven, Connecticut, USA
| |
Collapse
|
2
|
Zheng M, Zhang X, Ma X. Unsupervised Domain Adaptation with Differentially Private Gradient Projection. INT J INTELL SYST 2023. [DOI: 10.1155/2023/8426839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
Domain adaptation is a viable solution for deep learning with small data. However, domain adaptation models trained on data with sensitive information may be a violation of personal privacy. In this article, we proposed a solution for unsupervised domain adaptation, called DP-CUDA, which is based on differentially private gradient projection and contradistinguisher. Compared with the traditional domain adaptation process, DP-CUDA involves searching for domain-invariant features between the source domain and target domain first and then transferring knowledge. Specifically, the model is trained in the source domain by supervised learning from labeled data. During the training of the target model, feature learning is used to solve the classification task in an end-to-end manner using unlabeled data directly, and the differentially private noise is injected into the gradient. We conducted extensive experiments on a variety of benchmark datasets, including MNIST, USPS, SVHN, VisDA-2017, Office-31, and Amazon Review, to demonstrate our proposed method’s utility and privacy-preserving properties.
Collapse
|
3
|
Konkin A, Zapechnikov S. Zero knowledge proof and ZK-SNARK for private blockchains. J Comput Virol Hack Tech 2023. [DOI: 10.1007/s11416-023-00466-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
4
|
Rasha AH, Li T, Huang W, Gu J, Li C. Federated Learning in Smart Cities: Privacy and Security Survey. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
5
|
Chen S, Chen J. Lattice-based Group Signatures with Forward Security for Anonymous Authentication. Heliyon 2023; 9:e14917. [PMID: 37101632 PMCID: PMC10123160 DOI: 10.1016/j.heliyon.2023.e14917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/19/2023] [Accepted: 03/22/2023] [Indexed: 03/30/2023] Open
Abstract
Group signatures allow users to sign messages on behalf of a group without revealing authority is capable of identifying the user who generated it. However, the exposure of the user's signing key will severely damage the group signature scheme. In order to reduce the loss caused by signing key leakage, Song proposed the first forward-secure group signature. If a group signing key is revealed at the current time period, the previous signing key will not be affected. This means that the attacker cannot forge group signatures regarding messages signed in the past. To resist quantum attacks, many lattice-based forward-secure group signatures have been proposed. However, their key-update algorithm is expensive since they require some costly computations such as the Hermite normal form (HNF) operations and conversion from a full-rank set of lattice vectors into a basis. In this paper, we propose the group signature with forward security from lattice. In comparison with previous works, we have several advantages: Firstly, our scheme is more effective since we only need to sample some vectors independently from a discrete Gaussian during the key-update algorithm. Secondly, the derived secret key size is linear instead of quadratic with the lattice dimensions, which is more friendly towards lightweight applications. Anonymous authentication plays an increasingly critical role in protecting privacy and security in the environment where private information could be collected for intelligent analysis. Our work contributes to the anonymous authentication in the post-quantum setting, which has wide potential applications in the IoT environment.
Collapse
|
6
|
Şahin MS, Akleylek S. A survey of quantum secure group signature schemes: Lattice-based approach. Journal of Information Security and Applications 2023. [DOI: 10.1016/j.jisa.2023.103432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
7
|
Pandit AA, Kumar A, Mishra A. LWR-based Quantum-Safe Pseudo-Random Number Generator. Journal of Information Security and Applications 2023. [DOI: 10.1016/j.jisa.2023.103431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
8
|
Salez J. Universality of cutoff for exclusion with reservoirs. ANN PROBAB 2023. [DOI: 10.1214/22-aop1600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
|
9
|
Li S, Zhang L, Cai TT, Li H. Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer. J Am Stat Assoc 2023. [DOI: 10.1080/01621459.2023.2184373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Affiliation(s)
- Sai Li
- Institute of Statistics and Big Data, Renmin University of China, China
| | - Linjun Zhang
- Department of Statistics, Rutgers University, New Brunswick, NJ 08854
| | - T. Tony Cai
- Department of Statistics, the Wharton School, University of Pennsylvania, Philadelphia, PA 19104
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
10
|
Ma W, Xu P, Xu Y. Fairness Maximization among Offline Agents in Online-Matching Markets. ACM Trans Econ Comput 2023. [DOI: 10.1145/3569705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Online matching markets (OMMs) are commonly used in today’s world to pair agents from two parties (whom we will call
offline
and
online
agents) for mutual benefit. However, studies have shown that the algorithms making decisions in these OMMs often leave disparities in matching rates, especially for offline agents. In this paper, we propose online matching algorithms that optimize for either individual or group-level fairness among offline agents in OMMs. We present two linear-programming (LP) based sampling algorithms, which achieve competitive ratios at least 0.725 for individual fairness maximization and 0.719 for group fairness maximization. We derive further bounds based on fairness parameters, demonstrating conditions under which the competitive ratio can increase to 100%. There are two key ideas helping us break the barrier of
\(1-1/e\sim 63.2\% \)
for competitive ratio in online matching. One is
boosting
, which is to adaptively re-distribute all sampling probabilities among only the available neighbors for every arriving online agent. The other is
attenuation
, which aims to balance the matching probabilities among offline agents with different mass allocated by the benchmark LP. We conduct extensive numerical experiments and results show that our boosted version of sampling algorithms are not only conceptually easy to implement but also highly effective in practical instances of OMMs where fairness is a concern.
Collapse
Affiliation(s)
| | - Pan Xu
- New Jersey Institute of Technology, USA
| | | |
Collapse
|
11
|
Guingona V, Kolesnikov A, Nierwinski J, Schweitzer A. Comparing approximate and probabilistic differential privacy parameters. INFORM PROCESS LETT 2023. [DOI: 10.1016/j.ipl.2023.106380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
12
|
Nitiéma P. Artificial Intelligence in Medicine: Text Mining of Health Care Workers' Opinions. J Med Internet Res 2023; 25:e41138. [PMID: 36584303 PMCID: PMC9919460 DOI: 10.2196/41138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 11/11/2022] [Accepted: 12/19/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Artificial intelligence (AI) is being increasingly adopted in the health care industry for administrative tasks, patient care operations, and medical research. OBJECTIVE We aimed to examine health care workers' opinions about the adoption and implementation of AI-powered technology in the health care industry. METHODS Data were comments about AI posted on a web-based forum by 905 health care professionals from at least 77 countries, from May 2013 to October 2021. Structural topic modeling was used to identify the topics of discussion, and hierarchical clustering was performed to determine how these topics cluster into different groups. RESULTS Overall, 12 topics were identified from the collected comments. These comments clustered into 2 groups: impact of AI on health care system and practice and AI as a tool for disease screening, diagnosis, and treatment. Topics associated with negative sentiments included concerns about AI replacing human workers, impact of AI on traditional medical diagnostic procedures (ie, patient history and physical examination), accuracy of the algorithm, and entry of IT companies into the health care industry. Concerns about the legal liability for using AI in treating patients were also discussed. Positive topics about AI included the opportunity offered by the technology for improving the accuracy of image-based diagnosis and for enhancing personalized medicine. CONCLUSIONS The adoption and implementation of AI applications in the health care industry are eliciting both enthusiasm and concerns about patient care quality and the future of health care professions. The successful implementation of AI-powered technologies requires the involvement of all stakeholders, including patients, health care organization workers, health insurance companies, and government regulatory agencies.
Collapse
Affiliation(s)
- Pascal Nitiéma
- Department of Information Systems, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
13
|
Fritzsche MC, Akyüz K, Cano Abadía M, McLennan S, Marttinen P, Mayrhofer MT, Buyx AM. Ethical layering in AI-driven polygenic risk scores-New complexities, new challenges. Front Genet 2023; 14:1098439. [PMID: 36816027 PMCID: PMC9933509 DOI: 10.3389/fgene.2023.1098439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/04/2023] [Indexed: 01/27/2023] Open
Abstract
Researchers aim to develop polygenic risk scores as a tool to prevent and more effectively treat serious diseases, disorders and conditions such as breast cancer, type 2 diabetes mellitus and coronary heart disease. Recently, machine learning techniques, in particular deep neural networks, have been increasingly developed to create polygenic risk scores using electronic health records as well as genomic and other health data. While the use of artificial intelligence for polygenic risk scores may enable greater accuracy, performance and prediction, it also presents a range of increasingly complex ethical challenges. The ethical and social issues of many polygenic risk score applications in medicine have been widely discussed. However, in the literature and in practice, the ethical implications of their confluence with the use of artificial intelligence have not yet been sufficiently considered. Based on a comprehensive review of the existing literature, we argue that this stands in need of urgent consideration for research and subsequent translation into the clinical setting. Considering the many ethical layers involved, we will first give a brief overview of the development of artificial intelligence-driven polygenic risk scores, associated ethical and social implications, challenges in artificial intelligence ethics, and finally, explore potential complexities of polygenic risk scores driven by artificial intelligence. We point out emerging complexity regarding fairness, challenges in building trust, explaining and understanding artificial intelligence and polygenic risk scores as well as regulatory uncertainties and further challenges. We strongly advocate taking a proactive approach to embedding ethics in research and implementation processes for polygenic risk scores driven by artificial intelligence.
Collapse
Affiliation(s)
- Marie-Christine Fritzsche
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany,*Correspondence: Marie-Christine Fritzsche,
| | - Kaya Akyüz
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria,Department of Science and Technology Studies, University of Vienna, Vienna, Austria
| | - Mónica Cano Abadía
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria
| | - Stuart McLennan
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| | - Pekka Marttinen
- Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Michaela Th. Mayrhofer
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria
| | - Alena M. Buyx
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| |
Collapse
|
14
|
Martyn JM, Liu Y, Chin ZE, Chuang IL. Efficient fully-coherent quantum signal processing algorithms for real-time dynamics simulation. J Chem Phys 2023; 158:024106. [PMID: 36641381 DOI: 10.1063/5.0124385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Simulating the unitary dynamics of a quantum system is a fundamental problem of quantum mechanics, in which quantum computers are believed to have significant advantage over their classical counterparts. One prominent such instance is the simulation of electronic dynamics, which plays an essential role in chemical reactions, non-equilibrium dynamics, and material design. These systems are time-dependent, which requires that the corresponding simulation algorithm can be successfully concatenated with itself over different time intervals to reproduce the overall coherent quantum dynamics of the system. In this paper, we quantify such simulation algorithms by the property of being fully-coherent: the algorithm succeeds with arbitrarily high success probability 1 - δ while only requiring a single copy of the initial state. We subsequently develop fully-coherent simulation algorithms based on quantum signal processing (QSP), including a novel algorithm that circumvents the use of amplitude amplification while also achieving a query complexity additive in time t, ln(1/δ), and ln(1/ϵ) for error tolerance ϵ: Θ‖H‖|t|+ln(1/ϵ)+ln(1/δ). Furthermore, we numerically analyze these algorithms by applying them to the simulation of the spin dynamics of the Heisenberg model and the correlated electronic dynamics of an H2 molecule. Since any electronic Hamiltonian can be mapped to a spin Hamiltonian, our algorithm can efficiently simulate time-dependent ab initio electronic dynamics in the circuit model of quantum computation. Accordingly, it is also our hope that the present work serves as a bridge between QSP-based quantum algorithms and chemical dynamics, stimulating a cross-fertilization between these exciting fields.
Collapse
Affiliation(s)
- John M Martyn
- Department of Physics, Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Yuan Liu
- Department of Physics, Co-Design Center for Quantum Advantage, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Zachary E Chin
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Isaac L Chuang
- Department of Physics, Co-Design Center for Quantum Advantage, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
15
|
Drechsler J. Differential Privacy for Government Agencies—Are We There Yet? J Am Stat Assoc 2023. [DOI: 10.1080/01621459.2022.2161385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Jörg Drechsler
- Institute for Employment Research and University of Maryland
| |
Collapse
|
16
|
Andersen MS, Marsicano C, Pineda Torres M, Slusky D. Texas Senate Bill 8 significantly reduced travel to abortion clinics in Texas. Front Glob Womens Health 2023; 4:1117724. [PMID: 37020904 PMCID: PMC10067718 DOI: 10.3389/fgwh.2023.1117724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 02/28/2023] [Indexed: 04/07/2023] Open
Abstract
The Dobbs v. Jackson decision by the United States Supreme Court has rescinded the constitutional guarantee of abortion across the United States. As a result, at least 13 states have banned abortion access with unknown effects. Using "Texas" SB8 law that similarly restricted abortions in Texas, we provide insight into how individuals respond to these restrictions using aggregated and anonymized human mobility data. We find that "Texas" SB 8 law reduced mobility near abortion clinics in Texas by people who live in Texas and those who live outside the state. We also find that mobility from Texas to abortion clinics in other states increased, with notable increases in Missouri and Arkansas, two states that subsequently enacted post-Dobbs bans. These results highlight the importance of out-of-state abortion services for women living in highly restrictive states.
Collapse
Affiliation(s)
- Martin S. Andersen
- Department of Economics, University of North Carolina at Greensboro, Greensboro, NC, United States
- Correspondence: Martin S. Andersen
| | | | - Mayra Pineda Torres
- School of Economics, Georgia Institute of Technology, Atlanta, GA, United States
| | - David Slusky
- Department of Economics, University of Kansas, Lawrence, KS, United States
| |
Collapse
|
17
|
Narayan A, Liu Z, Bergquist JA, Charlebois C, Rampersad S, Rupp L, Brooks D, White D, Tate J, MacLeod RS. UncertainSCI: Uncertainty quantification for computational models in biomedicine and bioengineering. Comput Biol Med 2023; 152:106407. [PMID: 36521358 PMCID: PMC9812870 DOI: 10.1016/j.compbiomed.2022.106407] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 11/07/2022] [Accepted: 12/03/2022] [Indexed: 12/12/2022]
Abstract
BACKGROUND Computational biomedical simulations frequently contain parameters that model physical features, material coefficients, and physiological effects, whose values are typically assumed known a priori. Understanding the effect of variability in those assumed values is currently a topic of great interest. A general-purpose software tool that quantifies how variation in these parameters affects model outputs is not broadly available in biomedicine. For this reason, we developed the 'UncertainSCI' uncertainty quantification software suite to facilitate analysis of uncertainty due to parametric variability. METHODS We developed and distributed a new open-source Python-based software tool, UncertainSCI, which employs advanced parameter sampling techniques to build polynomial chaos (PC) emulators that can be used to predict model outputs for general parameter values. Uncertainty of model outputs is studied by modeling parameters as random variables, and model output statistics and sensitivities are then easily computed from the emulator. Our approaches utilize modern, near-optimal techniques for sampling and PC construction based on weighted Fekete points constructed by subsampling from a suitably randomized candidate set. RESULTS Concentrating on two test cases-modeling bioelectric potentials in the heart and electric stimulation in the brain-we illustrate the use of UncertainSCI to estimate variability, statistics, and sensitivities associated with multiple parameters in these models. CONCLUSION UncertainSCI is a powerful yet lightweight tool enabling sophisticated probing of parametric variability and uncertainty in biomedical simulations. Its non-intrusive pipeline allows users to leverage existing software libraries and suites to accurately ascertain parametric uncertainty in a variety of applications.
Collapse
Affiliation(s)
- Akil Narayan
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Mathematics, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States.
| | - Zexin Liu
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Mathematics, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| | - Jake A Bergquist
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Biomedical Engineering, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| | - Chantel Charlebois
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Biomedical Engineering, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| | - Sumientra Rampersad
- Department of Physics, University of Massachusetts Boston, Boston, MA, USA; Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | - Lindsay Rupp
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Biomedical Engineering, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| | - Dana Brooks
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | - Dan White
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Biomedical Engineering, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| | - Jess Tate
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Biomedical Engineering, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| | - Rob S MacLeod
- Scientific Computing and Imaging Institute, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States; Department of Biomedical Engineering, University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, United States
| |
Collapse
|
18
|
Al-Saggaf AA, Sheltami T, Alkhzaimi H, Ahmed G. Lightweight Two-Factor-Based User Authentication Protocol for IoT-Enabled Healthcare Ecosystem in Quantum Computing. Arab J Sci Eng 2023; 48:2347-57. [PMID: 36164325 DOI: 10.1007/s13369-022-07235-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 01/17/2022] [Indexed: 11/16/2022]
Abstract
The healthcare ecosystem is migrating from legacy systems to the Internet of Things (IoT), resulting in a digital environment. This transformation has increased importance on demanding both secure and usable user authentication methods. Recently, a post-quantum fuzzy commitment scheme (PQFC) has been constructed as a reliable and efficient method of biometric template protection. This paper presents a new two-factor-based user authentication protocol for the IoT-enabled healthcare ecosystem in post-quantum computing environments using the PQFC scheme. The proposed protocol is proved to be secure using random oracle model. Furthermore, the functionality and security of the proposed protocol are analyzed, showing that memoryless-effortless, user anonymity, mutual authentication, and resistance to biometric templates tampering and stolen attacks, stolen smart card attack, privileged interior attack are fulfilled. The costs of storage requirement, computation, communication and storage are estimated. The results demonstrate that the proposed protocol is more efficient than Mukherjee et al., Chaudhary et al., and Gupta et al. protocols.
Collapse
|
19
|
Li J, Jiang M, Qin Y, Zhang R, Ling SH. Intelligent depression detection with asynchronous federated optimization. COMPLEX INTELL SYST 2023; 9:115-131. [PMID: 35761865 PMCID: PMC9217731 DOI: 10.1007/s40747-022-00729-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 03/27/2022] [Indexed: 10/26/2022]
Abstract
The growth of population and the various intensive life pressures everyday deepen the competitions among people. Tens of millions of people each year suffer from depression and only a fraction receives adequate treatment. The development of social networks such as Facebook, Twitter, Weibo, and QQ provides more convenient communication and provides a new emotional vent window. People communicate with their friends, sharing their opinions, and shooting videos to reflect their feelings. It provides an opportunity to detect depression in social networks. Although depression detection using social networks has reflected the established connectivity across users, fewer researchers consider the data security and privacy-preserving schemes. Therefore, we advocate the federated learning technique as an efficient and scalable method, where it enables the handling of a massive number of edge devices in parallel. In this study, we conduct the depression analysis on the basis of an online microblog called Weibo. A novel algorithm termed as CNN Asynchronous Federated optimization (CAFed) is proposed based on federated learning to improve the communication cost and convergence rate. It is shown that our proposed method can effectively protect users' privacy under the premise of ensuring the accuracy of prediction. The proposed method converges faster than the Federated Averaging (FedAvg) for non-convex problems. Federated learning techniques can identify quality solutions of mental health problems among Weibo users.
Collapse
Affiliation(s)
- Jinli Li
- grid.459584.10000 0001 2196 0260College of Electronic Engineering, Guangxi Normal University, Guilin, China
| | - Ming Jiang
- grid.459584.10000 0001 2196 0260College of Electronic Engineering, Guangxi Normal University, Guilin, China
| | - Yunbai Qin
- grid.459584.10000 0001 2196 0260College of Electronic Engineering, Guangxi Normal University, Guilin, China
| | - Ran Zhang
- grid.1021.20000 0001 0526 7079Business and Law School, Deakin University, Geelong, Australia
| | - Sai Ho Ling
- grid.117476.20000 0004 1936 7611Faculty of Engineering and IT, University of Technology, Sydney, Australia
| |
Collapse
|
20
|
Dinsdale NK, Bluemke E, Sundaresan V, Jenkinson M, Smith SM, Namburete AIL. Challenges for machine learning in clinical translation of big data imaging studies. Neuron 2022; 110:3866-3881. [PMID: 36220099 DOI: 10.1016/j.neuron.2022.09.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 08/27/2021] [Accepted: 09/08/2022] [Indexed: 12/15/2022]
Abstract
Combining deep learning image analysis methods and large-scale imaging datasets offers many opportunities to neuroscience imaging and epidemiology. However, despite these opportunities and the success of deep learning when applied to a range of neuroimaging tasks and domains, significant barriers continue to limit the impact of large-scale datasets and analysis tools. Here, we examine the main challenges and the approaches that have been explored to overcome them. We focus on issues relating to data availability, interpretability, evaluation, and logistical challenges and discuss the problems that still need to be tackled to enable the success of "big data" deep learning approaches beyond research.
Collapse
Affiliation(s)
- Nicola K Dinsdale
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK; Oxford Machine Learning in NeuroImaging Lab, OMNI, Department of Computer Science, University of Oxford, Oxford, UK.
| | - Emma Bluemke
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Vaanathi Sundaresan
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK; Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Charlestown, MA, USA
| | - Mark Jenkinson
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK; Australian Institute for Machine Learning (AIML), School of Computer Science, University of Adelaide, Adelaide, SA, Australia; South Australian Health and Medical Research Institute (SAHMRI), North Terrace, Adelaide, SA, Australia
| | - Stephen M Smith
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Ana I L Namburete
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK; Oxford Machine Learning in NeuroImaging Lab, OMNI, Department of Computer Science, University of Oxford, Oxford, UK
| |
Collapse
|
21
|
Fried S. On the $$\alpha $$-lazy version of Markov chains in estimation and testing problems. Stat Inference Stoch Process 2022. [DOI: 10.1007/s11203-022-09283-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
22
|
Elkoumy G, Pankova A, Dumas M. Differentially private release of event logs for process mining. INFORM SYST 2022. [DOI: 10.1016/j.is.2022.102161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
23
|
Wu X, Du Y, Fan T, Guo J, Ren J, Wu R, Zheng T. Threat analysis for space information network based on network security attributes: a review. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00899-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractSpace Information Network (SIN) is a multi-purpose heterogeneous network. Due to the large-scale of SIN, its secure and stable operation is vulnerable to various threats. Much of current threat analysis for SIN is based on the network function or architecture. However, this approach cannot clearly divide the relation between threats and secure communication measures for a highly integrated network. Furthermore, it will lead to overlapping in segregation of secure duties. This paper presents a comprehensive review of threats and corresponding solutions in SIN from the perspective of network security attributes. In order to make the analysis applicable to more scenarios, the following three most essential attributes, confidentiality, integrity and availability, are selected as the threatened objectives. At the same time, for cross-reference with the analysis based on network function or architecture, this paper relates network layers to network security attributes through secure communication mechanisms. Specifically, the confidentiality includes confidential information-exchange and Authentication and Key Agreement (AKA), the integrity includes information identification and information restoration, and the availability includes link establishment, routing mechanism, and mobility management. According to above framework, this paper provides a cross-layer perspective for analyzing threat and enhancing the security and stability of SIN. Finally, this paper concludes with a summary of challenges and future work in SIN.
Collapse
|
24
|
Huang W, Li P, Li B, Nie L, Bao H. Towards Stable Task Assignment with Preference Lists and Ties in Spatial Crowdsourcing. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.11.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
25
|
Desai N, Lal Das M, Chaudhari P, Kumar N. Background knowledge attacks in privacy-preserving data publishing models. Comput Secur 2022. [DOI: 10.1016/j.cose.2022.102874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
26
|
Hou L, Ni W, Zhang S, Fu N, Zhang D. Wdt-SCAN: Clustering Decentralized Social Graphs with Local Differential Privacy. Comput Secur 2022. [DOI: 10.1016/j.cose.2022.103036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
27
|
Manchini C, Ospina R, Leiva V, Martin-barreiro C. A new approach to data differential privacy based on regression models under heteroscedasticity with numerical applications. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
28
|
Bhattacharjee K, Islam A, Vaidya J, Dasgupta A. PRIVEE: A Visual Analytic Workflow for Proactive Privacy Risk Inspection of Open Data. IEEE Symp Visual Cyber Sec (VIZSEC) 2022; 2022:10.1109/vizsec56996.2022.9941431. [PMID: 36655144 PMCID: PMC9841578 DOI: 10.1109/vizsec56996.2022.9941431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Open data sets that contain personal information are susceptible to adversarial attacks even when anonymized. By performing low-cost joins on multiple datasets with shared attributes, malicious users of open data portals might get access to information that violates individuals' privacy. However, open data sets are primarily published using a release-and-forget model, whereby data owners and custodians have little to no cognizance of these privacy risks. We address this critical gap by developing a visual analytic solution that enables data defenders to gain awareness about the disclosure risks in local, joinable data neighborhoods. The solution is derived through a design study with data privacy researchers, where we initially play the role of a red team and engage in an ethical data hacking exercise based on privacy attack scenarios. We use this problem and domain characterization to develop a set of visual analytic interventions as a defense mechanism and realize them in PRIVEE, a visual risk inspection workflow that acts as a proactive monitor for data defenders. PRIVEE uses a combination of risk scores and associated interactive visualizations to let data defenders explore vulnerable joins and interpret risks at multiple levels of data granularity. We demonstrate how PRIVEE can help emulate the attack strategies and diagnose disclosure risks through two case studies with data privacy experts.
Collapse
|
29
|
Xu Z, Hu Z, Zheng X, Zhang H, Luo Y. A matrix factorization recommendation model for tourism points of interest based on interest shift and differential privacy. IFS 2022. [DOI: 10.3233/jifs-211542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Adding noise to user history data helps to protect user privacy in recommendation systems but affects the recommendation performance. To solve this problem, a matrix factorization tourism point of interest recommendation model based on interest offset and differential privacy is proposed in this paper. The recommendation performance of the model is improved by analyzing user interest preferences. Specifically, user interest offsets are extracted from user tags and user ratings under time-series factors to calculate user interest drift. Then, similar neighbors are found to train user feature preferences which are integrated into the matrix model in the form of regular terms. Meanwhile, based on the differential privacy theory, a privacy neighbor selection algorithm combining the K-Medoides clustering algorithm and index mechanism is designed to effectively protect the identity of neighbors and prevent KNN attacks. Besides, the Laplace mechanism is used to implement differential privacy protection for the model’s gradient descent process. Finally, the feasibility of the proposed recommendation model is verified through experiments, and the experimental results indicate that this model has advantages in recommendation accuracy and privacy protection.
Collapse
Affiliation(s)
- Zhiyun Xu
- School of Geography and Tourism, Anhui Normal University, Wuhu, Anhui, China
| | - Zhaoyan Hu
- School of Computer and Information, Anhui Normal University, Wuhu, Anhui, China
- Anhui Provincial Key Laboratory of Network and Information Security, Wuhu, Anhui, China
| | - Xiaoyao Zheng
- School of Computer and Information, Anhui Normal University, Wuhu, Anhui, China
- Anhui Provincial Key Laboratory of Network and Information Security, Wuhu, Anhui, China
| | - Haiyan Zhang
- School of Computer and Information, Anhui Normal University, Wuhu, Anhui, China
- Anhui Provincial Key Laboratory of Network and Information Security, Wuhu, Anhui, China
| | - Yonglong Luo
- School of Geography and Tourism, Anhui Normal University, Wuhu, Anhui, China
- School of Computer and Information, Anhui Normal University, Wuhu, Anhui, China
- Anhui Provincial Key Laboratory of Network and Information Security, Wuhu, Anhui, China
| |
Collapse
|
30
|
Abstract
The availability of large-scale datasets on which to train, benchmark and test algorithms has been central to the rapid development of machine learning as a discipline. Despite considerable advancements, the field of quantum machine learning has thus far lacked a set of comprehensive large-scale datasets upon which to benchmark the development of algorithms for use in applied and theoretical quantum settings. In this paper, we introduce such a dataset, the QDataSet, a quantum dataset designed specifically to facilitate the training and development of quantum machine learning algorithms. The QDataSet comprises 52 high-quality publicly available datasets derived from simulations of one- and two-qubit systems evolving in the presence and/or absence of noise. The datasets are structured to provide a wealth of information to enable machine learning practitioners to use the QDataSet to solve problems in applied quantum computation, such as quantum control, quantum spectroscopy and tomography. Accompanying the datasets on the associated GitHub repository are a set of workbooks demonstrating the use of the QDataSet in a range of optimisation contexts. Measurement(s) | Simulations of one- and two-qubit quantum systems evolving in the presence and absence of noise and distortion | Technology Type(s) | Simulated measurement using Python packages | Sample Characteristic - Organism | Simulated quantum systems | Sample Characteristic - Environment | Quantum systems in noisy and noiseless environments |
Collapse
|
31
|
Yang M, Zhang C, Wang X, Liu X, Li S, Huang J, Feng Z, Sun X, Chen F, Yang S, Ni M, Li L, Cao Y, Mu F. TrustGWAS: A full-process workflow for encrypted GWAS using multi-key homomorphic encryption and pseudorandom number perturbation. Cell Syst 2022; 13:752-767.e6. [PMID: 36041458 DOI: 10.1016/j.cels.2022.08.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 04/21/2022] [Accepted: 08/04/2022] [Indexed: 01/26/2023]
Abstract
The statistical power of genome-wide association studies (GWASs) is affected by the effective sample size. However, the privacy and security concerns associated with individual-level genotype data pose great challenges for cross-institutional cooperation. The full-process cryptographic solutions are in demand but have not been covered, especially the essential principal-component analysis (PCA). Here, we present TrustGWAS, a complete solution for secure, large-scale GWAS, recapitulating gold standard results against PLINK without compromising privacy and supporting basic PLINK steps including quality control, linkage disequilibrium pruning, PCA, chi-square test, Cochran-Armitage trend test, covariate-supported logistic regression and linear regression, and their sequential combinations. TrustGWAS leverages pseudorandom number perturbations for PCA and multiparty scheme of multi-key homomorphic encryption for all other modules. TrustGWAS can evaluate 100,000 individuals with 1 million variants and complete QC-LD-PCA-regression workflow within 50 h. We further successfully discover gene loci associated with fasting blood glucose, consistent with the findings of the ChinaMAP project.
Collapse
|
32
|
Rauthan J. Fully homomorphic encryption: A case study. IFS 2022. [DOI: 10.3233/jifs-221454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Fully Homomorphic Encryption (FHE) is the holy grail of encrypted communications. It opens the door to several advanced functionalities to overcome the security and trust issues of the IT world. After 2009, once Craig Gentry had shown that FHE could be achieved, a study in this field boomed, and significant improvement was made in identifying more efficient and realistic programs. FHE is primitive cryptography that enables arbitrary functions to be calculated via encrypted data. These systems are applicable in different ways since they permit users to encrypt their private information securely while still outsourcing the processing of protected data without fearing disclosing the real data. In 2012, LTV12 presented the first multi-key FHE system and demonstrated the possibility of using multi-key systems in somewhat homomorphic encryption (SHE). Like in the one key context, there have been many advances in the field, but no effort has been made to develop the multi-key methods. This paper presents a discussion of FHE and MKFHE with a specific focus on the current techniques and three implementations, comprising the first in the multi-key setups, to the extent of our understanding.
Collapse
|
33
|
Chen D, Li Y, Chen J, Bi H, Ding X, Putzu L. Differential Privacy via Haar Wavelet Transform and Gaussian Mechanism for Range Query. Computational Intelligence and Neuroscience 2022; 2022:1-17. [PMID: 36131905 PMCID: PMC9484951 DOI: 10.1155/2022/8139813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 07/03/2022] [Accepted: 08/22/2022] [Indexed: 11/23/2022]
Abstract
Range query is the hot topic of the privacy-preserving data publishing. To preserve privacy, the large range query means more accumulate noise will be injected into the input data. This study presents a research on differential privacy for range query via Haar wavelet transform and Gaussian mechanism. First, the noise injected into the input data via Laplace mechanism is analyzed, and we conclude that it is difficult to judge the level of privacy protection based on the Haar wavelet transform and Laplace mechanism for range query because the sum of independent random Laplace variables is not a variable of a Laplace distribution. Second, the method of injecting noise into Haar wavelet coefficients via Gaussian mechanism is proposed in this study. Finally, the maximum variance for any range query under the framework of Haar wavelet transform and Gaussian mechanism is given. The analysis shows that using Haar wavelet transform and Gaussian mechanism, we can preserve the differential privacy for each input data and any range query, and the variance of noise is far less than that just using the Gaussian mechanism. In an experimental study on the dataset age extracted from IPUM's census data of the United States, we confirm that the proposed mechanism has much smaller maximum variance of noises than the Gaussian mechanism for range-count queries.
Collapse
|
34
|
Sundaram A, Abdel-Khalik HS, Abdo MG. Preventing Reverse Engineering of Critical Industrial Data with DIOD. NUCL TECHNOL 2022. [DOI: 10.1080/00295450.2022.2102848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Arvind Sundaram
- Purdue University, 205 Gates Road, West Lafayette, Indiana 47906
| | | | - Mohammad G. Abdo
- Idaho National Laboratory, 1955 N. Fremont Road, Idaho Falls, Idaho 83415
| |
Collapse
|
35
|
Halfpenny W, Baxter SL. Towards effective data sharing in ophthalmology: data standardization and data privacy. Curr Opin Ophthalmol 2022; 33:418-424. [PMID: 35819893 PMCID: PMC9357189 DOI: 10.1097/icu.0000000000000878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE OF REVIEW The purpose of this review is to provide an overview of updates in data standardization and data privacy in ophthalmology. These topics represent two key aspects of medical information sharing and are important knowledge areas given trends in data-driven healthcare. RECENT FINDINGS Standardization and privacy can be seen as complementary aspects that pertain to data sharing. Standardization promotes the ease and efficacy through which data is shared. Privacy considerations ensure that data sharing is appropriate and sufficiently controlled. There is active development in both areas, including government regulations and common data models to advance standardization, and application of technologies such as blockchain and synthetic data to help tackle privacy issues. These advancements have seen use in ophthalmology, but there are areas where further work is required. SUMMARY Information sharing is fundamental to both research and care delivery, and standardization/privacy are key constituent considerations. Therefore, widespread engagement with, and development of, data standardization and privacy ecosystems stand to offer great benefit to ophthalmology.
Collapse
Affiliation(s)
| | - Sally L. Baxter
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, CA, USA
- Health Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
36
|
Packhäuser K, Gündel S, Münster N, Syben C, Christlein V, Maier A. Deep learning-based patient re-identification is able to exploit the biometric nature of medical chest X-ray data. Sci Rep 2022; 12:14851. [PMID: 36050406 DOI: 10.1038/s41598-022-19045-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 08/23/2022] [Indexed: 11/08/2022] Open
Abstract
With the rise and ever-increasing potential of deep learning techniques in recent years, publicly available medical datasets became a key factor to enable reproducible development of diagnostic algorithms in the medical domain. Medical data contains sensitive patient-related information and is therefore usually anonymized by removing patient identifiers, e.g., patient names before publication. To the best of our knowledge, we are the first to show that a well-trained deep learning system is able to recover the patient identity from chest X-ray data. We demonstrate this using the publicly available large-scale ChestX-ray14 dataset, a collection of 112,120 frontal-view chest X-ray images from 30,805 unique patients. Our verification system is able to identify whether two frontal chest X-ray images are from the same person with an AUC of 0.9940 and a classification accuracy of 95.55%. We further highlight that the proposed system is able to reveal the same person even ten and more years after the initial scan. When pursuing a retrieval approach, we observe an mAP@R of 0.9748 and a precision@1 of 0.9963. Furthermore, we achieve an AUC of up to 0.9870 and a precision@1 of up to 0.9444 when evaluating our trained networks on external datasets such as CheXpert and the COVID-19 Image Data Collection. Based on this high identification rate, a potential attacker may leak patient-related information and additionally cross-reference images to obtain more information. Thus, there is a great risk of sensitive content falling into unauthorized hands or being disseminated against the will of the concerned patients. Especially during the COVID-19 pandemic, numerous chest X-ray datasets have been published to advance research. Therefore, such data may be vulnerable to potential attacks by deep learning-based re-identification algorithms.
Collapse
|
37
|
Benyahya M, Collen A, Kechagia S, Nijdam NA. Automated City Shuttles: Mapping the Key Challenges in Cybersecurity, Privacy and Standards to Future Developments. Comput Secur 2022. [DOI: 10.1016/j.cose.2022.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
38
|
Packhäuser K, Gündel S, Münster N, Syben C, Christlein V, Maier A. Deep learning-based patient re-identification is able to exploit the biometric nature of medical chest X-ray data. Sci Rep 2022. [PMID: 36050406 DOI: 10.48550/arxiv.2103.08562] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2023] Open
Abstract
With the rise and ever-increasing potential of deep learning techniques in recent years, publicly available medical datasets became a key factor to enable reproducible development of diagnostic algorithms in the medical domain. Medical data contains sensitive patient-related information and is therefore usually anonymized by removing patient identifiers, e.g., patient names before publication. To the best of our knowledge, we are the first to show that a well-trained deep learning system is able to recover the patient identity from chest X-ray data. We demonstrate this using the publicly available large-scale ChestX-ray14 dataset, a collection of 112,120 frontal-view chest X-ray images from 30,805 unique patients. Our verification system is able to identify whether two frontal chest X-ray images are from the same person with an AUC of 0.9940 and a classification accuracy of 95.55%. We further highlight that the proposed system is able to reveal the same person even ten and more years after the initial scan. When pursuing a retrieval approach, we observe an mAP@R of 0.9748 and a precision@1 of 0.9963. Furthermore, we achieve an AUC of up to 0.9870 and a precision@1 of up to 0.9444 when evaluating our trained networks on external datasets such as CheXpert and the COVID-19 Image Data Collection. Based on this high identification rate, a potential attacker may leak patient-related information and additionally cross-reference images to obtain more information. Thus, there is a great risk of sensitive content falling into unauthorized hands or being disseminated against the will of the concerned patients. Especially during the COVID-19 pandemic, numerous chest X-ray datasets have been published to advance research. Therefore, such data may be vulnerable to potential attacks by deep learning-based re-identification algorithms.
Collapse
Affiliation(s)
- Kai Packhäuser
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany.
| | - Sebastian Gündel
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany
| | - Nicolas Münster
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany
| | - Christopher Syben
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany
| | - Vincent Christlein
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany
| | - Andreas Maier
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany
| |
Collapse
|
39
|
Hu S, Li Y, Liu X, Li Q, Wu Z, He B. The OARF Benchmark Suite: Characterization and Implications for Federated Learning Systems. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3510540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
This article presents and characterizes an Open Application Repository for Federated Learning (OARF), a benchmark suite for federated machine learning systems. Previously available benchmarks for federated learning (FL) have focused mainly on synthetic datasets and use a limited number of applications. OARF mimics more realistic application scenarios with publicly available datasets as different data silos in image, text, and structured data. Our characterization shows that the benchmark suite is diverse in data size, distribution, feature distribution, and learning task complexity. The extensive evaluations with reference implementations show the future research opportunities for important aspects of FL systems. We have developed reference implementations, and evaluated the important aspects of FL, including model accuracy, communication cost, throughput, and convergence time. Through these evaluations, we discovered some interesting findings such as FL can effectively increase end-to-end throughput. The code of OARF is publicly available on GitHub.
1
Collapse
Affiliation(s)
- Sixu Hu
- National University of Singapore, Singapore
| | - Yuan Li
- National University of Singapore, Singapore
| | - Xu Liu
- National University of Singapore, Singapore
| | - Qinbin Li
- National University of Singapore, Singapore
| | - Zhaomin Wu
- National University of Singapore, Singapore
| | | |
Collapse
|
40
|
Alvim MS, Chatzikokolakis K, Kawamoto Y, Palamidessi C. Information Leakage Games: Exploring Information as a Utility Function. ACM Trans Priv Secur 2022. [DOI: 10.1145/3517330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
A common goal in the areas of secure information flow and privacy is to build effective defenses against unwanted leakage of information. To this end, one must be able to reason about potential attacks and their interplay with possible defenses. In this article, we propose a game-theoretic framework to formalize strategies of attacker and defender in the context of information leakage, and provide a basis for developing optimal defense methods. A novelty of our games is that their utility is given by information leakage, which in some cases may behave in a non-linear way. This causes a significant deviation from classic game theory, in which utility functions are linear with respect to players’ strategies. Hence, a key contribution of this work is the establishment of the foundations of information leakage games. We consider two kinds of games, depending on the notion of leakage considered. The first kind, the
QIF
-games
, is tailored for the theory of quantitative information flow. The second one, the
DP
-games
, corresponds to differential privacy.
Collapse
Affiliation(s)
| | | | | | - Catuscia Palamidessi
- Inria Saclay, France and École Polytechnique, École Polytechnique, Palaiseau, France
| |
Collapse
|
41
|
Li W, Yan A, Wu D, Zhu T, Huang T, Luo X, Wang S. DPCL: Contrastive representation learning with differential privacy. INT J INTELL SYST. [DOI: 10.1002/int.23002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
42
|
Li F, Cui Y, Wang J, Zhou H, Wang X, Yang Q. Lattice‐based batch authentication scheme with dynamic identity revocation in VANET. INT J INTELL SYST 2022. [DOI: 10.1002/int.23004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Fengyin Li
- School of Computer Science Qufu Normal University Rizhao China
| | - Yang Cui
- School of Computer Science Qufu Normal University Rizhao China
| | - Junhui Wang
- School of Computer Science Qufu Normal University Rizhao China
| | - Huiyu Zhou
- School of Computing and Mathematical Sciences University of Leicester Leicester UK
| | - Xiaoying Wang
- Information Center Third Affiliated Hospital of Sun Yat‐sen University Guangzhou China
| | - Qintai Yang
- Information Center Third Affiliated Hospital of Sun Yat‐sen University Guangzhou China
| |
Collapse
|
43
|
Li F, Yu S, Li G, Yang M, Zhou H. Intelligent federated learning on lattice‐based efficient heterogeneous signcryption. INT J INTELL SYST 2022. [DOI: 10.1002/int.23007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Fengyin Li
- School of Computer Science Qufu Normal University Rizhao China
| | - Siqi Yu
- School of Computer Science Qufu Normal University Rizhao China
| | - Guangshun Li
- School of Computer Science Qufu Normal University Rizhao China
| | - Mengjiao Yang
- School of Computer Science Qufu Normal University Rizhao China
| | - Huiyu Zhou
- School of Computing and Mathematical Sciences University of Leicester Leicester UK
| |
Collapse
|
44
|
Akallouch M, Akallouch O, Fardousse K, Bouhoute A, Berrada I. Prediction and Privacy Scheme for Traffic Flow Estimation on the Highway Road Network. Information 2022; 13:381. [DOI: 10.3390/info13080381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Accurate and timely traffic information is a vital element in intelligent transportation systems and urban management, which is vitally important for road users and government agencies. However, existing traffic prediction approaches are primarily based on standard machine learning which requires sharing direct raw information to the global server for model training. Further, user information may contain sensitive personal information, and sharing of direct raw data may lead to leakage of user private data and risks of exposure. In the face of the above challenges, in this work, we introduce a new hybrid framework that leverages Federated Learning with Local Differential Privacy to share model updates rather than directly sharing raw data among users. Our FL-LDP approach is designed to coordinate users to train the model collaboratively without compromising data privacy. We evaluate our scheme using a real-world public dataset and we implement different deep neural networks. We perform a comprehensive evaluation of our approach with state-of-the-art models. The prediction results of the experiment confirm that the proposed scheme is capable of building performance accurate traffic predictions, improving privacy preservation, and preventing data recovery attacks.
Collapse
|
45
|
Meng X, Yang Y, Liu X, Jiang N. Active forgetting via influence estimation for neural networks. INT J INTELL SYST 2022. [DOI: 10.1002/int.22981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Xianjia Meng
- School of Information Science and Technology Northwest University Xi'an People's Republic of China
| | - Yong Yang
- School of Information Science and Technology Northwest University Xi'an People's Republic of China
| | - Ximeng Liu
- Fujian Province Key Laboratory of Information Security of Network System, and College of Computer and Data Science Fuzhou University Fuzhou People's Republic of China
- Cyberspace Security Research Center Peng Cheng Laboratory Shenzhen People's Republic of China
| | - Nan Jiang
- Department of Internet of Things, School of Information Engineering East China Jiao Tong University Nanchang People's Republic of China
| |
Collapse
|
46
|
Krzyzanowski B, Manson SM. Twenty Years of the Health Insurance Portability and Accountability Act Safe Harbor Provision: Unsolved Challenges and Ways Forward. JMIR Med Inform 2022; 10:e37756. [PMID: 35921140 PMCID: PMC9386597 DOI: 10.2196/37756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 06/23/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
The Health Insurance Portability and Accountability Act (HIPAA) was an important milestone in protecting the privacy of patient data; however, the HIPAA provisions specific to geographic data remain vague and hinder the ways in which epidemiologists and geographers use and share spatial health data. The literature on spatial health and select legal and official guidance documents present scholars with ambiguous guidelines that have led to the use and propagation of multiple interpretations of a single HIPAA safe harbor provision specific to geographic data. Misinterpretation of this standard has resulted in many entities sharing data at overly conservative levels, whereas others offer definitions of safe harbors that potentially put patient data at risk. To promote understanding of, and adherence to, the safe harbor rule, this paper reviews the HIPAA law from its creation to the present day, elucidating common misconceptions and presenting straightforward guidance to scholars. We focus on the 20,000-person population threshold and the 3-digit zip code stipulation of safe harbors, which are central to the confusion surrounding how patient location data can be shared. A comprehensive examination of these 2 stipulations, which integrates various expert perspectives and relevant studies, reveals how alternative methods for safe harbors can offer researchers better data and better data protection. Much has changed in the 20 years since the introduction of the safe harbor provision; however, it continues to be the primary source of guidance (and frustration) for researchers trying to share maps, leaving many waiting for these rules to be revised in accordance with the times.
Collapse
|
47
|
Hotz VJ, Bollinger CR, Komarova T, Manski CF, Moffitt RA, Nekipelov D, Sojourner A, Spencer BD. Balancing data privacy and usability in the federal statistical system. Proc Natl Acad Sci U S A 2022; 119:e2104906119. [PMID: 35878030 DOI: 10.1073/pnas.2104906119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The federal statistical system is experiencing competing pressures for change. On the one hand, for confidentiality reasons, much socially valuable data currently held by federal agencies is either not made available to researchers at all or only made available under onerous conditions. On the other hand, agencies which release public databases face new challenges in protecting the privacy of the subjects in those databases, which leads them to consider releasing fewer data or masking the data in ways that will reduce their accuracy. In this essay, we argue that the discussion has not given proper consideration to the reduced social benefits of data availability and their usability relative to the value of increased levels of privacy protection. A more balanced benefit-cost framework should be used to assess these trade-offs. We express concerns both with synthetic data methods for disclosure limitation, which will reduce the types of research that can be reliably conducted in unknown ways, and with differential privacy criteria that use what we argue is an inappropriate measure of disclosure risk. We recommend that the measure of disclosure risk used to assess all disclosure protection methods focus on what we believe is the risk that individuals should care about, that more study of the impact of differential privacy criteria and synthetic data methods on data usability for research be conducted before either is put into widespread use, and that more research be conducted on alternative methods of disclosure risk reduction that better balance benefits and costs.
Collapse
|
48
|
Fang L, Du B, Wu C. Differentially private recommender system with variational autoencoders. Knowl Based Syst 2022; 250:109044. [DOI: 10.1016/j.knosys.2022.109044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
49
|
Vu DH, Vu TS, Luong TD. An efficient and practical approach for privacy-preserving Naive Bayes classification. Journal of Information Security and Applications 2022. [DOI: 10.1016/j.jisa.2022.103215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
50
|
Abstract
Building performant and robust artificial intelligence (AI)-based applications for dentistry requires large and high-quality data sets, which usually reside in distributed data silos from multiple sources (e.g., different clinical institutes). Collaborative efforts are limited as privacy constraints forbid direct sharing across the borders of these data silos. Federated learning is a scalable and privacy-preserving framework for collaborative training of AI models without data sharing, where instead the knowledge is exchanged in form of wisdom learned from the data. This article aims at introducing the established concept of federated learning together with chances and challenges to foster collaboration on AI-based applications within the dental research community.
Collapse
Affiliation(s)
- R Rischke
- Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
| | - L Schneider
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin, Berlin, Germany.,ITU/WHO Focus Group on AI for Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland
| | - K Müller
- Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
| | - W Samek
- Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
| | - F Schwendicke
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin, Berlin, Germany.,ITU/WHO Focus Group on AI for Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland
| | - J Krois
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin, Berlin, Germany.,ITU/WHO Focus Group on AI for Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland
| |
Collapse
|