1
|
Rocher L, Hendrickx JM, Montjoye YAD. A scaling law to model the effectiveness of identification techniques. Nat Commun 2025; 16:347. [PMID: 39788959 PMCID: PMC11718298 DOI: 10.1038/s41467-024-55296-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 12/06/2024] [Indexed: 01/12/2025] Open
Abstract
AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques and derive an analytical expression for correctness (κ), the fraction of people accurately identified in a population. We then generalize the model to forecast how κ scales from small-scale experiments to the real world, for exact, sparse, and machine learning-based robust identification techniques. Despite having only two degrees of freedom, our method closely fits 476 correctness curves and strongly outperforms curve-fitting methods and entropy-based rules of thumb. Our work provides a principled framework for forecasting the privacy risks posed by identification techniques, while also supporting independent accountability efforts for AI-based biometric systems.
Collapse
Affiliation(s)
- Luc Rocher
- Oxford Internet Institute, University of Oxford, Oxford, UK.
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium.
- Data Science Institute, Imperial College London, London, UK.
| | - Julien M Hendrickx
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Yves-Alexandre de Montjoye
- Data Science Institute, Imperial College London, London, UK.
- Department of Computing, Imperial College London, London, UK.
| |
Collapse
|
2
|
Cotler JH, Nogueira L, McCabe R, Nelson H, Brajcich BC, Boffa DJ, Lum SS, Harris JB, Hawhee V, Mullett TW, Bilimoria KY, Palis BE. Evaluating information loss in the National Cancer Database from cases lost to follow-up. J Surg Oncol 2022; 126:1123-1132. [PMID: 36029288 DOI: 10.1002/jso.26977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 05/17/2022] [Accepted: 05/21/2022] [Indexed: 11/09/2022]
Abstract
BACKGROUND AND OBJECTIVES Cancer registries must focus on data capture which returns value while reducing resource burden with minimal loss of data. Identifying the optimum length of follow-up data collection for patients with cancer achieves this goal. METHODS A two-step analysis using entropy calculations to assess information gain for each follow-up year, and second-order differences to compare survival outcomes between the defined follow-up periods and lifetime follow-up. A total of 391 567 adult cases, deidentified in the National Cancer Database and diagnosed in 1989. Comparisons examined a subset of 61 908 lung cancer cases, 48 387 colon and rectal cancer cases, and 64 134 breast cancer cases in adults. A total of 4133 pediatric cases were diagnosed in 1989 examining 1065 leukemia cases and 494 lymphoma cases. RESULTS Annual increases in information gain fell below 1% after 16 years of follow-up for adult cases and 9 years for pediatric cases. Comparison of second-order differences showed 62% of the comparisons were similar between 15 years and lifetime follow-up when examining restricted mean survival time. In addition, 90% of the comparisons were statistically similar when comparing hazard ratios. CONCLUSIONS Survival analysis using 15 years postdiagnosis follow-up showed minimal differences in information gain compared to lifetime follow-up.
Collapse
Affiliation(s)
- Joseph H Cotler
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois, USA
| | - Leticia Nogueira
- Health Services Research, American Cancer Society, Atlanta, Georgia, USA
| | - Ryan McCabe
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois, USA
| | - Heidi Nelson
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois, USA
| | - Brian C Brajcich
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois, USA.,Surgical Outcomes and Quality Improvement Center (SOQIC), Department of Surgery, Northwestern Medicine, Chicago, Illinois, USA
| | - Daniel J Boffa
- Department of Thoracic Surgery, Yale School of Medicine, New Haven, Connecticut, USA
| | - Sharon S Lum
- Department of Surgery, Loma Linda University School of Medicine, Loma Linda, California, USA
| | - James B Harris
- Department of Surgery, University of Nevada School of Medicine, Reno, Nevada, USA
| | - Vicki Hawhee
- H. Lee Moffitt Cancer Center & Research Institute, Tampa, Florida, USA
| | - Timothy W Mullett
- Department of Surgery, University of Kentucky College of Medicine, Lexington, Kentucky, USA
| | - Karl Y Bilimoria
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois, USA.,Surgical Outcomes and Quality Improvement Center (SOQIC), Department of Surgery, Northwestern Medicine, Chicago, Illinois, USA
| | - Bryan E Palis
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois, USA
| |
Collapse
|
3
|
Aggarwal E, Mohanty B. An algorithmic-based multi-attribute decision making model under intuitionistic fuzzy environment. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-212026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
An outranking procedure for Multi-Attribute Decision-Making (MADM) problems is introduced in our work that acts as a decision-aid in recommending the products to the buyers. The buyer’s product assessment is taken as Interval-Valued Intuitionistic Fuzzy Sets (IVIFS) in each attribute. The confidence level that is implicit in the buyer’s product rating is explicated in the proposed work using fuzzy entropy. As the confidence level of the buyer on the product assessment is for both satisfaction and reluctance, it is suitably distributed in membership and non-membership parts of IVIFS. Our work generates a dominance matrix that represents partial or full dominance of one product over another after scoring the products that are unified with buyer’s confidence. The proposed work suggests the product ranking after ascertaining the buyer’s flexibility. An algorithm is written in our work to validate the procedure developed. We have compared our work with other similar works to highlight the benefits of the proposed work. A numerical example is illustrated to highlight the procedure developed.
Collapse
Affiliation(s)
- Eshika Aggarwal
- Department of Decision Sciences, Indian Institute of Management, Lucknow, India
| | - B.K. Mohanty
- Department of Decision Sciences, Indian Institute of Management, Lucknow, India
| |
Collapse
|