1
|
Wen D, Soltan A, Trucco E, Matin RN. From data to diagnosis: skin cancer image datasets for artificial intelligence. Clin Exp Dermatol 2024; 49:675-685. [PMID: 38549552 DOI: 10.1093/ced/llae112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/11/2024] [Accepted: 03/25/2024] [Indexed: 06/26/2024]
Abstract
Artificial intelligence (AI) solutions for skin cancer diagnosis continue to gain momentum, edging closer towards broad clinical use. These AI models, particularly deep-learning architectures, require large digital image datasets for development. This review provides an overview of the datasets used to develop AI algorithms and highlights the importance of dataset transparency for the evaluation of algorithm generalizability across varying populations and settings. Current challenges for curation of clinically valuable datasets are detailed, which include dataset shifts arising from demographic variations and differences in data collection methodologies, along with inconsistencies in labelling. These shifts can lead to differential algorithm performance, compromise of clinical utility, and the propagation of discriminatory biases when developed algorithms are implemented in mismatched populations. Limited representation of rare skin cancers and minoritized groups in existing datasets are highlighted, which can further skew algorithm performance. Strategies to address these challenges are presented, which include improving transparency, representation and interoperability. Federated learning and generative methods, which may improve dataset size and diversity without compromising privacy, are also examined. Lastly, we discuss model-level techniques that may address biases entrained through the use of datasets derived from routine clinical care. As the role of AI in skin cancer diagnosis becomes more prominent, ensuring the robustness of underlying datasets is increasingly important.
Collapse
Affiliation(s)
- David Wen
- Department of Dermatology, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Oxford University Clinical Academic Graduate School, University of Oxford, Oxford, UK
| | - Andrew Soltan
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
- Oxford Cancer and Haematology Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Department of Oncology, University of Oxford, Oxford, UK
| | - Emanuele Trucco
- VAMPIRE Project, Computing, School of Science and Engineering, University of Dundee, Dundee, UK
| | - Rubeta N Matin
- Department of Dermatology, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Artificial Intelligence Working Party Group, British Association of Dermatologists, London, UK
| |
Collapse
|
2
|
Franklin G, Stephens R, Piracha M, Tiosano S, Lehouillier F, Koppel R, Elkin PL. The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective. Life (Basel) 2024; 14:652. [PMID: 38929638 PMCID: PMC11204917 DOI: 10.3390/life14060652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/24/2024] [Accepted: 04/26/2024] [Indexed: 06/28/2024] Open
Abstract
Artificial intelligence models represented in machine learning algorithms are promising tools for risk assessment used to guide clinical and other health care decisions. Machine learning algorithms, however, may house biases that propagate stereotypes, inequities, and discrimination that contribute to socioeconomic health care disparities. The biases include those related to some sociodemographic characteristics such as race, ethnicity, gender, age, insurance, and socioeconomic status from the use of erroneous electronic health record data. Additionally, there is concern that training data and algorithmic biases in large language models pose potential drawbacks. These biases affect the lives and livelihoods of a significant percentage of the population in the United States and globally. The social and economic consequences of the associated backlash cannot be underestimated. Here, we outline some of the sociodemographic, training data, and algorithmic biases that undermine sound health care risk assessment and medical decision-making that should be addressed in the health care system. We present a perspective and overview of these biases by gender, race, ethnicity, age, historically marginalized communities, algorithmic bias, biased evaluations, implicit bias, selection/sampling bias, socioeconomic status biases, biased data distributions, cultural biases and insurance status bias, conformation bias, information bias and anchoring biases and make recommendations to improve large language model training data, including de-biasing techniques such as counterfactual role-reversed sentences during knowledge distillation, fine-tuning, prefix attachment at training time, the use of toxicity classifiers, retrieval augmented generation and algorithmic modification to mitigate the biases moving forward.
Collapse
Affiliation(s)
- Gillian Franklin
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Department of Veterans Affairs, Knowledge Based Systems and Western New York, Veterans Affairs, Buffalo, NY 14215, USA
| | - Rachel Stephens
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
| | - Muhammad Piracha
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
| | - Shmuel Tiosano
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
| | - Frank Lehouillier
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Department of Veterans Affairs, Knowledge Based Systems and Western New York, Veterans Affairs, Buffalo, NY 14215, USA
| | - Ross Koppel
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Institute for Biomedical Informatics, Perelman School of Medicine, and Sociology Department, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Peter L. Elkin
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Department of Veterans Affairs, Knowledge Based Systems and Western New York, Veterans Affairs, Buffalo, NY 14215, USA
| |
Collapse
|
3
|
Oloruntoba A, Ingvar Å, Sashindranath M, Anthony O, Abbott L, Guitera P, Caccetta T, Janda M, Soyer HP, Mar V. Examining labelling guidelines for AI-based software as a medical device: A review and analysis of dermatology mobile applications in Australia. Australas J Dermatol 2024. [PMID: 38693690 DOI: 10.1111/ajd.14269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 02/26/2024] [Accepted: 04/01/2024] [Indexed: 05/03/2024]
Abstract
In recent years, there has been a surge in the development of AI-based Software as a Medical Device (SaMD), particularly in visual specialties such as dermatology. In Australia, the Therapeutic Goods Administration (TGA) regulates AI-based SaMD to ensure its safe use. Proper labelling of these devices is crucial to ensure that healthcare professionals and the general public understand how to use them and interpret results accurately. However, guidelines for labelling AI-based SaMD in dermatology are lacking, which may result in products failing to provide essential information about algorithm development and performance metrics. This review examines existing labelling guidelines for AI-based SaMD across visual medical specialties, with a specific focus on dermatology. Common recommendations for labelling are identified and applied to currently available dermatology AI-based SaMD mobile applications to determine usage of these labels. Of the 21 AI-based SaMD mobile applications identified, none fully comply with common labelling recommendations. Results highlight the need for standardized labelling guidelines. Ensuring transparency and accessibility of information is essential for the safe integration of AI into health care and preventing potential risks associated with inaccurate clinical decisions.
Collapse
Affiliation(s)
| | - Åsa Ingvar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
- Department of Dermatology, Skåne University Hospital, Lund University, Lund, Sweden
- Department of Clinical Sciences, Skåne University Hospital, Lund University, Lund, Sweden
| | - Maithili Sashindranath
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Ojochonu Anthony
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Lisa Abbott
- Melanoma Institute Australia, The University of Sydney, Sydney, New South Wales, Australia
| | - Pascale Guitera
- Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
- Sydney Melanoma Diagnostic Centre, Royal Prince Alfred Hospital, Camperdown, New South Wales, Australia
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Tony Caccetta
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Monika Janda
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - H Peter Soyer
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - Victoria Mar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
| |
Collapse
|
4
|
Joly-Chevrier M, Nguyen AXL, Liang L, Lesko-Krleza M, Lefrançois P. The State of Artificial Intelligence in Skin Cancer Publications. J Cutan Med Surg 2024; 28:146-152. [PMID: 38323537 PMCID: PMC11015717 DOI: 10.1177/12034754241229361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
BACKGROUND Artificial intelligence (AI) in skin cancer is a promising research field to assist physicians and to provide support to patients remotely. Physicians' awareness to new developments in AI research is important to define the best practices and scope of integrating AI-enabled technologies within a clinical setting. OBJECTIVES To analyze the characteristics and trends of AI skin cancer publications from dermatology journals. METHODS AI skin cancer publications were retrieved in June 2022 from the Web of Science. Publications were screened by title, abstract, and keywords to assess eligibility. Publications were fully reviewed. Publications were divided between nonmelanoma skin cancer (NMSC), melanoma, and skin cancer studies. The primary measured outcome was the number of citations. The secondary measured outcomes were articles' general characteristics and features related to AI. RESULTS A total of 168 articles were included: 25 on NMSC, 77 on melanoma, and 66 on skin cancer. The most common types of skin cancers were melanoma (134, 79.8%), basal cell carcinoma (61, 36.3%), and squamous cell carcinoma (45, 26.9%). All articles were published between 2000 and 2022, with 49 (29.2%) of them being published in 2021. Original studies that developed or assessed an algorithm predominantly used supervised learning (66, 97.0%) and deep neural networks (42, 67.7%). The most used imaging modalities were standard dermoscopy (76, 45.2%) and clinical images (39, 23.2%). CONCLUSIONS Most publications focused on developing or assessing screening technologies with mainly deep neural network algorithms. This indicates the eminent need for dermatologists to label or annotate images used by novel AI systems.
Collapse
Affiliation(s)
| | | | - Laurence Liang
- Faculty of Engineering, McGill University, Montreal, QC, Canada
| | - Michael Lesko-Krleza
- Division of Computer Engineering, Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
| | - Philippe Lefrançois
- Division of Dermatology, Department of Medicine, McGill University, Montreal, QC, Canada
- Division of Dermatology, Department of Medicine, Jewish General Hospital, Montreal, QC, Canada
- Lady Davis Institute for Medical Research, Montreal, QC, Canada
| |
Collapse
|
5
|
Affiliation(s)
- Aaron E Carroll
- Center for Pediatric and Adolescent Comparative Effectiveness Research, Indiana University School of Medicine, Indianapolis
- Web and Social Media Editor, JAMA Pediatrics
| | - Dimitri A Christakis
- Center for Child Health, Behavior, and Development, Seattle Children's Research Institute, Seattle, Washington
- Editor, JAMA Pediatrics
| |
Collapse
|
6
|
O’Hern K, Yang E, Vidal NY. ChatGPT underperforms in triaging appropriate use of Mohs surgery for cutaneous neoplasms. JAAD Int 2023; 12:168-170. [PMID: 37404248 PMCID: PMC10316650 DOI: 10.1016/j.jdin.2023.06.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2023] Open
Affiliation(s)
- Keegan O’Hern
- Department of Dermatology, Mayo Clinic, Rochester, Minnesota
| | - Eilene Yang
- Mayo Clinic Alix School of Medicine, Rochester, Minnesota
| | - Nahid Y. Vidal
- Division of Dermatologic Surgery, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
7
|
Steele L, Tan XL, Olabi B, Gao JM, Tanaka RJ, Williams HC. Determining the clinical applicability of machine learning models through assessment of reporting across skin phototypes and rarer skin cancer types: A systematic review. J Eur Acad Dermatol Venereol 2023; 37:657-665. [PMID: 36514990 DOI: 10.1111/jdv.18814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 11/09/2022] [Indexed: 12/15/2022]
Abstract
Machine learning (ML) models for skin cancer recognition may have variable performance across different skin phototypes and skin cancer types. Overall performance metrics alone are insufficient to detect poor subgroup performance. We aimed (1) to assess whether studies of ML models reported results separately for different skin phototypes and rarer skin cancers, and (2) to graphically represent the skin cancer training datasets used by current ML models. In this systematic review, we searched PubMed, Embase and CENTRAL. We included all studies in medical journals assessing an ML technique for skin cancer diagnosis that used clinical or dermoscopic images from 1 January 2012 to 22 September 2021. No language restrictions were applied. We considered rarer skin cancers to be skin cancers other than pigmented melanoma, basal cell carcinoma and squamous cell carcinoma. We identified 114 studies for inclusion. Rarer skin cancers were included by 8/114 studies (7.0%), and results for a rarer skin cancer were reported separately in 1/114 studies (0.9%). Performance was reported across all skin phototypes in 1/114 studies (0.9%), but performance was uncertain in skin phototypes I and VI from minimal representation of the skin phototypes in the test dataset (9/3756 and 1/3756, respectively). For training datasets, although public datasets were most frequently used, with the most widely used being the International Skin Imaging Collaboration (ISIC) archive (65/114 studies, 57.0%), the largest datasets were private. Our review identified that most ML models did not report performance separately for rarer skin cancers and different skin phototypes. A degree of variability in ML model performance across subgroups is expected, but the current lack of transparency is not justifiable and risks models being used inappropriately in populations in whom accuracy is low.
Collapse
Affiliation(s)
- Lloyd Steele
- Department of Dermatology, The Royal London Hospital, London, UK.,Centre for Cell Biology and Cutaneous Research, Blizard Institute, Queen Mary University of London, London, UK
| | - Xiang Li Tan
- St George's University Hospitals NHS Foundation Trust, London, UK
| | - Bayanne Olabi
- Biosciences Institute, Newcastle University, Newcastle, UK
| | - Jing Mia Gao
- Department of Dermatology, The Royal London Hospital, London, UK
| | - Reiko J Tanaka
- Department of Bioengineering, Imperial College London, London, UK
| | - Hywel C Williams
- Centre of Evidence-Based Dermatology, School of Medicine, University of Nottingham, Nottingham, UK
| |
Collapse
|
8
|
Morrow E, Zidaru T, Ross F, Mason C, Patel KD, Ream M, Stockley R. Artificial intelligence technologies and compassion in healthcare: A systematic scoping review. Front Psychol 2023; 13:971044. [PMID: 36733854 PMCID: PMC9887144 DOI: 10.3389/fpsyg.2022.971044] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 12/05/2022] [Indexed: 01/18/2023] Open
Abstract
Background Advances in artificial intelligence (AI) technologies, together with the availability of big data in society, creates uncertainties about how these developments will affect healthcare systems worldwide. Compassion is essential for high-quality healthcare and research shows how prosocial caring behaviors benefit human health and societies. However, the possible association between AI technologies and compassion is under conceptualized and underexplored. Objectives The aim of this scoping review is to provide a comprehensive depth and a balanced perspective of the emerging topic of AI technologies and compassion, to inform future research and practice. The review questions were: How is compassion discussed in relation to AI technologies in healthcare? How are AI technologies being used to enhance compassion in healthcare? What are the gaps in current knowledge and unexplored potential? What are the key areas where AI technologies could support compassion in healthcare? Materials and methods A systematic scoping review following five steps of Joanna Briggs Institute methodology. Presentation of the scoping review conforms with PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews). Eligibility criteria were defined according to 3 concept constructs (AI technologies, compassion, healthcare) developed from the literature and informed by medical subject headings (MeSH) and key words for the electronic searches. Sources of evidence were Web of Science and PubMed databases, articles published in English language 2011-2022. Articles were screened by title/abstract using inclusion/exclusion criteria. Data extracted (author, date of publication, type of article, aim/context of healthcare, key relevant findings, country) was charted using data tables. Thematic analysis used an inductive-deductive approach to generate code categories from the review questions and the data. A multidisciplinary team assessed themes for resonance and relevance to research and practice. Results Searches identified 3,124 articles. A total of 197 were included after screening. The number of articles has increased over 10 years (2011, n = 1 to 2021, n = 47 and from Jan-Aug 2022 n = 35 articles). Overarching themes related to the review questions were: (1) Developments and debates (7 themes) Concerns about AI ethics, healthcare jobs, and loss of empathy; Human-centered design of AI technologies for healthcare; Optimistic speculation AI technologies will address care gaps; Interrogation of what it means to be human and to care; Recognition of future potential for patient monitoring, virtual proximity, and access to healthcare; Calls for curricula development and healthcare professional education; Implementation of AI applications to enhance health and wellbeing of the healthcare workforce. (2) How AI technologies enhance compassion (10 themes) Empathetic awareness; Empathetic response and relational behavior; Communication skills; Health coaching; Therapeutic interventions; Moral development learning; Clinical knowledge and clinical assessment; Healthcare quality assessment; Therapeutic bond and therapeutic alliance; Providing health information and advice. (3) Gaps in knowledge (4 themes) Educational effectiveness of AI-assisted learning; Patient diversity and AI technologies; Implementation of AI technologies in education and practice settings; Safety and clinical effectiveness of AI technologies. (4) Key areas for development (3 themes) Enriching education, learning and clinical practice; Extending healing spaces; Enhancing healing relationships. Conclusion There is an association between AI technologies and compassion in healthcare and interest in this association has grown internationally over the last decade. In a range of healthcare contexts, AI technologies are being used to enhance empathetic awareness; empathetic response and relational behavior; communication skills; health coaching; therapeutic interventions; moral development learning; clinical knowledge and clinical assessment; healthcare quality assessment; therapeutic bond and therapeutic alliance; and to provide health information and advice. The findings inform a reconceptualization of compassion as a human-AI system of intelligent caring comprising six elements: (1) Awareness of suffering (e.g., pain, distress, risk, disadvantage); (2) Understanding the suffering (significance, context, rights, responsibilities etc.); (3) Connecting with the suffering (e.g., verbal, physical, signs and symbols); (4) Making a judgment about the suffering (the need to act); (5) Responding with an intention to alleviate the suffering; (6) Attention to the effect and outcomes of the response. These elements can operate at an individual (human or machine) and collective systems level (healthcare organizations or systems) as a cyclical system to alleviate different types of suffering. New and novel approaches to human-AI intelligent caring could enrich education, learning, and clinical practice; extend healing spaces; and enhance healing relationships. Implications In a complex adaptive system such as healthcare, human-AI intelligent caring will need to be implemented, not as an ideology, but through strategic choices, incentives, regulation, professional education, and training, as well as through joined up thinking about human-AI intelligent caring. Research funders can encourage research and development into the topic of AI technologies and compassion as a system of human-AI intelligent caring. Educators, technologists, and health professionals can inform themselves about the system of human-AI intelligent caring.
Collapse
Affiliation(s)
| | - Teodor Zidaru
- Department of Anthropology, London School of Economics and Political Sciences, London, United Kingdom
| | - Fiona Ross
- Faculty of Health, Science, Social Care and Education, Kingston University London, London, United Kingdom
| | - Cindy Mason
- Artificial Intelligence Researcher (Independent), Palo Alto, CA, United States
| | | | - Melissa Ream
- Kent Surrey Sussex Academic Health Science Network (AHSN) and the National AHSN Network Artificial Intelligence (AI) Initiative, Surrey, United Kingdom
| | - Rich Stockley
- Head of Research and Engagement, Surrey Heartlands Health and Care Partnership, Surrey, United Kingdom
| |
Collapse
|
9
|
Kovarik C. Development of High-Quality AI in Dermatology: Guidelines, Pitfalls, and Potential. JID INNOVATIONS 2022; 2:100157. [PMID: 36267807 PMCID: PMC9576984 DOI: 10.1016/j.xjidi.2022.100157] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Affiliation(s)
- Carrie Kovarik
- Department of Dermatology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Correspondence: Carrie Kovarik, Department of Dermatology, University of Pennsylvania, 2 Maloney Building, 3600 Spruce Street, Philadelphia, Pennsylvania 19104, USA.
| |
Collapse
|