Krammer S, Li Y, Jakob N, Boehm AS, Wolff H, Tang P, Lasser T, French LE, Hartmann D. Deep learning-based classification of dermatological lesions given a limited amount of labeled data.
J Eur Acad Dermatol Venereol 2022;
36:2516-2524. [PMID:
35876737 DOI:
10.1111/jdv.18460]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 06/10/2022] [Indexed: 11/28/2022]
Abstract
BACKGROUND
Artificial intelligence (AI) techniques are promising in early diagnosis of skin diseases. However, a precondition for their success is the access to large-scaled annotated data. Until now, obtaining this data has only been feasible with very high personnel and financial resources.
OBJECTIVES
The aim of this study was to overcome the obstacle caused by the scarcity of labeled data.
METHODS
To simulate the scenario of label shortage, we discarded a proportion of labels of the training set. The training set consisted of both labeled and unlabeled images. We then leveraged a self-supervised learning technique to pre-train the AI model on the unlabeled images. Next, we fine-tuned the pre-trained model on the labeled images.
RESULTS
When the images in the training dataset were fully labeled, the self-supervised pre-trained model achieved 95.7% of accuracy, 91.7% of precision and 90.7% of sensitivity. When only 10% of the data was labeled, the model could still yield 87.7% of accuracy, 81.7% of precision and 68.6% of sensitivity. In addition, we also empirically verified that the AI model and dermatologists are consistent in visually inspecting the skin images.
CONCLUSIONS
The experimental results demonstrate the great potential of the self-supervised learning in alleviating the scarcity of annotated data.
Collapse