Giraldo-Roldan D, Ribeiro ECC, Araújo ALD, Penafort PVM, Silva VMD, Câmara J, Pontes HAR, Martins MD, Oliveira MC, Santos-Silva AR, Lopes MA, Kowalski LP, Moraes MC, Vargas PA. Deep learning applied to the histopathological diagnosis of ameloblastomas and ameloblastic carcinomas.
J Oral Pathol Med 2023;
52:988-995. [PMID:
37712132 DOI:
10.1111/jop.13481]
[Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 08/08/2023] [Accepted: 08/30/2023] [Indexed: 09/16/2023]
Abstract
BACKGROUND
Odontogenic tumors (OT) are composed of heterogeneous lesions, which can be benign or malignant, with different behavior and histology. Within this classification, ameloblastoma and ameloblastic carcinoma (AC) represent a diagnostic challenge in daily histopathological practice due to their similar characteristics and the limitations that incisional biopsies represent. From these premises, we wanted to test the usefulness of models based on artificial intelligence (AI) in the field of oral and maxillofacial pathology for differential diagnosis. The main advantages of integrating Machine Learning (ML) with microscopic and radiographic imaging is the ability to significantly reduce intra-and inter observer variability and improve diagnostic objectivity and reproducibility.
METHODS
Thirty Digitized slides were collected from different diagnostic centers of oral pathology in Brazil. After performing manual annotation in the region of interest, the images were segmented and fragmented into small patches. In the supervised learning methodology for image classification, three models (ResNet50, DenseNet, and VGG16) were focus of investigation to provide the probability of an image being classified as class0 (i.e., ameloblastoma) or class1 (i.e., Ameloblastic carcinoma).
RESULTS
The training and validation metrics did not show convergence, characterizing overfitting. However, the test results were satisfactory, with an average for ResNet50 of 0.75, 0.71, 0.84, 0.65, and 0.77 for accuracy, precision, sensitivity, specificity, and F1-score, respectively.
CONCLUSIONS
The models demonstrated a strong potential of learning, but lack of generalization ability. The models learn fast, reaching a training accuracy of 98%. The evaluation process showed instability in validation; however, acceptable performance in the testing process, which may be due to the small data set. This first investigation opens an opportunity for expanding collaboration to incorporate more complementary data; as well as, developing and evaluating new alternative models.
Collapse