1
|
Schwabe D, Becker K, Seyferth M, Klaß A, Schaeffter T. The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review. NPJ Digit Med 2024; 7:203. [PMID: 39097662 PMCID: PMC11297942 DOI: 10.1038/s41746-024-01196-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 07/12/2024] [Indexed: 08/05/2024] Open
Abstract
The adoption of machine learning (ML) and, more specifically, deep learning (DL) applications into all major areas of our lives is underway. The development of trustworthy AI is especially important in medicine due to the large implications for patients' lives. While trustworthiness concerns various aspects including ethical, transparency and safety requirements, we focus on the importance of data quality (training/test) in DL. Since data quality dictates the behaviour of ML products, evaluating data quality will play a key part in the regulatory approval of medical ML products. We perform a systematic review following PRISMA guidelines using the databases Web of Science, PubMed and ACM Digital Library. We identify 5408 studies, out of which 120 records fulfil our eligibility criteria. From this literature, we synthesise the existing knowledge on data quality frameworks and combine it with the perspective of ML applications in medicine. As a result, we propose the METRIC-framework, a specialised data quality framework for medical training data comprising 15 awareness dimensions, along which developers of medical ML applications should investigate the content of a dataset. This knowledge helps to reduce biases as a major source of unfairness, increase robustness, facilitate interpretability and thus lays the foundation for trustworthy AI in medicine. The METRIC-framework may serve as a base for systematically assessing training datasets, establishing reference datasets, and designing test datasets which has the potential to accelerate the approval of medical ML products.
Collapse
Affiliation(s)
- Daniel Schwabe
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
| | - Katinka Becker
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Martin Seyferth
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Andreas Klaß
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Tobias Schaeffter
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
- Department of Medical Engineering, Technical University Berlin, Berlin, Germany
- Einstein Centre for Digital Future, Berlin, Germany
| |
Collapse
|
2
|
Keypoint Detection for Injury Identification during Turkey Husbandry Using Neural Networks. SENSORS 2022; 22:s22145188. [PMID: 35890870 PMCID: PMC9319281 DOI: 10.3390/s22145188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 06/24/2022] [Accepted: 07/09/2022] [Indexed: 02/05/2023]
Abstract
Injurious pecking against conspecifics is a serious problem in turkey husbandry. Bloody injuries act as a trigger mechanism to induce further pecking, and timely detection and intervention can prevent massive animal welfare impairments and costly losses. Thus, the overarching aim is to develop a camera-based system to monitor the flock and detect injuries using neural networks. In a preliminary study, images of turkeys were annotated by labelling potential injuries. These were used to train a network for injury detection. Here, we applied a keypoint detection model to provide more information on animal position and indicate injury location. Therefore, seven turkey keypoints were defined, and 244 images (showing 7660 birds) were manually annotated. Two state-of-the-art approaches for pose estimation were adjusted, and their results were compared. Subsequently, a better keypoint detection model (HRNet-W48) was combined with the segmentation model for injury detection. For example, individual injuries were classified using “near tail” or “near head” labels. Summarizing, the keypoint detection showed good results and could clearly differentiate between individual animals even in crowded situations.
Collapse
|
3
|
Stracke J, Andersson R, Volkmann N, Spindler B, Schulte-Landwehr J, Günther R, Kemper N. Footpad Monitoring: Reliability of an Automated System to Assess Footpad Dermatitis in Turkeys (Meleagris gallopavo) During Slaughter. Front Vet Sci 2022; 9:888503. [PMID: 35664852 PMCID: PMC9157434 DOI: 10.3389/fvets.2022.888503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 03/31/2022] [Indexed: 11/28/2022] Open
Abstract
Footpad dermatitis (FPD) is an indicator of animal welfare in turkeys, giving evidence of the animals' physical integrity and providing information on husbandry management. Automated systems for assessing FPD at slaughter can present a useful tool for objective data collection. However, using automated systems requires that they reliably assess the incidence. In this study, the feet of turkeys were scored for FPD by both an automated camera system and a human observer, using a five-scale score. The observer reliability between both was calculated (Krippendorff's alpha). The results were not acceptable, with an agreement coefficient of 0.44 in the initial situation. Therefore, pictures of 3,000 feet scored by the automated system were evaluated systematically to detect deficiencies. The reference area (metatarsal footpad) was not detected correctly in 55.0% of the feet, and false detections of the alteration on the footpad (FPD) were found in 32.9% of the feet. In 41.3% of the feet, the foot was not presented straight to the camera. According to these results, the algorithm of the automated system was modified, aiming to improve color detection and the distinction of the metatarsal footpad from the background. Pictures of the feet, now scored by the modified algorithm, were evaluated again. Observer reliability could be improved (Krippendorff's alpha = 0.61). However, detection of the metatarsal footpad (50.9% incorrect detections) and alterations (27.0% incorrect detections) remained a problem. We found that the performance of the camera system was affected by the angle at which the foot was presented to the camera (skew/straight; p < 0.05). Furthermore, the laterality of the foot (left/right) was found to have a significant effect (p < 0.001). We propose that the latter depends on the slaughter process. This study also highlights a high variability in observer reliability of human observers. Depending on the respective target parameter, the reliability coefficient (Krippendorff's alpha) ranged from 0.21 to 0.82. This stresses the importance of finding an objective alternative. Therefore, it was concluded that the automated detection system could be appropriate to reliably assess FPD at the slaughterhouse. However, there is still room to improve the existing method, especially when using FPD as a welfare indicator.
Collapse
Affiliation(s)
- Jenny Stracke
- Institute of Animal Science, Ethology, University of Bonn, Bonn, Germany
- Institute for Animal Hygiene, Animal Welfare and Farm Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Hannover, Germany
| | - Robby Andersson
- Faculty of Agricultural Sciences and Landscape Architecture, University of Applied Sciences Osnabrück, Osnabrück, Germany
| | - Nina Volkmann
- Institute for Animal Hygiene, Animal Welfare and Farm Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Hannover, Germany
- Science and Innovation for Sustainable Poultry Production (WING), University of Veterinary Medicine Hannover, Foundation, Vechta, Germany
| | - Birgit Spindler
- Institute for Animal Hygiene, Animal Welfare and Farm Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Hannover, Germany
- *Correspondence: Birgit Spindler
| | | | - Ronald Günther
- Heidemark Mästerkreis GmbH u. Co. KG, Haldensleben, Germany
| | - Nicole Kemper
- Institute for Animal Hygiene, Animal Welfare and Farm Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Hannover, Germany
| |
Collapse
|