1
|
Ye S, Xu Y, Chen D, Han S, Liao J. Learning a Single Network for Robust Medical Image Segmentation With Noisy Labels. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3188-3199. [PMID: 38635382 DOI: 10.1109/tmi.2024.3389776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
Robust segmenting with noisy labels is an important problem in medical imaging due to the difficulty of acquiring high-quality annotations. Despite the enormous success of recent developments, these developments still require multiple networks to construct their frameworks and focus on limited application scenarios, which leads to inflexibility in practical applications. They also do not explicitly consider the coarse boundary label problem, which results in sub-optimal results. To overcome these challenges, we propose a novel Simultaneous Edge Alignment and Memory-Assisted Learning (SEAMAL) framework for noisy-label robust segmentation. It achieves single-network robust learning, which is applicable for both 2D and 3D segmentation, in both Set-HQ-knowable and Set-HQ-agnostic scenarios. Specifically, to achieve single-model noise robustness, we design a Memory-assisted Selection and Correction module (MSC) that utilizes predictive history consistency from the Prediction Memory Bank to distinguish between reliable and non-reliable labels pixel-wisely, and that updates the reliable ones at the superpixel level. To overcome the coarse boundary label problem, which is common in practice, and to better utilize shape-relevant information at the boundary, we propose an Edge Detection Branch (EDB) that explicitly learns the boundary via an edge detection layer with only slight additional computational cost, and we improve the sharpness and precision of the boundary with a thinning loss. Extensive experiments verify that SEAMAL outperforms previous works significantly.
Collapse
|
2
|
Lu X, Cui Z, Sun Y, Guan Khor H, Sun A, Ma L, Chen F, Gao S, Tian Y, Zhou F, Lv Y, Liao H. Better Rough Than Scarce: Proximal Femur Fracture Segmentation With Rough Annotations. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3240-3252. [PMID: 38652607 DOI: 10.1109/tmi.2024.3392854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
Proximal femoral fracture segmentation in computed tomography (CT) is essential in the preoperative planning of orthopedic surgeons. Recently, numerous deep learning-based approaches have been proposed for segmenting various structures within CT scans. Nevertheless, distinguishing various attributes between fracture fragments and soft tissue regions in CT scans frequently poses challenges, which have received comparatively limited research attention. Besides, the cornerstone of contemporary deep learning methodologies is the availability of annotated data, while detailed CT annotations remain scarce. To address the challenge, we propose a novel weakly-supervised framework, namely Rough Turbo Net (RT-Net), for the segmentation of proximal femoral fractures. We emphasize the utilization of human resources to produce rough annotations on a substantial scale, as opposed to relying on limited fine-grained annotations that demand a substantial time to create. In RT-Net, rough annotations pose fractured-region constraints, which have demonstrated significant efficacy in enhancing the accuracy of the network. Conversely, the fine annotations can provide more details for recognizing edges and soft tissues. Besides, we design a spatial adaptive attention module (SAAM) that adapts to the spatial distribution of the fracture regions and align feature in each decoder. Moreover, we propose a fine-edge loss which is applied through an edge discrimination network to penalize the absence or imprecision edge features. Extensive quantitative and qualitative experiments demonstrate the superiority of RT-Net to state-of-the-art approaches. Furthermore, additional experiments show that RT-Net has the capability to produce pseudo labels for raw CT images that can further improve fracture segmentation performance and has the potential to improve segmentation performance on public datasets. The code is available at: https://github.com/zyairelu/RT-Net.
Collapse
|
3
|
Pauly R, Alexander Feltus F. Simplified detection of genetic background admixture using artificial intelligence. Clin Genet 2024; 106:247-257. [PMID: 38561851 DOI: 10.1111/cge.14527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 02/14/2024] [Accepted: 03/25/2024] [Indexed: 04/04/2024]
Abstract
Admixture refers to the mixing of genetic ancestry from different populations. Admixture is important for genomic medicine because it can affect how an individual responds to certain medications, how they metabolize drugs, and susceptibility to certain diseases. For example, some genetic variants associated with drug metabolism and response may be more common in certain populations, and individuals with admixed ancestry may have a different frequency of these variants than individuals from the ancestral populations. Understanding the patterns of admixture in a population can also help researchers identify new genetic variants associated with diseases or traits and develop more personalized and targeted treatments. In this study, we compared and classified the known and self-reported genetic backgrounds from 1000 Genomes Project and admixed samples from GTEx projects using supervised, unsupervised and statistical classification methodologies. We developed a novel tool called Admix-AI that uses a one-dimensional convolutional neural network to understand and classify admixed genetic backgrounds using 213 DNA-marker based genetic background labels. Admix-AI can be used to discover admixed proportions in samples and ultimately aid personalized genomic medicine by identifying specific biomarker systems. We compared Admix-AI to the existing admixture categorization software and found our tool to be computationally faster with 2× speedup and streamlined usage. Admix-AI is available as open-source code under GPL version 3.0 license at https://github.com/rpauly/Admix-AI.
Collapse
Affiliation(s)
- Rini Pauly
- Biomedical Data Science & Informatics Program, Clemson University, Clemson, South Carolina, USA
| | - Frank Alexander Feltus
- Biomedical Data Science & Informatics Program, Clemson University, Clemson, South Carolina, USA
- Genetics and Biochemistry Department, Clemson University, Clemson, South Carolina, USA
- Center for Human Genetics, Clemson University, Greenwood, South Carolina, USA
| |
Collapse
|
4
|
Nair A, Alagha MA, Cobb J, Jones G. Assessing the Value of Imaging Data in Machine Learning Models to Predict Patient-Reported Outcome Measures in Knee Osteoarthritis Patients. Bioengineering (Basel) 2024; 11:824. [PMID: 39199782 PMCID: PMC11351307 DOI: 10.3390/bioengineering11080824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 07/28/2024] [Accepted: 07/30/2024] [Indexed: 09/01/2024] Open
Abstract
Knee osteoarthritis (OA) affects over 650 million patients worldwide. Total knee replacement is aimed at end-stage OA to relieve symptoms of pain, stiffness and reduced mobility. However, the role of imaging modalities in monitoring symptomatic disease progression remains unclear. This study aimed to compare machine learning (ML) models, with and without imaging features, in predicting the two-year Western Ontario and McMaster Universities Arthritis Index (WOMAC) score for knee OA patients. We included 2408 patients from the Osteoarthritis Initiative (OAI) database, with 629 patients from the Multicenter Osteoarthritis Study (MOST) database. The clinical dataset included 18 clinical features, while the imaging dataset contained an additional 10 imaging features. Minimal Clinically Important Difference (MCID) was set to 24, reflecting meaningful physical impairment. Clinical and imaging dataset models produced similar area under curve (AUC) scores, highlighting low differences in performance AUC < 0.025). For both clinical and imaging datasets, Gradient Boosting Machine (GBM) models performed the best in the external validation, with a clinically acceptable AUC of 0.734 (95% CI 0.687-0.781) and 0.747 (95% CI 0.701-0.792), respectively. The five features identified included educational background, family history of osteoarthritis, co-morbidities, use of osteoporosis medications and previous knee procedures. This is the first study to demonstrate that ML models achieve comparable performance with and without imaging features.
Collapse
Affiliation(s)
- Abhinav Nair
- MSk Lab, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK
| | - M. Abdulhadi Alagha
- MSk Lab, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK
- Data Science Institute, London School of Economics and Political Science, London, UK
| | - Justin Cobb
- MSk Lab, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK
| | - Gareth Jones
- MSk Lab, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK
| |
Collapse
|
5
|
Katsumata E, Ranjan AK, Tashima Y, Takahata T, Sato T, Kobayashi M, Ishii M, Takahashi T, Oda A, Hirano M, Hakamata Y, Sugai K, Kobayashi E. A neural cell automated analysis system based on pathological specimens in a gerbil brain ischemia model. Acta Cir Bras 2024; 39:e394224. [PMID: 39140525 PMCID: PMC11321503 DOI: 10.1590/acb394224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 05/24/2024] [Indexed: 08/15/2024] Open
Abstract
PURPOSE Amid rising health awareness, natural products which has milder effects than medical drugs are becoming popular. However, only few systems can quantitatively assess their impact on living organisms. Therefore, we developed a deep-learning system to automate the counting of cells in a gerbil model, aiming to assess a natural product's effectiveness against ischemia. METHODS The image acquired from paraffin blocks containing gerbil brains was analyzed by a deep-learning model (fine-tuned Detectron2). RESULTS The counting system achieved a 79%-positive predictive value and 85%-sensitivity when visual judgment by an expert was used as ground truth. CONCLUSIONS Our system evaluated hydrogen water's potential against ischemia and found it potentially useful, which is consistent with expert assessment. Due to natural product's milder effects, large data sets are needed for evaluation, making manual measurement labor-intensive. Hence, our system offers a promising new approach for evaluating natural products.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Asahi Oda
- Nippon Veterinary and Life Science University – School of Veterinary Nursing and Technology – Department of Basic Science – Tokyo, Japan
| | - Momoko Hirano
- Nippon Veterinary and Life Science University – School of Veterinary Nursing and Technology – Department of Basic Science – Tokyo, Japan
| | - Yoji Hakamata
- Nippon Veterinary and Life Science University – School of Veterinary Nursing and Technology – Department of Basic Science – Tokyo, Japan
| | - Kazuhisa Sugai
- Nippon Veterinary and Life Science University – School of Veterinary Nursing and Technology – Department of Basic Science – Tokyo, Japan
| | - Eiji Kobayashi
- Nippon Veterinary and Life Science University – School of Veterinary Nursing and Technology – Department of Basic Science – Tokyo, Japan
- Kobayashi Regenerative Research Institute – Wakayama, Japan
| |
Collapse
|
6
|
Berisha V, Liss JM. Responsible development of clinical speech AI: Bridging the gap between clinical research and technology. NPJ Digit Med 2024; 7:208. [PMID: 39122889 PMCID: PMC11316053 DOI: 10.1038/s41746-024-01199-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 07/19/2024] [Indexed: 08/12/2024] Open
Abstract
This perspective article explores the challenges and potential of using speech as a biomarker in clinical settings, particularly when constrained by the small clinical datasets typically available in such contexts. We contend that by integrating insights from speech science and clinical research, we can reduce sample complexity in clinical speech AI models with the potential to decrease timelines to translation. Most existing models are based on high-dimensional feature representations trained with limited sample sizes and often do not leverage insights from speech science and clinical research. This approach can lead to overfitting, where the models perform exceptionally well on training data but fail to generalize to new, unseen data. Additionally, without incorporating theoretical knowledge, these models may lack interpretability and robustness, making them challenging to troubleshoot or improve post-deployment. We propose a framework for organizing health conditions based on their impact on speech and promote the use of speech analytics in diverse clinical contexts beyond cross-sectional classification. For high-stakes clinical use cases, we advocate for a focus on explainable and individually-validated measures and stress the importance of rigorous validation frameworks and ethical considerations for responsible deployment. Bridging the gap between AI research and clinical speech research presents new opportunities for more efficient translation of speech-based AI tools and advancement of scientific discoveries in this interdisciplinary space, particularly if limited to small or retrospective datasets.
Collapse
Affiliation(s)
- Visar Berisha
- School of Electrical Computer and Energy Engineering and College of Health Solutions, Arizona State University, Tempe, AZ, USA.
| | - Julie M Liss
- College of Health Solutions, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
7
|
Ahmadi A, Courtney M, Ren C, Ingalls B. A benchmarked comparison of software packages for time-lapse image processing of monolayer bacterial population dynamics. Microbiol Spectr 2024; 12:e0003224. [PMID: 38980028 PMCID: PMC11302142 DOI: 10.1128/spectrum.00032-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 04/26/2024] [Indexed: 07/10/2024] Open
Abstract
Time-lapse microscopy offers a powerful approach for analyzing cellular activity. In particular, this technique is valuable for assessing the behavior of bacterial populations, which can exhibit growth and intercellular interactions in a monolayer. Such time-lapse imaging typically generates large quantities of data, limiting the options for manual investigation. Several image-processing software packages have been developed to facilitate analysis. It can thus be a challenge to identify the software package best suited to a particular research goal. Here, we compare four software packages that support the analysis of 2D time-lapse images of cellular populations: CellProfiler, SuperSegger-Omnipose, DeLTA, and FAST. We compare their performance against benchmarked results on time-lapse observations of Escherichia coli populations. Performance varies across the packages, with each of the four outperforming the others in at least one aspect of the analysis. Not surprisingly, the packages that have been in development for longer showed the strongest performance. We found that deep learning-based approaches to object segmentation outperformed traditional approaches, but the opposite was true for frame-to-frame object tracking. We offer these comparisons, together with insight into usability, computational efficiency, and feature availability, as a guide to researchers seeking image-processing solutions. IMPORTANCE Time-lapse microscopy provides a detailed window into the world of bacterial behavior. However, the vast amount of data produced by these techniques is difficult to analyze manually. We have analyzed four software tools designed to process such data and compared their performance, using populations of commonly studied bacterial species as our test subjects. Our findings offer a roadmap to scientists, helping them choose the right tool for their research. This comparison bridges a gap between microbiology and computational analysis, streamlining research efforts.
Collapse
Affiliation(s)
- Atiyeh Ahmadi
- Department of Biology, University of Waterloo, Waterloo, Ontario, Canada
| | - Matthew Courtney
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada
| | - Carolyn Ren
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada
| | - Brian Ingalls
- Department of Biology, University of Waterloo, Waterloo, Ontario, Canada
- Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario, Canada
| |
Collapse
|
8
|
Schwabe D, Becker K, Seyferth M, Klaß A, Schaeffter T. The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review. NPJ Digit Med 2024; 7:203. [PMID: 39097662 PMCID: PMC11297942 DOI: 10.1038/s41746-024-01196-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 07/12/2024] [Indexed: 08/05/2024] Open
Abstract
The adoption of machine learning (ML) and, more specifically, deep learning (DL) applications into all major areas of our lives is underway. The development of trustworthy AI is especially important in medicine due to the large implications for patients' lives. While trustworthiness concerns various aspects including ethical, transparency and safety requirements, we focus on the importance of data quality (training/test) in DL. Since data quality dictates the behaviour of ML products, evaluating data quality will play a key part in the regulatory approval of medical ML products. We perform a systematic review following PRISMA guidelines using the databases Web of Science, PubMed and ACM Digital Library. We identify 5408 studies, out of which 120 records fulfil our eligibility criteria. From this literature, we synthesise the existing knowledge on data quality frameworks and combine it with the perspective of ML applications in medicine. As a result, we propose the METRIC-framework, a specialised data quality framework for medical training data comprising 15 awareness dimensions, along which developers of medical ML applications should investigate the content of a dataset. This knowledge helps to reduce biases as a major source of unfairness, increase robustness, facilitate interpretability and thus lays the foundation for trustworthy AI in medicine. The METRIC-framework may serve as a base for systematically assessing training datasets, establishing reference datasets, and designing test datasets which has the potential to accelerate the approval of medical ML products.
Collapse
Affiliation(s)
- Daniel Schwabe
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
| | - Katinka Becker
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Martin Seyferth
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Andreas Klaß
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Tobias Schaeffter
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
- Department of Medical Engineering, Technical University Berlin, Berlin, Germany
- Einstein Centre for Digital Future, Berlin, Germany
| |
Collapse
|
9
|
Wagstyl K, Kobow K, Casillas-Espinosa PM, Cole AJ, Jiménez-Jiménez D, Nariai H, Baulac S, O'Brien T, Henshall DC, Akman O, Sankar R, Galanopoulou AS, Auvin S. WONOEP 2022: Neurotechnology for the diagnosis of epilepsy. Epilepsia 2024; 65:2238-2247. [PMID: 38829313 DOI: 10.1111/epi.18028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 05/10/2024] [Accepted: 05/13/2024] [Indexed: 06/05/2024]
Abstract
Epilepsy's myriad causes and clinical presentations ensure that accurate diagnoses and targeted treatments remain a challenge. Advanced neurotechnologies are needed to better characterize individual patients across multiple modalities and analytical techniques. At the XVIth Workshop on Neurobiology of Epilepsy: Early Onset Epilepsies: Neurobiology and Novel Therapeutic Strategies (WONOEP 2022), the session on "advanced tools" highlighted a range of approaches, from molecular phenotyping of genetic epilepsy models and resected tissue samples to imaging-guided localization of epileptogenic tissue for surgical resection of focal malformations. These tools integrate cutting edge research, clinical data acquisition, and advanced computational methods to leverage the rich information contained within increasingly large datasets. A number of common challenges and opportunities emerged, including the need for multidisciplinary collaboration, multimodal integration, potential ethical challenges, and the multistage path to clinical translation. Despite these challenges, advanced epilepsy neurotechnologies offer the potential to improve our understanding of the underlying causes of epilepsy and our capacity to provide patient-specific treatment.
Collapse
Affiliation(s)
- Konrad Wagstyl
- School of Biomedical Engineering & Imaging Science, King's College London, London, UK
- Developmental Neurosciences, UCL Great Ormond Street for Child Health, UCL, London, UK
| | - Katja Kobow
- Institute of Neuropathology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Pablo M Casillas-Espinosa
- Department of Medicine, Royal Melbourne Hospital, University of Melbourne, Parkville, Victoria, Australia
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
- Department of Neurology, Alfred Hospital, Melbourne, Victoria, Australia
| | - Andrew J Cole
- MGH Epilepsy Service, Division of Clinical Neurophysiology, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Diego Jiménez-Jiménez
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK
| | - Hiroki Nariai
- Division of Pediatric Neurology, Department of Pediatrics, UCLA Medical Center, Los Angeles, California, USA
| | - Stéphanie Baulac
- Institut du Cerveau-Paris Brain Institute-ICM, INSERM, CNRS, Sorbonne Université, Paris, France
| | - Terence O'Brien
- Department of Medicine, Royal Melbourne Hospital, University of Melbourne, Parkville, Victoria, Australia
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
- Department of Neurology, Alfred Hospital, Melbourne, Victoria, Australia
| | - David C Henshall
- FutureNeuro SFI Research Centre, RCSI University of Medicine and Health Sciences, Dublin, Ireland
- Department of Physiology and Medical Physics, RCSI University of Medicine and Health Sciences, Dublin, Ireland
| | - Ozlem Akman
- Department of Physiology, Faculty of Medicine, Demiroglu Bilim University, Istanbul, Turkey
| | - Raman Sankar
- Division of Pediatric Neurology, Department of Pediatrics, UCLA Mattel Children's Hospital, David Geffen School of Medicine, Los Angeles, California, USA
- UCLA Children's Discovery and Innovation Institute, California, Los Angeles, USA
| | - Aristea S Galanopoulou
- Saul R. Korey Department of Neurology, Isabelle Rapin Division of Child Neurology, Laboratory of Developmental Epilepsy, Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Stéphane Auvin
- Université Paris-Cité, INSERM NeuroDiderot, Paris, France
- Pediatric Neurology Department, APHP, Robert Debré University Hospital, CRMR Epilepsies Rares, EpiCARE member, Paris, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
10
|
Kim H, Lim S, Park M, Kim K, Kang SH, Lee Y. Optimization of Fast Non-Local Means Noise Reduction Algorithm Parameter in Computed Tomographic Phantom Images Using 3D Printing Technology. Diagnostics (Basel) 2024; 14:1589. [PMID: 39125465 PMCID: PMC11312005 DOI: 10.3390/diagnostics14151589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 07/09/2024] [Accepted: 07/17/2024] [Indexed: 08/12/2024] Open
Abstract
Noise in computed tomography (CT) is inevitably generated, which lowers the accuracy of disease diagnosis. The non-local means approach, a software technique for reducing noise, is widely used in medical imaging. In this study, we propose a noise reduction algorithm based on fast non-local means (FNLMs) and apply it to CT images of a phantom created using 3D printing technology. The self-produced phantom was manufactured using filaments with similar density to human brain tissues. To quantitatively evaluate image quality, the contrast-to-noise ratio (CNR), coefficient of variation (COV), and normalized noise power spectrum (NNPS) were calculated. The results demonstrate that the optimized smoothing factors of FNLMs are 0.08, 0.16, 0.22, 0.25, and 0.32 at 0.001, 0.005, 0.01, 0.05, and 0.1 of noise intensities, respectively. In addition, we compared the optimized FNLMs with noisy, local filters and total variation algorithms. As a result, FNLMs showed superior performance compared to various denoising techniques. Particularly, comparing the optimized FNLMs to the noisy images, the CNR improved by 6.53 to 16.34 times, COV improved by 6.55 to 18.28 times, and the NNPS improved by 10-2 mm2 on average. In conclusion, our approach shows significant potential in enhancing CT image quality with anthropomorphic phantoms, thus addressing the noise issue and improving diagnostic accuracy.
Collapse
Affiliation(s)
- Hajin Kim
- Department of Health Science, General Graduate School of Gachon University, 191, Hambakmoe-ro, Yeonsu-gu, Incheon 21936, Republic of Korea; (H.K.); (S.L.); (M.P.)
| | - Sewon Lim
- Department of Health Science, General Graduate School of Gachon University, 191, Hambakmoe-ro, Yeonsu-gu, Incheon 21936, Republic of Korea; (H.K.); (S.L.); (M.P.)
| | - Minji Park
- Department of Health Science, General Graduate School of Gachon University, 191, Hambakmoe-ro, Yeonsu-gu, Incheon 21936, Republic of Korea; (H.K.); (S.L.); (M.P.)
| | - Kyuseok Kim
- Department of Biomedical Engineering, Eulji University, 553, Sanseong-daero, Sujeong-gu, Seongnam-si 13135, Republic of Korea;
| | - Seong-Hyeon Kang
- Department of Biomedical Engineering, Eulji University, 553, Sanseong-daero, Sujeong-gu, Seongnam-si 13135, Republic of Korea;
| | - Youngjin Lee
- Department of Radiological Science, Gachon University, 191, Hambakmoe-ro, Yeonsu-gu, Incheon 21936, Republic of Korea
| |
Collapse
|
11
|
Gao Y, Fu J, Guo Y, Wang Y. G-T correcting: an improved training of image segmentation under noisy labels. Med Biol Eng Comput 2024:10.1007/s11517-024-03170-4. [PMID: 39031327 DOI: 10.1007/s11517-024-03170-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 07/06/2024] [Indexed: 07/22/2024]
Abstract
Data-driven medical image segmentation networks require expert annotations, which are hard to obtain. Non-expert annotations are often used instead, but these can be inaccurate (referred to as "noisy labels"), misleading the network's training and causing a decline in segmentation performance. In this study, we focus on improving the segmentation performance of neural networks when trained with noisy annotations. Specifically, we propose a two-stage framework named "G-T correcting," consisting of "G" stage for recognizing noisy labels and "T" stage for correcting noisy labels. In the "G" stage, a positive feedback method is proposed to automatically recognize noisy samples, using a Gaussian mixed model to classify clean and noisy labels through the per-sample loss histogram. In the "T" stage, a confident correcting strategy and early learning strategy are adopted to allow the segmentation network to receive productive guidance from noisy labels. Experiments on simulated and real-world noisy labels show that this method can achieve over 90% accuracy in recognizing noisy labels, and improve the network's DICE coefficient to 91%. The results demonstrate that the proposed method can enhance the segmentation performance of the network when trained with noisy labels, indicating good clinical application prospects.
Collapse
Affiliation(s)
- Yun Gao
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Junhu Fu
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China
| | - Yi Guo
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China.
| | - Yuanyuan Wang
- School of Information Science and Technology of Fudan University, 220 Handan Rd, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention (MICCAI) of Shanghai, Shanghai, 200032, China.
| |
Collapse
|
12
|
Kim JY, Ryu WS, Kim D, Kim EY. Better performance of deep learning pulmonary nodule detection using chest radiography with pixel level labels in reference to computed tomography: data quality matters. Sci Rep 2024; 14:15967. [PMID: 38987309 PMCID: PMC11237128 DOI: 10.1038/s41598-024-66530-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 07/02/2024] [Indexed: 07/12/2024] Open
Abstract
Labeling errors can significantly impact the performance of deep learning models used for screening chest radiographs. The deep learning model for detecting pulmonary nodules is particularly vulnerable to such errors, mainly because normal chest radiographs and those with nodules obscured by ribs appear similar. Thus, high-quality datasets referred to chest computed tomography (CT) are required to prevent the misclassification of nodular chest radiographs as normal. From this perspective, a deep learning strategy employing chest radiography data with pixel-level annotations referencing chest CT scans may improve nodule detection and localization compared to image-level labels. We trained models using a National Institute of Health chest radiograph-based labeling dataset and an AI-HUB CT-based labeling dataset, employing DenseNet architecture with squeeze-and-excitation blocks. We developed four models to assess whether CT versus chest radiography and pixel-level versus image-level labeling would improve the deep learning model's performance to detect nodules. The models' performance was evaluated using two external validation datasets. The AI-HUB dataset with image-level labeling outperformed the NIH dataset (AUC 0.88 vs 0.71 and 0.78 vs. 0.73 in two external datasets, respectively; both p < 0.001). However, the AI-HUB data annotated at the pixel level produced the best model (AUC 0.91 and 0.86 in external datasets), and in terms of nodule localization, it significantly outperformed models trained with image-level annotation data, with a Dice coefficient ranging from 0.36 to 0.58. Our findings underscore the importance of accurately labeled data in developing reliable deep learning algorithms for nodule detection in chest radiography.
Collapse
Affiliation(s)
- Jae Yong Kim
- Artificial Intelligence Research Center, JLK Inc., 5 Teheran-ro 33-gil, Seoul, Republic of Korea
| | - Wi-Sun Ryu
- Artificial Intelligence Research Center, JLK Inc., 5 Teheran-ro 33-gil, Seoul, Republic of Korea.
| | - Dongmin Kim
- Artificial Intelligence Research Center, JLK Inc., 5 Teheran-ro 33-gil, Seoul, Republic of Korea
| | - Eun Young Kim
- Department of Radiology, Incheon Sejong Hospital, 20, Gyeyangmunhwa-ro, Gyeyang-gu, Incheon, 21080, Republic of Korea.
| |
Collapse
|
13
|
Butt MA, Kaleem MF, Bilal M, Hanif MS. Using multi-label ensemble CNN classifiers to mitigate labelling inconsistencies in patch-level Gleason grading. PLoS One 2024; 19:e0304847. [PMID: 38968206 PMCID: PMC11226137 DOI: 10.1371/journal.pone.0304847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 05/21/2024] [Indexed: 07/07/2024] Open
Abstract
This paper presents a novel approach to enhance the accuracy of patch-level Gleason grading in prostate histopathology images, a critical task in the diagnosis and prognosis of prostate cancer. This study shows that the Gleason grading accuracy can be improved by addressing the prevalent issue of label inconsistencies in the SICAPv2 prostate dataset, which employs a majority voting scheme for patch-level labels. We propose a multi-label ensemble deep-learning classifier that effectively mitigates these inconsistencies and yields more accurate results than the state-of-the-art works. Specifically, our approach leverages the strengths of three different one-vs-all deep learning models in an ensemble to learn diverse features from the histopathology images to individually indicate the presence of one or more Gleason grades (G3, G4, and G5) in each patch. These deep learning models have been trained using transfer learning to fine-tune a variant of the ResNet18 CNN classifier chosen after an extensive ablation study. Experimental results demonstrate that our multi-label ensemble classifier significantly outperforms traditional single-label classifiers reported in the literature by at least 14% and 4% on accuracy and f1-score metrics respectively. These results underscore the potential of our proposed machine learning approach to improve the accuracy and consistency of prostate cancer grading.
Collapse
Affiliation(s)
- Muhammad Asim Butt
- Department of Electrical Engineering, University of Management and Technology, Lahore, Pakistan
| | | | - Muhammad Bilal
- Center of Excellence in Intelligent Engineering Systems, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Electrical and Computer Engineering, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Muhammad Shehzad Hanif
- Center of Excellence in Intelligent Engineering Systems, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Electrical and Computer Engineering, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
14
|
Misera L, Müller-Franzes G, Truhn D, Kather JN. Weakly Supervised Deep Learning in Radiology. Radiology 2024; 312:e232085. [PMID: 39041937 DOI: 10.1148/radiol.232085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Deep learning (DL) is currently the standard artificial intelligence tool for computer-based image analysis in radiology. Traditionally, DL models have been trained with strongly supervised learning methods. These methods depend on reference standard labels, typically applied manually by experts. In contrast, weakly supervised learning is more scalable. Weak supervision comprises situations in which only a portion of the data are labeled (incomplete supervision), labels refer to a whole region or case as opposed to a precisely delineated image region (inexact supervision), or labels contain errors (inaccurate supervision). In many applications, weak labels are sufficient to train useful models. Thus, weakly supervised learning can unlock a large amount of otherwise unusable data for training DL models. One example of this is using large language models to automatically extract weak labels from free-text radiology reports. Here, we outline the key concepts in weakly supervised learning and provide an overview of applications in radiologic image analysis. With more fundamental and clinical translational work, weakly supervised learning could facilitate the uptake of DL in radiology and research workflows by enabling large-scale image analysis and advancing the development of new DL-based biomarkers.
Collapse
Affiliation(s)
- Leo Misera
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| | - Gustav Müller-Franzes
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| | - Daniel Truhn
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| | - Jakob Nikolas Kather
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| |
Collapse
|
15
|
Guan H, Yap PT, Bozoki A, Liu M. Federated learning for medical image analysis: A survey. PATTERN RECOGNITION 2024; 151:110424. [PMID: 38559674 PMCID: PMC10976951 DOI: 10.1016/j.patcog.2024.110424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Machine learning in medical imaging often faces a fundamental dilemma, namely, the small sample size problem. Many recent studies suggest using multi-domain data pooled from different acquisition sites/centers to improve statistical power. However, medical images from different sites cannot be easily shared to build large datasets for model training due to privacy protection reasons. As a promising solution, federated learning, which enables collaborative training of machine learning models based on data from different sites without cross-site data sharing, has attracted considerable attention recently. In this paper, we conduct a comprehensive survey of the recent development of federated learning methods in medical image analysis. We have systematically gathered research papers on federated learning and its applications in medical image analysis published between 2017 and 2023. Our search and compilation were conducted using databases from IEEE Xplore, ACM Digital Library, Science Direct, Springer Link, Web of Science, Google Scholar, and PubMed. In this survey, we first introduce the background of federated learning for dealing with privacy protection and collaborative learning issues. We then present a comprehensive review of recent advances in federated learning methods for medical image analysis. Specifically, existing methods are categorized based on three critical aspects of a federated learning system, including client end, server end, and communication techniques. In each category, we summarize the existing federated learning methods according to specific research problems in medical image analysis and also provide insights into the motivations of different approaches. In addition, we provide a review of existing benchmark medical imaging datasets and software platforms for current federated learning research. We also conduct an experimental study to empirically evaluate typical federated learning methods for medical image analysis. This survey can help to better understand the current research status, challenges, and potential research opportunities in this promising research field.
Collapse
Affiliation(s)
- Hao Guan
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrea Bozoki
- Department of Neurology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
16
|
Shi J, Zhang K, Guo C, Yang Y, Xu Y, Wu J. A survey of label-noise deep learning for medical image analysis. Med Image Anal 2024; 95:103166. [PMID: 38613918 DOI: 10.1016/j.media.2024.103166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 04/04/2024] [Accepted: 04/05/2024] [Indexed: 04/15/2024]
Abstract
Several factors are associated with the success of deep learning. One of the most important reasons is the availability of large-scale datasets with clean annotations. However, obtaining datasets with accurate labels in the medical imaging domain is challenging. The reliability and consistency of medical labeling are some of these issues, and low-quality annotations with label noise usually exist. Because noisy labels reduce the generalization performance of deep neural networks, learning with noisy labels is becoming an essential task in medical image analysis. Literature on this topic has expanded in terms of volume and scope. However, no recent surveys have collected and organized this knowledge, impeding the ability of researchers and practitioners to utilize it. In this work, we presented an up-to-date survey of label-noise learning for medical image domain. We reviewed extensive literature, illustrated some typical methods, and showed unified taxonomies in terms of methodological differences. Subsequently, we conducted the methodological comparison and demonstrated the corresponding advantages and disadvantages. Finally, we discussed new research directions based on the characteristics of medical images. Our survey aims to provide researchers and practitioners with a solid understanding of existing medical label-noise learning, such as the main algorithms developed over the past few years, which could help them investigate new methods to combat with the negative effects of label noise.
Collapse
Affiliation(s)
- Jialin Shi
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China.
| | - Kailai Zhang
- Department of Networks, China Mobile Communications Group Co., Ltd., Beijing, China
| | - Chenyi Guo
- Department of Electronic Engineering, Tsinghua University, Beijing, China
| | | | - Yali Xu
- Department of Breast Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Ji Wu
- Department of Electronic Engineering, Tsinghua University, Beijing, China
| |
Collapse
|
17
|
Wang H, Jin Q, Li S, Liu S, Wang M, Song Z. A comprehensive survey on deep active learning in medical image analysis. Med Image Anal 2024; 95:103201. [PMID: 38776841 DOI: 10.1016/j.media.2024.103201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 04/25/2024] [Accepted: 05/06/2024] [Indexed: 05/25/2024]
Abstract
Deep learning has achieved widespread success in medical image analysis, leading to an increasing demand for large-scale expert-annotated medical image datasets. Yet, the high cost of annotating medical images severely hampers the development of deep learning in this field. To reduce annotation costs, active learning aims to select the most informative samples for annotation and train high-performance models with as few labeled samples as possible. In this survey, we review the core methods of active learning, including the evaluation of informativeness and sampling strategy. For the first time, we provide a detailed summary of the integration of active learning with other label-efficient techniques, such as semi-supervised, self-supervised learning, and so on. We also summarize active learning works that are specifically tailored to medical image analysis. Additionally, we conduct a thorough comparative analysis of the performance of different AL methods in medical image analysis with experiments. In the end, we offer our perspectives on the future trends and challenges of active learning and its applications in medical image analysis. An accompanying paper list and code for the comparative analysis is available in https://github.com/LightersWang/Awesome-Active-Learning-for-Medical-Image-Analysis.
Collapse
Affiliation(s)
- Haoran Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Qiuye Jin
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Shiman Li
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Siyu Liu
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Manning Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China.
| | - Zhijian Song
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China.
| |
Collapse
|
18
|
Hochreuter KM, Ren J, Nijkamp J, Korreman SS, Lukacova S, Kallehauge JF, Trip AK. The effect of editing clinical contours on deep-learning segmentation accuracy of the gross tumor volume in glioblastoma. Phys Imaging Radiat Oncol 2024; 31:100620. [PMID: 39220114 PMCID: PMC11364127 DOI: 10.1016/j.phro.2024.100620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/29/2024] [Accepted: 08/01/2024] [Indexed: 09/04/2024] Open
Abstract
Background and purpose Deep-learning (DL) models for segmentation of the gross tumor volume (GTV) in radiotherapy are generally based on clinical delineations which suffer from inter-observer variability. The aim of this study was to compare performance of a DL-model based on clinical glioblastoma GTVs to a model based on a single-observer edited version of the same GTVs. Materials and methods The dataset included imaging data (Computed Tomography (CT), T1, contrast-T1 (T1C), and fluid-attenuated-inversion-recovery (FLAIR)) of 259 glioblastoma patients treated with post-operative radiotherapy between 2012 and 2019 at a single institute. The clinical GTVs were edited using all imaging data. The dataset was split into 207 cases for training/validation and 52 for testing.GTV segmentation models (nnUNet) were trained on clinical and edited GTVs separately and compared using Surface Dice with 1 mm tolerance (sDSC1mm). We also evaluated model performance with respect to extent of resection (EOR), and different imaging combinations (T1C/T1/FLAIR/CT, T1C/FLAIR/CT, T1C/FLAIR, T1C/CT, T1C/T1, T1C). A Wilcoxon test was used for significance testing. Results The median (range) sDSC1mm of the clinical-GTV-model and edited-GTV-model both evaluated with the edited contours, was 0.76 (0.43-0.94) vs. 0.92 (0.60-0.98) respectively (p < 0.001). sDSC1mm was not significantly different between patients with a biopsy, partial, and complete resection. T1C as single input performed as good as use of imaging combinations. Conclusions High segmentation accuracy was obtained by the DL-models. Editing of the clinical GTVs significantly increased DL performance with a relevant effect size. DL performance was robust for EOR and highly accurate using only T1C.
Collapse
Affiliation(s)
- Kim M. Hochreuter
- Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Jintao Ren
- Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Oncology, Aarhus University Hospital, Aarhus, Denmark
| | - Jasper Nijkamp
- Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Stine S. Korreman
- Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Oncology, Aarhus University Hospital, Aarhus, Denmark
| | - Slávka Lukacova
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Oncology, Aarhus University Hospital, Aarhus, Denmark
| | - Jesper F. Kallehauge
- Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Anouk K. Trip
- Danish Centre for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
| |
Collapse
|
19
|
Jiménez LG, Decaestecker C. Impact of imperfect annotations on CNN training and performance for instance segmentation and classification in digital pathology. Comput Biol Med 2024; 177:108586. [PMID: 38796882 DOI: 10.1016/j.compbiomed.2024.108586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/20/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024]
Abstract
Segmentation and classification of large numbers of instances, such as cell nuclei, are crucial tasks in digital pathology for accurate diagnosis. However, the availability of high-quality datasets for deep learning methods is often limited due to the complexity of the annotation process. In this work, we investigate the impact of noisy annotations on the training and performance of a state-of-the-art CNN model for the combined task of detecting, segmenting and classifying nuclei in histopathology images. In this context, we investigate the conditions for determining an appropriate number of training epochs to prevent overfitting to annotation noise during training. Our results indicate that the utilisation of a small, correctly annotated validation set is instrumental in avoiding overfitting and maintaining model performance to a large extent. Additionally, our findings underscore the beneficial role of pre-training.
Collapse
Affiliation(s)
- Laura Gálvez Jiménez
- Laboratory of Image Synthesis and Analysis, Université Libre de Bruxelles, Brussels, Belgium.
| | - Christine Decaestecker
- Laboratory of Image Synthesis and Analysis, Université Libre de Bruxelles, Brussels, Belgium; DIAPath, CMMI, Université Libre de Bruxelles, Gosselies, Belgium.
| |
Collapse
|
20
|
Bannone E, Collins T, Esposito A, Cinelli L, De Pastena M, Pessaux P, Felli E, Andreotti E, Okamoto N, Barberio M, Felli E, Montorsi RM, Ingaglio N, Rodríguez-Luna MR, Nkusi R, Marescaux J, Hostettler A, Salvia R, Diana M. Surgical optomics: hyperspectral imaging and deep learning towards precision intraoperative automatic tissue recognition-results from the EX-MACHYNA trial. Surg Endosc 2024; 38:3758-3772. [PMID: 38789623 DOI: 10.1007/s00464-024-10880-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 04/23/2024] [Indexed: 05/26/2024]
Abstract
BACKGROUND Hyperspectral imaging (HSI), combined with machine learning, can help to identify characteristic tissue signatures enabling automatic tissue recognition during surgery. This study aims to develop the first HSI-based automatic abdominal tissue recognition with human data in a prospective bi-center setting. METHODS Data were collected from patients undergoing elective open abdominal surgery at two international tertiary referral hospitals from September 2020 to June 2021. HS images were captured at various time points throughout the surgical procedure. Resulting RGB images were annotated with 13 distinct organ labels. Convolutional Neural Networks (CNNs) were employed for the analysis, with both external and internal validation settings utilized. RESULTS A total of 169 patients were included, 73 (43.2%) from Strasbourg and 96 (56.8%) from Verona. The internal validation within centers combined patients from both centers into a single cohort, randomly allocated to the training (127 patients, 75.1%, 585 images) and test sets (42 patients, 24.9%, 181 images). This validation setting showed the best performance. The highest true positive rate was achieved for the skin (100%) and the liver (97%). Misclassifications included tissues with a similar embryological origin (omentum and mesentery: 32%) or with overlaying boundaries (liver and hepatic ligament: 22%). The median DICE score for ten tissue classes exceeded 80%. CONCLUSION To improve automatic surgical scene segmentation and to drive clinical translation, multicenter accurate HSI datasets are essential, but further work is needed to quantify the clinical value of HSI. HSI might be included in a new omics science, namely surgical optomics, which uses light to extract quantifiable tissue features during surgery.
Collapse
Affiliation(s)
- Elisa Bannone
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France.
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy.
| | - Toby Collins
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
| | - Alessandro Esposito
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy
| | - Lorenzo Cinelli
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- Department of Gastrointestinal Surgery, San Raffaele Hospital IRCCS, Milan, Italy
| | - Matteo De Pastena
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy
| | - Patrick Pessaux
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- Department of General, Digestive, and Endocrine Surgery, University Hospital of Strasbourg, Strasbourg, France
- Institut of Viral and Liver Disease, Inserm U1110, University of Strasbourg, Strasbourg, France
| | - Emanuele Felli
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- Department of General, Digestive, and Endocrine Surgery, University Hospital of Strasbourg, Strasbourg, France
- Institut of Viral and Liver Disease, Inserm U1110, University of Strasbourg, Strasbourg, France
| | - Elena Andreotti
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy
| | - Nariaki Okamoto
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- Photonics Instrumentation for Health, iCube Laboratory, University of Strasbourg, Strasbourg, France
| | - Manuel Barberio
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- General Surgery Department, Ospedale Cardinale G. Panico, Tricase, Italy
| | - Eric Felli
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- Department of Visceral Surgery and Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Roberto Maria Montorsi
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy
| | - Naomi Ingaglio
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy
| | - María Rita Rodríguez-Luna
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
- Photonics Instrumentation for Health, iCube Laboratory, University of Strasbourg, Strasbourg, France
| | - Richard Nkusi
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
| | - Jacque Marescaux
- Research Institute Against Digestive Cancer (IRCAD), 67000, Strasbourg, France
| | | | - Roberto Salvia
- Department of General and Pancreatic Surgery, The Pancreas Institute, University of Verona Hospital Trust, P.Le Scuro 10, 37134, Verona, Italy
| | - Michele Diana
- Photonics Instrumentation for Health, iCube Laboratory, University of Strasbourg, Strasbourg, France
- Department of Surgery, University Hospital of Geneva, Geneva, Switzerland
| |
Collapse
|
21
|
Snoussi H, Karimi D, Afacan O, Utkur M, Gholipour A. HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI. ARXIV 2024:arXiv:2406.20042v1. [PMID: 38979484 PMCID: PMC11230346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Diffusion magnetic resonance imaging (dMRI) is pivotal for probing the microstructure of the rapidly-developing fetal brain. However, fetal motion during scans and its interaction with magnetic field inhomogeneities result in artifacts and data scattering across spatial and angular domains. The effects of those artifacts are more pronounced in high-angular resolution fetal dMRI, where signal-to-noise ratio is very low. Those effects lead to biased estimates and compromise the consistency and reliability of dMRI analysis. This work presents HAITCH, the first and the only publicly available tool to correct and reconstruct multi-shell high-angular resolution fetal dMRI data. HAITCH offers several technical advances that include a blip-reversed dual-echo acquisition for dynamic distortion correction, advanced motion correction for model-free and robust reconstruction, optimized multi-shell design for enhanced information capture and increased tolerance to motion, and outlier detection for improved reconstruction fidelity. The framework is open-source, flexible, and can be used to process any type of fetal dMRI data including single-echo or single-shell acquisitions, but is most effective when used with multi-shell multi-echo fetal dMRI data that cannot be processed with any of the existing tools. Validation experiments on real fetal dMRI scans demonstrate significant improvements and accurate correction across diverse fetal ages and motion levels. HAITCH successfully removes artifacts and reconstructs high-fidelity fetal dMRI data suitable for advanced diffusion modeling, including fiber orientation distribution function estimation. These advancements pave the way for more reliable analysis of the fetal brain microstructure and tractography under challenging imaging conditions.
Collapse
Affiliation(s)
- Haykel Snoussi
- Boston Children's Hospital, and Harvard Medical School, Boston, MA 02115 USA
| | - Davood Karimi
- Boston Children's Hospital, and Harvard Medical School, Boston, MA 02115 USA
| | - Onur Afacan
- Boston Children's Hospital, and Harvard Medical School, Boston, MA 02115 USA
| | - Mustafa Utkur
- Boston Children's Hospital, and Harvard Medical School, Boston, MA 02115 USA
| | - Ali Gholipour
- Boston Children's Hospital, and Harvard Medical School, Boston, MA 02115 USA
| |
Collapse
|
22
|
Wei Y, Deng Y, Sun C, Lin M, Jiang H, Peng Y. Deep learning with noisy labels in medical prediction problems: a scoping review. J Am Med Inform Assoc 2024; 31:1596-1607. [PMID: 38814164 PMCID: PMC11187424 DOI: 10.1093/jamia/ocae108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 04/27/2024] [Accepted: 05/03/2024] [Indexed: 05/31/2024] Open
Abstract
OBJECTIVES Medical research faces substantial challenges from noisy labels attributed to factors like inter-expert variability and machine-extracted labels. Despite this, the adoption of label noise management remains limited, and label noise is largely ignored. To this end, there is a critical need to conduct a scoping review focusing on the problem space. This scoping review aims to comprehensively review label noise management in deep learning-based medical prediction problems, which includes label noise detection, label noise handling, and evaluation. Research involving label uncertainty is also included. METHODS Our scoping review follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We searched 4 databases, including PubMed, IEEE Xplore, Google Scholar, and Semantic Scholar. Our search terms include "noisy label AND medical/healthcare/clinical," "uncertainty AND medical/healthcare/clinical," and "noise AND medical/healthcare/clinical." RESULTS A total of 60 papers met inclusion criteria between 2016 and 2023. A series of practical questions in medical research are investigated. These include the sources of label noise, the impact of label noise, the detection of label noise, label noise handling techniques, and their evaluation. Categorization of both label noise detection methods and handling techniques are provided. DISCUSSION From a methodological perspective, we observe that the medical community has been up to date with the broader deep-learning community, given that most techniques have been evaluated on medical data. We recommend considering label noise as a standard element in medical research, even if it is not dedicated to handling noisy labels. Initial experiments can start with easy-to-implement methods, such as noise-robust loss functions, weighting, and curriculum learning.
Collapse
Affiliation(s)
- Yishu Wei
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
- Reddit Inc., San Francisco, CA 16093, United States
| | - Yu Deng
- Center for Health Information Partnerships, Northwestern University, Chicago, IL 10611, United States
| | - Cong Sun
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
| | - Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
- Department of Surgery, University of Minnesota, Minneapolis, MN 55455, United States
| | - Hongmei Jiang
- Department of Statistics and Data Science, Northwestern University, Evanston, IL 60208, United States
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
| |
Collapse
|
23
|
Zhang J, Song B, Wang H, Han B, Liu T, Liu L, Sugiyama M. BadLabel: A Robust Perspective on Evaluating and Enhancing Label-Noise Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:4398-4409. [PMID: 38236681 DOI: 10.1109/tpami.2024.3355425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Label-noise learning (LNL) aims to increase the model's generalization given training data with noisy labels. To facilitate practical LNL algorithms, researchers have proposed different label noise types, ranging from class-conditional to instance-dependent noises. In this paper, we introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin. BadLabel is crafted based on the label-flipping attack against standard classification, where specific samples are selected and their labels are flipped to other labels so that the loss values of clean and noisy labels become indistinguishable. To address the challenge posed by BadLabel, we further propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable. Once we select a small set of (mostly) clean labeled data, we can apply the techniques of semi-supervised learning to train the model accurately. Empirically, our experimental results demonstrate that existing LNL algorithms are vulnerable to the newly introduced BadLabel noise type, while our proposed robust LNL method can effectively improve the generalization performance of the model under various types of label noise. The new dataset of noisy labels and the source codes of robust LNL algorithms are available at https://github.com/zjfheart/BadLabels.
Collapse
|
24
|
Gao M, Jiang H, Hu Y, Ren Q, Xie Z, Liu J. Suppressing label noise in medical image classification using mixup attention and self-supervised learning. Phys Med Biol 2024; 69:105026. [PMID: 38636495 DOI: 10.1088/1361-6560/ad4083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 04/18/2024] [Indexed: 04/20/2024]
Abstract
Deep neural networks (DNNs) have been widely applied in medical image classification and achieve remarkable classification performance. These achievements heavily depend on large-scale accurately annotated training data. However, label noise is inevitably introduced in the medical image annotation, as the labeling process heavily relies on the expertise and experience of annotators. Meanwhile, DNNs suffer from overfitting noisy labels, degrading the performance of models. Therefore, in this work, we innovatively devise a noise-robust training approach to mitigate the adverse effects of noisy labels in medical image classification. Specifically, we incorporate contrastive learning and intra-group mixup attention strategies into vanilla supervised learning. The contrastive learning for feature extractor helps to enhance visual representation of DNNs. The intra-group mixup attention module constructs groups and assigns self-attention weights for group-wise samples, and subsequently interpolates massive noisy-suppressed samples through weighted mixup operation. We conduct comparative experiments on both synthetic and real-world noisy medical datasets under various noise levels. Rigorous experiments validate that our noise-robust method with contrastive learning and mixup attention can effectively handle with label noise, and is superior to state-of-the-art methods. An ablation study also shows that both components contribute to boost model performance. The proposed method demonstrates its capability of curb label noise and has certain potential toward real-world clinic applications.
Collapse
Affiliation(s)
- Mengdi Gao
- College of Chemistry and Life Science, Beijing University of Technology, Beijing, People's Republic of China
- Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing, People's Republic of China
| | - Hongyang Jiang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, People's Republic of China
| | - Yan Hu
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| | - Qiushi Ren
- Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, People's Republic of China
| | - Zhaoheng Xie
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, People's Republic of China
| | - Jiang Liu
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| |
Collapse
|
25
|
Huang S, Jafari R, Mortazavi BJ. Pulse2AI: An Adaptive Framework to Standardize and Process Pulsatile Wearable Sensor Data for Clinical Applications. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2024; 5:330-338. [PMID: 38899025 PMCID: PMC11186651 DOI: 10.1109/ojemb.2024.3398444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 02/09/2024] [Accepted: 04/19/2024] [Indexed: 06/21/2024] Open
Abstract
Goal: To establish Pulse2AI as a reproducible data preprocessing framework for pulsatile signals that generate high-quality machine-learning-ready datasets from raw wearable recordings. Methods: We proposed an end-to-end data preprocessing framework that adapts multiple pulsatile signal modalities and generates machine-learning-ready datasets agnostic to downstream medical tasks. Results: a dataset preprocessed by Pulse2AI improved systolic blood pressure estimation by 29.58%, from 11.41 to 8.03 mmHg in root-mean-square-error (RMSE) and its diastolic counterpart by 26.01%, from 7.93 to 5.87 mmHg in RMSE. For respiration rate (RR) estimation, Pulse2AI boosted performance by 19.69%, from 1.47 to 1.18 breaths per minute (BrPM) in mean-absolute-error (MAE). Conclusion: Pulse2AI turns pulsatile signals into machine learning (ML) ready datasets for arbitrary remote health monitoring tasks. We tested Pulse2AI on multiple pulsatile modalities and demonstrated its efficacy in two medical applications. This work bridges valuable assets in remote sensing and internet of medical things to ML-ready datasets for medical modeling.
Collapse
Affiliation(s)
- Sicong Huang
- Department of Computer Science and EngineeringTexas A&M UniversityCollege StationTX77840USA
| | - Roozbeh Jafari
- Lincoln LaboratoryMassachusetts Institute of TechnologyLexingtonMA02139USA
- Laboratory for Information and Decision Systems (LIDS)Massachusetts Institute of TechnologyCambridgeMA02139USA
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTX77843USA
- School of Engineering MedicineTexas A&M UniversityHoustonTX77843USA
| | - Bobak J. Mortazavi
- Department of Computer Science and EngineeringTexas A&M UniversityCollege StationTX77840USA
| |
Collapse
|
26
|
Gao Y, Fu J, Wang Y, Guo Y. Typicality- and instance-dependent label noise-combating: a novel framework for simulating and combating real-world noisy labels for endoscopic polyp classification. Vis Comput Ind Biomed Art 2024; 7:10. [PMID: 38709353 DOI: 10.1186/s42492-024-00162-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/14/2024] [Indexed: 05/07/2024] Open
Abstract
Learning with noisy labels aims to train neural networks with noisy labels. Current models handle instance-independent label noise (IIN) well; however, they fall short with real-world noise. In medical image classification, atypical samples frequently receive incorrect labels, rendering instance-dependent label noise (IDN) an accurate representation of real-world scenarios. However, the current IDN approaches fail to consider the typicality of samples, which hampers their ability to address real-world label noise effectively. To alleviate the issues, we introduce typicality- and instance-dependent label noise (TIDN) to simulate real-world noise and establish a TIDN-combating framework to combat label noise. Specifically, we use the sample's distance to decision boundaries in the feature space to represent typicality. The TIDN is then generated according to typicality. We establish a TIDN-attention module to combat label noise and learn the transition matrix from latent ground truth to the observed noisy labels. A recursive algorithm that enables the network to make correct predictions with corrections from the learned transition matrix is proposed. Our experiments demonstrate that the TIDN simulates real-world noise more closely than the existing IIN and IDN. Furthermore, the TIDN-combating framework demonstrates superior classification performance when training with simulated TIDN and actual real-world noise.
Collapse
Affiliation(s)
- Yun Gao
- School of Information Science and Technology, Fudan University, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, 200433, China
| | - Junhu Fu
- School of Information Science and Technology, Fudan University, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, 200433, China
| | - Yuanyuan Wang
- School of Information Science and Technology, Fudan University, Shanghai, 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, 200433, China
| | - Yi Guo
- School of Information Science and Technology, Fudan University, Shanghai, 200433, China.
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, 200433, China.
| |
Collapse
|
27
|
Liu G, Brooks L, Canty J, Lu D, Jin JY, Lu J. Deep-NCA: A deep learning methodology for performing noncompartmental analysis of pharmacokinetic data. CPT Pharmacometrics Syst Pharmacol 2024; 13:870-879. [PMID: 38465417 PMCID: PMC11098158 DOI: 10.1002/psp4.13124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/24/2024] [Accepted: 02/22/2024] [Indexed: 03/12/2024] Open
Abstract
Noncompartmental analysis (NCA) is a model-independent approach for assessing pharmacokinetics (PKs). Although the existing NCA algorithms are very well-established and widely utilized, they suffer from low accuracies in the setting of sparse PK samples. In response, we developed Deep-NCA, a deep learning (DL) model to improve the prediction of key noncompartmental PK parameters. Our methodology utilizes synthetic PK data for model training and uses an innovative patient-specific normalization method for data preprocessing. Deep-NCA demonstrated adequate performance across six previously unseen simulated drugs under multiple dosing, showcasing effective generalization. Compared to traditional NCA, Deep-NCA exhibited superior performance for sparse PK data. This study advances the application of DL to PK studies and introduces an effective method for handling sparse PK data. With further validation and refinement, Deep-NCA could significantly enhance the efficiency of drug development by providing more accurate NCA estimates while requiring fewer PK samples.
Collapse
Affiliation(s)
- Gengbo Liu
- Modeling and Simulation/Clinical PharmacologyGenentech Inc.South San FranciscoCaliforniaUSA
| | - Logan Brooks
- Modeling and Simulation/Clinical PharmacologyGenentech Inc.South San FranciscoCaliforniaUSA
| | - John Canty
- Cancer ImmunologyGenentech Inc.South San FranciscoCaliforniaUSA
| | - Dan Lu
- Modeling and Simulation/Clinical PharmacologyGenentech Inc.South San FranciscoCaliforniaUSA
| | - Jin Y. Jin
- Modeling and Simulation/Clinical PharmacologyGenentech Inc.South San FranciscoCaliforniaUSA
| | - James Lu
- Modeling and Simulation/Clinical PharmacologyGenentech Inc.South San FranciscoCaliforniaUSA
| |
Collapse
|
28
|
Li T, Guo Y, Zhao Z, Chen M, Lin Q, Hu X, Yao Z, Hu B. Automated Diagnosis of Major Depressive Disorder With Multi-Modal MRIs Based on Contrastive Learning: A Few-Shot Study. IEEE Trans Neural Syst Rehabil Eng 2024; 32:1566-1576. [PMID: 38512734 DOI: 10.1109/tnsre.2024.3380357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]
Abstract
Depression ranks among the most prevalent mood-related psychiatric disorders. Existing clinical diagnostic approaches relying on scale interviews are susceptible to individual and environmental variations. In contrast, the integration of neuroimaging techniques and computer science has provided compelling evidence for the quantitative assessment of major depressive disorder (MDD). However, one of the major challenges in computer-aided diagnosis of MDD is to automatically and effectively mine the complementary cross-modal information from limited datasets. In this study, we proposed a few-shot learning framework that integrates multi-modal MRI data based on contrastive learning. In the upstream task, it is designed to extract knowledge from heterogeneous data. Subsequently, the downstream task is dedicated to transferring the acquired knowledge to the target dataset, where a hierarchical fusion paradigm is also designed to integrate features across inter- and intra-modalities. Lastly, the proposed model was evaluated on a set of multi-modal clinical data, achieving average scores of 73.52% and 73.09% for accuracy and AUC, respectively. Our findings also reveal that the brain regions within the default mode network and cerebellum play a crucial role in the diagnosis, which provides further direction in exploring reproducible biomarkers for MDD diagnosis.
Collapse
|
29
|
Yang H, Wu H, Kong L, Luo W, Xie Q, Pan J, Quan W, Hu L, Li D, Wu X, Liang H, Qin P. Precise detection of awareness in disorders of consciousness using deep learning framework. Neuroimage 2024; 290:120580. [PMID: 38508294 DOI: 10.1016/j.neuroimage.2024.120580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/14/2024] [Accepted: 03/16/2024] [Indexed: 03/22/2024] Open
Abstract
Diagnosis of disorders of consciousness (DOC) remains a formidable challenge. Deep learning methods have been widely applied in general neurological and psychiatry disorders, while limited in DOC domain. Considering the successful use of resting-state functional MRI (rs-fMRI) for evaluating patients with DOC, this study seeks to explore the conjunction of deep learning techniques and rs-fMRI in precisely detecting awareness in DOC. We initiated our research with a benchmark dataset comprising 140 participants, including 76 unresponsive wakefulness syndrome (UWS), 25 minimally conscious state (MCS), and 39 Controls, from three independent sites. We developed a cascade 3D EfficientNet-B3-based deep learning framework tailored for discriminating MCS from UWS patients, referred to as "DeepDOC", and compared its performance against five state-of-the-art machine learning models. We also included an independent dataset consists of 11 DOC patients to test whether our model could identify patients with cognitive motor dissociation (CMD), in which DOC patients were behaviorally diagnosed unconscious but could be detected conscious by brain computer interface (BCI) method. Our results demonstrate that DeepDOC outperforms the five machine learning models, achieving an area under curve (AUC) value of 0.927 and accuracy of 0.861 for distinguishing MCS from UWS patients. More importantly, DeepDOC excels in CMD identification, achieving an AUC of 1 and accuracy of 0.909. Using gradient-weighted class activation mapping algorithm, we found that the posterior cortex, encompassing the visual cortex, posterior middle temporal gyrus, posterior cingulate cortex, precuneus, and cerebellum, as making a more substantial contribution to classification compared to other brain regions. This research offers a convenient and accurate method for detecting covert awareness in patients with MCS and CMD using rs-fMRI data.
Collapse
Affiliation(s)
- Huan Yang
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; Medical Big Data Center, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China
| | - Hang Wu
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education; Institute for Brain Research and Rehabilitation, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China
| | - Lingcong Kong
- Medical Big Data Center, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China
| | - Wen Luo
- The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 528199, China
| | - Qiuyou Xie
- Joint Research Center for disorders of consciousness, Department of Rehabilitation, Zhujiang Hospital, School of Rehabilitation Sciences, Southern Medical University, Guangzhou 510220, China
| | - Jiahui Pan
- School of Software, South China Normal University, Foshan 528225, China; Pazhou Lab, Guangzhou 510330, China
| | - Wuxiu Quan
- Medical Big Data Center, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China
| | - Lianting Hu
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; Medical Big Data Center, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China
| | - Dantong Li
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; Medical Big Data Center, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China
| | - Xuehai Wu
- Pazhou Lab, Guangzhou 510330, China; Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200433, China; Shanghai Clinical Medical Center of Neurosurgery, Shanghai Key laboratory of Brain Function Restoration and Neural Regeneration, Neurosurgical Institute of Fudan University, Shanghai 200433, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, Shanghai 200433, China
| | - Huiying Liang
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; Medical Big Data Center, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Guangzhou 510080, China.
| | - Pengmin Qin
- Pazhou Lab, Guangzhou 510330, China; Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education; School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China.
| |
Collapse
|
30
|
Li Y, Imami MR, Zhao L, Amindarolzarbi A, Mena E, Leal J, Chen J, Gafita A, Voter AF, Li X, Du Y, Zhu C, Choyke PL, Zou B, Jiao Z, Rowe SP, Pomper MG, Bai HX. An Automated Deep Learning-Based Framework for Uptake Segmentation and Classification on PSMA PET/CT Imaging of Patients with Prostate Cancer. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01104-y. [PMID: 38587770 DOI: 10.1007/s10278-024-01104-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 01/22/2024] [Accepted: 03/26/2024] [Indexed: 04/09/2024]
Abstract
Uptake segmentation and classification on PSMA PET/CT are important for automating whole-body tumor burden determinations. We developed and evaluated an automated deep learning (DL)-based framework that segments and classifies uptake on PSMA PET/CT. We identified 193 [18F] DCFPyL PET/CT scans of patients with biochemically recurrent prostate cancer from two institutions, including 137 [18F] DCFPyL PET/CT scans for training and internally testing, and 56 scans from another institution for external testing. Two radiologists segmented and labelled foci as suspicious or non-suspicious for malignancy. A DL-based segmentation was developed with two independent CNNs. An anatomical prior guidance was applied to make the DL framework focus on PSMA-avid lesions. Segmentation performance was evaluated by Dice, IoU, precision, and recall. Classification model was constructed with multi-modal decision fusion framework evaluated by accuracy, AUC, F1 score, precision, and recall. Automatic segmentation of suspicious lesions was improved under prior guidance, with mean Dice, IoU, precision, and recall of 0.700, 0.566, 0.809, and 0.660 on the internal test set and 0.680, 0.548, 0.749, and 0.740 on the external test set. Our multi-modal decision fusion framework outperformed single-modal and multi-modal CNNs with accuracy, AUC, F1 score, precision, and recall of 0.764, 0.863, 0.844, 0.841, and 0.847 in distinguishing suspicious and non-suspicious foci on the internal test set and 0.796, 0.851, 0.865, 0.814, and 0.923 on the external test set. DL-based lesion segmentation on PSMA PET is facilitated through our anatomical prior guidance strategy. Our classification framework differentiates suspicious foci from those not suspicious for cancer with good accuracy.
Collapse
Affiliation(s)
- Yang Li
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, China
| | - Maliha R Imami
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Linmei Zhao
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Alireza Amindarolzarbi
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Esther Mena
- National Institutes of Health, Bethesda, 20892, USA
| | - Jeffrey Leal
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Junyu Chen
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Andrei Gafita
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Andrew F Voter
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Xin Li
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Yong Du
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Chengzhang Zhu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | | | - Beiji Zou
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, China
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Zhicheng Jiao
- Warren Alpert Medical School of Brown University, Providence, 02903, USA
| | - Steven P Rowe
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Martin G Pomper
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA
| | - Harrison X Bai
- Russell H. Morgan Department of Radiology and Radiological Sciences, Johns Hopkins University School of Medicine, 601 N. Caroline St., Baltimore, MD 21287, USA.
| |
Collapse
|
31
|
Ming Z, Chen D, Gao T, Tang Y, Tu W, Chen J. V2IED: Dual-view learning framework for detecting events of interictal epileptiform discharges. Neural Netw 2024; 172:106136. [PMID: 38266472 DOI: 10.1016/j.neunet.2024.106136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 11/20/2023] [Accepted: 01/16/2024] [Indexed: 01/26/2024]
Abstract
Interictal epileptiform discharges (IED) as large intermittent electrophysiological events are associated with various severe brain disorders. Automated IED detection has long been a challenging task, and mainstream methods largely focus on singling out IEDs from backgrounds from the perspective of waveform, leaving normal sharp transients/artifacts with similar waveforms almost unattended. An open issue still remains to accurately detect IED events that directly reflect the abnormalities in brain electrophysiological activities, minimizing the interference from irrelevant sharp transients with similar waveforms only. This study then proposes a dual-view learning framework (namely V2IED) to detect IED events from multi-channel EEG via aggregating features from the two phases: (1) Morphological Feature Learning: directly treating the EEG as a sequence with multiple channels, a 1D-CNN (Convolutional Neural Network) is applied to explicitly learning the deep morphological features; and (2) Spatial Feature Learning: viewing the EEG as a 3D tensor embedding channel topology, a CNN captures the spatial features at each sampling point followed by an LSTM (Long Short-Term Memories) to learn the evolution of these features. Experimental results from a public EEG dataset against the state-of-the-art counterparts indicate that: (1) compared with the existing optimal models, V2IED achieves a larger area under the receiver operating characteristic (ROC) curve in detecting IEDs from normal sharp transients with a 5.25% improvement in accuracy; (2) the introduction of spatial features improves performance by 2.4% in accuracy; and (3) V2IED also performs excellently in distinguishing IEDs from background signals especially benign variants.
Collapse
Affiliation(s)
- Zhekai Ming
- School of Computer Science, the Hubei Key Laboratory of Multimedia and Network Communication Engineering, the National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan, 430072, China
| | - Dan Chen
- School of Computer Science, the Hubei Key Laboratory of Multimedia and Network Communication Engineering, the National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan, 430072, China.
| | - Tengfei Gao
- School of Computer Science, the Hubei Key Laboratory of Multimedia and Network Communication Engineering, the National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan, 430072, China
| | - Yunbo Tang
- College of Computer and Data Science, Fuzhou University, Fuzhou, 350108, China
| | - Weiping Tu
- School of Computer Science, the Hubei Key Laboratory of Multimedia and Network Communication Engineering, the National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan, 430072, China
| | - Jingying Chen
- National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
32
|
O'Connor K, Golder S, Weissenbacher D, Klein AZ, Magge A, Gonzalez-Hernandez G. Methods and Annotated Data Sets Used to Predict the Gender and Age of Twitter Users: Scoping Review. J Med Internet Res 2024; 26:e47923. [PMID: 38488839 PMCID: PMC10980991 DOI: 10.2196/47923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/28/2023] [Accepted: 08/01/2023] [Indexed: 03/19/2024] Open
Abstract
BACKGROUND Patient health data collected from a variety of nontraditional resources, commonly referred to as real-world data, can be a key information source for health and social science research. Social media platforms, such as Twitter (Twitter, Inc), offer vast amounts of real-world data. An important aspect of incorporating social media data in scientific research is identifying the demographic characteristics of the users who posted those data. Age and gender are considered key demographics for assessing the representativeness of the sample and enable researchers to study subgroups and disparities effectively. However, deciphering the age and gender of social media users poses challenges. OBJECTIVE This scoping review aims to summarize the existing literature on the prediction of the age and gender of Twitter users and provide an overview of the methods used. METHODS We searched 15 electronic databases and carried out reference checking to identify relevant studies that met our inclusion criteria: studies that predicted the age or gender of Twitter users using computational methods. The screening process was performed independently by 2 researchers to ensure the accuracy and reliability of the included studies. RESULTS Of the initial 684 studies retrieved, 74 (10.8%) studies met our inclusion criteria. Among these 74 studies, 42 (57%) focused on predicting gender, 8 (11%) focused on predicting age, and 24 (32%) predicted a combination of both age and gender. Gender prediction was predominantly approached as a binary classification task, with the reported performance of the methods ranging from 0.58 to 0.96 F1-score or 0.51 to 0.97 accuracy. Age prediction approaches varied in terms of classification groups, with a higher range of reported performance, ranging from 0.31 to 0.94 F1-score or 0.43 to 0.86 accuracy. The heterogeneous nature of the studies and the reporting of dissimilar performance metrics made it challenging to quantitatively synthesize results and draw definitive conclusions. CONCLUSIONS Our review found that although automated methods for predicting the age and gender of Twitter users have evolved to incorporate techniques such as deep neural networks, a significant proportion of the attempts rely on traditional machine learning methods, suggesting that there is potential to improve the performance of these tasks by using more advanced methods. Gender prediction has generally achieved a higher reported performance than age prediction. However, the lack of standardized reporting of performance metrics or standard annotated corpora to evaluate the methods used hinders any meaningful comparison of the approaches. Potential biases stemming from the collection and labeling of data used in the studies was identified as a problem, emphasizing the need for careful consideration and mitigation of biases in future studies. This scoping review provides valuable insights into the methods used for predicting the age and gender of Twitter users, along with the challenges and considerations associated with these methods.
Collapse
Affiliation(s)
- Karen O'Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Su Golder
- Department of Health Sciences, University of York, York, United Kingdom
| | - Davy Weissenbacher
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Ari Z Klein
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Arjun Magge
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | | |
Collapse
|
33
|
Wang Y, Lin W, Zhuang X, Wang X, He Y, Li L, Lyu G. Advances in artificial intelligence for the diagnosis and treatment of ovarian cancer (Review). Oncol Rep 2024; 51:46. [PMID: 38240090 PMCID: PMC10828921 DOI: 10.3892/or.2024.8705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 01/05/2024] [Indexed: 01/23/2024] Open
Abstract
Artificial intelligence (AI) has emerged as a crucial technique for extracting high‑throughput information from various sources, including medical images, pathological images, and genomics, transcriptomics, proteomics and metabolomics data. AI has been widely used in the field of diagnosis, for the differentiation of benign and malignant ovarian cancer (OC), and for prognostic assessment, with favorable results. Notably, AI‑based radiomics has proven to be a non‑invasive, convenient and economical approach, making it an essential asset in a gynecological setting. The present study reviews the application of AI in the diagnosis, differentiation and prognostic assessment of OC. It is suggested that AI‑based multi‑omics studies have the potential to improve the diagnostic and prognostic predictive ability in patients with OC, thereby facilitating the realization of precision medicine.
Collapse
Affiliation(s)
- Yanli Wang
- Department of Ultrasound, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China
| | - Weihong Lin
- Department of Obstetrics and Gynecology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China
| | - Xiaoling Zhuang
- Department of Pathology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China
| | - Xiali Wang
- Department of Clinical Medicine, Quanzhou Medical College, Quanzhou, Fujian 362000, P.R. China
| | - Yifang He
- Department of Ultrasound, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China
| | - Luhong Li
- Department of Obstetrics and Gynecology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China
| | - Guorong Lyu
- Department of Ultrasound, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China
- Department of Clinical Medicine, Quanzhou Medical College, Quanzhou, Fujian 362000, P.R. China
| |
Collapse
|
34
|
Wu DY, Fang YV, Vo DT, Spangler A, Seiler SJ. Detailed Image Data Quality and Cleaning Practices for Artificial Intelligence Tools for Breast Cancer. JCO Clin Cancer Inform 2024; 8:e2300074. [PMID: 38552191 PMCID: PMC10994436 DOI: 10.1200/cci.23.00074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 11/30/2023] [Accepted: 02/13/2024] [Indexed: 04/02/2024] Open
Abstract
Standardizing image-data preparation practices to improve accuracy/consistency of AI diagnostic tools.
Collapse
Affiliation(s)
- Dolly Y. Wu
- Volunteer Services, UT Southwestern Medical Center, Dallas, TX
| | - Yisheng V. Fang
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX
| | - Dat T. Vo
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX
| | - Ann Spangler
- Retired, Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX
| | | |
Collapse
|
35
|
López-Pérez M, Morales-Álvarez P, Cooper LAD, Felicelli C, Goldstein J, Vadasz B, Molina R, Katsaggelos AK. Learning from crowds for automated histopathological image segmentation. Comput Med Imaging Graph 2024; 112:102327. [PMID: 38194768 DOI: 10.1016/j.compmedimag.2024.102327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 10/20/2023] [Accepted: 12/12/2023] [Indexed: 01/11/2024]
Abstract
Automated semantic segmentation of histopathological images is an essential task in Computational Pathology (CPATH). The main limitation of Deep Learning (DL) to address this task is the scarcity of expert annotations. Crowdsourcing (CR) has emerged as a promising solution to reduce the individual (expert) annotation cost by distributing the labeling effort among a group of (non-expert) annotators. Extracting knowledge in this scenario is challenging, as it involves noisy annotations. Jointly learning the underlying (expert) segmentation and the annotators' expertise is currently a commonly used approach. Unfortunately, this approach is frequently carried out by learning a different neural network for each annotator, which scales poorly when the number of annotators grows. For this reason, this strategy cannot be easily applied to real-world CPATH segmentation. This paper proposes a new family of methods for CR segmentation of histopathological images. Our approach consists of two coupled networks: a segmentation network (for learning the expert segmentation) and an annotator network (for learning the annotators' expertise). We propose to estimate the annotators' behavior with only one network that receives the annotator ID as input, achieving scalability on the number of annotators. Our family is composed of three different models for the annotator network. Within this family, we propose a novel modeling of the annotator network in the CR segmentation literature, which considers the global features of the image. We validate our methods on a real-world dataset of Triple Negative Breast Cancer images labeled by several medical students. Our new CR modeling achieves a Dice coefficient of 0.7827, outperforming the well-known STAPLE (0.7039) and being competitive with the supervised method with expert labels (0.7723). The code is available at https://github.com/wizmik12/CRowd_Seg.
Collapse
Affiliation(s)
- Miguel López-Pérez
- Department of Computer Science and Artificial Intelligence, University of Granada, Spain.
| | | | - Lee A D Cooper
- Department of Pathology at Northwestern University, Chicago, USA; Center for Computational Imaging and Signal Analytics, Northwestern University, Chicago, USA.
| | | | | | - Brian Vadasz
- Department of Pathology at Northwestern University, Chicago, USA
| | - Rafael Molina
- Department of Computer Science and Artificial Intelligence, University of Granada, Spain.
| | - Aggelos K Katsaggelos
- Center for Computational Imaging and Signal Analytics, Northwestern University, Chicago, USA; Department of Electrical and Computer Engineering at Northwestern University, Chicago, USA.
| |
Collapse
|
36
|
Song A, Lusk JB, Roh KM, Hsu ST, Valikodath NG, Lad EM, Muir KW, Engelhard MM, Limkakeng AT, Izatt JA, McNabb RP, Kuo AN. RobOCTNet: Robotics and Deep Learning for Referable Posterior Segment Pathology Detection in an Emergency Department Population. Transl Vis Sci Technol 2024; 13:12. [PMID: 38488431 PMCID: PMC10946693 DOI: 10.1167/tvst.13.3.12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 01/31/2024] [Indexed: 03/19/2024] Open
Abstract
Purpose To evaluate the diagnostic performance of a robotically aligned optical coherence tomography (RAOCT) system coupled with a deep learning model in detecting referable posterior segment pathology in OCT images of emergency department patients. Methods A deep learning model, RobOCTNet, was trained and internally tested to classify OCT images as referable versus non-referable for ophthalmology consultation. For external testing, emergency department patients with signs or symptoms warranting evaluation of the posterior segment were imaged with RAOCT. RobOCTNet was used to classify the images. Model performance was evaluated against a reference standard based on clinical diagnosis and retina specialist OCT review. Results We included 90,250 OCT images for training and 1489 images for internal testing. RobOCTNet achieved an area under the curve (AUC) of 1.00 (95% confidence interval [CI], 0.99-1.00) for detection of referable posterior segment pathology in the internal test set. For external testing, RAOCT was used to image 72 eyes of 38 emergency department patients. In this set, RobOCTNet had an AUC of 0.91 (95% CI, 0.82-0.97), a sensitivity of 95% (95% CI, 87%-100%), and a specificity of 76% (95% CI, 62%-91%). The model's performance was comparable to two human experts' performance. Conclusions A robotically aligned OCT coupled with a deep learning model demonstrated high diagnostic performance in detecting referable posterior segment pathology in a cohort of emergency department patients. Translational Relevance Robotically aligned OCT coupled with a deep learning model may have the potential to improve emergency department patient triage for ophthalmology referral.
Collapse
Affiliation(s)
- Ailin Song
- Duke University School of Medicine, Durham, NC, USA
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Jay B. Lusk
- Duke University School of Medicine, Durham, NC, USA
| | - Kyung-Min Roh
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - S. Tammy Hsu
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | | | - Eleonora M. Lad
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Kelly W. Muir
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Matthew M. Engelhard
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | | | - Joseph A. Izatt
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Ryan P. McNabb
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Anthony N. Kuo
- Department of Ophthalmology, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| |
Collapse
|
37
|
He Y, Ge R, Qi X, Chen Y, Wu J, Coatrieux JL, Yang G, Li S. Learning Better Registration to Learn Better Few-Shot Medical Image Segmentation: Authenticity, Diversity, and Robustness. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2588-2601. [PMID: 35895657 DOI: 10.1109/tnnls.2022.3190452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this work, we address the task of few-shot medical image segmentation (MIS) with a novel proposed framework based on the learning registration to learn segmentation (LRLS) paradigm. To cope with the limitations of lack of authenticity, diversity, and robustness in the existing LRLS frameworks, we propose the better registration better segmentation (BRBS) framework with three main contributions that are experimentally shown to have substantial practical merit. First, we improve the authenticity in the registration-based generation program and propose the knowledge consistency constraint strategy that constrains the registration network to learn according to the domain knowledge. It brings the semantic-aligned and topology-preserved registration, thus allowing the generation program to output new data with great space and style authenticity. Second, we deeply studied the diversity of the generation process and propose the space-style sampling program, which introduces the modeling of the transformation path of style and space change between few atlases and numerous unlabeled images into the generation program. Therefore, the sampling on the transformation paths provides much more diverse space and style features to the generated data effectively improving the diversity. Third, we first highlight the robustness in the learning of segmentation in the LRLS paradigm and propose the mix misalignment regularization, which simulates the misalignment distortion and constrains the network to reduce the fitting degree of misaligned regions. Therefore, it builds regularization for these regions improving the robustness of segmentation learning. Without any bells and whistles, our approach achieves a new state-of-the-art performance in few-shot MIS on two challenging tasks that outperform the existing LRLS-based few-shot methods. We believe that this novel and effective framework will provide a powerful few-shot benchmark for the field of medical image and efficiently reduce the costs of medical image research. All of our code will be made publicly available online.
Collapse
|
38
|
Gao Z, Wittrup E, Najarian K. Leveraging Multi-Annotator Label Uncertainties as Privileged Information for Acute Respiratory Distress Syndrome Detection in Chest X-ray Images. Bioengineering (Basel) 2024; 11:133. [PMID: 38391619 PMCID: PMC10885868 DOI: 10.3390/bioengineering11020133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 01/24/2024] [Indexed: 02/24/2024] Open
Abstract
Acute Respiratory Distress Syndrome (ARDS) is a life-threatening lung injury for which early diagnosis and evidence-based treatment can improve patient outcomes. Chest X-rays (CXRs) play a crucial role in the identification of ARDS; however, their interpretation can be difficult due to non-specific radiological features, uncertainty in disease staging, and inter-rater variability among clinical experts, thus leading to prominent label noise issues. To address these challenges, this study proposes a novel approach that leverages label uncertainty from multiple annotators to enhance ARDS detection in CXR images. Label uncertainty information is encoded and supplied to the model as privileged information, a form of information exclusively available during the training stage and not during inference. By incorporating the Transfer and Marginalized (TRAM) network and effective knowledge transfer mechanisms, the detection model achieved a mean testing AUROC of 0.850, an AUPRC of 0.868, and an F1 score of 0.797. After removing equivocal testing cases, the model attained an AUROC of 0.973, an AUPRC of 0.971, and an F1 score of 0.921. As a new approach to addressing label noise in medical image analysis, the proposed model has shown superiority compared to the original TRAM, Confusion Estimation, and mean-aggregated label training. The overall findings highlight the effectiveness of the proposed methods in addressing label noise in CXRs for ARDS detection, with potential for use in other medical imaging domains that encounter similar challenges.
Collapse
Affiliation(s)
- Zijun Gao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Emily Wittrup
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Kayvan Najarian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Michigan Institute for Data Science (MIDAS), University of Michigan, Ann Arbor, MI 48109, USA
- Department of Emergency Medicine, University of Michigan, Ann Arbor, MI 48109, USA
- Max Harry Weil Institute for Critical Care Research and Innovation, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
39
|
Tavolara TE, Niazi MKK, Feldman AL, Jaye DL, Flowers C, Cooper LAD, Gurcan MN. Translating prognostic quantification of c-MYC and BCL2 from tissue microarrays to whole slide images in diffuse large B-cell lymphoma using deep learning. Diagn Pathol 2024; 19:17. [PMID: 38243330 PMCID: PMC10797911 DOI: 10.1186/s13000-023-01425-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 12/04/2023] [Indexed: 01/21/2024] Open
Abstract
BACKGROUND c-MYC and BCL2 positivity are important prognostic factors for diffuse large B-cell lymphoma. However, manual quantification is subject to significant intra- and inter-observer variability. We developed an automated method for quantification in whole-slide images of tissue sections where manual quantification requires evaluating large areas of tissue with possibly heterogeneous staining. We train this method using annotations of tumor positivity in smaller tissue microarray cores where expression and staining are more homogeneous and then translate this model to whole-slide images. METHODS Our method applies a technique called attention-based multiple instance learning to regress the proportion of c-MYC-positive and BCL2-positive tumor cells from pathologist-scored tissue microarray cores. This technique does not require annotation of individual cell nuclei and is trained instead on core-level annotations of percent tumor positivity. We translate this model to scoring of whole-slide images by tessellating the slide into smaller core-sized tissue regions and calculating an aggregate score. Our method was trained on a public tissue microarray dataset from Stanford and applied to whole-slide images from a geographically diverse multi-center cohort produced by the Lymphoma Epidemiology of Outcomes study. RESULTS In tissue microarrays, the automated method had Pearson correlations of 0.843 and 0.919 with pathologist scores for c-MYC and BCL2, respectively. When utilizing standard clinical thresholds, the sensitivity/specificity of our method was 0.743 / 0.963 for c-MYC and 0.938 / 0.951 for BCL2. For double-expressors, sensitivity and specificity were 0.720 and 0.974. When translated to the external WSI dataset scored by two pathologists, Pearson correlation was 0.753 & 0.883 for c-MYC and 0.749 & 0.765 for BCL2, and sensitivity/specificity was 0.857/0.991 & 0.706/0.930 for c-MYC, 0.856/0.719 & 0.855/0.690 for BCL2, and 0.890/1.00 & 0.598/0.952 for double-expressors. Survival analysis demonstrates that for progression-free survival, model-predicted TMA scores significantly stratify double-expressors and non double-expressors (p = 0.0345), whereas pathologist scores do not (p = 0.128). CONCLUSIONS We conclude that proportion of positive stains can be regressed using attention-based multiple instance learning, that these models generalize well to whole slide images, and that our models can provide non-inferior stratification of progression-free survival outcomes.
Collapse
Affiliation(s)
- Thomas E Tavolara
- Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC, USA.
| | - M Khalid Khan Niazi
- Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Andrew L Feldman
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - David L Jaye
- Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Christopher Flowers
- Department of Lymphoma/Myeloma, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Lee A D Cooper
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Metin N Gurcan
- Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| |
Collapse
|
40
|
Arora M, Davis CM, Mondal A, Gowda NR, Foster DG, Kamaleswaran R. Optimizing the Synergistic Potential of Pseudo-Labels from Radiology Notes and Annotated Ground Truth in Identifying Pulmonary Opacities on Chest Radiographs for Early Detection of Acute Respiratory Distress Syndrome. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:270-279. [PMID: 38222424 PMCID: PMC10785907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Acute Respiratory Distress Syndrome (ARDS) is a life-threatening lung injury, hallmarks of which are bilateral radiographic opacities. Studies have shown that early recognition of ARDS could reduce severity and lethal clinical sequela. A Convolutional Neural Network (CNN) model that can identify bilateral pulmonary opacities on chest x-ray (CXR) images can aid early ARDS recognition. Obtaining large datasets with ground truth labels to train CNNs is challenging, as medical image annotation requires clinical expertise and meticulous consideration. In this work, we implement a natural language processing pipeline that extracts pseudo-labels CXR images by parsing radiology notes for abnormal findings. We obtain ground-truth annotations from clinicians for the presence of pulmonary opacities for a subset of these images. A knowledge distillation-based teacher-student training framework is implemented to leverage the larger dataset with noisy pseudo-labels. Our results show an AUC of 0.93 (95%CI 0.92-0.94) for the prediction of bilateral opacities on chest radiographs.
Collapse
Affiliation(s)
- Mehak Arora
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA
| | - Carolyn M Davis
- Department of Surgery, Emory University School of Medicine, Atlanta, GA
- Emory Critical Care Center, Emory University School of Medicine, Atlanta, GA
| | - Angana Mondal
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA
| | - Niraj R Gowda
- Division of Pulmonary, Critical Care, Allergy and Sleep Medicine, Emory University School of Medicine, Atlanta, GA
| | | | - Rishikesan Kamaleswaran
- Emory Critical Care Center, Emory University School of Medicine, Atlanta, GA
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA
| |
Collapse
|
41
|
Park S, Yoo HJ, Jang JS, Lee SH. Automated non-contact measurement of the spine curvature at the sagittal plane using a deep neural network. Clin Biomech (Bristol, Avon) 2024; 111:106146. [PMID: 37976690 DOI: 10.1016/j.clinbiomech.2023.106146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/28/2023] [Accepted: 11/08/2023] [Indexed: 11/19/2023]
Abstract
BACKGROUND Non-radiographical techniques have been suggested to measure the spine curvature at the sagittal plane. However, a neural network has not been used to measure the curvature. METHODS A single video camera captured images of a standing posture at the sagittal plane from twenty healthy males. Six marker positions along the spine's contour in each image were identified for measuring inclination, thoracic kyphosis, and lumbar lordosis angles. We estimated three inflection points around the neck, hip, and between the neck and hip, followed by identifying two adjacent marker positions per inflection point to compute its tangent. The angular deviation of each tangent line from the horizontal was computed to measure inclination angles. Thoracic kyphosis and lumbar lordosis angles were computed by the angular difference between the two adjacent tangents. A deep neural network was trained with 500,000 iterations using the labeled images from 18 participants (388 and 44 images for training and test set) and then evaluated using the unseen images (2 participants, 48 images; evaluation set). FINDINGS The mean total training and test errors were <2 pixels (∼ 0.6 cm). The total error in the evaluation set was qualitatively comparable (∼ 3 pixels = ∼ 0.9 cm), suggesting the model performance was maintained in the unseen data. The angle values between labeled and network-predicted marker positions were similar in the evaluation set. INTERPRETATION The network training with a relatively small number of images was successful based on the small error values observed in the evaluation set. The model may be an affordable, automated, and non-contact measurement tool for the human spine curvature.
Collapse
Affiliation(s)
- Sangsoo Park
- School of Global Sport Studies, Korea University Sejong Campus, Sejong City 30019, South Korea.
| | - Hyun-Joon Yoo
- Korea University Research Institute for Medical Bigdata Science, Korea University, Goryeodae-ro 73, Seongbuk-gu, Seoul 02841, South Korea
| | - Jin Su Jang
- Human Behavior & Genetic Institute, Associate Research Center, Korea University, Goryeodae-ro 73, Seongbuk-gu, Seoul 02841, South Korea
| | - Sang-Heon Lee
- Department of Physical Medicine and Rehabilitation, Korea University Anam Hospital, Korea University College of Medicine, Goryeodae-ro 73, Seongbuk-gu, Seoul 02841, South Korea
| |
Collapse
|
42
|
Li X, Wu Q, Wang M, Wu K. Uncertainty-aware network for fine-grained and imbalanced reflux esophagitis grading. Comput Biol Med 2024; 168:107751. [PMID: 38016373 DOI: 10.1016/j.compbiomed.2023.107751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 10/22/2023] [Accepted: 11/20/2023] [Indexed: 11/30/2023]
Abstract
Computer-aided diagnosis (CAD) assists endoscopists in analyzing endoscopic images, reducing misdiagnosis rates and enabling timely treatment. A few studies have focused on CAD for gastroesophageal reflux disease, but CAD studies on reflux esophagitis (RE) are still inadequate. This paper presents a CAD study on RE using a dataset collected from hospital, comprising over 3000 images. We propose an uncertainty-aware network with handcrafted features, utilizing representation and classifier decoupling with metric learning to address class imbalance and achieve fine-grained RE classification. To enhance interpretability, the network estimates uncertainty through test time augmentation. The experimental results demonstrate that the proposed network surpasses previous methods, achieving an accuracy of 90.2% and an F1 score of 90.1%.
Collapse
Affiliation(s)
- Xingcun Li
- School of Management, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Qinghua Wu
- School of Management, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Mi Wang
- Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China.
| | - Kun Wu
- Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| |
Collapse
|
43
|
Zhan J, Chen C, Zhang N, Zhong S, Wang J, Hu J, Liu J. An artificial intelligence model for embryo selection in preimplantation DNA methylation screening in assisted reproductive technology. BIOPHYSICS REPORTS 2023; 9:352-361. [PMID: 38524697 PMCID: PMC10960573 DOI: 10.52601/bpr.2023.230035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 11/28/2023] [Indexed: 03/26/2024] Open
Abstract
Embryo quality is a critical determinant of clinical outcomes in assisted reproductive technology (ART). A recent clinical trial investigating preimplantation DNA methylation screening (PIMS) revealed that whole genome DNA methylation level is a novel biomarker for assessing ART embryo quality. Here, we reinforced and estimated the clinical efficacy of PIMS. We introduce PIMS-AI, an innovative artificial intelligence (AI) based model, to predict the probability of an embryo producing live birth and subsequently assist ART embryo selection. Our model demonstrated robust performance, achieving an area under the curve (AUC) of 0.90 in cross-validation and 0.80 in independent testing. In simulated embryo selection, PIMS-AI attained an accuracy of 81% in identifying viable embryos for patients. Notably, PIMS-AI offers significant advantages over conventional preimplantation genetic testing for aneuploidy (PGT-A), including enhanced embryo discriminability and the potential to benefit a broader patient population. In conclusion, our approach holds substantial promise for clinical application and has the potential to significantly improve the ART success rate.
Collapse
Affiliation(s)
- Jianhong Zhan
- Institute of Biophysics, Chinese Academy of Science, Beijing 100101, China
| | - Chuangqi Chen
- Guangdong Women's and Children's Hospital, Guangzhou 511400, China
| | - Na Zhang
- Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital, Beijing 100026, China
| | | | - Jiaming Wang
- Institute of Biophysics, Chinese Academy of Science, Beijing 100101, China
- University of the Chinese Academy of Science, Beijing 101408, China
- School of Future Technology, University of the Chinese Academy of Science, Beijing 100049, China
| | - Jinzhou Hu
- Institute of Biophysics, Chinese Academy of Science, Beijing 100101, China
- University of the Chinese Academy of Science, Beijing 101408, China
| | - Jiang Liu
- Institute of Biophysics, Chinese Academy of Science, Beijing 100101, China
- University of the Chinese Academy of Science, Beijing 101408, China
- School of Future Technology, University of the Chinese Academy of Science, Beijing 100049, China
| |
Collapse
|
44
|
Kang DW, Park GH, Ryu WS, Schellingerhout D, Kim M, Kim YS, Park CY, Lee KJ, Han MK, Jeong HG, Kim DE. Strengthening deep-learning models for intracranial hemorrhage detection: strongly annotated computed tomography images and model ensembles. Front Neurol 2023; 14:1321964. [PMID: 38221995 PMCID: PMC10784380 DOI: 10.3389/fneur.2023.1321964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 12/11/2023] [Indexed: 01/16/2024] Open
Abstract
Background and purpose Multiple attempts at intracranial hemorrhage (ICH) detection using deep-learning techniques have been plagued by clinical failures. We aimed to compare the performance of a deep-learning algorithm for ICH detection trained on strongly and weakly annotated datasets, and to assess whether a weighted ensemble model that integrates separate models trained using datasets with different ICH improves performance. Methods We used brain CT scans from the Radiological Society of North America (27,861 CT scans, 3,528 ICHs) and AI-Hub (53,045 CT scans, 7,013 ICHs) for training. DenseNet121, InceptionResNetV2, MobileNetV2, and VGG19 were trained on strongly and weakly annotated datasets and compared using independent external test datasets. We then developed a weighted ensemble model combining separate models trained on all ICH, subdural hemorrhage (SDH), subarachnoid hemorrhage (SAH), and small-lesion ICH cases. The final weighted ensemble model was compared to four well-known deep-learning models. After external testing, six neurologists reviewed 91 ICH cases difficult for AI and humans. Results InceptionResNetV2, MobileNetV2, and VGG19 models outperformed when trained on strongly annotated datasets. A weighted ensemble model combining models trained on SDH, SAH, and small-lesion ICH had a higher AUC, compared with a model trained on all ICH cases only. This model outperformed four deep-learning models (AUC [95% C.I.]: Ensemble model, 0.953[0.938-0.965]; InceptionResNetV2, 0.852[0.828-0.873]; DenseNet121, 0.875[0.852-0.895]; VGG19, 0.796[0.770-0.821]; MobileNetV2, 0.650[0.620-0.680]; p < 0.0001). In addition, the case review showed that a better understanding and management of difficult cases may facilitate clinical use of ICH detection algorithms. Conclusion We propose a weighted ensemble model for ICH detection, trained on large-scale, strongly annotated CT scans, as no model can capture all aspects of complex tasks.
Collapse
Affiliation(s)
- Dong-Wan Kang
- Department of Public Health, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
- Department of Neurology, Gyeonggi Provincial Medical Center, Icheon Hospital, Icheon, Republic of Korea
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Gi-Hun Park
- JLK Inc., Artificial Intelligence Research Center, Seoul, Republic of Korea
| | - Wi-Sun Ryu
- JLK Inc., Artificial Intelligence Research Center, Seoul, Republic of Korea
| | - Dawid Schellingerhout
- Department of Neuroradiology and Imaging Physics, The University of Texas M.D. Anderson Cancer Center, Houston, TX, United States
| | - Museong Kim
- Department of Neurosurgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
- Hospital Medicine Center, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Yong Soo Kim
- Department of Neurology, Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul, Republic of Korea
| | - Chan-Young Park
- Department of Neurology, Chung-Ang University Hospital, Seoul, Republic of Korea
| | - Keon-Joo Lee
- Department of Neurology, Korea University Guro Hospital, Seoul, Republic of Korea
| | - Moon-Ku Han
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Han-Gil Jeong
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
- Department of Neurosurgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Dong-Eog Kim
- Department of Neurology, Dongguk University Ilsan Hospital, Goyang, Republic of Korea
- National Priority Research Center for Stroke, Goyang, Republic of Korea
| |
Collapse
|
45
|
Hasanah U, Avian C, Darmawan JT, Bachroin N, Faisal M, Prakosa SW, Leu JS, Tsai CT. CheXNet and feature pyramid network: a fusion deep learning architecture for multilabel chest X-Ray clinical diagnoses classification. THE INTERNATIONAL JOURNAL OF CARDIOVASCULAR IMAGING 2023:10.1007/s10554-023-03039-x. [PMID: 38150139 DOI: 10.1007/s10554-023-03039-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 12/18/2023] [Indexed: 12/28/2023]
Abstract
The existing multilabel X-Ray image learning tasks generally contain much information on pathology co-occurrence and interdependency, which is very important for clinical diagnosis. However, the challenging part of this subject is to accurately diagnose multiple diseases that occurred in a single X-Ray image since multiple levels of features are generated in the images, and create different features as in single label detection. Various works were developed to address this challenge with proposed deep learning architectures to improve classification performance and enrich diagnosis results with multi-probability disease detection. The objective is to create an accurate result and a faster inference system to support a quick diagnosis in the medical system. To contribute to this state-of-the-art, we designed a fusion architecture, CheXNet and Feature Pyramid Network (FPN), to classify and discriminate multiple thoracic diseases from chest X-Rays. This concept enables the model to extract while creating a pyramid of feature maps with different spatial resolutions that capture low-level and high-level semantic information to encounter multiple features. The model's effectiveness is evaluated using the NIH ChestXray14 dataset, with the Area Under Curve (AUC) and accuracy metrics used to compare the results against other cutting-edge approaches. The overall results demonstrate that our method outperforms other approaches and has become promising for multilabel disease classification in chest X-Rays, with potential applications in clinical practice. The result demonstrated that we achieved an average AUC of 0.846 and an accuracy of 0.914. Further, our proposed architecture diagnoses images in 0.013 s, faster than the latest approaches.
Collapse
Affiliation(s)
- Uswatun Hasanah
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Cries Avian
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | | | - Nabil Bachroin
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Muhamad Faisal
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Setya Widyawan Prakosa
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Jenq-Shiou Leu
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan.
| | - Chia-Ti Tsai
- Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan
| |
Collapse
|
46
|
YOUSEF M, ALLMER J. Deep learning in bioinformatics. Turk J Biol 2023; 47:366-382. [PMID: 38681776 PMCID: PMC11045206 DOI: 10.55730/1300-0152.2671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/28/2023] [Accepted: 12/18/2023] [Indexed: 05/01/2024] Open
Abstract
Deep learning is a powerful machine learning technique that can learn from large amounts of data using multiple layers of artificial neural networks. This paper reviews some applications of deep learning in bioinformatics, a field that deals with analyzing and interpreting biological data. We first introduce the basic concepts of deep learning and then survey the recent advances and challenges of applying deep learning to various bioinformatics problems, such as genome sequencing, gene expression analysis, protein structure prediction, drug discovery, and disease diagnosis. We also discuss future directions and opportunities for deep learning in bioinformatics. We aim to provide an overview of deep learning so that bioinformaticians applying deep learning models can consider all critical technical and ethical aspects. Thus, our target audience is biomedical informatics researchers who use deep learning models for inference. This review will inspire more bioinformatics researchers to adopt deep-learning methods for their research questions while considering fairness, potential biases, explainability, and accountability.
Collapse
Affiliation(s)
- Malik YOUSEF
- Department of Information Systems, Zefat Academic College, Zefat,
Israel
| | - Jens ALLMER
- Medical Informatics and Bioinformatics, Institute for Measurement Engineering and Sensor Technology, Hochschule Ruhr West, University of Applied Sciences, Mülheim an der Ruhr,
Germany
| |
Collapse
|
47
|
Ahmed SR, Befano B, Lemay A, Egemen D, Rodriguez AC, Angara S, Desai K, Jeronimo J, Antani S, Campos N, Inturrisi F, Perkins R, Kreimer A, Wentzensen N, Herrero R, Del Pino M, Quint W, de Sanjose S, Schiffman M, Kalpathy-Cramer J. Reproducible and clinically translatable deep neural networks for cervical screening. Sci Rep 2023; 13:21772. [PMID: 38066031 PMCID: PMC10709439 DOI: 10.1038/s41598-023-48721-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open
Abstract
Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. In this work, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-geography, multi-institution, and multi-device dataset of 9462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our model also produced reliable and consistent predictions, achieving a strong quadratic weighted kappa (QWK) of 0.86 and a minimal %2-class disagreement (% 2-Cl. D.) of 0.69%, between image pairs across women. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.
Collapse
Affiliation(s)
- Syed Rakin Ahmed
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, 02129, USA.
- Harvard Graduate Program in Biophysics, Harvard Medical School, Harvard University, Cambridge, MA, 02115, USA.
- Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
- Geisel School of Medicine at Dartmouth, Dartmouth College, Hanover, NH, 03755, USA.
| | - Brian Befano
- Information Management Services, Calverton, MD, 20705, USA
- University of Washington, Seattle, WA, 98195, USA
| | - Andreanne Lemay
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, 02129, USA
- NeuroPoly, Polytechnique Montreal, Montreal, QC, H3T 1N8, Canada
| | - Didem Egemen
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ana Cecilia Rodriguez
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Sandeep Angara
- Computational Health Research Branch, National Library of Medicine, Lister Hill Center, Bethesda, MD, 20894, USA
| | - Kanan Desai
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Jose Jeronimo
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Sameer Antani
- Computational Health Research Branch, National Library of Medicine, Lister Hill Center, Bethesda, MD, 20894, USA
| | - Nicole Campos
- Department of Health Policy and Management, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Federica Inturrisi
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Rebecca Perkins
- Department of Obstetrics & Gynecology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA
| | - Aimee Kreimer
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Nicolas Wentzensen
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Rolando Herrero
- Agencia Costarricense de Investigaciones Biomedicas (ACIB), Fundacion INCIENSA, San Jose, Costa Rica
| | | | - Wim Quint
- DDL Diagnostic Laboratory, Rijswijk, The Netherlands
| | - Silvia de Sanjose
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
- ISGlobal, Barcelona, Spain
| | - Mark Schiffman
- Clinical Epidemiology Unit, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Jayashree Kalpathy-Cramer
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, 02129, USA
- Department of Ophthalmology, University of Colorado Anschutz, Denver, CO, 80045, USA
| |
Collapse
|
48
|
Chen Z, Li W, Xing X, Yuan Y. Medical federated learning with joint graph purification for noisy label learning. Med Image Anal 2023; 90:102976. [PMID: 37806019 DOI: 10.1016/j.media.2023.102976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 02/08/2023] [Accepted: 09/18/2023] [Indexed: 10/10/2023]
Abstract
In terms of increasing privacy issues, Federated Learning (FL) has received extensive attention in medical imaging. Through collaborative training, FL can produce superior diagnostic models with global knowledge, while preserving private data locally. In practice, medical diagnosis suffers from intra-/inter-observer variability, thus label noise is inevitable in dataset preparation. Different from existing studies on centralized datasets, the label noise problem in FL scenarios confronts more challenges, due to data inaccessibility and even noise heterogeneity. In this work, we propose a federated framework with joint Graph Purification (FedGP) to address the label noise in FL through server and clients collaboration. Specifically, to overcome the impact of label noise on local training, we first devise a noisy graph purification on the client side to generate reliable pseudo labels by progressively expanding the purified graph with topological knowledge. Then, we further propose a graph-guided negative ensemble loss to exploit the topology of the client-side purified graph with robust complementary supervision against label noise. Moreover, to address the FL label noise with data silos, we propose a global centroid aggregation on the server side to produce a robust classifier with global knowledge, which can be optimized collaboratively in the FL framework. Extensive experiments are conducted on endoscopic and pathological images with the comparison under the homogeneous, heterogeneous, and real-world label noise for medical FL. Among these diverse noisy FL settings, our FedGP framework significantly outperforms denoising and noisy FL state-of-the-arts by a large margin. The source code is available at https://github.com/CUHK-AIM-Group/FedGP.
Collapse
Affiliation(s)
- Zhen Chen
- Centre for Artificial Intelligence and Robotics (CAIR), Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences, Hong Kong Special Administrative Region of China
| | - Wuyang Li
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Xiaohan Xing
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong Special Administrative Region of China; Department of Radiation Oncology, Stanford University, CA, USA
| | - Yixuan Yuan
- Department of Electronic Engineering, Chinese University of Hong Kong, Hong Kong Special Administrative Region of China.
| |
Collapse
|
49
|
Ostmeier S, Axelrod B, Isensee F, Bertels J, Mlynash M, Christensen S, Lansberg MG, Albers GW, Sheth R, Verhaaren BFJ, Mahammedi A, Li LJ, Zaharchuk G, Heit JJ. USE-Evaluator: Performance metrics for medical image segmentation models supervised by uncertain, small or empty reference annotations in neuroimaging. Med Image Anal 2023; 90:102927. [PMID: 37672900 DOI: 10.1016/j.media.2023.102927] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 07/08/2023] [Accepted: 08/03/2023] [Indexed: 09/08/2023]
Abstract
Performance metrics for medical image segmentation models are used to measure the agreement between the reference annotation and the predicted segmentation. Usually, overlap metrics, such as the Dice, are used as a metric to evaluate the performance of these models in order for results to be comparable. However, there is a mismatch between the distributions of cases and the difficulty level of segmentation tasks in public data sets compared to clinical practice. Common metrics used to assess performance fail to capture the impact of this mismatch, particularly when dealing with datasets in clinical settings that involve challenging segmentation tasks, pathologies with low signal, and reference annotations that are uncertain, small, or empty. Limitations of common metrics may result in ineffective machine learning research in designing and optimizing models. To effectively evaluate the clinical value of such models, it is essential to consider factors such as the uncertainty associated with reference annotations, the ability to accurately measure performance regardless of the size of the reference annotation volume, and the classification of cases where reference annotations are empty. We study how uncertain, small, and empty reference annotations influence the value of metrics on a stroke in-house data set regardless of the model. We examine metrics behavior on the predictions of a standard deep learning framework in order to identify suitable metrics in such a setting. We compare our results to the BRATS 2019 and Spinal Cord public data sets. We show how uncertain, small, or empty reference annotations require a rethinking of the evaluation. The evaluation code was released to encourage further analysis of this topic https://github.com/SophieOstmeier/UncertainSmallEmpty.git.
Collapse
Affiliation(s)
- Sophie Ostmeier
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America.
| | - Brian Axelrod
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Fabian Isensee
- Division of Medical Image Computing, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | | | - Michael Mlynash
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | | | - Maarten G Lansberg
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Gregory W Albers
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | | | | | - Abdelkader Mahammedi
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Li-Jia Li
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Greg Zaharchuk
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Jeremy J Heit
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| |
Collapse
|
50
|
Nguyen TV, Diakiw SM, VerMilyea MD, Dinsmore AW, Perugini M, Perugini D, Hall JMM. Efficient automated error detection in medical data using deep-learning and label-clustering. Sci Rep 2023; 13:19587. [PMID: 37949906 PMCID: PMC10638377 DOI: 10.1038/s41598-023-45946-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 10/26/2023] [Indexed: 11/12/2023] Open
Abstract
Medical datasets inherently contain errors from subjective or inaccurate test results, or from confounding biological complexities. It is difficult for medical experts to detect these elusive errors manually, due to lack of contextual information, limiting data privacy regulations, and the sheer scale of data to be reviewed. Current methods for training robust artificial intelligence (AI) models on data containing mislabeled examples generally fall into one of several categories-attempting to improve the robustness of the model architecture, the regularization techniques used, the loss function used during training, or selecting a subset of data that contains cleaner labels. This last category requires the ability to efficiently detect errors either prior to or during training, either relabeling them or removing them completely. More recent progress in error detection has focused on using multi-network learning to minimize deleterious effects of errors on training, however, using many neural networks to reach a consensus on which data should be removed can be computationally intensive and inefficient. In this work, a deep-learning based algorithm was used in conjunction with a label-clustering approach to automate error detection. For dataset with synthetic label flips added, these errors were identified with an accuracy of up to 85%, while requiring up to 93% less computing resources to complete compared to a previous model consensus approach developed previously. The resulting trained AI models exhibited greater training stability and up to a 45% improvement in accuracy, from 69 to over 99% compared to the consensus approach, at least 10% improvement on using noise-robust loss functions in a binary classification problem, and a 51% improvement for multi-class classification. These results indicate that practical, automated a priori detection of errors in medical data is possible, without human oversight.
Collapse
Affiliation(s)
- T V Nguyen
- Presagen, Adelaide, SA, 5000, Australia.
- School of Computing and Information Technology, University of Wollongong, Wollongong, NSW, 2522, Australia.
| | | | - M D VerMilyea
- Ovation Fertility, Austin, TX, 78731, USA
- Texas Fertility Center, Austin, TX, 78731, USA
| | - A W Dinsmore
- California Fertility Partners, Los Angeles, CA, 90025, USA
| | - M Perugini
- Presagen, Adelaide, SA, 5000, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, SA, 5000, Australia
| | | | - J M M Hall
- Presagen, Adelaide, SA, 5000, Australia
- Australian Research Council Centre of Excellence for Nanoscale BioPhotonics, Adelaide, SA, 5005, Australia
- School of Physical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia
| |
Collapse
|