1
|
|
Zhang R, Wang L, Cheng S, Song S. MLP-based classification of COVID-19 and skin diseases. Expert Syst Appl 2023; 228:120389. [PMID: 37193247 DOI: 10.1016/j.eswa.2023.120389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 05/03/2023] [Accepted: 05/04/2023] [Indexed: 05/18/2023]
Abstract
Recent years have witnessed a growing interest in neural network-based medical image classification methods, which have demonstrated remarkable performance in this field. Typically, convolutional neural network (CNN) architectures have been commonly employed to extract local features. However, the transformer, a newly emerged architecture, has gained popularity due to its ability to explore the relevance of remote elements in an image through a self-attention mechanism. Despite this, it is crucial to establish not only local connectivity but also remote relationships between lesion features and capture the overall image structure to improve image classification accuracy. Therefore, to tackle the aforementioned issues, this paper proposes a network based on multilayer perceptrons (MLPs) that can learn the local features of medical images on the one hand and capture the overall feature information in both spatial and channel dimensions on the other hand, thus utilizing image features effectively. This paper has been extensively validated on COVID19-CT dataset and ISIC 2018 dataset, and the results show that the method in this paper is more competitive and has higher performance in medical image classification compared with existing methods. This shows that the use of MLP to capture image features and establish connections between lesions is expected to provide novel ideas for medical image classification tasks in the future.
Collapse
Affiliation(s)
- Ruize Zhang
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Liejun Wang
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Shuli Cheng
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Shiji Song
- Department of Automation, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
2
|
|
Du Y, Hu L, Wu G, Tang Y, Cai X, Yin L. Diagnoses in multiple types of cancer based on serum Raman spectroscopy combined with a convolutional neural network: Gastric cancer, colon cancer, rectal cancer, lung cancer. Spectrochim Acta A Mol Biomol Spectrosc 2023; 298:122743. [PMID: 37119637 DOI: 10.1016/j.saa.2023.122743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/02/2023] [Accepted: 04/11/2023] [Indexed: 05/26/2023]
Abstract
Cancer is one of the major diseases that seriously threaten human health. Timely screening is beneficial to the cure of cancer. There are some shortcomings in current diagnosis methods, so it is very important to find a low-cost, fast, and nondestructive cancer screening technology. In this study, we demonstrated that serum Raman spectroscopy combined with a convolutional neural network model can be used for the diagnosis of four types of cancer including gastric cancer, colon cancer, rectal cancer, and lung cancer. Raman spectra database containing four types of cancer and healthy controls was established and a one-dimensional convolutional neural network (1D-CNN) was constructed. The classification accuracy of the Raman spectra combined with the 1D-CNN model was 94.5%. A convolutional neural network (CNN) is regarded as a black box, and the learning mechanism of the model is not clear. Therefore, we tried to visualize the CNN features of each convolutional layer in the diagnosis of rectal cancer. Overall, Raman spectroscopy combined with the CNN model is an effective tool that can be used to distinguish different cancer from healthy controls.
Collapse
Affiliation(s)
- Yu Du
- School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
| | - Lin Hu
- Department of Laboratory Medicine, The First Affiliated Hospital of Chongqing Medical University, 400016 Chongqing, China
| | - Guohua Wu
- School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.
| | - Yishu Tang
- Department of Laboratory Medicine, The First Affiliated Hospital of Chongqing Medical University, 400016 Chongqing, China.
| | - Xiongwei Cai
- Department of Gynecology, Chongqing Health Center for Women and Children, Women and Children's Hospital of Chongqing Medical University, 400016 Chongqing, China
| | - Longfei Yin
- School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
| |
Collapse
|
3
|
|
Zhang H, Pan Y, Liu X, Chen Y, Gong X, Zhu J, Yan J, Zhang H. Recognition of the rhizome of red ginseng based on spectral-image dual-scale digital information combined with intelligent algorithms. Spectrochim Acta A Mol Biomol Spectrosc 2023; 297:122742. [PMID: 37098315 DOI: 10.1016/j.saa.2023.122742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Red ginseng is a widely used and extensively researched food and medicinal product with high nutritional value, derived from steamed fresh ginseng. The components in various parts of red ginseng differ significantly, resulting in distinct pharmacological activities and efficacies. This study proposed to establish a hyperspectral imaging technology combined with intelligent algorithms for the recognition of different parts of red ginseng based on the dual-scale of spectrum and image information. Firstly, the spectral information was processed by the best combination of first derivative as pre-processing method and partial least squares discriminant analysis (PLS-DA) as classification model. The recognition accuracy of the rhizome and the main root of red ginseng is 96.79% and 95.94% respectively. Then, the image information was processed by the You Only Look Once version 5 small (YOLO v5s) model. The best parameter combination is epoch = 30, learning rate = 0.01, and activation function is leaky ReLU. In the red ginseng dataset, the highest accuracy, recall and mean Average Precision at IoU (Intersection over Union) threshold 0.5 (mAP@0.5) were 99.01%, 98.51% and 99.07% respectively. The application of spectrum-image dual-scale digital information combined with intelligent algorithms in the recognition of red ginseng is successful, which provides a positive significance for the online and on-site quality control and authenticity identification of crude drugs or fruits.
Collapse
Affiliation(s)
- HongXu Zhang
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China
| | - YiXia Pan
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China
| | - XiaoYi Liu
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China
| | - Yuan Chen
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China
| | - XingChu Gong
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - JieQiang Zhu
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China
| | - JiZhong Yan
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China.
| | - Hui Zhang
- College of Pharmaceutical Science, Zhejiang University of Technology, No. 18, Chaowang Road, Hangzhou 310014, China.
| |
Collapse
|
4
|
|
Cunha F, dos Santos EM, Colonna JG. Bag of tricks for long-tail visual recognition of animal species in camera-trap images. ECOL INFORM 2023; 76:102060. [DOI: 10.1016/j.ecoinf.2023.102060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
|
5
|
|
Zheng Q, Li Z, Zhang J, Mei C, Li G, Wang L. Automated segmentation of palpebral fissures from eye videography using a texture fusion neural network. Biomed Signal Process Control 2023; 85:104820. [DOI: 10.1016/j.bspc.2023.104820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2023]
|
6
|
|
Liu S, Zhao Y, An Y, Zhao J, Wang S, Yan J. GLFANet: A global to local feature aggregation network for EEG emotion recognition. Biomed Signal Process Control 2023; 85:104799. [DOI: 10.1016/j.bspc.2023.104799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
7
|
|
Naderi M, Karimi N, Emami A, Shirani S, Samavi S. Dynamic-Pix2Pix: Medical image segmentation by injecting noise to cGAN for modeling input and target domain joint distributions with limited training data. Biomed Signal Process Control 2023; 85:104877. [DOI: 10.1016/j.bspc.2023.104877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
8
|
|
Maurya A, Akashdeep, Mittal P, Kumar R. A modified U-net-based architecture for segmentation of satellite images on a novel dataset. ECOL INFORM 2023; 75:102078. [DOI: 10.1016/j.ecoinf.2023.102078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
9
|
|
Zeng C, Lai W, Lin H, Liu G, Qin B, Kang Q, Feng X, Yu Y, Gu R, Wu J, Mao L. Weak information extraction of gamma spectrum based on a two-dimensional wavelet transform. Radiat Phys Chem Oxf Engl 1993 2023; 208:110914. [DOI: 10.1016/j.radphyschem.2023.110914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
|
10
|
|
Qiu S, Fan T, Jiang J, Wang Z, Wang Y, Xu J, Sun T, Jiang N. A novel two-level interactive action recognition model based on inertial data fusion. Inf Sci (N Y) 2023; 633:264-279. [DOI: 10.1016/j.ins.2023.03.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
|
11
|
|
El-ghany SA, Elmogy M, El-aziz AAA. A fully automatic fine tuned deep learning model for knee osteoarthritis detection and progression analysis. Egyptian Informatics Journal 2023; 24:229-240. [DOI: 10.1016/j.eij.2023.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
12
|
|
Ansah FA, Amo-boateng M, Siabi EK, Bordoh PK. Location of seed spoilage in mango fruit using X-ray imaging and convolutional neural networks. Scientific African 2023; 20:e01649. [DOI: 10.1016/j.sciaf.2023.e01649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023] Open
|
13
|
|
Wang R, Duan Y, Hu M, Liu X, Li Y, Gao Q, Tong T, Tan T. LightR-YOLOv5: A compact rotating detector for SARS-CoV-2 antigen-detection rapid diagnostic test results. Displays 2023; 78:102403. [PMID: 36937555 DOI: 10.1016/j.displa.2023.102403] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 02/07/2023] [Accepted: 02/19/2023] [Indexed: 05/20/2023]
Abstract
Nucleic acid testing is currently the golden reference for coronaviruses (SARS-CoV-2) detection, while the SARS-CoV-2 antigen-detection rapid diagnostic tests (RDT) is an important adjunct. RDT can be widely used in the community or regional screening management as self-test tools and may need to be verified by healthcare authorities. However, manual verification of RDT results is a time-consuming task, and existing object detection algorithms usually suffer from high model complexity and computational effort, making them difficult to deploy. We propose LightR-YOLOv5, a compact rotating SARS-CoV-2 antigen-detection RDT results detector. Firstly, we employ an extremely light-weight L-ShuffleNetV2 network as a feature extraction network with a slight reduction in recognition accuracy. Secondly, we combine semantic and texture features in different layers by judiciously combining and employing GSConv, depth-wise convolution, and other modules, and further employ the NAM attention to locate the RDT result detection region. Furthermore, we propose a new data augmentation approach, Single-Copy-Paste, for increasing data samples for the specific task of RDT result detection while achieving a small improvement in model accuracy. Compared with some mainstream rotating object detection networks, the model size of our LightR-YOLOv5 is only 2.03MB, and it is 12.6%, 6.4%, and 7.3% higher in mAP@.5:.95 metrics compared to RetianNet, FCOS, and R3Det, respectively.
Collapse
Affiliation(s)
- Rongsheng Wang
- Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, 999078, Macao Special Administrative Region of China
| | - Yaofei Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, 999078, Macao Special Administrative Region of China
| | - Menghan Hu
- Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai 200240, China
| | - Xiaohong Liu
- John Hopcroft Center, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yukun Li
- Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, 999078, Macao Special Administrative Region of China
| | - Qinquan Gao
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
| | - Tong Tong
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
| | - Tao Tan
- Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, 999078, Macao Special Administrative Region of China
| |
Collapse
|
14
|
|
Xiao H, Liu Q, Li L. MFMANet: Multi-feature Multi-attention Network for efficient subtype classification on non-small cell lung cancer CT images. Biomed Signal Process Control 2023; 84:104768. [DOI: 10.1016/j.bspc.2023.104768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
15
|
|
Moshaei-nezhad Y, Tetzlaff R, Kirsch M. Respiration and heartbeat motion correction of intraoperative thermographic images in neurosurgery. Biomed Signal Process Control 2023; 84:104770. [DOI: 10.1016/j.bspc.2023.104770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
|
16
|
|
Diao S, Su J, Yang C, Zhu W, Xiang D, Chen X, Peng Q, Shi F. Classification and segmentation of OCT images for age-related macular degeneration based on dual guidance networks. Biomed Signal Process Control 2023; 84:104810. [DOI: 10.1016/j.bspc.2023.104810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
|
17
|
|
Rakshit A, Pramanick S, Bagchi A, Bhattacharyya S. Autonomous grasping of 3-D objects by a vision-actuated robot arm using Brain–Computer Interface. Biomed Signal Process Control 2023; 84:104765. [DOI: 10.1016/j.bspc.2023.104765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
|
18
|
|
Wang X, Su R, Xie W, Wang W, Xu Y, Mann R, Han J, Tan T. 2.75D: Boosting learning by representing 3D Medical imaging to 2D features for small data. Biomed Signal Process Control 2023; 84:104858. [DOI: 10.1016/j.bspc.2023.104858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2023]
|
19
|
|
Xue M, Wu Y, Wu Z, Zhang Y, Wang J, Liu W. Detecting backdoor in deep neural networks via intentional adversarial perturbations. Inf Sci (N Y) 2023; 634:564-577. [DOI: 10.1016/j.ins.2023.03.112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
20
|
|
Song L, Ma M, Liu G. TS-Net: Two-stage deformable medical image registration network based on new smooth constraints. Magn Reson Imaging 2023; 99:26-33. [PMID: 36709011 DOI: 10.1016/j.mri.2023.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Medical image registration can establish the spatial consistency of the corresponding anatomical structures between different medical images, which is important in medical image analysis. In recent years, with the rapid development of deep learning, the image registration methods based on deep learning greatly improve the speed, accuracy, and robustness of registration. Regrettably, these methods typically do not work well for large deformations and complex deformations in the image, and neglect to preserve the topological properties of the image during deformation. Aiming at these problems, we propose a new network TS-Net that learns deformation from coarse to fine and transmits information of different scales in the two stages. Two-stage network learning deformation from coarse to fine can gradually learn the large and complex deformations in images. In the second stage, the feature maps downsampled in the first stage for skip connection can expand the local receptive field and obtain more local information. The smooth constraints function used in the past is to impose the same restriction on the global, which is not targeted. In this paper, we propose a new smooth constraints function for each voxel deformation, which can better ensure the smoothness of the transformation and maintain the topological properties of the image. The experiments on brain datasets with complex deformations and heart datasets with large deformations show that our proposed method achieves better results while maintaining the topological properties of deformations compared to existing deep learning-based registration methods.
Collapse
Affiliation(s)
- Lei Song
- College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, PR China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, Jilin, PR China.
| | - Mingrui Ma
- College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, PR China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, Jilin, PR China.
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, PR China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, Jilin, PR China.
| |
Collapse
|
21
|
|
Yan A, Huang T, Ke L, Liu X, Chen Q, Dong C. Explanation leaks: Explanation-guided model extraction attacks. Inf Sci (N Y) 2023; 632:269-284. [DOI: 10.1016/j.ins.2023.03.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
22
|
|
Wang X, Nie G, Li B, Zhao Y, Kang M, Liu B. Hierarchical memory-guided long-term tracking with meta transformer inquiry network. Knowl Based Syst 2023; 269:110504. [DOI: 10.1016/j.knosys.2023.110504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
23
|
|
Shi X, Zhang S, Cheng M, He L, Tang X, Cui Z. Few-shot semantic segmentation for industrial defect recognition. COMPUT IND 2023; 148:103901. [DOI: 10.1016/j.compind.2023.103901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
|
24
|
|
Easley T, Chen R, Hannon K, Dutt R, Bijsterbosch J. Population modeling with machine learning can enhance measures of mental health - Open-data replication. Neuroimage: Reports 2023; 3:100163. [DOI: 10.1016/j.ynirp.2023.100163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
|
25
|
|
Salido J, Vallez N, González-López L, Deniz O, Bueno G. Comparison of deep learning models for digital H&E staining from unpaired label-free multispectral microscopy images. Comput Methods Programs Biomed 2023; 235:107528. [PMID: 37040684 DOI: 10.1016/j.cmpb.2023.107528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
BACKGROUND AND OBJECTIVE This paper presents the quantitative comparison of three generative models of digital staining, also known as virtual staining, in H&E modality (i.e., Hematoxylin and Eosin) that are applied to 5 types of breast tissue. Moreover, a qualitative evaluation of the results achieved with the best model was carried out. This process is based on images of samples without staining captured by a multispectral microscope with previous dimensional reduction to three channels in the RGB range. METHODS The models compared are based on conditional GAN (pix2pix) which uses images aligned with/without staining, and two models that do not require image alignment, Cycle GAN (cycleGAN) and contrastive learning-based model (CUT). These models are compared based on the structural similarity and chromatic discrepancy between samples with chemical staining and their corresponding ones with digital staining. The correspondence between images is achieved after the chemical staining images are subjected to digital unstaining by means of a model obtained to guarantee the cyclic consistency of the generative models. RESULTS The comparison of the three models corroborates the visual evaluation of the results showing the superiority of cycleGAN both for its larger structural similarity with respect to chemical staining (mean value of SSIM ∼ 0.95) and lower chromatic discrepancy (10%). To this end, quantization and calculation of EMD (Earth Mover's Distance) between clusters is used. In addition, quality evaluation through subjective psychophysical tests with three experts was carried out to evaluate quality of the results with the best model (cycleGAN). CONCLUSIONS The results can be satisfactorily evaluated by metrics that use as reference image a chemically stained sample and the digital staining images of the reference sample with prior digital unstaining. These metrics demonstrate that generative staining models that guarantee cyclic consistency provide the closest results to chemical H&E staining that also is consistent with the result of qualitative evaluation by experts.
Collapse
Affiliation(s)
- Jesus Salido
- IEEAC Dept. (ESI-UCLM), P de la Universidad 4, Ciudad Real, 13071, Spain.
| | - Noelia Vallez
- IEEAC Dept. (ETSII-UCLM), Avda. Camilo José Cela s/n, Ciudad Real, 13071, Spain
| | - Lucía González-López
- Hospital Gral. Universitario de C.Real (HGUCR), C. Obispo Rafael Torija s/n, Ciudad Real, 13005, Spain
| | - Oscar Deniz
- IEEAC Dept. (ETSII-UCLM), Avda. Camilo José Cela s/n, Ciudad Real, 13071, Spain
| | - Gloria Bueno
- IEEAC Dept. (ETSII-UCLM), Avda. Camilo José Cela s/n, Ciudad Real, 13071, Spain
| |
Collapse
|
26
|
|
Zhao S, Chen W, Zhang F, Liu X. Disentangle irrelevant and critical representations for face anti-spoofing. Neurocomputing 2023; 536:175-190. [DOI: 10.1016/j.neucom.2023.03.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
27
|
|
Huaulmé A, Harada K, Nguyen QM, Park B, Hong S, Choi MK, Peven M, Li Y, Long Y, Dou Q, Kumar S, Lalithkumar S, Hongliang R, Matsuzaki H, Ishikawa Y, Harai Y, Kondo S, Mitsuishi M, Jannin P. PEg TRAnsfer Workflow recognition challenge report: Do multimodal data improve recognition? Comput Methods Programs Biomed 2023; 236:107561. [PMID: 37119774 DOI: 10.1016/j.cmpb.2023.107561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 04/06/2023] [Accepted: 04/18/2023] [Indexed: 05/21/2023]
Abstract
BACKGROUND AND OBJECTIVE In order to be context-aware, computer-assisted surgical systems require accurate, real-time automatic surgical workflow recognition. In the past several years, surgical video has been the most commonly-used modality for surgical workflow recognition. But with the democratization of robot-assisted surgery, new modalities, such as kinematics, are now accessible. Some previous methods use these new modalities as input for their models, but their added value has rarely been studied. This paper presents the design and results of the "PEg TRAnsfer Workflow recognition" (PETRAW) challenge with the objective of developing surgical workflow recognition methods based on one or more modalities and studying their added value. METHODS The PETRAW challenge included a data set of 150 peg transfer sequences performed on a virtual simulator. This data set included videos, kinematic data, semantic segmentation data, and annotations, which described the workflow at three levels of granularity: phase, step, and activity. Five tasks were proposed to the participants: three were related to the recognition at all granularities simultaneously using a single modality, and two addressed the recognition using multiple modalities. The mean application-dependent balanced accuracy (AD-Accuracy) was used as an evaluation metric to take into account class balance and is more clinically relevant than a frame-by-frame score. RESULTS Seven teams participated in at least one task with four participating in every task. The best results were obtained by combining video and kinematic data (AD-Accuracy of between 93% and 90% for the four teams that participated in all tasks). CONCLUSION The improvement of surgical workflow recognition methods using multiple modalities compared with unimodal methods was significant for all teams. However, the longer execution time required for video/kinematic-based methods(compared to only kinematic-based methods) must be considered. Indeed, one must ask if it is wise to increase computing time by 2000 to 20,000% only to increase accuracy by 3%. The PETRAW data set is publicly available at www.synapse.org/PETRAW to encourage further research in surgical workflow recognition.
Collapse
Affiliation(s)
- Arnaud Huaulmé
- Univ Rennes, INSERM, LTSI - UMR 1099, Rennes, F35000, France.
| | - Kanako Harada
- Department of Mechanical Engineering, the University of Tokyo, Tokyo 113-8656, Japan
| | | | - Bogyu Park
- VisionAI hutom, Seoul, Republic of Korea
| | | | | | | | | | - Yonghao Long
- Department of Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong
| | - Qi Dou
- Department of Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong
| | | | | | - Ren Hongliang
- National University of Singapore, Singapore, Singapore; The Chinese University of Hong Kong, Hong Kong, Hong Kong
| | - Hiroki Matsuzaki
- National Cancer Center Japan East Hospital, Tokyo 104-0045, Japan
| | - Yuto Ishikawa
- National Cancer Center Japan East Hospital, Tokyo 104-0045, Japan
| | - Yuriko Harai
- National Cancer Center Japan East Hospital, Tokyo 104-0045, Japan
| | | | - Manoru Mitsuishi
- Department of Mechanical Engineering, the University of Tokyo, Tokyo 113-8656, Japan
| | - Pierre Jannin
- Univ Rennes, INSERM, LTSI - UMR 1099, Rennes, F35000, France.
| |
Collapse
|
28
|
|
Zhou Y, Wang H, Huo S, Wang B. Hierarchical full-attention neural architecture search based on search space compression. Knowl Based Syst 2023; 269:110507. [DOI: 10.1016/j.knosys.2023.110507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
29
|
|
Dagan A, Guy I, Novgorodov S. Shop by image: characterizing visual search in e-commerce. INFORM RETRIEVAL J 2023; 26:2. [DOI: 10.1007/s10791-023-09418-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
30
|
|
Wang Q, Zhang W, Yang W, Xu C, Cui Z. Prototype-guided Instance matching for multiple pedestrian tracking. Neurocomputing 2023; 538:126207. [DOI: 10.1016/j.neucom.2023.03.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
31
|
|
Sáenz-Gamboa JJ, Domenech J, Alonso-Manjarrés A, Gómez JA, de la Iglesia-Vayá M. Automatic semantic segmentation of the lumbar spine: Clinical applicability in a multi-parametric and multi-center study on magnetic resonance images. Artif Intell Med 2023; 140:102559. [PMID: 37210154 DOI: 10.1016/j.artmed.2023.102559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 04/14/2023] [Accepted: 04/18/2023] [Indexed: 05/22/2023]
Abstract
Significant difficulties in medical image segmentation include the high variability of images caused by their origin (multi-center), the acquisition protocols (multi-parametric), the variability of human anatomy, illness severity, the effect of age and gender, and notable other factors. This work addresses problems associated with the automatic semantic segmentation of lumbar spine magnetic resonance images using convolutional neural networks. We aimed to assign a class label to each pixel of an image, with classes defined by radiologists corresponding to structural elements such as vertebrae, intervertebral discs, nerves, blood vessels, and other tissues. The proposed network topologies represent variants of the U-Net architecture, and we used several complementary blocks to define the variants: three types of convolutional blocks, spatial attention models, deep supervision, and multilevel feature extractor. Here, we describe the topologies and analyze the results of the neural network designs that obtained the most accurate segmentation. Several proposed designs outperform the standard U-Net used as a baseline, primarily when used in ensembles, where the outputs of multiple neural networks are combined according to different strategies.
Collapse
Affiliation(s)
- Jhon Jairo Sáenz-Gamboa
- FISABIO-CIPF Joint Research Unit in Biomedical Imaging, Fundaciò per al Foment de la Investigaciò Sanitària i Biomèdica (FISABIO), Av. de Catalunya 21, 46020 València, Spain.
| | - Julio Domenech
- Orthopedic Surgery Department, Hospital Arnau de Vilanova, Carrer de San Clemente s/n, 46015, València, Spain
| | - Antonio Alonso-Manjarrés
- Radiology Department, Hospital Arnau de Vilanova, Carrer de San Clemente s/n, 46015, València, Spain
| | - Jon A Gómez
- Pattern Recognition and Human Language Technology research center, Universitat Politècnica de València, Camí de Vera, s/n, 46022, València, Spain
| | - Maria de la Iglesia-Vayá
- FISABIO-CIPF Joint Research Unit in Biomedical Imaging, Fundaciò per al Foment de la Investigaciò Sanitària i Biomèdica (FISABIO), Av. de Catalunya 21, 46020 València, Spain; Regional ministry of Universal Health and Public Health in Valencia, Carrer de Misser Mascó 31, 46010 València, Spain.
| |
Collapse
|
32
|
|
Liu L, Li C. Comparative study of deep learning models on the images of biopsy specimens for diagnosis of lung cancer treatment. Journal of Radiation Research and Applied Sciences 2023; 16:100555. [DOI: 10.1016/j.jrras.2023.100555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
33
|
|
Chen A, Tang X, Cheng B, He J. Multi-source monitoring information fusion method for dam health diagnosis based on Wasserstein distance. Inf Sci (N Y) 2023; 632:378-389. [DOI: 10.1016/j.ins.2023.03.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
|
34
|
|
Petäinen L, Väyrynen JP, Ruusuvuori P, Pölönen I, Äyrämö S, Kuopio T. Domain-specific transfer learning in the automated scoring of tumor-stroma ratio from histopathological images of colorectal cancer. PLoS One 2023; 18:e0286270. [PMID: 37235626 DOI: 10.1371/journal.pone.0286270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 05/11/2023] [Indexed: 05/28/2023] Open
Abstract
Tumor-stroma ratio (TSR) is a prognostic factor for many types of solid tumors. In this study, we propose a method for automated estimation of TSR from histopathological images of colorectal cancer. The method is based on convolutional neural networks which were trained to classify colorectal cancer tissue in hematoxylin-eosin stained samples into three classes: stroma, tumor and other. The models were trained using a data set that consists of 1343 whole slide images. Three different training setups were applied with a transfer learning approach using domain-specific data i.e. an external colorectal cancer histopathological data set. The three most accurate models were chosen as a classifier, TSR values were predicted and the results were compared to a visual TSR estimation made by a pathologist. The results suggest that classification accuracy does not improve when domain-specific data are used in the pre-training of the convolutional neural network models in the task at hand. Classification accuracy for stroma, tumor and other reached 96.1% on an independent test set. Among the three classes the best model gained the highest accuracy (99.3%) for class tumor. When TSR was predicted with the best model, the correlation between the predicted values and values estimated by an experienced pathologist was 0.57. Further research is needed to study associations between computationally predicted TSR values and other clinicopathological factors of colorectal cancer and the overall survival of the patients.
Collapse
Affiliation(s)
- Liisa Petäinen
- Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland
| | - Juha P Väyrynen
- Cancer and Translational Medicine Research Unit, Medical Research Center, Oulu University Hospital, and University of Oulu, Oulu, Finland
| | - Pekka Ruusuvuori
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- Cancer Research Unit, Institute of Biomedicine, University of Turku, Turku, Finland
- FICAN West Cancer Centre, Turku University Hospital, Turku, Finland
| | - Ilkka Pölönen
- Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland
| | - Sami Äyrämö
- Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland
| | - Teijo Kuopio
- Department of Education and Research, Hospital Nova of Central Finland, Jyväskylä, Finland
- Department of Biological and Environmental Science, University of Jyväskylä, Jyväskylä, Finland
- Department of Pathology, Hospital Nova of Central Finland, Jyväskylä, Finland
| |
Collapse
|
35
|
|
Zhuang M, Chen Z, Yang Y, Kettunen L, Wang H. Annotation-efficient training of medical image segmentation network based on scribble guidance in difficult areas. Int J Comput Assist Radiol Surg 2023. [PMID: 37233894 DOI: 10.1007/s11548-023-02931-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Accepted: 04/19/2023] [Indexed: 05/27/2023]
Abstract
PURPOSE The training of deep medical image segmentation networks usually requires a large amount of human-annotated data. To alleviate the burden of human labor, many semi- or non-supervised methods have been developed. However, due to the complexity of clinical scenario, insufficient training labels still causes inaccurate segmentation in some difficult local areas such as heterogeneous tumors and fuzzy boundaries. METHODS We propose an annotation-efficient training approach, which only requires scribble guidance in the difficult areas. A segmentation network is initially trained with a small amount of fully annotated data and then used to produce pseudo labels for more training data. Human supervisors draw scribbles in the areas of incorrect pseudo labels (i.e., difficult areas), and the scribbles are converted into pseudo label maps using a probability-modulated geodesic transform. To reduce the influence of the potential errors in the pseudo labels, a confidence map of the pseudo labels is generated by jointly considering the pixel-to-scribble geodesic distance and the network output probability. The pseudo labels and confidence maps are iteratively optimized with the update of the network, and the network training is promoted by the pseudo labels and the confidence maps in turn. RESULTS Cross-validation based on two data sets (brain tumor MRI and liver tumor CT) showed that our method significantly reduces the annotation time while maintains the segmentation accuracy of difficult areas (e.g., tumors). Using 90 scribble-annotated training images (annotated time: ~ 9 h), our method achieved the same performance as using 45 fully annotated images (annotation time: > 100 h) but required much shorter annotation time. CONCLUSION Compared to the conventional full annotation approaches, the proposed method significantly saves the annotation efforts by focusing the human supervisions on the most difficult regions. It provides an annotation-efficient way for training medical image segmentation networks in complex clinical scenario.
Collapse
Affiliation(s)
- Mingrui Zhuang
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, 116024, China
| | - Zhonghua Chen
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, 116024, China
- Faculty of Information Technology, University of Jyväskylä, 40100, Jyvaskyla, Finland
| | - Yuxin Yang
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, 116024, China
| | - Lauri Kettunen
- Faculty of Information Technology, University of Jyväskylä, 40100, Jyvaskyla, Finland
| | - Hongkai Wang
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, 116024, China.
- Liaoning Key Laboratory of Integrated Circuit and Biomedical Electronic System, Dalian, China.
| |
Collapse
|
36
|
|
Peng T, Gu Y, Zhang J, Dong Y, DI G, Wang W, Zhao J, Cai J. A Robust and Explainable Structure-Based Algorithm for Detecting the Organ Boundary From Ultrasound Multi-Datasets. J Digit Imaging 2023. [PMID: 37231289 DOI: 10.1007/s10278-023-00839-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 04/19/2023] [Accepted: 04/20/2023] [Indexed: 05/27/2023] Open
Abstract
Detecting the organ boundary in an ultrasound image is challenging because of the poor contrast of ultrasound images and the existence of imaging artifacts. In this study, we developed a coarse-to-refinement architecture for multi-organ ultrasound segmentation. First, we integrated the principal curve-based projection stage into an improved neutrosophic mean shift-based algorithm to acquire the data sequence, for which we utilized a limited amount of prior seed point information as the approximate initialization. Second, a distribution-based evolution technique was designed to aid in the identification of a suitable learning network. Then, utilizing the data sequence as the input of the learning network, we achieved the optimal learning network after learning network training. Finally, a scaled exponential linear unit-based interpretable mathematical model of the organ boundary was expressed via the parameters of a fraction-based learning network. The experimental outcomes indicated that our algorithm 1) achieved more satisfactory segmentation outcomes than state-of-the-art algorithms, with a Dice score coefficient value of 96.68 ± 2.2%, a Jaccard index value of 95.65 ± 2.16%, and an accuracy of 96.54 ± 1.82% and 2) discovered missing or blurry areas.
Collapse
Affiliation(s)
- Tao Peng
- School of Future Science and Engineering, Soochow University, Suzhou, China.
- Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong, China.
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX, USA.
| | - Yidong Gu
- School of Future Science and Engineering, Soochow University, Suzhou, China
- Department of Medical Ultrasound, the Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, Jiangsu, China
| | - Ji Zhang
- Department of Radiology, The Affiliated Taizhou People's Hospital of Nanjing Medical University, Taizhou, Jiangsu Province, China
| | - Yan Dong
- Department of Ultrasonography, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
| | - Gongye DI
- Department of Ultrasonic, The Affiliated Taizhou People's Hospital of Nanjing Medical University, Taizhou, Jiangsu Province, China
| | - Wenjie Wang
- Department of Radio-Oncology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, Jiangsu, China
| | - Jing Zhao
- Department of Ultrasound, Tsinghua University Affiliated Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Jing Cai
- Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong, China.
| |
Collapse
|
37
|
|
Sahoo M, Mitra M, Pal S. Improved Detection of Dry Age-Related Macular Degeneration from Optical Coherence Tomography Images using Adaptive Window Based Feature Extraction and Weighted Ensemble Based Classification Approach. Photodiagnosis Photodyn Ther 2023;:103629. [PMID: 37244451 DOI: 10.1016/j.pdpdt.2023.103629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 05/20/2023] [Accepted: 05/22/2023] [Indexed: 05/29/2023]
Abstract
BACKGROUND Dry Age-related macular degeneration (AMD), which affects the older population, can lead to blindness when left untreated. Preventing vision loss in elderly needs early identification. Dry-AMD diagnosis is still time-consuming and very subjective, depending on the ophthalmologist. Setting up a thorough eye-screening system to find Dry-AMD is a very difficult task. METHODOLOGY This study aims to develop a weighted majority voting (WMV) ensemble-based prediction model to diagnose Dry-AMD. The WMV approach combines the predictions from base-classifiers and chooses the class with greatest vote based on assigned weights to each classifier. A novel feature extraction method is used along the retinal pigment epithelium (RPE) layer, with the number of windows calculated for each picture playing an important part in identifying Dry-AMD/normal images using the WMV methodology. Pre-processing using hybrid-median filter followed by scale-invariant feature transform based segmentation of RPE layer and curvature flattening of retina is employed to measure exact thickness of RPE layer. RESULT The proposed model is trained on 70% of the OCT image database (OCTID) and evaluated on remaining OCTID and SD-OCT Noor dataset. Model has achieved accuracy of 96.15% and 96.94%, respectively. The suggested algorithm's effectiveness in Dry-AMD identification is demonstrated by comparison with alternative approaches. Even though the suggested model is only trained on the OCTID, it has performed well when tested on additional dataset. CONCLUSION The suggested architecture can be used for quick eye-screening for early identification of Dry-AMD. The recommended method may be applied in real-time since it requires fewer complexity and learning-variables.
Collapse
Affiliation(s)
- Moumita Sahoo
- Department of Applied Electronics and Instrumentation Engineering, Haldia Institute of Technology, Haldia, West Bengal, India.
| | - Madhuchhanda Mitra
- Department of Applied Physics, University of Calcutta, Kolkata, West Bengal, India
| | - Saurabh Pal
- Department of Applied Physics, University of Calcutta, Kolkata, West Bengal, India
| |
Collapse
|
38
|
|
Iio R, Ueda D, Matsumoto T, Manaka T, Nakazawa K, Ito Y, Hirakawa Y, Yamamoto A, Shiba M, Nakamura H. Deep learning-based screening tool for rotator cuff tears on shoulder radiography. J Orthop Sci 2023:S0949-2658(23)00132-X. [PMID: 37236873 DOI: 10.1016/j.jos.2023.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 02/06/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023]
Abstract
BACKGROUND Early diagnosis of rotator cuff tears is essential for appropriate and timely treatment. Although radiography is the most used technique in clinical practice, it is difficult to accurately rule out rotator cuff tears as an initial imaging diagnostic modality. Deep learning-based artificial intelligence has recently been applied in medicine, especially diagnostic imaging. This study aimed to develop a deep learning algorithm as a screening tool for rotator cuff tears based on radiography. METHODS We used 2803 shoulder radiographs of the true anteroposterior view to develop the deep learning algorithm. Radiographs were labeled 0 and 1 as intact or low-grade partial-thickness rotator cuff tears and high-grade partial or full-thickness rotator cuff tears, respectively. The diagnosis of rotator cuff tears was determined based on arthroscopic findings. The diagnostic performance of the deep learning algorithm was assessed by calculating the area under the curve (AUC), sensitivity, negative predictive value (NPV), and negative likelihood ratio (LR-) of test datasets with a cutoff value of expected high sensitivity determination based on validation datasets. Furthermore, the diagnostic performance for each rotator cuff tear size was evaluated. RESULTS The AUC, sensitivity, NPV, and LR- with expected high sensitivity determination were 0.82, 84/92 (91.3%), 102/110 (92.7%), and 0.16, respectively. The sensitivity, NPV, and LR- for full-thickness rotator cuff tears were 69/73 (94.5%), 102/106 (96.2%), and 0.10, respectively, while the diagnostic performance for partial-thickness rotator cuff tears was low at 15/19 (78.9%), NPV of 102/106 (96.2%) and LR- of 0.39. CONCLUSIONS Our algorithm had a high diagnostic performance for full-thickness rotator cuff tears. The deep learning algorithm based on shoulder radiography helps screen rotator cuff tears by setting an appropriate cutoff value. LEVEL OF EVIDENCE Level III: Diagnostic Study.
Collapse
Affiliation(s)
- Ryosuke Iio
- Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka City University, Osaka, Japan; Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Daiju Ueda
- Smart Life Science Lab, Center for Health Science Innovation, Osaka Metropolitan University, Osaka, Japan; Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Toshimasa Matsumoto
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Tomoya Manaka
- Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan.
| | - Katsumasa Nakazawa
- Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka City University, Osaka, Japan; Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Yoichi Ito
- Ito Clinic, Osaka Shoulder Center, Osaka, Japan
| | - Yoshihiro Hirakawa
- Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Akira Yamamoto
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Masatsugu Shiba
- Smart Life Science Lab, Center for Health Science Innovation, Osaka Metropolitan University, Osaka, Japan; Department of Medical Statistics, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Hiroaki Nakamura
- Department of Orthopaedic Surgery, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| |
Collapse
|
39
|
|
Magana-Salgado U, Namburi P, Feigin-Almon M, Pallares-Lopez R, Anthony B. A comparison of point-tracking algorithms in ultrasound videos from the upper limb. Biomed Eng Online 2023; 22:52. [PMID: 37226240 DOI: 10.1186/s12938-023-01105-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 04/25/2023] [Indexed: 05/26/2023] Open
Abstract
Tracking points in ultrasound (US) videos can be especially useful to characterize tissues in motion. Tracking algorithms that analyze successive video frames, such as variations of Optical Flow and Lucas-Kanade (LK), exploit frame-to-frame temporal information to track regions of interest. In contrast, convolutional neural-network (CNN) models process each video frame independently of neighboring frames. In this paper, we show that frame-to-frame trackers accumulate error over time. We propose three interpolation-like methods to combat error accumulation and show that all three methods reduce tracking errors in frame-to-frame trackers. On the neural-network end, we show that a CNN-based tracker, DeepLabCut (DLC), outperforms all four frame-to-frame trackers when tracking tissues in motion. DLC is more accurate than the frame-to-frame trackers and less sensitive to variations in types of tissue movement. The only caveat found with DLC comes from its non-temporal tracking strategy, leading to jitter between consecutive frames. Overall, when tracking points in videos of moving tissue, we recommend using DLC when prioritizing accuracy and robustness across movements in videos, and using LK with the proposed error-correction methods for small movements when tracking jitter is unacceptable.
Collapse
Affiliation(s)
- Uriel Magana-Salgado
- Department of Mechanical Engineering, MIT, Cambridge, MA, 02139, USA
- Mechanical Engineering Graduate Program, MIT, Cambridge, MA, 02139, USA
| | - Praneeth Namburi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, 12-3211, Cambridge, MA, 02139, USA.
- MIT.Nano Immersion Lab, MIT, Cambridge, MA, 02139, USA.
| | | | - Roger Pallares-Lopez
- Department of Mechanical Engineering, MIT, Cambridge, MA, 02139, USA
- Mechanical Engineering Graduate Program, MIT, Cambridge, MA, 02139, USA
| | - Brian Anthony
- Department of Mechanical Engineering, MIT, Cambridge, MA, 02139, USA
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, 12-3211, Cambridge, MA, 02139, USA
- MIT.Nano Immersion Lab, MIT, Cambridge, MA, 02139, USA
| |
Collapse
|
40
|
|
Jialeng G, Suárez de la Fuente S, Smith T. BoatNet: automated small boat composition detection using deep learning on satellite imagery. UCL Open Environ 2023; 5:e058. [PMID: 37229348 DOI: 10.14324/111.444/ucloe.000058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 04/03/2023] [Indexed: 05/27/2023]
Abstract
Tracking and measuring national carbon footprints is key to achieving the ambitious goals set by the Paris Agreement on carbon emissions. According to statistics, more than 10% of global transportation carbon emissions result from shipping. However, accurate tracking of the emissions of the small boat segment is not well established. Past research looked into the role played by small boat fleets in terms of greenhouse gases, but this has relied either on high-level technological and operational assumptions or the installation of global navigation satellite system sensors to understand how this vessel class behaves. This research is undertaken mainly in relation to fishing and recreational boats. With the advent of open-access satellite imagery and its ever-increasing resolution, it can support innovative methodologies that could eventually lead to the quantification of greenhouse gas emissions. Our work used deep learning algorithms to detect small boats in three cities in the Gulf of California in Mexico. The work produced a methodology named BoatNet that can detect, measure and classify small boats with leisure boats and fishing boats even under low-resolution and blurry satellite images, achieving an accuracy of 93.9% with a precision of 74.0%. Future work should focus on attributing a boat activity to fuel consumption and operational profile to estimate small boat greenhouse gas emissions in any given region.
Collapse
Affiliation(s)
- Guo Jialeng
- UCL Energy Institute, The Bartlett School of Environment, Energy and Resources, University College London, London, UK
| | - Santiago Suárez de la Fuente
- UCL Energy Institute, The Bartlett School of Environment, Energy and Resources, University College London, London, UK
| | - Tristan Smith
- UCL Energy Institute, The Bartlett School of Environment, Energy and Resources, University College London, London, UK
| |
Collapse
|
41
|
|
Mbani B, Buck V, Greinert J. An automated image-based workflow for detecting megabenthic fauna in optical images with examples from the Clarion-Clipperton Zone. Sci Rep 2023; 13:8350. [PMID: 37221273 DOI: 10.1038/s41598-023-35518-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 05/19/2023] [Indexed: 05/25/2023] Open
Abstract
Recent advances in optical underwater imaging technologies enable the acquisition of huge numbers of high-resolution seafloor images during scientific expeditions. While these images contain valuable information for non-invasive monitoring of megabenthic fauna, flora and the marine ecosystem, traditional labor-intensive manual approaches for analyzing them are neither feasible nor scalable. Therefore, machine learning has been proposed as a solution, but training the respective models still requires substantial manual annotation. Here, we present an automated image-based workflow for Megabenthic Fauna Detection with Faster R-CNN (FaunD-Fast). The workflow significantly reduces the required annotation effort by automating the detection of anomalous superpixels, which are regions in underwater images that have unusual properties relative to the background seafloor. The bounding box coordinates of the detected anomalous superpixels are proposed as a set of weak annotations, which are then assigned semantic morphotype labels and used to train a Faster R-CNN object detection model. We applied this workflow to example underwater images recorded during cruise SO268 to the German and Belgian contract areas for Manganese-nodule exploration, within the Clarion-Clipperton Zone (CCZ). A performance assessment of our FaunD-Fast model showed a mean average precision of 78.1% at an intersection-over-union threshold of 0.5, which is on a par with competing models that use costly-to-acquire annotations. In more detail, the analysis of the megafauna detection results revealed that ophiuroids and xenophyophores were among the most abundant morphotypes, accounting for 62% of all the detections within the surveyed area. Investigating the regional differences between the two contract areas further revealed that both megafaunal abundance and diversity was higher in the shallower German area, which might be explainable by the higher food availability in form of sinking organic material that decreases from east-to-west across the CCZ. Since these findings are consistent with studies based on conventional image-based methods, we conclude that our automated workflow significantly reduces the required human effort, while still providing accurate estimates of megafaunal abundance and their spatial distribution. The workflow is thus useful for a quick but objective generation of baseline information to enable monitoring of remote benthic ecosystems.
Collapse
Affiliation(s)
- Benson Mbani
- DeepSea Monitoring Group, GEOMAR Helmholtz Center for Ocean Research Kiel, Wischhofstraße 1-3, 24148, Kiel, Germany.
| | - Valentin Buck
- DeepSea Monitoring Group, GEOMAR Helmholtz Center for Ocean Research Kiel, Wischhofstraße 1-3, 24148, Kiel, Germany
| | - Jens Greinert
- DeepSea Monitoring Group, GEOMAR Helmholtz Center for Ocean Research Kiel, Wischhofstraße 1-3, 24148, Kiel, Germany
- Institute of Geosciences, Kiel University, Ludewig-Meyn-Str. 10-12, 24118, Kiel, Germany
| |
Collapse
|
42
|
|
Van Den Berghe T, Babin D, Chen M, Callens M, Brack D, Maes H, Lievens J, Lammens M, Van Sumere M, Morbée L, Hautekeete S, Schatteman S, Jacobs T, Thooft WJ, Herregods N, Huysse W, Jaremko JL, Lambert R, Maksymowych W, Laloo F, Baraliakos X, De Craemer AS, Carron P, Van den Bosch F, Elewaut D, Jans L. Neural network algorithm for detection of erosions and ankylosis on CT of the sacroiliac joints: multicentre development and validation of diagnostic accuracy. Eur Radiol 2023. [PMID: 37219619 DOI: 10.1007/s00330-023-09704-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 03/03/2023] [Accepted: 03/25/2023] [Indexed: 05/24/2023]
Abstract
OBJECTIVES To evaluate the feasibility and diagnostic accuracy of a deep learning network for detection of structural lesions of sacroiliitis on multicentre pelvic CT scans. METHODS Pelvic CT scans of 145 patients (81 female, 121 Ghent University/24 Alberta University, 18-87 years old, mean 40 ± 13 years, 2005-2021) with a clinical suspicion of sacroiliitis were retrospectively included. After manual sacroiliac joint (SIJ) segmentation and structural lesion annotation, a U-Net for SIJ segmentation and two separate convolutional neural networks (CNN) for erosion and ankylosis detection were trained. In-training validation and tenfold validation testing (U-Net-n = 10 × 58; CNN-n = 10 × 29) on a test dataset were performed to assess performance on a slice-by-slice and patient level (dice coefficient/accuracy/sensitivity/specificity/positive and negative predictive value/ROC AUC). Patient-level optimisation was applied to increase the performance regarding predefined statistical metrics. Gradient-weighted class activation mapping (Grad-CAM++) heatmap explainability analysis highlighted image parts with statistically important regions for algorithmic decisions. RESULTS Regarding SIJ segmentation, a dice coefficient of 0.75 was obtained in the test dataset. For slice-by-slice structural lesion detection, a sensitivity/specificity/ROC AUC of 95%/89%/0.92 and 93%/91%/0.91 were obtained in the test dataset for erosion and ankylosis detection, respectively. For patient-level lesion detection after pipeline optimisation for predefined statistical metrics, a sensitivity/specificity of 95%/85% and 82%/97% were obtained for erosion and ankylosis detection, respectively. Grad-CAM++ explainability analysis highlighted cortical edges as focus for pipeline decisions. CONCLUSIONS An optimised deep learning pipeline, including an explainability analysis, detects structural lesions of sacroiliitis on pelvic CT scans with excellent statistical performance on a slice-by-slice and patient level. CLINICAL RELEVANCE STATEMENT An optimised deep learning pipeline, including a robust explainability analysis, detects structural lesions of sacroiliitis on pelvic CT scans with excellent statistical metrics on a slice-by-slice and patient level. KEY POINTS • Structural lesions of sacroiliitis can be detected automatically in pelvic CT scans. • Both automatic segmentation and disease detection yield excellent statistical outcome metrics. • The algorithm takes decisions based on cortical edges, rendering an explainable solution.
Collapse
Affiliation(s)
- Thomas Van Den Berghe
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium.
| | - Danilo Babin
- Department of Telecommunication and Information Processing - Image Processing and Interpretation (TELIN-IPI), Faculty of Engineering and Architecture, Ghent University - IMEC, Sint-Pietersnieuwstraat 41, 9000, Ghent, Belgium
| | - Min Chen
- Department of Radiology, Peking University Shenzhen Hospital, Shenzhen, 518036, China
| | - Martijn Callens
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Denim Brack
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Helena Maes
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Jan Lievens
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Marie Lammens
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Maxime Van Sumere
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Lieve Morbée
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Simon Hautekeete
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Stijn Schatteman
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Tom Jacobs
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Willem-Jan Thooft
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Nele Herregods
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Wouter Huysse
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Jacob L Jaremko
- Department of Radiology and Diagnostic Imaging and Rheumatology, University of Alberta, 8440 122 Street NW, Edmonton, Alberta, T6G 2B7, Canada
| | - Robert Lambert
- Department of Radiology and Diagnostic Imaging and Rheumatology, University of Alberta, 8440 122 Street NW, Edmonton, Alberta, T6G 2B7, Canada
| | - Walter Maksymowych
- Department of Radiology and Diagnostic Imaging and Rheumatology, University of Alberta, 8440 122 Street NW, Edmonton, Alberta, T6G 2B7, Canada
| | - Frederiek Laloo
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Xenofon Baraliakos
- Rheumazentrum Ruhrgebiet Herne, Ruhr-University Bochum, Claudiusstraße 45, 44649, Herne, Germany
| | - Ann-Sophie De Craemer
- Department of Rheumatology, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
- Vlaams Instituut voor Biotechnologie (VIB) Centre for Inflammation Research (IRC), Ghent University, Technologiepark 927, 9052, Ghent, Belgium
| | - Philippe Carron
- Department of Rheumatology, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
- Vlaams Instituut voor Biotechnologie (VIB) Centre for Inflammation Research (IRC), Ghent University, Technologiepark 927, 9052, Ghent, Belgium
| | - Filip Van den Bosch
- Department of Rheumatology, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
- Vlaams Instituut voor Biotechnologie (VIB) Centre for Inflammation Research (IRC), Ghent University, Technologiepark 927, 9052, Ghent, Belgium
| | - Dirk Elewaut
- Department of Rheumatology, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
- Vlaams Instituut voor Biotechnologie (VIB) Centre for Inflammation Research (IRC), Ghent University, Technologiepark 927, 9052, Ghent, Belgium
| | - Lennart Jans
- Department of Radiology and Medical Imaging, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium
| |
Collapse
|
43
|
|
Lee JS, Shin K, Ryu SM, Jegal SG, Lee W, Yoon MA, Hong GS, Paik S, Kim N. Screening of adolescent idiopathic scoliosis using generative adversarial network (GAN) inversion method in chest radiographs. PLoS One 2023; 18:e0285489. [PMID: 37216382 DOI: 10.1371/journal.pone.0285489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 04/25/2023] [Indexed: 05/24/2023] Open
Abstract
OBJECTIVE Conventional computer-aided diagnosis using convolutional neural networks (CNN) has limitations in detecting sensitive changes and determining accurate decision boundaries in spectral and structural diseases such as scoliosis. We devised a new method to detect and diagnose adolescent idiopathic scoliosis in chest X-rays (CXRs) employing the latent space's discriminative ability in the generative adversarial network (GAN) and a simple multi-layer perceptron (MLP) to screen adolescent idiopathic scoliosis CXRs. MATERIALS AND METHODS Our model was trained and validated in a two-step manner. First, we trained a GAN using CXRs with various scoliosis severities and utilized the trained network as a feature extractor using the GAN inversion method. Second, we classified each vector from the latent space using a simple MLP. RESULTS The 2-layer MLP exhibited the best classification in the ablation study. With this model, the area under the receiver operating characteristic (AUROC) curves were 0.850 in the internal and 0.847 in the external datasets. Furthermore, when the sensitivity was fixed at 0.9, the model's specificity was 0.697 in the internal and 0.646 in the external datasets. CONCLUSION We developed a classifier for Adolescent idiopathic scoliosis (AIS) through generative representation learning. Our model shows good AUROC under screening chest radiographs in both the internal and external datasets. Our model has learned the spectral severity of AIS, enabling it to generate normal images even when trained solely on scoliosis radiographs.
Collapse
Affiliation(s)
- Jun Soo Lee
- Department of Industrial Engineering, Seoul National University, Seoul, Korea
| | - Keewon Shin
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Seung Min Ryu
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
- Department of Orthopedic Surgery, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Seong Gyu Jegal
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Woojin Lee
- Department of Radiology, Hanyang University Hospital, Seoul, Korea
| | - Min A Yoon
- Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Gil-Sun Hong
- Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Sanghyun Paik
- Department of Radiology, Hanyang University Hospital, Seoul, Korea
| | - Namkug Kim
- Department of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
44
|
|
Ullmann D, Taran O, Voloshynovskiy S. Multivariate Time Series Information Bottleneck. Entropy (Basel) 2023; 25. [PMID: 37238586 DOI: 10.3390/e25050831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 05/10/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023]
Abstract
Time series (TS) and multiple time series (MTS) predictions have historically paved the way for distinct families of deep learning models. The temporal dimension, distinguished by its evolutionary sequential aspect, is usually modeled by decomposition into the trio of "trend, seasonality, noise", by attempts to copy the functioning of human synapses, and more recently, by transformer models with self-attention on the temporal dimension. These models may find applications in finance and e-commerce, where any increase in performance of less than 1% has large monetary repercussions, they also have potential applications in natural language processing (NLP), medicine, and physics. To the best of our knowledge, the information bottleneck (IB) framework has not received significant attention in the context of TS or MTS analyses. One can demonstrate that a compression of the temporal dimension is key in the context of MTS. We propose a new approach with partial convolution, where a time sequence is encoded into a two-dimensional representation resembling images. Accordingly, we use the recent advances made in image extension to predict an unseen part of an image from a given one. We show that our model compares well with traditional TS models, has information-theoretical foundations, and can be easily extended to more dimensions than only time and space. An evaluation of our multiple time series-information bottleneck (MTS-IB) model proves its efficiency in electricity production, road traffic, and astronomical data representing solar activity, as recorded by NASA's interface region imaging spectrograph (IRIS) satellite.
Collapse
Affiliation(s)
- Denis Ullmann
- Faculty of Science, University of Geneva, CUI, 1227 Carouge, Switzerland
| | - Olga Taran
- Faculty of Science, University of Geneva, CUI, 1227 Carouge, Switzerland
| | | |
Collapse
|
45
|
|
Gdoura A, Degünther M, Lorenz B, Effland A. Combining CNNs and Markov-like Models for Facial Landmark Detection with Spatial Consistency Estimates. J Imaging 2023; 9:104. [PMID: 37233323 DOI: 10.3390/jimaging9050104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/09/2023] [Accepted: 05/11/2023] [Indexed: 05/27/2023] Open
Abstract
The accurate localization of facial landmarks is essential for several tasks, including face recognition, head pose estimation, facial region extraction, and emotion detection. Although the number of required landmarks is task-specific, models are typically trained on all available landmarks in the datasets, limiting efficiency. Furthermore, model performance is strongly influenced by scale-dependent local appearance information around landmarks and the global shape information generated by them. To account for this, we propose a lightweight hybrid model for facial landmark detection designed specifically for pupil region extraction. Our design combines a convolutional neural network (CNN) with a Markov random field (MRF)-like process trained on only 17 carefully selected landmarks. The advantage of our model is the ability to run different image scales on the same convolutional layers, resulting in a significant reduction in model size. In addition, we employ an approximation of the MRF that is run on a subset of landmarks to validate the spatial consistency of the generated shape. This validation process is performed against a learned conditional distribution, expressing the location of one landmark relative to its neighbor. Experimental results on popular facial landmark localization datasets such as 300 w, WFLW, and HELEN demonstrate the accuracy of our proposed model. Furthermore, our model achieves state-of-the-art performance on a well-defined robustness metric. In conclusion, the results demonstrate the ability of our lightweight model to filter out spatially inconsistent predictions, even with significantly fewer training landmarks.
Collapse
Affiliation(s)
- Ahmed Gdoura
- Department of Ophthalmology, Justus-Liebig-University Gießen, 35392 Gießen, Germany
- Department of Mathematics, Natural Sciences and Data Processing, Technische Hochschule Mittelhessen, 61169 Friedberg, Germany
| | - Markus Degünther
- Department of Mathematics, Natural Sciences and Data Processing, Technische Hochschule Mittelhessen, 61169 Friedberg, Germany
| | - Birgit Lorenz
- Department of Ophthalmology, Justus-Liebig-University Gießen, 35392 Gießen, Germany
- Department of Ophthalmology, University Hospital Bonn, 53127 Bonn, Germany
| | - Alexander Effland
- Institute of Applied Mathematics, University of Bonn, 53115 Bonn, Germany
| |
Collapse
|
46
|
|
Segal Y, Hadar O, Lhotska L. Using EfficientNet-B7 (CNN), Variational Auto Encoder (VAE) and Siamese Twins' Networks to Evaluate Human Exercises as Super Objects in a TSSCI Images. J Pers Med 2023; 13:874. [PMID: 37241044 DOI: 10.3390/jpm13050874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/16/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
In this article, we introduce a new approach to human movement by defining the movement as a static super object represented by a single two-dimensional image. The described method is applicable in remote healthcare applications, such as physiotherapeutic exercises. It allows researchers to label and describe the entire exercise as a standalone object, isolated from the reference video. This approach allows us to perform various tasks, including detecting similar movements in a video, measuring and comparing movements, generating new similar movements, and defining choreography by controlling specific parameters in the human body skeleton. As a result of the presented approach, we can eliminate the need to label images manually, disregard the problem of finding the start and the end of an exercise, overcome synchronization issues between movements, and perform any deep learning network-based operation that processes super objects in images in general. As part of this article, we will demonstrate two application use cases: one illustrates how to verify and score a fitness exercise. In contrast, the other illustrates how to generate similar movements in the human skeleton space by addressing the challenge of supplying sufficient training data for deep learning applications (DL). A variational auto encoder (VAE) simulator and an EfficientNet-B7 classifier architecture embedded within a Siamese twin neural network are presented in this paper in order to demonstrate the two use cases. These use cases demonstrate the versatility of our innovative concept in measuring, categorizing, inferring human behavior, and generating gestures for other researchers.
Collapse
Affiliation(s)
- Yoram Segal
- School of Electrical and Computer Engineering, Ben Gurion University of the Negev, Be'er-Sheva 84105001, Israel
| | - Ofer Hadar
- School of Electrical and Computer Engineering, Ben Gurion University of the Negev, Be'er-Sheva 84105001, Israel
| | - Lenka Lhotska
- Czech Institute of Informatics, Robotics and Cybernetics, Faculty of Biomedical Engineering, Czech Technical University in Prague, 160 00 Prague, Czech Republic
| |
Collapse
|
47
|
|
Chen X, Pu H, He Y, Lai M, Zhang D, Chen J, Pu H. An Efficient Method for Monitoring Birds Based on Object Detection and Multi-Object Tracking Networks. Animals (Basel) 2023; 13:1713. [PMID: 37238144 DOI: 10.3390/ani13101713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 05/28/2023] Open
Abstract
To protect birds, it is crucial to identify their species and determine their population across different regions. However, currently, bird monitoring methods mainly rely on manual techniques, such as point counts conducted by researchers and ornithologists in the field. This method can sometimes be inefficient, prone to errors, and have limitations, which may not always be conducive to bird conservation efforts. In this paper, we propose an efficient method for wetland bird monitoring based on object detection and multi-object tracking networks. First, we construct a manually annotated dataset for bird species detection, annotating the entire body and head of each bird separately, comprising 3737 bird images. We also built a new dataset containing 11,139 complete, individual bird images for the multi-object tracking task. Second, we perform comparative experiments using a state-of-the-art batch of object detection networks, and the results demonstrated that the YOLOv7 network, trained with a dataset labeling the entire body of the bird, was the most effective method. To enhance YOLOv7 performance, we added three GAM modules on the head side of the YOLOv7 to minimize information diffusion and amplify global interaction representations and utilized Alpha-IoU loss to achieve more accurate bounding box regression. The experimental results revealed that the improved method offers greater accuracy, with mAP@0.5 improving to 0.951 and mAP@0.5:0.95 improving to 0.815. Then, we send the detection information to DeepSORT for bird tracking and classification counting. Finally, we use the area counting method to count according to the species of birds to obtain information about flock distribution. The method described in this paper effectively addresses the monitoring challenges in bird conservation.
Collapse
Affiliation(s)
- Xian Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Hongli Pu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Yihui He
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Mengzhen Lai
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Daike Zhang
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Junyang Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Haibo Pu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Ya'an Digital Agricultural Engineering Technology Research Center, Ya'an 625000, China
| |
Collapse
|
48
|
|
Hsu LY, Ali Z, Bagheri H, Huda F, Redd BA, Jones EC. Comparison of CT and Dixon MR Abdominal Adipose Tissue Quantification Using a Unified Computer-Assisted Software Framework. Tomography 2023; 9:1041-51. [PMID: 37218945 DOI: 10.3390/tomography9030085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 05/17/2023] [Accepted: 05/18/2023] [Indexed: 05/24/2023] Open
Abstract
PURPOSE Reliable and objective measures of abdominal fat distribution across imaging modalities are essential for various clinical and research scenarios, such as assessing cardiometabolic disease risk due to obesity. We aimed to compare quantitative measures of subcutaneous (SAT) and visceral (VAT) adipose tissues in the abdomen between computed tomography (CT) and Dixon-based magnetic resonance (MR) images using a unified computer-assisted software framework. MATERIALS AND METHODS This study included 21 subjects who underwent abdominal CT and Dixon MR imaging on the same day. For each subject, two matched axial CT and fat-only MR images at the L2-L3 and the L4-L5 intervertebral levels were selected for fat quantification. For each image, an outer and an inner abdominal wall regions as well as SAT and VAT pixel masks were automatically generated by our software. The computer-generated results were then inspected and corrected by an expert reader. RESULTS There were excellent agreements for both abdominal wall segmentation and adipose tissue quantification between matched CT and MR images. Pearson coefficients were 0.97 for both outer and inner region segmentation, 0.99 for SAT, and 0.97 for VAT quantification. Bland-Altman analyses indicated minimum biases in all comparisons. CONCLUSION We showed that abdominal adipose tissue can be reliably quantified from both CT and Dixon MR images using a unified computer-assisted software framework. This flexible framework has a simple-to-use workflow to measure SAT and VAT from both modalities to support various clinical research applications.
Collapse
Affiliation(s)
- Li-Yueh Hsu
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10, Room 1C370, 10 Center Drive, Bethesda, MA 20892, USA
| | - Zara Ali
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10, Room 1C370, 10 Center Drive, Bethesda, MA 20892, USA
| | - Hadi Bagheri
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10, Room 1C370, 10 Center Drive, Bethesda, MA 20892, USA
| | - Fahimul Huda
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10, Room 1C370, 10 Center Drive, Bethesda, MA 20892, USA
| | - Bernadette A Redd
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10, Room 1C370, 10 Center Drive, Bethesda, MA 20892, USA
| | - Elizabeth C Jones
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10, Room 1C370, 10 Center Drive, Bethesda, MA 20892, USA
| |
Collapse
|
49
|
|
Gustavo Pereira de Andrade A. Letter to the editor regarding "Three-dimensional videography using omnidirectional cameras: An approach inspired by the direct linear transformation method". J Biomech 2023; 155:111641. [PMID: 37245384 DOI: 10.1016/j.jbiomech.2023.111641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/09/2023] [Accepted: 05/10/2023] [Indexed: 05/30/2023]
Affiliation(s)
- André Gustavo Pereira de Andrade
- School of Physical Education, Physiotherapy and Occupational Therapy, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil; Brazilian Paralympic Reference Center, Sports Training Center, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
| |
Collapse
|
50
|
|
Liu Y, Zuo S. Self-supervised monocular depth estimation for gastrointestinal endoscopy. Comput Methods Programs Biomed 2023; 238:107619. [PMID: 37235969 DOI: 10.1016/j.cmpb.2023.107619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 04/26/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023]
Abstract
BACKGROUND AND OBJECTIVE Gastrointestinal (GI) endoscopy represents a promising tool for GI cancer screening. However, the limited field of view and uneven skills of endoscopists make it remains difficult to accurately identify polyps and follow up on precancerous lesions under endoscopy. Estimating depth from GI endoscopic sequences is essential for a series of AI-assisted surgical techniques. Nonetheless, depth estimation algorithm of GI endoscopy is a challenging task due to the particularity of the environment and the limitation of datasets. In this paper, we propose a self-supervised monocular depth estimation method for GI endoscopy. METHODS A depth estimation network and a camera ego-motion estimation network are firstly constructed to obtain the depth information and pose information of the sequence respectively, and then the model is enabled to perform self-supervised training by calculating the multi-scale structural similarity with L1 norm (MS-SSIM+L1) loss function between the target frame and the reconstructed image as part of the loss of the training network. The MS-SSIM+L1 loss function is good for reserving high-frequency information and can maintain the invariance of brightness and color. Our model consists of the U-shape convolutional network with the dual-attention mechanism, which is beneficial to capture muti-scale contextual information, and greatly improves the accuracy of depth estimation. We evaluated our method qualitatively and quantitatively with different state-of-the-art methods. RESULTS AND CONCLUSIONS The experimental results manifest that our method has superior generality, achieving lower error metrics and higher accuracy metrics on both the UCL dataset and the Endoslam dataset. The proposed method has also been validated with clinical GI endoscopy, demonstrating the potential clinical value of the model.
Collapse
Affiliation(s)
- Yuying Liu
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China
| | - Siyang Zuo
- Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, Tianjin, China.
| |
Collapse
|