151
|
Zhao R, Xie M, Feng X, Su X, Zhang H, Yang W. Content-illumination coupling guided low-light image enhancement network. Sci Rep 2024; 14:8456. [PMID: 38605053 PMCID: PMC11009353 DOI: 10.1038/s41598-024-58965-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 04/05/2024] [Indexed: 04/13/2024] Open
Abstract
Current low-light enhancement algorithms fail to suppress noise when enhancing brightness, and may introduces structural distortion and color distortion caused by halos or artifacts. This paper proposes a content-illumination coupling guided low-light image enhancement network (CICGNet), it develops a truss topology based on Retinex as backbone to decompose low-light image component in an end-to-end way. The preservation of content features and the enhancement of illumination features are carried out along with depth and width direction of the truss topology. Each submodule uses the same resolution input and output to avoid the introduction of noise. Illumination component prevents misestimation of global and local illumination by using pre- and post-activation features at different depth levels, this way could avoid possible halos and artifacts. The network progressively enhances the illumination component and maintains the content component stage-by-stage. The proposed algorithm demonstrates better performance compared with advanced attention-based low-light enhancement algorithms and state-of-the-art image restoration algorithms. We also perform extensive ablation studies and demonstrate the impact of low-light enhancement algorithm on the downstream task of computer vision. Code is available at: https://github.com/Ruini94/CICGNet .
Collapse
Affiliation(s)
- Ruini Zhao
- Key Laboratory of Space Precision Measurement Technology, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xian, 710119, China
| | - Meilin Xie
- Key Laboratory of Space Precision Measurement Technology, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xian, 710119, China
- Pilot National Laboratory for Marine Science and Technology, Qingdao, 266200, China
| | - Xubin Feng
- Key Laboratory of Space Precision Measurement Technology, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xian, 710119, China.
| | - Xiuqin Su
- Key Laboratory of Space Precision Measurement Technology, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xian, 710119, China
- Pilot National Laboratory for Marine Science and Technology, Qingdao, 266200, China
| | - Huiming Zhang
- Institute of Intelligent Transportation, Shandong Provincial Communications Planning and Design Inst Group Co., Ltd., Jinan, 250101, China
| | - Wei Yang
- Chang'an University, Xian, 710064, China
| |
Collapse
|
152
|
Müller R. Bioinspiration from bats and new paradigms for autonomy in natural environments. Bioinspir Biomim 2024; 19:033001. [PMID: 38452384 DOI: 10.1088/1748-3190/ad311e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 03/07/2024] [Indexed: 03/09/2024]
Abstract
Achieving autonomous operation in complex natural environment remains an unsolved challenge. Conventional engineering approaches to this problem have focused on collecting large amounts of sensory data that are used to create detailed digital models of the environment. However, this only postpones solving the challenge of identifying the relevant sensory information and linking it to action control to the domain of the digital world model. Furthermore, it imposes high demands in terms of computing power and introduces large processing latencies that hamper autonomous real-time performance. Certain species of bats that are able to navigate and hunt their prey in dense vegetation could be a biological model system for an alternative approach to addressing the fundamental issues associated with autonomy in complex natural environments. Bats navigating in dense vegetation rely on clutter echoes, i.e. signals that consist of unresolved contributions from many scatters. Yet, the animals are able to extract the relevant information from these input signals with brains that are often less than 1 g in mass. Pilot results indicate that information relevant to location identification and passageway finding can be directly obtained from clutter echoes, opening up the possibility that the bats' skill can be replicated in man-made autonomous systems.
Collapse
Affiliation(s)
- Rolf Müller
- Department of Mechanical Engineering, Virginia Tech, Blacksburg, VA 24061, United States of America
| |
Collapse
|
153
|
Washington P. A Perspective on Crowdsourcing and Human-in-the-Loop Workflows in Precision Health. J Med Internet Res 2024; 26:e51138. [PMID: 38602750 DOI: 10.2196/51138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 11/15/2023] [Accepted: 01/30/2024] [Indexed: 04/12/2024] Open
Abstract
Modern machine learning approaches have led to performant diagnostic models for a variety of health conditions. Several machine learning approaches, such as decision trees and deep neural networks, can, in principle, approximate any function. However, this power can be considered to be both a gift and a curse, as the propensity toward overfitting is magnified when the input data are heterogeneous and high dimensional and the output class is highly nonlinear. This issue can especially plague diagnostic systems that predict behavioral and psychiatric conditions that are diagnosed with subjective criteria. An emerging solution to this issue is crowdsourcing, where crowd workers are paid to annotate complex behavioral features in return for monetary compensation or a gamified experience. These labels can then be used to derive a diagnosis, either directly or by using the labels as inputs to a diagnostic machine learning model. This viewpoint describes existing work in this emerging field and discusses ongoing challenges and opportunities with crowd-powered diagnostic systems, a nascent field of study. With the correct considerations, the addition of crowdsourcing to human-in-the-loop machine learning workflows for the prediction of complex and nuanced health conditions can accelerate screening, diagnostics, and ultimately access to care.
Collapse
Affiliation(s)
- Peter Washington
- Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI, United States
| |
Collapse
|
154
|
Ngan KH, Mansouri-Benssassi E, Phelan J, Townsend J, Garcez AD. From explanation to intervention: Interactive knowledge extraction from Convolutional Neural Networks used in radiology. PLoS One 2024; 19:e0293967. [PMID: 38598468 PMCID: PMC11006149 DOI: 10.1371/journal.pone.0293967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 10/23/2023] [Indexed: 04/12/2024] Open
Abstract
Deep Learning models such as Convolutional Neural Networks (CNNs) are very effective at extracting complex image features from medical X-rays. However, the limited interpretability of CNNs has hampered their deployment in medical settings as they failed to gain trust among clinicians. In this work, we propose an interactive framework to allow clinicians to ask what-if questions and intervene in the decisions of a CNN, with the aim of increasing trust in the system. The framework translates a layer of a trained CNN into a measurable and compact set of symbolic rules. Expert interactions with visualizations of the rules promote the use of clinically-relevant CNN kernels and attach meaning to the rules. The definition and relevance of the kernels are supported by radiomics analyses and permutation evaluations, respectively. CNN kernels that do not have a clinically-meaningful interpretation are removed without affecting model performance. By allowing clinicians to evaluate the impact of adding or removing kernels from the rule set, our approach produces an interpretable refinement of the data-driven CNN in alignment with medical best practice.
Collapse
Affiliation(s)
- Kwun Ho Ngan
- Data Science Institute, City, University of London, London, United Kingdom
- Fujitsu Research of Europe Ltd, Slough, United Kingdom
| | | | - James Phelan
- Data Science Institute, City, University of London, London, United Kingdom
| | | | | |
Collapse
|
155
|
Domanskyi S, Srivastava A, Kaster J, Li H, Herlyn M, Rubinstein JC, Chuang JH. Nextflow pipeline for Visium and H&E data from patient-derived xenograft samples. Cell Rep Methods 2024:100759. [PMID: 38626768 DOI: 10.1016/j.crmeth.2024.100759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/23/2024] [Accepted: 03/25/2024] [Indexed: 04/18/2024]
Abstract
We designed a Nextflow DSL2-based pipeline, Spatial Transcriptomics Quantification (STQ), for simultaneous processing of 10x Genomics Visium spatial transcriptomics data and a matched hematoxylin and eosin (H&E)-stained whole-slide image (WSI), optimized for patient-derived xenograft (PDX) cancer specimens. Our pipeline enables the classification of sequenced transcripts for deconvolving the mouse and human species and mapping the transcripts to reference transcriptomes. We align the H&E WSI with the spatial layout of the Visium slide and generate imaging and quantitative morphology features for each Visium spot. The pipeline design enables multiple analysis workflows, including single or dual reference genome input and stand-alone image analysis. We show the utility of our pipeline on a dataset from Visium profiling of four melanoma PDX samples. The clustering of Visium spots and clustering of H&E imaging features reveal similar patterns arising from the two data modalities.
Collapse
Affiliation(s)
- Sergii Domanskyi
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
| | - Anuj Srivastava
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | | | - Haiyin Li
- The Wistar Institute, Philadelphia, PA 19104, USA
| | | | - Jill C Rubinstein
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Hartford HealthCare Cancer Institute at St. Vincent's Medical Center, Bridgeport, CT 06606, USA
| | - Jeffrey H Chuang
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; UCONN Health Department of Genetics and Genome Sciences, Farmington, CT 06032, USA.
| |
Collapse
|
156
|
Zhang X, Wang C, Sun Q, Wu J, Dai Y, Li E, Wu J, Chen H, Duan S, Hu W. Inorganic Halide Perovskite Nanowires/Conjugated Polymer Heterojunction-Based Optoelectronic Synaptic Transistors for Dynamic Machine Vision. Nano Lett 2024; 24:4132-4140. [PMID: 38534013 DOI: 10.1021/acs.nanolett.3c05092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Inspired by the retina, artificial optoelectronic synapses have groundbreaking potential for machine vision. The field-effect transistor is a crucial platform for optoelectronic synapses that is highly sensitive to external stimuli and can modulate conductivity. On the basis of the decent optical absorption, perovskite materials have been widely employed for constructing optoelectronic synaptic transistors. However, the reported optoelectronic synaptic transistors focus on the static processing of independent stimuli at different moments, while the natural visual information consists of temporal signals. Here, we report CsPbBrI2 nanowire-based optoelectronic synaptic transistors to study the dynamic responses of artificial synaptic transistors to time-varying visual information for the first time. Moreover, on the basis of the dynamic synaptic behavior, a hardware system with an accuracy of 85% is built to the trajectory of moving objects. This work offers a new way to develop artificial optoelectronic synapses for the construction of dynamic machine vision systems.
Collapse
Affiliation(s)
- Xianghong Zhang
- Institute of Optoelectronic Display, National & Local United Engineering Lab of Flat Panel Display Technology, Fuzhou University, Fuzhou 350002, China
- Shanghai Frontiers Science Research Base of Intelligent Optoelectronics and Perception, Institute of Optoelectronics, Department of Materials Science, Fudan University, Shanghai 200433, China
| | - Congyong Wang
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Binhai New City, Fuzhou 350207, China
- Department of Chemistry, National University of Singapore, 3 Science Drive, Singapore 117543
| | - Qisheng Sun
- China Electronics Technology Group Corp 46th Research Institute, 26 Dongting Road, Tianjin 300220, P. R. China
| | - Jianxin Wu
- Institute of Optoelectronic Display, National & Local United Engineering Lab of Flat Panel Display Technology, Fuzhou University, Fuzhou 350002, China
- Fujian Science & Technology Innovation Laboratory for Optoelectronic Information of China, Fuzhou 350100, China
| | - Yan Dai
- Institute of Optoelectronic Display, National & Local United Engineering Lab of Flat Panel Display Technology, Fuzhou University, Fuzhou 350002, China
- Fujian Science & Technology Innovation Laboratory for Optoelectronic Information of China, Fuzhou 350100, China
| | - Enlong Li
- Shanghai Frontiers Science Research Base of Intelligent Optoelectronics and Perception, Institute of Optoelectronics, Department of Materials Science, Fudan University, Shanghai 200433, China
| | - Jishan Wu
- Department of Chemistry, National University of Singapore, 3 Science Drive, Singapore 117543
| | - Huipeng Chen
- Institute of Optoelectronic Display, National & Local United Engineering Lab of Flat Panel Display Technology, Fuzhou University, Fuzhou 350002, China
- Fujian Science & Technology Innovation Laboratory for Optoelectronic Information of China, Fuzhou 350100, China
| | - Shuming Duan
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Binhai New City, Fuzhou 350207, China
| | - Wenping Hu
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Binhai New City, Fuzhou 350207, China
- Key Laboratory of Organic Integrated Circuits, Ministry of Education & Tianjin Key Laboratory of Molecular Optoelectronic Sciences, Department of Chemistry, School of Science, Tianjin University, Tianjin 300072, China
- Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, China
| |
Collapse
|
157
|
Gu Y, Pope A, Smith C, Carmona C, Johnstone A, Shi L, Chen X, Santos S, Bacon-Brenes CC, Shoff T, Kleczko KM, Frydman J, Thompson LM, Mobley WC, Wu C. BDNF and TRiC-inspired reagent rescue cortical synaptic deficits in a mouse model of Huntington's disease. Neurobiol Dis 2024; 195:106502. [PMID: 38608784 DOI: 10.1016/j.nbd.2024.106502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 03/27/2024] [Accepted: 04/09/2024] [Indexed: 04/14/2024] Open
Abstract
Synaptic changes are early manifestations of neuronal dysfunction in Huntington's disease (HD). However, the mechanisms by which mutant HTT protein impacts synaptogenesis and function are not well understood. Herein we explored HD pathogenesis in the BACHD mouse model by examining synaptogenesis and function in long term primary cortical cultures. At DIV14 (days in vitro), BACHD cortical neurons showed no difference from WT neurons in synaptogenesis as revealed by colocalization of a pre-synaptic (Synapsin I) and a post-synaptic (PSD95) marker. From DIV21 to DIV35, BACHD neurons showed progressively reduced colocalization of Synapsin I and PSD95 relative to WT neurons. The deficits were effectively rescued by treatment of BACHD neurons with BDNF. The recombinant apical domain of CCT1 (ApiCCT1) yielded a partial rescuing effect. BACHD neurons also showed culture age-related significant functional deficits as revealed by multielectrode arrays (MEAs). These deficits were prevented by BDNF, whereas ApiCCT1 showed a less potent effect. These findings are evidence that deficits in BACHD synapse and function can be replicated in vitro and that BDNF or a TRiC-inspired reagent can potentially be protective against these changes in BACHD neurons. Our findings support the use of cellular models to further explicate HD pathogenesis and potential treatments.
Collapse
Affiliation(s)
- Yingli Gu
- Department of Neurology, The Fourth Hospital of Harbin Medical University, 150001, China; Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America
| | - Alexander Pope
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America
| | - Charlene Smith
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92697, United States of America
| | - Christopher Carmona
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America; Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92697, United States of America; Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, United States of America; Beckman Laser Institute & Medical Clinic, University of California, Irvine, Irvine, CA, United States; Department of Biomedical Engineering, University of California, Irvine, Irvine, CA, United States
| | - Aaron Johnstone
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America
| | - Linda Shi
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92697, United States of America; Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, United States of America; Beckman Laser Institute & Medical Clinic, University of California, Irvine, Irvine, CA, United States; Department of Biomedical Engineering, University of California, Irvine, Irvine, CA, United States
| | - Xuqiao Chen
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America
| | - Sarai Santos
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America
| | | | - Thomas Shoff
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America
| | - Korbin M Kleczko
- Department of Biology and Genetics, Stanford University, Stanford, CA 94305-5430, United States of America
| | - Judith Frydman
- Department of Biology and Genetics, Stanford University, Stanford, CA 94305-5430, United States of America
| | - Leslie M Thompson
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92697, United States of America; Institute of Memory Impairments and Neurological Disorders, University of California, Irvine, CA 92697, United States of America; Department of Neurobiology and Behavior, University of California, Irvine, CA 92697, United States of America; Sue and Bill Gross Stem Cell Center, University of California, Irvine, CA 92697, United States of America
| | - William C Mobley
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America.
| | - Chengbiao Wu
- Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, United States of America.
| |
Collapse
|
158
|
Kim D. Numerical subgrid Bi-cubic methods of partial differential equations in image segmentation. Sci Rep 2024; 14:8387. [PMID: 38600152 PMCID: PMC11006860 DOI: 10.1038/s41598-024-54855-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 02/17/2024] [Indexed: 04/12/2024] Open
Abstract
Image segmentation is a core research in the image processing and computer vision. In this paper, we suggest a Bi-cubic spline phase transition potential and elaborate a Bi-Cubic spline phase transition potential development. In the image segmentation, we develop the new approach to apply the novel computational fluid dynamics in the boundary with subgrid. The numerical subgrid Bi-cubic method with Bi-Cubic spline for minimizing the piecewise constant energy functional is very efficient, robust and fast in the image segmentation with a multispecies multiphase segmentation models. The subgrid Bi-cubic spline is applied on the boundary with subgrid and the regular grid is applied on the non-boundary. The model generates a multispecies multiphase distribution with Bi-Cubic spline and we can extract the image segments with multispecies multiphase. Finally, we analyze the models and show the numerical results. Numerical results are presented with OCR (Optical Character Recognition) and the medical image.
Collapse
Affiliation(s)
- Dongyung Kim
- Department of Mathematics, Phoenix College, Phoenix, AZ, 85013, USA.
- Department of Mathematical Sciences, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
159
|
Wang Z, Tu H, Qian Y, Zhao Y. DON6D: a decoupled one-stage network for 6D pose estimation. Sci Rep 2024; 14:8410. [PMID: 38600244 DOI: 10.1038/s41598-024-59152-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 04/08/2024] [Indexed: 04/12/2024] Open
Abstract
The six-dimensional (6D) pose object estimation is a key task in robotic manipulation and grasping scenes. Many existing two-stage solutions with a slow inference speed require extra refinement to handle the challenges of variations in lighting, sensor noise, object occlusion, and truncation. To address these challenges, this work proposes a decoupled one-stage network (DON6D) model for 6D pose estimation that improves inference speed on the premise of maintaining accuracy. Particularly, since the RGB images are aligned with the RGB-D images, the proposed DON6D first uses a two-dimensional detection network to locate the interested objects in RGB-D images. Then, a module of feature extraction and fusion is used to extract color and geometric features fully. Further, dual data augmentation is performed to enhance the generalization ability of the proposed model. Finally, the features are fused, and an attention residual encoder-decoder, which can improve the pose estimation performance to obtain an accurate 6D pose, is introduced. The proposed DON6D model is evaluated on the LINEMOD and YCB-Video datasets. The results demonstrate that the proposed DON6D is superior to several state-of-the-art methods regarding the ADD(-S) and ADD(-S) AUC metrics.
Collapse
Affiliation(s)
- Zheng Wang
- School of Computer and Computational Sciences, Hangzhou City University, Hangzhou, 310015, China
| | - Hangyao Tu
- School of Computer and Computational Sciences, Hangzhou City University, Hangzhou, 310015, China.
| | - Yutong Qian
- School of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Yanwei Zhao
- School of Engineering, Hangzhou City University, Hangzhou, 310015, China
| |
Collapse
|
160
|
Bülow RD, Lan YC, Amann K, Boor P. [Artificial intelligence in kidney transplant pathology]. Pathologie (Heidelb) 2024:10.1007/s00292-024-01324-7. [PMID: 38598097 DOI: 10.1007/s00292-024-01324-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 03/12/2024] [Indexed: 04/11/2024]
Abstract
BACKGROUND Artificial intelligence (AI) systems have showed promising results in digital pathology, including digital nephropathology and specifically also kidney transplant pathology. AIM Summarize the current state of research and limitations in the field of AI in kidney transplant pathology diagnostics and provide a future outlook. MATERIALS AND METHODS Literature search in PubMed and Web of Science using the search terms "deep learning", "transplant", and "kidney". Based on these results and studies cited in the identified literature, a selection was made of studies that have a histopathological focus and use AI to improve kidney transplant diagnostics. RESULTS AND CONCLUSION Many studies have already made important contributions, particularly to the automation of the quantification of some histopathological lesions in nephropathology. This likely can be extended to automatically quantify all relevant lesions for a kidney transplant, such as Banff lesions. Important limitations and challenges exist in the collection of representative data sets and the updates of Banff classification, making large-scale studies challenging. The already positive study results make future AI support in kidney transplant pathology appear likely.
Collapse
Affiliation(s)
- Roman David Bülow
- Institut für Pathologie, Sektion Nephropathologie, Universitätsklinikum RWTH Aachen, Pauwelsstraße 30, 52074, Aachen, Deutschland
| | - Yu-Chia Lan
- Institut für Pathologie, Sektion Nephropathologie, Universitätsklinikum RWTH Aachen, Pauwelsstraße 30, 52074, Aachen, Deutschland
| | - Kerstin Amann
- Abteilung Nephropathologie, Institut für Pathologie, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Deutschland
| | - Peter Boor
- Institut für Pathologie, Sektion Nephropathologie, Universitätsklinikum RWTH Aachen, Pauwelsstraße 30, 52074, Aachen, Deutschland.
- Medizinische Klinik II, Universitätsklinikum RWTH Aachen, Aachen, Deutschland.
| |
Collapse
|
161
|
Khor W, Chen YK, Roberts M, Ciampa F. Automated detection and classification of concealed objects using infrared thermography and convolutional neural networks. Sci Rep 2024; 14:8353. [PMID: 38594274 PMCID: PMC11004154 DOI: 10.1038/s41598-024-56636-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 03/08/2024] [Indexed: 04/11/2024] Open
Abstract
This paper presents a study on the effectiveness of a convolutional neural network (CNN) in classifying infrared images for security scanning. Infrared thermography was explored as a non-invasive security scanner for stand-off and walk-through concealed object detection. Heat generated by human subjects radiates off the clothing surface, allowing detection by an infrared camera. However, infrared lacks in penetration capability compared to longer electromagnetic waves, leading to less obvious visuals on the clothing surface. ResNet-50 was used as the CNN model to automate the classification process of thermal images. The ImageNet database was used to pre-train the model, which was further fine-tuned using infrared images obtained from experiments. Four image pre-processing approaches were explored, i.e., raw infrared image, subject cropped region-of-interest (ROI) image, K-means, and Fuzzy-c clustered images. All these approaches were evaluated using the receiver operating characteristic curve on an internal holdout set, with an area-under-the-curve of 0.8923, 0.9256, 0.9485, and 0.9669 for the raw image, ROI cropped, K-means, and Fuzzy-c models, respectively. The CNN models trained using various image pre-processing approaches suggest that the prediction performance can be improved by the removal of non-decision relevant information and the visual highlighting of features.
Collapse
Affiliation(s)
- WeeLiam Khor
- Department of Mechanical Engineering Sciences, University of Surrey, Guildford, GU2 7XH, UK
- Department of Technology, Design and Environment, Oxford Brookes University, Wheatley, OX33 1HX, UK
| | - Yichen Kelly Chen
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, UK
| | - Michael Roberts
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, UK
- Department of Medicine, University of Cambridge, Hills Road, Cambridge, CB2 2QQ, UK
| | - Francesco Ciampa
- Department of Mechanical Engineering Sciences, University of Surrey, Guildford, GU2 7XH, UK.
| |
Collapse
|
162
|
Eun NL, Lee E, Park AY, Son EJ, Kim JA, Youk JH. Artificial intelligence for ultrasound microflow imaging in breast cancer diagnosis. Ultraschall Med 2024. [PMID: 38593859 DOI: 10.1055/a-2230-2455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
PURPOSE To develop and evaluate artificial intelligence (AI) algorithms for ultrasound (US) microflow imaging (MFI) in breast cancer diagnosis. MATERIALS AND METHODS We retrospectively collected a dataset consisting of 516 breast lesions (364 benign and 152 malignant) in 471 women who underwent B-mode US and MFI. The internal dataset was split into training (n = 410) and test datasets (n = 106) for developing AI algorithms from deep convolutional neural networks from MFI. AI algorithms were trained to provide malignancy risk (0-100%). The developed AI algorithms were further validated with an independent external dataset of 264 lesions (229 benign and 35 malignant). The diagnostic performance of B-mode US, AI algorithms, or their combinations was evaluated by calculating the area under the receiver operating characteristic curve (AUROC). RESULTS The AUROC of the developed three AI algorithms (0.955-0.966) was higher than that of B-mode US (0.842, P < 0.0001). The AUROC of the AI algorithms on the external validation dataset (0.892-0.920) was similar to that of the test dataset. Among the AI algorithms, no significant difference was found in all performance metrics combined with or without B-mode US. Combined B-mode US and AI algorithms had a higher AUROC (0.963-0.972) than that of B-mode US (P < 0.0001). Combining B-mode US and AI algorithms significantly decreased the false-positive rate of BI-RADS category 4A lesions from 87% to 13% (P < 0.0001). CONCLUSION AI-based MFI diagnosed breast cancers with better performance than B-mode US, eliminating 74% of false-positive diagnoses in BI-RADS category 4A lesions.
Collapse
Affiliation(s)
- Na Lae Eun
- Radiology, Gangnam Severance Hospital, Seoul, Korea (the Republic of)
| | - Eunjung Lee
- Computational Science and Engineering, Yonsei University, Seoul, Korea (the Republic of)
| | - Ah Young Park
- Radiology, Bundang CHA Medical Center, Seongnam, Korea (the Republic of)
| | - Eun Ju Son
- Radiology, Gangnam Severance Hospital, Seoul, Korea (the Republic of)
| | - Jeong-Ah Kim
- Radiology, Gangnam Severance Hospital, Seoul, Korea (the Republic of)
| | - Ji Hyun Youk
- Department of Radiology, Yonsei University College of Medicine, Seoul, Korea, Republic of
- Radiology, Gangnam Severance Hospital, Seoul, Korea (the Republic of)
| |
Collapse
|
163
|
Fraiponts M, Maes W, Champagne B. Earth Mover's Charge Transfer Distance: A General and Robust Approach for Describing Excited State Locality. J Chem Theory Comput 2024; 20:2751-2760. [PMID: 38407044 DOI: 10.1021/acs.jctc.3c01148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
A novel approach for assessing the extent of electron displacement in optical transitions is proposed by implementing the Earth Mover's Distance (EMD) method, which quantifies the spatial dissimilarity between ground and excited state electron density distributions. In contrast to previous descriptors, this index provides a representative and intuitively understandable distance under a robust and computationally efficient scheme for all possible forms of locality, even in the most difficult to dissect topological cases. The theoretical differences among the existing indices and our method are first illustrated with the help of a simplified model system, followed by a benchmarking of several partial atomic charge models using experimentally relevant push-pull compounds with diverse symmetries. These same molecules are finally employed to further demonstrate the principal advantages of the EMD index and its capabilities in rationalizing charge transfer phenomena.
Collapse
Affiliation(s)
- Mathias Fraiponts
- Laboratory of Theoretical Chemistry (LCT), Theoretical and Structural Physical Chemistry Unit, Namur Institute of Structured Matter, University of Namur, Rue de Bruxelles 61, 5000 Namur, Belgium
- Design & Synthesis of Organic Semiconductors (DSOS), Hasselt University, Agoralaan 1, 3590 Diepenbeek, Belgium
- IMEC, Institute for Materials Research (IMO-IMOMEC), Wetenschapspark 1, 3590 Diepenbeek, Belgium
| | - Wouter Maes
- Design & Synthesis of Organic Semiconductors (DSOS), Hasselt University, Agoralaan 1, 3590 Diepenbeek, Belgium
- IMEC, Institute for Materials Research (IMO-IMOMEC), Wetenschapspark 1, 3590 Diepenbeek, Belgium
| | - Benoît Champagne
- Laboratory of Theoretical Chemistry (LCT), Theoretical and Structural Physical Chemistry Unit, Namur Institute of Structured Matter, University of Namur, Rue de Bruxelles 61, 5000 Namur, Belgium
| |
Collapse
|
164
|
Condrea F, Rapaka S, Itu L, Sharma P, Sperl J, Ali AM, Leordeanu M. Anatomically aware dual-hop learning for pulmonary embolism detection in CT pulmonary angiograms. Comput Biol Med 2024; 174:108464. [PMID: 38613894 DOI: 10.1016/j.compbiomed.2024.108464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 04/01/2024] [Accepted: 04/08/2024] [Indexed: 04/15/2024]
Abstract
Pulmonary Embolisms (PE) represent a leading cause of cardiovascular death. While medical imaging, through computed tomographic pulmonary angiography (CTPA), represents the gold standard for PE diagnosis, it is still susceptible to misdiagnosis or significant diagnosis delays, which may be fatal for critical cases. Despite the recently demonstrated power of deep learning to bring a significant boost in performance in a wide range of medical imaging tasks, there are still very few published researches on automatic pulmonary embolism detection. Herein we introduce a deep learning based approach, which efficiently combines computer vision and deep neural networks for pulmonary embolism detection in CTPA. Our method brings novel contributions along three orthogonal axes: (1) automatic detection of anatomical structures; (2) anatomical aware pretraining, and (3) a dual-hop deep neural net for PE detection. We obtain state-of-the-art results on the publicly available multicenter large-scale RSNA dataset.
Collapse
Affiliation(s)
- Florin Condrea
- Institute of Mathematics of the Romanian Academy "Simion Stoilow, Bucharest, Romania; Advanta, Siemens, 15 Noiembrie Bvd, Brasov, 500097, Romania.
| | | | - Lucian Itu
- Advanta, Siemens, 15 Noiembrie Bvd, Brasov, 500097, Romania
| | | | | | - A Mohamed Ali
- Siemens Healthcare Private Limited, Mumbai, 400079, India
| | - Marius Leordeanu
- Institute of Mathematics of the Romanian Academy "Simion Stoilow, Bucharest, Romania; Advanta, Siemens, 15 Noiembrie Bvd, Brasov, 500097, Romania; Polytechnic University of Bucharest, Bucharest, Romania
| |
Collapse
|
165
|
Zhang Y, Feng X, Dong Y, Chen Y, Zhao Z, Yang B, Chang Y, Bai Y. SM-GRSNet: sparse mapping-based graph representation segmentation network for honeycomb lung lesion. Phys Med Biol 2024; 69:085020. [PMID: 38417177 DOI: 10.1088/1361-6560/ad2e6b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 02/28/2024] [Indexed: 03/01/2024]
Abstract
Objective. Honeycomb lung is a rare but severe disease characterized by honeycomb-like imaging features and distinct radiological characteristics. Therefore, this study aims to develop a deep-learning model capable of segmenting honeycomb lung lesions from Computed Tomography (CT) scans to address the efficacy issue of honeycomb lung segmentation.Methods. This study proposes a sparse mapping-based graph representation segmentation network (SM-GRSNet). SM-GRSNet integrates an attention affinity mechanism to effectively filter redundant features at a coarse-grained region level. The attention encoder generated by this mechanism specifically focuses on the lesion area. Additionally, we introduce a graph representation module based on sparse links in SM-GRSNet. Subsequently, graph representation operations are performed on the sparse graph, yielding detailed lesion segmentation results. Finally, we construct a pyramid-structured cascaded decoder in SM-GRSNet, which combines features from the sparse link-based graph representation modules and attention encoders to generate the final segmentation mask.Results. Experimental results demonstrate that the proposed SM-GRSNet achieves state-of-the-art performance on a dataset comprising 7170 honeycomb lung CT images. Our model attains the highest IOU (87.62%), Dice(93.41%). Furthermore, our model also achieves the lowest HD95 (6.95) and ASD (2.47).Significance.The SM-GRSNet method proposed in this paper can be used for automatic segmentation of honeycomb lung CT images, which enhances the segmentation performance of Honeycomb lung lesions under small sample datasets. It will help doctors with early screening, accurate diagnosis, and customized treatment. This method maintains a high correlation and consistency between the automatic segmentation results and the expert manual segmentation results. Accurate automatic segmentation of the honeycomb lung lesion area is clinically important.
Collapse
Affiliation(s)
- Yuanrong Zhang
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Xiufang Feng
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Yunyun Dong
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Ying Chen
- School of International Education, Beijing University of Chemical Technology, Beijing 100029, People's Republic of China
| | - Zian Zhao
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Bingqian Yang
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Yunqing Chang
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Yujie Bai
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| |
Collapse
|
166
|
Wei M, Chen L, Ji W, Yue X, Zimmermann R. In Defense of Clip-Based Video Relation Detection. IEEE Trans Image Process 2024; 33:2759-2769. [PMID: 38530734 DOI: 10.1109/tip.2024.3379935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Video Visual Relation Detection (VidVRD) aims to detect visual relationship triplets in videos using spatial bounding boxes and temporal boundaries. Existing VidVRD methods can be broadly categorized into bottom-up and top-down paradigms, depending on their approach to classifying relations. Bottom-up methods follow a clip-based approach where they classify relations of short clip tubelet pairs and then merge them into long video relations. On the other hand, top-down methods directly classify long video tubelet pairs. While recent video-based methods utilizing video tubelets have shown promising results, we argue that the effective modeling of spatial and temporal context plays a more significant role than the choice between clip tubelets and video tubelets. This motivates us to revisit the clip-based paradigm and explore the key success factors in VidVRD. In this paper, we propose a Hierarchical Context Model (HCM) that enriches the object-based spatial context and relation-based temporal context based on clips. We demonstrate that using clip tubelets can achieve superior performance compared to most video-based methods. Additionally, using clip tubelets offers more flexibility in model designs and helps alleviate the limitations associated with video tubelets, such as the challenging long-term object tracking problem and the loss of temporal information in long-term tubelet feature compression. Extensive experiments conducted on two challenging VidVRD benchmarks validate that our HCM achieves a new state-of-the-art performance, highlighting the effectiveness of incorporating advanced spatial and temporal context modeling within the clip-based paradigm.
Collapse
|
167
|
Liu C, Li X, Xiao W, Xie S. CCDet: Confidence-Consistent Learning for Dense Object Detection. IEEE Trans Image Process 2024; 33:2746-2758. [PMID: 38517714 DOI: 10.1109/tip.2024.3378457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]
Abstract
Modern detectors commonly employ classification scores to reflect the localization quality of detection results. However, there exists an inconsistency between them, misguiding the selection of high-quality predictions and providing unreliable results for downstream applications. In this paper, we find that the root of this confidence inconsistency lies in the inaccurate IoU estimation and the spatial misalignment of the learned features between the classification and localization tasks. Therefore, a Confidence-Consistent Detector (CCDet) which includes the Distribution-based IoU Prediction (DIP) and Consistency-aware label assignment (CLA), is proposed. DIP provides more stable and accurate IoU estimation by learning the probability distribution over the IoU range and employing the expectation as the predicted IoU. CLA adopts both the prediction performance and consistency degree of samples as assignment metrics to select positives, which guides the classification and localization tasks to promote similar feature distribution. Comprehensive experiments demonstrate that CCDet can effectively mitigate the confidence inconsistency between classification and localization, and achieve stable improvement across different baselines. On the test-dev set of MS COCO, CCDet acquires a single-model single-scale AP of 50.1%, surpassing most of the existing object detectors.
Collapse
|
168
|
Lu S, Zhang W, Zhao H, Liu H, Wang N, Li H. Anomaly Detection for Medical Images Using Heterogeneous Auto-Encoder. IEEE Trans Image Process 2024; 33:2770-2782. [PMID: 38551828 DOI: 10.1109/tip.2024.3381435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Anomaly detection is an important task for medical image analysis, which can alleviate the reliance of supervised methods on large labelled datasets. Most existing methods use a pixel-wise self-reconstruction framework for anomaly detection. However, there are two challenges of these studies: 1) they tend to overfit learning an identity mapping between the input and output, which leads to failure in detecting abnormal samples; 2) the reconstruction considers the pixel-wise differences which may lead to an undesirable result. To mitigate the above problems, we propose a novel heterogeneous Auto-Encoder (Hetero-AE) for medical anomaly detection. Our model utilizes a convolutional neural network (CNN) as the encoder and a hybrid CNN-Transformer network as the decoder. The heterogeneous structure enables the model to learn the intrinsic information of normal data and enlarge the difference on abnormal samples. To fully exploit the effectiveness of Transformer in the hybrid network, a multi-scale sparse Transformer block is proposed to trade off modelling long-range feature dependencies and high computational costs. Moreover, the multi-stage feature comparison is introduced to reduce the noise of pixel-wise comparison. Extensive experiments on four public datasets (i.e., retinal OCT, chest X-ray, brain MRI, and COVID-19) verify the effectiveness of our method on different imaging modalities for anomaly detection. Additionally, our method can accurately detect tumors in brain MRI and lesions in retinal OCT with interpretable heatmaps to locate lesion areas, assisting clinicians in diagnosing abnormalities efficiently.
Collapse
|
169
|
Walter K, Freeman M, Bex P. Quantifying task-related gaze. Atten Percept Psychophys 2024:10.3758/s13414-024-02883-w. [PMID: 38594445 DOI: 10.3758/s13414-024-02883-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/18/2024] [Indexed: 04/11/2024]
Abstract
Competing theories attempt to explain what guides eye movements when exploring natural scenes: bottom-up image salience and top-down semantic salience. In one study, we apply language-based analyses to quantify the well-known observation that task influences gaze in natural scenes. Subjects viewed ten scenes as if they were performing one of two tasks. We found that the semantic similarity between the task and the labels of objects in the scenes captured the task-dependence of gaze (t(39) = 13.083; p < 0.001). In another study, we examined whether image salience or semantic salience better predicts gaze during a search task, and if viewing strategies are affected by searching for targets of high or low semantic relevance to the scene. Subjects searched 100 scenes for a high- or low-relevance object. We found that image salience becomes a worse predictor of gaze across successive fixations, while semantic salience remains a consistent predictor (X2(1, N=40) = 75.148, p < .001). Furthermore, we found that semantic salience decreased as object relevance decreased (t(39) = 2.304; p = .027). These results suggest that semantic salience is a useful predictor of gaze during task-related scene viewing, and that even in target-absent trials, gaze is modulated by the relevance of a search target to the scene in which it might be located.
Collapse
Affiliation(s)
- Kerri Walter
- Department of Psychology, Northeastern University, Boston, MA, USA.
| | - Michelle Freeman
- Department of Psychology, Northeastern University, Boston, MA, USA
| | - Peter Bex
- Department of Psychology, Northeastern University, Boston, MA, USA
| |
Collapse
|
170
|
Fang K, Wang J, Chen Q, Feng X, Qu Y, Shi J, Xu Z. Swin-cryoEM: Multi-class cryo-electron micrographs single particle mixed detection method. PLoS One 2024; 19:e0298287. [PMID: 38593135 PMCID: PMC11003668 DOI: 10.1371/journal.pone.0298287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 01/18/2024] [Indexed: 04/11/2024] Open
Abstract
Cryo-electron micrograph images have various characteristics such as varying sizes, shapes, and distribution densities of individual particles, severe background noise, high levels of impurities, irregular shapes, blurred edges, and similar color to the background. How to demonstrate good adaptability in the field of image vision by picking up single particles from multiple types of cryo-electron micrographs is currently a challenge in the field of cryo-electron micrographs. This paper combines the characteristics of the MixUp hybrid enhancement algorithm, enhances the image feature information in the pre-processing stage, builds a feature perception network based on the channel self-attention mechanism in the forward network of the Swin Transformer model network, achieving adaptive adjustment of self-attention mechanism between different single particles, increasing the network's tolerance to noise, Incorporating PReLU activation function to enhance information exchange between pixel blocks of different single particles, and combining the Cross-Entropy function with the softmax function to construct a classification network based on Swin Transformer suitable for cryo-electron micrograph single particle detection model (Swin-cryoEM), achieving mixed detection of multiple types of single particles. Swin-cryoEM algorithm can better solve the problem of good adaptability in picking single particles of many types of cryo-electron micrographs, improve the accuracy and generalization ability of the single particle picking method, and provide high-quality data support for the three-dimensional reconstruction of a single particle. In this paper, ablation experiments and comparison experiments were designed to evaluate and compare Swin-cryoEM algorithms in detail and comprehensively on multiple datasets. The Average Precision is an important evaluation index of the evaluation model, and the optimal Average Precision reached 95.5% in the training stage Swin-cryoEM, and the single particle picking performance was also superior in the prediction stage. This model inherits the advantages of the Swin Transformer detection model and is superior to mainstream models such as Faster R-CNN and YOLOv5 in terms of the single particle detection capability of cryo-electron micrographs.
Collapse
Affiliation(s)
- Kun Fang
- Hunan Meteorological Information Center, Hunan Meteorological Bureau, Changsha, Hunan, China
- Hunan Key Laboratory of Meteorological Disaster Prevention and Reduction, Hunan Meteorological Bureau, Changsha, Hunan, China
| | - JinLing Wang
- Xiangtan University& China Unicom (Hunan) Industrial Internet Co., Ltd, China Unicom (Hunan), Changsha, Hunan, China
| | - QingFeng Chen
- Hunan Meteorological Information Center, Hunan Meteorological Bureau, Changsha, Hunan, China
- Hunan Key Laboratory of Meteorological Disaster Prevention and Reduction, Hunan Meteorological Bureau, Changsha, Hunan, China
| | - Xian Feng
- Hunan Meteorological Information Center, Hunan Meteorological Bureau, Changsha, Hunan, China
- Hunan Key Laboratory of Meteorological Disaster Prevention and Reduction, Hunan Meteorological Bureau, Changsha, Hunan, China
| | - YouMing Qu
- Hunan Meteorological Information Center, Hunan Meteorological Bureau, Changsha, Hunan, China
- Hunan Key Laboratory of Meteorological Disaster Prevention and Reduction, Hunan Meteorological Bureau, Changsha, Hunan, China
| | - Jiachi Shi
- Hunan Meteorological Information Center, Hunan Meteorological Bureau, Changsha, Hunan, China
- Hunan Key Laboratory of Meteorological Disaster Prevention and Reduction, Hunan Meteorological Bureau, Changsha, Hunan, China
| | - Zhuomin Xu
- School of Geography and Information Engineering, China University of Geosciences (Wuhan), Wuhan, Hubei, China
| |
Collapse
|
171
|
Sheng S, Jing J, Wang Z, Zhang H. Cosine similarity knowledge distillation for surface anomaly detection. Sci Rep 2024; 14:8150. [PMID: 38589492 PMCID: PMC11001943 DOI: 10.1038/s41598-024-58409-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
The current state-of-the-art anomaly detection methods based on knowledge distillation (KD) typically depend on smaller student networks or reverse distillation to address vanishing representations discrepancy on anomalies. These methods often struggle to achieve precise detection when dealing with complex texture backgrounds containing anomalies due to the similarity between anomalous and non-anomalous regions. Therefore, we propose a new paradigm-Cosine Similarity Knowledge Distillation (CSKD), for surface anomaly detection and localization. We focus on the superior performance of the same deeper teacher and student encoders by the distillation loss in traditional knowledge distillation-based methods. Essentially, we introduce the Attention One-Class Embedding (AOCE) in the student network to enhance learning capabilities and reduce the effect of the teacher-student (T-S) model on response similarity in anomalous regions. Furthermore, we find the optimal models by different classes' hard-coded epochs, and an adaptive optimal model selection method is designed. Extensive experiments on the MVTec dataset with 99.2% image-level AUROC and 98.2%/94.7% pixel-level AUROC/PRO demonstrate that our method outperforms existing unsupervised anomaly detection algorithms. Additional experiments on DAGM dataset, and one-class anomaly detection benchmarks further show the superiority of the proposed method.
Collapse
Affiliation(s)
- Siyu Sheng
- College of Electrical and Information, Xi'an Polytechnic University, Xi'an, 710048, China
| | - Junfeng Jing
- College of Electrical and Information, Xi'an Polytechnic University, Xi'an, 710048, China.
- Xi'an Polytechnic University Branch of Shaanxi Artificial Intelligence Joint Laboratory, Xi'an, 710048, China.
| | - Zhen Wang
- Defense Innovation Institute, Beijing, 100071, China
| | - Huanhuan Zhang
- Xi'an Polytechnic University Branch of Shaanxi Artificial Intelligence Joint Laboratory, Xi'an, 710048, China
| |
Collapse
|
172
|
Nikalayevich E, Letort G, de Labbey G, Todisco E, Shihabi A, Turlier H, Voituriez R, Yahiatene M, Pollet-Villard X, Innocenti M, Schuh M, Terret ME, Verlhac MH. Aberrant cortex contractions impact mammalian oocyte quality. Dev Cell 2024; 59:841-852.e7. [PMID: 38387459 DOI: 10.1016/j.devcel.2024.01.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 12/18/2023] [Accepted: 01/26/2024] [Indexed: 02/24/2024]
Abstract
The cortex controls cell shape. In mouse oocytes, the cortex thickens in an Arp2/3-complex-dependent manner, ensuring chromosome positioning and segregation. Surprisingly, we identify that mouse oocytes lacking the Arp2/3 complex undergo cortical actin remodeling upon division, followed by cortical contractions that are unprecedented in mammalian oocytes. Using genetics, imaging, and machine learning, we show that these contractions stir the cytoplasm, resulting in impaired organelle organization and activity. Oocyte capacity to avoid polyspermy is impacted, leading to a reduced female fertility. We could diminish contractions and rescue cytoplasmic anomalies. Similar contractions were observed in human oocytes collected as byproducts during IVF (in vitro fertilization) procedures. These contractions correlate with increased cytoplasmic motion, but not with defects in spindle assembly or aneuploidy in mice or humans. Our study highlights a multiscale effect connecting cortical F-actin, contractions, and cytoplasmic organization and affecting oocyte quality, with implications for female fertility.
Collapse
Affiliation(s)
- Elvira Nikalayevich
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France
| | - Gaëlle Letort
- Department of Developmental and Stem Cell Biology, Institut Pasteur, CNRS UMR 3738, Université Paris Cité, 25 rue du Dr. Roux, 75015 Paris, France
| | - Ghislain de Labbey
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France
| | - Elena Todisco
- Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Anastasia Shihabi
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France
| | - Hervé Turlier
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France
| | - Raphaël Voituriez
- Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Laboratoire Jean Perrin, CNRS, Sorbonne Université, Paris, France
| | - Mohamed Yahiatene
- Centre Assistance Médicale à la Procréation Nataliance, Groupe Mlab, Pôle Santé Oréliance, Saran, France
| | - Xavier Pollet-Villard
- Centre Assistance Médicale à la Procréation Nataliance, Groupe Mlab, Pôle Santé Oréliance, Saran, France
| | - Metello Innocenti
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy
| | - Melina Schuh
- Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Marie-Emilie Terret
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France.
| | - Marie-Hélène Verlhac
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France.
| |
Collapse
|
173
|
Sung C, Oh JS, Park BS, Kim SS, Song SY, Lee JJ. Diagnostic performance of a deep-learning model using 18F-FDG PET/CT for evaluating recurrence after radiation therapy in patients with lung cancer. Ann Nucl Med 2024:10.1007/s12149-024-01925-5. [PMID: 38589677 DOI: 10.1007/s12149-024-01925-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 03/21/2024] [Indexed: 04/10/2024]
Abstract
OBJECTIVE We developed a deep learning model for distinguishing radiation therapy (RT)-related changes and tumour recurrence in patients with lung cancer who underwent RT, and evaluated its performance. METHODS We retrospectively recruited 308 patients with lung cancer with RT-related changes observed on 18F-fluorodeoxyglucose positron emission tomography-computed tomography (18F-FDG PET/CT) performed after RT. Patients were labelled as positive or negative for tumour recurrence through histologic diagnosis or clinical follow-up after 18F-FDG PET/CT. A two-dimensional (2D) slice-based convolutional neural network (CNN) model was created with a total of 3329 slices as input, and performance was evaluated with five independent test sets. RESULTS For the five independent test sets, the area under the curve (AUC) of the receiver operating characteristic curve, sensitivity, and specificity were in the range of 0.98-0.99, 95-98%, and 87-95%, respectively. The region determined by the model was confirmed as an actual recurred tumour through the explainable artificial intelligence (AI) using gradient-weighted class activation mapping (Grad-CAM). CONCLUSION The 2D slice-based CNN model using 18F-FDG PET imaging was able to distinguish well between RT-related changes and tumour recurrence in patients with lung cancer.
Collapse
Affiliation(s)
- Changhwan Sung
- Department of Nuclear Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul, 05505, Korea
| | - Jungsu S Oh
- Department of Nuclear Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul, 05505, Korea
| | - Byung Soo Park
- Department of Nuclear Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul, 05505, Korea
| | - Su Ssan Kim
- Department of Radiation Oncology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Si Yeol Song
- Department of Radiation Oncology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Jong Jin Lee
- Department of Nuclear Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul, 05505, Korea.
| |
Collapse
|
174
|
Xu X, Chen Y, Yin H, Wang X, Zhang X. Nondestructive detection of SSC in multiple pear (Pyrus pyrifolia Nakai) cultivars using Vis-NIR spectroscopy coupled with the Grad-CAM method. Food Chem 2024; 450:139283. [PMID: 38615528 DOI: 10.1016/j.foodchem.2024.139283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/22/2024] [Accepted: 04/06/2024] [Indexed: 04/16/2024]
Abstract
Vis-NIR spectroscopy coupled with chemometric models is frequently used for pear soluble solid content (SSC) prediction. However, the model robustness is challenged by the variations in pear cultivars. This study explored the feasibility of developing universal models for predicting SSC of multiple pear varieties to improve the model's generalizability. The mature fruits of 6 pear cultivars with green skin (Pyrus pyrifolia Nakai cv. 'Cuiyu', 'Sucui No.1' and 'Cuiguan') and brown skin (Pyrus pyrifolia Nakai cv. 'Hosui','Syusui' and 'Wakahikari') were used to establish single-cultivar models and multi-cultivar universal models using convolutional neural network (CNN), partial least square (PLS), and support vector regression (SVR) approaches. Multi-cultivar universal models were built using full spectra and important variables extracted by gradient-weighted class activation mapping (Grad-CAM), respectively. The universal models based on important variables obtained satisfactory performances with RMSEPs of 0.76, 0.59, 0.80, 1.64, 0.98, and 1.03°Brix on 6 cultivars, respectively.
Collapse
Affiliation(s)
- Xin Xu
- College of Engineering, Nanjing Agricultural University, Nanjing 210031, China
| | - Yanyu Chen
- College of Engineering, Nanjing Agricultural University, Nanjing 210031, China
| | - Hao Yin
- College of Horticulture, Nanjing Agricultural University, Nanjing 210031, China
| | - Xiaochan Wang
- College of Engineering, Nanjing Agricultural University, Nanjing 210031, China
| | - Xiaolei Zhang
- College of Engineering, Nanjing Agricultural University, Nanjing 210031, China.
| |
Collapse
|
175
|
Tao W, Wang X, Yan T, Liu Z, Wan S. ESF-YOLO: an accurate and universal object detector based on neural networks. Front Neurosci 2024; 18:1371418. [PMID: 38650621 PMCID: PMC11033406 DOI: 10.3389/fnins.2024.1371418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/28/2024] [Indexed: 04/25/2024] Open
Abstract
As an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Sampling Conv Module (MSCM) is designed, which enhances the backbone network's learning capability for low-level features through multi-scale receptive fields and cross-scale feature fusion. Secondly, to tackle occlusion issues, a new Block-wise Channel Attention Module (BCAM) is designed, assigning greater weights to channels corresponding to critical information. Next, a lightweight Decoupled Head (LD-Head) is devised. Additionally, the loss function is redesigned to address asynchrony between labels and confidences, alleviating the imbalance between positive and negative samples during the neural network training. Finally, an adaptive scale factor for Intersection over Union (IoU) calculation is innovatively proposed, adjusting bounding box sizes adaptively to accommodate targets of different sizes in the dataset. Experimental results on the SODA10M and CBIA8K datasets demonstrate that ESF-YOLO increases Average Precision at 0.50 IoU (AP50) by 3.93 and 2.24%, Average Precision at 0.75 IoU (AP75) by 4.77 and 4.85%, and mean Average Precision (mAP) by 4 and 5.39%, respectively, validating the model's broad applicability.
Collapse
Affiliation(s)
- Wenguang Tao
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Xiaotian Wang
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Tian Yan
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Zhengzhuo Liu
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Shizheng Wan
- Shanghai Electro-Mechanical Engineering Institute, Shanghai, China
| |
Collapse
|
176
|
Zhang X, Gao H, Wang H, Chen Z, Zhang Z, Chen X, Li Y, Qi Y, Wang R. PLANET: A Multi-objective Graph Neural Network Model for Protein-Ligand Binding Affinity Prediction. J Chem Inf Model 2024; 64:2205-2220. [PMID: 37319418 DOI: 10.1021/acs.jcim.3c00253] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Predicting protein-ligand binding affinity is a central issue in drug design. Various deep learning models have been published in recent years, where many of them rely on 3D protein-ligand complex structures as input and tend to focus on the single task of reproducing binding affinity. In this study, we have developed a graph neural network model called PLANET (Protein-Ligand Affinity prediction NETwork). This model takes the graph-represented 3D structure of the binding pocket on the target protein and the 2D chemical structure of the ligand molecule as input. It was trained through a multi-objective process with three related tasks, including deriving the protein-ligand binding affinity, protein-ligand contact map, and ligand distance matrix. Besides the protein-ligand complexes with known binding affinity data retrieved from the PDBbind database, a large number of non-binder decoys were also added to the training data for deriving the final model of PLANET. When tested on the CASF-2016 benchmark, PLANET exhibited a scoring power comparable to the best result yielded by other deep learning models as well as a reasonable ranking power and docking power. In virtual screening trials conducted on the DUD-E benchmark, PLANET's performance was notably better than several deep learning and machine learning models. As on the LIT-PCBA benchmark, PLANET achieved comparable accuracy as the conventional docking program Glide, but it only spent less than 1% of Glide's computation time to finish the same job because PLANET did not need exhaustive conformational sampling. Considering the decent accuracy and efficiency of PLANET in binding affinity prediction, it may become a useful tool for conducting large-scale virtual screening.
Collapse
Affiliation(s)
- Xiangying Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Haotian Gao
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Haojie Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Zhihang Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Zhe Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Xinchong Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Yan Li
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Yifei Qi
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Renxiao Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| |
Collapse
|
177
|
Tian T, Li S, Fang M, Zhao D, Zeng J. MolSHAP: Interpreting Quantitative Structure-Activity Relationships Using Shapley Values of R-Groups. J Chem Inf Model 2024; 64:2236-2249. [PMID: 37584270 DOI: 10.1021/acs.jcim.3c00465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Optimizing the activities and properties of lead compounds is an essential step in the drug discovery process. Despite recent advances in machine learning-aided drug discovery, most of the existing methods focus on making predictions for the desired objectives directly while ignoring the explanations for predictions. Although several techniques can provide interpretations for machine learning-based methods such as feature attribution, there are still gaps between these interpretations and the principles commonly adopted by medicinal chemists when designing and optimizing molecules. Here, we propose an interpretation framework, named MolSHAP, for quantitative structure-activity relationship analysis by estimating the contributions of R-groups. Instead of attributing the activities to individual input features, MolSHAP regards the R-group fragments as the basic units of interpretation, which is in accordance with the fragment-based modifications in molecule optimization. MolSHAP is a model-agnostic method that can interpret activity regression models with arbitrary input formats and model architectures. Based on the evaluations of numerous representative activity regression models on a specially designed R-group ranking task, MolSHAP achieved significantly better interpretation power compared with other methods. In addition, we developed a compound optimization algorithm based on MolSHAP and illustrated the reliability of the optimized compounds using an independent case study. These results demonstrated that MolSHAP can provide a useful tool for accurately interpreting the quantitative structure-activity relationships and rationally optimizing the compound activities in drug discovery.
Collapse
Affiliation(s)
- Tingzhong Tian
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Shuya Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Meng Fang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
178
|
Cui S, Hui B. Dual-Dependency Attention Transformer for Fine-Grained Visual Classification. Sensors (Basel) 2024; 24:2337. [PMID: 38610547 PMCID: PMC11014298 DOI: 10.3390/s24072337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/31/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024]
Abstract
Visual transformers (ViTs) are widely used in various visual tasks, such as fine-grained visual classification (FGVC). However, the self-attention mechanism, which is the core module of visual transformers, leads to quadratic computational and memory complexity. The sparse-attention and local-attention approaches currently used by most researchers are not suitable for FGVC tasks. These tasks require dense feature extraction and global dependency modeling. To address this challenge, we propose a dual-dependency attention transformer model. It decouples global token interactions into two paths. The first is a position-dependency attention pathway based on the intersection of two types of grouped attention. The second is a semantic dependency attention pathway based on dynamic central aggregation. This approach enhances the high-quality semantic modeling of discriminative cues while reducing the computational cost to linear computational complexity. In addition, we develop discriminative enhancement strategies. These strategies increase the sensitivity of high-confidence discriminative cue tracking with a knowledge-based representation approach. Experiments on three datasets, NABIRDS, CUB, and DOGS, show that the method is suitable for fine-grained image classification. It finds a balance between computational cost and performance.
Collapse
Affiliation(s)
- Shiyan Cui
- Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, China;
- Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Bin Hui
- Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, China;
- Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
| |
Collapse
|
179
|
Freitas M, Pinho F, Pinho L, Silva S, Figueira V, Vilas-Boas JP, Silva A. Biomechanical Assessment Methods Used in Chronic Stroke: A Scoping Review of Non-Linear Approaches. Sensors (Basel) 2024; 24:2338. [PMID: 38610549 PMCID: PMC11014015 DOI: 10.3390/s24072338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 03/22/2024] [Accepted: 04/04/2024] [Indexed: 04/14/2024]
Abstract
Non-linear and dynamic systems analysis of human movement has recently become increasingly widespread with the intention of better reflecting how complexity affects the adaptability of motor systems, especially after a stroke. The main objective of this scoping review was to summarize the non-linear measures used in the analysis of kinetic, kinematic, and EMG data of human movement after stroke. PRISMA-ScR guidelines were followed, establishing the eligibility criteria, the population, the concept, and the contextual framework. The examined studies were published between 1 January 2013 and 12 April 2023, in English or Portuguese, and were indexed in the databases selected for this research: PubMed®, Web of Science®, Institute of Electrical and Electronics Engineers®, Science Direct® and Google Scholar®. In total, 14 of the 763 articles met the inclusion criteria. The non-linear measures identified included entropy (n = 11), fractal analysis (n = 1), the short-term local divergence exponent (n = 1), the maximum Floquet multiplier (n = 1), and the Lyapunov exponent (n = 1). These studies focused on different motor tasks: reaching to grasp (n = 2), reaching to point (n = 1), arm tracking (n = 2), elbow flexion (n = 5), elbow extension (n = 1), wrist and finger extension upward (lifting) (n = 1), knee extension (n = 1), and walking (n = 4). When studying the complexity of human movement in chronic post-stroke adults, entropy measures, particularly sample entropy, were preferred. Kinematic assessment was mainly performed using motion capture systems, with a focus on joint angles of the upper limbs.
Collapse
Affiliation(s)
- Marta Freitas
- Escola Superior de Saúde do Vale do Ave, Cooperativa de Ensino Superior Politécnico e Universitário, Rua José António Vidal, 81, 4760-409 Vila Nova de Famalicão, Portugal; (F.P.); (L.P.); (S.S.); (V.F.)
- HM—Health and Human Movement Unit, Polytechnic University of Health, Cooperativa de Ensino Superior Politécnico e Universitário, CRL, 4760-409 Vila Nova de Famalicão, Portugal
- Center for Rehabilitation Research (CIR), R. Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal;
- Porto Biomechanics Laboratory (LABIOMEP), 4200-450 Porto, Portugal
| | - Francisco Pinho
- Escola Superior de Saúde do Vale do Ave, Cooperativa de Ensino Superior Politécnico e Universitário, Rua José António Vidal, 81, 4760-409 Vila Nova de Famalicão, Portugal; (F.P.); (L.P.); (S.S.); (V.F.)
- HM—Health and Human Movement Unit, Polytechnic University of Health, Cooperativa de Ensino Superior Politécnico e Universitário, CRL, 4760-409 Vila Nova de Famalicão, Portugal
| | - Liliana Pinho
- Escola Superior de Saúde do Vale do Ave, Cooperativa de Ensino Superior Politécnico e Universitário, Rua José António Vidal, 81, 4760-409 Vila Nova de Famalicão, Portugal; (F.P.); (L.P.); (S.S.); (V.F.)
- HM—Health and Human Movement Unit, Polytechnic University of Health, Cooperativa de Ensino Superior Politécnico e Universitário, CRL, 4760-409 Vila Nova de Famalicão, Portugal
- Center for Rehabilitation Research (CIR), R. Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal;
- Porto Biomechanics Laboratory (LABIOMEP), 4200-450 Porto, Portugal
| | - Sandra Silva
- Escola Superior de Saúde do Vale do Ave, Cooperativa de Ensino Superior Politécnico e Universitário, Rua José António Vidal, 81, 4760-409 Vila Nova de Famalicão, Portugal; (F.P.); (L.P.); (S.S.); (V.F.)
- HM—Health and Human Movement Unit, Polytechnic University of Health, Cooperativa de Ensino Superior Politécnico e Universitário, CRL, 4760-409 Vila Nova de Famalicão, Portugal
- Department of Medical Sciences, University of Aveiro, 3810-193 Aveiro, Portugal
- School of Health Sciences, University of Aveiro, 3810-193 Aveiro, Portugal;
| | - Vânia Figueira
- Escola Superior de Saúde do Vale do Ave, Cooperativa de Ensino Superior Politécnico e Universitário, Rua José António Vidal, 81, 4760-409 Vila Nova de Famalicão, Portugal; (F.P.); (L.P.); (S.S.); (V.F.)
- HM—Health and Human Movement Unit, Polytechnic University of Health, Cooperativa de Ensino Superior Politécnico e Universitário, CRL, 4760-409 Vila Nova de Famalicão, Portugal
- Porto Biomechanics Laboratory (LABIOMEP), 4200-450 Porto, Portugal
| | - João Paulo Vilas-Boas
- School of Health Sciences, University of Aveiro, 3810-193 Aveiro, Portugal;
- Centre for Research, Training, Innovation and Intervention in Sport (CIFI2D), Faculty of Sport, University of Porto, 4200-450 Porto, Portugal
| | - Augusta Silva
- Center for Rehabilitation Research (CIR), R. Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal;
- Department of Physiotherapy, School of Health, Polytechnic of Porto, 4200-072 Porto, Portugal
| |
Collapse
|
180
|
Ibragimov E, Kim Y, Lee JH, Cho J, Lee JJ. Automated Pavement Condition Index Assessment with Deep Learning and Image Analysis: An End-to-End Approach. Sensors (Basel) 2024; 24:2333. [PMID: 38610545 PMCID: PMC11014408 DOI: 10.3390/s24072333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/02/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024]
Abstract
The degradation of road pavements due to environmental factors is a pressing issue in infrastructure maintenance, necessitating precise identification of pavement distresses. The pavement condition index (PCI) serves as a critical metric for evaluating pavement conditions, essential for effective budget allocation and performance tracking. Traditional manual PCI assessment methods are limited by labor intensity, subjectivity, and susceptibility to human error. Addressing these challenges, this paper presents a novel, end-to-end automated method for PCI calculation, integrating deep learning and image processing technologies. The first stage employs a deep learning algorithm for accurate detection of pavement cracks, followed by the application of a segmentation-based skeleton algorithm in image processing to estimate crack width precisely. This integrated approach enhances the assessment process, providing a more comprehensive evaluation of pavement integrity. The validation results demonstrate a 95% accuracy in crack detection and 90% accuracy in crack width estimation. Leveraging these results, the automated PCI rating is achieved, aligned with standards, showcasing significant improvements in the efficiency and reliability of PCI evaluations. This method offers advancements in pavement maintenance strategies and potential applications in broader road infrastructure management.
Collapse
Affiliation(s)
- Eldor Ibragimov
- SISTech Co., Ltd., Seoul 05006, Republic of Korea; (E.I.); (Y.K.)
| | - Yongsoo Kim
- SISTech Co., Ltd., Seoul 05006, Republic of Korea; (E.I.); (Y.K.)
| | - Jung Hee Lee
- Department of Artificial Intelligence, Ajou University, Suwon-si 16499, Republic of Korea;
| | - Junsang Cho
- Korea Expressway Corporation Research Institute, Hwaseong-si 13550, Republic of Korea;
| | - Jong-Jae Lee
- Department of Civil & Environmental Engineering, Sejong University, Seoul 05006, Republic of Korea
| |
Collapse
|
181
|
Reigle J, Lopez-Nunez O, Drysdale E, Abuquteish D, Liu X, Putra J, Erdman L, Griffiths AM, Prasath S, Siddiqui I, Dhaliwal J. Using Deep Learning to Automate Eosinophil Counting in Pediatric Ulcerative Colitis Histopathological Images. medRxiv 2024:2024.04.03.24305251. [PMID: 38633803 PMCID: PMC11023647 DOI: 10.1101/2024.04.03.24305251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
Background Accurate identification of inflammatory cells from mucosal histopathology images is important in diagnosing ulcerative colitis. The identification of eosinophils in the colonic mucosa has been associated with disease course. Cell counting is not only time-consuming but can also be subjective to human biases. In this study we developed an automatic eosinophilic cell counting tool from mucosal histopathology images, using deep learning. Method Four pediatric IBD pathologists from two North American pediatric hospitals annotated 530 crops from 143 standard-of-care hematoxylin and eosin (H & E) rectal mucosal biopsies. A 305/75 split was used for training/validation to develop and optimize a U-Net based deep learning model, and 150 crops were used as a test set. The U-Net model was then compared to SAU-Net, a state-of-the-art U-Net variant. We undertook post-processing steps, namely, (1) the pixel-level probability threshold, (2) the minimum number of clustered pixels to designate a cell, and (3) the connectivity. Experiments were run to optimize model parameters using AUROC and cross-entropy loss as the performance metrics. Results The F1-score was 0.86 (95%CI:0.79-0.91) (Precision: 0.77 (95%CI:0.70-0.83), Recall: 0.96 (95%CI:0.93-0.99)) to identify eosinophils as compared to an F1-score of 0.2 (95%CI:0.13-0.26) for SAU-Net (Precision: 0.38 (95%CI:0.31-0.46), Recall: 0.13 (95%CI:0.08-0.19)). The inter-rater reliability was 0.96 (95%CI:0.93-0.97). The correlation between two pathologists and the algorithm was 0.89 (95%CI:0.82-0.94) and 0.88 (95%CI:0.80-0.94) respectively. Conclusion Our results indicate that deep learning-based automated eosinophilic cell counting can obtain a robust level of accuracy with a high degree of concordance with manual expert annotations.
Collapse
|
182
|
Wang X, Mi Y, Zhang X. 3D human pose data augmentation using Generative Adversarial Networks for robotic-assisted movement quality assessment. Front Neurorobot 2024; 18:1371385. [PMID: 38644903 PMCID: PMC11032046 DOI: 10.3389/fnbot.2024.1371385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/08/2024] [Indexed: 04/23/2024] Open
Abstract
In the realm of human motion recognition systems, the augmentation of 3D human pose data plays a pivotal role in enriching and enhancing the quality of original datasets through the generation of synthetic data. This augmentation is vital for addressing the current research gaps in diversity and complexity, particularly when dealing with rare or complex human movements. Our study introduces a groundbreaking approach employing Generative Adversarial Networks (GANs), coupled with Support Vector Machine (SVM) and DenseNet, further enhanced by robot-assisted technology to improve the precision and efficiency of data collection. The GANs in our model are responsible for generating highly realistic and diverse 3D human motion data, while SVM aids in the effective classification of this data. DenseNet is utilized for the extraction of key features, facilitating a comprehensive and integrated approach that significantly elevates both the data augmentation process and the model's ability to process and analyze complex human movements. The experimental outcomes underscore our model's exceptional performance in motion quality assessment, showcasing a substantial improvement over traditional methods in terms of classification accuracy and data processing efficiency. These results validate the effectiveness of our integrated network model, setting a solid foundation for future advancements in the field. Our research not only introduces innovative methodologies for 3D human pose data enhancement but also provides substantial technical support for practical applications across various domains, including sports science, rehabilitation medicine, and virtual reality. By combining advanced algorithmic strategies with robotic technologies, our work addresses key challenges in data augmentation and motion quality assessment, paving the way for new research and development opportunities in these critical areas.
Collapse
Affiliation(s)
- Xuefeng Wang
- College of Sports, Woosuk University, Jeonju, Republic of Korea
| | - Yang Mi
- College of Sports and Health, Linyi University, Linyi, China
| | - Xiang Zhang
- Department of Information Engineering, Linyi Technician Institute, Linyi, China
| |
Collapse
|
183
|
Arif M, Fang G, Ghulam A, Musleh S, Alam T. DPI_CDF: druggable protein identifier using cascade deep forest. BMC Bioinformatics 2024; 25:145. [PMID: 38580921 DOI: 10.1186/s12859-024-05744-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 03/13/2024] [Indexed: 04/07/2024] Open
Abstract
BACKGROUND Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor's performance is still not satisfactory. METHODS In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF. RESULTS The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew's-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process. AVAILABILITY The benchmark datasets and source codes are available in GitHub: http://github.com/Muhammad-Arif-NUST/DPI_CDF .
Collapse
Affiliation(s)
- Muhammad Arif
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Ge Fang
- State Key Laboratory for Organic Electronics and Information Displays, Institute of Advanced Materials (IAM), Nanjing 210023, P. R. China, Nanjing 210023, China
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bankok, 10700, Thailand
| | - Ali Ghulam
- Information Technology Centre, Sindh Agriculture University, Sindh, Pakistan
| | - Saleh Musleh
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
| |
Collapse
|
184
|
Ebrahim M, Alsmirat M, Al-Ayyoub M. Advanced disk herniation computer aided diagnosis system. Sci Rep 2024; 14:8071. [PMID: 38580700 PMCID: PMC10997754 DOI: 10.1038/s41598-024-58283-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 03/27/2024] [Indexed: 04/07/2024] Open
Abstract
Over recent years, researchers and practitioners have encountered massive and continuous improvements in the computational resources available for their use. This allowed the use of resource-hungry Machine learning (ML) algorithms to become feasible and practical. Moreover, several advanced techniques are being used to boost the performance of such algorithms even further, which include various transfer learning techniques, data augmentation, and feature concatenation. Normally, the use of these advanced techniques highly depends on the size and nature of the dataset being used. In the case of fine-grained medical image sets, which have subcategories within the main categories in the image set, there is a need to find the combination of the techniques that work the best on these types of images. In this work, we utilize these advanced techniques to find the best combinations to build a state-of-the-art lumber disc herniation computer-aided diagnosis system. We have evaluated the system extensively and the results show that the diagnosis system achieves an accuracy of 98% when it is compared with human diagnosis.
Collapse
Affiliation(s)
- Maad Ebrahim
- Department of Computer Science and Operations Research (DIRO), University of Montreal, Montreal, QC, H3T1J4, Canada
- Department of Computer Science, Jordan University of Science and Technology, Ar-Ramtha, Jordan
| | - Mohammad Alsmirat
- Department of Computer Science, University of Sharjah, Sharjah, United Arab Emirates.
- Department of Computer Science, Jordan University of Science and Technology, Ar-Ramtha, Jordan.
| | - Mahmoud Al-Ayyoub
- Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, United Arab Emirates.
- Department of Computer Science, Jordan University of Science and Technology, Ar-Ramtha, Jordan.
| |
Collapse
|
185
|
Tang X, Zhao S. The application prospects of robot pose estimation technology: exploring new directions based on YOLOv8-ApexNet. Front Neurorobot 2024; 18:1374385. [PMID: 38644904 PMCID: PMC11026676 DOI: 10.3389/fnbot.2024.1374385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 03/22/2024] [Indexed: 04/23/2024] Open
Abstract
Introduction Service robot technology is increasingly gaining prominence in the field of artificial intelligence. However, persistent limitations continue to impede its widespread implementation. In this regard, human motion pose estimation emerges as a crucial challenge necessary for enhancing the perceptual and decision-making capacities of service robots. Method This paper introduces a groundbreaking model, YOLOv8-ApexNet, which integrates advanced technologies, including Bidirectional Routing Attention (BRA) and Generalized Feature Pyramid Network (GFPN). BRA facilitates the capture of inter-keypoint correlations within dynamic environments by introducing a bidirectional information propagation mechanism. Furthermore, GFPN adeptly extracts and integrates feature information across different scales, enabling the model to make more precise predictions for targets of various sizes and shapes. Results Empirical research findings reveal significant performance enhancements of the YOLOv8-ApexNet model across the COCO and MPII datasets. Compared to existing methodologies, the model demonstrates pronounced advantages in keypoint localization accuracy and robustness. Discussion The significance of this research lies in providing an efficient and accurate solution tailored for the realm of service robotics, effectively mitigating the deficiencies inherent in current approaches. By bolstering the accuracy of perception and decision-making, our endeavors unequivocally endorse the widespread integration of service robots within practical applications.
Collapse
Affiliation(s)
- XianFeng Tang
- Physical Education Department, Zhejiang Wanli University, Ningbo, China
| | - Shuwei Zhao
- Physical Education Department, Hebei University of Technology, Tianjin, China
| |
Collapse
|
186
|
Zhou L, Wu G, Zuo Y, Chen X, Hu H. A Comprehensive Review of Vision-Based 3D Reconstruction Methods. Sensors (Basel) 2024; 24:2314. [PMID: 38610525 PMCID: PMC11014007 DOI: 10.3390/s24072314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 03/28/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024]
Abstract
With the rapid development of 3D reconstruction, especially the emergence of algorithms such as NeRF and 3DGS, 3D reconstruction has become a popular research topic in recent years. 3D reconstruction technology provides crucial support for training extensive computer vision models and advancing the development of general artificial intelligence. With the development of deep learning and GPU technology, the demand for high-precision and high-efficiency 3D reconstruction information is increasing, especially in the fields of unmanned systems, human-computer interaction, virtual reality, and medicine. The rapid development of 3D reconstruction is becoming inevitable. This survey categorizes the various methods and technologies used in 3D reconstruction. It explores and classifies them based on three aspects: traditional static, dynamic, and machine learning. Furthermore, it compares and discusses these methods. At the end of the survey, which includes a detailed analysis of the trends and challenges in 3D reconstruction development, we aim to provide a comprehensive introduction for individuals who are currently engaged in or planning to conduct research on 3D reconstruction. Our goal is to help them gain a comprehensive understanding of the relevant knowledge related to 3D reconstruction.
Collapse
Affiliation(s)
| | - Guoxin Wu
- Key Laboratory of Modern Measurement and Control Technology Ministry of Education, Beijing Information Science and Technology University, Beijing 100080, China; (L.Z.); (Y.Z.); (X.C.); (H.H.)
| | | | | | | |
Collapse
|
187
|
Li B, Chen H, Duan H. Artificial intelligence-driven prognostic system for conception prediction and management in intrauterine adhesions following hysteroscopic adhesiolysis: a diagnostic study using hysteroscopic images. Front Bioeng Biotechnol 2024; 12:1327207. [PMID: 38638324 PMCID: PMC11024240 DOI: 10.3389/fbioe.2024.1327207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 03/04/2024] [Indexed: 04/20/2024] Open
Abstract
Introduction Intrauterine adhesions (IUAs) caused by endometrial injury, commonly occurring in developing countries, can lead to subfertility. This study aimed to develop and evaluate a DeepSurv architecture-based artificial intelligence (AI) system for predicting fertility outcomes after hysteroscopic adhesiolysis. Methods This diagnostic study included 555 intrauterine adhesions (IUAs) treated with hysteroscopic adhesiolysis with 4,922 second-look hysteroscopic images from a prospective clinical database (IUADB, NCT05381376) with a minimum of 2 years of follow-up. These patients were randomly divided into training, validation, and test groups for model development, tuning, and external validation. Four transfer learning models were built using the DeepSurv architecture and a code-free AI application for pregnancy prediction was also developed. The primary outcome was the model's ability to predict pregnancy within a year after adhesiolysis. Secondary outcomes were model performance which evaluated using time-dependent area under the curves (AUCs) and C-index, and ART benefits evaluated by hazard ratio (HR) among different risk groups. Results External validation revealed that using the DeepSurv architecture, InceptionV3+ DeepSurv, InceptionResNetV2+ DeepSurv, and ResNet50+ DeepSurv achieved AUCs of 0.94, 0.95, and 0.93, respectively, for one-year pregnancy prediction, outperforming other models and clinical score systems. A code-free AI application was developed to identify candidates for ART. Patients with lower natural conception probability indicated by the application had a higher ART benefit hazard ratio (HR) of 3.13 (95% CI: 1.22-8.02, p = 0.017). Conclusion InceptionV3+ DeepSurv, InceptionResNetV2+ DeepSurv, and ResNet50+ DeepSurv show potential in predicting the fertility outcomes of IUAs after hysteroscopic adhesiolysis. The code-free AI application based on the DeepSurv architecture facilitates personalized therapy following hysteroscopic adhesiolysis.
Collapse
Affiliation(s)
- Bohan Li
- Department of Minimally Invasive Gynecologic Center, Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Healthcare Hospital, Beijing, China
| | - Hui Chen
- School of Biomedical Engineering, Capital Medical University, Beijing, China
- Beijing Advanced Innovation Center for Big Data-based Precision Medicine, Capital Medical University, Beijing, China
| | - Hua Duan
- Department of Minimally Invasive Gynecologic Center, Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Healthcare Hospital, Beijing, China
| |
Collapse
|
188
|
Schulte L, Faul C, Oswald P, Preißler K, Steinfartz S, Veith M, Caspers BA. Performance of different automatic photographic identification software for larvae and adults of the European fire salamander. PLoS One 2024; 19:e0298285. [PMID: 38573887 PMCID: PMC10994360 DOI: 10.1371/journal.pone.0298285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 01/22/2024] [Indexed: 04/06/2024] Open
Abstract
For many species, population sizes are unknown despite their importance for conservation. For population size estimation, capture-mark-recapture (CMR) studies are often used, which include the necessity to identify each individual, mostly through individual markings or genetic characters. Invasive marking techniques, however, can negatively affect the individual fitness. Alternatives are low-impact techniques such as the use of photos for individual identification, for species with stable distinctive phenotypic traits. For the individual identification of photos, a variety of different software, with different requirements, is available. The European fire salamander (Salamandra salamandra) is a species in which individuals, both at the larval stage and as adults, have individual specific patterns that allow for individual identification. In this study, we compared the performance of five different software for the use of photographic identification for the European fire salamander: Amphibian & Reptile Wildbook (ARW), AmphIdent, I3S pattern+, ManderMatcher and Wild-ID. While adults can be identified by all five software, European fire salamander larvae can currently only be identified by two of the five (ARW and Wild-ID). We used one dataset of European fire salamander larval pictures taken in the laboratory and tested this dataset in two of the five software (ARW and Wild-ID). We used another dataset of European fire salamander adult pictures taken in the field and tested this using all five software. We compared the requirements of all software on the pictures used and calculated the False Rejection Rate (FRR) and the Recognition Rate (RR). For the larval dataset (421 pictures) we found that the ARW and Wild-ID performed equally well for individual identification (99.6% and 100% Recognition Rate, respectively). For the adult dataset (377 pictures), we found the best False Rejection Rate in ManderMatcher and the highest Recognition Rate in the ARW. Additionally, the ARW is the only program that requires no image pre-processing. In times of amphibian declines, non-invasive photo identification software allowing capture-mark-recapture studies help to gain knowledge on population sizes, distribution, movement and demography of a population and can thus help to support species conservation.
Collapse
Affiliation(s)
- Laura Schulte
- Department of Behavioural Ecology, Bielefeld University, Konsequenz, Bielefeld, Germany
| | - Charlotte Faul
- Biogeography, Trier University, Universitätsring, Trier, Germany
| | - Pia Oswald
- Department of Behavioural Ecology, Bielefeld University, Konsequenz, Bielefeld, Germany
| | - Kathleen Preißler
- Molecular Evolution and Systematics of Animals, Leipzig University, Talstraße, Leipzig
| | - Sebastian Steinfartz
- Molecular Evolution and Systematics of Animals, Leipzig University, Talstraße, Leipzig
| | - Michael Veith
- Biogeography, Trier University, Universitätsring, Trier, Germany
| | - Barbara A. Caspers
- Department of Behavioural Ecology, Bielefeld University, Konsequenz, Bielefeld, Germany
- JICE, Joint Institute for Individualisation in a Changing Environment, University of Münster and Bielefeld University, Bielefeld, Germany
| |
Collapse
|
189
|
Nakatani T, Utsumi Y, Fujimoto K, Iwamura M, Kise K. Image recognition-based petal arrangement estimation. Front Plant Sci 2024; 15:1334362. [PMID: 38638358 PMCID: PMC11024381 DOI: 10.3389/fpls.2024.1334362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 02/21/2024] [Indexed: 04/20/2024]
Abstract
Flowers exhibit morphological diversity in the number and positional arrangement of their floral organs, such as petals. The petal arrangements of blooming flowers are represented by the overlap position relation between neighboring petals, an indicator of the floral developmental process; however, only specialists are capable of the petal arrangement identification. Therefore, we propose a method to support the estimation of the arrangement of the perianth organs, including petals and tepals, using image recognition techniques. The problem for realizing the method is that it is not possible to prepare a large number of image datasets: we cannot apply the latest machine learning based image processing methods, which require a large number of images. Therefore, we describe the tepal arrangement as a sequence of interior-exterior patterns of tepal overlap in the image, and estimate the tepal arrangement by matching the pattern with the known patterns. We also use methods that require less or no training data to implement the method: the fine-tuned YOLO v5 model for flower detection, GrubCut for flower segmentation, the Harris corner detector for tepal overlap detection, MAML-based interior-exterior estimation, and circular permutation matching for tepal arrangement estimation. Experimental results showed good accuracy when flower detection, segmentation, overlap location estimation, interior-exterior estimation, and circle permutation matching-based tepal arrangement estimation were evaluated independently. However, the accuracy decreased when they were integrated. Therefore, we developed a user interface for manual correction of the position of overlap estimation and interior-exterior pattern estimation, which ensures the quality of tepal arrangement estimation.
Collapse
Affiliation(s)
- Tomoya Nakatani
- Graduate School of Informatics, Osaka Metropolitan University, Sakai, Japan
| | - Yuzuko Utsumi
- Graduate School of Informatics, Osaka Metropolitan University, Sakai, Japan
| | - Koichi Fujimoto
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan
| | - Masakazu Iwamura
- Graduate School of Informatics, Osaka Metropolitan University, Sakai, Japan
| | - Koichi Kise
- Graduate School of Informatics, Osaka Metropolitan University, Sakai, Japan
| |
Collapse
|
190
|
Ahmadkhani S, Moghaddam ME. A social image recommendation system based on deep reinforcement learning. PLoS One 2024; 19:e0300059. [PMID: 38574062 PMCID: PMC10994284 DOI: 10.1371/journal.pone.0300059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 02/21/2024] [Indexed: 04/06/2024] Open
Abstract
Today, due to the expansion of the Internet and social networks, people are faced with a vast amount of dynamic information. To mitigate the issue of information overload, recommender systems have become pivotal by analyzing users' activity histories to discern their interests and preferences. However, most available social image recommender systems utilize a static strategy, meaning they do not adapt to changes in user preferences. To overcome this challenge, our paper introduces a dynamic image recommender system that leverages a deep reinforcement learning (DRL) framework, enriched with a novel set of features including emotion, style, and personality. These features, uncommon in existing systems, are instrumental in crafting a user's characteristic vector, offering a personalized recommendation experience. Additionally, we overcome the challenge of state representation definition in reinforcement learning by introducing a new state representation. The experimental results show that our proposed method, compared to some related works, significantly improves Recall@k and Precision@k by approximately 7%-10% (for the top 100 images recommended) for personalized image recommendation.
Collapse
Affiliation(s)
- Somaye Ahmadkhani
- Shahid Beheshti University, Faculty of Computer Science and Engineering, Tehran, Iran
| | | |
Collapse
|
191
|
Nagle MF, Yuan J, Kaur D, Ma C, Peremyslova E, Jiang Y, Niño de Rivera A, Jawdy S, Chen JG, Feng K, Yates TB, Tuskan GA, Muchero W, Fuxin L, Strauss SH. GWAS supported by computer vision identifies large numbers of candidate regulators of in planta regeneration in Populus trichocarpa. G3 (Bethesda) 2024; 14:jkae026. [PMID: 38325329 PMCID: PMC10989874 DOI: 10.1093/g3journal/jkae026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/18/2024] [Accepted: 01/20/2024] [Indexed: 02/09/2024]
Abstract
Plant regeneration is an important dimension of plant propagation and a key step in the production of transgenic plants. However, regeneration capacity varies widely among genotypes and species, the molecular basis of which is largely unknown. Association mapping methods such as genome-wide association studies (GWAS) have long demonstrated abilities to help uncover the genetic basis of trait variation in plants; however, the performance of these methods depends on the accuracy and scale of phenotyping. To enable a large-scale GWAS of in planta callus and shoot regeneration in the model tree Populus, we developed a phenomics workflow involving semantic segmentation to quantify regenerating plant tissues over time. We found that the resulting statistics were of highly non-normal distributions, and thus employed transformations or permutations to avoid violating assumptions of linear models used in GWAS. We report over 200 statistically supported quantitative trait loci (QTLs), with genes encompassing or near to top QTLs including regulators of cell adhesion, stress signaling, and hormone signaling pathways, as well as other diverse functions. Our results encourage models of hormonal signaling during plant regeneration to consider keystone roles of stress-related signaling (e.g. involving jasmonates and salicylic acid), in addition to the auxin and cytokinin pathways commonly considered. The putative regulatory genes and biological processes we identified provide new insights into the biological complexity of plant regeneration, and may serve as new reagents for improving regeneration and transformation of recalcitrant genotypes and species.
Collapse
Affiliation(s)
- Michael F Nagle
- Department of Forest Ecosystems and Society, Oregon State University, 321 Richardson Hall, Corvallis, OR 97311, USA
| | - Jialin Yuan
- Department of Electrical Engineering and Computer Science, Oregon State University, 1148 Kelley Engineering Center, Corvallis, OR 97331, USA
| | - Damanpreet Kaur
- Department of Electrical Engineering and Computer Science, Oregon State University, 1148 Kelley Engineering Center, Corvallis, OR 97331, USA
| | - Cathleen Ma
- Department of Forest Ecosystems and Society, Oregon State University, 321 Richardson Hall, Corvallis, OR 97311, USA
| | - Ekaterina Peremyslova
- Department of Forest Ecosystems and Society, Oregon State University, 321 Richardson Hall, Corvallis, OR 97311, USA
| | - Yuan Jiang
- Statistics Department, Oregon State University, 239 Weniger Hall, Corvallis, OR 97331, USA
| | - Alexa Niño de Rivera
- Department of Forest Ecosystems and Society, Oregon State University, 321 Richardson Hall, Corvallis, OR 97311, USA
| | - Sara Jawdy
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
| | - Jin-Gui Chen
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Bredesen Center for Interdisciplinary Research, University of Tennessee-Knoxville, 310 Ferris Hall 1508 Middle Dr, Knoxville, TN 37996, USA
| | - Kai Feng
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
| | - Timothy B Yates
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Bredesen Center for Interdisciplinary Research, University of Tennessee-Knoxville, 310 Ferris Hall 1508 Middle Dr, Knoxville, TN 37996, USA
| | - Gerald A Tuskan
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
| | - Wellington Muchero
- Biosciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Center for Bioenergy Innovation, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831, USA
- Bredesen Center for Interdisciplinary Research, University of Tennessee-Knoxville, 310 Ferris Hall 1508 Middle Dr, Knoxville, TN 37996, USA
| | - Li Fuxin
- Department of Electrical Engineering and Computer Science, Oregon State University, 1148 Kelley Engineering Center, Corvallis, OR 97331, USA
| | - Steven H Strauss
- Department of Forest Ecosystems and Society, Oregon State University, 321 Richardson Hall, Corvallis, OR 97311, USA
| |
Collapse
|
192
|
Xu K, Zhang F, Huang Y, Huang X. 2.5D UNet with context-aware feature sequence fusion for accurate esophageal tumor semantic segmentation. Phys Med Biol 2024; 69:085002. [PMID: 38484399 DOI: 10.1088/1361-6560/ad3419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 03/14/2024] [Indexed: 04/04/2024]
Abstract
Segmenting esophageal tumor from computed tomography (CT) sequence images can assist doctors in diagnosing and treating patients with this malignancy. However, accurately extracting esophageal tumor features from CT images often present challenges due to their small area, variable position, and shape, as well as the low contrast with surrounding tissues. This results in not achieving the level of accuracy required for practical applications in current methods. To address this problem, we propose a 2.5D context-aware feature sequence fusion UNet (2.5D CFSF-UNet) model for esophageal tumor segmentation in CT sequence images. Specifically, we embed intra-slice multiscale attention feature fusion (Intra-slice MAFF) in each skip connection of UNet to improve feature learning capabilities, better expressing the differences between anatomical structures within CT sequence images. Additionally, the inter-slice context fusion block (Inter-slice CFB) is utilized in the center bridge of UNet to enhance the depiction of context features between CT slices, thereby preventing the loss of structural information between slices. Experiments are conducted on a dataset of 430 esophageal tumor patients. The results show an 87.13% dice similarity coefficient, a 79.71% intersection over union and a 2.4758 mm Hausdorff distance, which demonstrates that our approach can improve contouring consistency and can be applied to clinical applications.
Collapse
Affiliation(s)
- Kai Xu
- Scholl of the Internet, Anhui university, Anhui, 230039, People's Republic of China
| | - Feixiang Zhang
- Scholl of the Internet, Anhui university, Anhui, 230039, People's Republic of China
| | - Yong Huang
- Department of Medical Oncology, The Second People's Hospital of Hefei, Hefei, 230011, People's Republic of China
| | - Xiaoyu Huang
- Department of Chinese Integrative Medicine Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, 230022, People's Republic of China
| |
Collapse
|
193
|
Sittinger M, Uhler J, Pink M, Herz A. Insect detect: An open-source DIY camera trap for automated insect monitoring. PLoS One 2024; 19:e0295474. [PMID: 38568922 PMCID: PMC10990185 DOI: 10.1371/journal.pone.0295474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 02/28/2024] [Indexed: 04/05/2024] Open
Abstract
Insect monitoring is essential to design effective conservation strategies, which are indispensable to mitigate worldwide declines and biodiversity loss. For this purpose, traditional monitoring methods are widely established and can provide data with a high taxonomic resolution. However, processing of captured insect samples is often time-consuming and expensive, which limits the number of potential replicates. Automated monitoring methods can facilitate data collection at a higher spatiotemporal resolution with a comparatively lower effort and cost. Here, we present the Insect Detect DIY (do-it-yourself) camera trap for non-invasive automated monitoring of flower-visiting insects, which is based on low-cost off-the-shelf hardware components combined with open-source software. Custom trained deep learning models detect and track insects landing on an artificial flower platform in real time on-device and subsequently classify the cropped detections on a local computer. Field deployment of the solar-powered camera trap confirmed its resistance to high temperatures and humidity, which enables autonomous deployment during a whole season. On-device detection and tracking can estimate insect activity/abundance after metadata post-processing. Our insect classification model achieved a high top-1 accuracy on the test dataset and generalized well on a real-world dataset with captured insect images. The camera trap design and open-source software are highly customizable and can be adapted to different use cases. With custom trained detection and classification models, as well as accessible software programming, many possible applications surpassing our proposed deployment method can be realized.
Collapse
Affiliation(s)
- Maximilian Sittinger
- Julius Kühn Institute (JKI)—Federal Research Centre for Cultivated Plants, Institute for Biological Control, Dossenheim, Germany
| | - Johannes Uhler
- Julius Kühn Institute (JKI)—Federal Research Centre for Cultivated Plants, Institute for Biological Control, Dossenheim, Germany
| | - Maximilian Pink
- Julius Kühn Institute (JKI)—Federal Research Centre for Cultivated Plants, Institute for Biological Control, Dossenheim, Germany
| | - Annette Herz
- Julius Kühn Institute (JKI)—Federal Research Centre for Cultivated Plants, Institute for Biological Control, Dossenheim, Germany
| |
Collapse
|
194
|
Ali IE, Sumita Y, Wakabayashi N. Advancing maxillofacial prosthodontics by using pre-trained convolutional neural networks: Image-based classification of the maxilla. J Prosthodont 2024. [PMID: 38566564 DOI: 10.1111/jopr.13853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 03/15/2024] [Indexed: 04/04/2024] Open
Abstract
PURPOSE The study aimed to compare the performance of four pre-trained convolutional neural networks in recognizing seven distinct prosthodontic scenarios involving the maxilla, as a preliminary step in developing an artificial intelligence (AI)-powered prosthesis design system. MATERIALS AND METHODS Seven distinct classes, including cleft palate, dentulous maxillectomy, edentulous maxillectomy, reconstructed maxillectomy, completely dentulous, partially edentulous, and completely edentulous, were considered for recognition. Utilizing transfer learning and fine-tuned hyperparameters, four AI models (VGG16, Inception-ResNet-V2, DenseNet-201, and Xception) were employed. The dataset, consisting of 3541 preprocessed intraoral occlusal images, was divided into training, validation, and test sets. Model performance metrics encompassed accuracy, precision, recall, F1 score, area under the receiver operating characteristic curve (AUC), and confusion matrix. RESULTS VGG16, Inception-ResNet-V2, DenseNet-201, and Xception demonstrated comparable performance, with maximum test accuracies of 0.92, 0.90, 0.94, and 0.95, respectively. Xception and DenseNet-201 slightly outperformed the other models, particularly compared with InceptionResNet-V2. Precision, recall, and F1 scores exceeded 90% for most classes in Xception and DenseNet-201 and the average AUC values for all models ranged between 0.98 and 1.00. CONCLUSIONS While DenseNet-201 and Xception demonstrated superior performance, all models consistently achieved diagnostic accuracy exceeding 90%, highlighting their potential in dental image analysis. This AI application could help work assignments based on difficulty levels and enable the development of an automated diagnosis system at patient admission. It also facilitates prosthesis designing by integrating necessary prosthesis morphology, oral function, and treatment difficulty. Furthermore, it tackles dataset size challenges in model optimization, providing valuable insights for future research.
Collapse
Affiliation(s)
- Islam E Ali
- Department of Advanced Prosthodontics, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
- Department of Prosthodontics, Faculty of Dentistry, Mansoura University, Mansoura, Egypt
| | - Yuka Sumita
- Division of General Dentistry 4, The Nippon Dental University Hospital, Tokyo, Japan
- Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Noriyuki Wakabayashi
- Department of Advanced Prosthodontics, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| |
Collapse
|
195
|
Alegria AD, Joshi AS, Mendana JB, Khosla K, Smith KT, Auch B, Donovan M, Bischof J, Gohl DM, Kodandaramaiah SB. High-throughput genetic manipulation of multicellular organisms using a machine-vision guided embryonic microinjection robot. Genetics 2024; 226:iyae025. [PMID: 38373262 PMCID: PMC10990426 DOI: 10.1093/genetics/iyae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/02/2024] [Accepted: 01/08/2024] [Indexed: 02/21/2024] Open
Abstract
Microinjection is a technique used for transgenesis, mutagenesis, cell labeling, cryopreservation, and in vitro fertilization in multiple single and multicellular organisms. Microinjection requires specialized skills and involves rate-limiting and labor-intensive preparatory steps. Here, we constructed a machine-vision guided generalized robot that fully automates the process of microinjection in fruit fly (Drosophila melanogaster) and zebrafish (Danio rerio) embryos. The robot uses machine learning models trained to detect embryos in images of agar plates and identify specific anatomical locations within each embryo in 3D space using dual view microscopes. The robot then serially performs a microinjection in each detected embryo. We constructed and used three such robots to automatically microinject tens of thousands of Drosophila and zebrafish embryos. We systematically optimized robotic microinjection for each species and performed routine transgenesis with proficiency comparable to highly skilled human practitioners while achieving up to 4× increases in microinjection throughput in Drosophila. The robot was utilized to microinject pools of over 20,000 uniquely barcoded plasmids into 1,713 embryos in 2 days to rapidly generate more than 400 unique transgenic Drosophila lines. This experiment enabled a novel measurement of the number of independent germline integration events per successfully injected embryo. Finally, we showed that robotic microinjection of cryoprotective agents in zebrafish embryos significantly improves vitrification rates and survival of cryopreserved embryos post-thaw as compared to manual microinjection. We anticipate that the robot can be used to carry out microinjection for genome-wide manipulation and cryopreservation at scale in a wide range of organisms.
Collapse
Affiliation(s)
- Andrew D Alegria
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Amey S Joshi
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Jorge Blanco Mendana
- University of Minnesota Genomics Center, University of Minnesota, Minneapolis, MN 55455, USA
| | - Kanav Khosla
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Kieran T Smith
- Department of Fisheries, Wildlife and Conservation Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Benjamin Auch
- University of Minnesota Genomics Center, University of Minnesota, Minneapolis, MN 55455, USA
| | - Margaret Donovan
- University of Minnesota Genomics Center, University of Minnesota, Minneapolis, MN 55455, USA
| | - John Bischof
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Daryl M Gohl
- University of Minnesota Genomics Center, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Suhasa B Kodandaramaiah
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
196
|
Lv Y, Zhang J, Barnes N, Dai Y. Weakly-Supervised Contrastive Learning for Unsupervised Object Discovery. IEEE Trans Image Process 2024; 33:2689-2702. [PMID: 38536682 DOI: 10.1109/tip.2024.3380243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation. This task is promising due to its ability to discover objects in a generic manner. We roughly categorize existing techniques into two main directions, namely the generative solutions based on image resynthesis, and the clustering methods based on self-supervised models. We have observed that the former heavily relies on the quality of image reconstruction, while the latter shows limitations in effectively modeling semantic correlations. To directly target at object discovery, we focus on the latter approach and propose a novel solution by incorporating weakly-supervised contrastive learning (WCL) to enhance semantic information exploration. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images, which is achieved by fine-tuning the feature encoder of a self-supervised model, namely DINO, via WCL. Subsequently, we introduce Principal Component Analysis (PCA) to localize object regions. The principal projection direction, corresponding to the maximal eigenvalue, serves as an indicator of the object region(s). Extensive experiments on benchmark unsupervised object discovery datasets demonstrate the effectiveness of our proposed solution. The source code and experimental results are publicly available via our project page at https://github.com/npucvr/WSCUOD.git.
Collapse
|
197
|
Buatik A, Thansirichaisree P, Kalpiyapun P, Khademi N, Pasityothin I, Poovarodom N. Mosaic crack mapping of footings by convolutional neural networks. Sci Rep 2024; 14:7851. [PMID: 38570570 PMCID: PMC10991403 DOI: 10.1038/s41598-024-58432-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 03/29/2024] [Indexed: 04/05/2024] Open
Abstract
Cracks are the primary indicator informing the structural health of concrete structures. Frequent inspection is essential for maintenance, and automatic crack inspection offers a significant advantage, given its efficiency and accuracy. Previously, image-based crack detection systems have been utilized for individual images, yet these systems are not effective for large inspection areas. This paper thereby proposes an image-based crack detection system using a Deep Convolution Neural Network (DCNN) to identify cracks in mosaic images composed from UAV photos of concrete footings. UAV images are transformed into 3D footing models, from which the composite images are created. The CNN model is trained on 224 × 224 pixel patches, and training samples are augmented by various image transformation techniques. The proposed method is applied to localize cracks on composite images through the sliding window technique. The proposed VGG16 CNN detection system, with 95% detection accuracy, indicates superior performance to feature-based detection systems.
Collapse
Affiliation(s)
- Apichat Buatik
- Research Unit of Infrastructure Inspection, Monitoring, Repair and Strengthening, Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathumthani, Thailand
| | - Phromphat Thansirichaisree
- Research Unit of Infrastructure Inspection, Monitoring, Repair and Strengthening, Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathumthani, Thailand.
| | - Phisutwat Kalpiyapun
- Research Unit of Infrastructure Inspection, Monitoring, Repair and Strengthening, Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathumthani, Thailand
| | - Navid Khademi
- Research Unit of Infrastructure Inspection, Monitoring, Repair and Strengthening, Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathumthani, Thailand
- School of Civil Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Ittipon Pasityothin
- Research Unit of Infrastructure Inspection, Monitoring, Repair and Strengthening, Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathumthani, Thailand
| | - Nakhorn Poovarodom
- Research Unit of Infrastructure Inspection, Monitoring, Repair and Strengthening, Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathumthani, Thailand
| |
Collapse
|
198
|
Jo S, Jang O, Bhattacharyya C, Kim M, Lee T, Jang Y, Song H, Kwon H, Do S, Kim S. S-LIGHT: Synthetic Dataset for the Separation of Diffuse and Specular Reflection Images. Sensors (Basel) 2024; 24:2286. [PMID: 38610497 PMCID: PMC11014017 DOI: 10.3390/s24072286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Revised: 03/19/2024] [Accepted: 04/01/2024] [Indexed: 04/14/2024]
Abstract
Several studies in computer vision have examined specular removal, which is crucial for object detection and recognition. This research has traditionally been divided into two tasks: specular highlight removal, which focuses on removing specular highlights on object surfaces, and reflection removal, which deals with specular reflections occurring on glass surfaces. In reality, however, both types of specular effects often coexist, making it a fundamental challenge that has not been adequately addressed. Recognizing the necessity of integrating specular components handled in both tasks, we constructed a specular-light (S-Light) DB for training single-image-based deep learning models. Moreover, considering the absence of benchmark datasets for quantitative evaluation, the multi-scale normalized cross correlation (MS-NCC) metric, which considers the correlation between specular and diffuse components, was introduced to assess the learning outcomes.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Sungho Kim
- Advanced Visual Intelligence Lab (AVILAB), Yeungnam University, Gyeongsan-si 38541, Republic of Korea; (S.J.); (O.J.); (C.B.); (M.K.); (T.L.); (Y.J.); (H.S.); (H.K.); (S.D.)
| |
Collapse
|
199
|
Shi Q, Ye M, Huang W, Ruan W, Du B. Label-Aware Calibration and Relation-Preserving in Visual Intention Understanding. IEEE Trans Image Process 2024; 33:2627-2638. [PMID: 38536683 DOI: 10.1109/tip.2024.3380250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Visual intention understanding is a challenging task that explores the hidden intention behind the images of publishers in social media. Visual intention represents implicit semantics, whose ambiguous definition inevitably leads to label shifting and label blemish. The former indicates that the same image delivers intention discrepancies under different data augmentations, while the latter represents that the label of intention data is susceptible to errors or omissions during the annotation process. This paper proposes a novel method, called Label-aware Calibration and Relation-preserving (LabCR) to alleviate the above two problems from both intra-sample and inter-sample views. First, we disentangle the multiple intentions into a single intention for explicit distribution calibration in terms of the overall and the individual. Calibrating the class probability distributions in augmented instance pairs provides consistent inferred intention to address label shifting. Second, we utilize the intention similarity to establish correlations among samples, which offers additional supervision signals to form correlation alignments in instance pairs. This strategy alleviates the effect of label blemish. Extensive experiments have validated the superiority of the proposed method LabCR in visual intention understanding and pedestrian attribute recognition. Code is available at https://github.com/ShiQingHongYa/LabCR.
Collapse
|
200
|
Miao B, Bennamoun M, Gao Y, Mian A. Region Aware Video Object Segmentation With Deep Motion Modeling. IEEE Trans Image Process 2024; 33:2639-2651. [PMID: 38551827 DOI: 10.1109/tip.2024.3381445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Current semi-supervised video object segmentation (VOS) methods often employ the entire features of one frame to predict object masks and update memory. This introduces significant redundant computations. To reduce redundancy, we introduce a Region Aware Video Object Segmentation (RAVOS) approach, which predicts regions of interest (ROIs) for efficient object segmentation and memory storage. RAVOS includes a fast object motion tracker to predict object ROIs in the next frame. For efficient segmentation, object features are extracted based on the ROIs, and an object decoder is designed for object-level segmentation. For efficient memory storage, we propose motion path memory to filter out redundant context by memorizing the features within the motion path of objects. In addition to RAVOS, we also propose a large-scale occluded VOS dataset, dubbed OVOS, to benchmark the performance of VOS models under occlusions. Evaluation on DAVIS and YouTube-VOS benchmarks and our new OVOS dataset show that our method achieves state-of-the-art performance with significantly faster inference time, e.g., 86.1 J & F at 42 FPS on DAVIS and 84.4 J & F at 23 FPS on YouTube-VOS. Project page: ravos.netlify.app.
Collapse
|