1
|
Sletta Ø, Cheema A, Marthinsen AJ, Andreassen IM, Sletten CM, Galtung IT, Soler A, Molinas M. Newly identified Phonocardiography frequency bands for psychological stress detection with Deep Wavelet Scattering Network. Comput Biol Med 2024; 178:108722. [PMID: 38889628 DOI: 10.1016/j.compbiomed.2024.108722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 04/10/2024] [Accepted: 06/06/2024] [Indexed: 06/20/2024]
Abstract
The timely psychological stress detection can improve the quality of human life by preventing stress-induced behavioral and pathological consequences. This paper presents a novel framework that eliminates the need of Electrocardiography (ECG) signals-based referencing of Phonocardiography (PCG) signals for psychological stress detection. This stand-alone PCG-based methodology uses wavelet scattering approach on the data acquired from twenty-eight healthy adult male and female subjects to detect psychological stress. The acquired PCG signals are asynchronously segmented for the analysis using wavelet scattering transform. After the noise bands removal, the optimized segmentation length (L), scattering network parameters namely-invariance scale (J) and quality factor (Q) are utilized for computation of scattering features. These scattering coefficients generated are fed to K-nearest neighbor (KNN) and Extreme Gradient Boosting (XGBoost) classifier and the ten-fold cross validation-based performance metrics obtained are-accuracy 94.30 %, sensitivity 97.96 %, specificity 88.01 % and area under the curve (AUC) 0.9298 using XGBoost classifier for detecting psychological stress. Most importantly, the framework also identified two frequency bands in PCG signals with high discriminatory power for psychological stress detection as 270-290 Hz and 380-390 Hz. The elimination of multi-modal data acquisition and analysis makes this approach cost-efficient and reduces computational complexity.
Collapse
Affiliation(s)
- Øystein Sletta
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Amandeep Cheema
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway.
| | - Anne Joo Marthinsen
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Ida Marie Andreassen
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Christian Moe Sletten
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Ivar Tesdal Galtung
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| | - Andres Soler
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway.
| | - Marta Molinas
- Department of Engineering Cybernetics, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway.
| |
Collapse
|
2
|
Gao J, Jiao L, Liu X, Li L, Chen P, Liu F, Yang S. Multiscale Dynamic Curvelet Scattering Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7999-8012. [PMID: 36427283 DOI: 10.1109/tnnls.2022.3223212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The feature representation learning process greatly determines the performance of networks in classification tasks. By combining multiscale geometric tools and networks, better representation and learning can be achieved. However, relatively fixed geometric features and multiscale structures are always used. In this article, we propose a more flexible framework called the multiscale dynamic curvelet scattering network (MSDCCN). This data-driven dynamic network is based on multiscale geometric prior knowledge. First, multiresolution scattering and multiscale curvelet features are efficiently aggregated in different levels. Then, these features can be reused in networks flexibly and dynamically, depending on the multiscale intervention flag. The initial value of this flag is based on the complexity assessment, and it is updated according to feature sparsity statistics on the pretrained model. With the multiscale dynamic reuse structure, the feature representation learning process can be improved in the following training process. Also, multistage fine-tuning can be performed to further improve the classification accuracy. Furthermore, a novel multiscale dynamic curvelet scattering module, which is more flexible, is developed to be further embedded into other networks. Extensive experimental results show that better classification accuracies can be achieved by MSDCCN. In addition, necessary evaluation experiments have been performed, including convergence analysis, insight analysis, and adaptability analysis.
Collapse
|
3
|
Filippi J, Casti P, Antonelli G, Murdocca M, Mencattini A, Corsi F, D'Orazio M, Pecora A, De Luca M, Curci G, Ghibelli L, Sangiuolo F, Neale SL, Martinelli E. Cell Electrokinetic Fingerprint: A Novel Approach Based on Optically Induced Dielectrophoresis (ODEP) for In-Flow Identification of Single Cells. SMALL METHODS 2024:e2300923. [PMID: 38693090 DOI: 10.1002/smtd.202300923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 04/04/2024] [Indexed: 05/03/2024]
Abstract
A novel optically induced dielectrophoresis (ODEP) system that can operate under flow conditions is designed for automatic trapping of cells and subsequent induction of 2D multi-frequency cell trajectories. Like in a "ping-pong" match, two virtual electrode barriers operate in an alternate mode with varying frequencies of the input voltage. The so-derived cell motions are characterized via time-lapse microscopy, cell tracking, and state-of-the-art machine learning algorithms, like the wavelet scattering transform (WST). As a cell-electrokinetic fingerprint, the dynamic of variation of the cell displacements happening, over time, is quantified in response to different frequency values of the induced electric field. When tested on two biological scenarios in the cancer domain, the proposed approach discriminates cellular dielectric phenotypes obtained, respectively, at different early phases of drug-induced apoptosis in prostate cancer (PC3) cells and for differential expression of the lectine-like oxidized low-density lipoprotein receptor-1 (LOX-1) transcript levels in human colorectal adenocarcinoma (DLD-1) cells. The results demonstrate increased discrimination of the proposed system and pose an additional basis for making ODEP-based assays addressing cancer heterogeneity for precision medicine and pharmacological research.
Collapse
Affiliation(s)
- Joanna Filippi
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| | - Paola Casti
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| | - Gianni Antonelli
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| | - Michela Murdocca
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Via Montpellier 1, Rome, 00133, Italy
| | - Arianna Mencattini
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| | - Francesca Corsi
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica 1, Rome, 00133, Italy
- Department of Chemical Science and Technologies, University of Rome Tor Vergata, Via della Ricerca Scientifica 1, Rome, 00133, Italy
| | - Michele D'Orazio
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| | - Alessandro Pecora
- Italian Nation Research Council (CNR), Via del Fosso del Cavaliere 100, Rome, 00133, Italy
| | - Massimiliano De Luca
- Italian Nation Research Council (CNR), Via del Fosso del Cavaliere 100, Rome, 00133, Italy
| | - Giorgia Curci
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| | - Lina Ghibelli
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica 1, Rome, 00133, Italy
| | - Federica Sangiuolo
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Via Montpellier 1, Rome, 00133, Italy
| | - Steven L Neale
- James Watt School of Engineering, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Eugenio Martinelli
- Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy
- Interdisciplinary Center for Advanced Studies on Lab-on-Chip and Organ-on-Chip Applications (ICLOC), Via del Politecnico 1, Rome, 00133, Italy
| |
Collapse
|
4
|
Deo BS, Nayak S, Pal M, Panigrahi PK, Pradhan A. Wavelet scattering transform and entropy features in fluorescence spectral signal analysis for cervical cancer diagnosis. Biomed Phys Eng Express 2024; 10:045002. [PMID: 38636479 DOI: 10.1088/2057-1976/ad403a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 04/18/2024] [Indexed: 04/20/2024]
Abstract
Cervical cancer is a prevalent malignant tumor within the female reproductive system and is regarded as a prominent cause of female mortality on a global scale. Timely and precise detection of various phases of cervical cancer holds the potential to substantially enhance both the rate of successful treatment and the duration of patient survival. Fluorescence spectroscopy is a highly sensitive method for detecting the biochemical changes that arise during cancer progression. In our study, fluorescence spectral data is collected from a diverse group of 110 subjects. The potential of the scattering transform technique for the purpose of cancer detection is explored. The processed signal undergoes an initial decomposition into scattering coefficients using the wavelet scattering transform (WST). Subsequently, the scattering coefficients are subjected to computation for fuzzy entropy, dispersion entropy, phase entropy, and spectral entropy, for effectively characterizing the fluorescence spectral signals. These combined features generated through the proposed approach are then fed to 1D convolutional neural network (CNN) classifier to classify them into normal, pre-cancerous, and cancerous categories, thereby evaluating the effectiveness of the proposed methodology. We obtained mean classification accuracy of 97% using 5-fold cross-validation. This demonstrates the potential of combining WST and entropic features for analyzing fluorescence spectroscopy signals using 1D CNN classifier that enables early cancer detection in contrast to prevailing diagnostic methods.
Collapse
Affiliation(s)
- Bhaswati Singha Deo
- Center for Lasers and Photonics, Indian Institute of Technology, Kanpur, 208016, India
| | - Sidharthenee Nayak
- ABB Ability Innovation Center, Asea Brown Boveri Company, Hyderabad, 500084, Telangana, India
- School of Electrical Sciences, Indian Institute of Technology, Bhubaneswar, 751013, India
| | - Mayukha Pal
- ABB Ability Innovation Center, Asea Brown Boveri Company, Hyderabad, 500084, Telangana, India
| | - Prasanta K Panigrahi
- Department of Physical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur, Nadia, 741246, India
- Center for Quantum Science and Technology, Siksha 'O' Anusandhan university, Bhubaneswar, 751030, Odisha, India
| | - Asima Pradhan
- Center for Lasers and Photonics, Indian Institute of Technology, Kanpur, 208016, India
- Department of Physics, Indian Institute of Technology, Kanpur, 208016, India
| |
Collapse
|
5
|
Xin J, Khishe M, Zeebaree DQ, Abualigah L, Ghazal TM. Adaptive habitat biogeography-based optimizer for optimizing deep CNN hyperparameters in image classification. Heliyon 2024; 10:e28147. [PMID: 38689992 PMCID: PMC11059399 DOI: 10.1016/j.heliyon.2024.e28147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 03/12/2024] [Accepted: 03/12/2024] [Indexed: 05/02/2024] Open
Abstract
Deep Convolutional Neural Networks (DCNNs) have shown remarkable success in image classification tasks, but optimizing their hyperparameters can be challenging due to their complex structure. This paper develops the Adaptive Habitat Biogeography-Based Optimizer (AHBBO) for tuning the hyperparameters of DCNNs in image classification tasks. In complicated optimization problems, the BBO suffers from premature convergence and insufficient exploration. In this regard, an adaptable habitat is presented as a solution to these problems; it would permit variable habitat sizes and regulated mutation. Better optimization performance and a greater chance of finding high-quality solutions across a wide range of problem domains are the results of this modification's increased exploration and population diversity. AHBBO is tested on 53 benchmark optimization functions and demonstrates its effectiveness in improving initial stochastic solutions and converging faster to the optimum. Furthermore, DCNN-AHBBO is compared to 23 well-known image classifiers on nine challenging image classification problems and shows superior performance in reducing the error rate by up to 5.14%. Our proposed algorithm outperforms 13 benchmark classifiers in 87 out of 95 evaluations, providing a high-performance and reliable solution for optimizing DNNs in image classification tasks. This research contributes to the field of deep learning by proposing a new optimization algorithm that can improve the efficiency of deep neural networks in image classification.
Collapse
Affiliation(s)
- Jiayun Xin
- School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, 264209, Shandong, China
| | - Mohammad Khishe
- Department of Electrical Engineering, Imam Khomeini Marine Science University, Nowshahr, Iran
- Center for Artificial Intelligence Applications, Yuan Ze University, Taiwan
| | - Diyar Qader Zeebaree
- Information Technology Department, Technical College of Duhok, Duhok Polytechnic University, Duhok, Iraq
| | - Laith Abualigah
- Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, 19328, Jordan
- Computer Science Department, Al al-Bayt University, Mafraq, 25113, Jordan
- Artificial Intelligence and Sensing Technologies (AIST) Research Center, University of Tabuk, Tabuk, 71491, Saudi Arabia
- MEU Research Unit, Middle East University, Amman, 11831, Jordan
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos, 13-5053, Lebanon
- School of Engineering and Technology, Sunway University Malaysia, Petaling Jaya, 27500, Malaysia
| | - Taher M. Ghazal
- Centre for Cyber Physical Systems, Computer Science Department, Khalifa University, United Arab Emirates
- Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM), 43600, Bangi, Selangor, Malaysia
- Applied Science Research Center, Applied Science Private University, Amman, 11937, Jordan
| |
Collapse
|
6
|
Li S, Xu H, Wang J, Xu R, Liu A, He F, Liu X, Tao D. Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:2714-2729. [PMID: 38557629 DOI: 10.1109/tip.2024.3381771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Billions of people share images from their daily lives on social media every day. However, their biometric information (e.g., fingerprints) could be easily stolen from these images. The threat of fingerprint leakage from social media has created a strong desire to anonymize shared images while maintaining image quality, since fingerprints act as a lifelong individual biometric password. To guard the fingerprint leakage, adversarial attack that involves adding imperceptible perturbations to fingerprint images have emerged as a feasible solution. However, existing works of this kind are either weak in black-box transferability or cause the images to have an unnatural appearance. Motivated by the visual perception hierarchy (i.e., high-level perception exploits model-shared semantics that transfer well across models while low-level perception extracts primitive stimuli that result in high visual sensitivity when a suspicious stimulus is provided), we propose FingerSafe, a hierarchical perceptual protective noise injection framework to address the above mentioned problems. For black-box transferability, we inject protective noises into the fingerprint orientation field to perturb the model-shared high-level semantics (i.e., fingerprint ridges). Considering visual naturalness, we suppress the low-level local contrast stimulus by regularizing the response of the Lateral Geniculate Nucleus. Our proposed FingerSafe is the first to provide feasible fingerprint protection in both digital (up to 94.12%) and realistic scenarios (Twitter and Facebook, up to 68.75%). Our code can be found at https://github.com/nlsde-safety-team/FingerSafe.
Collapse
|
7
|
Tam KH, Soares MF, Kers J, Sharples EJ, Ploeg RJ, Kaisar M, Rittscher J. Predicting clinical endpoints and visual changes with quality-weighted tissue-based renal histological features. FRONTIERS IN TRANSPLANTATION 2024; 3:1305468. [PMID: 38993786 PMCID: PMC11235227 DOI: 10.3389/frtra.2024.1305468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 03/15/2024] [Indexed: 07/13/2024]
Abstract
Two common obstacles limiting the performance of data-driven algorithms in digital histopathology classification tasks are the lack of expert annotations and the narrow diversity of datasets. Multi-instance learning (MIL) can address the former challenge for the analysis of whole slide images (WSI), but performance is often inferior to full supervision. We show that the inclusion of weak annotations can significantly enhance the effectiveness of MIL while keeping the approach scalable. An analysis framework was developed to process periodic acid-Schiff (PAS) and Sirius Red (SR) slides of renal biopsies. The workflow segments tissues into coarse tissue classes. Handcrafted and deep features were extracted from these tissues and combined using a soft attention model to predict several slide-level labels: delayed graft function (DGF), acute tubular injury (ATI), and Remuzzi grade components. A tissue segmentation quality metric was also developed to reduce the adverse impact of poorly segmented instances. The soft attention model was trained using 5-fold cross-validation on a mixed dataset and tested on the QUOD dataset containing n = 373 PAS and n = 195 SR biopsies. The average ROC-AUC over different prediction tasks was found to be 0.598 ± 0.011 , significantly higher than using only ResNet50 ( 0.545 ± 0.012 ), only handcrafted features ( 0.542 ± 0.011 ), and the baseline ( 0.532 ± 0.012 ) of state-of-the-art performance. In conjunction with soft attention, weighting tissues by segmentation quality has led to further improvement ( A U C = 0.618 ± 0.010 ) . Using an intuitive visualisation scheme, we show that our approach may also be used to support clinical decision making as it allows pinpointing individual tissues relevant to the predictions.
Collapse
Affiliation(s)
- Ka Ho Tam
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Maria F. Soares
- Department of Cellular Pathology, Oxford University Hospitals NHS Foundation Trust, John Radcliffe Hospital, Oxford, United Kingdom
| | - Jesper Kers
- Department of Pathology, Amsterdam UMC, University of Amsterdam, Amsterdam, Netherlands
- Department of Pathology, Leiden Transplant Center, Leiden University Medical Center, Leiden, Netherlands
- Van’t Hoff Institute for Molecular Sciences, University of Amsterdam, Amsterdam, Netherlands
| | - Edward J. Sharples
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
| | - Rutger J. Ploeg
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
- Research and Development, NHS Blood and Transplant Filton and Oxford, Oxford, United Kingdom
| | - Maria Kaisar
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
- Research and Development, NHS Blood and Transplant Filton and Oxford, Oxford, United Kingdom
| | - Jens Rittscher
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
8
|
Cheng S, Morel R, Allys E, Ménard B, Mallat S. Scattering spectra models for physics. PNAS NEXUS 2024; 3:pgae103. [PMID: 38560525 PMCID: PMC10978061 DOI: 10.1093/pnasnexus/pgae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 02/16/2024] [Indexed: 04/04/2024]
Abstract
Physicists routinely need probabilistic models for a number of tasks such as parameter inference or the generation of new realizations of a field. Establishing such models for highly non-Gaussian fields is a challenge, especially when the number of samples is limited. In this paper, we introduce scattering spectra models for stationary fields and we show that they provide accurate and robust statistical descriptions of a wide range of fields encountered in physics. These models are based on covariances of scattering coefficients, i.e. wavelet decomposition of a field coupled with a pointwise modulus. After introducing useful dimension reductions taking advantage of the regularity of a field under rotation and scaling, we validate these models on various multiscale physical fields and demonstrate that they reproduce standard statistics, including spatial moments up to fourth order. The scattering spectra provide us with a low-dimensional structured representation that captures key properties encountered in a wide range of physical fields. These generic models can be used for data exploration, classification, parameter inference, symmetry detection, and component separation.
Collapse
Affiliation(s)
- Sihao Cheng
- School of Natural Sciences, Institute for Advanced Study, Princeton, NJ 08540, USA
| | - Rudy Morel
- Departement d'informatique de l'ENS, ENS, CNRS, PSL University, 75014 Paris, France
| | - Erwan Allys
- Laboratoire de Physique de l'Ecole normale supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris Cité, 75014 Paris, France
| | - Brice Ménard
- Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Stéphane Mallat
- Departement d'informatique de l'ENS, ENS, CNRS, PSL University, 75014 Paris, France
- Collège de France, 75231 Paris, France
- Center for Computational Mathematics, Flatiron Institute, New York, NY 10010, USA
| |
Collapse
|
9
|
Radhakrishnan A, Beaglehole D, Pandit P, Belkin M. Mechanism for feature learning in neural networks and backpropagation-free machine learning models. Science 2024; 383:1461-1467. [PMID: 38452048 DOI: 10.1126/science.adi5639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 02/22/2024] [Indexed: 03/09/2024]
Abstract
Understanding how neural networks learn features, or relevant patterns in data, for prediction is necessary for their reliable use in technological and scientific applications. In this work, we presented a unifying mathematical mechanism, known as average gradient outer product (AGOP), that characterized feature learning in neural networks. We provided empirical evidence that AGOP captured features learned by various neural network architectures, including transformer-based language models, convolutional networks, multilayer perceptrons, and recurrent neural networks. Moreover, we demonstrated that AGOP, which is backpropagation-free, enabled feature learning in machine learning models, such as kernel machines, that a priori could not identify task-specific features. Overall, we established a fundamental mechanism that captured feature learning in neural networks and enabled feature learning in general machine learning models.
Collapse
Affiliation(s)
- Adityanarayanan Radhakrishnan
- Harvard School of Engineering and Applied Sciences, Cambridge, MA 02138, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Daniel Beaglehole
- Computer Science and Engineering, UC San Diego, La Jolla, CA 92093, USA
| | - Parthe Pandit
- Center for Machine Intelligence and Data Science, IIT Bombay, Mumbai 400076, India
- Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA 92093, USA
| | - Mikhail Belkin
- Computer Science and Engineering, UC San Diego, La Jolla, CA 92093, USA
- Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
10
|
Schuler I, Schuler M, Frick T, Jimenez D, Maghnouj A, Hahn S, Zewail R, Gerwert K, El-Mashtoly SF. Efficacy of tyrosine kinase inhibitors examined by a combination of Raman micro-spectroscopy and a deep wavelet scattering-based multivariate analysis framework. Analyst 2024; 149:2004-2015. [PMID: 38426854 DOI: 10.1039/d3an02235h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
HER2 is a crucial therapeutic target in breast cancer, and the survival rate of breast cancer patients has increased because of this receptor's inhibition. However, tumors have shown resistance to this therapeutic strategy due to oncogenic mutations that decrease the binding of several HER2-targeted drugs, including lapatinib, and confer resistance to this drug. Neratinib can overcome this drug resistance and effectively inhibit HER2 signaling and tumor growth. In the present study, we examined the efficacy of lapatinib and neratinib using breast cancer cells by Raman microscopy combined with a deep wavelet scattering-based multivariate analysis framework. This approach discriminated between control cells and drug-treated cells with high accuracy, compared to classical principal component analysis. Both lapatinib and neratinib induced changes in the cellular biochemical composition. Furthermore, the Raman results were compared with the results of several in vitro assays. For instance, drug-treated cells exhibited (i) inhibition of ERK and AKT phosphorylation, (ii) inhibition of cellular proliferation, (iii) cell-cycle arrest, and (iv) apoptosis as indicated by western blotting, real-time cell analysis (RTCA), cell-cycle analysis, and apoptosis assays. Thus, the observed Raman spectral changes are attributed to cell-cycle arrest and apoptosis. The results also indicated that neratinib is more potent than lapatinib. Moreover, the uptake and distribution of lapatinib in cells were visualized through its label-free marker bands in the fingerprint region using Raman spectral imaging. These results show the prospects of Raman microscopy in drug evaluation and presumably in drug discovery.
Collapse
Affiliation(s)
- Irina Schuler
- Center for Protein Diagnostics, Ruhr-University Bochum, Bochum, Germany.
- Department of Biophysics, Ruhr-University Bochum, Bochum, Germany
| | - Martin Schuler
- Center for Protein Diagnostics, Ruhr-University Bochum, Bochum, Germany.
- Department of Biophysics, Ruhr-University Bochum, Bochum, Germany
| | - Tatjana Frick
- Center for Protein Diagnostics, Ruhr-University Bochum, Bochum, Germany.
- Department of Biophysics, Ruhr-University Bochum, Bochum, Germany
| | - Dairovys Jimenez
- Center for Protein Diagnostics, Ruhr-University Bochum, Bochum, Germany.
- Department of Biophysics, Ruhr-University Bochum, Bochum, Germany
| | - Abdelouahid Maghnouj
- Department of Molecular GI-Oncology, Clinical Research Center, Ruhr-University Bochum, Bochum, Germany
| | - Stephan Hahn
- Department of Molecular GI-Oncology, Clinical Research Center, Ruhr-University Bochum, Bochum, Germany
| | - Rami Zewail
- Department of Computer Science & Engineering, Egypt-Japan University of Science and Technology, New Borg El-Arab, Egypt
| | - Klaus Gerwert
- Center for Protein Diagnostics, Ruhr-University Bochum, Bochum, Germany.
- Department of Biophysics, Ruhr-University Bochum, Bochum, Germany
| | - Samir F El-Mashtoly
- Center for Protein Diagnostics, Ruhr-University Bochum, Bochum, Germany.
- Department of Biophysics, Ruhr-University Bochum, Bochum, Germany
- Biotechnology Program, Institute of Basic and Applied Science, Egypt-Japan University of Science and Technology, New Borg El-Arab, Egypt
| |
Collapse
|
11
|
Gerace F, Krzakala F, Loureiro B, Stephan L, Zdeborová L. Gaussian universality of perceptrons with random labels. Phys Rev E 2024; 109:034305. [PMID: 38632742 DOI: 10.1103/physreve.109.034305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 12/08/2023] [Indexed: 04/19/2024]
Abstract
While classical in many theoretical settings-and in particular in statistical physics-inspired works-the assumption of Gaussian i.i.d. input data is often perceived as a strong limitation in the context of statistics and machine learning. In this study, we redeem this line of work in the case of generalized linear classification, also known as the perceptron model, with random labels. We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance. In the limit of vanishing regularization, we further demonstrate that the training loss is independent of the data covariance. On the theoretical side, we prove this universality for an arbitrary mixture of homogeneous Gaussian clouds. Empirically, we show that the universality holds also for a broad range of real data sets.
Collapse
Affiliation(s)
- Federica Gerace
- International School of Advanced Studies (SISSA), Trieste, Via Bonomea, 265, 34136 Trieste, Italy
- EPFL Statistical Physics of Computation (SPOC) Laboratory, Rte Cantonale, 1015 Lausanne, Switzerland
| | - Florent Krzakala
- EPFL, Information, Learning and Physics (IdePHICS) Laboratory, Rte Cantonale, 1015 Lausanne, Switzerland
| | - Bruno Loureiro
- EPFL, Information, Learning and Physics (IdePHICS) Laboratory, Rte Cantonale, 1015 Lausanne, Switzerland
- Département d'Informatique, École Normale Supérieure (ENS)-PSL & CNRS, F-75230 Paris Cedex 05, France
| | - Ludovic Stephan
- EPFL, Information, Learning and Physics (IdePHICS) Laboratory, Rte Cantonale, 1015 Lausanne, Switzerland
| | - Lenka Zdeborová
- EPFL Statistical Physics of Computation (SPOC) Laboratory, Rte Cantonale, 1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Pham TD, Holmes SB, Zou L, Patel M, Coulthard P. Diagnosis of pathological speech with streamlined features for long short-term memory learning. Comput Biol Med 2024; 170:107976. [PMID: 38219647 DOI: 10.1016/j.compbiomed.2024.107976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/14/2023] [Accepted: 01/04/2024] [Indexed: 01/16/2024]
Abstract
BACKGROUND Pathological speech diagnosis is crucial for identifying and treating various speech disorders. Accurate diagnosis aids in developing targeted intervention strategies, improving patients' communication abilities, and enhancing their overall quality of life. With the rising incidence of speech-related conditions globally, including oral health, the need for efficient and reliable diagnostic tools has become paramount, emphasizing the significance of advanced research in this field. METHODS This paper introduces novel features for deep learning in the analysis of short voice signals. It proposes the incorporation of time-space and time-frequency features to accurately discern between two distinct groups: Individuals exhibiting normal vocal patterns and those manifesting pathological voice conditions. These advancements aim to enhance the precision and reliability of diagnostic procedures, paving the way for more targeted treatment approaches. RESULTS Utilizing a publicly available voice database, this study carried out training and validation using long short-term memory (LSTM) networks learning on the combined features, along with a data balancing strategy. The proposed approach yielded promising performance metrics: 90% accuracy, 93% sensitivity, 87% specificity, 88% precision, an F1 score of 0.90, and an area under the receiver operating characteristic curve of 0.96. The results surpassed those obtained by the networks trained using wavelet-time scattering coefficients, as well as several algorithms trained with alternative feature types. CONCLUSIONS The incorporation of time-frequency and time-space features extracted from short segments of voice signals for LSTM learning demonstrates significant promise as an AI tool for the diagnosis of speech pathology. The proposed approach has the potential to enhance the accuracy and allow for real-time pathological speech assessment, thereby facilitating more targeted and effective therapeutic interventions.
Collapse
Affiliation(s)
- Tuan D Pham
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK.
| | - Simon B Holmes
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| | - Lifong Zou
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| | - Mangala Patel
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| | - Paul Coulthard
- Barts and The London Faculty of Medicine and Dentistry, Queen Mary University of London, Turner Street, E1 2AD, London, UK
| |
Collapse
|
13
|
Sharma N, Sharma M, Tailor J, Chaudhari A, Joshi D, Acharya UR. Automated detection of depression using wavelet scattering networks. Med Eng Phys 2024; 124:104107. [PMID: 38418014 DOI: 10.1016/j.medengphy.2024.104107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 12/16/2023] [Accepted: 01/09/2024] [Indexed: 03/01/2024]
Abstract
Today, depression is a common problem that affects many people all over the world. It can impact a person's mood and quality of life unless identified and treated immediately. Due to the hectic and stressful modern life seems to be, depression has become a leading cause of mental health illnesses. Signals from electroencephalograms (EEG) are frequently used to detect depression. It is difficult, time-consuming, and highly skilled to manually detect depression using EEG data analysis. Hence, in the proposed study, an automated depression detection system using EEG signals is proposed. The proposed study uses a clinically available dataset and dataset provided by the Department of Psychiatry at the Government Medical College (GMC) in Kozhikode, Kerala, India which consisted of 15 depressed patients and 15 healthy subjects and a publically available Multi-modal Open Dataset (MODMA) for Mental-disorder Analysis available at UK Data service reshare that consisted of 24 depressed patients and 29 healthy subjects. In this study, we have developed a novel Deep Wavelet Scattering Network (DWSN) for the automated detection of depression EEG signals. The best-performing classifier is then chosen by feeding the features into several machine-learning algorithms. For the clinically available GMC dataset, Medium Neural Network (MNN) achieved the highest accuracy of 99.95% with a Kappa value of 0.999. Using the suggested methods, the precision, recall, and F1-score are all 1. For the MODMA dataset, Wide Neural Network (WNN) achieved the highest accuracy of 99.3% with a Kappa value of 0.987. Using the suggested methods, the precision, recall, and F1-score are all 0.99. In comparison to all current methodologies, the performance of the suggested research is superior. The proposed method can be used to automatically diagnose depression both at home and in clinical settings.
Collapse
Affiliation(s)
- Nishant Sharma
- Department of Electrical Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Manish Sharma
- Department of Electrical Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Jimit Tailor
- Department of Electrical Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Arth Chaudhari
- Department of Electrical Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Deepak Joshi
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi (IITD), Delhi, India.
| | - U Rajendra Acharya
- School of Mathematics, Physics, and Computing, University of Southern Queensland, Toowoomba 4350, Queensland, Australia.
| |
Collapse
|
14
|
Oberst S, Martin R. Feature-preserving synthesis of termite-mimetic spinodal nest morphology. iScience 2024; 27:108674. [PMID: 38292166 PMCID: PMC10825051 DOI: 10.1016/j.isci.2023.108674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 07/09/2023] [Accepted: 12/05/2023] [Indexed: 02/01/2024] Open
Abstract
Termite-built topology is complex due to group interactions and environmental feedback. Being interlinked with material characteristics and related to functionality, an accurate synthesis of termite mound topology has never been achieved. We scanned inner termite mound pieces via high-resolution micro-computed tomography. A wavelet scattering transform followed by optimization extracts features that are fed into a Gaussian Random Fields (GRFs) approach to synthesize termite-mimetic spinodal topology. Compared to natural structures the GRF topology is more regular. Irregularity is related to anisotropy, indicative of directionality caused by porous network connectivity of chambers and corridors. Since GRFs are related to diffusion, we assume that deterministic behavioral traits play a significant role in the development of these local differences. We pioneer a framework to reliably mimic termite mound spinodal features. Engineering termite-inspired structures will allow to inspect aspects of termite architectures and their behavior to manufacture novel material concepts with imprinted multi-functionality.
Collapse
Affiliation(s)
- Sebastian Oberst
- Centre for Audio, Acoustics and Vibration, University of Technology Sydney, Sydney, NSW 2007, Australia
- School of Engineering and IT, University of New South Wales, University of New South Wales, Canberra, ACT 2612, Australia
| | - Richard Martin
- Centre for Audio, Acoustics and Vibration, University of Technology Sydney, Sydney, NSW 2007, Australia
| |
Collapse
|
15
|
Petersen PC, Sepliarskaia A. VC dimensions of group convolutional neural networks. Neural Netw 2024; 169:462-474. [PMID: 37939535 DOI: 10.1016/j.neunet.2023.10.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 07/17/2023] [Accepted: 10/10/2023] [Indexed: 11/10/2023]
Abstract
We study the generalization capacity of group convolutional neural networks. We identify precise estimates for the VC dimensions of simple sets of group convolutional neural networks. In particular, we find that for infinite groups and appropriately chosen convolutional kernels, already two-parameter families of convolutional neural networks have an infinite VC dimension, despite being invariant to the action of an infinite group.
Collapse
Affiliation(s)
- Philipp Christian Petersen
- University of Vienna, Faculty of Mathematics and Research Network Data Science@ Uni Vienna, Kolingasse 14-16, 1090 Wien, Austria.
| | - Anna Sepliarskaia
- University of Vienna, Faculty of Mathematics and Research Network Data Science@ Uni Vienna, Kolingasse 14-16, 1090 Wien, Austria.
| |
Collapse
|
16
|
Ju C, Guan C. Tensor-CSPNet: A Novel Geometric Deep Learning Framework for Motor Imagery Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10955-10969. [PMID: 35749326 DOI: 10.1109/tnnls.2022.3172108] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep learning (DL) has been widely investigated in a vast majority of applications in electroencephalography (EEG)-based brain-computer interfaces (BCIs), especially for motor imagery (MI) classification in the past five years. The mainstream DL methodology for the MI-EEG classification exploits the temporospatial patterns of EEG signals using convolutional neural networks (CNNs), which have been particularly successful in visual images. However, since the statistical characteristics of visual images depart radically from EEG signals, a natural question arises whether an alternative network architecture exists apart from CNNs. To address this question, we propose a novel geometric DL (GDL) framework called Tensor-CSPNet, which characterizes spatial covariance matrices derived from EEG signals on symmetric positive definite (SPD) manifolds and fully captures the temporospatiofrequency patterns using existing deep neural networks on SPD manifolds, integrating with experiences from many successful MI-EEG classifiers to optimize the framework. In the experiments, Tensor-CSPNet attains or slightly outperforms the current state-of-the-art performance on the cross-validation and holdout scenarios in two commonly used MI-EEG datasets. Moreover, the visualization and interpretability analyses also exhibit the validity of Tensor-CSPNet for the MI-EEG classification. To conclude, in this study, we provide a feasible answer to the question by generalizing the DL methodologies on SPD manifolds, which indicates the start of a specific GDL methodology for the MI-EEG classification.
Collapse
|
17
|
Wang F, Chen D, Yao W, Fu R. Real driving environment EEG-based detection of driving fatigue using the wavelet scattering network. J Neurosci Methods 2023; 400:109983. [PMID: 37838152 DOI: 10.1016/j.jneumeth.2023.109983] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 09/29/2023] [Accepted: 10/11/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND Driving fatigue is one of the main factors leading to traffic accidents. So, it is necessary to detect driver fatigue accurately and quickly. NEW METHOD To precisely detect driving fatigue in a real driving environment, this paper adopts a classification method for driving fatigue based on the wavelet scattering network (WSN). Firstly, electroencephalogram (EEG) signals of 12 subjects in the real driving environment are collected and categorized into two states: fatigue and awake. Secondly, the WSN algorithm extracts wavelet scattering coefficients of EEG signals, and these coefficients are used as input in support vector machine (SVM) as feature vectors for classification. RESULTS The results showed that the average classification accuracy of 12 subjects reached 99.33%; the average precision rate reached 99.28%; the average recall rate reached 98.27%; the average F1 score reached 98.74%; and the average classification accuracy of the public data set SEED-VIG reached 99.39%. The average precision, recall rate and F1 score reached 99.27%, 98.41% and 98.83% respectively. COMPARISON WITH EXISTING METHODS In addition, the WSN algorithm is compared with traditional convolutional neural network (CNN), Sparse-deep belief networks (SDBN), Spatio-temporal convolutional neural networks (STCNN), Long short-term memory (LSTM), and other methods, and it is found that WSN has higher classification accuracy. CONCLUSION Furthermore, this method has good versatility, providing excellent recognition effect on small sample data sets, and fast running time, making it convenient for real-time online monitoring of driver fatigue. Therefore, the WSN algorithm is promising in efficiently detecting driving fatigue state of drivers in real environments, contributing to improved traffic safety.
Collapse
Affiliation(s)
- Fuwang Wang
- Northeast Electric Power University, School of Mechanic Engineering, Jilin 132012, China.
| | - Daping Chen
- Northeast Electric Power University, School of Mechanic Engineering, Jilin 132012, China
| | - Wanchao Yao
- Northeast Electric Power University, School of Mechanic Engineering, Jilin 132012, China
| | - Rongrong Fu
- Yanshan University, College of Electrical Engineering, Qinhuangdao 066004, China
| |
Collapse
|
18
|
Pham TD, Sun X. Wavelet scattering networks in deep learning for discovering protein markers in a cohort of Swedish rectal cancer patients. Cancer Med 2023; 12:21502-21518. [PMID: 38014709 PMCID: PMC10726782 DOI: 10.1002/cam4.6672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 09/25/2023] [Accepted: 10/20/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Cancer biomarkers play a pivotal role in the diagnosis, prognosis, and treatment response prediction of the disease. In this study, we analyzed the expression levels of RhoB and DNp73 proteins in rectal cancer, as captured in immunohistochemical images, to predict the 5-year survival time of two patient groups: one with preoperative radiotherapy and one without. METHODS The utilization of deep convolutional neural networks in medical research, particularly in clinical cancer studies, has been gaining substantial attention. This success primarily stems from their ability to extract intricate image features that prove invaluable in machine learning. Another innovative method for extracting features at multiple levels is the wavelet-scattering network. Our study combines the strengths of these two convolution-based approaches to robustly extract image features related to protein expression. RESULTS The efficacy of our approach was evaluated across various tissue types, including tumor, biopsy, metastasis, and adjacent normal tissue. Statistical assessments demonstrated exceptional performance across a range of metrics, including prediction accuracy, classification accuracy, precision, and the area under the receiver operating characteristic curve. CONCLUSION These results underscore the potential of dual convolutional learning to assist clinical researchers in the timely validation and discovery of cancer biomarkers.
Collapse
Affiliation(s)
- Tuan D. Pham
- Barts and The London School of Medicine and Dentistry Queen MaryUniversity of London Turner StreetLondonUK
| | - Xiao‐Feng Sun
- Division of Oncology Department of Biomedical and Clinical SciencesLinkoping UniversityLinkopingSweden
| |
Collapse
|
19
|
Yang GY, Li XL, Xiao ZK, Mu TJ, Martin RR, Hu SM. Sampling Equivariant Self-Attention Networks for Object Detection in Aerial Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:6413-6425. [PMID: 37906473 DOI: 10.1109/tip.2023.3327586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Objects in aerial images show greater variations in scale and orientation than in other images, making them harder to detect using vanilla deep convolutional neural networks. Networks with sampling equivariance can adapt sampling from input feature maps to object transformation, allowing a convolutional kernel to extract effective object features under different transformations. However, methods such as deformable convolutional networks can only provide sampling equivariance under certain circumstances, as they sample by location. We propose sampling equivariant self-attention networks, which treat self-attention restricted to a local image patch as convolution sampling by masks instead of locations, and a transformation embedding module to improve the equivariant sampling further. We further propose a novel randomized normalization module to enhance network generalization and a quantitative evaluation metric to fairly evaluate the ability of sampling equivariance of different models. Experiments show that our model provides significantly better sampling equivariance than existing methods without additional supervision and can thus extract more effective image features. Our model achieves state-of-the-art results on the DOTA-v1.0, DOTA-v1.5, and HRSC2016 datasets without additional computations or parameters.
Collapse
|
20
|
Ang KM, Lim WH, Tiang SS, Sharma A, Eid MM, Tawfeek SM, Khafaga DS, Alharbi AH, Abdelhamid AA. Optimizing Image Classification: Automated Deep Learning Architecture Crafting with Network and Learning Hyperparameter Tuning. Biomimetics (Basel) 2023; 8:525. [PMID: 37999166 PMCID: PMC10669013 DOI: 10.3390/biomimetics8070525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/01/2023] [Accepted: 11/02/2023] [Indexed: 11/25/2023] Open
Abstract
This study introduces ETLBOCBL-CNN, an automated approach for optimizing convolutional neural network (CNN) architectures to address classification tasks of varying complexities. ETLBOCBL-CNN employs an effective encoding scheme to optimize network and learning hyperparameters, enabling the discovery of innovative CNN structures. To enhance the search process, it incorporates a competency-based learning concept inspired by mixed-ability classrooms during the teacher phase. This categorizes learners into competency-based groups, guiding each learner's search process by utilizing the knowledge of the predominant peers, the teacher solution, and the population mean. This approach fosters diversity within the population and promotes the discovery of innovative network architectures. During the learner phase, ETLBOCBL-CNN integrates a stochastic peer interaction scheme that encourages collaborative learning among learners, enhancing the optimization of CNN architectures. To preserve valuable network information and promote long-term population quality improvement, ETLBOCBL-CNN introduces a tri-criterion selection scheme that considers fitness, diversity, and learners' improvement rates. The performance of ETLBOCBL-CNN is evaluated on nine different image datasets and compared to state-of-the-art methods. Notably, ELTLBOCBL-CNN achieves outstanding accuracies on various datasets, including MNIST (99.72%), MNIST-RD (96.67%), MNIST-RB (98.28%), MNIST-BI (97.22%), MNST-RD + BI (83.45%), Rectangles (99.99%), Rectangles-I (97.41%), Convex (98.35%), and MNIST-Fashion (93.70%). These results highlight the remarkable classification accuracy of ETLBOCBL-CNN, underscoring its potential for advancing smart device infrastructure development.
Collapse
Affiliation(s)
- Koon Meng Ang
- Faculty of Engineering, Technology and Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia; (K.M.A.); (S.S.T.)
| | - Wei Hong Lim
- Faculty of Engineering, Technology and Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia; (K.M.A.); (S.S.T.)
| | - Sew Sun Tiang
- Faculty of Engineering, Technology and Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia; (K.M.A.); (S.S.T.)
| | - Abhishek Sharma
- Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun 248002, India;
| | - Marwa M. Eid
- Delta Higher Institute for Engineering and Technology, Mansoura 35511, Egypt;
- Faculty of Artificial Intelligence, Delta University for Science and Technology, Mansoura 35111, Egypt
| | - Sayed M. Tawfeek
- Delta Higher Institute for Engineering and Technology, Mansoura 35511, Egypt;
| | - Doaa Sami Khafaga
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia; (D.S.K.); (A.H.A.)
| | - Amal H. Alharbi
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia; (D.S.K.); (A.H.A.)
| | - Abdelaziz A. Abdelhamid
- Department of Computer Science, Faculty of Computer and Information Sciences, Ain Shams University, Cairo 11566, Egypt;
- Department of Computer Science, College of Computing and Information Technology, Shaqra University, Sahqra 11961, Saudi Arabia
| |
Collapse
|
21
|
Baharlouei Z, Rabbani H, Plonka G. Wavelet scattering transform application in classification of retinal abnormalities using OCT images. Sci Rep 2023; 13:19013. [PMID: 37923770 PMCID: PMC10624695 DOI: 10.1038/s41598-023-46200-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 10/29/2023] [Indexed: 11/06/2023] Open
Abstract
To assist ophthalmologists in diagnosing retinal abnormalities, Computer Aided Diagnosis has played a significant role. In this paper, a particular Convolutional Neural Network based on Wavelet Scattering Transform (WST) is used to detect one to four retinal abnormalities from Optical Coherence Tomography (OCT) images. Predefined wavelet filters in this network decrease the computation complexity and processing time compared to deep learning methods. We use two layers of the WST network to obtain a direct and efficient model. WST generates a sparse representation of the images which is translation-invariant and stable concerning local deformations. Next, a Principal Component Analysis classifies the extracted features. We evaluate the model using four publicly available datasets to have a comprehensive comparison with the literature. The accuracies of classifying the OCT images of the OCTID dataset into two and five classes were [Formula: see text] and [Formula: see text], respectively. We achieved an accuracy of [Formula: see text] in detecting Diabetic Macular Edema from Normal ones using the TOPCON device-based dataset. Heidelberg and Duke datasets contain DME, Age-related Macular Degeneration, and Normal classes, in which we achieved accuracy of [Formula: see text] and [Formula: see text], respectively. A comparison of our results with the state-of-the-art models shows that our model outperforms these models for some assessments or achieves nearly the best results reported so far while having a much smaller computational complexity.
Collapse
Affiliation(s)
- Zahra Baharlouei
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hossein Rabbani
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Gerlind Plonka
- Institute for Numerical and Applied Mathematics, Georg-August-University of Goettingen, Göttingen, Germany
| |
Collapse
|
22
|
Olivier A, Hoffmann C, Jousse-Joulin S, Mansour A, Bressollette L, Clement B. Machine and Deep Learning Approaches Applied to Classify Gougerot-Sjögren Syndrome and Jointly Segment Salivary Glands. Bioengineering (Basel) 2023; 10:1283. [PMID: 38002406 PMCID: PMC10668981 DOI: 10.3390/bioengineering10111283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 10/10/2023] [Accepted: 10/16/2023] [Indexed: 11/26/2023] Open
Abstract
To diagnose Gougerot-Sjögren syndrome (GSS), ultrasound imaging (US) is a promising tool for helping physicians and experts. Our project focuses on the automatic detection of the presence of GSS using US. Ultrasound imaging suffers from a weak signal-to-noise ratio. Therefore, any classification or segmentation task based on these images becomes a difficult challenge. To address these two tasks, we evaluate different approaches: a classification using a machine learning method along with feature extraction based on a set of measurements following the radiomics guidance and a deep-learning-based classification. We propose, therefore, an innovative method to enhance the training of a deep neural network with a two phases: multiple supervision using joint classification and a segmentation implemented as pretraining. We highlight the fact that our learning methods provide segmentation results similar to those performed by human experts. We obtain proficient segmentation results for salivary glands and promising detection results for Gougerot-Sjögren syndrome; we observe maximal accuracy with the model trained in two phases. Our experimental results corroborate the fact that deep learning and radiomics combined with ultrasound imaging can be a promising tool for the above-mentioned problems.
Collapse
Affiliation(s)
- Aurélien Olivier
- ENSTA Bretagne, Lab-STICC UMR CNRS 6285, 29200 Brest, France; (A.O.)
- GETBO UMR 13-04 CHRU Cavale Blanche, 29200 Brest, France
| | | | | | - Ali Mansour
- ENSTA Bretagne, Lab-STICC UMR CNRS 6285, 29200 Brest, France; (A.O.)
| | | | - Benoit Clement
- ENSTA Bretagne, Lab-STICC UMR CNRS 6285, 29200 Brest, France; (A.O.)
- CROSSING IRL CNRS 2010, Adelaide 5005, Australia
| |
Collapse
|
23
|
Chen Q, Wang L, Xing Z, Wang L, Hu X, Wang R, Zhu YM. Deep wavelet scattering orthogonal fusion network for glioma IDH mutation status prediction. Comput Biol Med 2023; 166:107493. [PMID: 37774558 DOI: 10.1016/j.compbiomed.2023.107493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 06/26/2023] [Accepted: 09/15/2023] [Indexed: 10/01/2023]
Abstract
Accurately predicting the isocitrate dehydrogenase (IDH) mutation status of gliomas is greatly significant for formulating appropriate treatment plans and evaluating the prognoses of gliomas. Although existing studies can accurately predict the IDH mutation status of gliomas based on multimodal magnetic resonance (MR) images and machine learning methods, most of these methods cannot fully explore multimodal information and effectively predict IDH status for datasets acquired from multiple centers. To address this issue, a novel wavelet scattering (WS)-based orthogonal fusion network (WSOFNet) was proposed in this work to predict the IDH mutation status of gliomas from multiple centers. First, transformation-invariant features were extracted from multimodal MR images with a WS network, and then the multimodal WS features were used instead of the original images as the inputs of WSOFNet and were fully fused through an adaptive multimodal feature fusion module (AMF2M) and an orthogonal projection module (OPM). Finally, the fused features were input into a fully connected classifier to predict IDH mutation status. In addition, to achieve improved prediction accuracy, four auxiliary losses were also used in the feature extraction modules. The comparison results showed that the prediction area under the curve (AUC) of WSOFNet on a single-center dataset was 0.9966 and that on a multicenter dataset was approximately 0.9655, which was at least 3.9% higher than that of state-of-the-art methods. Moreover, the ablation experimental results also proved that the adaptive multimodal feature fusion strategy based on orthogonal projection could effectively improve the prediction performance of the model, especially for an external validation dataset.
Collapse
Affiliation(s)
- Qijian Chen
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
| | - Lihui Wang
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China.
| | - Zhiyang Xing
- Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, NHC Key Laboratory of Pulmonary Immune-related Diseases, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Li Wang
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
| | - Xubin Hu
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
| | - Rongpin Wang
- Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, NHC Key Laboratory of Pulmonary Immune-related Diseases, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Yue-Min Zhu
- University Lyon, INSA Lyon, CNRS, Inserm, IRP Metislab CREATIS UMR5220, U1206, Lyon 69621, France
| |
Collapse
|
24
|
Gunler Pirim MA, Tora H, Oztoprak K, Butun İ. Two-Stage Feature Generator for Handwritten Digit Classification. SENSORS (BASEL, SWITZERLAND) 2023; 23:8477. [PMID: 37896570 PMCID: PMC10610940 DOI: 10.3390/s23208477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 09/12/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023]
Abstract
In this paper, a novel feature generator framework is proposed for handwritten digit classification. The proposed framework includes a two-stage cascaded feature generator. The first stage is based on principal component analysis (PCA), which generates projected data on principal components as features. The second one is constructed by a partially trained neural network (PTNN), which uses projected data as inputs and generates hidden layer outputs as features. The features obtained from the PCA and PTNN-based feature generator are tested on the MNIST and USPS datasets designed for handwritten digit sets. Minimum distance classifier (MDC) and support vector machine (SVM) methods are exploited as classifiers for the obtained features in association with this framework. The performance evaluation results show that the proposed framework outperforms the state-of-the-art techniques and achieves accuracies of 99.9815% and 99.9863% on the MNIST and USPS datasets, respectively. The results also show that the proposed framework achieves almost perfect accuracies, even with significantly small training data sizes.
Collapse
Affiliation(s)
| | - Hakan Tora
- Department of Avionics, Atilim University, 06830 Ankara, Turkey;
| | - Kasim Oztoprak
- Department of Computer Engineering, Konya Food and Agriculture University, 42080 Konya, Turkey
| | - İsmail Butun
- Department of Computer Engineering, KTH Royal Institute of Technology, SE-114 28 Stockholm, Sweden
- Department of Computer Engineering, OSTIM Technical University, 06370 Ankara, Turkey
| |
Collapse
|
25
|
Barmpas K, Panagakis Y, Adamos DA, Laskaris N, Zafeiriou S. BrainWave-Scattering Net: a lightweight network for EEG-based motor imagery recognition. J Neural Eng 2023; 20:056014. [PMID: 37678229 DOI: 10.1088/1741-2552/acf78a] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 09/07/2023] [Indexed: 09/09/2023]
Abstract
Objective.Brain-computer interfaces (BCIs) enable a direct communication of the brain with the external world, using one's neural activity, measured by electroencephalography (EEG) signals. In recent years, convolutional neural networks (CNNs) have been widely used to perform automatic feature extraction and classification in various EEG-based tasks. However, their undeniable benefits are counterbalanced by the lack of interpretability properties as well as the inability to perform sufficiently when only limited amount of training data is available.Approach.In this work, we introduce a novel, lightweight, fully-learnable neural network architecture that relies on Gabor filters to delocalize EEG signal information into scattering decomposition paths along frequency and slow-varying temporal modulations.Main results.We utilize our network in two distinct modeling settings, for building either a generic (training across subjects) or a personalized (training within a subject) classifier.Significance.In both cases, using two different publicly available datasets and one in-house collected dataset, we demonstrate high performance for our model with considerably less number of trainable parameters as well as shorter training time compared to other state-of-the-art deep architectures. Moreover, our network demonstrates enhanced interpretability properties emerging at the level of the temporal filtering operation and enables us to train efficient personalized BCI models with limited amount of training data.
Collapse
Affiliation(s)
- Konstantinos Barmpas
- Department of Computing, Imperial College London, London SW7 2RH, United Kingdom
- Cogitat Ltd, London, United Kingdom
| | - Yannis Panagakis
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens 15784, Greece
- Cogitat Ltd, London, United Kingdom
| | - Dimitrios A Adamos
- Department of Computing, Imperial College London, London SW7 2RH, United Kingdom
- Cogitat Ltd, London, United Kingdom
| | - Nikolaos Laskaris
- School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
- Cogitat Ltd, London, United Kingdom
| | - Stefanos Zafeiriou
- Department of Computing, Imperial College London, London SW7 2RH, United Kingdom
- Cogitat Ltd, London, United Kingdom
| |
Collapse
|
26
|
Yu F, Li H, Shi Y, Tang G, Chen Z, Jiang M. FFENet: frequency-spatial feature enhancement network for clothing classification. PeerJ Comput Sci 2023; 9:e1555. [PMID: 37810358 PMCID: PMC10557477 DOI: 10.7717/peerj-cs.1555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 08/08/2023] [Indexed: 10/10/2023]
Abstract
Clothing analysis has garnered significant attention, and within this field, clothing classification plays a vital role as one of the fundamental technologies. Due to the inherent complexity of clothing scenes in real-world environments, the learning of clothing features in such complex scenes often encounters interference. Because clothing classification relies on the contour and texture information of clothing, clothing classification in real scenes may lead to poor classification results. Therefore, this paper proposes a clothing classification network based on frequency-spatial domain conversion. The proposed network combines frequency domain information with spatial information and does not compress channels. It aims to enhance the extraction of clothing features and improve the accuracy of clothing classification. In our work, (1) we combine the frequency domain information and spatial information to establish a clothing feature extraction clothing classification network without compressed feature map channels, (2) we use the frequency domain feature enhancement module to realize the preliminary extraction of clothing features, and (3) we introduce a clothing dataset in complex scenes (Clothing-8). Our network achieves a top-1 model accuracy of 93.4% on the Clothing-8 dataset and 94.62% on the Fashion-MNIST dataset. Additionally, it also achieves the best results in terms of top-3 and top-5 metrics on the DeepFashion dataset.
Collapse
Affiliation(s)
- Feng Yu
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, Jiangxia District, China
- Engineering Research Center of Hubei Province for Clothing Information, Wuhan, Jiangxia District, China
| | - Huiyin Li
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, Jiangxia District, China
| | - Yankang Shi
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, Jiangxia District, China
| | - Guangyu Tang
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, Jiangxia District, China
| | - Zhaoxiang Chen
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, Jiangxia District, China
| | - Minghua Jiang
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, Jiangxia District, China
- Engineering Research Center of Hubei Province for Clothing Information, Wuhan, Jiangxia District, China
| |
Collapse
|
27
|
Sharma M, Verma S, Anand D, Gadre VM, Acharya UR. CAPSCNet: A novel scattering network for automated identification of phasic cyclic alternating patterns of human sleep using multivariate EEG signals. Comput Biol Med 2023; 164:107259. [PMID: 37544251 DOI: 10.1016/j.compbiomed.2023.107259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 07/03/2023] [Accepted: 07/04/2023] [Indexed: 08/08/2023]
Abstract
The Cyclic Alternating Pattern (CAP) can be considered a physiological marker of sleep instability. The CAP can examine various sleep-related disorders. Certain short events (A and B phases) manifest related to a specific physiological process or pathology during non-rapid eye movement (NREM) sleep. These phases unexpectedly modify EEG oscillations; hence, manual detection is challenging. Therefore, it is highly desirable to have an automated system for detecting the A-phases (AP). Deep convolution neural networks (CNN) have shown high performance in various healthcare applications. A variant of the deep neural network called the Wavelet Scattering Network (WSN) has been used to overcome the specific limitations of CNN, such as the need for a large amount of data to train the model. WSN is an optimized network that can learn features that help discriminate patterns hidden inside signals. Also, WSNs are invariant to local perturbations, making the network significantly more reliable and effective. It can also help improve performance on tasks where data is minimal. In this study, we proposed a novel WSN-based CAPSCNet to automatically detect AP using EEG signals. Seven dataset variants of cyclic alternating pattern (CAP) sleep cohort is employed for this study. Two electroencephalograms (EEG) derivations, namely: C4-A1 and F4-C4, are used to develop the CAPSCNet. The model is examined using healthy subjects and patients tormented by six different sleep disorders, namely: sleep-disordered breathing (SDB), insomnia, nocturnal frontal lobe epilepsy (NFLE), narcolepsy, periodic leg movement disorder (PLM) and rapid eye movement behavior disorder (RBD) subjects. Several different machine-learning algorithms were used to classify the features obtained from the WSN. The proposed CAPSCNet has achieved the highest average classification accuracy of 83.4% using a trilayered neural network classifier for the healthy data variant. The proposed CAPSCNet is efficient and computationally faster.
Collapse
Affiliation(s)
- Manish Sharma
- Department of Electrical and Computer Science Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Sarv Verma
- Department of Electrical and Computer Science Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Divyansh Anand
- Department of Electrical and Computer Science Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad, India.
| | - Vikram M Gadre
- Department of Electrical Engineering, Indian Institute of Technology, Bombay, Mumbai, India.
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield 4300, Australia.
| |
Collapse
|
28
|
Siviero I, Menegaz G, Storti SF. Functional Connectivity and Feature Fusion Enhance Multiclass Motor-Imagery Brain-Computer Interface Performance. SENSORS (BASEL, SWITZERLAND) 2023; 23:7520. [PMID: 37687976 PMCID: PMC10490741 DOI: 10.3390/s23177520] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/24/2023] [Accepted: 08/27/2023] [Indexed: 09/10/2023]
Abstract
(1) Background: in the field of motor-imagery brain-computer interfaces (MI-BCIs), obtaining discriminative features among multiple MI tasks poses a significant challenge. Typically, features are extracted from single electroencephalography (EEG) channels, neglecting their interconnections, which leads to limited results. To address this limitation, there has been growing interest in leveraging functional brain connectivity (FC) as a feature in MI-BCIs. However, the high inter- and intra-subject variability has so far limited its effectiveness in this domain. (2) Methods: we propose a novel signal processing framework that addresses this challenge. We extracted translation-invariant features (TIFs) obtained from a scattering convolution network (SCN) and brain connectivity features (BCFs). Through a feature fusion approach, we combined features extracted from selected channels and functional connectivity features, capitalizing on the strength of each component. Moreover, we employed a multiclass support vector machine (SVM) model to classify the extracted features. (3) Results: using a public dataset (IIa of the BCI Competition IV), we demonstrated that the feature fusion approach outperformed existing state-of-the-art methods. Notably, we found that the best results were achieved by merging TIFs with BCFs, rather than considering TIFs alone. (4) Conclusions: our proposed framework could be the key for improving the performance of a multiclass MI-BCI system.
Collapse
Affiliation(s)
- Ilaria Siviero
- Department of Computer Science, University of Verona, Strada Le Grazie 15, 37134 Verona, Italy;
| | - Gloria Menegaz
- Department of Engineering for Innovation Medicine, University of Verona, Strada Le Grazie 15, 37134 Verona, Italy;
| | - Silvia Francesca Storti
- Department of Engineering for Innovation Medicine, University of Verona, Strada Le Grazie 15, 37134 Verona, Italy;
| |
Collapse
|
29
|
Kuo W, Rossinelli D, Schulz G, Wenger RH, Hieber S, Müller B, Kurtcuoglu V. Terabyte-scale supervised 3D training and benchmarking dataset of the mouse kidney. Sci Data 2023; 10:510. [PMID: 37537174 PMCID: PMC10400611 DOI: 10.1038/s41597-023-02407-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 07/24/2023] [Indexed: 08/05/2023] Open
Abstract
The performance of machine learning algorithms, when used for segmenting 3D biomedical images, does not reach the level expected based on results achieved with 2D photos. This may be explained by the comparative lack of high-volume, high-quality training datasets, which require state-of-the-art imaging facilities, domain experts for annotation and large computational and personal resources. The HR-Kidney dataset presented in this work bridges this gap by providing 1.7 TB of artefact-corrected synchrotron radiation-based X-ray phase-contrast microtomography images of whole mouse kidneys and validated segmentations of 33 729 glomeruli, which corresponds to a one to two orders of magnitude increase over currently available biomedical datasets. The image sets also contain the underlying raw data, threshold- and morphology-based semi-automatic segmentations of renal vasculature and uriniferous tubules, as well as true 3D manual annotations. We therewith provide a broad basis for the scientific community to build upon and expand in the fields of image processing, data augmentation and machine learning, in particular unsupervised and semi-supervised learning investigations, as well as transfer learning and generative adversarial networks.
Collapse
Affiliation(s)
- Willy Kuo
- Institute of Physiology, University of Zurich, Zurich, Switzerland
- National Centre of Competence in Research, Kidney.CH, Zurich, Switzerland
| | - Diego Rossinelli
- Institute of Physiology, University of Zurich, Zurich, Switzerland
- National Centre of Competence in Research, Kidney.CH, Zurich, Switzerland
| | - Georg Schulz
- Biomaterials Science Center, Department of Biomedical Engineering, University of Basel, Allschwil, Switzerland
| | - Roland H Wenger
- Institute of Physiology, University of Zurich, Zurich, Switzerland
- National Centre of Competence in Research, Kidney.CH, Zurich, Switzerland
| | - Simone Hieber
- Biomaterials Science Center, Department of Biomedical Engineering, University of Basel, Allschwil, Switzerland
| | - Bert Müller
- Biomaterials Science Center, Department of Biomedical Engineering, University of Basel, Allschwil, Switzerland
| | - Vartan Kurtcuoglu
- Institute of Physiology, University of Zurich, Zurich, Switzerland.
- National Centre of Competence in Research, Kidney.CH, Zurich, Switzerland.
| |
Collapse
|
30
|
Najaran MHT. A genetic programming-based convolutional deep learning algorithm for identifying COVID-19 cases via X-ray images. Artif Intell Med 2023; 142:102571. [PMID: 37316095 DOI: 10.1016/j.artmed.2023.102571] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 03/07/2023] [Accepted: 04/27/2023] [Indexed: 06/16/2023]
Abstract
Evolutionary algorithms have been successfully employed to find the best structure for many learning algorithms including neural networks. Due to their flexibility and promising results, Convolutional Neural Networks (CNNs) have found their application in many image processing applications. The structure of CNNs greatly affects the performance of these algorithms both in terms of accuracy and computational cost, thus, finding the best architecture for these networks is a crucial task before they are employed. In this paper, we develop a genetic programming approach for the optimization of CNN structure in diagnosing COVID-19 cases via X-ray images. A graph representation for CNN architecture is proposed and evolutionary operators including crossover and mutation are specifically designed for the proposed representation. The proposed architecture of CNNs is defined by two sets of parameters, one is the skeleton which determines the arrangement of the convolutional and pooling operators and their connections and one is the numerical parameters of the operators which determine the properties of these operators like filter size and kernel size. The proposed algorithm in this paper optimizes the skeleton and the numerical parameters of the CNN architectures in a co-evolutionary scheme. The proposed algorithm is used to identify covid-19 cases via X-ray images.
Collapse
|
31
|
Yang KB, Lee J, Yang J. Multi-class semantic segmentation of breast tissues from MRI images using U-Net based on Haar wavelet pooling. Sci Rep 2023; 13:11704. [PMID: 37474633 PMCID: PMC10359288 DOI: 10.1038/s41598-023-38557-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 07/11/2023] [Indexed: 07/22/2023] Open
Abstract
MRI images used in breast cancer diagnosis are taken in a lying position and therefore are inappropriate for reconstructing the natural breast shape in a standing position. Some studies have proposed methods to present the breast shape in a standing position using an ordinary differential equation of the finite element method. However, it is difficult to obtain meaningful results because breast tissues have different elastic moduli. This study proposed a multi-class semantic segmentation method for breast tissues to reconstruct breast shapes using U-Net based on Haar wavelet pooling. First, a dataset was constructed by labeling the skin, fat, and fibro-glandular tissues and the background from MRI images taken in a lying position. Next, multi-class semantic segmentation was performed using U-Net based on Haar wavelet pooling to improve the segmentation accuracy for breast tissues. The U-Net effectively extracted breast tissue features while reducing image information loss in a subsampling stage using multiple sub-bands. In addition, the proposed network is robust to overfitting. The proposed network showed a mIOU of 87.48 for segmenting breast tissues. The proposed networks demonstrated high-accuracy segmentation for breast tissue with different elastic moduli to reconstruct the natural breast shape.
Collapse
Affiliation(s)
- Kwang Bin Yang
- Devision of Memory - Memory FAB Team 1, Samsung Electronics, 1 Samsungjeonja-ro, Hwaseong, Gyeonggi, 18448, Republic of Korea
| | - Jinwon Lee
- Department of Industrial and Management Engineering, Gangneung-Wonju National University, 150 Namwon-ro, Wonju, Gangwon, 26403, Republic of Korea
| | - Jeongsam Yang
- Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Suwon, Gyeonggi, 16499, Republic of Korea.
| |
Collapse
|
32
|
Odinaev I, Wong KL, Chin JW, Goyal R, Chan TT, So RHY. Robust Heart Rate Variability Measurement from Facial Videos. Bioengineering (Basel) 2023; 10:851. [PMID: 37508878 PMCID: PMC10376629 DOI: 10.3390/bioengineering10070851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 06/30/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023] Open
Abstract
Remote Photoplethysmography (rPPG) is a contactless method that enables the detection of various physiological signals from facial videos. rPPG utilizes a digital camera to detect subtle changes in skin color to measure vital signs such as heart rate variability (HRV), an important biomarker related to the autonomous nervous system. This paper presents a novel contactless HRV extraction algorithm, WaveHRV, based on the Wavelet Scattering Transform technique, followed by adaptive bandpass filtering and inter-beat-interval (IBI) analysis. Furthermore, a novel method is introduced to preprocess noisy contact-based PPG signals. WaveHRV is bench-marked against existing algorithms and public datasets. Our results show that WaveHRV is promising and achieves the lowest mean absolute error (MAE) of 10.5 ms and 6.15 ms for RMSSD and SDNN on the UBFCrPPG dataset.
Collapse
Affiliation(s)
| | - Kwan Long Wong
- PanopticAI Ltd., Hong Kong, China
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China
| | | | - Raghav Goyal
- PanopticAI Ltd., Hong Kong, China
- Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China
| | | | - Richard H Y So
- PanopticAI Ltd., Hong Kong, China
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China
| |
Collapse
|
33
|
Salehi A, Roberts A, Phinyomark A, Scheme E. Feature Learning Networks for Floor Sensor-based Gait Recognition. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-5. [PMID: 38083158 DOI: 10.1109/embc40787.2023.10340596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Deep learning (DL) has become a powerful tool in many image classification applications but often requires large training sets to achieve high accuracy. For applications where the available data are limited, this can become a severely limiting factor in model performance. To address this limitation, feature learning network approaches that integrate traditional feature extraction methods with DL frameworks have been proposed. In this study, the performances of traditional methods: discrete wavelet transform (DWT), discrete cosine transform (DCT), independent component analysis (ICA), and principal component analysis (PCA); and their corresponding feature networks based on a convolutional neural network (CNN) framework: ScatNet (wavelet scattering network), DCTNet, ICANet, and PCANet, were investigated for use in pressure-based footstep recognition when the limited sample size is available for person authentication. The results show that the feature learning networks (90.6% accuracy) achieved significantly better performance on average than the conventional feature extraction methods (79.7% accuracy) (p < 0.05). Among the different feature networks, PCANet provided the best verification performance, with an accuracy of 92.2%. Feature learning networks are simple and effective approaches that can be a promising solution for applications like floor-based gait recognition in a security access scenario (such as workspace environment and border control) when small amounts of data are available for training models to differentiate between a larger group of users.
Collapse
|
34
|
Manta O, Sarafidis M, Schlee W, Mazurek B, Matsopoulos GK, Koutsouris DD. Development of Machine-Learning Models for Tinnitus-Related Distress Classification Using Wavelet-Transformed Auditory Evoked Potential Signals and Clinical Data. J Clin Med 2023; 12:jcm12113843. [PMID: 37298037 DOI: 10.3390/jcm12113843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/12/2023] Open
Abstract
Tinnitus is a highly prevalent condition, affecting more than 1 in 7 adults in the EU and causing negative effects on sufferers' quality of life. In this study, we utilised data collected within the "UNITI" project, the largest EU tinnitus-related research programme. Initially, we extracted characteristics from both auditory brainstem response (ABR) and auditory middle latency response (AMLR) signals, which were derived from tinnitus patients. We then combined these features with the patients' clinical data, and integrated them to build machine learning models for the classification of individuals and their ears according to their level of tinnitus-related distress. Several models were developed and tested on different datasets to determine the most relevant features and achieve high performances. Specifically, seven widely used classifiers were utilised on all generated datasets: random forest (RF), linear, radial, and polynomial support vector machines (SVM), naive bayes (NB), neural networks (NN), and linear discriminant analysis (LDA). Results showed that features extracted from the wavelet-scattering transformed AMLR signals were the most informative data. In combination with the 15 LASSO-selected clinical features, the SVM classifier achieved optimal performance with an AUC value, sensitivity, and specificity of 92.53%, 84.84%, and 83.04%, respectively, indicating high discrimination performance between the two groups.
Collapse
Affiliation(s)
- Ourania Manta
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
| | - Michail Sarafidis
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
| | - Winfried Schlee
- Department of Psychiatry and Psychotherapy, University of Regensburg, 93053 Regensburg, Germany
- Institute for Information and Process Management, Eastern Switzerland University of Applied Sciences, 9001 St. Gallen, Switzerland
| | - Birgit Mazurek
- Tinnitus Center, Charité-Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, 10117 Berlin, Germany
| | - George K Matsopoulos
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
| | - Dimitrios D Koutsouris
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
| |
Collapse
|
35
|
Knapp PF, Lewis WE. Advanced data analysis in inertial confinement fusion and high energy density physics. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2023; 94:061103. [PMID: 37862494 DOI: 10.1063/5.0128661] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 05/17/2023] [Indexed: 10/22/2023]
Abstract
Bayesian analysis enables flexible and rigorous definition of statistical model assumptions with well-characterized propagation of uncertainties and resulting inferences for single-shot, repeated, or even cross-platform data. This approach has a strong history of application to a variety of problems in physical sciences ranging from inference of particle mass from multi-source high-energy particle data to analysis of black-hole characteristics from gravitational wave observations. The recent adoption of Bayesian statistics for analysis and design of high-energy density physics (HEDP) and inertial confinement fusion (ICF) experiments has provided invaluable gains in expert understanding and experiment performance. In this Review, we discuss the basic theory and practical application of the Bayesian statistics framework. We highlight a variety of studies from the HEDP and ICF literature, demonstrating the power of this technique. Due to the computational complexity of multi-physics models needed to analyze HEDP and ICF experiments, Bayesian inference is often not computationally tractable. Two sections are devoted to a review of statistical approximations, efficient inference algorithms, and data-driven methods, such as deep-learning and dimensionality reduction, which play a significant role in enabling use of the Bayesian framework. We provide additional discussion of various applications of Bayesian and machine learning methods that appear to be sparse in the HEDP and ICF literature constituting possible next steps for the community. We conclude by highlighting community needs, the resolution of which will improve trust in data-driven methods that have proven critical for accelerating the design and discovery cycle in many application areas.
Collapse
Affiliation(s)
- P F Knapp
- Sandia National Laboratories, Albuquerque, New Mexico 87185, USA
| | - W E Lewis
- Sandia National Laboratories, Albuquerque, New Mexico 87185, USA
| |
Collapse
|
36
|
Ram S, Tang W, Bell AJ, Pal R, Spencer C, Buschhaus A, Hatt CR, diMagliano MP, Rehemtulla A, Rodríguez JJ, Galban S, Galban CJ. Lung cancer lesion detection in histopathology images using graph-based sparse PCA network. Neoplasia 2023; 42:100911. [PMID: 37269818 DOI: 10.1016/j.neo.2023.100911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 05/17/2023] [Indexed: 06/05/2023]
Abstract
Early detection of lung cancer is critical for improvement of patient survival. To address the clinical need for efficacious treatments, genetically engineered mouse models (GEMM) have become integral in identifying and evaluating the molecular underpinnings of this complex disease that may be exploited as therapeutic targets. Assessment of GEMM tumor burden on histopathological sections performed by manual inspection is both time consuming and prone to subjective bias. Therefore, an interplay of needs and challenges exists for computer-aided diagnostic tools, for accurate and efficient analysis of these histopathology images. In this paper, we propose a simple machine learning approach called the graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E). Our method comprises four steps: 1) cascaded graph-based sparse PCA, 2) PCA binary hashing, 3) block-wise histograms, and 4) support vector machine (SVM) classification. In our proposed architecture, graph-based sparse PCA is employed to learn the filter banks of the multiple stages of a convolutional network. This is followed by PCA hashing and block histograms for indexing and pooling. The meaningful features extracted from this GS-PCA are then fed to an SVM classifier. We evaluate the performance of the proposed algorithm on H&E slides obtained from an inducible K-rasG12D lung cancer mouse model using precision/recall rates, Fβ-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC) and show that our algorithm is efficient and provides improved detection accuracy compared to existing algorithms.
Collapse
Affiliation(s)
- Sundaresh Ram
- Departments of Radiology, and Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Wenfei Tang
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alexander J Bell
- Departments of Radiology, and Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ravi Pal
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Cara Spencer
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Charles R Hatt
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA; Imbio LLC, Minneapolis, MN 55405, USA
| | - Marina Pasca diMagliano
- Departments of Surgery, and Cell and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alnawaz Rehemtulla
- Departments of Radiology, and Radiation Oncology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jeffrey J Rodríguez
- Departments of Electrical and Computer Engineering, and Biomedical Engineering, The University of Arizona, Tucson, AZ 85721, USA
| | - Stefanie Galban
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Craig J Galban
- Departments of Radiology, and Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
37
|
Guo Z, Liu Z, Barbastathis G, Zhang Q, Glinsky ME, Alpert BK, Levine ZH. Noise-resilient deep learning for integrated circuit tomography. OPTICS EXPRESS 2023; 31:15355-15371. [PMID: 37157639 DOI: 10.1364/oe.486213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
X-ray tomography is a non-destructive imaging technique that reveals the interior of an object from its projections at different angles. Under sparse-view and low-photon sampling, regularization priors are required to retrieve a high-fidelity reconstruction. Recently, deep learning has been used in X-ray tomography. The prior learned from training data replaces the general-purpose priors in iterative algorithms, achieving high-quality reconstructions with a neural network. Previous studies typically assume the noise statistics of test data are acquired a priori from training data, leaving the network susceptible to a change in the noise characteristics under practical imaging conditions. In this work, we propose a noise-resilient deep-reconstruction algorithm and apply it to integrated circuit tomography. By training the network with regularized reconstructions from a conventional algorithm, the learned prior shows strong noise resilience without the need for additional training with noisy examples, and allows us to obtain acceptable reconstructions with fewer photons in test data. The advantages of our framework may further enable low-photon tomographic imaging where long acquisition times limit the ability to acquire a large training set.
Collapse
|
38
|
Park CF, Allys E, Villaescusa-Navarro F, Finkbeiner D. Quantification of High-dimensional Non-Gaussianities and Its Implication to Fisher Analysis in Cosmology. THE ASTROPHYSICAL JOURNAL 2023; 946:107. [PMID: 37681217 PMCID: PMC10482003 DOI: 10.3847/1538-4357/acbe3b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 02/04/2023] [Accepted: 02/21/2023] [Indexed: 09/09/2023]
Abstract
It is well known that the power spectrum is not able to fully characterize the statistical properties of non-Gaussian density fields. Recently, many different statistics have been proposed to extract information from non-Gaussian cosmological fields that perform better than the power spectrum. The Fisher matrix formalism is commonly used to quantify the accuracy with which a given statistic can constrain the value of the cosmological parameters. However, these calculations typically rely on the assumption that the sampling distribution of the considered statistic follows a multivariate Gaussian distribution. In this work, we follow Sellentin & Heavens and use two different statistical tests to identify non-Gaussianities in different statistics such as the power spectrum, bispectrum, marked power spectrum, and wavelet scattering transform (WST). We remove the non-Gaussian components of the different statistics and perform Fisher matrix calculations with the Gaussianized statistics using Quijote simulations. We show that constraints on the parameters can change by a factor of ∼2 in some cases. We show with simple examples how statistics that do not follow a multivariate Gaussian distribution can achieve artificially tight bounds on the cosmological parameters when using the Fisher matrix formalism. We think that the non-Gaussian tests used in this work represent a powerful tool to quantify the robustness of Fisher matrix calculations and their underlying assumptions. We release the code used to compute the power spectra, bispectra, and WST that can be run on both CPUs and GPUs.
Collapse
Affiliation(s)
| | - Erwan Allys
- Laboratoire de Physique de l'École Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris Cité, F-75005 Paris, France
| | - Francisco Villaescusa-Navarro
- Center for Computational Astrophysics, Flatiron Institute, 162 5th Avenue, New York, NY 10010, USA
- Department of Astrophysical Sciences, Princeton University, Peyton Hall, Princeton, NJ 08544, USA
| | | |
Collapse
|
39
|
Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY 2023; 15:1375-1385. [PMID: 37056796 PMCID: PMC10023307 DOI: 10.1007/s41870-023-01224-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 03/04/2023] [Indexed: 03/20/2023]
Abstract
Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-independent speech tasks in order to track how the speaker's fluency affects the system. The multi-scale wavelet scattering transform is used to compensate for concerns like the loss of high-frequency information caused by the most widely used mel frequency cepstral coefficients feature extractor. The proposed large-scale speaker identification system succeeds well by employing wavelet scattered Bi-LSTM. While this procedure is used to identify non-native children in multiple classes, average values of accuracy, precision, recall, and F-measure are being used to assess the performance of the model in text-independent and text-dependent tasks, which outperforms the existing models.
Collapse
|
40
|
Agboola HA, Zaccheus JE. Wavelet image scattering based glaucoma detection. BMC Biomed Eng 2023; 5:1. [PMID: 36864533 PMCID: PMC9979468 DOI: 10.1186/s42490-023-00067-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 02/06/2023] [Indexed: 03/04/2023] Open
Abstract
BACKGROUND The ever-growing need for cheap, simple, fast, and accurate healthcare solutions spurred a lot of research activities which are aimed at the reliable deployment of artificial intelligence in the medical fields. However, this has proved to be a daunting task especially when looking to make automated diagnoses using biomedical image data. Biomedical image data have complex patterns which human experts find very hard to comprehend. Against this backdrop, we applied a representation or feature learning algorithm: Invariant Scattering Convolution Network or Wavelet scattering Network to retinal fundus images and studied the the efficacy of the automatically extracted features therefrom for glaucoma diagnosis/detection. The influence of wavelet scattering network parameter settings as well as 2-D channel image type on the detection correctness is also examined. Our work is a distinct departure from the usual method where wavelet transform is applied to pre-processed retinal fundus images and handcrafted features are extracted from the decomposition results. Here, the RIM-ONE DL image dataset was fed into a wavelet scattering network developed in the Matlab environment to achieve a stage-wise decomposition process called wavelet scattering of the retinal fundus images thereby, automatically learning features from the images. These features were then used to build simple and computationally cheap classification algorithms. RESULTS Maximum detection correctness of 98% was achieved on the held-out test set. Detection correctness is highly sensitive to scattering network parameter setting and 2-D channel image type. CONCLUSION A superficial comparison of the classification results obtained from our work and those obtained using a convolutional neural network underscores the potentiality of the proposed method for glaucoma detection.
Collapse
|
41
|
Maruyama H, Okada K, Motoyoshi I. A two-stage spectral model for sound texture perception: Synthesis and psychophysics. Iperception 2023; 14:20416695231157349. [PMID: 36845027 PMCID: PMC9950610 DOI: 10.1177/20416695231157349] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 01/30/2023] [Indexed: 02/25/2023] Open
Abstract
The natural environment is filled with a variety of auditory events such as wind blowing, water flowing, and fire crackling. It has been suggested that the perception of such textural sounds is based on the statistics of the natural auditory events. Inspired by a recent spectral model for visual texture perception, we propose a model that can describe the perceived sound texture only with the linear spectrum and the energy spectrum. We tested the validity of the model by using synthetic noise sounds that preserve the two-stage amplitude spectra of the original sound. Psychophysical experiment showed that our synthetic noises were perceived as like the original sounds for 120 real-world auditory events. The performance was comparable with the synthetic sounds produced by McDermott-Simoncelli's model which considers various classes of auditory statistics. The results support the notion that the perception of natural sound textures is predictable by the two-stage spectral signals.
Collapse
Affiliation(s)
| | | | - Isamu Motoyoshi
- Isamu Motoyoshi, Department of Life
Sciences, The University of Tokyo, Japan.
| |
Collapse
|
42
|
Sharaf AI. Sleep Apnea Detection Using Wavelet Scattering Transformation and Random Forest Classifier. ENTROPY (BASEL, SWITZERLAND) 2023; 25:399. [PMID: 36981288 PMCID: PMC10047098 DOI: 10.3390/e25030399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 02/08/2023] [Accepted: 02/17/2023] [Indexed: 06/18/2023]
Abstract
Obstructive Sleep Apnea (OSA) is a common sleep-breathing disorder that highly reduces the quality of human life. The most powerful method for the detection and classification of sleep apnea is the Polysomnogram. However, this method is time-consuming and cost-inefficient. Therefore, several methods focus on using electrocardiogram (ECG) signals to detect sleep apnea. This paper proposed a novel automated approach to detect and classify apneic events from single-lead ECG signals. Wavelet Scattering Transformation (WST) was applied to the ECG signals to decompose the signal into smaller segments. Then, a set of features, including higher-order statistics and entropy-based features, was extracted from the WST coefficients to formulate a search space. The obtained features were fed to a random forest classifier to classify the ECG segments. The experiment was validated using the 10-fold and hold-out cross-validation methods, which resulted in an accuracy of 91.65% and 90.35%, respectively. The findings were compared with different classifiers to show the significance of the proposed approach. The proposed approach achieved better performance measures than most of the existing methodologies.
Collapse
Affiliation(s)
- Ahmed I Sharaf
- Deanship of Scientific Research, Umm Al-Qura University, Mecca 24382, Saudi Arabia
| |
Collapse
|
43
|
Im2Graph: A Weakly Supervised Approach for Generating Holistic Scene Graphs from Regional Dependencies. FUTURE INTERNET 2023. [DOI: 10.3390/fi15020070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023] Open
Abstract
Conceptual representations of images involving descriptions of entities and their relations are often represented using scene graphs. Such scene graphs can express relational concepts by using sets of triplets ⟨subject—predicate—object⟩. Instead of building dedicated models for scene graph generation, our model tends to extract the latent relational information implicitly encoded in image captioning models. We explored dependency parsing to build grammatically sound parse trees from captions. We used detection algorithms for the region propositions to generate dense region-based concept graphs. These were optimally combined using the approximate sub-graph isomorphism to create holistic concept graphs for images. The major advantages of this approach are threefold. Firstly, the proposed graph generation module is completely rule-based and, hence, adheres to the principles of explainable artificial intelligence. Secondly, graph generation can be used as plug-and-play along with any region proposition and caption generation framework. Finally, our results showed that we could generate rich concept graphs without explicit graph-based supervision.
Collapse
|
44
|
Bi Y, Xue B, Zhang M. Instance Selection-Based Surrogate-Assisted Genetic Programming for Feature Learning in Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1118-1132. [PMID: 34464287 DOI: 10.1109/tcyb.2021.3105696] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Genetic programming (GP) has been applied to feature learning for image classification and achieved promising results. However, many GP-based feature learning algorithms are computationally expensive due to a large number of expensive fitness evaluations, especially when using a large number of training instances/images. Instance selection aims to select a small subset of training instances, which can reduce the computational cost. Surrogate-assisted evolutionary algorithms often replace expensive fitness evaluations by building surrogate models. This article proposes an instance selection-based surrogate-assisted GP for fast feature learning in image classification. The instance selection method selects multiple small subsets of images from the original training set to form surrogate training sets of different sizes. The proposed approach gradually uses these surrogate training sets to reduce the overall computational cost using a static or dynamic strategy. At each generation, the proposed approach evaluates the entire population on the small surrogate training sets and only evaluates ten current best individuals on the entire training set. The features learned by the proposed approach are fed into linear support vector machines for classification. Extensive experiments show that the proposed approach can not only significantly reduce the computational cost but also improve the generalisation performance over the baseline method, which uses the entire training set for fitness evaluations, on 11 different image datasets. The comparisons with other state-of-the-art GP and non-GP methods further demonstrate the effectiveness of the proposed approach. Further analysis shows that using multiple surrogate training sets in the proposed approach achieves better performance than using a single surrogate training set and using a random instance selection method.
Collapse
|
45
|
Najaran MHT. An evolutionary ensemble learning for diagnosing COVID-19 via cough signals. INTELLIGENT MEDICINE 2023; 3:S2667-1026(23)00002-5. [PMID: 36743333 PMCID: PMC9882956 DOI: 10.1016/j.imed.2023.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/10/2023] [Accepted: 01/11/2023] [Indexed: 01/30/2023]
Abstract
Objective The spread of the COVID-19 disease has caused great concern around the world and detecting the positive cases is crucial in curbing the pandemic. One of the symptoms of the disease is the dry cough it causes. It has previously been shown that cough signals can be used to identify a variety of diseases including tuberculosis, asthma, etc. In this paper, we proposed an algorithm to diagnose via cough signals the COVID-19 disease. Methods The proposed algorithm is an ensemble scheme that consists of a number of base learners, where each base learner uses a different feature extractor method, including statistical approaches and convolutional neural networks (CNN) for automatic feature extraction. Features are extracted from the raw signal and some transforms performed it, including Fourier, wavelet, Hilbert-Huang, and short-term Fourier transforms. The outputs of these base-learners are aggregated via a weighted voting scheme, with the weights optimised via an evolutionary paradigm. This paper also proposes a memetic algorithm for training the CNNs in the base-learners, which combines the speed of gradient descent (GD) algorithms and global search space coverage of the evolutionary algorithms. Results Experiments were performed on the proposed algorithm and different rival algorithms which included a number of CNN architectures in the literature and generic machine learning algorithms. The results suggested that the proposed algorithm achieves better performance compared to the existing algorithms in diagnosing COVID-19 via cough signals. Conclusion This research showed that COVID-19 could be diagnosed via cough signals and CNNs could be employed to process these signals and it may be further improved by the optimization of CNN architecture.
Collapse
|
46
|
Chang X, Ren P, Xu P, Li Z, Chen X, Hauptmann A. A Comprehensive Survey of Scene Graphs: Generation and Application. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:1-26. [PMID: 34941499 DOI: 10.1109/tpami.2021.3137605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Scene graph is a structured representation of a scene that can clearly express the objects, attributes, and relationships between objects in the scene. As computer vision technology continues to develop, people are no longer satisfied with simply detecting and recognizing objects in images; instead, people look forward to a higher level of understanding and reasoning about visual scenes. For example, given an image, we want to not only detect and recognize objects in the image, but also understand the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content. Alternatively, we might want the machine to tell us what the little girl in the image is doing (Visual Question Answering (VQA)), or even remove the dog from the image and find similar images (image editing and retrieval), etc. These tasks require a higher level of understanding and reasoning for image vision tasks. The scene graph is just such a powerful tool for scene understanding. Therefore, scene graphs have attracted the attention of a large number of researchers, and related research is often cross-modal, complex, and rapidly developing. However, no relatively systematic survey of scene graphs exists at present. To this end, this survey conducts a comprehensive investigation of the current scene graph research. More specifically, we first summarize the general definition of the scene graph, then conducte a comprehensive and systematic discussion on the generation method of the scene graph (SGG) and the SGG with the aid of prior knowledge. We then investigate the main applications of scene graphs and summarize the most commonly used datasets. Finally, we provide some insights into the future development of scene graphs.
Collapse
|
47
|
Zhang T, Chen W, Chen X. Identifying epileptic EEGs and congestive heart failure ECGs under unified framework of wavelet scattering transform, bidirectional weighted (2D)2PCA and KELM. Biocybern Biomed Eng 2023. [DOI: 10.1016/j.bbe.2023.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|
48
|
Stouffer KM, Witter MP, Tward DJ, Miller MI. Projective Diffeomorphic Mapping of Molecular Digital Pathology with Tissue MRI. COMMUNICATIONS ENGINEERING 2022; 1:44. [PMID: 37284027 PMCID: PMC10243734 DOI: 10.1038/s44172-022-00044-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 11/28/2022] [Indexed: 06/08/2023]
Abstract
Reconstructing dense 3D anatomical coordinates from 2D projective measurements has become a central problem in digital pathology for both animal models and human studies. Here we describe Projective Large Deformation Diffeomorphic Metric Mapping (LDDMM), a technique which projects diffeomorphic mappings of dense human magnetic resonance imaging (MRI) atlases at tissue scales onto sparse measurements at micrometre scales associated with histological and more general optical imaging modalities. We solve the problem of dense mapping surjectively onto histological sections by incorporating technologies for crossing modalities that use nonlinear scattering transforms to represent multiple radiomic-like textures at micron scales, together with a Gaussian mixture-model framework for modelling tears and distortions associated to each section. We highlight the significance of our method through incorporation of neuropathological measures and MRI, of relevance to the development of biomarkers for Alzheimer's disease and one instance of the integration of imaging data across the scales of clinical imaging and digital pathology.
Collapse
Affiliation(s)
- Kaitlin M. Stouffer
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Menno P. Witter
- Kavli Institute for Systems Neuroscience, Norwegian University of Science and Technology, Trondheim, Torgarden Norway
| | - Daniel J. Tward
- Departments of Computational Medicine and Neurology, University of California, Los Angeles, CA USA
| | - Michael I. Miller
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| |
Collapse
|
49
|
Abdolali M, Gillis N. Revisiting data augmentation for subspace clustering. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
50
|
Convolution-layer parameters optimization in Convolutional Neural Networks. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|