1
|
Ben Nasr Barber F, Elloumi Oueslati A. Human exons and introns classification using pre-trained Resnet-50 and GoogleNet models and 13-layers CNN model. J Genet Eng Biotechnol 2024; 22:100359. [PMID: 38494268 PMCID: PMC10903757 DOI: 10.1016/j.jgeb.2024.100359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
BACKGROUND Examining functions and characteristics of DNA sequences is a highly challenging task. When it comes to the human genome, which is made up of exons and introns, this task is more challenging. Human exons and introns contain millions to billions of nucleotides, which contributes to the complexity observed in this sequences. Considering how complicated the subject of genomics is, it is obvious that using signal processing techniques and deep learning tools to build a strong predictive model can be very helpful for the development of the research of the human genome. RESULTS After representing human exons and introns with color images using Frequency Chaos Game Representation, two pre-trained convolutional neural network models (Resnet-50 and GoogleNet) and a proposed CNN model having 13 hidden layers were used to classify our obtained images. We have reached a value of 92% for the accuracy rate for Resnet-50 model in about 7 h for the execution time, a value of 91.5% for the accuracy rate for the GoogleNet model in 2 h and a half for the execution time. For our proposed CNN model, we have reached 91.6% for the accuracy rate in 2 h and 37 min. CONCLUSIONS Our proposed CNN model is faster than the Resnet-50 model in terms of execution time. It was able to slightly exceed the GoogleNet model for the accuracy rate value.
Collapse
Affiliation(s)
- Feriel Ben Nasr Barber
- Electrical Engineering Department, SITI Laboratory, National School of Engineers of Tunis (ENIT), BP37, Le Belvedere, 1002 Tunis, Tunisia; Electrical Engineering Department, National School of Engineers of Carthage (ENICarthage), Tunis, Tunisia.
| | - Afef Elloumi Oueslati
- Electrical Engineering Department, SITI Laboratory, National School of Engineers of Tunis (ENIT), BP37, Le Belvedere, 1002 Tunis, Tunisia; Electrical Engineering Department, National School of Engineers of Carthage (ENICarthage), Tunis, Tunisia.
| |
Collapse
|
2
|
Pulcinelli M, Pinnelli M, Massaroni C, Lo Presti D, Fortino G, Schena E. Wearable Systems for Unveiling Collective Intelligence in Clinical Settings. SENSORS (BASEL, SWITZERLAND) 2023; 23:9777. [PMID: 38139623 PMCID: PMC10747409 DOI: 10.3390/s23249777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/29/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023]
Abstract
Nowadays, there is an ever-growing interest in assessing the collective intelligence (CI) of a team in a wide range of scenarios, thanks to its potential in enhancing teamwork and group performance. Recently, special attention has been devoted on the clinical setting, where breakdowns in teamwork, leadership, and communication can lead to adverse events, compromising patient safety. So far, researchers have mostly relied on surveys to study human behavior and group dynamics; however, this method is ineffective. In contrast, a promising solution to monitor behavioral and individual features that are reflective of CI is represented by wearable technologies. To date, the field of CI assessment still appears unstructured; therefore, the aim of this narrative review is to provide a detailed overview of the main group and individual parameters that can be monitored to evaluate CI in clinical settings, together with the wearables either already used to assess them or that have the potential to be applied in this scenario. The working principles, advantages, and disadvantages of each device are introduced in order to try to bring order in this field and provide a guide for future CI investigations in medical contexts.
Collapse
Affiliation(s)
- Martina Pulcinelli
- Research Unit of Measurements and Biomedical Instrumentation, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128 Roma, Italy; (M.P.); (M.P.); (C.M.); (E.S.)
| | - Mariangela Pinnelli
- Research Unit of Measurements and Biomedical Instrumentation, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128 Roma, Italy; (M.P.); (M.P.); (C.M.); (E.S.)
| | - Carlo Massaroni
- Research Unit of Measurements and Biomedical Instrumentation, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128 Roma, Italy; (M.P.); (M.P.); (C.M.); (E.S.)
- Fondazione Policlinico Universitario Campus Bio-Medico, Via Alvaro del Portillo, 200, 00128 Roma, Italy
| | - Daniela Lo Presti
- Research Unit of Measurements and Biomedical Instrumentation, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128 Roma, Italy; (M.P.); (M.P.); (C.M.); (E.S.)
- Fondazione Policlinico Universitario Campus Bio-Medico, Via Alvaro del Portillo, 200, 00128 Roma, Italy
| | - Giancarlo Fortino
- DIMES, University of Calabria, Via P. Bucci 41C, 87036 Rende, Italy;
| | - Emiliano Schena
- Research Unit of Measurements and Biomedical Instrumentation, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128 Roma, Italy; (M.P.); (M.P.); (C.M.); (E.S.)
- Fondazione Policlinico Universitario Campus Bio-Medico, Via Alvaro del Portillo, 200, 00128 Roma, Italy
| |
Collapse
|
3
|
KIM S, YUMUŞAK Ç, IRIMIA CV, BEDNORZ M, YENEL E, KUŞ M, SARIÇİFTÇİ NS, SHIM BS, IRIMIA-VLADU M. Amplifying the dielectric constant of shellac by incorporating natural clays for organic field effect transistors (OFETs). Turk J Chem 2023; 47:1169-1182. [PMID: 38173751 PMCID: PMC10762868 DOI: 10.55730/1300-0527.3603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 10/31/2023] [Accepted: 10/11/2023] [Indexed: 01/05/2024] Open
Abstract
We demonstrate in this work the practical use of uniform mixtures of a bioresin shellac and four natural clays, i.e. montmorillonite, sepiolite, halloysite and vermiculate as dielectrics in organic field effect transistors (OFETs). We present a thorough characterization of their processability and film forming characteristic, surface characterization, elaborate dielectric investigation and the fabrication of field effect transistors with two classic organic semiconductors, i.e. pentacene and fullerene C60. We show that low operating voltage of approximately 4 V is possible for all the OFETs using several combinations of clays and shellac. The capacitance measurements show an improvement of the dielectric constant of shellac by a factor of 2, to values in excess of 7 in the uniform mixtures of sepiolite and montmorillonite with this bioresin.
Collapse
Affiliation(s)
- Sunwoo KIM
- Department of Chemical Engineering, Inha University,
South Korea
- Program in Biomedical Science & Engineering, Inha University,
South Korea
- Linz Institute for Organic Solar Cells (LIOS), Institute of Physical Chemistry, Johannes Kepler University Linz, Linz,
Austria
| | - Çiğdem YUMUŞAK
- Linz Institute for Organic Solar Cells (LIOS), Institute of Physical Chemistry, Johannes Kepler University Linz, Linz,
Austria
| | - Cristian Vlad IRIMIA
- Linz Institute for Organic Solar Cells (LIOS), Institute of Physical Chemistry, Johannes Kepler University Linz, Linz,
Austria
| | - Mateusz BEDNORZ
- Linz Institute for Organic Solar Cells (LIOS), Institute of Physical Chemistry, Johannes Kepler University Linz, Linz,
Austria
| | - Esma YENEL
- Department of Chemical Engineering, Konya Technical University, Konya,
Turkiye
| | - Mahmut KUŞ
- Department of Chemical Engineering, Konya Technical University, Konya,
Turkiye
| | - Niyazi Serdar SARIÇİFTÇİ
- Linz Institute for Organic Solar Cells (LIOS), Institute of Physical Chemistry, Johannes Kepler University Linz, Linz,
Austria
| | - Bong Sup SHIM
- Department of Chemical Engineering, Inha University,
South Korea
- Program in Biomedical Science & Engineering, Inha University,
South Korea
| | - Mihai IRIMIA-VLADU
- Linz Institute for Organic Solar Cells (LIOS), Institute of Physical Chemistry, Johannes Kepler University Linz, Linz,
Austria
| |
Collapse
|
4
|
Chin CL, Lin CC, Wang JW, Chin WC, Chen YH, Chang SW, Huang PC, Zhu X, Hsu YL, Liu SH. A Wearable Assistant Device for the Hearing Impaired to Recognize Emergency Vehicle Sirens with Edge Computing. SENSORS (BASEL, SWITZERLAND) 2023; 23:7454. [PMID: 37687910 PMCID: PMC10490602 DOI: 10.3390/s23177454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 08/21/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023]
Abstract
Wearable assistant devices play an important role in daily life for people with disabilities. Those who have hearing impairments may face dangers while walking or driving on the road. The major danger is their inability to hear warning sounds from cars or ambulances. Thus, the aim of this study is to develop a wearable assistant device with edge computing, allowing the hearing impaired to recognize the warning sounds from vehicles on the road. An EfficientNet-based, fuzzy rank-based ensemble model was proposed to classify seven audio sounds, and it was embedded in an Arduino Nano 33 BLE Sense development board. The audio files were obtained from the CREMA-D dataset and the Large-Scale Audio dataset of emergency vehicle sirens on the road, with a total number of 8756 files. The seven audio sounds included four vocalizations and three sirens. The audio signal was converted into a spectrogram by using the short-time Fourier transform for feature extraction. When one of the three sirens was detected, the wearable assistant device presented alarms by vibrating and displaying messages on the OLED panel. The performances of the EfficientNet-based, fuzzy rank-based ensemble model in offline computing achieved an accuracy of 97.1%, precision of 97.79%, sensitivity of 96.8%, and specificity of 97.04%. In edge computing, the results comprised an accuracy of 95.2%, precision of 93.2%, sensitivity of 95.3%, and specificity of 95.1%. Thus, the proposed wearable assistant device has the potential benefit of helping the hearing impaired to avoid traffic accidents.
Collapse
Affiliation(s)
- Chiun-Li Chin
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Chia-Chun Lin
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Jing-Wen Wang
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Wei-Cheng Chin
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Yu-Hsiang Chen
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Sheng-Wen Chang
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Pei-Chen Huang
- Department of Medical Informatics, Chung Shan Medical University, Taichung 40201, Taiwan; (C.-L.C.); (C.-C.L.); (J.-W.W.); (W.-C.C.); (Y.-H.C.); (S.-W.C.); (P.-C.H.)
| | - Xin Zhu
- Division of Information Systems, School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu 965-8580, Fukushima, Japan;
| | - Yu-Lun Hsu
- Bachelor’s Program of Sports and Health Promotion, Fo Guang University, Yilan 26247, Taiwan;
| | - Shing-Hong Liu
- Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung 41349, Taiwan
| |
Collapse
|
5
|
Shougat MREU, Li X, Shao S, McGarvey K, Perkins E. Hopf physical reservoir computer for reconfigurable sound recognition. Sci Rep 2023; 13:8719. [PMID: 37253968 DOI: 10.1038/s41598-023-35760-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 05/23/2023] [Indexed: 06/01/2023] Open
Abstract
The Hopf oscillator is a nonlinear oscillator that exhibits limit cycle motion. This reservoir computer utilizes the vibratory nature of the oscillator, which makes it an ideal candidate for reconfigurable sound recognition tasks. In this paper, the capabilities of the Hopf reservoir computer performing sound recognition are systematically demonstrated. This work shows that the Hopf reservoir computer can offer superior sound recognition accuracy compared to legacy approaches (e.g., a Mel spectrum + machine learning approach). More importantly, the Hopf reservoir computer operating as a sound recognition system does not require audio preprocessing and has a very simple setup while still offering a high degree of reconfigurability. These features pave the way of applying physical reservoir computing for sound recognition in low power edge devices.
Collapse
Affiliation(s)
- Md Raf E Ul Shougat
- Mechanical & Aerospace Engineering Department, North Carolina State University, 1840 Entrepreneur Drive, Raleigh, NC, 27695, USA
| | | | - Siyao Shao
- TandemLaunch, 780 Av. Brewster, Montreal, H4C2K1, Canada
- echosonic, 780 Av. Brewster, Montreal, H4C2K1, Canada
| | | | | |
Collapse
|
6
|
Dong P, Song Y, Yu S, Zhang Z, Mallipattu SK, Djurić PM, Yao S. Electromyogram-Based Lip-Reading via Unobtrusive Dry Electrodes and Machine Learning Methods. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2023; 19:e2205058. [PMID: 36703524 DOI: 10.1002/smll.202205058] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 01/11/2023] [Indexed: 06/18/2023]
Abstract
Lip-reading provides an effective speech communication interface for people with voice disorders and for intuitive human-machine interactions. Existing systems are generally challenged by bulkiness, obtrusiveness, and poor robustness against environmental interferences. The lack of a truly natural and unobtrusive system for converting lip movements to speech precludes the continuous use and wide-scale deployment of such devices. Here, the design of a hardware-software architecture to capture, analyze, and interpret lip movements associated with either normal or silent speech is presented. The system can recognize different and similar visemes. It is robust in a noisy or dark environment. Self-adhesive, skin-conformable, and semi-transparent dry electrodes are developed to track high-fidelity speech-relevant electromyogram signals without impeding daily activities. The resulting skin-like sensors can form seamless contact with the curvilinear and dynamic surfaces of the skin, which is crucial for a high signal-to-noise ratio and minimal interference. Machine learning algorithms are employed to decode electromyogram signals and convert them to spoken words. Finally, the applications of the developed lip-reading system in augmented reality and medical service are demonstrated, which illustrate the great potential in immersive interaction and healthcare applications.
Collapse
Affiliation(s)
- Penghao Dong
- Department of Mechanical Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Yuanqing Song
- Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Shangyouqiao Yu
- Department of Mechanical Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Zimeng Zhang
- Department of Mechanical Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Sandeep K Mallipattu
- Department of Medicine, Stony Brook University, Stony Brook, NY, 11794, USA
- Renal Section, Northport VA Medical Center, Northport, NY, 11768, USA
| | - Petar M Djurić
- Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Shanshan Yao
- Department of Mechanical Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
| |
Collapse
|
7
|
Force-induced ion generation in zwitterionic hydrogels for a sensitive silent-speech sensor. Nat Commun 2023; 14:219. [PMID: 36639704 PMCID: PMC9839672 DOI: 10.1038/s41467-023-35893-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 01/05/2023] [Indexed: 01/15/2023] Open
Abstract
Human-sensitive mechanosensation depends on ionic currents controlled by skin mechanoreceptors. Inspired by the sensory behavior of skin, we investigate zwitterionic hydrogels that generate ions under an applied force in a mobile-ion-free system. Within this system, water dissociates as the distance between zwitterions reduces under an applied pressure. Meanwhile, zwitterionic segments can provide migration channels for the generated ions, significantly facilitating ion transport. These combined effects endow a mobile-ion-free zwitterionic skin sensor with sensitive transduction of pressure into ionic currents, achieving a sensitivity up to five times that of nonionic hydrogels. The signal response time, which relies on the crosslinking degree of the zwitterionic hydrogel, was ~38 ms, comparable to that of natural skin. The skin sensor was incorporated into a universal throat-worn silent-speech recognition system that transforms the tiny signals of laryngeal mechanical vibrations into silent speech.
Collapse
|
8
|
Peracha FK, Khattak MI, Salem N, Saleem N. Causal speech enhancement using dynamical-weighted loss and attention encoder-decoder recurrent neural network. PLoS One 2023; 18:e0285629. [PMID: 37167227 PMCID: PMC10174555 DOI: 10.1371/journal.pone.0285629] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 04/26/2023] [Indexed: 05/13/2023] Open
Abstract
Speech enhancement (SE) reduces background noise signals in target speech and is applied at the front end in various real-world applications, including robust ASRs and real-time processing in mobile phone communications. SE systems are commonly integrated into mobile phones to increase quality and intelligibility. As a result, a low-latency system is required to operate in real-world applications. On the other hand, these systems need efficient optimization. This research focuses on the single-microphone SE operating in real-time systems with better optimization. We propose a causal data-driven model that uses attention encoder-decoder long short-term memory (LSTM) to estimate the time-frequency mask from a noisy speech in order to make a clean speech for real-time applications that need low-latency causal processing. The encoder-decoder LSTM and a causal attention mechanism are used in the proposed model. Furthermore, a dynamical-weighted (DW) loss function is proposed to improve model learning by varying the weight loss values. Experiments demonstrated that the proposed model consistently improves voice quality, intelligibility, and noise suppression. In the causal processing mode, the LSTM-based estimated suppression time-frequency mask outperforms the baseline model for unseen noise types. The proposed SE improved the STOI by 2.64% (baseline LSTM-IRM), 6.6% (LSTM-KF), 4.18% (DeepXi-KF), and 3.58% (DeepResGRU-KF). In addition, we examine word error rates (WERs) using Google's Automatic Speech Recognition (ASR). The ASR results show that error rates decreased from 46.33% (noisy signals) to 13.11% (proposed) 15.73% (LSTM), and 14.97% (LSTM-KF).
Collapse
Affiliation(s)
- Fahad Khalil Peracha
- Department of Electrical Engineering, University of Engineering and Technology, Peshawar, KPK, Pakistan
| | - Muhammad Irfan Khattak
- Department of Electrical Engineering, University of Engineering and Technology, Peshawar, KPK, Pakistan
| | - Nema Salem
- Electrical and Computer Engineering Department, Effat College of Engineering, Effat University, Jeddah, KSA
| | - Nasir Saleem
- Department of Electrical Engineering, University of Engineering and Technology, Peshawar, KPK, Pakistan
| |
Collapse
|
9
|
Al-hammuri K, Gebali F, Thirumarai Chelvan I, Kanan A. Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review. Diagnostics (Basel) 2022; 12:diagnostics12112811. [PMID: 36428870 PMCID: PMC9689563 DOI: 10.3390/diagnostics12112811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 11/07/2022] [Accepted: 11/13/2022] [Indexed: 11/18/2022] Open
Abstract
Lingual ultrasound imaging is essential in linguistic research and speech recognition. It has been used widely in different applications as visual feedback to enhance language learning for non-native speakers, study speech-related disorders and remediation, articulation research and analysis, swallowing study, tongue 3D modelling, and silent speech interface. This article provides a comparative analysis and review based on quantitative and qualitative criteria of the two main streams of tongue contour segmentation from ultrasound images. The first stream utilizes traditional computer vision and image processing algorithms for tongue segmentation. The second stream uses machine and deep learning algorithms for tongue segmentation. The results show that tongue tracking using machine learning-based techniques is superior to traditional techniques, considering the performance and algorithm generalization ability. Meanwhile, traditional techniques are helpful for implementing interactive image segmentation to extract valuable features during training and postprocessing. We recommend using a hybrid approach to combine machine learning and traditional techniques to implement a real-time tongue segmentation tool.
Collapse
Affiliation(s)
- Khalid Al-hammuri
- Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC V8W 2Y2, Canada
- Correspondence:
| | - Fayez Gebali
- Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC V8W 2Y2, Canada
| | | | - Awos Kanan
- Department of Computer Engineering, Princess Sumaya University for Technology, Amman 11941, Jordan
| |
Collapse
|
10
|
Zdravkova K, Krasniqi V, Dalipi F, Ferati M. Cutting-edge communication and learning assistive technologies for disabled children: An artificial intelligence perspective. Front Artif Intell 2022; 5:970430. [PMID: 36388402 PMCID: PMC9650429 DOI: 10.3389/frai.2022.970430] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 09/27/2022] [Indexed: 01/13/2024] Open
Abstract
In this study we provide an in-depth review and analysis of the impact of artificial intelligence (AI) components and solutions that support the development of cutting-edge assistive technologies for children with special needs. Various disabilities are addressed and the most recent assistive technologies that enhance communication and education of disabled children, as well as the AI technologies that have enabled their development, are presented. The paper summarizes with an AI perspective on future assistive technologies and ethical concerns arising from the use of such cutting-edge communication and learning technologies for children with disabilities.
Collapse
Affiliation(s)
- Katerina Zdravkova
- Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Skopje, North Macedonia
| | - Venera Krasniqi
- Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Skopje, North Macedonia
| | - Fisnik Dalipi
- Department of Informatics, Faculty of Technology, Linnaeus University, Växjö, Sweden
| | - Mexhid Ferati
- Department of Informatics, Faculty of Technology, Linnaeus University, Växjö, Sweden
| |
Collapse
|
11
|
Using AAEHS-Net as an Attention-Based Auxiliary Extraction and Hybrid Subsampled Network for Semantic Segmentation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1536976. [PMID: 36275973 PMCID: PMC9586756 DOI: 10.1155/2022/1536976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 10/03/2022] [Indexed: 11/17/2022]
Abstract
Semantic segmentation based on deep learning has undergone remarkable advancements in recent years. However, due to the neglect of the shallow features, the problems of inaccurate segmentation have persisted. To address this issue, a semantic segmentation network-attention-based auxiliary extraction and hybrid subsampled network (AAEHS-Net) is suggested in this study. To extract more deep information and the shallow features, the complementary and enhanced extraction module (CEEM) is utilized by the network. As a result, the edge segmentation of the model is improved. Moreover, to reduce the loss of features, a hybrid subsampled module (HSM) is introduced. Meanwhile, global max pool and global avg pool module (GAGM) is designed as an attention module to enhance the features with global and important information and maintain feature continuity. The proposed AAEHS-Net is evaluated on three datasets: the aerial drone image dataset, the Massachusetts roads dataset, and the Massachusetts buildings dataset. On the three datasets, AAEHS-Net achieves 1.15%, 0.88%, and 2.1% higher accuracy than U-Net, reaching 90.12%, 96.23%, and 95.15%, respectively. At the same time, our proposed network has obtained the best values for all evaluation metrics in three datasets compared to the currently popular algorithms.
Collapse
|
12
|
Deng J, Zhang X, Li M, Jiang H, Chen Q. Feasibility study on Raman spectra-based deep learning models for monitoring the contamination degree and level of aflatoxin B1 in edible oil. Microchem J 2022. [DOI: 10.1016/j.microc.2022.107613] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
13
|
Real-Time Object Tracking Algorithm Based on Siamese Network. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12147338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Object tracking is aimed at tracking a given target that is only specified in the first frame. Due to the rapid movement and the interference of cluttered backgrounds, object tracking is a significant challenging issue in computer vision. This research put forward an innovative feature pyramid and optical flow estimation based on the Siamese network for object tracking, which is called SiamFP. The SiamFP jointly trains the optical flow and the tracking task under the Siamese network framework. We employ the optical flow network based on the pyramid correlation mapping to evaluate the movement information of the target in two contiguous frames, to increase the accuracy of the feature representation. Simultaneously, we adopt spatial attention as well as channel attention to effectively restrain the ambient noise, stress the target area, and better extract the features of the given object, so that the tracking algorithm has a higher success rate. The proposed SiamFP obtains state-of-the-art performance on OTB50, OTB2015, and VOT2016 benchmarks while exhibiting better real-time and robustness.
Collapse
|
14
|
Supervised Learning Models for the Preliminary Detection of COVID-19 in Patients Using Demographic and Epidemiological Parameters. INFORMATION 2022. [DOI: 10.3390/info13070330] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The World Health Organization labelled the new COVID-19 breakout a public health crisis of worldwide concern on 30 January 2020, and it was named the new global pandemic in March 2020. It has had catastrophic consequences on the world economy and well-being of people and has put a tremendous strain on already-scarce healthcare systems globally, particularly in underdeveloped countries. Over 11 billion vaccine doses have already been administered worldwide, and the benefits of these vaccinations will take some time to appear. Today, the only practical approach to diagnosing COVID-19 is through the RT-PCR and RAT tests, which have sometimes been known to give unreliable results. Timely diagnosis and implementation of precautionary measures will likely improve the survival outcome and decrease the fatality rates. In this study, we propose an innovative way to predict COVID-19 with the help of alternative non-clinical methods such as supervised machine learning models to identify the patients at risk based on their characteristic parameters and underlying comorbidities. Medical records of patients from Mexico admitted between 23 January 2020 and 26 March 2022, were chosen for this purpose. Among several supervised machine learning approaches tested, the XGBoost model achieved the best results with an accuracy of 92%. It is an easy, non-invasive, inexpensive, instant and accurate way of forecasting those at risk of contracting the virus. However, it is pretty early to deduce that this method can be used as an alternative in the clinical diagnosis of coronavirus cases.
Collapse
|
15
|
Toward Smart Communication Components: Recent Advances in Human and AI Speaker Interaction. ELECTRONICS 2022. [DOI: 10.3390/electronics11101533] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This study aims to investigate how humans and artificial intelligence (AI) speakers interact and to examine the interactions based on three types of communication failures: system, semantic, and effectiveness. We divided service failures using AI speaker user data provided by the top telecommunication service providers in South Korea and investigated the means to increase the continuity of product use for each type. We proved the occurrence of failure due to system error (H1) and negative results on sustainable use of the AI speaker due to not understanding the meaning (H2). It was observed that the number of users increases as the effectiveness failure rate increases. For single-person households constituted by persons in their 30s and 70s or older, the continued use of AI speakers was significant. We found that it alleviated loneliness and that human-machine interaction using AI speaker could reach a high level through a high degree of meaning transfer. We also expect AI speakers to play a positive role in single-person households, especially in cases of the elderly, which has become a tough challenge in the recent times.
Collapse
|
16
|
Ehrmann G, Blachowicz T, Homburg SV, Ehrmann A. Measuring Biosignals with Single Circuit Boards. Bioengineering (Basel) 2022; 9:bioengineering9020084. [PMID: 35200437 PMCID: PMC8869486 DOI: 10.3390/bioengineering9020084] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 02/14/2022] [Indexed: 12/23/2022] Open
Abstract
To measure biosignals constantly, using textile-integrated or even textile-based electrodes and miniaturized electronics, is ideal to provide maximum comfort for patients or athletes during monitoring. While in former times, this was usually solved by integrating specialized electronics into garments, either connected to a handheld computer or including a wireless data transfer option, nowadays increasingly smaller single circuit boards are available, e.g., single-board computers such as Raspberry Pi or microcontrollers such as Arduino, in various shapes and dimensions. This review gives an overview of studies found in the recent scientific literature, reporting measurements of biosignals such as ECG, EMG, sweat and other health-related parameters by single circuit boards, showing new possibilities offered by Arduino, Raspberry Pi etc. in the mobile long-term acquisition of biosignals. The review concentrates on the electronics, not on textile electrodes about which several review papers are available.
Collapse
Affiliation(s)
- Guido Ehrmann
- Virtual Institute of Applied Research on Advanced Materials (VIARAM)
- Correspondence:
| | - Tomasz Blachowicz
- Institute of Physics—Center for Science and Education, Silesian University of Technology, 44-100 Gliwice, Poland;
| | - Sarah Vanessa Homburg
- Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, 33619 Bielefeld, Germany; (S.V.H.); (A.E.)
| | - Andrea Ehrmann
- Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, 33619 Bielefeld, Germany; (S.V.H.); (A.E.)
| |
Collapse
|
17
|
Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031091] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Automatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech. However, traditional systems have low performance due to a noisy environment. In addition to this, accents and local differences negatively affect the ASR system’s performance while analyzing speech signals. A precise speech recognition system was developed to improve the system performance to overcome these issues. This paper uses speech information from jim-schwoebel voice datasets processed by Mel-frequency cepstral coefficients (MFCCs). The MFCC algorithm extracts the valuable features that are used to recognize speech. Here, a sparse auto-encoder (SAE) neural network is used to classify the model, and the hidden Markov model (HMM) is used to decide on the speech recognition. The network performance is optimized by applying the Harris Hawks optimization (HHO) algorithm to fine-tune the network parameter. The fine-tuned network can effectively recognize speech in a noisy environment.
Collapse
|
18
|
Steering a Robotic Wheelchair Based on Voice Recognition System Using Convolutional Neural Networks. ELECTRONICS 2022. [DOI: 10.3390/electronics11010168] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Many wheelchair people depend on others to control the movement of their wheelchairs, which significantly influences their independence and quality of life. Smart wheelchairs offer a degree of self-dependence and freedom to drive their own vehicles. In this work, we designed and implemented a low-cost software and hardware method to steer a robotic wheelchair. Moreover, from our method, we developed our own Android mobile app based on Flutter software. A convolutional neural network (CNN)-based network-in-network (NIN) structure approach integrated with a voice recognition model was also developed and configured to build the mobile app. The technique was also implemented and configured using an offline Wi-Fi network hotspot between software and hardware components. Five voice commands (yes, no, left, right, and stop) guided and controlled the wheelchair through the Raspberry Pi and DC motor drives. The overall system was evaluated based on a trained and validated English speech corpus by Arabic native speakers for isolated words to assess the performance of the Android OS application. The maneuverability performance of indoor and outdoor navigation was also evaluated in terms of accuracy. The results indicated a degree of accuracy of approximately 87.2% of the accurate prediction of some of the five voice commands. Additionally, in the real-time performance test, the root-mean-square deviation (RMSD) values between the planned and actual nodes for indoor/outdoor maneuvering were 1.721 × 10−5 and 1.743 × 10−5, respectively.
Collapse
|
19
|
Aquila-eagle-based Deep Convolutional neural network for speech recognition using EEG signals. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH 2022. [DOI: 10.4018/ijsir.302608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The conventional BCI system experiences several issues such as background noise interference, lower precision rate and high cost. Hence, a novel speech recognition model which is based on the optimized Deep-CNN is proposed in this research article so as to restrain the issues related to the conventional speech recognition method. The significance of the research relies on the proposed method algorithm known as Aquila-eagle optimization, which effectively tunes the parameters of Deep-CNN. The most significant features are extracted in the feature selection process, which enhance the precision of the speech recognition model. Further unwanted noises in the EEG signals are constructively removed in the pre-processing stage to boost the accuracy of the Deep-CNN classifier.From the experimental outcomes it is demonstrated that the proposed Aquila-eagle-based DeepCNN outperformed other state-of-the-art techniques in terms of accuracy, precision, and recall with the values of 93.11%, 90.89%, and 93.11%, respectively.
Collapse
|
20
|
Sarmiento LC, Villamizar S, López O, Collazos AC, Sarmiento J, Rodríguez JB. Recognition of EEG Signals from Imagined Vowels Using Deep Learning Methods. SENSORS (BASEL, SWITZERLAND) 2021; 21:6503. [PMID: 34640824 PMCID: PMC8512781 DOI: 10.3390/s21196503] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/17/2021] [Accepted: 09/24/2021] [Indexed: 01/27/2023]
Abstract
The use of imagined speech with electroencephalographic (EEG) signals is a promising field of brain-computer interfaces (BCI) that seeks communication between areas of the cerebral cortex related to language and devices or machines. However, the complexity of this brain process makes the analysis and classification of this type of signals a relevant topic of research. The goals of this study were: to develop a new algorithm based on Deep Learning (DL), referred to as CNNeeg1-1, to recognize EEG signals in imagined vowel tasks; to create an imagined speech database with 50 subjects specialized in imagined vowels from the Spanish language (/a/,/e/,/i/,/o/,/u/); and to contrast the performance of the CNNeeg1-1 algorithm with the DL Shallow CNN and EEGNet benchmark algorithms using an open access database (BD1) and the newly developed database (BD2). In this study, a mixed variance analysis of variance was conducted to assess the intra-subject and inter-subject training of the proposed algorithms. The results show that for intra-subject training analysis, the best performance among the Shallow CNN, EEGNet, and CNNeeg1-1 methods in classifying imagined vowels (/a/,/e/,/i/,/o/,/u/) was exhibited by CNNeeg1-1, with an accuracy of 65.62% for BD1 database and 85.66% for BD2 database.
Collapse
Affiliation(s)
- Luis Carlos Sarmiento
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Sergio Villamizar
- Department of Electrical and Electronics Engineering, School of Engineering, Universidad Nacional de Colombia, Bogotá 111321, Colombia; (S.V.); (J.B.R.)
| | - Omar López
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Ana Claros Collazos
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Jhon Sarmiento
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Jan Bacca Rodríguez
- Department of Electrical and Electronics Engineering, School of Engineering, Universidad Nacional de Colombia, Bogotá 111321, Colombia; (S.V.); (J.B.R.)
| |
Collapse
|