1
|
Gris VN, Crespo TR, Kaneko A, Okamoto M, Suzuki J, Teramae JN, Miyabe-Nishiwaki T. Deep Learning for Face Detection and Pain Assessment in Japanese macaques ( Macaca fuscata). JOURNAL OF THE AMERICAN ASSOCIATION FOR LABORATORY ANIMAL SCIENCE : JAALAS 2024; 63:403-411. [PMID: 38428929 PMCID: PMC11270042 DOI: 10.30802/aalas-jaalas-23-000056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/31/2023] [Accepted: 01/04/2024] [Indexed: 03/03/2024]
Abstract
Facial expressions have increasingly been used to assess emotional states in mammals. The recognition of pain in research animals is essential for their well-being and leads to more reliable research outcomes. Automating this process could contribute to early pain diagnosis and treatment. Artificial neural networks have become a popular option for image classification tasks in recent years due to the development of deep learning. In this study, we investigated the ability of a deep learning model to detect pain in Japanese macaques based on their facial expression. Thirty to 60 min of video footage from Japanese macaques undergoing laparotomy was used in the study. Macaques were recorded undisturbed in their cages before surgery (No Pain) and one day after the surgery before scheduled analgesia (Pain). Videos were processed for facial detection and image extraction with the algorithms RetinaFace (adding a bounding box around the face for image extraction) or Mask R-CNN (contouring the face for extraction). ResNet50 used 75% of the images to train systems; the other 25% were used for testing. Test accuracy varied from 48 to 54% after box extraction. The low accuracy of classification after box extraction was likely due to the incorporation of features that were not relevant for pain (for example, background, illumination, skin color, or objects in the enclosure). However, using contour extraction, preprocessing the images, and fine-tuning, the network resulted in 64% appropriate generalization. These results suggest that Mask R-CNN can be used for facial feature extractions and that the performance of the classifying model is relatively accurate for nonannotated single-frame images.
Collapse
Affiliation(s)
- Vanessa N Gris
- Primate Research Institute and
- Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan; and
| | - Thomás R Crespo
- Department of Advanced Mathematical Sciences, Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Akihisa Kaneko
- Primate Research Institute and
- Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan; and
| | - Munehiro Okamoto
- Primate Research Institute and
- Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan; and
| | | | - Jun-nosuke Teramae
- Department of Advanced Mathematical Sciences, Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Takako Miyabe-Nishiwaki
- Primate Research Institute and
- Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan; and
| |
Collapse
|
2
|
Wu T, Xia Z, Zhou M, Kong LB, Chen Z. AMENet is a monocular depth estimation network designed for automatic stereoscopic display. Sci Rep 2024; 14:5868. [PMID: 38467677 PMCID: PMC10928105 DOI: 10.1038/s41598-024-56095-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 03/01/2024] [Indexed: 03/13/2024] Open
Abstract
Monocular depth estimation has a wide range of applications in the field of autostereoscopic displays, while accuracy and robustness in complex scenes are still a challenge. In this paper, we propose a depth estimation network for autostereoscopic displays, which aims at improving the accuracy of monocular depth estimation by fusing Vision Transformer (ViT) and Convolutional Neural Network (CNN). Our approach feeds the input image as a sequence of visual features into the ViT module and utilizes its global perception capability to extract high-level semantic features of the image. The relationship between the losses is quantified by adding a weight correction module to improve robustness of the model. Experimental evaluation results on several public datasets show that AMENet exhibits higher accuracy and robustness than existing methods in different scenarios and complex conditions. In addition, a detailed experimental analysis was conducted to verify the effectiveness and stability of our method. The accuracy improvement on the KITTI dataset compared to the baseline method is 4.4%. In summary, AMENet is a promising depth estimation method with sufficient high robustness and accuracy for monocular depth estimation tasks.
Collapse
Affiliation(s)
- Tianzhao Wu
- College of New Materials and New Energies, Shenzhen University of Technology, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518060, Guangdong, China
| | - Zhongyi Xia
- College of New Materials and New Energies, Shenzhen University of Technology, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518060, Guangdong, China
| | - Man Zhou
- College of New Materials and New Energies, Shenzhen University of Technology, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518060, Guangdong, China
| | - Ling Bing Kong
- College of New Materials and New Energies, Shenzhen University of Technology, Shenzhen, 518118, Guangdong, China
| | - Zengyuan Chen
- College of New Materials and New Energies, Shenzhen University of Technology, Shenzhen, 518118, Guangdong, China.
| |
Collapse
|
3
|
Liang J, Feng J, Lin Z, Wei J, Luo X, Wang QM, He B, Chen H, Ye Y. Research on prognostic risk assessment model for acute ischemic stroke based on imaging and multidimensional data. Front Neurol 2023; 14:1294723. [PMID: 38192576 PMCID: PMC10773779 DOI: 10.3389/fneur.2023.1294723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 11/30/2023] [Indexed: 01/10/2024] Open
Abstract
Accurately assessing the prognostic outcomes of patients with acute ischemic stroke and adjusting treatment plans in a timely manner for those with poor prognosis is crucial for intervening in modifiable risk factors. However, there is still controversy regarding the correlation between imaging-based predictions of complications in acute ischemic stroke. To address this, we developed a cross-modal attention module for integrating multidimensional data, including clinical information, imaging features, treatment plans, prognosis, and complications, to achieve complementary advantages. The fused features preserve magnetic resonance imaging (MRI) characteristics while supplementing clinical relevant information, providing a more comprehensive and informative basis for clinical diagnosis and treatment. The proposed framework based on multidimensional data for activity of daily living (ADL) scoring in patients with acute ischemic stroke demonstrates higher accuracy compared to other state-of-the-art network models, and ablation experiments confirm the effectiveness of each module in the framework.
Collapse
Affiliation(s)
- Jiabin Liang
- Postgraduate Cultivation Base of Guangzhou University of Chinese Medicine, Panyu Central Hospital, Guangzhou, China
- Graduate School, Guangzhou University of Chinese Medicine, Guangzhou, China
- Medical Imaging Institute of Panyu, Guangzhou, China
| | - Jie Feng
- Radiology Department of Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - Zhijie Lin
- Laboratory for Intelligent Information Processing, Guangdong University of Technology, Guangzhou, China
| | - Jinbo Wei
- Postgraduate Cultivation Base of Guangzhou University of Chinese Medicine, Panyu Central Hospital, Guangzhou, China
| | - Xun Luo
- Kerry Rehabilitation Medicine Research Institute, Shenzhen, China
| | - Qing Mei Wang
- Stroke Biological Recovery Laboratory, Spaulding Rehabilitation Hospital, Teaching Affiliate of Harvard Medical School, Charlestown, MA, United States
| | - Bingjie He
- Panyu Health Management Center, Guangzhou, China
| | - Hanwei Chen
- Postgraduate Cultivation Base of Guangzhou University of Chinese Medicine, Panyu Central Hospital, Guangzhou, China
- Medical Imaging Institute of Panyu, Guangzhou, China
- Panyu Health Management Center, Guangzhou, China
| | - Yufeng Ye
- Postgraduate Cultivation Base of Guangzhou University of Chinese Medicine, Panyu Central Hospital, Guangzhou, China
- Medical Imaging Institute of Panyu, Guangzhou, China
| |
Collapse
|
4
|
Kowsar I, Rabbani SB, Akhter KFB, Samad MD. Deep Clustering of Electronic Health Records Tabular Data for Clinical Interpretation. ... IEEE INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND PHOTONICS. IEEE INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND PHOTONICS 2023; 2023:10.1109/ictp60248.2023.10490723. [PMID: 39027675 PMCID: PMC11255553 DOI: 10.1109/ictp60248.2023.10490723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
Machine learning applications are widespread due to straightforward supervised learning of known data labels. Many data samples in real-world scenarios, including medicine, are unlabeled because data annotation can be time-consuming and error-prone. The application and evaluation of unsupervised clustering methods are not trivial and are limited to traditional methods (e.g., k-means) when clinicians demand deeper insights into patient data beyond classification accuracy. The contribution of this paper is three-fold: 1) to introduce a patient stratification strategy based on a clinical variable instead of a diagnostic label, 2) to evaluate clustering performance using within-cluster homogeneity and between-cluster statistical difference, and 3) to compare widely used traditional clustering algorithms (e.g., k-means) with a state-of-the-art deep learning solution for clustering tabular data. The deep clustering method achieves superior within-cluster homogeneity and between-cluster separation compared to k-means and identifies three statistically distinct and clinically interpretable high blood pressure patient clusters. The proposed clustering strategy and evaluation metrics will facilitate the stratification of large patient cohorts in health science research without requiring explicit diagnostic labels.
Collapse
Affiliation(s)
- Ibna Kowsar
- Department of Computer Science, Tennessee State University, Nashville, TN, United States
| | - Shourav B Rabbani
- Department of Computer Science, Tennessee State University, Nashville, TN, United States
| | - Kazi Fuad B Akhter
- Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
| | - Manar D Samad
- Department of Computer Science, Tennessee State University, Nashville, TN, United States
| |
Collapse
|
5
|
Sun L, Yin C, Xu Q, Zhao W. Artificial intelligence for healthcare and medical education: a systematic review. Am J Transl Res 2023; 15:4820-4828. [PMID: 37560249 PMCID: PMC10408516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/03/2023] [Indexed: 08/11/2023]
Abstract
BACKGROUND Human society has entered the age of artificial intelligence, medical practice and medical education are undergoing profound changes. Artificial intelligence (AI) is now applied in many industries, particularly in healthcare and medical education, where it deeply intersects. The purpose of this paper is to overview the current situation and problems of "AI+medicine/medical" education and to provide our own perspective on the current predicament. METHODS We searched PubMed, Embase, Cochrane and CNKI databases to assess the literature on AI+medical/medical education from 2017 to July 2022. The main inclusion criteria include literature describing the current situation or predicament of "AI+medical/medical education". RESULTS Studies have shown that the current application of AI in medical education is focused on clinical specialty training and continuing education, with the main application areas being radiology, diagnostics, surgery, cardiology, and dentistry. The main role is to assist physicians to improve their efficiency and accuracy. In addition, the field of combining AI with medicine/medical education is steadily expanding, and the most urgent need is for policy makers, experts in the medical field, AI and education, and experts in other fields to come together to reach consensus on ethical issues and develop regulatory standards. Our study also found that most medical students are positive about adding AI-related courses to the existing medical curriculum. Finally, the quality of research on "AI+medical/medical education" is poor. CONCLUSION In the context of the COVID-19 pandemic, our study provides an innovative systematic review of the latest "AI+medicine/medical curriculum". Since the AI+medicine curriculum is not yet regulated, we have made some suggestions.
Collapse
Affiliation(s)
- Li Sun
- Department of Neurology, Hongqi Hospital Affiliated to Mudanjiang Medical UniversityMudanjiang 157011, Heilongjiang, China
- Heilongjiang Key Laboratory of Ischemic Stroke Prevention and TreatmentMudanjiang 157011, Heilongjiang, China
| | - Changhao Yin
- Department of Neurology, Hongqi Hospital Affiliated to Mudanjiang Medical UniversityMudanjiang 157011, Heilongjiang, China
- Heilongjiang Key Laboratory of Ischemic Stroke Prevention and TreatmentMudanjiang 157011, Heilongjiang, China
| | - Qiuling Xu
- Department of Physiology, Mudanjiang Medical UniversityMudanjiang 157011, Heilongjiang, China
| | - Weina Zhao
- Department of Neurology, Hongqi Hospital Affiliated to Mudanjiang Medical UniversityMudanjiang 157011, Heilongjiang, China
- Heilongjiang Key Laboratory of Ischemic Stroke Prevention and TreatmentMudanjiang 157011, Heilongjiang, China
| |
Collapse
|
6
|
Khalooei M, Mehdi Homayounpour M, Amirmazlaghani Amirmazlaghani: M. Layer-wise Regularized Adversarial Training using Layers Sustainability Analysis framework. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.03.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
7
|
Kamath V, Renuka A. Deep Learning Based Object Detection for Resource Constrained Devices- Systematic Review, Future Trends and Challenges Ahead. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
|
8
|
Guo L, Mu S, Deng Y, Shi C, Yan B, Xiao Z. Efficient Binary Weight Convolutional Network Accelerator for Speech Recognition. SENSORS (BASEL, SWITZERLAND) 2023; 23:1530. [PMID: 36772567 PMCID: PMC9920974 DOI: 10.3390/s23031530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 06/18/2023]
Abstract
Speech recognition has progressed tremendously in the area of artificial intelligence (AI). However, the performance of the real-time offline Chinese speech recognition neural network accelerator for edge AI needs to be improved. This paper proposes a configurable convolutional neural network accelerator based on a lightweight speech recognition model, which can dramatically reduce hardware resource consumption while guaranteeing an acceptable error rate. For convolutional layers, the weights are binarized to reduce the number of model parameters and improve computational and storage efficiency. A multichannel shared computation (MCSC) architecture is proposed to maximize the reuse of weight and feature map data. The binary weight-sharing processing engine (PE) is designed to avoid limiting the number of multipliers. A custom instruction set is established according to the variable length of voice input to configure parameters for adapting to different network structures. Finally, the ping-pong storage method is used when the feature map is an input. We implemented this accelerator on Xilinx ZYNQ XC7Z035 under the working frequency of 150 MHz. The processing time for 2.24 s and 8 s of speech was 69.8 ms and 189.51 ms, respectively, and the convolution performance reached 35.66 GOPS/W. Compared with other computing platforms, accelerators perform better in terms of energy efficiency, power consumption and hardware resource consumption.
Collapse
|
9
|
Ragi S, Rahman MH, Duckworth J, Jawaharraj K, Chundi P, Gadhamshetty V. Artificial Intelligence-Driven Image Analysis of Bacterial Cells and Biofilms. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:174-184. [PMID: 34951852 DOI: 10.1109/tcbb.2021.3138304] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The current study explores an artificial intelligence framework for measuring the structural features from microscopy images of the bacterial biofilms. Desulfovibrio alaskensis G20 (DA-G20) grown on mild steel surfaces is used as a model for sulfate reducing bacteria that are implicated in microbiologically influenced corrosion problems. Our goal is to automate the process of extracting the geometrical properties of the DA-G20 cells from the scanning electron microscopy (SEM) images, which is otherwise a laborious and costly process. These geometric properties are a biofilm phenotype that allow us to understand how the biofilm structurally adapts to the surface properties of the underlying metals, which can lead to better corrosion prevention solutions. We adapt two deep learning models: (a) a deep convolutional neural network (DCNN) model to achieve semantic segmentation of the cells, (d) a mask region-convolutional neural network (Mask R-CNN) model to achieve instance segmentation of the cells. These models are then integrated with moment invariants approach to measure the geometric characteristics of the segmented cells. Our numerical studies confirm that the Mask-RCNN and DCNN methods are 227x and 70x faster respectively, compared to the traditional method of manual identification and measurement of the cell geometric properties by the domain experts.
Collapse
|
10
|
Abrar S, Samad MD. Perturbation of deep autoencoder weights for model compression and classification of tabular data. Neural Netw 2022; 156:160-169. [PMID: 36270199 PMCID: PMC9669225 DOI: 10.1016/j.neunet.2022.09.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 07/18/2022] [Accepted: 09/19/2022] [Indexed: 11/16/2022]
Abstract
Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior performance of traditional machine learning models. This paper proposes periodic perturbations (prune and regrow) of DNN weights, especially at the self-supervised pre-training stage of deep autoencoders. The proposed weight perturbation strategy outperforms dropout learning or weight regularization (L1 or L2) for four out of six tabular data sets in downstream classification tasks. Unlike dropout learning, the proposed weight perturbation routine additionally achieves 15% to 40% sparsity across six tabular data sets, resulting in compressed pretrained models. The proposed pretrained model compression improves the accuracy of downstream classification, unlike traditional weight pruning methods that trade off performance for model compression. Our experiments reveal that a pretrained deep autoencoder with weight perturbation can outperform traditional machine learning in tabular data classification, whereas baseline fully-connected DNNs yield the worst classification accuracy. However, traditional machine learning models are superior to any deep model when a tabular data set contains uncorrelated variables. Therefore, the performance of deep models with tabular data is contingent on the types and statistics of constituent variables.
Collapse
Affiliation(s)
- Sakib Abrar
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, United States
| | - Manar D Samad
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, United States.
| |
Collapse
|
11
|
Otero-González I, Caeiro-Rodríguez M, Rodriguez-D’Jesus A. Methods for Gastrointestinal Endoscopy Quantification: A Focus on Hands and Fingers Kinematics. SENSORS (BASEL, SWITZERLAND) 2022; 22:9253. [PMID: 36501954 PMCID: PMC9741269 DOI: 10.3390/s22239253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/19/2022] [Accepted: 11/23/2022] [Indexed: 06/17/2023]
Abstract
Gastrointestinal endoscopy is a complex procedure requiring the mastery of several competencies and skills. This procedure is in increasing demand, but there exist important management and ethical issues regarding the training of new endoscopists. Nowadays, this requires the direct involvement of real patients and a high chance of the endoscopists themselves suffering from musculoskeletal conditions. Colonoscopy quantification can be useful for improving these two issues. This paper reviews the literature regarding efforts to quantify gastrointestinal procedures and focuses on the capture of hand and finger kinematics. Current technologies to support the capture of data from hand and finger movements are analyzed and tested, considering smart gloves and vision-based solutions. Manus VR Prime II and Stretch Sense MoCap reveal the main problems with smart gloves related to the adaptation of the gloves to different hand sizes and comfortability. Regarding vision-based solutions, Vero Vicon cameras show the main problem in gastrointestinal procedure scenarios: occlusion. In both cases, calibration and data interoperability are also key issues that limit possible applications. In conclusion, new advances are needed to quantify hand and finger kinematics in an appropriate way to support further developments.
Collapse
Affiliation(s)
- Iván Otero-González
- atlanTTic Research Center for Telecommunication Technologies, Universidade de Vigo, Campus-Universitario S/N, 36312 Vigo, Spain
| | - Manuel Caeiro-Rodríguez
- atlanTTic Research Center for Telecommunication Technologies, Universidade de Vigo, Campus-Universitario S/N, 36312 Vigo, Spain
| | | |
Collapse
|
12
|
A Block-Based and Highly Parallel CNN Accelerator for Seed Sorting. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING 2022. [DOI: 10.1155/2022/5608573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Seed sorting is critical for the breeding industry to improve the agricultural yield. The seed sorting methods based on convolutional neural networks (CNNs) have achieved excellent recognition accuracy on large-scale pretrained network models. However, CNN inference is a computationally intensive process that often requires hardware acceleration to operate in real time. For embedded devices, the high-power consumption of graphics processing units (GPUs) is generally prohibitive, and the field programmable gate array (FPGA) becomes a solution to perform high-speed inference by providing a customized accelerator for a particular user. To date, the recognition speeds of the FPGA-based universal accelerators for high-throughput seed sorting tasks are slow, which cannot guarantee real-time seed sorting. Therefore, a block-based and highly parallel MobileNetV2 accelerator is proposed in this paper. First, a hardware-friendly quantization method that uses only fixed-point operation is designed to reduce resource consumption. Then, the block convolution strategy is proposed to avoid latency and energy consumption increase caused by large-scale intermediate result off-chip data transfers. Finally, two scalable computing engines are explicitly designed for depth-wise convolution (DWC) and point-wise convolution (PWC) to develop the high parallelism of block convolution computation. Moreover, an efficient memory system with a double buffering mechanism and new data reordering mode is designed to address the imbalance between memory access and parallel computing. Our proposed FPGA-based MobileNetV2 accelerator for real-time seed sorting is implemented and evaluated on the platform of Xilinx XC7020. Experimental results demonstrate that our implementation can achieve about 29.4 frames per second (FPS) and 10.86 Giga operations per second (GOPS), and 0.92× to 5.70 × DSP-efficiency compared with previous FPGA-based accelerators.
Collapse
|
13
|
Multichannel KHMF for speech separation with enthalpy based DOA and score based CNN (SCNN). EVOLVING SYSTEMS 2022. [DOI: 10.1007/s12530-022-09473-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
14
|
Yang Y, Lv H, Chen N. A Survey on ensemble learning under the era of deep learning. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10283-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
15
|
Wall C, Powell D, Young F, Zynda AJ, Stuart S, Covassin T, Godfrey A. A deep learning-based approach to diagnose mild traumatic brain injury using audio classification. PLoS One 2022; 17:e0274395. [PMID: 36170287 PMCID: PMC9518857 DOI: 10.1371/journal.pone.0274395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 08/26/2022] [Indexed: 11/19/2022] Open
Abstract
Mild traumatic brain injury (mTBI or concussion) is receiving increased attention due to the incidence in contact sports and limitations with subjective (pen and paper) diagnostic approaches. If an mTBI is undiagnosed and the athlete prematurely returns to play, it can result in serious short-term and/or long-term health complications. This demonstrates the importance of providing more reliable mTBI diagnostic tools to mitigate misdiagnosis. Accordingly, there is a need to develop reliable and efficient objective approaches with computationally robust diagnostic methods. Here in this pilot study, we propose the extraction of Mel Frequency Cepstral Coefficient (MFCC) features from audio recordings of speech that were collected from athletes engaging in rugby union who were diagnosed with an mTBI or not. These features were trained on our novel particle swarm optimised (PSO) bidirectional long short-term memory attention (Bi-LSTM-A) deep learning model. Little-to-no overfitting occurred during the training process, indicating strong reliability of the approach regarding the current test dataset classification results and future test data. Sensitivity and specificity to distinguish those with an mTBI were 94.7% and 86.2%, respectively, with an AUROC score of 0.904. This indicates a strong potential for the deep learning approach, with future improvements in classification results relying on more participant data and further innovations to the Bi-LSTM-A model to fully establish this approach as a pragmatic mTBI diagnostic tool.
Collapse
Affiliation(s)
- Conor Wall
- Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, United Kingdom
| | - Dylan Powell
- Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, United Kingdom
| | - Fraser Young
- Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, United Kingdom
| | - Aaron J. Zynda
- Department of Kinesiology, Michigan State University, East Lansing, Michigan, United States of America
| | - Sam Stuart
- Department of Sport, Exercise and Rehabilitation, Northumbria University, Newcastle upon Tyne, United Kingdom
| | - Tracey Covassin
- Department of Kinesiology, Michigan State University, East Lansing, Michigan, United States of America
| | - Alan Godfrey
- Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, United Kingdom
- * E-mail:
| |
Collapse
|
16
|
Yang G. Research on Mental Health Monitoring Scheme of Migrant Children Based on Convolutional Neural Network Based on Deep Learning. Occup Ther Int 2022; 2022:2210820. [PMID: 36081739 PMCID: PMC9427310 DOI: 10.1155/2022/2210820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 11/18/2022] Open
Abstract
In recent years, with the acceleration of urbanization and the implementation of compulsory education, the pressure on students' study and life has increased, and the phenomenon of psychological and behavioral problems has become increasingly prominent. Therefore, the school has regarded students' mental health education as the top priority in teaching work. Effective expression classification can assist psychology researchers to study psychology and other disciplines and analyze children's psychological activities and mental states by classifying expressions, thereby reducing the occurrence of psychological behavior problems. Most of the current mainstream methods focus on the exploration of text explicit features and the optimization of representation models, and few works pay attention to deeper language expressions. Metaphors, as language expressions often used in daily life, are closely related to an individual's emotion, cognition, and psychological state. This paper studies children's smiling face recognition based on deep neural network. In order to obtain a better identification effect of mental health problems of children, this paper attempts to use multisource data, including consumption data, access control data, network logs, and grade data, and proposes a multisource data-based mental health problem identification algorithm. The main research focus is feature extraction, trying to use one-dimensional convolutional neural network (1D-CNN) to mine students' online patterns from online behavior sequences, calculate abnormal scores based on students' consumption data in the cafeteria, and describe the dietary differences among students. At the same time, this paper uses the students' psychological state data provided by the psychological center as a label to improve the deficiencies caused by the questionnaire. This paper uses the training set to train five common classification algorithms, evaluates them through the validation set, and selects the best classifier as our algorithm and uses it to identify students with mental health problems in the test set. The experimental results show that precision reaches 0.68, recall reaches 0.56, and F1-measure reaches 0.67.
Collapse
Affiliation(s)
- Guangyan Yang
- School of Education, Xi'an University, Xi'an, Shaanxi 710065, China
| |
Collapse
|
17
|
Götz G, Falcón Pérez R, Schlecht SJ, Pulkki V. Neural network for multi-exponential sound energy decay analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:942. [PMID: 36050155 DOI: 10.1121/10.0013416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 07/18/2022] [Indexed: 06/15/2023]
Abstract
An established model for sound energy decay functions (EDFs) is the superposition of multiple exponentials and a noise term. This work proposes a neural-network-based approach for estimating the model parameters from EDFs. The network is trained on synthetic EDFs and evaluated on two large datasets of over 20 000 EDF measurements conducted in various acoustic environments. The evaluation shows that the proposed neural network architecture robustly estimates the model parameters from large datasets of measured EDFs while being lightweight and computationally efficient. An implementation of the proposed neural network is publicly available.
Collapse
Affiliation(s)
- Georg Götz
- Aalto Acoustics Lab, Department of Signal Processing and Acoustics, Aalto University, P.O. Box 13100, 00076 Aalto, Finland
| | - Ricardo Falcón Pérez
- Aalto Acoustics Lab, Department of Signal Processing and Acoustics, Aalto University, P.O. Box 13100, 00076 Aalto, Finland
| | - Sebastian J Schlecht
- Aalto Acoustics Lab, Department of Signal Processing and Acoustics, Aalto University, P.O. Box 13100, 00076 Aalto, Finland
| | - Ville Pulkki
- Aalto Acoustics Lab, Department of Signal Processing and Acoustics, Aalto University, P.O. Box 13100, 00076 Aalto, Finland
| |
Collapse
|
18
|
Sharma M, Joshi S, Chatterjee T, Hamid R. A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.084] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
19
|
Zou J, Zhang Y, Liu H, Ma L. Monogenic features based single sample face recognition by kernel sparse representation on multiple Riemannian manifolds. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.06.113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
20
|
Image classification based on quaternion-valued capsule network. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03849-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
21
|
Wang J, Mo W, Wu Y, Xu X, Li Y, Ye J, Lai X. Combined Channel Attention and Spatial Attention Module Network for Chinese Herbal Slices Automated Recognition. Front Neurosci 2022; 16:920820. [PMID: 35769703 PMCID: PMC9234258 DOI: 10.3389/fnins.2022.920820] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 05/16/2022] [Indexed: 11/13/2022] Open
Abstract
Chinese Herbal Slices (CHS) are critical components of Traditional Chinese Medicine (TCM); the accurate recognition of CHS is crucial for applying to medicine, production, and education. However, existing methods to recognize the CHS are mainly performed by experienced professionals, which may not meet vast CHS market demand due to time-consuming and the limited number of professionals. Although some automated CHS recognition approaches have been proposed, the performance still needs further improvement because they are primarily based on the traditional machine learning with hand-crafted features, resulting in relatively low accuracy. Additionally, few CHS datasets are available for research aimed at practical application. To comprehensively address these problems, we propose a combined channel attention and spatial attention module network (CCSM-Net) for efficiently recognizing CHS with 2-D images. The CCSM-Net integrates channel and spatial attentions, focusing on the most important information as well as the position of the information of CHS image. Especially, pairs of max-pooling and average pooling operations are used in the CA and SA module to aggregate the channel information of the feature map. Then, a dataset of 14,196 images with 182 categories of commonly used CHS is constructed. We evaluated our framework on the constructed dataset. Experimental results show that the proposed CCSM-Net indicates promising performance and outperforms other typical deep learning algorithms, achieving a recognition rate of 99.27%, a precision of 99.33%, a recall of 99.27%, and an F1-score of 99.26% with different numbers of CHS categories.
Collapse
Affiliation(s)
- Jianqing Wang
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Weitao Mo
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Yan Wu
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Xiaomei Xu
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Yi Li
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Jianming Ye
- First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| | - Xiaobo Lai
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
- First Affiliated Hospital, Gannan Medical University, Ganzhou, China
| |
Collapse
|
22
|
Li G, Zhang J, Zhang M, Wu R, Cao X, Liu W. Efficient depthwise separable convolution accelerator for classification and UAV object detection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.02.071] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
23
|
Attention-Based RU-BiLSTM Sentiment Analysis Model for Roman Urdu. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073641] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Deep neural networks have emerged as a leading approach towards handling many natural language processing (NLP) tasks. Deep networks initially conquered the problems of computer vision. However, dealing with sequential data such as text and sound was a nightmare for such networks as traditional deep networks are not reliable in preserving contextual information. This may not harm the results in the case of image processing where we do not care about the sequence, but when we consider the data collected from text for processing, such networks may trigger disastrous results. Moreover, establishing sentence semantics in a colloquial text such as Roman Urdu is a challenge. Additionally, the sparsity and high dimensionality of data in such informal text have encountered a significant challenge for building sentence semantics. To overcome this problem, we propose a deep recurrent architecture RU-BiLSTM based on bidirectional LSTM (BiLSTM) coupled with word embedding and an attention mechanism for sentiment analysis of Roman Urdu. Our proposed model uses the bidirectional LSTM to preserve the context in both directions and the attention mechanism to concentrate on more important features. Eventually, the last dense softmax output layer is used to acquire the binary and ternary classification results. We empirically evaluated our model on two available datasets of Roman Urdu, i.e., RUECD and RUSA-19. Our proposed model outperformed the baseline models on many grounds, and a significant improvement of 6% to 8% is achieved over baseline models.
Collapse
|
24
|
Liu J, Chen Y, Dong Z, Wang S, Calinon S, Li M, Chen F. Robot Cooking With Stir-Fry: Bimanual Non-Prehensile Manipulation of Semi-Fluid Objects. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3153728] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
25
|
Cheng Y, Wang C, Wu J, Zhu H, Lee C. Multi-dimensional recurrent neural network for remaining useful life prediction under variable operating conditions and multiple fault modes. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108507] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
26
|
Ruan Q, Wu Q, Yao J, Wang Y, Tseng HW, Zhang Z. An Efficient Tongue Segmentation Model Based on U-Net Framework. INT J PATTERN RECOGN 2021. [DOI: 10.1142/s0218001421540355] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In the intelligently processing of the tongue image, one of the most important tasks is to accurately segment the tongue body from a whole tongue image, and the good quality of tongue body edge processing is of great significance for the relevant tongue feature extraction. To improve the performance of the segmentation model for tongue images, we propose an efficient tongue segmentation model based on U-Net. Three important studies are launched, including optimizing the model’s main network, innovating a new network to specially handle tongue edge cutting and proposing a weighted binary cross-entropy loss function. The purpose of optimizing the tongue image main segmentation network is to make the model recognize the foreground and background features for the tongue image as well as possible. A novel tongue edge segmentation network is used to focus on handling the tongue edge because the edge of the tongue contains a number of important information. Furthermore, the advantageous loss function proposed is to be adopted to enhance the pixel supervision corresponding to tongue images. Moreover, thanks to a lack of tongue image resources on Traditional Chinese Medicine (TCM), some special measures are adopted to augment training samples. Various comparing experiments on two datasets were conducted to verify the performance of the segmentation model. The experimental results indicate that the loss rate of our model converges faster than the others. It is proved that our model has better stability and robustness of segmentation for tongue image from poor environment. The experimental results also indicate that our model outperforms the state-of-the-art ones in aspects of the two most important tongue image segmentation indexes: IoU and Dice. Moreover, experimental results on augmentation samples demonstrate our model have better performances.
Collapse
Affiliation(s)
- Qunsheng Ruan
- School of Informatics, Xiamen, University, Xiamen, Fujian, P. R. China
| | - Qingfeng Wu
- School of Informatics, Xiamen, University, Xiamen, Fujian, P. R. China
| | - Junfeng Yao
- School of Informatics, Xiamen, University, Xiamen, Fujian, P. R. China
| | - Yingdong Wang
- School of Informatics, Xiamen, University, Xiamen, Fujian, P. R. China
| | - Hsien-Wei Tseng
- College of Mathematics and Information Engineering, Longyan University, Long Yan, Fujian P. R. China
| | - Zhiling Zhang
- School of Information, Mechanical and Electrical, Engineering Ningde Normal University, Ningde, Fujian P. R. China
| |
Collapse
|
27
|
Technologies for Multimodal Interaction in Extended Reality—A Scoping Review. MULTIMODAL TECHNOLOGIES AND INTERACTION 2021. [DOI: 10.3390/mti5120081] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
When designing extended reality (XR) applications, it is important to consider multimodal interaction techniques, which employ several human senses simultaneously. Multimodal interaction can transform how people communicate remotely, practice for tasks, entertain themselves, process information visualizations, and make decisions based on the provided information. This scoping review summarized recent advances in multimodal interaction technologies for head-mounted display-based (HMD) XR systems. Our purpose was to provide a succinct, yet clear, insightful, and structured overview of emerging, underused multimodal technologies beyond standard video and audio for XR interaction, and to find research gaps. The review aimed to help XR practitioners to apply multimodal interaction techniques and interaction researchers to direct future efforts towards relevant issues on multimodal XR. We conclude with our perspective on promising research avenues for multimodal interaction technologies.
Collapse
|
28
|
Krumm D, Kuske N, Neubert M, Buder J, Hamker F, Odenwald S. Determining push-off forces in speed skating imitation drills. SPORTS ENGINEERING 2021. [DOI: 10.1007/s12283-021-00362-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
AbstractSpeed skating takes place on ice rinks and is, therefore, dependent on seasonal conditions. To be able to train all year round, training in the summer months, when no ice rinks are available, consists mainly of athletics and endurance training as well as imitation drills. Imitation drills are exercises, e.g. on a slide board, which imitate the actual skating movement. To objectively evaluate the quality of the execution of these exercises, key performance indicators such as push-off forces need to be quantified. The aim of this work was to determine the push-off forces during speed skating imitation drills using pressure insoles in combination with machine-learning methods. A slide board is usually not instrumented. Here, the slide board was equipped with force plates to record the target variables, i.e. the push-off forces. The input variables to determine the push-off forces were recorded using plantar pressure insoles and triaxial accelerometers. Seven participants took part in the study. Two different machine-learning algorithms were compared. A non-linear deep neural network model and a linear multiple variable regression model. The models were trained using the obtained force–time curves. The linear regression model proved sufficient to predict the push-off forces. The relative difference between the measured and modelled maximum push-off force remained below 5%. This approach, based on a mobile and low-cost measurement system, allows a quantitative analysis of the athlete’s technique/performance. Therefore, we expect the instrument to be a helpful tool for the training of speed skaters.
Collapse
|
29
|
Kozak J, Kania K, Juszczuk P, Mitręga M. Swarm intelligence goal-oriented approach to data-driven innovation in customer churn management. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2021. [DOI: 10.1016/j.ijinfomgt.2021.102357] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
30
|
Arefin R, Samad MD, Akyelken FA, Davanian A. Non-transfer Deep Learning of Optical Coherence Tomography for Post-hoc Explanation of Macular Disease Classification. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2021; 2021:48-52. [PMID: 36168324 PMCID: PMC9511893 DOI: 10.1109/ichi52183.2021.00020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Deep transfer learning is a popular choice for classifying monochromatic medical images using models that are pretrained by natural images with color channels. This choice may introduce unnecessarily redundant model complexity that can limit explanations of such model behavior and outcomes in the context of medical imaging. To investigate this hypothesis, we develop a configurable deep convolutional neural network (CNN) to classify four macular disease conditions using retinal optical coherence tomography (OCT) images. Our proposed non-transfer deep CNN model (acc: 97.9%) outperforms existing transfer learning models such as ResNet-50 (acc: 89.0%), ResNet-101 (acc: 96.7%), VGG-19 (acc: 93.3%), Inception-V3 (acc: 95.8%) in the same retinal OCT image classification task. We perform post-hoc analysis of the trained model and model extracted image features, which reveals that only eight out of 256 filter kernels are active at our final convolutional layer. The convolutional responses of these selective eight filters yield image features that efficiently separate four macular disease classes even when projected onto two-dimensional principal component space. Our findings suggest that many deep learning parameters and their computations are redundant and expensive for retinal OCT image classification, which are expected to be more intense when using transfer learning. Additionally, we provide clinical interpretations of our misclassified test images identifying manifest artifacts, shadowing of useful texture, false texture representing fluids, and other confounding factors. These clinical explanations along with model optimization via kernel selection can improve the classification accuracy, computational costs, and explainability of model outcomes.
Collapse
Affiliation(s)
- Raisul Arefin
- Dept. of Computer Science Auburn University Auburn, AB USA
| | - Manar D Samad
- Dept. of Computer Science Tennessee State University Nashville, TN USA
| | - Furkan A Akyelken
- Dept. of Computer Science Tennessee State University Nashville, TN, USA
| | - Arash Davanian
- Vanderbilt Eye Institute Vanderbilt University Medical Center Nashville, TN, USA
| |
Collapse
|
31
|
Designing convolutional neural networks with constrained evolutionary piecemeal training. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02679-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
AbstractThe automated architecture search methodology for neural networks is known as Neural Architecture Search (NAS). In recent times, Convolutional Neural Networks (CNNs) designed through NAS methodologies have achieved very high performance in several fields, for instance image classification and natural language processing. Our work is in the same domain of NAS, where we traverse the search space of neural network architectures with the help of an evolutionary algorithm which has been augmented with a novel approach of piecemeal-training. In contrast to the previously published NAS techniques, wherein the training with given data is considered an isolated task to estimate the performance of neural networks, our work demonstrates that a neural network architecture and the related weights can be jointly learned by combining concepts of the traditional training process and evolutionary architecture search in a single algorithm. The consolidation has been realised by breaking down the conventional training technique into smaller slices and collating them together with an integrated evolutionary architecture search algorithm. The constraints on architecture search space are placed by limiting its various parameters within a specified range of values, consequently regulating the neural network’s size and memory requirements. We validate this concept on two vastly different datasets, namely, the CIFAR-10 dataset in the domain of image classification, and PAMAP2 dataset in the Human Activity Recognition (HAR) domain. Starting from randomly initialized and untrained CNNs, the algorithm discovers models with competent architectures, which after complete training, reach an accuracy of of 92.5% for CIFAR-10 and 94.36% PAMAP2. We further extend the algorithm to include an additional conflicting search objective: the number of parameters of the neural network. Our multi-objective algorithm produces a Pareto optimal set of neural networks, by optimizing the search for both the accuracy and the parameter count, thus emphasizing the versatility of our approach.
Collapse
|
32
|
|
33
|
A Deep Neural Network for Accurate and Robust Prediction of the Glass Transition Temperature of Polyhydroxyalkanoate Homo- and Copolymers. MATERIALS 2020; 13:ma13245701. [PMID: 33327598 PMCID: PMC7765086 DOI: 10.3390/ma13245701] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Revised: 12/06/2020] [Accepted: 12/09/2020] [Indexed: 12/20/2022]
Abstract
The purpose of this study was to develop a data-driven machine learning model to predict the performance properties of polyhydroxyalkanoates (PHAs), a group of biosourced polyesters featuring excellent performance, to guide future design and synthesis experiments. A deep neural network (DNN) machine learning model was built for predicting the glass transition temperature, Tg, of PHA homo- and copolymers. Molecular fingerprints were used to capture the structural and atomic information of PHA monomers. The other input variables included the molecular weight, the polydispersity index, and the percentage of each monomer in the homo- and copolymers. The results indicate that the DNN model achieves high accuracy in estimation of the glass transition temperature of PHAs. In addition, the symmetry of the DNN model is ensured by incorporating symmetry data in the training process. The DNN model achieved better performance than the support vector machine (SVD), a nonlinear ML model and least absolute shrinkage and selection operator (LASSO), a sparse linear regression model. The relative importance of factors affecting the DNN model prediction were analyzed. Sensitivity of the DNN model, including strategies to deal with missing data, were also investigated. Compared with commonly used machine learning models incorporating quantitative structure-property (QSPR) relationships, it does not require an explicit descriptor selection step but shows a comparable performance. The machine learning model framework can be readily extended to predict other properties.
Collapse
|