1
|
Temporal prediction of suicidal ideation in an ecological momentary assessment study with recurrent neural networks. J Affect Disord 2024:S0165-0327(24)00829-2. [PMID: 38795778 DOI: 10.1016/j.jad.2024.05.093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 05/04/2024] [Accepted: 05/18/2024] [Indexed: 05/28/2024]
Abstract
INTRODUCTION Ecological Momentary Assessment (EMA) holds promise for providing insights into daily life experiences when studying mental health phenomena. However, commonly used mixed-effects linear statistical models do not fully utilize the richness of the ultidimensional time-varying data that EMA yields. Recurrent Neural Networks (RNNs) provide an alternative data analytic method to leverage more information and potentially improve prediction, particularly for non-normally distributed outcomes. METHODS As part of a broader research study of suicidal thoughts and behavior in people with borderline personality disorder (BPD), eighty-four participants engaged in EMA data collection over one week, answering questions multiple times each day about suicidal ideation (SI), stressful events, coping strategy use, and affect. RNNs and mixed-effects linear regression models (MEMs) were trained and used to predict SI. Root mean squared error (RMSE), mean absolute percent error (MAPE), and a pseudo-R2 accuracy metric were used to compare SI prediction accuracy between the two modeling methods. RESULTS RNNs had superior accuracy metrics (full model: RMSE = 3.41, MAPE = 42 %, pseudo-R2 = 26 %) compared with MEMs (full model: RMSE = 3.84, MAPE = 56 %, pseudo-R2 = 16 %). Importantly, RNNs showed significantly more accurate prediction at higher values of SI. Additionally, RNNs predicted, with significantly higher accuracy, the SI scores of participants with depression diagnoses and of participants with higher depression scores at baseline. CONCLUSION In this EMA study with a moderately sized sample, RNNs were better able to learn and predict daily SI compared with mixed-effects models. RNNs should be considered as an option for EMA analysis.
Collapse
|
2
|
Assessing the quality of experience in wireless networks for multimedia applications: A comprehensive analysis utilizing deep learning-based techniques. Heliyon 2024; 10:e30351. [PMID: 38726158 PMCID: PMC11079109 DOI: 10.1016/j.heliyon.2024.e30351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/18/2024] [Accepted: 04/24/2024] [Indexed: 05/12/2024] Open
Abstract
In the context of the burgeoning progression of wireless network technology and the corresponding escalation in the demand for mobile Internet-based multimedia transmission services, the task of preserving and augmenting user satisfaction has emerged as an imperative concern. This necessitates a sophisticated and accurate evaluation of multimedia service quality within the sphere of wireless networks. To systematically address the nuanced issue of user experience quality, the present study introduces a novel method for evaluating multimedia Quality of Experience (QoE) in wireless networks, employing an advanced deep learning model as the underlying analytical framework. Initially, the research undertakes the task of modeling the video session process, giving due consideration to the status of each temporal interval within the session's architecture. Subsequently, the challenge of QoE prediction is dissected and investigated through the lens of recurrent neural networks (RNNs), culminating in the proposition of an all-encompassing QoE prediction model that harmoniously integrates video information, Quality of Service (QoS) data, user behavior analytics, and facial expression analysis. The empirical segment of this research serves to validate the efficacy of the suggested video QoE evaluation method, engaging both quantitative and qualitative comparison metrics with contemporaneous state-of-the-art QoE models, employing the RTVCQoE dataset as the empirical foundation. The experimental findings illuminate that the QoE model elucidated in this study transcends competing models in performance metrics such as PLCC, SRCC, and KRCC. Consequently, this investigation stands as a seminal contribution to academic literature, furnishing an exacting and dependable QoE evaluation methodology. Such a contribution augments the user experience landscape in multimedia services within wireless networks, and instigates further scholarly exploration and technological innovation in the mobile Internet domain.
Collapse
|
3
|
A systematic review on diabetic retinopathy detection and classification based on deep learning techniques using fundus images. PeerJ Comput Sci 2024; 10:e1947. [PMID: 38699206 PMCID: PMC11065411 DOI: 10.7717/peerj-cs.1947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 02/28/2024] [Indexed: 05/05/2024]
Abstract
Diabetic retinopathy (DR) is the leading cause of visual impairment globally. It occurs due to long-term diabetes with fluctuating blood glucose levels. It has become a significant concern for people in the working age group as it can lead to vision loss in the future. Manual examination of fundus images is time-consuming and requires much effort and expertise to determine the severity of the retinopathy. To diagnose and evaluate the disease, deep learning-based technologies have been used, which analyze blood vessels, microaneurysms, exudates, macula, optic discs, and hemorrhages also used for initial detection and grading of DR. This study examines the fundamentals of diabetes, its prevalence, complications, and treatment strategies that use artificial intelligence methods such as machine learning (ML), deep learning (DL), and federated learning (FL). The research covers future studies, performance assessments, biomarkers, screening methods, and current datasets. Various neural network designs, including recurrent neural networks (RNNs), generative adversarial networks (GANs), and applications of ML, DL, and FL in the processing of fundus images, such as convolutional neural networks (CNNs) and their variations, are thoroughly examined. The potential research methods, such as developing DL models and incorporating heterogeneous data sources, are also outlined. Finally, the challenges and future directions of this research are discussed.
Collapse
|
4
|
RM-GPT: Enhance the comprehensive generative ability of molecular GPT model via LocalRNN and RealFormer. Artif Intell Med 2024; 150:102827. [PMID: 38553166 DOI: 10.1016/j.artmed.2024.102827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 02/26/2024] [Accepted: 02/26/2024] [Indexed: 04/02/2024]
Abstract
Due to the surging of cost, artificial intelligence-assisted de novo drug design has supplanted conventional methods and become an emerging option for drug discovery. Although there have arisen many successful examples of applying generative models to the molecular field, these methods struggle to deal with conditional generation that meet chemists' practical requirements which ask for a controllable process to generate new molecules or optimize basic molecules with appointed conditions. To address this problem, a Recurrent Molecular-Generative Pretrained Transformer model is proposed, supplemented by LocalRNN and Residual Attention Layer Transformer, referred to as RM-GPT. RM-GPT rebuilds GPT model's architecture by incorporating LocalRNN and Residual Attention Layer Transformer so that it is able to extract local information and build connectivity between attention blocks. The incorporation of Transformer in these two modules enables leveraging the parallel computing advantages of multi-head attention mechanisms while extracting local structural information effectively. Through exploring and learning in a large chemical space, RM-GPT absorbs the ability to generate drug-like molecules with conditions in demand, such as desired properties and scaffolds, precisely and stably. RM-GPT achieved better results than SOTA methods on conditional generation.
Collapse
|
5
|
Predicting drug activity against cancer through genomic profiles and SMILES. Artif Intell Med 2024; 150:102820. [PMID: 38553160 DOI: 10.1016/j.artmed.2024.102820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/09/2024] [Accepted: 02/21/2024] [Indexed: 04/02/2024]
Abstract
Due to the constant increase in cancer rates, the disease has become a leading cause of death worldwide, enhancing the need for its detection and treatment. In the era of personalized medicine, the main goal is to incorporate individual variability in order to choose more precisely which therapy and prevention strategies suit each person. However, predicting the sensitivity of tumors to anticancer treatments remains a challenge. In this work, we propose two deep neural network models to predict the impact of anticancer drugs in tumors through the half-maximal inhibitory concentration (IC50). These models join biological and chemical data to apprehend relevant features of the genetic profile and the drug compounds, respectively. In order to predict the drug response in cancer cell lines, this study employed different DL methods, resorting to Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs). In the first stage, two autoencoders were pre-trained with high-dimensional gene expression and mutation data of tumors. Afterward, this genetic background is transferred to the prediction models that return the IC50 value that portrays the potency of a substance in inhibiting a cancer cell line. When comparing RSEM Expected counts and TPM as methods for displaying gene expression data, RSEM has been shown to perform better in deep models and CNNs model can obtain better insight in these types of data. Moreover, the obtained results reflect the effectiveness of the extracted deep representations in the prediction of the IC50 value that portrays the potency of a substance in inhibiting a tumor, achieving a performance of a mean squared error of 1.06 and surpassing previous state-of-the-art models.
Collapse
|
6
|
Robust diagnosis recommendation system for Primary Care Telemedicine using long short-term memory multi-class sequence classification. Heliyon 2024; 10:e26770. [PMID: 38510056 PMCID: PMC10950495 DOI: 10.1016/j.heliyon.2024.e26770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/12/2024] [Accepted: 02/20/2024] [Indexed: 03/22/2024] Open
Abstract
Background Telemedicine offers opportunity for robust diagnoses recommendations to support healthcare providers intra-consultation in a way that does not limit providers ability to explore diagnostic codes and make the most appropriate selection for each consultation. Objective The objective of this work was to develop a recommendation system for ICD-10 coding using multiclass sequence classification and deep learning. The recommendations are intended to support telemedicine clinicians in making timely and appropriate diagnosis selections. The recommendations allow clinicians to find and select the best diagnosis code much quicker and without leaving the telemedicine platform to search codes and code descriptions. Methods We developed an LSTM model for multi-class text sequence classification to make diagnosis recommendations. The LSTM recommender used text-based symptoms, complaints, and consultation request reasons as model inputs. Data were extracted from a live telemedicine platform which spans general medicine, dermatology, and mental health clinical specialties. A popularity-based model was used for baseline comparison. Results Using over 2.8 MM telemedicine consultations during 2021 and 2022, our LSTM recommender average accuracy was 31.7%. LSTM recommender average coverage in the top 20 recommended diagnoses was 85.8% with an average personalization score of 0.87. Conclusions LSTM multi-class sequence classification recommends diagnoses specific to individual consultations, is retrainable on regular intervals, and could improve diagnoses recommendations such that providers require less time and resources searching for diagnosis codes. In addition, the LSTM recommender is robust enough to make recommendations across clinical specialties such as general medicine, dermatology, and mental health.
Collapse
|
7
|
Reinvent 4: Modern AI-driven generative molecule design. J Cheminform 2024; 16:20. [PMID: 38383444 PMCID: PMC10882833 DOI: 10.1186/s13321-024-00812-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/09/2024] [Indexed: 02/23/2024] Open
Abstract
REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within the general machine learning optimization algorithms, transfer learning, reinforcement learning and curriculum learning. REINVENT 4 enables and facilitates de novo design, R-group replacement, library design, linker design, scaffold hopping and molecule optimization. This contribution gives an overview of the software and describes its design. Algorithms and their applications are discussed in detail. REINVENT 4 is a command line tool which reads a user configuration in either TOML or JSON format. The aim of this release is to provide reference implementations for some of the most common algorithms in AI based molecule generation. An additional goal with the release is to create a framework for education and future innovation in AI based molecular design. The software is available from https://github.com/MolecularAI/REINVENT4 and released under the permissive Apache 2.0 license. Scientific contribution. The software provides an open-source reference implementation for generative molecular design where the software is also being used in production to support in-house drug discovery projects. The publication of the most common machine learning algorithms in one code and full documentation thereof will increase transparency of AI and foster innovation, collaboration and education.
Collapse
|
8
|
A physics-informed long short-term memory (LSTM) model for estimating ammonia emissions from dairy manure during storage. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 912:168885. [PMID: 38036129 DOI: 10.1016/j.scitotenv.2023.168885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 12/02/2023]
Abstract
Manure management on dairy farms impacts how farmers maximize its value as fertilizer, reduce operating costs, and minimize environmental pollution potential. A persistent challenge on many farms is minimizing ammonia losses through volatilization during storage to maintain manure nitrogen content. Knowing the quantities of emitted pollutants is at the core of designing and improving mitigation strategies for livestock operations. Although process-based models have improved the accuracy of estimating ammonia emissions, complex systems such as manure storage still need to be solved because some underlying science still needs work. This study presents a novel physics-informed long short-term memory (PI-LSTM) modeling approach combining traditional process-based with recurrent neural networks to estimate ammonia loss from dairy manure during storage. The method entails inverse modeling to optimize hyperparameters to improve the accuracy of estimating physicochemical properties pertinent to ammonia's transport and surface emissions. The study used open data sets from two on-farm studies on liquid dairy manure storage in Switzerland and Indiana, U.S.A. The root mean square errors were 1.51 g m-2 h-1 for the PI-LSTM model, 3.01 g m-2 h-1 for the base compartmental process-based (Base-CPBM) model, and 2.17 g m-2 h-1 for the hyperparameter-tuned compartmental process-based (HT-CPBM) model. In addition, the PI-LSTM model outperformed the Base-CPBM and the HT-CPBM models by 20 to 80 % during summer and spring, when most annual ammonia emissions occur. The study demonstrated that incorporating physical knowledge into machine learning models improves generalization accuracy. The outcomes of this study provide the scientific basis to improve policymaking decisions and the design of suitable on-farm strategies to minimize manure nutrient losses on dairy farms during storage periods.
Collapse
|
9
|
A recurrent Hopfield network for estimating meso-scale effective connectivity in MEG. Neural Netw 2024; 170:72-93. [PMID: 37977091 DOI: 10.1016/j.neunet.2023.11.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 11/07/2023] [Accepted: 11/09/2023] [Indexed: 11/19/2023]
Abstract
The architecture of communication within the brain, represented by the human connectome, has gained a paramount role in the neuroscience community. Several features of this communication, e.g., the frequency content, spatial topology, and temporal dynamics are currently well established. However, identifying generative models providing the underlying patterns of inhibition/excitation is very challenging. To address this issue, we present a novel generative model to estimate large-scale effective connectivity from MEG. The dynamic evolution of this model is determined by a recurrent Hopfield neural network with asymmetric connections, and thus denoted Recurrent Hopfield Mass Model (RHoMM). Since RHoMM must be applied to binary neurons, it is suitable for analyzing Band Limited Power (BLP) dynamics following a binarization process. We trained RHoMM to predict the MEG dynamics through a gradient descent minimization and we validated it in two steps. First, we showed a significant agreement between the similarity of the effective connectivity patterns and that of the interregional BLP correlation, demonstrating RHoMM's ability to capture individual variability of BLP dynamics. Second, we showed that the simulated BLP correlation connectomes, obtained from RHoMM evolutions of BLP, preserved some important topological features, e.g, the centrality of the real data, assuring the reliability of RHoMM. Compared to other biophysical models, RHoMM is based on recurrent Hopfield neural networks, thus, it has the advantage of being data-driven, less demanding in terms of hyperparameters and scalable to encompass large-scale system interactions. These features are promising for investigating the dynamics of inhibition/excitation at different spatial scales.
Collapse
|
10
|
Decomposition aided attention-based recurrent neural networks for multistep ahead time-series forecasting of renewable power generation. PeerJ Comput Sci 2024; 10:e1795. [PMID: 38259888 PMCID: PMC10803097 DOI: 10.7717/peerj-cs.1795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 12/14/2023] [Indexed: 01/24/2024]
Abstract
Renewable energy plays an increasingly important role in our future. As fossil fuels become more difficult to extract and effectively process, renewables offer a solution to the ever-increasing energy demands of the world. However, the shift toward renewable energy is not without challenges. While fossil fuels offer a more reliable means of energy storage that can be converted into usable energy, renewables are more dependent on external factors used for generation. Efficient storage of renewables is more difficult often relying on batteries that have a limited number of charge cycles. A robust and efficient system for forecasting power generation from renewable sources can help alleviate some of the difficulties associated with the transition toward renewable energy. Therefore, this study proposes an attention-based recurrent neural network approach for forecasting power generated from renewable sources. To help networks make more accurate forecasts, decomposition techniques utilized applied the time series, and a modified metaheuristic is introduced to optimized hyperparameter values of the utilized networks. This approach has been tested on two real-world renewable energy datasets covering both solar and wind farms. The models generated by the introduced metaheuristics were compared with those produced by other state-of-the-art optimizers in terms of standard regression metrics and statistical analysis. Finally, the best-performing model was interpreted using SHapley Additive exPlanations.
Collapse
|
11
|
Contextually enhanced ES-dRNN with dynamic attention for short-term load forecasting. Neural Netw 2024; 169:660-672. [PMID: 37972510 DOI: 10.1016/j.neunet.2023.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 09/05/2023] [Accepted: 11/07/2023] [Indexed: 11/19/2023]
Abstract
In this paper, we propose a new short-term load forecasting (STLF) model based on contextually enhanced hybrid and hierarchical architecture combining exponential smoothing (ES) and a recurrent neural network (RNN). The model is composed of two simultaneously trained tracks: the context track and the main track. The context track introduces additional information to the main track. It is extracted from representative series and dynamically modulated to adjust to the individual series forecasted by the main track. The RNN architecture consists of multiple recurrent layers stacked with hierarchical dilations and equipped with recently proposed attentive dilated recurrent cells. These cells enable the model to capture short-term, long-term and seasonal dependencies across time series as well as to weight dynamically the input information. The model produces both point forecasts and predictive intervals. The experimental part of the work performed on 35 forecasting problems shows that the proposed model outperforms in terms of accuracy its predecessor as well as standard statistical models and state-of-the-art machine learning models.
Collapse
|
12
|
Reservoir computing models based on spiking neural P systems for time series classification. Neural Netw 2024; 169:274-281. [PMID: 37918270 DOI: 10.1016/j.neunet.2023.10.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 09/12/2023] [Accepted: 10/25/2023] [Indexed: 11/04/2023]
Abstract
Nonlinear spiking neural P (NSNP) systems are neural-like membrane computing models with nonlinear spiking mechanisms. Because of this nonlinear spiking mechanism, NSNP systems can show rich nonlinear dynamics. Reservoir computing (RC) is a novel recurrent neural network (RNN) and can overcome some shortcomings of traditional RNNs. Based on NSNP systems, we developed two RC variants for time series classification, RC-SNP and RC-RMS-SNP, which are without and integrated with reservoir model space (RMS), respectively. The two RC variants use NSNP systems as the reservoirs and can be easily implemented in the RC framework. The proposed two RC variants were evaluated on 17 benchmark time series classification datasets and compared with 16 state-of-the-art or baseline classification models. The comparison results demonstrate the effectiveness of the proposed two RC variants for time series classification tasks.
Collapse
|
13
|
An event-triggered collaborative neurodynamic approach to distributed global optimization. Neural Netw 2024; 169:181-190. [PMID: 37890367 DOI: 10.1016/j.neunet.2023.10.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 08/29/2023] [Accepted: 10/15/2023] [Indexed: 10/29/2023]
Abstract
In this paper, we propose an event-triggered collaborative neurodynamic approach to distributed global optimization in the presence of nonconvexity. We design a projection neural network group consisting of multiple projection neural networks coupled via a communication network. We prove the convergence of the projection neural network group to Karush-Kuhn-Tucker points of a given global optimization problem. To reduce communication bandwidth consumption, we adopt an event-triggered mechanism to liaise with other neural networks in the group with the Zeno behavior being precluded. We employ multiple projection neural network groups for scattered searches and re-initialize their states using a meta-heuristic rule in the collaborative neurodynamic optimization framework. In addition, we apply the collaborative neurodynamic approach for distributed optimal chiller loading in a heating, ventilation, and air conditioning system.
Collapse
|
14
|
Alternative states in microbial communities during artificial aeration: Proof of incubation experiment and development of recurrent neural network models. WATER RESEARCH 2023; 247:120828. [PMID: 37948904 DOI: 10.1016/j.watres.2023.120828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 10/20/2023] [Accepted: 11/03/2023] [Indexed: 11/12/2023]
Abstract
Artificial aeration, a widely used method of restoring the aquatic ecological environment by enhancing the re-oxygenation capacity, typically relies upon empirical models to predict ecological dynamics and determine the operating scheme of the aeration equipment. Restoration through artificial aeration is involved in oxic-anoxic transitions, whether these transitions occurred in the form of a regime shift, making the development of predictive models challenging. Here, we confirmed the existence of alternative states in microbial communities during artificial aeration through aeration incubation experiment for the first time and considered its existence in neural network modeling in order to improve model performance. By aeration incubation experiment, it was confirmed that the alternative states exist in microbial communities during artificial aeration by two independent approaches, potential analysis and "enterotyping" approach. Comparing neural network models with and without considering the existence of alternative states, it was found that considering the existence of alternative states in modeling could improve the performance of neural network model. Our study provides a reference for the prediction of systems containing time series data where the current state will have an impact on later states. The developed model could be used for optimizing the operating scheme of the artificial aeration.
Collapse
|
15
|
Automated model discovery for muscle using constitutive recurrent neural networks. J Mech Behav Biomed Mater 2023; 145:106021. [PMID: 37473576 DOI: 10.1016/j.jmbbm.2023.106021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/18/2023] [Accepted: 07/06/2023] [Indexed: 07/22/2023]
Abstract
The stiffness of soft biological tissues not only depends on the applied deformation, but also on the deformation rate. To model this type of behavior, traditional approaches select a specific time-dependent constitutive model and fit its parameters to experimental data. Instead, a new trend now suggests a machine-learning based approach that simultaneously discovers both the best model and best parameters to explain given data. Recent studies have shown that feed-forward constitutive neural networks can robustly discover constitutive models and parameters for hyperelastic materials. However, feed-forward architectures fail to capture the history dependence of viscoelastic soft tissues. Here we combine a feed-forward constitutive neural network for the hyperelastic response and a recurrent neural network for the viscous response inspired by the theory of quasi-linear viscoelasticity. Our novel rheologically-informed network architecture discovers the time-independent initial stress using the feed-forward network and the time-dependent relaxation using the recurrent network. We train and test our combined network using unconfined compression relaxation experiments of passive skeletal muscle and compare our discovered model to a neo Hookean standard linear solid, to an advanced mechanics-based model, and to a vanilla recurrent neural network with no mechanics knowledge. We demonstrate that, for limited experimental data, our new constitutive recurrent neural network discovers models and parameters that satisfy basic physical principles and generalize well to unseen data. We discover a Mooney-Rivlin type two-term initial stored energy function that is linear in the first invariant I1 and quadratic in the second invariant I2 with stiffness parameters of 0.60 kPa and 0.55 kPa. We also discover a Prony-series type relaxation function with time constants of 0.362s, 2.54s, and 52.0s with coefficients of 0.89, 0.05, and 0.03. Our newly discovered model outperforms both the neo Hookean standard linear solid and the vanilla recurrent neural network in terms of prediction accuracy on unseen data. Our results suggest that constitutive recurrent neural networks can autonomously discover both model and parameters that best explain experimental data of soft viscoelastic tissues. Our source code, data, and examples are available at https://github.com/LivingMatterLab.
Collapse
|
16
|
Two-timescale recurrent neural networks for distributed minimax optimization. Neural Netw 2023; 165:527-539. [PMID: 37348433 DOI: 10.1016/j.neunet.2023.06.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 06/01/2023] [Accepted: 06/02/2023] [Indexed: 06/24/2023]
Abstract
In this paper, we present two-timescale neurodynamic optimization approaches to distributed minimax optimization. We propose four multilayer recurrent neural networks for solving four different types of generally nonlinear convex-concave minimax problems subject to linear equality and nonlinear inequality constraints. We derive sufficient conditions to guarantee the stability and optimality of the neural networks. We demonstrate the viability and efficiency of the proposed neural networks in two specific paradigms for Nash-equilibrium seeking in a zero-sum game and distributed constrained nonlinear optimization.
Collapse
|
17
|
Machine learning in Huntington's disease: exploring the Enroll-HD dataset for prognosis and driving capability prediction. Orphanet J Rare Dis 2023; 18:218. [PMID: 37501188 PMCID: PMC10375780 DOI: 10.1186/s13023-023-02785-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 06/18/2023] [Indexed: 07/29/2023] Open
Abstract
BACKGROUND In biomedicine, machine learning (ML) has proven beneficial for the prognosis and diagnosis of different diseases, including cancer and neurodegenerative disorders. For rare diseases, however, the requirement for large datasets often prevents this approach. Huntington's disease (HD) is a rare neurodegenerative disorder caused by a CAG repeat expansion in the coding region of the huntingtin gene. The world's largest observational study for HD, Enroll-HD, describes over 21,000 participants. As such, Enroll-HD is amenable to ML methods. In this study, we pre-processed and imputed Enroll-HD with ML methods to maximise the inclusion of participants and variables. With this dataset we developed models to improve the prediction of the age at onset (AAO) and compared it to the well-established Langbehn formula. In addition, we used recurrent neural networks (RNNs) to demonstrate the utility of ML methods for longitudinal datasets, assessing driving capabilities by learning from previous participant assessments. RESULTS Simple pre-processing imputed around 42% of missing values in Enroll-HD. Also, 167 variables were retained as a result of imputing with ML. We found that multiple ML models were able to outperform the Langbehn formula. The best ML model (light gradient boosting machine) improved the prognosis of AAO compared to the Langbehn formula by 9.2%, based on root mean squared error in the test set. In addition, our ML model provides more accurate prognosis for a wider CAG repeat range compared to the Langbehn formula. Driving capability was predicted with an accuracy of 85.2%. The resulting pre-processing workflow and code to train the ML models are available to be used for related HD predictions at: https://github.com/JasperO98/hdml/tree/main . CONCLUSIONS Our pre-processing workflow made it possible to resolve the missing values and include most participants and variables in Enroll-HD. We show the added value of a ML approach, which improved AAO predictions and allowed for the development of an advisory model that can assist clinicians and participants in estimating future driving capability.
Collapse
|
18
|
Individual modelling of haematotoxicity with NARX neural networks: A knowledge transfer approach. Heliyon 2023; 9:e17890. [PMID: 37483774 PMCID: PMC10362198 DOI: 10.1016/j.heliyon.2023.e17890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 06/22/2023] [Accepted: 06/30/2023] [Indexed: 07/25/2023] Open
Abstract
Cytotoxic cancer therapy often results in dose-limiting haematotoxic side effects. Predicting an individual's risk is a major objective in precision medicine of cancer treatment. In this regard, patient heterogeneity presents a significant challenge. In this paper, we explore the use of hypothesis-free machine learning models based on recurrent nonlinear auto-regressive networks with exogenous inputs (NARX) as an approach to achieve this goal. Also, we propose a knowledge transfer approach to ameliorate the issue of sparse individual data, which typically hampers learning of individual networks. We demonstrate the feasibility of our approach based on a virtual patient population generated using a semi-mechanistic model of haematopoiesis and imposing different cytotoxic therapy scenarios on it. Employing different techniques of model optimisation, we derive robust and parsimonious individual networks with good generalisation performances. Moreover, we analyse in detail possible factors influencing the generalisation performance. Results suggest that our transfer learning approach using NARX networks can provide robust predictions of individual patient's response to treatment. As a practical perspective, we apply our approach to individual time series data of two patients with non-Hodgkin's lymphoma.
Collapse
|
19
|
Using a Recurrent Neural Network To Inform the Use of Prostate-specific Antigen (PSA) and PSA Density for Dynamic Monitoring of the Risk of Prostate Cancer Progression on Active Surveillance. EUR UROL SUPPL 2023; 52:36-39. [PMID: 37182116 PMCID: PMC10172696 DOI: 10.1016/j.euros.2023.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/12/2023] [Indexed: 05/16/2023] Open
Abstract
The global uptake of prostate cancer (PCa) active surveillance (AS) is steadily increasing. While prostate-specific antigen density (PSAD) is an important baseline predictor of PCa progression on AS, there is a scarcity of recommendations on its use in follow-up. In particular, the best way of measuring PSAD is unclear. One approach would be to use the baseline gland volume (BGV) as a denominator in all calculations throughout AS (nonadaptive PSAD, PSADNA), while another would be to remeasure gland volume at each new magnetic resonance imaging scan (adaptive PSAD, PSADA). In addition, little is known about the predictive value of serial PSAD in comparison to PSA. We applied a long short-term memory recurrent neural network to an AS cohort of 332 patients and found that serial PSADNA significantly outperformed both PSADA and PSA for follow-up prediction of PCa progression because of its high sensitivity. Importantly, while PSADNA was superior in patients with smaller glands (BGV ≤55 ml), serial PSA was better in men with larger prostates of >55 ml. Patient summary Repeat measurements of prostate-specific antigen (PSA) and PSA density (PSAD) are the mainstay of active surveillance in prostate cancer. Our study suggests that in patients with a prostate gland of 55 ml or smaller, PSAD measurements are a better predictor of tumour progression, whereas men with a larger gland may benefit more from PSA monitoring.
Collapse
|
20
|
Prediction and detection of emotional tone in online social media mental disorder groups using regression and recurrent neural networks. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-21. [PMID: 37362737 PMCID: PMC10126575 DOI: 10.1007/s11042-023-15316-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 01/17/2023] [Accepted: 04/06/2023] [Indexed: 06/28/2023]
Abstract
Online social media networks have become a significant platform for persons with mental illnesses to discuss their struggles and obtain emotional and informational assistance in recent years. One such platform is Reddit, where sub-groups called 'subreddits' exist, based on a variety of topics including mental illnesses such as anxiety or depression. We analyse the user's interactions to calculate the mental health status by formulating and using a parameter called 'emotional tone' representing the user's emotional state. VADER sentiment analysis and TextBlob are used to categorise emotional tone and find distribution of emotional polarity and subjectivity of comments. For final tone prediction, RNN and State-Of-The-Art word embedding techniques are used to develop a predictive model. The resultant model provides end-to-end categorization and prediction of emotional tone. We obtain results with respect to Weighted L1 Loss that deals with extreme responses. The MODEL transcends all the baselines by at least 12.1% and the final emotional status of the authors is positive.
Collapse
|
21
|
Neural correlates of face perception modeled with a convolutional recurrent neural network. J Neural Eng 2023; 20. [PMID: 36898147 DOI: 10.1088/1741-2552/acc35b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 03/10/2023] [Indexed: 03/12/2023]
Abstract
OBJECTIVE Event-related potential (ERP) sensitivity to faces is predominantly characterized by an N170 peak that has greater amplitude and shorter latency when elicited by human faces than images of other objects. We aimed to develop a computational model of visual ERP generation to study this phenomenon which consisted of a three-dimensional convolutional neural network (CNN) connected to a recurrent neural network (RNN).
Approach: The CNN provided image representation learning, complimenting sequence learning of the RNN for modeling visually-evoked potentials. We used open-access data from ERP CORE (40 subjects) to develop the model, generated synthetic images for simulating experiments with a generative adversarial network, then collected additional data (16 subjects) to validate predictions of these simulations. For modeling, visual stimuli presented during ERP experiments were represented as sequences of images (time x pixels). These were provided as inputs to the model. By filtering and pooling over spatial dimensions, the CNN transformed these inputs into sequences of vectors that were passed to the RNN. The ERP waveforms evoked by visual stimuli were provided to the RNN as labels for supervised learning. The whole model was trained end-to-end using data from the open-access dataset to reproduce ERP waveforms evoked by visual events. 
Main results: Cross-validation model outputs strongly correlated with open-access (r = 0.98) and validation study data (r = 0.78). Open-access and validation study data correlated similarly (r = 0.81). Some aspects of model behavior were consistent with neural recordings while others were not, suggesting promising albeit limited capacity for modeling the neurophysiology of face-sensitive ERP generation.
Significance: The approach developed in this work is potentially of significant value for visual neuroscience research, where it may be adapted for multiple contexts to study computational relationships between visual stimuli and evoked neural activity.
.
Collapse
|
22
|
Different eigenvalue distributions encode the same temporal tasks in recurrent neural networks. Cogn Neurodyn 2023; 17:257-275. [PMID: 35469119 PMCID: PMC9020562 DOI: 10.1007/s11571-022-09802-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 01/28/2022] [Accepted: 03/21/2022] [Indexed: 01/26/2023] Open
Abstract
Different brain areas, such as the cortex and, more specifically, the prefrontal cortex, show great recurrence in their connections, even in early sensory areas. Several approaches and methods based on trained networks have been proposed to model and describe these regions. It is essential to understand the dynamics behind the models because they are used to build different hypotheses about the functioning of brain areas and to explain experimental results. The main contribution here is the description of the dynamics through the classification and interpretation carried out with a set of numerical simulations. This study sheds light on the multiplicity of solutions obtained for the same tasks and shows the link between the spectra of linearized trained networks and the dynamics of the counterparts. The patterns in the distribution of the eigenvalues of the recurrent weight matrix were studied and properly related to the dynamics in each task. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-022-09802-5.
Collapse
|
23
|
Deep neural network architecture for automated soft surgical skills evaluation using objective structured assessment of technical skills criteria. Int J Comput Assist Radiol Surg 2023; 18:929-937. [PMID: 36694051 DOI: 10.1007/s11548-022-02827-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 12/22/2022] [Indexed: 01/26/2023]
Abstract
PURPOSE Classic methods of surgery skills evaluation tend to classify the surgeon performance in multi-categorical discrete classes. If this classification scheme has proven to be effective, it does not provide in-between evaluation levels. If these intermediate scoring levels were available, they would provide more accurate evaluation of the surgeon trainee. METHODS We propose a novel approach to assess surgery skills on a continuous scale ranging from 1 to 5. We show that the proposed approach is flexible enough to be used either for scores of global performance or several sub-scores based on a surgical criteria set called Objective Structured Assessment of Technical Skills (OSATS). We established a combined CNN+BiLSTM architecture to take advantage of both temporal and spatial features of kinematic data. Our experimental validation relies on real-world data obtained from JIGSAWS database. The surgeons are evaluated on three tasks: Knot-Tying, Needle-Passing and Suturing. The proposed framework of neural networks takes as inputs a sequence of 76 kinematic variables and produces an output float score ranging from 1 to 5, reflecting the quality of the performed surgical task. RESULTS Our proposed model achieves high-quality OSATS scores predictions with means of Spearman correlation coefficients between the predicted outputs and the ground-truth outputs of 0.82, 0.60 and 0.65 for Knot-Tying, Needle-Passing and Suturing, respectively. To our knowledge, we are the first to achieve this regression performance using the OSATS criteria and the JIGSAWS kinematic data. CONCLUSION An effective deep learning tool was created for the purpose of surgical skills assessment. It was shown that our method could be a promising surgical skills evaluation tool for surgical training programs.
Collapse
|
24
|
Continual learning with attentive recurrent neural networks for temporal data classification. Neural Netw 2023; 158:171-187. [PMID: 36459884 DOI: 10.1016/j.neunet.2022.10.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 10/01/2022] [Accepted: 10/31/2022] [Indexed: 11/13/2022]
Abstract
Continual learning is an emerging research branch of deep learning, which aims to learn a model for a series of tasks continually without forgetting knowledge obtained from previous tasks. Despite receiving a lot of attention in the research community, temporal-based continual learning techniques are still underutilized. In this paper, we address the problem of temporal-based continual learning by allowing a model to continuously learn on temporal data. To solve the catastrophic forgetting problem of learning temporal data in task incremental scenarios, in this research, we propose a novel method based on attentive recurrent neural networks, called Temporal Teacher Distillation (TTD). TTD solves the catastrophic forgetting problem in an attentive recurrent neural network based on three hypotheses, namely Rotation Hypothesis, Redundant Hypothesis, and Recover Hypothesis. Rotation Hypothesis and Redundant hypotheses could cause the attention shift phenomenon, which degrades the model performance on the learned tasks. Moreover, not considering the Recover Hypothesis increases extra memory usage in continuously training different tasks. Therefore, the proposed TTD based on the above hypotheses complements the inadequacy of the existing methods for temporal-based continual learning. For evaluating the performance of our proposed method in task incremental setting, we use a public dataset, WIreless Sensor Data Mining (WISDM), and a synthetic dataset, Split-QuickDraw-100. According to experimental results, the proposed TTD significantly outperforms state-of-the-art methods by up to 14.6% and 45.1% in terms of accuracy and forgetting measures, respectively. To the best of our knowledge, this is the first work that studies continual learning in real-world incremental categories for temporal data classification with attentive recurrent neural networks and provides the proper application-oriented scenario.
Collapse
|
25
|
The single-channel dry electrode SSVEP-based biometric approach: data augmentation techniques against overfitting for RNN-based deep models. Phys Eng Sci Med 2022; 45:1219-1240. [PMID: 36318386 DOI: 10.1007/s13246-022-01189-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 10/17/2022] [Indexed: 12/14/2022]
Abstract
Biometric studies based on electroencephalography (EEG) have received increasing attention because each individual has a dynamic and unique pattern. However, classic EEG-based biometrics have significant deficiencies, including noise-prone signals, gel-based electrodes, and the need for multi-training/multi-channel acquisition and high mental effort. In contrast, steady-state visually evoked potential (SSVEP)-based biometrics have the important advantages of high signal-to-noise ratio and untrained usage. Dynamic brain potential responses are a natural subconscious activity and can be elicited by flickering lights having distinct frequencies, such as cell phone flashes, without extra physical or mental effort. Few studies involving multi-channel/multi-trial SSVEP-based biometric research are available in the current literature. Moreover, there is a lack of research comparing them to the single-channel single-trial dry electrode-implemented SSVEP-based biometric approach using Recurrent Neural Networks (RNN). Furthermore, to the best of our knowledge, no prior work has proposed an SSVEP-based biometric comparison of the RNNs using data augmentation strategies against overfitting. It was observed that the biometric recognition results were promising, achieving up to 100% accuracy and > 97% sensitivity and specificity scores for 11 subjects. F-scores were also yielded as > 97% values. This single-channel SSVEP-based biometric approach using RNN deep models may offer low-cost, user-friendly, and reliable individual identification authentication, leading to significant application domains.
Collapse
|
26
|
Systematic review of content analysis algorithms based on deep neural networks. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 82:17879-17903. [PMID: 36313481 PMCID: PMC9589819 DOI: 10.1007/s11042-022-14043-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 07/12/2022] [Accepted: 10/06/2022] [Indexed: 06/16/2023]
Abstract
Today according to social media, the internet, Etc. Data is rapidly produced and occupies a large space in systems that have resulted in enormous data warehouses; the progress in information technology has significantly increased the speed and ease of data flow.text mining is one of the most important methods for extracting a useful model through extracting and adapting knowledge from data sets. However, many studies have been conducted based on the usage of deep learning for text processing and text mining issues.The idea and method of text mining are one of the fields that seek to extract useful information from unstructured textual data that is used very today. Deep learning and machine learning techniques in classification and text mining and their type are discussed in this paper as well. Neural networks of various kinds, namely, ANN, RNN, CNN, and LSTM, are the subject of study to select the best technique. In this study, we conducted a Systematic Literature Review to extract and associate the algorithms and features that have been used in this area. Based on our search criteria, we retrieved 130 relevant studies from electronic databases between 1997 and 2021; we have selected 43 studies for further analysis using inclusion and exclusion criteria in Section 3.2. According to this study, hybrid LSTM is the most widely used deep learning algorithm in these studies, and SVM in machine learning method high accuracy in result shown.
Collapse
|
27
|
Comparison of machine learning classifiers for differentiating level and sport using movement data. J Sports Sci 2022; 40:2166-2172. [PMID: 36415053 DOI: 10.1080/02640414.2022.2145430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The purposes of this study were to determine if 1) recurrent neural networks designed for multivariate, time-series analyses outperform traditional linear and non-linear machine learning classifiers when classifying athletes based on competition level and sport played, and 2) athletes of different sports move differently during non-sport-specific movement screens. Optical-based kinematic data from 542 athletes were used as input data for nine different machine learning algorithms to classify athletes based on competition level and sport played. For the traditional machine learning classifiers, principal component analysis and feature selection were used to reduce the data dimensionality and to determine the best principal components to retain. Across tasks, recurrent neural networks and linear machine learning classifiers tended to outperform the non-linear machine learning classifiers. For all tasks, reservoir computing took the least amount of time to train. Across tasks, reservoir computing had one of the highest classification rates and took the least amount of time to train; however, interpreting the results is more difficult compared to linear classifiers. In addition, athletes were successfully classified based on sport suggesting that athletes competing in different sports move differently during non-sport specific movements. Therefore, movement assessment screens should incorporate sport-specific scoring criteria.
Collapse
|
28
|
Early detection of Alzheimer's disease using neuropsychological tests: a predict-diagnose approach using neural networks. Brain Inform 2022; 9:23. [PMID: 36166157 PMCID: PMC9515292 DOI: 10.1186/s40708-022-00169-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 08/23/2022] [Indexed: 11/10/2022] Open
Abstract
Alzheimer’s disease (AD) is a slowly progressing disease for which there is no known therapeutic cure at present. Ongoing research around the world is actively engaged in the quest for identifying markers that can help predict the future cognitive state of individuals so that measures can be taken to prevent the onset or arrest the progression of the disease. Researchers are interested in both biological and neuropsychological markers that can serve as good predictors of the future cognitive state of individuals. The goal of this study is to identify non-invasive, inexpensive markers and develop neural network models that learn the relationship between those markers and the future cognitive state. To that end, we use the renowned Alzheimer’s Disease Neuroimaging Initiative (ADNI) data for a handful of neuropsychological tests to train Recurrent Neural Network (RNN) models to predict future neuropsychological test results and Multi-Level Perceptron (MLP) models to diagnose the future cognitive states of trial participants based on those predicted results. The results demonstrate that the predicted cognitive states match the actual cognitive states of ADNI test subjects with a high level of accuracy. Therefore, this novel two-step technique can serve as an effective tool for the prediction of Alzheimer’s disease progression. The reliance of the results on inexpensive, non-invasive tests implies that this technique can be used in countries around the world including those with limited financial resources.
Collapse
|
29
|
Predicting Age-related Macular Degeneration Progression with Longitudinal Fundus Images Using Deep Learning. MACHINE LEARNING IN MEDICAL IMAGING. MLMI (WORKSHOP) 2022; 13583:11-20. [PMID: 36656604 PMCID: PMC9842432 DOI: 10.1007/978-3-031-21014-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Accurately predicting a patient's risk of progressing to late age-related macular degeneration (AMD) is difficult but crucial for personalized medicine. While existing risk prediction models for progression to late AMD are useful for triaging patients, none utilizes longitudinal color fundus photographs (CFPs) in a patient's history to estimate the risk of late AMD in a given subsequent time interval. In this work, we seek to evaluate how deep neural networks capture the sequential information in longitudinal CFPs and improve the prediction of 2-year and 5-year risk of progression to late AMD. Specifically, we proposed two deep learning models, CNN-LSTM and CNN-Transformer, which use a Long-Short Term Memory (LSTM) and a Transformer, respectively with convolutional neural networks (CNN), to capture the sequential information in longitudinal CFPs. We evaluated our models in comparison to baselines on the Age-Related Eye Disease Study, one of the largest longitudinal AMD cohorts with CFPs. The proposed models outperformed the baseline models that utilized only single-visit CFPs to predict the risk of late AMD (0.879 vs 0.868 in AUC for 2-year prediction, and 0.879 vs 0.862 for 5-year prediction). Further experiments showed that utilizing longitudinal CFPs over a longer time period was helpful for deep learning models to predict the risk of late AMD. We made the source code available at https://github.com/bionlplab/AMD_prognosis_mlmi2022 to catalyze future works that seek to develop deep learning models for late AMD prediction.
Collapse
|
30
|
Recurrent neural network to predict hyperelastic constitutive behaviors of the skeletal muscle. Med Biol Eng Comput 2022; 60:1177-1185. [PMID: 35244859 DOI: 10.1007/s11517-022-02541-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 02/23/2022] [Indexed: 10/18/2022]
Abstract
Hyperelastic constitutive laws have been commonly used to model the passive behavior of the human skeletal muscle. Despite many efforts, the use of accurate finite element formulations of hyperelastic constitutive laws is still time-consuming for a real-time medical simulation system. The objective of the present study was to develop a deep learning model to predict the hyperelastic constitutive behaviors of the skeletal muscle toward a fast estimation of the muscle tissue stress.A finite element (FE) model of the right psoas muscle was developed. Neo-Hookean and Mooney-Rivlin laws were used. A tensile test was performed with an applied body force. A learning database was built from this model using an automatic probabilistic generation process. A long-short term memory (LSTM) neural network was implemented to predict the stress evolution of the skeletal muscle tissue. A hyperparameter tuning process was conducted. Root mean square error (RMSE) and associated relative error was quantified to evaluate the precision of the predictive capacity of the developed deep learning model. Pearson correlation coefficients (R) was also computed.The nodal displacements and the maximal stresses range from 70 to 227 mm and from 2.79 to 5.61 MPa for Neo-Hookean and Monney-Rivlin laws, respectively. Regarding the LSTM predictions, the RMSE ranges from 224.3 ± 3.9 Pa (8%) to 227.5 [Formula: see text] 5.7 Pa (4%) for Neo-Hookean and Monney-Rivlin laws, respectively. Pearson correlation coefficients (R) of 0.78 [Formula: see text] 0.02 and 0.77 [Formula: see text] 0.02 were obtained for Neo-Hookean and Monney-Rivlin laws, respectively.The present study showed that, for the first time, the use of a deep learning model can reproduce the time-series behaviors of the complex FE formulations for skeletal muscle modeling. In particular, the use of a LSTM neural network leads to a fast and accurate surrogate model for the in silico prediction of the hyperelastic constitutive behaviors of the skeletal muscle. As perspectives, the developed deep learning model will be integrated into a real-time medical simulation of the skeletal muscle for prosthetic socket design and childbirth simulator.
Collapse
|
31
|
Weather forecasting based on data-driven and physics-informed reservoir computing models. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:24131-24144. [PMID: 34825327 DOI: 10.1007/s11356-021-17668-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 11/17/2021] [Indexed: 06/13/2023]
Abstract
In response to the growing demand for the global energy supply chain, wind power has become an important research subject among studies in the advancement of renewable energy sources. The major concern is the stochastic volatility of weather conditions that hinder the development of wind power forecasting approaches. To address this issue, the current study proposes a weather prediction method divided into two models for wind speed and atmospheric system forecasting. First, the data-based model incorporated with wavelet transform and recurrent neural networks is employed to predict the wind speed. Second, the physics-informed echo state network was used to learn the chaotic behavior of the atmospheric system. The findings were validated with a case study conducted on wind speed data from Turkmenistan. The results suggest the outperformance of physics-informed model for accurate and reliable forecasting analysis, which indicates the potential for implementation in wind energy analysis.
Collapse
|
32
|
A contextual detector of surgical tools in laparoscopic videos using deep learning. Surg Endosc 2022; 36:679-688. [PMID: 33559057 PMCID: PMC8349373 DOI: 10.1007/s00464-021-08336-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 01/13/2021] [Indexed: 01/03/2023]
Abstract
BACKGROUND The complexity of laparoscopy requires special training and assessment. Analyzing the streaming videos during the surgery can potentially improve surgical education. The tedium and cost of such an analysis can be dramatically reduced using an automated tool detection system, among other things. We propose a new multilabel classifier, called LapTool-Net to detect the presence of surgical tools in each frame of a laparoscopic video. METHODS The novelty of LapTool-Net is the exploitation of the correlations among the usage of different tools and, the tools and tasks-i.e., the context of the tools' usage. Towards this goal, the pattern in the co-occurrence of the tools is utilized for designing a decision policy for the multilabel classifier based on a Recurrent Convolutional Neural Network (RCNN), which is trained in an end-to-end manner. In the post-processing step, the predictions are corrected by modeling the long-term tasks' order with an RNN. RESULTS LapTool-Net was trained using publicly available datasets of laparoscopic cholecystectomy, viz., M2CAI16 and Cholec80. For M2CAI16, our exact match accuracies (when all the tools in one frame are predicted correctly) in online and offline modes were 80.95% and 81.84% with per-class F1-score of 88.29% and 90.53%. For Cholec80, the accuracies were 85.77% and 91.92% with F1-scores if 93.10% and 96.11% for online and offline, respectively. CONCLUSIONS The results show LapTool-Net outperformed state-of-the-art methods significantly, even while using fewer training samples and a shallower architecture. Our context-aware model does not require expert's domain-specific knowledge, and the simple architecture can potentially improve all existing methods.
Collapse
|
33
|
Artificial Intelligence-Enabled De Novo Design of Novel Compounds that Are Synthesizable. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:409-419. [PMID: 34731479 DOI: 10.1007/978-1-0716-1787-8_17] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Development of computer-aided de novo design methods to discover novel compounds in a speedy manner to treat human diseases has been of interest to drug discovery scientists for the past three decades. In the beginning, the efforts were mostly concentrated to generate molecules that fit the active site of the target protein by sequential building of a molecule atom-by-atom and/or group-by-group while exploring all possible conformations to optimize binding interactions with the target protein. In recent years, deep learning approaches are applied to generate molecules that are iteratively optimized against a binding hypothesis (to optimize potency) and predictive models of drug-likeness (to optimize properties). Synthesizability of molecules generated by these de novo methods remains a challenge. This review will focus on the recent development of synthetic planning methods that are suitable for enhancing synthesizability of molecules designed by de novo methods.
Collapse
|
34
|
Extent of detection of hidden relationships among different hydrological variables during floods using data-driven models. ENVIRONMENTAL MONITORING AND ASSESSMENT 2021; 193:692. [PMID: 34609643 DOI: 10.1007/s10661-021-09499-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 09/30/2021] [Indexed: 06/13/2023]
Abstract
Understanding of flood dynamics forms the basis for the leading water resource management and flood risk mitigation practices. In particular, accurate prediction of river flow during massive flood events and capturing the hysteretic behavior of river stage-discharge are among the key interests in hydrological research. The literature demonstrates that data-driven models are significant in identifying complex and hidden relationships among dependent variables, without considering explicit physical schemes. In this regard, we aim to discover the extent to which data-driven models can recognize the hidden relationships among different hydrological variables, in order to generate accurate predictions of the river flow. A secondary aim involves the detection of whether data-driven models can digest the internal features of training inputs to extrapolate severe flood records beyond the training domain. To achieve these aims, we developed a recurrent neural network (RNN) model of two hidden layers to capture the hidden relationships among the inputs, and investigated the model's predictive capability using quantitative and qualitative analyses. The quantitative analysis comprised of a comparison between model predictions, and another set of precise independent records obtained through an advanced hydroacoustic system for reference. A qualitative approach was adopted to visualize the hysteretic behavior of the stage-discharge relations of the model records, with the high-resolution records of the hydroacoustic system. The findings display the potential of data-driven models for accurately predicting river flow. Consequently, the qualitative analysis revealed moderate correlations of stage-discharge loops as compared to the reference records. Additionally, the model was tested against severe destructive flood records generated from the East Asian monsoon and tropical cyclones. Its findings suggest that data-driven models cannot extrapolate new features beyond their training dataset. Overall, this study discusses the competence of RNNs in providing reliable and accurate river flow predictions during floods.
Collapse
|
35
|
Visual analytics tool for the interpretation of hidden states in recurrent neural networks. Vis Comput Ind Biomed Art 2021; 4:24. [PMID: 34585277 PMCID: PMC8479019 DOI: 10.1186/s42492-021-00090-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 08/12/2021] [Indexed: 11/29/2022] Open
Abstract
In this paper, we introduce a visual analytics approach aimed at helping machine learning experts analyze the hidden states of layers in recurrent neural networks. Our technique allows the user to interactively inspect how hidden states store and process information throughout the feeding of an input sequence into the network. The technique can help answer questions, such as which parts of the input data have a higher impact on the prediction and how the model correlates each hidden state configuration with a certain output. Our visual analytics approach comprises several components: First, our input visualization shows the input sequence and how it relates to the output (using color coding). In addition, hidden states are visualized through a nonlinear projection into a 2-D visualization space using t-distributed stochastic neighbor embedding to understand the shape of the space of the hidden states. Trajectories are also employed to show the details of the evolution of the hidden state configurations. Finally, a time-multi-class heatmap matrix visualizes the evolution of the expected predictions for multi-class classifiers, and a histogram indicates the distances between the hidden states within the original space. The different visualizations are shown simultaneously in multiple views and support brushing-and-linking to facilitate the analysis of the classifications and debugging for misclassified input sequences. To demonstrate the capability of our approach, we discuss two typical use cases for long short-term memory models applied to two widely used natural language processing datasets.
Collapse
|
36
|
Physics-incorporated convolutional recurrent neural networks for source identification and forecasting of dynamical systems. Neural Netw 2021; 144:359-371. [PMID: 34547672 DOI: 10.1016/j.neunet.2021.08.033] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 08/25/2021] [Accepted: 08/30/2021] [Indexed: 11/20/2022]
Abstract
Spatio-temporal dynamics of physical processes are generally modeled using partial differential equations (PDEs). Though the core dynamics follows some principles of physics, real-world physical processes are often driven by unknown external sources. In such cases, developing a purely analytical model becomes very difficult and data-driven modeling can be of assistance. In this paper, we present a hybrid framework combining physics-based numerical models with deep learning for source identification and forecasting of spatio-temporal dynamical systems with unobservable time-varying external sources. We formulate our model PhICNet as a convolutional recurrent neural network (RNN) which is end-to-end trainable for spatio-temporal evolution prediction of dynamical systems and learns the source behavior as an internal state of the RNN. Experimental results show that the proposed model can forecast the dynamics for a relatively long time and identify the sources as well.
Collapse
|
37
|
Erythropoiesis stimulating agent recommendation model using recurrent neural networks for patient with kidney failure with replacement therapy. Comput Biol Med 2021; 137:104718. [PMID: 34481182 DOI: 10.1016/j.compbiomed.2021.104718] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 07/27/2021] [Accepted: 07/28/2021] [Indexed: 12/17/2022]
Abstract
In patients with kidney failure with replacement therapy (KFRT), optimizing anemia management in these patients is a challenging problem because of the complexities of the underlying diseases and heterogeneous responses to erythropoiesis-stimulating agents (ESAs). Therefore, we propose a ESA dose recommendation model based on sequential awareness neural networks. Data from 466 KFRT patients (12,907 dialysis sessions) in seven tertiary-care general hospitals were included in the experiment. First, a Hb prediction model was developed to simulate longitudinal heterogeneous ESA and Hb interactions. Based on the prediction model as a prospective study simulator, we built an ESA dose recommendation model to predict the required amount of ESA dose to reach a target hemoglobin level after 30 days. Each model's performance was evaluated in the mean absolute error (MAE). The MAEs presenting the best results of the prediction and recommendation model were 0.59 (95% confidence interval: 0.56-0.62) g/dL and 43.2 μg (ESAs dose), respectively. Compared to the results in the real-world clinical data, the recommendation model achieved a reduction of ESA dose (Algorithm: 140 vs. Human: 150 μg/month, P < 0.001), a more stable monthly Hb difference (Algorithm: 0.6 vs. Human: 0.8 g/dL, P < 0.001), and an improved target Hb success rate (Algorithm: 79.5% vs. Human: 62.9% for previous month's Hb < 10.0 g/dL; Algorithm: 95.7% vs. Human:73.0% for previous month's Hb 10.0-12.0 g/dL). We developed an ESA dose recommendation model for optimizing anemia management in patients with KFRT and showed its potential effectiveness in a simulated prospective study.
Collapse
|
38
|
Longitudinal machine learning modeling of MS patient trajectories improves predictions of disability progression. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 208:106180. [PMID: 34146771 DOI: 10.1016/j.cmpb.2021.106180] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 05/08/2021] [Indexed: 05/23/2023]
Abstract
BACKGROUND AND OBJECTIVES Research in Multiple Sclerosis (MS) has recently focused on extracting knowledge from real-world clinical data sources. This type of data is more abundant than data produced during clinical trials and potentially more informative about real-world clinical practice. However, this comes at the cost of less curated and controlled data sets. In this work we aim to predict disability progression by optimally extracting information from longitudinal patient data in the real-world setting, with a special focus on the sporadic sampling problem. METHODS We use machine learning methods suited for patient trajectories modeling, such as recurrent neural networks and tensor factorization. A subset of 6682 patients from the MSBase registry is used. RESULTS We can predict disability progression of patients in a two-year horizon with an ROC-AUC of 0.85, which represents a 32% decrease in the ranking pair error (1-AUC) compared to reference methods using static clinical features. CONCLUSIONS Compared to the models available in the literature, this work uses the most complete patient history for MS disease progression prediction and represents a step forward towards AI-assisted precision medicine in MS.
Collapse
|
39
|
Generic and specific recurrent neural network models: Applications for large and small scale biopharmaceutical upstream processes. BIOTECHNOLOGY REPORTS (AMSTERDAM, NETHERLANDS) 2021; 31:e00640. [PMID: 34159058 PMCID: PMC8193373 DOI: 10.1016/j.btre.2021.e00640] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 04/24/2021] [Accepted: 05/27/2021] [Indexed: 01/02/2023]
Abstract
The calculation of temporally varying upstream process outcomes is a challenging task. Over the last years, several parametric, semi-parametric as well as non-parametric approaches were developed to provide reliable estimates for key process parameters. We present generic and product-specific recurrent neural network (RNN) models for the computation and study of growth and metabolite-related upstream process parameters as well as their temporal evolution. Our approach can be used for the control and study of single product-specific large-scale manufacturing runs as well as generic small-scale evaluations for combined processes and products at development stage. The computational results for the product titer as well as various major upstream outcomes in addition to relevant process parameters show a high degree of accuracy when compared to experimental data and, accordingly, a reasonable predictive capability of the RNN models. The calculated values for the root-mean squared errors of prediction are significantly smaller than the experimental standard deviation for the considered process run ensembles, which highlights the broad applicability of our approach. As a specific benefit for platform processes, the generic RNN model is also used to simulate process outcomes for different temperatures in good agreement with experimental results. The high level of accuracy and the straightforward usage of the approach without sophisticated parameterization and recalibration procedures highlight the benefits of the RNN models, which can be regarded as promising alternatives to existing parametric and semi-parametric methods.
Collapse
|
40
|
An adaptive backpropagation algorithm for long-term electricity load forecasting. Neural Comput Appl 2021; 34:477-491. [PMID: 34393381 PMCID: PMC8356219 DOI: 10.1007/s00521-021-06384-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 07/26/2021] [Indexed: 11/02/2022]
Abstract
Artificial Neural Networks (ANNs) have been widely used to determine future demand for power in the short, medium, and long terms. However, research has identified that ANNs could cause inaccurate predictions of load when used for long-term forecasting. This inaccuracy is attributed to insufficient training data and increased accumulated errors, especially in long-term estimations. This study develops an improved ANN model with an Adaptive Backpropagation Algorithm (ABPA) for best practice in the forecasting long-term load demand of electricity. The ABPA includes proposing new forecasting formulations that adjust/adapt forecast values, so it takes into consideration the deviation between trained and future input datasets' different behaviours. The architecture of the Multi-Layer Perceptron (MLP) model, along with its traditional Backpropagation Algorithm (BPA), is used as a baseline for the proposed development. The forecasting formula is further improved by introducing adjustment factors to smooth out behavioural differences between the trained and new/future datasets. A computational study based on actual monthly electricity consumption inputs from 2011 to 2020, provided by the Iraqi Ministry of Electricity, is conducted to verify the proposed adaptive algorithm's performance. Different types of energy consumption and the electricity cut period (unsatisfied demand) factor are also considered in this study as vital factors. The developed ANN model, including its proposed ABPA, is then compared with traditional and popular prediction techniques such as regression and other advanced machine learning approaches, including Recurrent Neural Networks (RNNs), to justify its superiority amongst them. The results reveal that the most accurate long-term forecasts with the minimum Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE) values of (1.195.650) and (0.045), respectively, are successfully achieved by applying the proposed ABPA. It can be concluded that the proposed ABPA, including the adjustment factor, enables traditional ANN techniques to be efficiently used for long-term forecasting of electricity load demand.
Collapse
|
41
|
Predicting the epidemic curve of the coronavirus (SARS-CoV-2) disease (COVID-19) using artificial intelligence: An application on the first and second waves. INFORMATICS IN MEDICINE UNLOCKED 2021; 25:100691. [PMID: 34395821 PMCID: PMC8349399 DOI: 10.1016/j.imu.2021.100691] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 07/21/2021] [Accepted: 08/01/2021] [Indexed: 12/15/2022] Open
Abstract
Objectives The COVID-19 pandemic is considered a major threat to global public health. The aim of our study was to use the official epidemiological data to forecast the epidemic curves (daily new cases) of the COVID-19 using Artificial Intelligence (AI)-based Recurrent Neural Networks (RNNs), then to compare and validate the predicted models with the observed data. Methods We used publicly available datasets from the World Health Organization and Johns Hopkins University to create a training dataset, then we employed RNNs with gated recurring units (Long Short-Term Memory - LSTM units) to create two prediction models. Our proposed approach considers an ensemble-based system, which is realized by interconnecting several neural networks. To achieve the appropriate diversity, we froze some network layers that control the way how the model parameters are updated. In addition, we could provide country-specific predictions by transfer learning, and with extra feature injections from governmental constraints, better predictions in the longer term are achieved. We have calculated the Root Mean Squared Logarithmic Error (RMSLE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE) to thoroughly compare our model predictions with the observed data. Results We reported the predicted curves for France, Germany, Hungary, Italy, Spain, the United Kingdom, and the United States of America. The result of our study underscores that the COVID-19 pandemic is a propagated source epidemic, therefore repeated peaks on the epidemic curve are to be anticipated. Besides, the errors between the predicted and validated data and trends seem to be low. Conclusion Our proposed model has shown satisfactory accuracy in predicting the new cases of COVID-19 in certain contexts. The influence of this pandemic is significant worldwide and has already impacted most life domains. Decision-makers must be aware, that even if strict public health measures are executed and sustained, future peaks of infections are possible. The AI-based models are useful tools for forecasting epidemics as these models can be recalculated according to the newly observed data to get a more precise forecasting.
Collapse
|
42
|
Predicting the epidemic curve of the coronavirus (SARS-CoV-2) disease (COVID-19) using artificial intelligence: An application on the first and second waves. INFORMATICS IN MEDICINE UNLOCKED 2021; 25:100691. [PMID: 34395821 DOI: 10.1101/2020.04.17.20069666] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 07/21/2021] [Accepted: 08/01/2021] [Indexed: 05/27/2023] Open
Abstract
OBJECTIVES The COVID-19 pandemic is considered a major threat to global public health. The aim of our study was to use the official epidemiological data to forecast the epidemic curves (daily new cases) of the COVID-19 using Artificial Intelligence (AI)-based Recurrent Neural Networks (RNNs), then to compare and validate the predicted models with the observed data. METHODS We used publicly available datasets from the World Health Organization and Johns Hopkins University to create a training dataset, then we employed RNNs with gated recurring units (Long Short-Term Memory - LSTM units) to create two prediction models. Our proposed approach considers an ensemble-based system, which is realized by interconnecting several neural networks. To achieve the appropriate diversity, we froze some network layers that control the way how the model parameters are updated. In addition, we could provide country-specific predictions by transfer learning, and with extra feature injections from governmental constraints, better predictions in the longer term are achieved. We have calculated the Root Mean Squared Logarithmic Error (RMSLE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE) to thoroughly compare our model predictions with the observed data. RESULTS We reported the predicted curves for France, Germany, Hungary, Italy, Spain, the United Kingdom, and the United States of America. The result of our study underscores that the COVID-19 pandemic is a propagated source epidemic, therefore repeated peaks on the epidemic curve are to be anticipated. Besides, the errors between the predicted and validated data and trends seem to be low. CONCLUSION Our proposed model has shown satisfactory accuracy in predicting the new cases of COVID-19 in certain contexts. The influence of this pandemic is significant worldwide and has already impacted most life domains. Decision-makers must be aware, that even if strict public health measures are executed and sustained, future peaks of infections are possible. The AI-based models are useful tools for forecasting epidemics as these models can be recalculated according to the newly observed data to get a more precise forecasting.
Collapse
|
43
|
Continual learning for recurrent neural networks: An empirical evaluation. Neural Netw 2021; 143:607-627. [PMID: 34343775 DOI: 10.1016/j.neunet.2021.07.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 07/15/2021] [Accepted: 07/16/2021] [Indexed: 10/20/2022]
Abstract
Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario.
Collapse
|
44
|
Recurrent neural network pruning using dynamical systems and iterative fine-tuning. Neural Netw 2021; 143:475-488. [PMID: 34280607 DOI: 10.1016/j.neunet.2021.07.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 06/24/2021] [Accepted: 07/02/2021] [Indexed: 11/17/2022]
Abstract
Network pruning techniques are widely employed to reduce the memory requirements and increase the inference speed of neural networks. This work proposes a novel RNN pruning method that considers the RNN weight matrices as collections of time-evolving signals. Such signals that represent weight vectors can be modelled using Linear Dynamical Systems (LDSs). In this way, weight vectors with similar temporal dynamics can be pruned as they have limited effect on the performance of the model. Additionally, during the fine-tuning of the pruned model, a novel discrimination-aware variation of the L2 regularization is introduced to penalize network weights (i.e., reduce the magnitude), whose impact on the output of an RNN network is minimal. Finally, an iterative fine-tuning approach is proposed that employs a bigger model to guide an increasingly smaller pruned one, as a steep decrease of the network parameters can irreversibly harm the performance of the pruned model. Extensive experimentation with different network architectures demonstrates the potential of the proposed method to create pruned models with significantly improved perplexity by at least 0.62% on the PTB dataset and improved F1-score by 1.39% on the SQuAD dataset, contrary to other state-of-the-art approaches that slightly improve or even deteriorate models' performance.
Collapse
|
45
|
Transfer-RLS method and transfer-FORCE learning for simple and fast training of reservoir computing models. Neural Netw 2021; 143:550-563. [PMID: 34304003 DOI: 10.1016/j.neunet.2021.06.031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 05/13/2021] [Accepted: 06/29/2021] [Indexed: 11/22/2022]
Abstract
Reservoir computing is a machine learning framework derived from a special type of recurrent neural network. Following recent advances in physical reservoir computing, some reservoir computing devices are thought to be promising as energy-efficient machine learning hardware for real-time information processing. To realize efficient online learning with low-power reservoir computing devices, it is beneficial to develop fast convergence learning methods with simpler operations. This study proposes a training method located in the middle between the recursive least squares (RLS) method and the least mean squares (LMS) method, which are standard online learning methods for reservoir computing models. The RLS method converges fast but requires updates of a huge matrix called a gain matrix, whereas the LMS method does not use a gain matrix but converges very slow. On the other hand, the proposed method called a transfer-RLS method does not require updates of the gain matrix in the main-training phase by updating that in advance (i.e., in a pre-training phase). As a result, the transfer-RLS method can work with simpler operations than the original RLS method without sacrificing much convergence speed. We numerically and analytically show that the transfer-RLS method converges much faster than the LMS method. Furthermore, we show that a modified version of the transfer-RLS method (called transfer-FORCE learning) can be applied to the first-order reduced and controlled error (FORCE) learning for a reservoir computing model with a closed-loop, which is challenging to train.
Collapse
|
46
|
Two-timescale neurodynamic approaches to supervised feature selection based on alternative problem formulations. Neural Netw 2021; 142:180-191. [PMID: 34020085 DOI: 10.1016/j.neunet.2021.04.038] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/21/2021] [Accepted: 04/29/2021] [Indexed: 10/21/2022]
Abstract
Feature selection is a crucial step in data processing and machine learning. While many greedy and sequential feature selection approaches are available, a holistic neurodynamics approach to supervised feature selection is recently developed via fractional programming by minimizing feature redundancy and maximizing relevance simultaneously. In view that the gradient of the fractional objective function is also fractional, alternative problem formulations are desirable to obviate the fractional complexity. In this paper, the fractional programming problem formulation is equivalently reformulated as bilevel and bilinear programming problems without using any fractional function. Two two-timescale projection neural networks are adapted for solving the reformulated problems. Experimental results on six benchmark datasets are elaborated to demonstrate the global convergence and high classification performance of the proposed neurodynamic approaches in comparison with six mainstream feature selection approaches.
Collapse
|
47
|
Mechanisms for handling nested dependencies in neural-network language models and humans. Cognition 2021; 213:104699. [PMID: 33941375 DOI: 10.1016/j.cognition.2021.104699] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 03/17/2021] [Accepted: 03/22/2021] [Indexed: 11/25/2022]
Abstract
Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement (e.g., capturing the correct number agreement between subject and verb when they are separated by other phrases). Although the network, a recurrent architecture with Long Short-Term Memory units, was solely trained to predict the next word in a large corpus, analysis showed the emergence of a very sparse set of specialized units that successfully handled local and long-distance syntactic agreement for grammatical number. However, the simulations also showed that this mechanism does not support full recursion and fails with some long-range embedded dependencies. We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns, with or without embedding. Human and model error patterns were remarkably similar, showing that the model echoes various effects observed in human data. However, a key difference was that, with embedded long-range dependencies, humans remained above chance level, while the model's systematic errors brought it below chance. Overall, our study shows that exploring the ways in which modern artificial neural networks process sentences leads to precise and testable hypotheses about human linguistic performance.
Collapse
|
48
|
A Deep Learning-based approach for forecasting off-gas production and consumption in the blast furnace. Neural Comput Appl 2021; 34:911-923. [PMID: 33879977 PMCID: PMC8051551 DOI: 10.1007/s00521-021-05984-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 03/25/2021] [Indexed: 10/29/2022]
Abstract
This article presents the application of a recent neural network topology known as the deep echo state network to the prediction and modeling of strongly nonlinear systems typical of the process industry. The article analyzes the results by introducing a comparison with one of the most common and efficient topologies, the long short-term memories, in order to highlight the strengths and weaknesses of a reservoir computing approach compared to one currently considered as a standard of recurrent neural network. As benchmark application, two specific processes common in the integrated steelworks are selected, with the purpose of forecasting the future energy exchanges and transformations. The procedures of training, validation and test are based on data analysis, outlier detection and reconciliation and variable selection starting from real field industrial data. The analysis of results shows the effectiveness of deep echo state networks and their strong forecasting capabilities with respect to standard recurrent methodologies both in terms of training procedures and accuracy. Supplementary Information The online version contains supplementary material available at 10.1007/s00521-021-05984-x.
Collapse
|
49
|
A novel Encoder-Decoder model based on read-first LSTM for air pollutant prediction. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 765:144507. [PMID: 33418334 DOI: 10.1016/j.scitotenv.2020.144507] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 11/26/2020] [Accepted: 12/11/2020] [Indexed: 06/12/2023]
Abstract
Accurate air pollutant prediction allows effective environment management to reduce the impact of pollution and prevent pollution incidents. Existing studies of air pollutant prediction are mostly interdisciplinary involving environmental science and computer science where the problem is formulated as time series prediction. A prevalent recent approach to time series prediction is the Encoder-Decoder model, which is based on recurrent neural networks (RNN) such as long short-term memory (LSTM), and great potential has been demonstrated. An LSTM network relies on various gate units, but in most existing studies the correlation between gate units is ignored. This correlation is important for establishing the relationship of the random variables in a time series as the stronger is this correlation, the stronger is the relationship between the random variables. In this paper we propose an improved LSTM, named Read-first LSTM or RLSTM for short, which is a more powerful temporal feature extractor than RNN, LSTM and Gated Recurrent Unit (GRU). RLSTM has some useful properties: (1) enables better store and remember capabilities in longer time series and (2) overcomes the problem of dependency between gate units. Since RLSTM is good at long term feature extraction, it is expected to perform well in time series prediction. Therefore, we use RLSTM as the Encoder and LSTM as the Decoder to build an Encoder-Decoder model (EDSModel) for pollutant prediction in this paper. Our experimental results show, for 1 to 24 h prediction, the proposed prediction model performed well with a root mean square error of 30.218. The effectiveness and superiority of RLSTM and the prediction model have been demonstrated.
Collapse
|
50
|
GA-based implicit stochastic optimization and RNN-based simulation for deriving multi-objective reservoir hedging rules. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:19107-19120. [PMID: 33394424 DOI: 10.1007/s11356-020-12291-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
Management of reservoir systems is a complicated process involving many uncertainties regarding future events and the diversity of purposes these reservoirs serve; therefore, an effective management of these systems could help improve resource utilization and avoid stakeholder disputes. The aim of this paper was to build an optimization-simulation framework based on implicit stochastic optimization (ISO), genetic algorithms (GA), and recurrent neural network (RNN) for addressing the issue of reservoir operation. Inflow scenarios were generated synthetically based on a monthly scale to be used as an input to a multi-objective genetic programming model to construct an optimal operating rules database. Such database was subsequently used simultaneously with the output of the inflow forecasting model to simulate monthly reservoir hedging rules using RNN. Our results demonstrate the effectiveness of the GA-ISO-RNN model for simulating and predicting optimal reservoir release with consistent accuracy. Results from both the training and testing phases clearly proved the usefulness of RNN in predicting optimal reservoir release with relatively higher values of the Nash-Sutcliffe model efficiency coefficient, correlation coefficient, and lower values of root mean squared error and mean absolute deviation. Furthermore, by comparing the historical releases and the output of the proposed model, the results show that the proposed model was less vulnerable than standard operating rules. The proposed methodology was applied to the Bigge reservoir in Germany, as it features an extensive management infrastructure, but this methodology can also be easily adopted in other similar cases.
Collapse
|