1
|
Lu F, Zlobina K, Rondoni NA, Teymoori S, Gomez M. Enhancing wound healing through deep reinforcement learning for optimal therapeutics. ROYAL SOCIETY OPEN SCIENCE 2024; 11:240228. [PMID: 39086835 PMCID: PMC11289634 DOI: 10.1098/rsos.240228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 05/04/2024] [Accepted: 06/10/2024] [Indexed: 08/02/2024]
Abstract
Finding the optimal treatment strategy to accelerate wound healing is of utmost importance, but it presents a formidable challenge owing to the intrinsic nonlinear nature of the process. We propose an adaptive closed-loop control framework that incorporates deep learning, optimal control and reinforcement learning to accelerate wound healing. By adaptively learning a linear representation of nonlinear wound healing dynamics using deep learning and interactively training a deep reinforcement learning agent for tracking the optimal signal derived from this representation without the need for intricate mathematical modelling, our approach has not only successfully reduced the wound healing time by 45.56% compared to the one without any treatment, but also demonstrates the advantages of offering a safer and more economical treatment strategy. The proposed methodology showcases a significant potential for expediting wound healing by effectively integrating perception, predictive modelling and optimal adaptive control, eliminating the need for intricate mathematical models.
Collapse
Affiliation(s)
- Fan Lu
- Applied Mathematics, Baskin School of Engineering, University of California, Santa Cruz, CA, USA
| | - Ksenia Zlobina
- Applied Mathematics, Baskin School of Engineering, University of California, Santa Cruz, CA, USA
| | - Nicholas A. Rondoni
- Applied Mathematics, Baskin School of Engineering, University of California, Santa Cruz, CA, USA
| | - Sam Teymoori
- Applied Mathematics, Baskin School of Engineering, University of California, Santa Cruz, CA, USA
| | - Marcella Gomez
- Applied Mathematics, Baskin School of Engineering, University of California, Santa Cruz, CA, USA
| |
Collapse
|
2
|
Al-Hamadani MNA, Fadhel MA, Alzubaidi L, Balazs H. Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic Review. SENSORS (BASEL, SWITZERLAND) 2024; 24:2461. [PMID: 38676080 PMCID: PMC11053800 DOI: 10.3390/s24082461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/04/2024] [Accepted: 04/08/2024] [Indexed: 04/28/2024]
Abstract
Reinforcement learning (RL) has emerged as a dynamic and transformative paradigm in artificial intelligence, offering the promise of intelligent decision-making in complex and dynamic environments. This unique feature enables RL to address sequential decision-making problems with simultaneous sampling, evaluation, and feedback. As a result, RL techniques have become suitable candidates for developing powerful solutions in various domains. In this study, we present a comprehensive and systematic review of RL algorithms and applications. This review commences with an exploration of the foundations of RL and proceeds to examine each algorithm in detail, concluding with a comparative analysis of RL algorithms based on several criteria. This review then extends to two key applications of RL: robotics and healthcare. In robotics manipulation, RL enhances precision and adaptability in tasks such as object grasping and autonomous learning. In healthcare, this review turns its focus to the realm of cell growth problems, clarifying how RL has provided a data-driven approach for optimizing the growth of cell cultures and the development of therapeutic solutions. This review offers a comprehensive overview, shedding light on the evolving landscape of RL and its potential in two diverse yet interconnected fields.
Collapse
Affiliation(s)
- Mokhaled N. A. Al-Hamadani
- Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, H-4032 Debrecen, Hungary;
- Doctoral School of Informatics, University of Debrecen, H-4032 Debrecen, Hungary
- Department of Electronic Techniques, Technical Institute/Alhawija, Northern Technical University, 36001 Kirkuk, Iraq
| | - Mohammed A. Fadhel
- Research and Development Department, Akunah Company, Brisbane, QLD 4120, Australia; (M.A.F.); (L.A.)
| | - Laith Alzubaidi
- Research and Development Department, Akunah Company, Brisbane, QLD 4120, Australia; (M.A.F.); (L.A.)
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Harangi Balazs
- Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, H-4032 Debrecen, Hungary;
| |
Collapse
|
3
|
Mashayekhi H, Nazari M, Jafarinejad F, Meskin N. Deep reinforcement learning-based control of chemo-drug dose in cancer treatment. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107884. [PMID: 37948911 DOI: 10.1016/j.cmpb.2023.107884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 10/15/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023]
Abstract
BACKGROUND AND OBJECTIVE Advancement in the treatment of cancer, as a leading cause of death worldwide, has promoted several research activities in various related fields. The development of effective treatment regimens with optimal drug dose administration using a mathematical modeling framework has received extensive research attention during the last decades. However, most of the control techniques presented for cancer chemotherapy are mainly model-based approaches. The available model-free techniques based on Reinforcement Learning (RL), commonly discretize the problem states and variables, which other than demanding expert supervision, cannot model the real-world conditions accurately. The more recent Deep Reinforcement Learning (DRL) methods, which enable modeling the problem in its original continuous space, are rarely applied in cancer chemotherapy. METHODS In this paper, we propose an effective and robust DRL-based, model-free method for the closed-loop control of cancer chemotherapy drug dosing. A nonlinear pharmacological cancer model is used for simulating the patient and capturing the cancer dynamics. In contrast to previous work, the state variables and control action are modeled in their original infinite spaces to avoid expert-guided discretization and provide a more realistic solution. The DRL network is trained to automatically adjust the drug dose based on the monitored states of the patient. The proposed method provides an adaptive control technique to respond to the special conditions and diagnosis measurements of different categories of patients. RESULTS AND CONCLUSIONS The performance of the proposed DRL-based controller is evaluated by numerical analysis of different diverse simulated patients. Comparison to the state-of-the-art RL-based method, which uses discretized state and action spaces, shows the superiority of the approach in the process and duration of cancer chemotherapy treatment. In the majority of the studied cases, the proposed model decreases the medication period and the total amount of administrated drug, while increasing the rate of reduction in tumor cells.
Collapse
Affiliation(s)
- Hoda Mashayekhi
- Faculty of Computer Engineering, Shahrood University of Technology, Shahrood, Iran.
| | - Mostafa Nazari
- Faculty of Mechanical Engineering, Shahrood University of Technology, Shahrood, Iran.
| | - Fatemeh Jafarinejad
- Faculty of Computer Engineering, Shahrood University of Technology, Shahrood, Iran.
| | - Nader Meskin
- Faculty of Electrical Engineering, Qatar University, Doha, Qatar.
| |
Collapse
|
4
|
Lai M, Yang H, Gu J, Chen X, Jiang Z. Digital-twin-based Online Parameter Personalization for Implantable Cardiac Defibrillators. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:3007-3010. [PMID: 36086607 DOI: 10.1109/embc48229.2022.9871142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Implantable cardioverter defibrillators (ICDs) are developed to provide timely therapies when adverse patient conditions are detected. Device therapies need to be adjusted for individual patients and evolving patient conditions, which can be achieved by adjusting device parameter settings. However, there are no validated clinical guidelines for parameter personalization, especially for patients with complex and rare conditions. In this paper, we propose a reinforcement learning framework for online parameter personalization of ICDs. Heart states can be inferred from ECG signals from ECG patches, which can be used to create a digital twin of the patient. Reinforcement learning then use the digital twin as environment to explore parameter settings with less misdiagnosis. Experiments were performed on three virtual patients with specific and evolving heart conditions, and the result shows that our proposed approach can identify ICD parameter settings that can achieve better performance compared to default parameter settings. Clinical relevance-Patients with ICD and ECG patch can receive periodic ICD parameter adjustments that are appropriate for their current heart conditions.
Collapse
|
5
|
Schamberg G, Badgeley M, Meschede-Krasa B, Kwon O, Brown EN. Continuous action deep reinforcement learning for propofol dosing during general anesthesia. Artif Intell Med 2022; 123:102227. [PMID: 34998516 DOI: 10.1016/j.artmed.2021.102227] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 09/26/2021] [Accepted: 11/23/2021] [Indexed: 11/16/2022]
Abstract
PURPOSE Anesthesiologists simultaneously manage several aspects of patient care during general anesthesia. Automating administration of hypnotic agents could enable more precise control of a patient's level of unconsciousness and enable anesthesiologists to focus on the most critical aspects of patient care. Reinforcement learning (RL) algorithms can be used to fit a mapping from patient state to a medication regimen. These algorithms can learn complex control policies that, when paired with modern techniques for promoting model interpretability, offer a promising approach for developing a clinically viable system for automated anesthestic drug delivery. METHODS We expand on our prior work applying deep RL to automated anesthetic dosing by now using a continuous-action model based on the actor-critic RL paradigm. The proposed RL agent is composed of a policy network that maps observed anesthetic states to a continuous probability density over propofol-infusion rates and a value network that estimates the favorability of observed states. We train and test three versions of the RL agent using varied reward functions. The agent is trained using simulated pharmacokinetic/pharmacodynamic models with randomized parameters to ensure robustness to patient variability. The model is tested on simulations and retrospectively on nine general anesthesia cases collected in the operating room. We utilize Shapley additive explanations to gain an understanding of the factors with the greatest influence over the agent's decision-making. RESULTS The deep RL agent significantly outperformed a proportional-integral-derivative controller (median episode median absolute performance error 1.9% ± 1.8 and 3.1% ± 1.1). The model that was rewarded for minimizing total doses performed the best across simulated patient demographics (median episode median performance error 1.1% ± 0.5). When run on real-world clinical datasets, the agent recommended doses that were consistent with those administered by the anesthesiologist. CONCLUSIONS The proposed approach marks the first fully continuous deep RL algorithm for automating anesthestic drug dosing. The reward function used by the RL training algorithm can be flexibly designed for desirable practices (e.g. use less anesthetic) and bolstered performances. Through careful analysis of the learned policies, techniques for interpreting dosing decisions, and testing on clinical data, we confirm that the agent's anesthetic dosing is consistent with our understanding of best-practices in anesthesia care.
Collapse
Affiliation(s)
- Gabriel Schamberg
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | | | - Benyamin Meschede-Krasa
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Ohyoon Kwon
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Emery N Brown
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| |
Collapse
|
6
|
Shah FA, Meyer NJ, Angus DC, Awdish R, Azoulay É, Calfee CS, Clermont G, Gordon AC, Kwizera A, Leligdowicz A, Marshall JC, Mikacenic C, Sinha P, Venkatesh B, Wong HR, Zampieri FG, Yende S. A Research Agenda for Precision Medicine in Sepsis and Acute Respiratory Distress Syndrome: An Official American Thoracic Society Research Statement. Am J Respir Crit Care Med 2021; 204:891-901. [PMID: 34652268 PMCID: PMC8534611 DOI: 10.1164/rccm.202108-1908st] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Background: Precision medicine focuses on the identification of therapeutic strategies that are effective for a group of patients based on similar unifying characteristics. The recent success of precision medicine in non-critical care settings has resulted from the confluence of large clinical and biospecimen repositories, innovative bioinformatics, and novel trial designs. Similar advances for precision medicine in sepsis and in the acute respiratory distress syndrome (ARDS) are possible but will require further investigation and significant investment in infrastructure. Methods: This project was funded by the American Thoracic Society Board of Directors. A multidisciplinary and diverse working group reviewed the available literature, established a conceptual framework, and iteratively developed recommendations for the Precision Medicine Research Agenda for Sepsis and ARDS. Results: The following six priority recommendations were developed by the working group: 1) the creation of large richly phenotyped and harmonized knowledge networks of clinical, imaging, and multianalyte molecular data for sepsis and ARDS; 2) the implementation of novel trial designs, including adaptive designs, and embedding trial procedures in the electronic health record; 3) continued innovation in the data science and engineering methods required to identify heterogeneity of treatment effect; 4) further development of the tools necessary for the real-time application of precision medicine approaches; 5) work to ensure that precision medicine strategies are applicable and available to a broad range of patients varying across differing racial, ethnic, socioeconomic, and demographic groups; and 6) the securement and maintenance of adequate and sustainable funding for precision medicine efforts. Conclusions: Precision medicine approaches that incorporate variability in genomic, biologic, and environmental factors may provide a path forward for better individualizing the delivery of therapies and improving care for patients with sepsis and ARDS.
Collapse
|
7
|
Eghbali N, Alhanai T, Ghassemi MM. Patient-Specific Sedation Management via Deep Reinforcement Learning. Front Digit Health 2021; 3:608893. [PMID: 34713090 PMCID: PMC8521809 DOI: 10.3389/fdgth.2021.608893] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 02/04/2021] [Indexed: 02/04/2023] Open
Abstract
Introduction: Developing reliable medication dosing guidelines is challenging because individual dose-response relationships are mitigated by both static (e. g., demographic) and dynamic factors (e.g., kidney function). In recent years, several data-driven medication dosing models have been proposed for sedatives, but these approaches have been limited in their ability to assess interindividual differences and compute individualized doses. Objective: The primary objective of this study is to develop an individualized framework for sedative-hypnotics dosing. Method: Using publicly available data (1,757 patients) from the MIMIC IV intensive care unit database, we developed a sedation management agent using deep reinforcement learning. More specifically, we modeled the sedative dosing problem as a Markov Decision Process and developed an RL agent based on a deep deterministic policy gradient approach with a prioritized experience replay buffer to find the optimal policy. We assessed our method's ability to jointly learn an optimal personalized policy for propofol and fentanyl, which are among commonly prescribed sedative-hypnotics for intensive care unit sedation. We compared our model's medication performance against the recorded behavior of clinicians on unseen data. Results: Experimental results demonstrate that our proposed model would assist clinicians in making the right decision based on patients' evolving clinical phenotype. The RL agent was 8% better at managing sedation and 26% better at managing mean arterial compared to the clinicians' policy; a two-sample t-test validated that these performance improvements were statistically significant (p < 0.05). Conclusion: The results validate that our model had better performance in maintaining control variables within their target range, thereby jointly maintaining patients' health conditions and managing their sedation.
Collapse
Affiliation(s)
- Niloufar Eghbali
- Human Augmentation and Artificial Intelligence Laboratory, Department of Computer Science, Michigan State University, East Lansing, MI, United States
| | - Tuka Alhanai
- Laboratory for Computer-Human Intelligence, Division of Engineering, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Mohammad M. Ghassemi
- Human Augmentation and Artificial Intelligence Laboratory, Department of Computer Science, Michigan State University, East Lansing, MI, United States
| |
Collapse
|
8
|
Liu S, See KC, Ngiam KY, Celi LA, Sun X, Feng M. Reinforcement Learning for Clinical Decision Support in Critical Care: Comprehensive Review. J Med Internet Res 2020; 22:e18477. [PMID: 32706670 PMCID: PMC7400046 DOI: 10.2196/18477] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 05/05/2020] [Accepted: 05/13/2020] [Indexed: 12/21/2022] Open
Abstract
Background Decision support systems based on reinforcement learning (RL) have been implemented to facilitate the delivery of personalized care. This paper aimed to provide a comprehensive review of RL applications in the critical care setting. Objective This review aimed to survey the literature on RL applications for clinical decision support in critical care and to provide insight into the challenges of applying various RL models. Methods We performed an extensive search of the following databases: PubMed, Google Scholar, Institute of Electrical and Electronics Engineers (IEEE), ScienceDirect, Web of Science, Medical Literature Analysis and Retrieval System Online (MEDLINE), and Excerpta Medica Database (EMBASE). Studies published over the past 10 years (2010-2019) that have applied RL for critical care were included. Results We included 21 papers and found that RL has been used to optimize the choice of medications, drug dosing, and timing of interventions and to target personalized laboratory values. We further compared and contrasted the design of the RL models and the evaluation metrics for each application. Conclusions RL has great potential for enhancing decision making in critical care. Challenges regarding RL system design, evaluation metrics, and model choice exist. More importantly, further work is required to validate RL in authentic clinical environments.
Collapse
Affiliation(s)
- Siqi Liu
- NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore, Singapore.,Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| | - Kay Choong See
- Division of Respiratory & Critical Care Medicine, National University Hospital, Singapore, Singapore
| | - Kee Yuan Ngiam
- Group Chief Technology Office, National University Health System, Singapore, Singapore
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, United States.,Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States
| | | | - Mengling Feng
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| |
Collapse
|
9
|
Yu C, Ren G, Dong Y. Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units. BMC Med Inform Decis Mak 2020; 20:124. [PMID: 32646412 PMCID: PMC7344039 DOI: 10.1186/s12911-020-1120-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in healthcare domains. Recent years have seen a great progress of applying RL in addressing decision-making problems in Intensive Care Units (ICUs). However, since the goal of traditional RL algorithms is to maximize a long-term reward function, exploration in the learning process may have a fatal impact on the patient. As such, a short-term goal should also be considered to keep the patient stable during the treating process. Methods We use a Supervised-Actor-Critic (SAC) RL algorithm to address this problem by combining the long-term goal-oriented characteristics of RL with the short-term goal of supervised learning. We evaluate the differences between SAC and traditional Actor-Critic (AC) algorithms in addressing the decision making problems of ventilation and sedative dosing in ICUs. Results Results show that SAC is much more efficient than the traditional AC algorithm in terms of convergence rate and data utilization. Conclusions The SAC algorithm not only aims to cure patients in the long term, but also reduces the degree of deviation from the strategy applied by clinical doctors and thus improves the therapeutic effect.
Collapse
Affiliation(s)
- Chao Yu
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, 510015, China.
| | - Guoqi Ren
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 110621, China
| | - Yinzhao Dong
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 110621, China
| |
Collapse
|
10
|
Affiliation(s)
- Laleh Jalilian
- Department of Anesthesiology and Perioperative Medicine, UCLA David Geffen School of Medicine, Los Angeles, California
| | | |
Collapse
|
11
|
Schamberg G, Badgeley M, Brown EN. Controlling Level of Unconsciousness by Titrating Propofol with Deep Reinforcement Learning. Artif Intell Med 2020. [DOI: 10.1007/978-3-030-59137-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
12
|
Zou Q, Ma Q. The application of machine learning to disease diagnosis and treatment. Math Biosci 2019; 320:108305. [PMID: 31857093 DOI: 10.1016/j.mbs.2019.108305] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, China.
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, United States.
| |
Collapse
|