1
|
Suriyaamporn P, Pamornpathomkul B, Patrojanasophon P, Ngawhirunpat T, Rojanarata T, Opanasopit P. The Artificial Intelligence-Powered New Era in Pharmaceutical Research and Development: A Review. AAPS PharmSciTech 2024; 25:188. [PMID: 39147952 DOI: 10.1208/s12249-024-02901-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 07/22/2024] [Indexed: 08/17/2024] Open
Abstract
Currently, artificial intelligence (AI), machine learning (ML), and deep learning (DL) are gaining increased interest in many fields, particularly in pharmaceutical research and development, where they assist in decision-making in complex situations. Numerous research studies and advancements have demonstrated how these computational technologies are used in various pharmaceutical research and development aspects, including drug discovery, personalized medicine, drug formulation, optimization, predictions, drug interactions, pharmacokinetics/ pharmacodynamics, quality control/quality assurance, and manufacturing processes. Using advanced modeling techniques, these computational technologies can enhance efficiency and accuracy, handle complex data, and facilitate novel discoveries within minutes. Furthermore, these technologies offer several advantages over conventional statistics. They allow for pattern recognition from complex datasets, and the models, typically developed from data-driven algorithms, can predict a given outcome (model output) from a set of features (model inputs). Additionally, this review discusses emerging trends and provides perspectives on the application of AI with quality by design (QbD) and the future role of AI in this field. Ethical and regulatory considerations associated with integrating AI into pharmaceutical technology were also examined. This review aims to offer insights to researchers, professionals, and others on the current state of AI applications in pharmaceutical research and development and their potential role in the future of research and the era of pharmaceutical Industry 4.0 and 5.0.
Collapse
Affiliation(s)
- Phuvamin Suriyaamporn
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Boonnada Pamornpathomkul
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Prasopchai Patrojanasophon
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Tanasait Ngawhirunpat
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Theerasak Rojanarata
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand
| | - Praneet Opanasopit
- Pharmaceutical Development of Green Innovations Group (PDGIG), Department of Industrial Pharmacy, Faculty of Pharmacy, Silpakorn University, Nakhon Pathom, Thailand.
| |
Collapse
|
2
|
Bomrah S, Uddin M, Upadhyay U, Komorowski M, Priya J, Dhar E, Hsu SC, Syed-Abdul S. A scoping review of machine learning for sepsis prediction- feature engineering strategies and model performance: a step towards explainability. Crit Care 2024; 28:180. [PMID: 38802973 PMCID: PMC11131234 DOI: 10.1186/s13054-024-04948-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 05/10/2024] [Indexed: 05/29/2024] Open
Abstract
BACKGROUND Sepsis, an acute and potentially fatal systemic response to infection, significantly impacts global health by affecting millions annually. Prompt identification of sepsis is vital, as treatment delays lead to increased fatalities through progressive organ dysfunction. While recent studies have delved into leveraging Machine Learning (ML) for predicting sepsis, focusing on aspects such as prognosis, diagnosis, and clinical application, there remains a notable deficiency in the discourse regarding feature engineering. Specifically, the role of feature selection and extraction in enhancing model accuracy has been underexplored. OBJECTIVES This scoping review aims to fulfill two primary objectives: To identify pivotal features for predicting sepsis across a variety of ML models, providing valuable insights for future model development, and To assess model efficacy through performance metrics including AUROC, sensitivity, and specificity. RESULTS The analysis included 29 studies across diverse clinical settings such as Intensive Care Units (ICU), Emergency Departments, and others, encompassing 1,147,202 patients. The review highlighted the diversity in prediction strategies and timeframes. It was found that feature extraction techniques notably outperformed others in terms of sensitivity and AUROC values, thus indicating their critical role in improving sepsis prediction models. CONCLUSION Key dynamic indicators, including vital signs and critical laboratory values, are instrumental in the early detection of sepsis. Applying feature selection methods significantly boosts model precision, with models like Random Forest and XG Boost showing promising results. Furthermore, Deep Learning models (DL) reveal unique insights, spotlighting the pivotal role of feature engineering in sepsis prediction, which could greatly benefit clinical practice.
Collapse
Affiliation(s)
- Sherali Bomrah
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No. 291, Zhongzheng Rd, Zhonghe District, New Taipei City, 235, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, 235, Taiwan
- College of Medicine, Taipei Medical University, Taipei, 110, Taiwan
| | - Mohy Uddin
- Research Quality Management Section, King Abdullah International Medical Research Center, King Saud Bin Abdulaziz University for Health Sciences, Ministry of National Guard-Health Affairs, 11426, Riyadh, Saudi Arabia
| | - Umashankar Upadhyay
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No. 291, Zhongzheng Rd, Zhonghe District, New Taipei City, 235, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, 235, Taiwan
- School of Biotechnology and Applied Sciences, Shoolini University of Biotechnology and Management Sciences, Solan, 173229, India
| | - Matthieu Komorowski
- Faculty of Medicine, Department of Surgery and Cancer, Imperial College of London, South Kensington Campus, London, UK
| | - Jyoti Priya
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No. 291, Zhongzheng Rd, Zhonghe District, New Taipei City, 235, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, 235, Taiwan
| | - Eshita Dhar
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No. 291, Zhongzheng Rd, Zhonghe District, New Taipei City, 235, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, 235, Taiwan
| | - Shih-Chang Hsu
- Department of Emergency, School of Medicine, College of Medicine, Taipei Medical University, Taipei, 106, Taiwan
- Emergency Department, Wan Fang Hospital, Taipei Medical University, Taipei, 116, Taiwan
| | - Shabbir Syed-Abdul
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No. 291, Zhongzheng Rd, Zhonghe District, New Taipei City, 235, Taiwan.
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, 235, Taiwan.
- School of Gerontology and Long-Term Care, College of Nursing, Taipei Medical University, Taipei, Taiwan.
| |
Collapse
|
3
|
Aliper A, Kudrin R, Polykovskiy D, Kamya P, Tutubalina E, Chen S, Ren F, Zhavoronkov A. Prediction of Clinical Trials Outcomes Based on Target Choice and Clinical Trial Design with Multi-Modal Artificial Intelligence. Clin Pharmacol Ther 2023; 114:972-980. [PMID: 37483175 DOI: 10.1002/cpt.3008] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 07/10/2023] [Indexed: 07/25/2023]
Abstract
Drug discovery and development is a notoriously risky process with high failure rates at every stage, including disease modeling, target discovery, hit discovery, lead optimization, preclinical development, human safety, and efficacy studies. Accurate prediction of clinical trial outcomes may help significantly improve the efficiency of this process by prioritizing therapeutic programs that are more likely to succeed in clinical trials and ultimately benefit patients. Here, we describe inClinico, a transformer-based artificial intelligence software platform designed to predict the outcome of phase II clinical trials. The platform combines an ensemble of clinical trial outcome prediction engines that leverage generative artificial intelligence and multimodal data, including omics, text, clinical trial design, and small molecule properties. inClinico was validated in retrospective, quasi-prospective, and prospective validation studies internally and with pharmaceutical companies and financial institutions. The platform achieved 0.88 receiver operating characteristic area under the curve in predicting the phase II to phase III transition on a quasi-prospective validation dataset. The first prospective predictions were made and placed on date-stamped preprint servers in 2016. To validate our model in a real-world setting, we published forecasted outcomes for several phase II clinical trials achieving 79% accuracy for the trials that have read out. We also present an investment application of inClinico using date stamped virtual trading portfolio demonstrating 35% 9-month return on investment.
Collapse
Affiliation(s)
- Alex Aliper
- Insilico Medicine AI Ltd, Masdar City, Abu Dhabi, United Arab Emirates
| | - Roman Kudrin
- Insilico Medicine AI Ltd, Masdar City, Abu Dhabi, United Arab Emirates
| | | | - Petrina Kamya
- Insilico Medicine Canada Inc., Quebec, Montreal, Canada
| | - Elena Tutubalina
- Insilico Medicine Hong Kong Ltd, New Territories, Pak Shek Kok, Hong Kong
| | - Shan Chen
- Insilico Medicine Shanghai Ltd, Pudong New District, Shanghai, China
| | - Feng Ren
- Insilico Medicine Shanghai Ltd, Pudong New District, Shanghai, China
| | - Alex Zhavoronkov
- Insilico Medicine AI Ltd, Masdar City, Abu Dhabi, United Arab Emirates
- Insilico Medicine Hong Kong Ltd, New Territories, Pak Shek Kok, Hong Kong
| |
Collapse
|
4
|
Niazi SK. The Coming of Age of AI/ML in Drug Discovery, Development, Clinical Testing, and Manufacturing: The FDA Perspectives. Drug Des Devel Ther 2023; 17:2691-2725. [PMID: 37701048 PMCID: PMC10493153 DOI: 10.2147/dddt.s424991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/24/2023] [Indexed: 09/14/2023] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) represent significant advancements in computing, building on technologies that humanity has developed over millions of years-from the abacus to quantum computers. These tools have reached a pivotal moment in their development. In 2021 alone, the U.S. Food and Drug Administration (FDA) received over 100 product registration submissions that heavily relied on AI/ML for applications such as monitoring and improving human performance in compiling dossiers. To ensure the safe and effective use of AI/ML in drug discovery and manufacturing, the FDA and numerous other U.S. federal agencies have issued continuously updated, stringent guidelines. Intriguingly, these guidelines are often generated or updated with the aid of AI/ML tools themselves. The overarching goal is to expedite drug discovery, enhance the safety profiles of existing drugs, introduce novel treatment modalities, and improve manufacturing compliance and robustness. Recent FDA publications offer an encouraging outlook on the potential of these tools, emphasizing the need for their careful deployment. This has expanded market opportunities for retraining personnel handling these technologies and enabled innovative applications in emerging therapies such as gene editing, CRISPR-Cas9, CAR-T cells, mRNA-based treatments, and personalized medicine. In summary, the maturation of AI/ML technologies is a testament to human ingenuity. Far from being autonomous entities, these are tools created by and for humans designed to solve complex problems now and in the future. This paper aims to present the status of these technologies, along with examples of their present and future applications.
Collapse
|
5
|
Binko MA, Reitz KM, Chaer RA, Haga LM, Go C, Alie-Cusson FS, Tzeng E, Eslami MH, Sridharan ND. Selective Publication within Vascular Surgery: Characteristics of Discontinued and Unpublished Randomized Clinical Trials. Ann Vasc Surg 2023; 95:251-261. [PMID: 37311508 DOI: 10.1016/j.avsg.2023.05.035] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/25/2023] [Accepted: 05/31/2023] [Indexed: 06/15/2023]
Abstract
BACKGROUND Discontinued and unpublished randomized clinical trials (RCTs) are common resulting in biased publication and loss of potential knowledge. The magnitude of selective publication within vascular surgery remains unknown. METHODS RCT relevant to vascular surgery registered (01/01/2010-10/31/2019) on ClinicalTrials.gov were included. Trials ending normally with conclusion of participant treatment and examination were considered completed whereas discontinued trials stopped early. Publications were identified through automatically indexed PubMed citations on ClinicalTrials.gov or manually identified on PubMed or Google Scholar >30 months after the completion date, the date the final participant was examined, allowing time for publication. RESULTS Of 108 RCT (n = 37, 837), 22.2% (24/108) were discontinued, including 16.7% (4/24) stopped prior to and 83.3% (20/24) after starting enrollment. Only 28.4% of estimated enrollment was achieved for all discontinued RCT. Nineteen (79.2%) investigators provided a reason for discontinuation, which most commonly included poor enrollment (45.8%), inadequate supplies or funding (12.5%), and trial design concerns (8.3%). Of the 20 trials terminated following enrollment, 20.0% (4/20) were published in peer-reviewed journals and 80.0% (16/20) failed to reach publication. Of the 77.8% trials completed, 75.0% (63/84) were published and 25.0% (21/84) remain unpublished. In a multivariate regression of completed trials, industry funding was significantly associated with decreased likelihood of peer-reviewed publication (OR = 0.18, (95% CI 0.05-0.71), P = 0.01). Of the discontinued and completed trials remaining unpublished, 62.5% and 61.9% failed to report results on ClinicalTrials.gov, respectively, encompassing a total of 4,788 enrollees without publicly available results. CONCLUSIONS Nearly 25% of registered vascular RCT were discontinued. Of completed RCT, 25% remain unpublished with industry funding associated with decreased likelihood of publication. This study identifies opportunities to report all findings for completed and discontinued vascular surgery RCT, whether industry sponsored, or investigator initiated.
Collapse
Affiliation(s)
- Mary A Binko
- School of Medicine, University of Pittsburgh, Pittsburgh, PA
| | - Katherine M Reitz
- Division of Vascular Surgery, Department of Surgery, UPMC, Pittsburgh, PA
| | - Rabih A Chaer
- Division of Vascular Surgery, Department of Surgery, UPMC, Pittsburgh, PA
| | - Lindsey M Haga
- Division of Vascular Surgery, Department of Surgery, UPMC, Pittsburgh, PA
| | - Catherine Go
- Division of Vascular Surgery, Department of Surgery, UPMC, Pittsburgh, PA
| | | | - Edith Tzeng
- Division of Vascular Surgery, Department of Surgery, UPMC, Pittsburgh, PA
| | - Mohammad H Eslami
- Division of Vascular Surgery, Department of Surgery, UPMC, Pittsburgh, PA
| | | |
Collapse
|
6
|
Budennyy S, Kazakov A, Kovtun E, Zhukov L. New drugs and stock market: a machine learning framework for predicting pharma market reaction to clinical trial announcements. Sci Rep 2023; 13:12817. [PMID: 37550410 PMCID: PMC10406841 DOI: 10.1038/s41598-023-39301-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 07/23/2023] [Indexed: 08/09/2023] Open
Abstract
Pharmaceutical companies operate in a strictly regulated and highly risky environment in which a single slip can lead to serious financial implications. Accordingly, the announcements of clinical trial results tend to determine the future course of events, hence being closely monitored by the public. Most works focus on retrospective analysis of announcement impact on company stock prices, bypassing the consideration of the problem in the predictive paradigm. In this work, we aim to close this gap by proposing a framework that allows predicting the numerical values of announcement-induced changes in stock prices. In fact, it is a problem of the impact prediction of the specific event on the corresponding time series. Our framework includes a BERT model for extracting the sentiment polarity of announcements, a Temporal Fusion Transformer for forecasting the expected return, a graph convolution network for capturing event relationships, and gradient boosting for predicting the price change. We operate with one of the biggest FDA (the Food and Drug Administration) datasets, consisting of 5436 clinical trial announcements from 681 companies for the years 2018-2022. During the study, we get several significant outcomes and domain-specific insights. Firstly, we obtain statistical evidence for the clinical result promulgation influence on the public pharma market value. Secondly, we witness inherently different patterns of responses to positive and negative announcements, reflected in a stronger and more pronounced reaction to negative clinical news. Thirdly, we discover two factors that play a crucial role in a predictive framework: (1) the drug portfolio size of the company, indicating the greater susceptibility to an announcement in the case of low diversification among drug products and (2) the announcement network effect, manifesting through an increase in predictive power when exploiting interdependencies of events belonging to the same company or nosology. Finally, we prove the viability of the forecast setting by getting ROC AUC scores predominantly greater than 0.7 for the classification of price change on historical data. We emphasize the transferability and generalizability of the developed framework on other datasets and domains but on the condition of the presence of two key entities: events and the associated time series.
Collapse
Affiliation(s)
- Semen Budennyy
- Sber AI Lab, Moscow, Russia.
- Artificial Intelligence Research Institute (AIRI), Moscow, Russia.
| | | | | | - Leonid Zhukov
- Higher School of Economics University, Moscow, Russia
| |
Collapse
|
7
|
Vora LK, Gholap AD, Jetha K, Thakur RRS, Solanki HK, Chavda VP. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023; 15:1916. [PMID: 37514102 PMCID: PMC10385763 DOI: 10.3390/pharmaceutics15071916] [Citation(s) in RCA: 61] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 06/28/2023] [Accepted: 07/04/2023] [Indexed: 07/30/2023] Open
Abstract
Artificial intelligence (AI) has emerged as a powerful tool that harnesses anthropomorphic knowledge and provides expedited solutions to complex challenges. Remarkable advancements in AI technology and machine learning present a transformative opportunity in the drug discovery, formulation, and testing of pharmaceutical dosage forms. By utilizing AI algorithms that analyze extensive biological data, including genomics and proteomics, researchers can identify disease-associated targets and predict their interactions with potential drug candidates. This enables a more efficient and targeted approach to drug discovery, thereby increasing the likelihood of successful drug approvals. Furthermore, AI can contribute to reducing development costs by optimizing research and development processes. Machine learning algorithms assist in experimental design and can predict the pharmacokinetics and toxicity of drug candidates. This capability enables the prioritization and optimization of lead compounds, reducing the need for extensive and costly animal testing. Personalized medicine approaches can be facilitated through AI algorithms that analyze real-world patient data, leading to more effective treatment outcomes and improved patient adherence. This comprehensive review explores the wide-ranging applications of AI in drug discovery, drug delivery dosage form designs, process optimization, testing, and pharmacokinetics/pharmacodynamics (PK/PD) studies. This review provides an overview of various AI-based approaches utilized in pharmaceutical technology, highlighting their benefits and drawbacks. Nevertheless, the continued investment in and exploration of AI in the pharmaceutical industry offer exciting prospects for enhancing drug development processes and patient care.
Collapse
Affiliation(s)
- Lalitkumar K Vora
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK
| | - Amol D Gholap
- Department of Pharmaceutics, St. John Institute of Pharmacy and Research, Palghar 401404, Maharashtra, India
| | - Keshava Jetha
- Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
- Ph.D. Section, Gujarat Technological University, Ahmedabad 382424, Gujarat, India
| | | | - Hetvi K Solanki
- Pharmacy Section, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
| | - Vivek P Chavda
- Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of Pharmacy, Ahmedabad 380009, Gujarat, India
| |
Collapse
|
8
|
Ferdowsi S, Knafou J, Borissov N, Vicente Alvarez D, Mishra R, Amini P, Teodoro D. Deep learning-based risk prediction for interventional clinical trials based on protocol design: A retrospective study. PATTERNS (NEW YORK, N.Y.) 2023; 4:100689. [PMID: 36960445 PMCID: PMC10028430 DOI: 10.1016/j.patter.2023.100689] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/07/2022] [Accepted: 01/16/2023] [Indexed: 02/12/2023]
Abstract
Success rate of clinical trials (CTs) is low, with the protocol design itself being considered a major risk factor. We aimed to investigate the use of deep learning methods to predict the risk of CTs based on their protocols. Considering protocol changes and their final status, a retrospective risk assignment method was proposed to label CTs according to low, medium, and high risk levels. Then, transformer and graph neural networks were designed and combined in an ensemble model to learn to infer the ternary risk categories. The ensemble model achieved robust performance (area under the receiving operator characteristic curve [AUROC] of 0.8453 [95% confidence interval: 0.8409-0.8495]), similar to the individual architectures but significantly outperforming a baseline based on bag-of-words features (0.7548 [0.7493-0.7603] AUROC). We demonstrate the potential of deep learning in predicting the risk of CTs from their protocols, paving the way for customized risk mitigation strategies during protocol design.
Collapse
Affiliation(s)
- Sohrab Ferdowsi
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
- Geneva School of Business Administration, HES-SO University of Applied Sciences and Arts of Western Switzerland, Geneva, Switzerland
| | - Julien Knafou
- Geneva School of Business Administration, HES-SO University of Applied Sciences and Arts of Western Switzerland, Geneva, Switzerland
| | - Nikolay Borissov
- Clinical Trials Unit, University of Bern, Bern, Switzerland
- Risklick AG, Bern, Switzerland
| | - David Vicente Alvarez
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
- Geneva School of Business Administration, HES-SO University of Applied Sciences and Arts of Western Switzerland, Geneva, Switzerland
| | - Rahul Mishra
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Poorya Amini
- Clinical Trials Unit, University of Bern, Bern, Switzerland
- Risklick AG, Bern, Switzerland
| | - Douglas Teodoro
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
- Geneva School of Business Administration, HES-SO University of Applied Sciences and Arts of Western Switzerland, Geneva, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Corresponding author
| |
Collapse
|
9
|
Zhang E, DuBois SG. Early Termination of Oncology Clinical Trials in the United States. Cancer Med 2023; 12:5517-5525. [PMID: 36305832 PMCID: PMC10028157 DOI: 10.1002/cam4.5385] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 10/07/2022] [Accepted: 10/16/2022] [Indexed: 11/08/2022] Open
Abstract
PURPOSE The aim of this study was to evaluate the rate of early trial discontinuation of oncology trials and reasons for early termination, to assess potential trends in rates of oncology trial termination, and to perform a comprehensive analysis of predictors of early termination. This study intends to inform efforts in improving efficiency of the oncology clinical trial enterprise. METHODS We conducted a cross-sectional study of interventional cancer clinical trials registered in ClinicalTrials.gov database from September 27, 2007 to June 30, 2015, with at least one site listed in the United States. We evaluated predictors of early trial termination using Fisher exact or χ2 tests and logistic regression. RESULTS Of 8687 trials, 22.74% (n = 1975) were terminated trials. Rates of early trial termination appeared stable over the study. Statistically significant univariate predictors of early termination for any reason include cancer category, phase, funding source, location, and age. In multivariable analysis, trials spanning multiple cancer categories and international trials were less likely to terminate early whereas phase 2 trials and trials funded by academia/foundation were more likely to terminate early. The most common reason for early termination was "Other, Multiple Reasons, or Unknown" (36.9%), followed by accrual issues (34.5%). In multivariate analysis among all terminated trials, supportive care trials, phase 2 trials, and non-industry funded trials had significantly higher odds of trial discontinuation specifically due to poor accrual. CONCLUSION In this national sample of cancer clinical trials, early trial discontinuation was common. Many factors influenced early trial termination with poor accrual being a common reason. Specific trial features are associated with differential likelihood of early trial termination for any reason and for early trial termination due to poor accrual.
Collapse
Affiliation(s)
- Ellen Zhang
- Harvard Medical School, Boston, Massachusetts, USA
| | - Steven G DuBois
- Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
10
|
Improving clinical trial design using interpretable machine learning based prediction of early trial termination. Sci Rep 2023; 13:121. [PMID: 36599880 PMCID: PMC9813129 DOI: 10.1038/s41598-023-27416-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 01/02/2023] [Indexed: 01/06/2023] Open
Abstract
This study proposes using a machine learning pipeline to optimise clinical trial design. The goal is to predict early termination probability of clinical trials using machine learning modelling, and to understand feature contributions driving early termination. This will inform further suggestions to the study protocol to reduce the risk of wasted resources. A dataset containing 420,268 clinical trial records and 24 fields was extracted from the ct.gov registry. In addition to study characteristics features, 12,864 eligibility criteria search features are used, generated using a public annotated eligibility criteria dataset, CHIA. Furthermore, disease categorization features are used allowing a study to belong more than one category specified by clinicaltrials.gov. Ensemble models including random forest and extreme gradient boosting classifiers were used to train and evaluate predictive performance. We achieved a Receiver Operator Characteristic Area under the Curve score of 0.80, and balanced accuracy of 0.70 on the test set using gradient boosting classification. We used Shapley Additive Explanations to interpret the termination predictions to flag feature contributions. The proposed pipeline will lead to an optimised clinical trial design and consequently help potentially life-saving treatments reach patients faster.
Collapse
|
11
|
Eysenbach G, Šuster S, Baldwin T, Verspoor K. Predicting Publication of Clinical Trials Using Structured and Unstructured Data: Model Development and Validation Study. J Med Internet Res 2022; 24:e38859. [PMID: 36563029 PMCID: PMC9823568 DOI: 10.2196/38859] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 10/14/2022] [Accepted: 11/16/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Publication of registered clinical trials is a critical step in the timely dissemination of trial findings. However, a significant proportion of completed clinical trials are never published, motivating the need to analyze the factors behind success or failure to publish. This could inform study design, help regulatory decision-making, and improve resource allocation. It could also enhance our understanding of bias in the publication of trials and publication trends based on the research direction or strength of the findings. Although the publication of clinical trials has been addressed in several descriptive studies at an aggregate level, there is a lack of research on the predictive analysis of a trial's publishability given an individual (planned) clinical trial description. OBJECTIVE We aimed to conduct a study that combined structured and unstructured features relevant to publication status in a single predictive approach. Established natural language processing techniques as well as recent pretrained language models enabled us to incorporate information from the textual descriptions of clinical trials into a machine learning approach. We were particularly interested in whether and which textual features could improve the classification accuracy for publication outcomes. METHODS In this study, we used metadata from ClinicalTrials.gov (a registry of clinical trials) and MEDLINE (a database of academic journal articles) to build a data set of clinical trials (N=76,950) that contained the description of a registered trial and its publication outcome (27,702/76,950, 36% published and 49,248/76,950, 64% unpublished). This is the largest data set of its kind, which we released as part of this work. The publication outcome in the data set was identified from MEDLINE based on clinical trial identifiers. We carried out a descriptive analysis and predicted the publication outcome using 2 approaches: a neural network with a large domain-specific language model and a random forest classifier using a weighted bag-of-words representation of text. RESULTS First, our analysis of the newly created data set corroborates several findings from the existing literature regarding attributes associated with a higher publication rate. Second, a crucial observation from our predictive modeling was that the addition of textual features (eg, eligibility criteria) offers consistent improvements over using only structured data (F1-score=0.62-0.64 vs F1-score=0.61 without textual features). Both pretrained language models and more basic word-based representations provide high-utility text representations, with no significant empirical difference between the two. CONCLUSIONS Different factors affect the publication of a registered clinical trial. Our approach to predictive modeling combines heterogeneous features, both structured and unstructured. We show that methods from natural language processing can provide effective textual features to enable more accurate prediction of publication success, which has not been explored for this task previously.
Collapse
Affiliation(s)
| | - Simon Šuster
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
| | - Timothy Baldwin
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia.,Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Karin Verspoor
- School of Computing Technologies, RMIT University, Melbourne, Australia
| |
Collapse
|
12
|
Chen Z, Peng B, Ioannidis VN, Li M, Karypis G, Ning X. A knowledge graph of clinical trials ([Formula: see text]). Sci Rep 2022; 12:4724. [PMID: 35304504 PMCID: PMC8933553 DOI: 10.1038/s41598-022-08454-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 02/28/2022] [Indexed: 02/05/2023] Open
Abstract
Effective and successful clinical trials are essential in developing new drugs and advancing new treatments. However, clinical trials are very expensive and easy to fail. The high cost and low success rate of clinical trials motivate research on inferring knowledge from existing clinical trials in innovative ways for designing future clinical trials. In this manuscript, we present our efforts on constructing the first publicly available Clinical Trials Knowledge Graph, denoted as [Formula: see text]. [Formula: see text] includes nodes representing medical entities in clinical trials (e.g., studies, drugs and conditions), and edges representing the relations among these entities (e.g., drugs used in studies). Our embedding analysis demonstrates the potential utilities of [Formula: see text] in various applications such as drug repurposing and similarity search, among others.
Collapse
Affiliation(s)
- Ziqi Chen
- The Ohio State University, Columbus, USA
| | - Bo Peng
- The Ohio State University, Columbus, USA
| | | | - Mufei Li
- Amazon Web Services Shanghai AI Lab, Shanghai, China
| | | | - Xia Ning
- The Ohio State University, Columbus, USA
| |
Collapse
|
13
|
On Graph Construction for Classification of Clinical Trials Protocols Using Graph Neural Networks. Artif Intell Med 2022. [DOI: 10.1007/978-3-031-09342-5_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
14
|
Kim B, Jang YJ, Cho HR, Kim SY, Jeong JE, Shim MK, Kim MG. Predicting completion of clinical trials in pregnant women: Cox proportional hazard and neural network models. Clin Transl Sci 2021; 15:691-699. [PMID: 34735737 PMCID: PMC8932703 DOI: 10.1111/cts.13187] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/25/2021] [Accepted: 10/21/2021] [Indexed: 12/01/2022] Open
Abstract
This study aimed to develop a model for predicting the completion of clinical trials involving pregnant women using the Cox proportional hazard model and neural network model (DeepSurv) and to compare the predictive performance of both methods. We collected data on 819 clinical trials performed on pregnant women and intervention studies using at least one drug as intervention from 2009 to 2018 from ClinicalTrials.gov. The Cox proportional hazard model and DeepSurv were used to develop models that predict clinical trial completion. The concordance index (C‐index) was used to evaluate the predictive performance. The Cox proportional hazard model revealed that a sample size of n ≥ 329 (hazard ratio [HR] = 0.53), very high human development index (HDI) country (HR = 0.28), abortion (HR = 3.30), labor (HR = 2.16), and iron deficiency anemia (HR = 2.29) were significantly related to the probability of clinical trial completion (all p value < 0.01). The C‐index of the model development dataset and test dataset were 0.72 and 0.73, respectively. DeepSurv model consisted of one hidden layer with 16 nodes. DeepSurv showed the C‐index comparable to the Cox proportional hazard model. The C‐index of the training dataset and test dataset were 0.76 and 0.72, respectively. Further a nomogram that calculate a probability of clinical trial completion at 1 year, 3 years, and 5 years was developed. Both the Cox proportional hazard model and DeepSurv yielded sufficient predicting performance. We hope that this study will contribute to the execution of future clinical trials in pregnant women.
Collapse
Affiliation(s)
- Bomee Kim
- Graduate School of Clinical Biohealth, Ewha Womans University, Seoul, Korea
| | - Yun Ji Jang
- College of Pharmacy, CHA University, Pocheon, Korea
| | - Hae Ram Cho
- College of Pharmacy, CHA University, Pocheon, Korea
| | - So Yeon Kim
- College of Pharmacy, CHA University, Pocheon, Korea
| | - Ji Eun Jeong
- College of Pharmacy, CHA University, Pocheon, Korea
| | | | - Myeong Gyu Kim
- College of Pharmacy, Ewha Womans University, Seoul, Korea.,Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul, Korea
| |
Collapse
|
15
|
Understanding and predicting COVID-19 clinical trial completion vs. cessation. PLoS One 2021; 16:e0253789. [PMID: 34252108 PMCID: PMC8274906 DOI: 10.1371/journal.pone.0253789] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 06/12/2021] [Indexed: 11/19/2022] Open
Abstract
As of March 30 2021, over 5,193 COVID-19 clinical trials have been registered through Clinicaltrial.gov. Among them, 191 trials were terminated, suspended, or withdrawn (indicating the cessation of the study). On the other hand, 909 trials have been completed (indicating the completion of the study). In this study, we propose to study underlying factors of COVID-19 trial completion vs. cessation, and design predictive models to accurately predict whether a COVID-19 trial may complete or cease in the future. We collect 4,441 COVID-19 trials from ClinicalTrial.gov to build a testbed, and design four types of features to characterize clinical trial administration, eligibility, study information, criteria, drug types, study keywords, as well as embedding features commonly used in the state-of-the-art machine learning. Our study shows that drug features and study keywords are most informative features, but all four types of features are essential for accurate trial prediction. By using predictive models, our approach achieves more than 0.87 AUC (Area Under the Curve) score and 0.81 balanced accuracy to correctly predict COVID-19 clinical trial completion vs. cessation. Our research shows that computational methods can deliver effective features to understand difference between completed vs. ceased COVID-19 trials. In addition, such models can also predict COVID-19 trial status with satisfactory accuracy, and help stakeholders better plan trials and minimize costs.
Collapse
|
16
|
Elkin ME, Zhu X. Community and topic modeling for infectious disease clinical trial recommendation. ACTA ACUST UNITED AC 2021; 10:47. [PMID: 34254037 PMCID: PMC8262767 DOI: 10.1007/s13721-021-00321-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 05/24/2021] [Accepted: 05/25/2021] [Indexed: 11/30/2022]
Abstract
Clinical trials are crucial for the advancement of treatment and knowledge within the medical community. Although the ClinicalTrials.gov initiative has resulted in a rich source of information for clinical trial research, only a handful of analytic studies have been carried out to understand this valuable data source. Analysis of this database provides insight for emerging trends of clinical research. In this study, we propose to use network analysis to understand infectious disease clinical trial research. Our goal is to understand two important issues related to the clinical trials: (1) the concentrations and characteristics of infectious disease clinical trial research, and (2) recommendation of clinical trials to a sponsor (or an investigator). The first issue helps summarize clinical trial research related to a particular disease(s), and the second issue helps match clinical trial sponsors and investigators for information recommendation. By using 4228 clinical trials as the test bed, our study investigates 4864 sponsors and 1879 research areas characterized by Medical Subject Heading (MeSH) keywords. We use a network to characterize infectious disease clinical trials, and design a new community-topic-based link prediction approach to predict sponsors’ interests. Our design relies on network modeling of both clinical trial sponsors and keywords. For sponsors, we extract communities with each community consisting of sponsors with coherent interests. For keywords, we extract topics with each topic containing semantic consistent keywords. The communities and topics are combined for accurate clinical trial recommendation. This transformative study concludes that using network analysis can tremendously help the understanding of clinical trial research for effective summarization, characterization, and prediction.
Collapse
Affiliation(s)
- Magdalyn E Elkin
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431 USA
| | - Xingquan Zhu
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431 USA
| |
Collapse
|