1
|
Wossnig L, Furtmann N, Buchanan A, Kumar S, Greiff V. Best practices for machine learning in antibody discovery and development. Drug Discov Today 2024; 29:104025. [PMID: 38762089 DOI: 10.1016/j.drudis.2024.104025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/25/2024] [Accepted: 05/13/2024] [Indexed: 05/20/2024]
Abstract
In the past 40 years, therapeutic antibody discovery and development have advanced considerably, with machine learning (ML) offering a promising way to speed up the process by reducing costs and the number of experiments required. Recent progress in ML-guided antibody design and development (D&D) has been hindered by the diversity of data sets and evaluation methods, which makes it difficult to conduct comparisons and assess utility. Establishing standards and guidelines will be crucial for the wider adoption of ML and the advancement of the field. This perspective critically reviews current practices, highlights common pitfalls and proposes method development and evaluation guidelines for various ML-based techniques in therapeutic antibody D&D. Addressing challenges across the ML process, best practices are recommended for each stage to enhance reproducibility and progress.
Collapse
Affiliation(s)
- Leonard Wossnig
- LabGenius Ltd, The Biscuit Factory, 100 Drummond Road, London SE16 4DG, UK; Department of Computer Science, University College London, 66-72 Gower St, London WC1E 6EA, UK.
| | - Norbert Furtmann
- R&D Large Molecules Research Platform, Sanofi Deutschland GmbH, Industriepark Höchst, Frankfurt Am Main, Germany
| | - Andrew Buchanan
- Biologics Engineering, R&D, AstraZeneca, Cambridge CB2 0AA, UK
| | - Sandeep Kumar
- Computational Protein Design and Modeling Group, Computational Science, Moderna Therapeutics, 200 Technology Square, Cambridge, MA 02139, USA
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, Oslo, Norway
| |
Collapse
|
2
|
Hasan F, Muhtar MS, Wu D, Chen PY, Hsu MH, Nguyen PA, Chen TJ, Chiu HY. Web-based artificial intelligence to predict cognitive impairment following stroke: A multicenter study. J Stroke Cerebrovasc Dis 2024; 33:107826. [PMID: 38908612 DOI: 10.1016/j.jstrokecerebrovasdis.2024.107826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 06/05/2024] [Accepted: 06/18/2024] [Indexed: 06/24/2024] Open
Abstract
BACKGROUND AND PURPOSE Post-stroke cognitive impairment (PSCI) is highly prevalent in modern society. However, there is limited study implying an accurate and explainable machine learning model to predict PSCI. The aim of this study is to develop and validate a web-based artificial intelligence (AI) tool for predicting PSCI. METHODS The retrospective cohort study design was conducted to develop and validate a web-based prediction model. Adults who experienced a stroke between January 1, 2004, and September 30, 2017, were enrolled, and patients with PSCI were followed up from the stroke index date until their last follow-up. The model's performance metrics, including accuracy, area under the curve (AUC), recall, precision, and F1 score, were compared. RESULTS A total of 3209 stroke patients were included in the study. The model demonstrated an accuracy of 0.8793, AUC of 0.9200, recall of 0.6332, precision of 0.9664, and F1 score of 0.7651. In the external validation phase, the accuracy improved to 0.9039, AUC to 0.9094, recall to 0.7457, precision to 0.9168, and F1 score to 0.8224. The final model can be accessed at https://psci-calculator.my.id/. CONCLUSION Our results are able to produce a user-friendly interface that is useful for health practitioners to perform early prediction on PSCI. These findings also suggest that the provided AI model is reliable and can serve as a roadmap for future studies using AI models in a clinical setting.
Collapse
Affiliation(s)
- Faizul Hasan
- Faculty of Nursing, Chulalongkorn University, Boromarajonani Srisataphat Building, 12th Floor, Rama1 Road, Wang Mai, Pathum Wan, Bangkok 10330, Thailand; School of Nursing, College of Nursing, Taipei Medical University, No. 250, Wuxing St., Xinyi Dist., Taipei City 110, Taiwan
| | | | - Dean Wu
- Research Center of Sleep Medicine, College of Medicine, Taipei Medical University 110, Taipei City, Taiwan; Department of Neurology, School of Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan; Department of Neurology, Shuang-Ho Hospital, New Taipei City 23561, Taiwan
| | - Pin-Yuan Chen
- Department of Neurosurgery, Chang Gung Memorial Hospital, Keelung City 204, Taiwan; School of Medicine, College of Medicine, Chang Gung University, Taoyuan City 333, Taiwan
| | - Min-Huei Hsu
- Graduate Institute of Data Science, Taipei Medical University, Taipei City 110, Taiwan
| | - Phung Anh Nguyen
- Graduate Institute of Data Science, Taipei Medical University, Taipei City 110, Taiwan
| | - Ting-Jhen Chen
- Faculty of Nursing, Chulalongkorn University, Boromarajonani Srisataphat Building, 12th Floor, Rama1 Road, Wang Mai, Pathum Wan, Bangkok 10330, Thailand; School of Nursing, Faculty of Science, Medicine and Health, University of Wollongong, Northfields Ave, Wollongong, NSW 2522, Australia
| | - Hsiao-Yean Chiu
- School of Nursing, College of Nursing, Taipei Medical University, No. 250, Wuxing St., Xinyi Dist., Taipei City 110, Taiwan; Research Center of Sleep Medicine, College of Medicine, Taipei Medical University 110, Taipei City, Taiwan; Department of Nursing, Taipei Medical University Hospital, Taipei City 110, Taiwan.
| |
Collapse
|
3
|
Shon S, Lim K, Chae M, Lee H, Choi J. Predicting Sudden Sensorineural Hearing Loss Recovery with Patient-Personalized Seigel's Criteria Using Machine Learning. Diagnostics (Basel) 2024; 14:1296. [PMID: 38928711 PMCID: PMC11202901 DOI: 10.3390/diagnostics14121296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 06/04/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Accurate prognostic prediction is crucial for managing Idiopathic Sudden Sensorineural Hearing Loss (ISSHL). Previous studies developing ISSHL prognosis models often overlooked individual variability in hearing damage by relying on fixed frequency domains. This study aims to develop models predicting ISSHL prognosis one month after treatment, focusing on patient-specific hearing impairments. METHODS Patient-Personalized Seigel's Criteria (PPSC) were developed considering patient-specific hearing impairment related to ISSHL criteria. We performed a statistical test to assess the shift in the recovery assessment when applying PPSC. The utilized dataset of 581 patients comprised demographic information, health records, laboratory testing, onset and treatment, and hearing levels. To reduce the model's reliance on hearing level features, we used only the averages of hearing levels of the impaired frequencies. Then, model development, evaluation, and interpretation proceeded. RESULTS The chi-square test (p-value: 0.106) indicated that the shift in recovery assessment is not statistically significant. The soft-voting ensemble model was most effective, achieving an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.864 (95% CI: 0.801-0.927), with model interpretation based on the SHapley Additive exPlanations value. CONCLUSIONS With PPSC, providing a hearing assessment comparable to traditional Seigel's criteria, the developed models successfully predicted ISSHL recovery one month post-treatment by considering patient-specific impairments.
Collapse
Affiliation(s)
- Sanghyun Shon
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
| | - Kanghyeon Lim
- Department of Otorhinolaryngology-Head and Neck Surgery, Korea University Ansan Hospital, Ansan-si 15355, Republic of Korea;
| | - Minsu Chae
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
| | - Hwamin Lee
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
| | - June Choi
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
- Department of Otorhinolaryngology-Head and Neck Surgery, Korea University Ansan Hospital, Ansan-si 15355, Republic of Korea;
| |
Collapse
|
4
|
Zhang G, Xie Q, Wang C, Xu J, Liu G, Su C. Intelligent alert system for predicting invasive mechanical ventilation needs via noninvasive parameters: employing an integrated machine learning method with integration of multicenter databases. Med Biol Eng Comput 2024:10.1007/s11517-024-03143-7. [PMID: 38861056 DOI: 10.1007/s11517-024-03143-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 05/27/2024] [Indexed: 06/12/2024]
Abstract
The use of invasive mechanical ventilation (IMV) is crucial in rescuing patients with respiratory dysfunction. Accurately predicting the demand for IMV is vital for clinical decision-making. However, current techniques are invasive and challenging to implement in pre-hospital and emergency rescue settings. To address this issue, a real-time prediction method utilizing only non-invasive parameters was developed to forecast IMV demand in this study. The model introduced the concept of real-time warning and leveraged the advantages of machine learning and integrated methods, achieving an AUC value of 0.935 (95% CI 0.933-0.937). The AUC value for the multi-center validation using the AmsterdamUMCdb database was 0.727, surpassing the performance of traditional risk adjustment algorithms (OSI(oxygenation saturation index): 0.608, P/F(oxygenation index): 0.558). Feature weight analysis demonstrated that BMI, Gcsverbal, and age significantly contributed to the model's decision-making. These findings highlight the substantial potential of a machine learning real-time dynamic warning model that solely relies on non-invasive parameters to predict IMV demand. Such a model can provide technical support for predicting the need for IMV in pre-hospital and disaster scenarios.
Collapse
Affiliation(s)
- Guang Zhang
- Systems Engineering Institute, People's Liberation Army, Academy of Military Sciences, Tianjin, 300161, China
| | - Qingyan Xie
- School of Life Sciences, Tiangong University, Tianjin, 300387, China
| | - Chengyi Wang
- School of Life Sciences, Tiangong University, Tianjin, 300387, China
| | - Jiameng Xu
- School of Life Sciences, Tiangong University, Tianjin, 300387, China
| | - Guanjun Liu
- Systems Engineering Institute, People's Liberation Army, Academy of Military Sciences, Tianjin, 300161, China
| | - Chen Su
- Systems Engineering Institute, People's Liberation Army, Academy of Military Sciences, Tianjin, 300161, China.
| |
Collapse
|
5
|
Hein K, Conkey-Morrison C, Burleigh TL, Poulus D, Stavropoulos V. Examining how gamers connect with their avatars to assess their anxiety: A novel artificial intelligence approach. Acta Psychol (Amst) 2024; 246:104298. [PMID: 38701623 DOI: 10.1016/j.actpsy.2024.104298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/29/2024] [Accepted: 04/29/2024] [Indexed: 05/05/2024] Open
Abstract
Research has supported that a gamer's attachment to their avatar can offer significant insights about their mental health, including anxiety. To assess this hypothesis, longitudinal data from 565 adult and adolescent participants (Mage = 29.3 years, SD = 10.6) was analyzed at two points, six months apart. Respondents were assessed using the User-Avatar Bond (UAB) scale and the Depression Anxiety Stress Scale (DASS) to measure their connection with their avatar and their risk for anxiety. The records were processed using both untuned and tuned artificial intelligence [AI] classifiers to evaluate present and future anxiety. The findings indicated that AI models are capable of accurately and autonomously discerning cases of anxiety risk based on the gamers' self-reported UAB, age, and duration of gaming, both at present and after six months. Notably, random forest algorithms surpassed other AI models in effectiveness, with avatar compensation emerging as the most significant factor in model training for prospective anxiety. The implications for assessment, prevention, and clinical practice are discussed.
Collapse
Affiliation(s)
- Kaiden Hein
- School of Health and Biomedical Sciences, RMIT University, Melbourne, Australia
| | - Connor Conkey-Morrison
- School of Health and Biomedical Sciences, RMIT University, Melbourne, Australia; College of Health and Biomedicine, Victoria University, Melbourne, Victoria, Australia
| | - Tyrone L Burleigh
- School of Health and Biomedical Sciences, RMIT University, Melbourne, Australia.
| | - Dylan Poulus
- Faculty of Health, Southern Cross University, Queensland, Australia
| | | |
Collapse
|
6
|
Çi Ftçi B, Teki N R. Prediction of viral families and hosts of single-stranded RNA viruses based on K-Mer coding from phylogenetic gene sequences. Comput Biol Chem 2024; 112:108114. [PMID: 38852362 DOI: 10.1016/j.compbiolchem.2024.108114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 05/06/2024] [Accepted: 05/25/2024] [Indexed: 06/11/2024]
Abstract
There are billions of virus species worldwide, and viruses, the smallest parasitic entities, pose a serious threat. Therefore, fighting associated disorders requires an understanding of the genetic structure of viruses. Considering the wide diversity and rapid evolution of viruses, there is a critical need to quickly and accurately classify viral species and their potential hosts to better understand transmission dynamics, facilitating the development of targeted therapies. Recognizing this, this study has investigated the classes of RNA viruses based on their genomic sequences using Machine Learning (ML) and Deep Learning (DL) models. The PhyVirus dataset, consisting of pathogenic Single-stranded RNA viruses of Baltimore group four (+ssRNA) and five (-ssRNA) with different hosts and species, was analyzed. The dataset containing viral gene sequences was analyzed using the K-Mer coding technique, which is based on base words of various lengths. The study used classical ML algorithms (Random Forest, Gradient Boosting and Extra Trees) and the Fully Connected Deep Neural Network, a Deep Learning algorithm, to predict viral families and hosts. Detailed analyses were performed on the classifier performance in scenarios with different train-test ratios and different word lengths (k-values) for K-Mer. The observed results show that Fully Connected Deep Neural Network has a high success rate of 99.60 % in predicting virus families. In predicting virus hosts, the Extra Trees classifier achieved the highest success rate of 81.53 %. This study is considered to be the first classification study in the literature on this dataset, which has a very large family and host diversity consisting of gene sequences of Single-stranded RNA viruses. Our detailed investigations on how varying word lengths based on K-Mer coding in gene sequences affect the classification into viral families and hosts make this study particularly valuable. This study shows that ML and DL methods have the potential to produce valuable results in phylogenetic studies. In addition, the results and high-performance values show that these methods can be successfully used in regenerative applications of gene sequences or in studies such as the elimination of losses in gene sequences.
Collapse
Affiliation(s)
- Bahar Çi Ftçi
- Batman University, Institute of Graduate Studies, Department of Electrical and Electronic Engineering, Turkey; Siirt University, Distance Education Application and Research Center, Turkey.
| | - Ramazan Teki N
- Batman University, Faculty of Engineering and Architecture, Department of Computer Engineering, Turkey.
| |
Collapse
|
7
|
Shariati MM, Eslami S, Shoeibi N, Eslampoor A, Sedaghat M, Gharaei H, Zarei-Ghanavati S, Derakhshan A, Abrishami M, Abrishami M, Hosseini SM, Rad SS, Astaneh MA, Farimani RM. Development, comparison, and internal validation of prediction models to determine the visual prognosis of patients with open globe injuries using machine learning approaches. BMC Med Inform Decis Mak 2024; 24:131. [PMID: 38773484 PMCID: PMC11106970 DOI: 10.1186/s12911-024-02520-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 04/24/2024] [Indexed: 05/23/2024] Open
Abstract
INTRODUCTION Open globe injuries (OGI) represent a main preventable reason for blindness and visual impairment, particularly in developing countries. The goal of this study is evaluating key variables affecting the prognosis of open globe injuries and validating internally and comparing different machine learning models to estimate final visual acuity. MATERIALS AND METHODS We reviewed three hundred patients with open globe injuries receiving treatment at Khatam-Al-Anbia Hospital in Iran from 2020 to 2022. Age, sex, type of trauma, initial VA grade, relative afferent pupillary defect (RAPD), zone of trauma, traumatic cataract, traumatic optic neuropathy (TON), intraocular foreign body (IOFB), retinal detachment (RD), endophthalmitis, and ocular trauma score (OTS) grade were the input features. We calculated univariate and multivariate regression models to assess the association of different features with visual acuity (VA) outcomes. We predicted visual acuity using ten supervised machine learning algorithms including multinomial logistic regression (MLR), support vector machines (SVM), K-nearest neighbors (KNN), naïve bayes (NB), decision tree (DT), random forest (RF), bagging (BG), adaptive boosting (ADA), artificial neural networks (ANN), and extreme gradient boosting (XGB). Accuracy, positive predictive value (PPV), recall, F-score, brier score (BS), Matthew correlation coefficient (MCC), receiver operating characteristic (AUC-ROC), and calibration plot were used to assess how well machine learning algorithms performed in predicting the final VA. RESULTS The artificial neural network (ANN) model had the best accuracy to predict the final VA. The sensitivity, F1 score, PPV, accuracy, and MCC of the ANN model were 0.81, 0.85, 0.89, 0.93, and 0.81, respectively. In addition, the estimated AUC-ROC and AUR-PRC of the ANN model for OGI patients were 0.96 and 0.91, respectively. The brier score and calibration log-loss for the ANN model was 0.201 and 0.232, respectively. CONCLUSION As classic and ensemble ML models were compared, results shows that the ANN model was the best. As a result, the framework that has been presented may be regarded as a good substitute for predicting the final VA in OGI patients. Excellent predictive accuracy was shown by the open globe injury model developed in this study, which should be helpful to provide clinical advice to patients and making clinical decisions concerning the management of open globe injuries.
Collapse
Affiliation(s)
| | - Saeid Eslami
- Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Nasser Shoeibi
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Alireza Eslampoor
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Hamid Gharaei
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Akbar Derakhshan
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Abrishami
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mojtaba Abrishami
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Saeed Shokuhi Rad
- Eye Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | | |
Collapse
|
8
|
Ravichandran A, Araque JC, Lawson JW. Predicting the functional state of protein kinases using interpretable graph neural networks from sequence and structural data. Proteins 2024; 92:623-636. [PMID: 38083830 DOI: 10.1002/prot.26641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 10/13/2023] [Accepted: 11/09/2023] [Indexed: 04/13/2024]
Abstract
Protein kinases are central to cellular activities and are actively pursued as drug targets for several conditions including cancer and autoimmune diseases. Despite the availability of a large structural database for kinases, methodologies to elucidate the structure-function relationship of these proteins (without manual intervention) are lacking. Such techniques are essential in structural biology and to accelerate drug discovery efforts. Here, we implement an interpretable graph neural network (GNN) framework for classifying the functionally active and inactive states of a large set of protein kinases by only using their tertiary structure and amino acid sequence. We show that the GNN models can classify kinase structures with high accuracy (>97%). We implement the Gradient-weighted Class Activation Mapping for graphs (Graph Grad-CAM) to automatically identify structurally important residues and residue-residue contacts of the kinases without any a priori input. We show that the motifs identified through the Graph Grad-CAM methodology are functionally critical, consistent with the existing kinase literature. Notably, the highly conserved DFG and HRD motifs of the well-known hydrophobic spine are identified by the interpretable framework in addition to some of the lesser known motifs. Further, using Grad-CAM maps as the vector embedding of the protein structures, we identify the subtle differences in the crystal structures among different sub-classes of kinases in the Protein Data Bank (PDB). Frameworks such as the one implemented here, for high-throughput identification of protein structure-function relationships are essential in designing targeted small molecules therapies as well as in engineering new proteins for novel applications.
Collapse
Affiliation(s)
- Ashwin Ravichandran
- KBR Inc., Intelligent Systems Division, NASA Ames Research Center, Moffett Field, California, USA
| | - Juan C Araque
- KBR Inc., Intelligent Systems Division, NASA Ames Research Center, Moffett Field, California, USA
| | - John W Lawson
- Intelligent Systems Division, NASA Ames Research Center, Moffett Field, California, USA
| |
Collapse
|
9
|
Ajmal A, Danial M, Zulfat M, Numan M, Zakir S, Hayat C, Alabbosh KF, Zaki MEA, Ali A, Wei D. In Silico Prediction of New Inhibitors for Kirsten Rat Sarcoma G12D Cancer Drug Target Using Machine Learning-Based Virtual Screening, Molecular Docking, and Molecular Dynamic Simulation Approaches. Pharmaceuticals (Basel) 2024; 17:551. [PMID: 38794122 PMCID: PMC11124053 DOI: 10.3390/ph17050551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 03/24/2024] [Accepted: 03/27/2024] [Indexed: 05/26/2024] Open
Abstract
Single-point mutations in the Kirsten rat sarcoma (KRAS) viral proto-oncogene are the most common cause of human cancer. In humans, oncogenic KRAS mutations are responsible for about 30% of lung, pancreatic, and colon cancers. One of the predominant mutant KRAS G12D variants is responsible for pancreatic cancer and is an attractive drug target. At the time of writing, no Food and Drug Administration (FDA) approved drugs are available for the KRAS G12D mutant. So, there is a need to develop an effective drug for KRAS G12D. The process of finding new drugs is expensive and time-consuming. On the other hand, in silico drug designing methodologies are cost-effective and less time-consuming. Herein, we employed machine learning algorithms such as K-nearest neighbor (KNN), support vector machine (SVM), and random forest (RF) for the identification of new inhibitors against the KRAS G12D mutant. A total of 82 hits were predicted as active against the KRAS G12D mutant. The active hits were docked into the active site of the KRAS G12D mutant. Furthermore, to evaluate the stability of the compounds with a good docking score, the top two complexes and the standard complex (MRTX-1133) were subjected to 200 ns MD simulation. The top two hits revealed high stability as compared to the standard compound. The binding energy of the top two hits was good as compared to the standard compound. Our identified hits have the potential to inhibit the KRAS G12D mutation and can help combat cancer. To the best of our knowledge, this is the first study in which machine-learning-based virtual screening, molecular docking, and molecular dynamics simulation were carried out for the identification of new promising inhibitors for the KRAS G12D mutant.
Collapse
Affiliation(s)
- Amar Ajmal
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Muhammad Danial
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Maryam Zulfat
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Muhammad Numan
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Sidra Zakir
- Department of Chemistry, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Chandni Hayat
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | | | - Magdi E. A. Zaki
- Department of Chemistry, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh 11623, Saudi Arabia
| | - Arif Ali
- Department of Bioinformatics and Biological Statistics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dongqing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang 473006, China
- Henan Biological Industry Group, 41 Nongye East Rd., Jinshui, Zhengzhou 450008, China
- Peng Cheng National Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nashan District, Shenzhen 518055, China
| |
Collapse
|
10
|
Hannon FP, Green MJ, O'Grady L, Hudson C, Gouw A, Randall LV. Predictive modelling of deviation from expected milk yield in transition cows on automatic milking systems. Prev Vet Med 2024; 225:106160. [PMID: 38452602 DOI: 10.1016/j.prevetmed.2024.106160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 01/25/2024] [Accepted: 02/19/2024] [Indexed: 03/09/2024]
Abstract
The transition period is a pivotal time in the production cycle of the dairy cow. It is estimated that between 30% and 50% of all cows experience metabolic or infectious disease during this time. One of the most common and economically consequential effects of disease during the transition period is a reduction in early lactation milk production. This has led to the utilisation of deviation from expected milk yield in early lactation as a proxy measure for transition health. However, to date, this analysis has been used exclusively for the retrospective assessment of transition cow health. Statistical models capable of predicting deviations from expected milk yield may allow producers to proactively manage animals predicted to suffer negative deviations in early lactation milk production. The objective of this retrospective cohort study was first, to explore the accuracy with which cow-level production and behaviour data collected on automatic milking systems (AMS) from 1-3 days in milk (DIM) can predict deviation from expected 30-day cumulative milk yield in multiparous cows. And second, to assess the accuracy with which predicted yield deviations can classify cows into groups which may facilitate improved transition management. Production, rumination, and physical activity data from 31 commercial AMS were accessed. A 3-step analytical procedure was then conducted. In Step 1, expected cumulative yield for 1-30 DIM for each individual cow-lactation was calculated using a mixed effect linear model. In Step 2, 30-Day Yield Deviation (YD) was calculated as the difference between observed and expected cumulative yield. Lactations were then assigned to one of three groups based on their YD, RED Group (= -15% YD), AMBER Group (-14% ̶ 0% YD), GREEN Group (>0% YD). In Step 3, yield, rumination, and physical activity data from days 1-3 in lactation were used to predict YD using machine learning models. Following external validation, YD was predicted across the test data set with a mean absolute error of 9%. Categorisation of animals suffering large negative deviations (RED group) was achieved with a specificity of 99%, sensitivity of 35%, and balanced accuracy of 67%. Our results suggest that milk yield, rumination and physical activity patterns expressed by dairy cows from 1-3 DIM have utility in the prediction of deviation from expected 30-day cumulative yield. However, these predictions currently lack the sensitivity required to classify cows reliably and completely into groups which may facilitate improved transition cow management.
Collapse
Affiliation(s)
- Fergus P Hannon
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire LE12 5RD, United Kingdom.
| | - Martin J Green
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire LE12 5RD, United Kingdom
| | - Luke O'Grady
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire LE12 5RD, United Kingdom; School of Veterinary Medicine, University College, Belfield, Dublin 4, Ireland
| | - Chris Hudson
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire LE12 5RD, United Kingdom
| | - Anneke Gouw
- Lely International N.V., Cornelis van der Lelylaan 1, Maassluis 3147 PB, the Netherlands
| | - Laura V Randall
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire LE12 5RD, United Kingdom
| |
Collapse
|
11
|
Murmu S, Sinha D, Chaurasia H, Sharma S, Das R, Jha GK, Archak S. A review of artificial intelligence-assisted omics techniques in plant defense: current trends and future directions. FRONTIERS IN PLANT SCIENCE 2024; 15:1292054. [PMID: 38504888 PMCID: PMC10948452 DOI: 10.3389/fpls.2024.1292054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 01/24/2024] [Indexed: 03/21/2024]
Abstract
Plants intricately deploy defense systems to counter diverse biotic and abiotic stresses. Omics technologies, spanning genomics, transcriptomics, proteomics, and metabolomics, have revolutionized the exploration of plant defense mechanisms, unraveling molecular intricacies in response to various stressors. However, the complexity and scale of omics data necessitate sophisticated analytical tools for meaningful insights. This review delves into the application of artificial intelligence algorithms, particularly machine learning and deep learning, as promising approaches for deciphering complex omics data in plant defense research. The overview encompasses key omics techniques and addresses the challenges and limitations inherent in current AI-assisted omics approaches. Moreover, it contemplates potential future directions in this dynamic field. In summary, AI-assisted omics techniques present a robust toolkit, enabling a profound understanding of the molecular foundations of plant defense and paving the way for more effective crop protection strategies amidst climate change and emerging diseases.
Collapse
Affiliation(s)
- Sneha Murmu
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Dipro Sinha
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Himanshushekhar Chaurasia
- Central Institute for Research on Cotton Technology, Indian Council of Agricultural Research (ICAR), Mumbai, India
| | - Soumya Sharma
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Ritwika Das
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Girish Kumar Jha
- Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Sunil Archak
- National Bureau of Plant Genetic Resources, Indian Council of Agricultural Research (ICAR), New Delhi, India
| |
Collapse
|
12
|
Yaschenko AE, Alonso JM, Stepanova AN. Arabidopsis as a model for translational research. THE PLANT CELL 2024:koae065. [PMID: 38411602 DOI: 10.1093/plcell/koae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 01/26/2024] [Accepted: 01/26/2024] [Indexed: 02/28/2024]
Abstract
Arabidopsis thaliana is currently the most-studied plant species on earth, with an unprecedented number of genetic, genomic, and molecular resources having been generated in this plant model. In the era of translating foundational discoveries to crops and beyond, we aimed to highlight the utility and challenges of using Arabidopsis as a reference for applied plant biology research, agricultural innovation, biotechnology, and medicine. We hope that this review will inspire the next generation of plant biologists to continue leveraging Arabidopsis as a robust and convenient experimental system to address fundamental and applied questions in biology. We aim to encourage lab and field scientists alike to take advantage of the vast Arabidopsis datasets, annotations, germplasm, constructs, methods, molecular and computational tools in our pursuit to advance understanding of plant biology and help feed the world's growing population. We envision that the power of Arabidopsis-inspired biotechnologies and foundational discoveries will continue to fuel the development of resilient, high-yielding, nutritious plants for the betterment of plant and animal health and greater environmental sustainability.
Collapse
Affiliation(s)
- Anna E Yaschenko
- Department of Plant and Microbial Biology, Genetics and Genomics Academy, North Carolina State University, Raleigh, NC 27695, USA
| | - Jose M Alonso
- Department of Plant and Microbial Biology, Genetics and Genomics Academy, North Carolina State University, Raleigh, NC 27695, USA
| | - Anna N Stepanova
- Department of Plant and Microbial Biology, Genetics and Genomics Academy, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
13
|
Rahmatinejad Z, Dehghani T, Hoseini B, Rahmatinejad F, Lotfata A, Reihani H, Eslami S. A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department. Sci Rep 2024; 14:3406. [PMID: 38337000 PMCID: PMC10858239 DOI: 10.1038/s41598-024-54038-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 02/07/2024] [Indexed: 02/12/2024] Open
Abstract
This study addresses the challenges associated with emergency department (ED) overcrowding and emphasizes the need for efficient risk stratification tools to identify high-risk patients for early intervention. While several scoring systems, often based on logistic regression (LR) models, have been proposed to indicate patient illness severity, this study aims to compare the predictive performance of ensemble learning (EL) models with LR for in-hospital mortality in the ED. A cross-sectional single-center study was conducted at the ED of Imam Reza Hospital in northeast Iran from March 2016 to March 2017. The study included adult patients with one to three levels of emergency severity index. EL models using Bagging, AdaBoost, random forests (RF), Stacking and extreme gradient boosting (XGB) algorithms, along with an LR model, were constructed. The training and validation visits from the ED were randomly divided into 80% and 20%, respectively. After training the proposed models using tenfold cross-validation, their predictive performance was evaluated. Model performance was compared using the Brier score (BS), The area under the receiver operating characteristics curve (AUROC), The area and precision-recall curve (AUCPR), Hosmer-Lemeshow (H-L) goodness-of-fit test, precision, sensitivity, accuracy, F1-score, and Matthews correlation coefficient (MCC). The study included 2025 unique patients admitted to the hospital's ED, with a total percentage of hospital deaths at approximately 19%. In the training group and the validation group, 274 of 1476 (18.6%) and 152 of 728 (20.8%) patients died during hospitalization, respectively. According to the evaluation of the presented framework, EL models, particularly Bagging, predicted in-hospital mortality with the highest AUROC (0.839, CI (0.802-0.875)) and AUCPR = 0.64 comparable in terms of discrimination power with LR (AUROC (0.826, CI (0.787-0.864)) and AUCPR = 0.61). XGB achieved the highest precision (0.83), sensitivity (0.831), accuracy (0.842), F1-score (0.833), and the highest MCC (0.48). Additionally, the most accurate models in the unbalanced dataset belonged to RF with the lowest BS (0.128). Although all studied models overestimate mortality risk and have insufficient calibration (P > 0.05), stacking demonstrated relatively good agreement between predicted and actual mortality. EL models are not superior to LR in predicting in-hospital mortality in the ED. Both EL and LR models can be considered as screening tools to identify patients at risk of mortality.
Collapse
Affiliation(s)
- Zahra Rahmatinejad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Toktam Dehghani
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Toos Institute of Higher Education, Mashhad, Iran
| | - Benyamin Hoseini
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Rahmatinejad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Aynaz Lotfata
- Department of Pathology, Microbiology, and Immunology, School of Veterinary Medicine, University of California, Davis, CA, USA
| | - Hamidreza Reihani
- Department of Emergency Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Saeid Eslami
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
- Department of Medical Informatics, Amsterdam UMC - Location AMC, University of Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
14
|
Brown T, Burleigh TL, Schivinski B, Bennett S, Gorman-Alesi A, Blinka L, Stavropoulos V. Translating the user-avatar bond into depression risk: A preliminary machine learning study. J Psychiatr Res 2024; 170:328-339. [PMID: 38194850 DOI: 10.1016/j.jpsychires.2023.12.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/21/2023] [Accepted: 12/27/2023] [Indexed: 01/11/2024]
Abstract
Research has shown a link between depression risk and how gamers form relationships with their in-game figure of representation, called avatar. This is reinforced by literature supporting that a gamer's connection to their avatar may provide broader insight into their mental health. Therefore, it has been argued that if properly examined, the bond between a person and their avatar may reveal information about their current or potential struggles with depression offline. To examine whether the connection with an individuals' avatars may reveal their risk for depression, longitudinal data from 565 adults/adolescents (Mage = 29.3 years, SD = 10.6) were evaluated twice (six months apart). Participants completed the User-Avatar-Bond [UAB] scale and Depression Anxiety Stress Scale to measure avatar bond and depression risk. A series of tuned and untuned artificial intelligence [AI] classifiers analyzed their responses concurrently and prospectively. This allowed the examination of whether user-avatar bond can provide cross-sectional and predictive information about depression risk. Findings revealed that AI models can learn to accurately and automatically identify depression risk cases, based on gamers' reported UAB, age, and length of gaming involvement, both at present and six months later. In particular, random forests outperformed all other AIs, while avatar immersion was shown to be the strongest training predictor. Study outcomes demonstrate that UAB can be translated into accurate, concurrent, and future, depression risk predictions via trained AI classifiers. Assessment, prevention, and practice implications are discussed in the light of these results.
Collapse
Affiliation(s)
- Taylor Brown
- Applied Health, School of Health and Biomedical Sciences, RMIT University, Australia.
| | - Tyrone L Burleigh
- Centre of Excellence in Responsible Gaming, University of Gibraltar, Gibraltar.
| | | | | | - Angela Gorman-Alesi
- School Counselling Unit, Child & Family Counsellor, Catholic Care Victoria, Australia.
| | - Lukas Blinka
- Faculty of Social Studies, Masaryk University, Czech Republic.
| | | |
Collapse
|
15
|
Ailawadhi S, Romanus D, Shah S, Fraeman K, Saragoussi D, Buus RM, Nguyen B, Cherepanov D, Lamerato L, Berger A. Development and validation of algorithms for identifying lines of therapy in multiple myeloma using real-world data. Future Oncol 2024. [PMID: 38231002 DOI: 10.2217/fon-2023-0696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024] Open
Abstract
Aim: To validate algorithms based on electronic health data to identify composition of lines of therapy (LOT) in multiple myeloma (MM). Materials & methods: This study used available electronic health data for selected adults within Henry Ford Health (Michigan, USA) newly diagnosed with MM in 2006-2017. Algorithm performance in this population was verified via chart review. As with prior oncology studies, good performance was defined as positive predictive value (PPV) ≥75%. Results: Accuracy for identifying LOT1 (N = 133) was 85.0%. For the most frequent regimens, accuracy was 92.5-97.7%, PPV 80.6-93.8%, sensitivity 88.2-89.3% and specificity 94.3-99.1%. Algorithm performance decreased in subsequent LOTs, with decreasing sample sizes. Only 19.5% of patients received maintenance therapy during LOT1. Accuracy for identifying maintenance therapy was 85.7%; PPV for the most common maintenance therapy was 73.3%. Conclusion: Algorithms performed well in identifying LOT1 - especially more commonly used regimens - and slightly less well in identifying maintenance therapy therein.
Collapse
Affiliation(s)
- Sikander Ailawadhi
- Division of Hematology/Oncology, Department of Medicine, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Dorothy Romanus
- Global Evidence & Outcomes, Takeda Development Center Americas, Inc. (TDCA), Lexington, MA 02421, USA
| | - Surbhi Shah
- Real-World Evidence, Evidera/PPD (part of Thermo fisher Scientific), Bethesda, MD 20814, USA
| | - Kathy Fraeman
- Real-World Evidence, Evidera/PPD (part of Thermo fisher Scientific), Bethesda, MD 20814, USA
| | - Delphine Saragoussi
- Real-World Evidence, Evidera/PPD (part of Thermo fisher Scientific), London, W6 8BJ, UK
| | - Rebecca Morris Buus
- Epidemiology and Scientific Affairs, Clinical Development Services Division, Evidera/PPD (part of Thermo Fisher Scientific), Bethesda, MD 20814, USA
| | - Binh Nguyen
- Medical Science and Strategy, Oncology, PPD (part of Thermo Fisher Scientific), Bethesda, MD 20814, USA
| | - Dasha Cherepanov
- Global Evidence & Outcomes, Takeda Development Center Americas, Inc. (TDCA), Lexington, MA 02421, USA
| | | | - Ariel Berger
- Real-World Evidence, Evidera/PPD (part of Thermo fisher Scientific), Bethesda, MD 20814, USA
| |
Collapse
|
16
|
Chmielowski L, Kucharzak M, Burduk R. Novel method of building train and test sets for evaluation of machine learning models related to software bugs assignment. Sci Rep 2023; 13:21512. [PMID: 38057324 DOI: 10.1038/s41598-023-48617-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 11/28/2023] [Indexed: 12/08/2023] Open
Abstract
Nowadays many tools are in use in processes related to handling bug reports, feature requests, supporting questions or similar related issues which should be handled during software development or maintenance. Part of them use machine learning techniques. In introduction is presented a review of fundamental methods used for evaluation of machine learning models. This paper points out weak points of currently used metrics for evaluation in specific context of the cases related to software development especially bug reports. The disadvantages of state of the art are related to disregarding time dependencies which are important to be applied for creating train and test sets as they may have impact on results. Extensive research of the art has been conducted and has not been found any article with the use of time dependencies for evaluation of machine learning models in the context of works related to software development applications like machine learning solutions to supporting bug tracking systems. This paper introduces a novel solution which is devoid of these drawbacks. Experimental research showed the effectiveness of the introduced method and significantly different results obtained compared to the state-of-the-art methods.
Collapse
Affiliation(s)
- Lukasz Chmielowski
- Nokia Solutions and Networks sp. z o.o., 02-685, Warsaw, Poland.
- Wroclaw University of Science and Technology, 50-370, Wroclaw, Poland.
| | - Michal Kucharzak
- Nokia Solutions and Networks sp. z o.o., 02-685, Warsaw, Poland
- Wroclaw University of Science and Technology, 50-370, Wroclaw, Poland
| | - Robert Burduk
- Wroclaw University of Science and Technology, 50-370, Wroclaw, Poland
| |
Collapse
|
17
|
Blampey Q, Bercovici N, Dutertre CA, Pic I, Ribeiro JM, André F, Cournède PH. A biology-driven deep generative model for cell-type annotation in cytometry. Brief Bioinform 2023; 24:bbad260. [PMID: 37497716 DOI: 10.1093/bib/bbad260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/20/2023] [Accepted: 06/27/2023] [Indexed: 07/28/2023] Open
Abstract
Cytometry enables precise single-cell phenotyping within heterogeneous populations. These cell types are traditionally annotated via manual gating, but this method lacks reproducibility and sensitivity to batch effect. Also, the most recent cytometers-spectral flow or mass cytometers-create rich and high-dimensional data whose analysis via manual gating becomes challenging and time-consuming. To tackle these limitations, we introduce Scyan https://github.com/MICS-Lab/scyan, a Single-cell Cytometry Annotation Network that automatically annotates cell types using only prior expert knowledge about the cytometry panel. For this, it uses a normalizing flow-a type of deep generative model-that maps protein expressions into a biologically relevant latent space. We demonstrate that Scyan significantly outperforms the related state-of-the-art models on multiple public datasets while being faster and interpretable. In addition, Scyan overcomes several complementary tasks, such as batch-effect correction, debarcoding and population discovery. Overall, this model accelerates and eases cell population characterization, quantification and discovery in cytometry.
Collapse
Affiliation(s)
- Quentin Blampey
- Université Paris-Saclay, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), 3 rue Joliot Curie, 91190,Gif-sur-Yvette, France
| | - Nadège Bercovici
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
- Université Paris Cité, Institut Cochin, CNRS, Inserm, 22 Rue Méchain, 75014, Paris, France
| | - Charles-Antoine Dutertre
- Université Paris-Saclay, Gustave Roussy, Inserm U1015, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Isabelle Pic
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Joana Mourato Ribeiro
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
- Gustave Roussy, Département de Médecine Oncologique, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Fabrice André
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
- Gustave Roussy, Département de Médecine Oncologique, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Paul-Henry Cournède
- Université Paris-Saclay, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), 3 rue Joliot Curie, 91190,Gif-sur-Yvette, France
| |
Collapse
|
18
|
Ju H, Bai J, Jiang J, Che Y, Chen X. Comparative evaluation and analysis of DNA N4-methylcytosine methylation sites using deep learning. Front Genet 2023; 14:1254827. [PMID: 37671040 PMCID: PMC10476523 DOI: 10.3389/fgene.2023.1254827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 07/31/2023] [Indexed: 09/07/2023] Open
Abstract
DNA N4-methylcytosine (4mC) is significantly involved in biological processes, such as DNA expression, repair, and replication. Therefore, accurate prediction methods are urgently needed. Deep learning methods have transformed applications that previously require sequencing expertise into engineering challenges that do not require expertise to solve. Here, we compare a variety of state-of-the-art deep learning models on six benchmark datasets to evaluate their performance in 4mC methylation site detection. We visualize the statistical analysis of the datasets and the performance of different deep-learning models. We conclude that deep learning can greatly expand the potential of methylation site prediction.
Collapse
Affiliation(s)
- Hong Ju
- Heilongjiang Agricultural Engineering Vocational College, Harbin, China
| | - Jie Bai
- Engineering Research Center of Integration and Application of Digital Learning Technology, Ministry of Education, Hangzhou, China
| | - Jing Jiang
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Yusheng Che
- Heilongjiang Agricultural Engineering Vocational College, Harbin, China
| | - Xin Chen
- Department of Neurosurgical Laboratory, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
19
|
Zou K, Wang S, Wang Z, Zhang Z, Yang F. HAR_Locator: a novel protein subcellular location prediction model of immunohistochemistry images based on hybrid attention modules and residual units. Front Mol Biosci 2023; 10:1171429. [PMID: 37664182 PMCID: PMC10470064 DOI: 10.3389/fmolb.2023.1171429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 08/04/2023] [Indexed: 09/05/2023] Open
Abstract
Introduction: Proteins located in subcellular compartments have played an indispensable role in the physiological function of eukaryotic organisms. The pattern of protein subcellular localization is conducive to understanding the mechanism and function of proteins, contributing to investigating pathological changes of cells, and providing technical support for targeted drug research on human diseases. Automated systems based on featurization or representation learning and classifier design have attracted interest in predicting the subcellular location of proteins due to a considerable rise in proteins. However, large-scale, fine-grained protein microscopic images are prone to trapping and losing feature information in the general deep learning models, and the shallow features derived from statistical methods have weak supervision abilities. Methods: In this work, a novel model called HAR_Locator was developed to predict the subcellular location of proteins by concatenating multi-view abstract features and shallow features, whose advanced advantages are summarized in the following three protocols. Firstly, to get discriminative abstract feature information on protein subcellular location, an abstract feature extractor called HARnet based on Hybrid Attention modules and Residual units was proposed to relieve gradient dispersion and focus on protein-target regions. Secondly, it not only improves the supervision ability of image information but also enhances the generalization ability of the HAR_Locator through concatenating abstract features and shallow features. Finally, a multi-category multi-classifier decision system based on an Artificial Neural Network (ANN) was introduced to obtain the final output results of samples by fitting the most representative result from five subset predictors. Results: To evaluate the model, a collection of 6,778 immunohistochemistry (IHC) images from the Human Protein Atlas (HPA) database was used to present experimental results, and the accuracy, precision, and recall evaluation indicators were significantly increased to 84.73%, 84.77%, and 84.70%, respectively, compared with baseline predictors.
Collapse
Affiliation(s)
- Kai Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Simeng Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Ziqian Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Zhihai Zhang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Fan Yang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
- Artificial Intelligence and Bioinformation Cognition Laboratory, Jiangxi Science and Technology Normal University, Nanchang, China
| |
Collapse
|
20
|
Mastour H, Dehghani T, Moradi E, Eslami S. Early prediction of medical students' performance in high-stakes examinations using machine learning approaches. Heliyon 2023; 9:e18248. [PMID: 37519702 PMCID: PMC10372649 DOI: 10.1016/j.heliyon.2023.e18248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 06/03/2023] [Accepted: 07/12/2023] [Indexed: 08/01/2023] Open
Abstract
Introduction Since the advent of medical education systems, managing high-stakes exams has been a top priority and challenge for all policymakers. However, considering machine learning (ML) techniques as a replacement for medical licensing examinations, particularly during crises such as the COVID-19 outbreak, could be an effective solution. This study uses ML models to develop a framework for predicting medical students' performance on high-stakes exams, such as the Comprehensive Medical Basic Sciences Examination (CMBSE). Material and methods Prediction of students' status and score on high-stakes examinations faces several challenges, including an imbalanced number of failing and passing students, a large number of heterogeneous and complex features, and the need to identify at-risk and top-performing students. In this study, two major categories of ML approaches are compared: first, classic models (logistic regression (LR), support vector machine (SVM), and k-nearest neighbors (KNN)), and second, ensemble models (voting, bagging (BG), random forests (RF), adaptive boosting (ADA), extreme gradient boosting (XGB), and stacking). Results To evaluate the models' discrimination ability, they are assessed using a real dataset containing information on medical students over a five-year period (n = 1005). The findings indicate that ensemble ML models demonstrate optimal performance in predicting CMBSE status (RF and stacking). Similarly, among the classic regressors, LR exhibited the highest root-mean-square deviation (RMSD) (0.134) and coefficient of determination (R2) (0.62), whereas the RF model had the highest RMSD (0.077) and R2 (0.80) overall. Furthermore, Anatomical Sciences, Biochemistry, Parasitology, and Entomology grade point average (GPA) and grades demonstrated the strongest positive correlation with the outcomes. Conclusion Comparing classic and ensemble ML models revealed that ensemble models are superior to classic models. Therefore, the presented framework could be considered a suitable alternative for the CMBSE and other comparable medical licensing examinations.
Collapse
Affiliation(s)
- Haniye Mastour
- Department of Medical Education, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Toktam Dehghani
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Ehsan Moradi
- Mashhad University of Medical Sciences, Mashhad, Iran
| | - Saeid Eslami
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Pharmaceutical Sciences Research Center, Institute of Pharmaceutical Technology, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
21
|
Greenberg ZF, Graim KS, He M. Towards artificial intelligence-enabled extracellular vesicle precision drug delivery. Adv Drug Deliv Rev 2023:114974. [PMID: 37356623 DOI: 10.1016/j.addr.2023.114974] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 06/27/2023]
Abstract
Extracellular Vesicles (EVs), particularly exosomes, recently exploded into nanomedicine as an emerging drug delivery approach due to their superior biocompatibility, circulating stability, and bioavailability in vivo. However, EV heterogeneity makes molecular targeting precision a critical challenge. Deciphering key molecular drivers for controlling EV tissue targeting specificity is in great need. Artificial intelligence (AI) brings powerful prediction ability for guiding the rational design of engineered EVs in precision control for drug delivery. This review focuses on cutting-edge nano-delivery via integrating large-scale EV data with AI to develop AI-directed EV therapies and illuminate the clinical translation potential. We briefly review the current status of EVs in drug delivery, including the current frontier, limitations, and considerations to advance the field. Subsequently, we detail the future of AI in drug delivery and its impact on precision EV delivery. Our review discusses the current universal challenge of standardization and critical considerations when using AI combined with EVs for precision drug delivery. Finally, we will conclude this review with a perspective on future clinical translation led by a combined effort of AI and EV research.
Collapse
Affiliation(s)
- Zachary F Greenberg
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida, 32610, USA
| | - Kiley S Graim
- Department of Computer & Information Science & Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, 32610, USA
| | - Mei He
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida, 32610, USA.
| |
Collapse
|
22
|
Shaw A, Newman P, Witchalls J, Hedger T. Externally validated machine learning algorithm accurately predicts medial tibial stress syndrome in military trainees: a multicohort study. BMJ Open Sport Exerc Med 2023; 9:e001566. [PMID: 37497020 PMCID: PMC10367080 DOI: 10.1136/bmjsem-2023-001566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/17/2023] [Indexed: 07/28/2023] Open
Abstract
Objectives Medial tibial stress syndrome (MTSS) is a common musculoskeletal injury in both sporting and military settings. No reliable treatments exist, and reoccurrence rates are high. Prevention of MTSS is critical to reducing operational burden. Therefore, this study aimed to build a decision-making model to predict the individual risk of MTSS within officer cadets and test the external validity of the model on a separate military population. Design Prospective cohort study. Methods This study collected a suite of key variables previously established for predicting MTSS. Data were obtained from 107 cadets (34 women and 73 men). A follow-up survey was conducted at 3 months to determine MTSS diagnoses. Six ensemble learning algorithms were deployed and trained five times on random stratified samples of 75% of the dataset. The resultant algorithms were tested on the remaining 25% of the dataset, with models then compared for accuracy. The most accurate new algorithm was tested on an unrelated data sample of 123 Australian Navy recruits to establish external validity of the model. Results Calibrated random forest modelling was the most accurate in identifying a diagnosis of MTSS; (area under curve (AUC)=98%, classification accuracy (CA)=96%). External validation on a sample of Navy recruits resulted in comparable accuracy; (AUC=95%, CA=94%). When the model was tested on the combined datasets, similar accuracy was achieved; (AUC=92%, CA=91%). Conclusion This model is highly accurate in predicting those who will develop MTSS. The model provides important preventive capacity which should be trialled as a risk management intervention.
Collapse
Affiliation(s)
- Angus Shaw
- Faculty of Health (Physiotherapy), University of Canberra, Canberra, Australian Capital Territory, Australia
- Physiotherapy, Matrix Physiotherapy & Sports Clinic, Queanbeyan, New South Wales, Australia
| | - Phil Newman
- Faculty of Health (Physiotherapy), Research Institute for Sport and Exercise (UCRISE), University of Canberra, Canberra, Australian Capital Territory, Australia
| | - Jeremy Witchalls
- Faculty of Health (Physiotherapy), Research Institute for Sport and Exercise (UCRISE), University of Canberra, Canberra, Australian Capital Territory, Australia
| | - Tristan Hedger
- Physiotherapy, Australian Defence Force Academy, Campbell, Australian Capital Territory, Australia
| |
Collapse
|
23
|
Nayak T, Chadaga K, Sampathila N, Mayrose H, Gokulkrishnan N, Bairy G M, Prabhu S, S SK, Umakanth S. Deep learning based detection of monkeypox virus using skin lesion images. MEDICINE IN NOVEL TECHNOLOGY AND DEVICES 2023; 18:100243. [PMID: 37293134 PMCID: PMC10236906 DOI: 10.1016/j.medntd.2023.100243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/25/2023] [Accepted: 05/26/2023] [Indexed: 06/10/2023] Open
Abstract
As we set into the second half of 2022, the world is still recovering from the two-year COVID-19 pandemic. However, over the past three months, the outbreak of the Monkeypox Virus (MPV) has led to fifty-two thousand confirmed cases and over one hundred deaths. This caused the World Health Organisation to declare the outbreak a Public Health Emergency of International Concern (PHEIC). If this outbreak worsens, we could be looking at the Monkeypox virus causing the next global pandemic. As Monkeypox affects the human skin, the symptoms can be captured with regular imaging. Large samples of these images can be used as a training dataset for machine learning-based detection tools. Using a regular camera to capture the skin image of the infected person and running it against computer vision models is beneficial. In this research, we use deep learning to diagnose monkeypox from skin lesion images. Using a publicly available dataset, we tested the dataset on five pre-trained deep neural networks: GoogLeNet, Places365-GoogLeNet, SqueezeNet, AlexNet and ResNet-18. Hyperparameter was done to choose the best parameters. Performance metrics such as accuracy, precision, recall, f1-score and AUC were considered. Among the above models, ResNet18 was able to obtain the highest accuracy of 99.49%. The modified models obtained validation accuracies above 95%. The results prove that deep learning models such as the proposed model based on ResNet-18 can be deployed and can be crucial in battling the monkeypox virus. Since the used networks are optimized for efficiency, they can be used on performance limited devices such as smartphones with cameras. The addition of explainable artificial intelligence techniques LIME and GradCAM enables visual interpretation of the prediction made, helping health professionals using the model.
Collapse
Affiliation(s)
- Tushar Nayak
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Krishnaraj Chadaga
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Niranjana Sampathila
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Hilda Mayrose
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Nitila Gokulkrishnan
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Muralidhar Bairy G
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Srikanth Prabhu
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Swathi K S
- Prasanna School of Public Health, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Shashikiran Umakanth
- Department of Medicine, Dr. T.M.A. Pai Hospital, Manipal Academy of Higher Education, Udupi, Karnataka, 576101, India
| |
Collapse
|
24
|
Yang S, Yang Z, Ni X. AMPFinder: A computational model to identify antimicrobial peptides and their functions based on sequence-derived information. Anal Biochem 2023; 673:115196. [PMID: 37236434 DOI: 10.1016/j.ab.2023.115196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/22/2023] [Accepted: 05/23/2023] [Indexed: 05/28/2023]
Abstract
Antimicrobial peptides (AMPs) called host defense peptides have existed among all classes of life with 5-100 amino acids generally and can kill mycobacteria, envelop viruses, bacteria, fungi, cancerous cells and so on. Owing to the non-drug resistance of AMP, it has been a wonderful agent to find novel therapies. Therefore, it is urgent to identify AMPs and predict their function in a high-throughput way. In this paper, we propose a cascaded computational model to identify AMPs and their functional type based on sequence-derived and life language embedding, called AMPFinder. Compared with other state-of-the-art methods, AMPFinder obtains higher performance both on AMP identification and AMP function prediction. AMPFinder shows better performance with improvement of F1-score (1.45%-6.13%), MCC (2.92%-12.86%) and AUC (5.13%-8.56%) and AP (9.20%-21.07%) on an independent test dataset. And AMPFinder achieve lower bias of R2 on a public dataset by 10-fold cross-validation with an improvement of (18.82%-19.46%). The comparison with other state-of-the-art methods shows that AMP can accurately identify AMP and its function types. The datasets, source code and user-friendly application are available at https://github.com/abcair/AMPFinder.
Collapse
Affiliation(s)
- Sen Yang
- The Affiliated Changzhou No 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China; School of Computer Science and Artificial Intelligence Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
| | - Zexi Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
| | - Xinye Ni
- The Affiliated Changzhou No 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China.
| |
Collapse
|
25
|
Malek AA, Alias MA, Razak FA, Noorani MSM, Mahmud R, Zulkepli NFS. Persistent Homology-Based Machine Learning Method for Filtering and Classifying Mammographic Microcalcification Images in Early Cancer Detection. Cancers (Basel) 2023; 15:cancers15092606. [PMID: 37174071 PMCID: PMC10177619 DOI: 10.3390/cancers15092606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/23/2023] [Accepted: 03/30/2023] [Indexed: 05/15/2023] Open
Abstract
Microcalcifications in mammogram images are primary indicators for detecting the early stages of breast cancer. However, dense tissues and noise in the images make it challenging to classify the microcalcifications. Currently, preprocessing procedures such as noise removal techniques are applied directly on the images, which may produce a blurry effect and loss of image details. Further, most of the features used in classification models focus on local information of the images and are often burdened with details, resulting in data complexity. This research proposed a filtering and feature extraction technique using persistent homology (PH), a powerful mathematical tool used to study the structure of complex datasets and patterns. The filtering process is not performed directly on the image matrix but through the diagrams arising from PH. These diagrams will enable us to distinguish prominent characteristics of the image from noise. The filtered diagrams are then vectorised using PH features. Supervised machine learning models are trained on the MIAS and DDSM datasets to evaluate the extracted features' efficacy in discriminating between benign and malignant classes and to obtain the optimal filtering level. This study reveals that appropriate PH filtering levels and features can improve classification accuracy in early cancer detection.
Collapse
Affiliation(s)
- Aminah Abdul Malek
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
- Mathematical Sciences Studies, College of Computing, Informatics and Media, Universiti Teknologi MARA (UiTM) Negeri Sembilan Branch, Seremban Campus, Seremban 70300, Negeri Sembilan, Malaysia
| | - Mohd Almie Alias
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
- Centre for Modelling and Data Analysis (DELTA), Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
| | - Fatimah Abdul Razak
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
- Centre for Modelling and Data Analysis (DELTA), Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
| | - Mohd Salmi Md Noorani
- Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia
| | - Rozi Mahmud
- Department of Radiology and Imaging, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia (UPM), Serdang 43400, Selangor, Malaysia
| | | |
Collapse
|
26
|
Al-Tashi Q, Saad MB, Muneer A, Qureshi R, Mirjalili S, Sheshadri A, Le X, Vokes NI, Zhang J, Wu J. Machine Learning Models for the Identification of Prognostic and Predictive Cancer Biomarkers: A Systematic Review. Int J Mol Sci 2023; 24:7781. [PMID: 37175487 PMCID: PMC10178491 DOI: 10.3390/ijms24097781] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/10/2023] [Accepted: 04/19/2023] [Indexed: 05/15/2023] Open
Abstract
The identification of biomarkers plays a crucial role in personalized medicine, both in the clinical and research settings. However, the contrast between predictive and prognostic biomarkers can be challenging due to the overlap between the two. A prognostic biomarker predicts the future outcome of cancer, regardless of treatment, and a predictive biomarker predicts the effectiveness of a therapeutic intervention. Misclassifying a prognostic biomarker as predictive (or vice versa) can have serious financial and personal consequences for patients. To address this issue, various statistical and machine learning approaches have been developed. The aim of this study is to present an in-depth analysis of recent advancements, trends, challenges, and future prospects in biomarker identification. A systematic search was conducted using PubMed to identify relevant studies published between 2017 and 2023. The selected studies were analyzed to better understand the concept of biomarker identification, evaluate machine learning methods, assess the level of research activity, and highlight the application of these methods in cancer research and treatment. Furthermore, existing obstacles and concerns are discussed to identify prospective research areas. We believe that this review will serve as a valuable resource for researchers, providing insights into the methods and approaches used in biomarker discovery and identifying future research opportunities.
Collapse
Affiliation(s)
- Qasem Al-Tashi
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Maliazurina B. Saad
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Amgad Muneer
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rizwan Qureshi
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Seyedali Mirjalili
- Centre for Artificial Intelligence Research and Optimization, Torrens University Australia, Fortitude Valley, Brisbane, QLD 4006, Australia
- Yonsei Frontier Lab, Yonsei University, Seoul 03722, Republic of Korea
- University Research and Innovation Center, Obuda University, 1034 Budapest, Hungary
| | - Ajay Sheshadri
- Department of Pulmonary Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xiuning Le
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Natalie I. Vokes
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Jianjun Zhang
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Jia Wu
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
27
|
Wang RH, Luo T, Zhang HL, Du PF. PLA-GNN: Computational inference of protein subcellular location alterations under drug treatments with deep graph neural networks. Comput Biol Med 2023; 157:106775. [PMID: 36921458 DOI: 10.1016/j.compbiomed.2023.106775] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/21/2023] [Accepted: 03/09/2023] [Indexed: 03/12/2023]
Abstract
The aberrant protein sorting has been observed in many conditions, including complex diseases, drug treatments, and environmental stresses. It is important to systematically identify protein mis-localization events in a given condition. Experimental methods for finding mis-localized proteins are always costly and time consuming. Predicting protein subcellular localizations has been studied for many years. However, only a handful of existing works considered protein subcellular location alterations. We proposed a computational method for identifying alterations of protein subcellular locations under drug treatments. We took three drugs, including TSA (trichostain A), bortezomib and tacrolimus, as instances for this study. By introducing dynamic protein-protein interaction networks, graph neural network algorithms were applied to aggregate topological information under different conditions. We systematically reported potential protein mis-localization events under drug treatments. As far as we know, this is the first attempt to find protein mis-localization events computationally in drug treatment conditions. Literatures validated that a number of proteins, which are highly related to pharmacological mechanisms of these drugs, may undergo protein localization alterations. We name our method as PLA-GNN (Protein Localization Alteration by Graph Neural Networks). It can be extended to other drugs and other conditions. All datasets and codes of this study has been deposited in a GitHub repository (https://github.com/quinlanW/PLA-GNN).
Collapse
Affiliation(s)
- Ren-Hua Wang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China.
| | - Tao Luo
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China.
| | - Han-Lin Zhang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China.
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China.
| |
Collapse
|
28
|
Wu W, Wang Y, Tang J, Yu M, Yuan J, Zhang G. Developing and evaluating a machine-learning-based algorithm to predict the incidence and severity of ARDS with continuous non-invasive parameters from ordinary monitors and ventilators. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 230:107328. [PMID: 36640602 DOI: 10.1016/j.cmpb.2022.107328] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 12/11/2022] [Accepted: 12/27/2022] [Indexed: 06/17/2023]
Abstract
OBJECTIVES Major observational studies report that the mortality rate of acute respiratory distress syndrome (ARDS) is close to 40%. Different treatment strategies are required for each patient, according to the degree of ARDS. Early prediction of ARDS is helpful to implement targeted drug therapy and mechanical ventilation strategies for patients with different degrees of potential ARDS. In this paper, a new dynamic prediction machine learning model for ARDS incidence and severity is established and evaluated based on 28 parameters from ordinary monitors and ventilators, capable of dynamic prediction of the incidence and severity of ARDS. This new method is expected to meet the clinical practice requirements of user-friendliness and timeliness for wider application. METHODS A total of 4738 hospitalized patients who required ICU care from 159 hospitals are employed in this study. The models are trained by standardized data from electronic medical records. There are 28 structured, continuous non-invasive parameters that are recorded every hour. Seven machine learning models using only continuous, non-invasive parameters are developed for dynamic prediction and compared with methods trained by complete parameters and the traditional risk adjustment method (i.e., oxygenation saturation index method). RESULTS The optimal prediction performance (area under the curve) of the ARDS incidence and severity prediction models built using continuous noninvasive parameters reached0.8691 and 0.7765, respectively. In terms of mild and severe ARDS prediction, the AUC values are both above 0.85. The performance of the model using only continuous non-invasive parameters have an AUC of 0.0133 lower, in comparison with that employing a complete feature set, including continuous non-invasive parameters, demographic information, laboratory parameters and clinical natural language text. CONCLUSIONS A machine learning method was developed in this study using only continuous non-invasive parameters for ARDS incidence and severity prediction. Because the continuous non-invasive parameters can be easily obtained from ordinary monitors and ventilators, the method presented in this study is friendly and convenient to use. It is expected to be applied in pre-hospital setting for early ARDS warning.
Collapse
Affiliation(s)
- Wenzhu Wu
- Chongqing Medical and Pharmaceutical College, Chongqing, China
| | - Yalin Wang
- Department of Medical Engineering, Medical Supplies Center of PLA General Hospital, Beijing, China
| | - Junquan Tang
- Chongqing Medical and Pharmaceutical College, Chongqing, China
| | - Ming Yu
- Institute of Medical Support Technology, Tianjin, China
| | - Jing Yuan
- Institute of Medical Support Technology, Tianjin, China
| | - Guang Zhang
- Institute of Medical Support Technology, Tianjin, China
| |
Collapse
|
29
|
Hobensack M, Song J, Scharp D, Bowles KH, Topaz M. Machine learning applied to electronic health record data in home healthcare: A scoping review. Int J Med Inform 2023; 170:104978. [PMID: 36592572 PMCID: PMC9869861 DOI: 10.1016/j.ijmedinf.2022.104978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/13/2022] [Accepted: 12/23/2022] [Indexed: 12/31/2022]
Abstract
OBJECTIVE Despite recent calls for home healthcare (HHC) to integrate informatics, the application of machine learning in HHC is relatively unknown. Thus, this study aimed to synthesize and appraise the literature describing the application of machine learning to predict adverse outcomes (e.g., hospitalization, mortality) using electronic health record (EHR) data in the HHC setting. Our secondary aim was to evaluate the comprehensiveness of predictors used in the machine learning algorithms guided by the Biopsychosocial Model. METHODS During March 2022 we conducted a literature search in four databases: PubMed, Embase, CINAHL, and Scopus. Inclusion criteria were 1) describing services provided in the HHC setting, 2) applying machine learning algorithms to predict adverse outcomes, defined as outcomes related to patient deterioration, 3) using EHR data and 4) focusing on the adult population. Predictors were mapped to the Biopsychosocial Model. A risk of bias analysis was conducted using the Prediction Model Risk Of Bias Assessment Tool. RESULTS The final sample included 20 studies. Eighteen studies used predictors from standardized assessments integrated in the EHR. The most common outcome of interest was hospitalization (55%), followed by mortality (25%). Psychological predictors were frequently excluded (35%). Tree based algorithms were most frequently applied (75%). Most studies demonstrated high or unclear risk of bias (75%). CONCLUSION Future studies in HHC should consider incorporating machine learning algorithms into clinical decision support systems to identify patients at risk. Based on the Biopsychosocial model, psychological and interpersonal characteristics should be used along with biological characteristics to enhance risk prediction. To facilitate the widespread adoption of machine learning, stakeholders should encourage standardization in the HHC setting.
Collapse
Affiliation(s)
| | - Jiyoun Song
- Columbia University School of Nursing, New York, NY, USA.
| | | | - Kathryn H Bowles
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA.
| | - Maxim Topaz
- Columbia University School of Nursing, New York, NY, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA; Data Science Institute, Columbia University, New York, NY, USA.
| |
Collapse
|
30
|
Cia G, Pucci F, Rooman M. Critical review of conformational B-cell epitope prediction methods. Brief Bioinform 2023; 24:6972295. [PMID: 36611255 DOI: 10.1093/bib/bbac567] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/17/2022] [Accepted: 11/19/2022] [Indexed: 01/09/2023] Open
Abstract
Accurate in silico prediction of conformational B-cell epitopes would lead to major improvements in disease diagnostics, drug design and vaccine development. A variety of computational methods, mainly based on machine learning approaches, have been developed in the last decades to tackle this challenging problem. Here, we rigorously benchmarked nine state-of-the-art conformational B-cell epitope prediction webservers, including generic and antibody-specific methods, on a dataset of over 250 antibody-antigen structures. The results of our assessment and statistical analyses show that all the methods achieve very low performances, and some do not perform better than randomly generated patches of surface residues. In addition, we also found that commonly used consensus strategies that combine the results from multiple webservers are at best only marginally better than random. Finally, we applied all the predictors to the SARS-CoV-2 spike protein as an independent case study, and showed that they perform poorly in general, which largely recapitulates our benchmarking conclusions. We hope that these results will lead to greater caution when using these tools until the biases and issues that limit current methods have been addressed, promote the use of state-of-the-art evaluation methodologies in future publications and suggest new strategies to improve the performance of conformational B-cell epitope prediction methods.
Collapse
Affiliation(s)
- Gabriel Cia
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, F. Roosevelt Avenue, 1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Triumph Boulevard, 1050, Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, F. Roosevelt Avenue, 1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Triumph Boulevard, 1050, Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, F. Roosevelt Avenue, 1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Triumph Boulevard, 1050, Brussels, Belgium
| |
Collapse
|
31
|
Thanathornwong B, Suebnukarn S, Ouivirach K. Clinical Decision Support System for Geriatric Dental Treatment Using a Bayesian Network and a Convolutional Neural Network. Healthc Inform Res 2023; 29:23-30. [PMID: 36792098 PMCID: PMC9932303 DOI: 10.4258/hir.2023.29.1.23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 10/30/2022] [Indexed: 02/10/2023] Open
Abstract
OBJECTIVES The aim of this study was to evaluate the performance of a clinical decision support system (CDSS) for therapeutic plans in geriatric dentistry. The information that needs to be considered in a therapeutic plan includes not only the patient's oral health status obtained from an oral examination, but also other related factors such as underlying diseases, socioeconomic characteristics, and functional dependency. METHODS A Bayesian network (BN) was used as a framework to construct a model of contributing factors and their causal relationships based on clinical knowledge and data. The faster R-CNN (regional convolutional neural network) algorithm was used to detect oral health status, which was part of the BN structure. The study was conducted using retrospective data from 400 patients receiving geriatric dental care at a university hospital between January 2020 and June 2021. RESULTS The model showed an F1-score of 89.31%, precision of 86.69%, and recall of 82.14% for the detection of periodontally compromised teeth. A receiver operating characteristic curve analysis showed that the BN model was highly accurate for recommending therapeutic plans (area under the curve = 0.902). The model performance was compared to that of experts in geriatric dentistry, and the experts and the system strongly agreed on the recommended therapeutic plans (kappa value = 0.905). CONCLUSIONS This research was the first phase of the development of a CDSS to recommend geriatric dental treatment. The proposed system, when integrated into the clinical workflow, is expected to provide general practitioners with expert-level decision support in geriatric dental care.
Collapse
|
32
|
Kanyongo W, Ezugwu AE. Machine learning approaches to medication adherence amongst NCD patients: A systematic literature review. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023] Open
|
33
|
Kari H, Bandi SMS, Kumar A, Yella VR. DeePromClass: Delineator for Eukaryotic Core Promoters Employing Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:802-807. [PMID: 35353704 DOI: 10.1109/tcbb.2022.3163418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Computational promoter identification in eukaryotes is a classical biological problem that should be refurbished with the availability of an avalanche of experimental data and emerging deep learning technologies. The current knowledge indicates that eukaryotic core promoters display multifarious signals such as TATA-Box, Inr element, TCT, and Pause-button, etc., and structural motifs such as G-quadruplexes. In the present study, we combined the power of deep learning with a plethora of promoter motifs to delineate promoter and non-promoters gleaned from the statistical properties of DNA sequence arrangement. To this end, we implemented convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for five model systems with [-100 to +50] segments relative to the transcription start site being the core promoter. Unlike previous state-of-the-art tools, which furnish a binary decision of promoter or non-promoter, we classify a chunk of 151mer sequence into a promoter along with the consensus signal type or a non-promoter. The combined CNN-LSTM model; we call "DeePromClass", achieved testing accuracy of 90.6%, 93.6%, 91.8%, 86.5%, and 84.0% for S. cerevisiae, C. elegans, D. melanogaster, Mus musculus, and Homo sapiens respectively. In total, our tool provides an insightful update on next-generation promoter prediction tools for promoter biologists.
Collapse
|
34
|
Abstract
As the global burden of antibiotic resistance continues to grow, creative approaches to antibiotic discovery are needed to accelerate the development of novel medicines. A rapidly progressing computational revolution-artificial intelligence-offers an optimistic path forward due to its ability to alleviate bottlenecks in the antibiotic discovery pipeline. In this review, we discuss how advancements in artificial intelligence are reinvigorating the adoption of past antibiotic discovery models-namely natural product exploration and small molecule screening. We then explore the application of contemporary machine learning approaches to emerging areas of antibiotic discovery, including antibacterial systems biology, drug combination development, antimicrobial peptide discovery, and mechanism of action prediction. Lastly, we propose a call to action for open access of high-quality screening datasets and interdisciplinary collaboration to accelerate the rate at which machine learning models can be trained and new antibiotic drugs can be developed.
Collapse
Affiliation(s)
- Telmah Lluka
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jonathan M Stokes
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
35
|
Lyu Y, Xu Q, Yang Z, Liu J. Prediction of patient choice tendency in medical decision-making based on machine learning algorithm. Front Public Health 2023; 11:1087358. [PMID: 36908484 PMCID: PMC9998498 DOI: 10.3389/fpubh.2023.1087358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 02/07/2023] [Indexed: 03/14/2023] Open
Abstract
Objective Machine learning (ML) algorithms, as an early branch of artificial intelligence technology, can effectively simulate human behavior by training on data from the training set. Machine learning algorithms were used in this study to predict patient choice tendencies in medical decision-making. Its goal was to help physicians understand patient preferences and to serve as a resource for the development of decision-making schemes in clinical treatment. As a result, physicians and patients can have better conversations at lower expenses, leading to better medical decisions. Method Patient medical decision-making tendencies were predicted by primary survey data obtained from 248 participants at third-level grade-A hospitals in China. Specifically, 12 predictor variables were set according to the literature review, and four types of outcome variables were set based on the optimization principle of clinical diagnosis and treatment. That is, the patient's medical decision-making tendency, which is classified as treatment effect, treatment cost, treatment side effect, and treatment experience. In conjunction with the study's data characteristics, three ML classification algorithms, decision tree (DT), k-nearest neighbor (KNN), and support vector machine (SVM), were used to predict patients' medical decision-making tendency, and the performance of the three types of algorithms was compared. Results The accuracy of the DT algorithm for predicting patients' choice tendency in medical decision making is 80% for treatment effect, 60% for treatment cost, 56% for treatment side effects, and 60% for treatment experience, followed by the KNN algorithm at 78%, 66%, 74%, 84%, and the SVM algorithm at 82%, 76%, 80%, 94%. At the same time, the comprehensive evaluation index F1-score of the DT algorithm are 0.80, 0.61, 0.58, 0.60, the KNN algorithm are 0.75, 0.65, 0.71, 0.84, and the SVM algorithm are 0.81, 0.74, 0.73, 0.94. Conclusion Among the three ML classification algorithms, SVM has the highest accuracy and the best performance. Therefore, the prediction results have certain reference values and guiding significance for physicians to formulate clinical treatment plans. The research results are helpful to promote the development and application of a patient-centered medical decision assistance system, to resolve the conflict of interests between physicians and patients and assist them to realize scientific decision-making.
Collapse
Affiliation(s)
- Yuwen Lyu
- Institute of Humanities and Social Sciences, Guangzhou Medical University, Guangzhou, China
| | - Qian Xu
- School of Health Management, Guangzhou Medical University, Guangzhou, China
| | - Zhenchao Yang
- The Eighth Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Junrong Liu
- Institute of Humanities and Social Sciences, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
36
|
Chest X-Ray Images to Differentiate COVID-19 from Pneumonia with Artificial Intelligence Techniques. Int J Biomed Imaging 2022; 2022:5318447. [PMID: 36588667 PMCID: PMC9800093 DOI: 10.1155/2022/5318447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 11/05/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022] Open
Abstract
This paper presents an automated and noninvasive technique to discriminate COVID-19 patients from pneumonia patients using chest X-ray images and artificial intelligence. The reverse transcription-polymerase chain reaction (RT-PCR) test is commonly administered to detect COVID-19. However, the RT-PCR test necessitates person-to-person contact to administer, requires variable time to produce results, and is expensive. Moreover, this test is still unreachable to the significant global population. The chest X-ray images can play an important role here as the X-ray machines are commonly available at any healthcare facility. However, the chest X-ray images of COVID-19 and viral pneumonia patients are very similar and often lead to misdiagnosis subjectively. This investigation has employed two algorithms to solve this problem objectively. One algorithm uses lower-dimension encoded features extracted from the X-ray images and applies them to the machine learning algorithms for final classification. The other algorithm relies on the inbuilt feature extractor network to extract features from the X-ray images and classifies them with a pretrained deep neural network VGG16. The simulation results show that the proposed two algorithms can extricate COVID-19 patients from pneumonia with the best accuracy of 100% and 98.1%, employing VGG16 and the machine learning algorithm, respectively. The performances of these two algorithms have also been collated with those of other existing state-of-the-art methods.
Collapse
|
37
|
Shalit N, Fire M, Ben-Elia E. A supervised machine learning model for imputing missing boarding stops in smart card data. PUBLIC TRANSPORT (HEIDELBERG, GERMANY) 2022; 15:287-319. [PMID: 38625321 PMCID: PMC9734418 DOI: 10.1007/s12469-022-00309-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 11/04/2022] [Indexed: 04/17/2024]
Abstract
Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.
Collapse
Affiliation(s)
- Nadav Shalit
- Data4Good Lab, Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Michael Fire
- Data4Good Lab, Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Eran Ben-Elia
- GAMESLab, Department of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
38
|
Liu G, Xu J, Wang C, Yu M, Yuan J, Tian F, Zhang G. A machine learning method for predicting the probability of MODS using only non-invasive parameters. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 227:107236. [PMID: 36384060 DOI: 10.1016/j.cmpb.2022.107236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 10/30/2022] [Accepted: 11/07/2022] [Indexed: 06/16/2023]
Abstract
OBJECTIVES Timely and accurate prediction of multiple organ dysfunction syndrome (MODS) is essential for the rescue and treatment of trauma patients However, existing methods are invasive, easily affected by artifacts and can be difficult to perform in a pre-hospital setting. We aim to develop prediction models for patients with MODS using only non-invasive parameters. METHOD In this study, records from 2319 patients were extracted from the Multiparameter Intelligent Monitoring in Intensive Care Ⅲ database (MIMIC Ⅲ), based on the sequential organ failure assessment (SOFA) score. Seven commonly used machine learning (ML) methods were selected and applied to develop a real-time prediction method for MODS based on full parameters (laboratory parameter. drug and non-invasive parameters, 57 parameters in total) and non-invasive parameters only (17 parameters) and compared with four traditional scoring systems. RESULTS The prediction results using LightGBM (LGBM) and Adaboost based on the full parameter modeling were 0.959 for area under receiver operating characteristic curve (AUC), outperforming four traditional scoring systems. The removal of 40 parameters and retaining of 17 non-invasive parameters decreased the AUC value of LGBM by 0.015, which still outperformed all traditional scoring systems. CONCLUSIONS A real-time and accurate MODS prediction method was developed in this paper based on non-invasive parameters by comparing the performance of four ML methods, which proved to be superior to the traditional scoring systems. This method can help medical staff to diagnose MODS as soon as possible and can improve the survival rate of patients in a pre-hospital setting.
Collapse
Affiliation(s)
- Guanjun Liu
- Institute of Medical Support Technology, Academy of Systems Engineering, Academy of Military Sciences, 106 Wandong Road, Tianjin 300161, China
| | - Jiameng Xu
- School of Life Sciences, Tiangong University, 399 Binshui West Road, Tianjin 300387, China
| | - Chengyi Wang
- School of Life Sciences, Tiangong University, 399 Binshui West Road, Tianjin 300387, China
| | - Ming Yu
- Institute of Medical Support Technology, Academy of Systems Engineering, Academy of Military Sciences, 106 Wandong Road, Tianjin 300161, China
| | - Jing Yuan
- Institute of Medical Support Technology, Academy of Systems Engineering, Academy of Military Sciences, 106 Wandong Road, Tianjin 300161, China
| | - Feng Tian
- Institute of Medical Support Technology, Academy of Systems Engineering, Academy of Military Sciences, 106 Wandong Road, Tianjin 300161, China
| | - Guang Zhang
- Institute of Medical Support Technology, Academy of Systems Engineering, Academy of Military Sciences, 106 Wandong Road, Tianjin 300161, China.
| |
Collapse
|
39
|
Lan W, Dong Y, Chen Q, Liu J, Wang J, Chen YPP, Pan S. IGNSCDA: Predicting CircRNA-Disease Associations Based on Improved Graph Convolutional Network and Negative Sampling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3530-3538. [PMID: 34506289 DOI: 10.1109/tcbb.2021.3111607] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accumulating evidences have shown that circRNA plays an important role in human diseases. It can be used as potential biomarker for diagnose and treatment of disease. Although some computational methods have been proposed to predict circRNA-disease associations, the performance still need to be improved. In this paper, we propose a new computational model based on Improved Graph convolutional network and Negative Sampling to predict CircRNA-Disease Associations. In our method, it constructs the heterogeneous network based on known circRNA-disease associations. Then, an improved graph convolutional network is designed to obtain the feature vectors of circRNA and disease. Further, the multi-layer perceptron is employed to predict circRNA-disease associations based on the feature vectors of circRNA and disease. In addition, the negative sampling method is employed to reduce the effect of the noise samples, which selects negative samples based on circRNA's expression profile similarity and Gaussian Interaction Profile kernel similarity. The 5-fold cross validation is utilized to evaluate the performance of the method. The results show that IGNSCDA outperforms than other state-of-the-art methods in the prediction performance. Moreover, the case study shows that IGNSCDA is an effective tool for predicting potential circRNA-disease associations.
Collapse
|
40
|
An approach to multi-class imbalanced problem in ecology using machine learning. ECOL INFORM 2022. [DOI: 10.1016/j.ecoinf.2022.101822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
41
|
Tarakci F, Ozkan IA, Yilmaz S, Tezcan D. Diagnosing rheumatoid arthritis disease using fuzzy expert system and machine learning techniques. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-221582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Rheumatoid Arthritis (RA) is a very common autoimmune disease that causes significant morbidity and mortality, and therefore early diagnosis and treatment are important. Early diagnosis of RA and knowing the severity of the disease are very important for the treatment to be applied. The diagnosis of RA usually requires a physical examination, laboratory tests, and a review of the patient’s medical history. In this study, the diagnosis of RA was made with two different methods using a fuzzy expert system (FES) and machine learning (ML) techniques, which were designed and implemented with the help of a specialist in the field, and the results were compared. For this purpose, blood counts were taken from 286 people, including 91 men and 195 women from various age groups. In the first method, an FES structure that determines the severity of RA disease has been established from blood count using the laboratory test results of CRP, ESR, RF, and ANA. The FES result that determines RA disease severity, the Anti-CCP level that is used to distinguish RA disease, and the patient’s medical history were used to design the Decision Support System (DSS) that diagnoses RA disease. The DSS is web-based and publicly accessible. In the second method, RA disease was diagnosed using kNN, SVM, LR, DT, NB, and MLP algorithms, which are widely used in machine learning. To examine the effect of the patient’s history on RA disease diagnosis, two different models were used in machine learning techniques, one with and one without the patient’s history. The results of the fuzzy-based DSS were also compared with the diagnoses made by the specialist and the diagnoses made according to the 2010 ACR / EULAR RA classification criteria. The performed DSS has achieved a diagnostic success rate of 94.05% on 286 patients. In the study of machine learning techniques, the highest success rate was achieved with the LR model. While the success rate of the model was 91.25 % with only blood count data, the success rate was 97.90% with the addition of the patient’s history. In addition to the high success rate, the results show that the patient’s history is important in diagnosing RA disease.
Collapse
Affiliation(s)
- Fatih Tarakci
- Department of Computer Engineering, Faculty of Technology, Selcuk University, Konya, Turkey
| | - Ilker Ali Ozkan
- Department of Computer Engineering, Faculty of Technology, Selcuk University, Konya, Turkey
| | - Sema Yilmaz
- Division of Rheumatology, Selcuk University School of Medicine, Konya, Turkey
| | - Dilek Tezcan
- Division of Rheumatology, Selcuk University School of Medicine, Konya, Turkey
| |
Collapse
|
42
|
Zahedi S, Carvalho AS, Ejtehadifar M, Beck HC, Rei N, Luis A, Borralho P, Bugalho A, Matthiesen R. Assessment of a Large-Scale Unbiased Malignant Pleural Effusion Proteomics Study of a Real-Life Cohort. Cancers (Basel) 2022; 14:cancers14184366. [PMID: 36139528 PMCID: PMC9496668 DOI: 10.3390/cancers14184366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 08/29/2022] [Accepted: 09/01/2022] [Indexed: 11/29/2022] Open
Abstract
Simple Summary Pleural effusion (PE) occurs as a consequence of various pathologies. Malignant effusion due to lung cancer is one of the most frequent causes. A method for accurate differentiation of malignant from benign PE is an unmet clinical need. Proteomics profiling of PE has shown promising results. However, mass spectrometry (MS) analysis typically involves the tedious elimination of abundant proteins before analysis, and clinical annotation of proteomics profiled cohorts is limited. This study compares the proteomes of malignant PE and nonmalignant PE, identifies lung cancer malignant markers in agreement with other studies, and identifies markers strongly associated with patient survival. Abstract Background: Pleural effusion (PE) is common in advanced-stage lung cancer patients and is related to poor prognosis. Identification of cancer cells is the standard method for the diagnosis of a malignant PE (MPE). However, it only has moderate sensitivity. Thus, more sensitive diagnostic tools are urgently needed. Methods: The present study aimed to discover potential protein targets to distinguish malignant pleural effusion (MPE) from other non-malignant pathologies. We have collected PE from 97 patients to explore PE proteomes by applying state-of-the-art liquid chromatography-mass spectrometry (LC-MS) to identify potential biomarkers that correlate with immunohistochemistry assessment of tumor biopsy or with survival data. Functional analyses were performed to elucidate functional differences in PE proteins in malignant and benign samples. Results were integrated into a clinical risk prediction model to identify likely malignant cases. Sensitivity, specificity, and negative predictive value were calculated. Results: In total, 1689 individual proteins were identified by MS-based proteomics analysis of the 97 PE samples, of which 35 were diagnosed as malignant. A comparison between MPE and benign PE (BPE) identified 58 differential regulated proteins after correction of the p-values for multiple testing. Furthermore, functional analysis revealed an up-regulation of matrix intermediate filaments and cellular movement-related proteins. Additionally, gene ontology analysis identified the involvement of metabolic pathways such as glycolysis/gluconeogenesis, pyruvate metabolism and cysteine and methionine metabolism. Conclusion: This study demonstrated a partial least squares regression model with an area under the curve of 98 and an accuracy of 0.92 when evaluated on the holdout test data set. Furthermore, highly significant survival markers were identified (e.g., PSME1 with a log-rank of 1.68 × 10−6).
Collapse
Affiliation(s)
- Sara Zahedi
- iNOVA4Health, NOVA Medical School (NMS), Faculdade de Ciências Médicas (FCM), Universidade Nova de Lisboa, 1150-082 Lisbon, Portugal
| | - Ana Sofia Carvalho
- iNOVA4Health, NOVA Medical School (NMS), Faculdade de Ciências Médicas (FCM), Universidade Nova de Lisboa, 1150-082 Lisbon, Portugal
| | - Mostafa Ejtehadifar
- iNOVA4Health, NOVA Medical School (NMS), Faculdade de Ciências Médicas (FCM), Universidade Nova de Lisboa, 1150-082 Lisbon, Portugal
| | - Hans C. Beck
- Department of Clinical Biochemistry, Odense University Hospital, 5000 Odense, Denmark
| | - Nádia Rei
- iNOVA4Health, NOVA Medical School (NMS), Faculdade de Ciências Médicas (FCM), Universidade Nova de Lisboa, 1150-082 Lisbon, Portugal
| | - Ana Luis
- Hospital CUF Descobertas, CUF Oncologia, 1998-018 Lisbon, Portugal
| | - Paula Borralho
- Hospital CUF Descobertas, CUF Oncologia, 1998-018 Lisbon, Portugal
| | - António Bugalho
- iNOVA4Health, NOVA Medical School (NMS), Faculdade de Ciências Médicas (FCM), Universidade Nova de Lisboa, 1150-082 Lisbon, Portugal
- Hospital CUF Descobertas, CUF Oncologia, 1998-018 Lisbon, Portugal
- Correspondence: (A.B.); (R.M.)
| | - Rune Matthiesen
- iNOVA4Health, NOVA Medical School (NMS), Faculdade de Ciências Médicas (FCM), Universidade Nova de Lisboa, 1150-082 Lisbon, Portugal
- Correspondence: (A.B.); (R.M.)
| |
Collapse
|
43
|
Adams J, Agyenkwa-Mawuli K, Agyapong O, Wilson MD, Kwofie SK. EBOLApred: A machine learning-based web application for predicting cell entry inhibitors of the Ebola virus. Comput Biol Chem 2022; 101:107766. [DOI: 10.1016/j.compbiolchem.2022.107766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 08/10/2022] [Accepted: 08/29/2022] [Indexed: 11/03/2022]
|
44
|
Warin K, Limprasert W, Suebnukarn S, Jinaporntham S, Jantana P, Vicharueang S. AI-based analysis of oral lesions using novel deep convolutional neural networks for early detection of oral cancer. PLoS One 2022; 17:e0273508. [PMID: 36001628 PMCID: PMC9401150 DOI: 10.1371/journal.pone.0273508] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 08/09/2022] [Indexed: 11/18/2022] Open
Abstract
Artificial intelligence (AI) applications in oncology have been developed rapidly with reported successes in recent years. This work aims to evaluate the performance of deep convolutional neural network (CNN) algorithms for the classification and detection of oral potentially malignant disorders (OPMDs) and oral squamous cell carcinoma (OSCC) in oral photographic images. A dataset comprising 980 oral photographic images was divided into 365 images of OSCC, 315 images of OPMDs and 300 images of non-pathological images. Multiclass image classification models were created by using DenseNet-169, ResNet-101, SqueezeNet and Swin-S. Multiclass object detection models were fabricated by using faster R-CNN, YOLOv5, RetinaNet and CenterNet2. The AUC of multiclass image classification of the best CNN models, DenseNet-196, was 1.00 and 0.98 on OSCC and OPMDs, respectively. The AUC of the best multiclass CNN-base object detection models, Faster R-CNN, was 0.88 and 0.64 on OSCC and OPMDs, respectively. In comparison, DenseNet-196 yielded the best multiclass image classification performance with AUC of 1.00 and 0.98 on OSCC and OPMD, respectively. These values were inline with the performance of experts and superior to those of general practictioners (GPs). In conclusion, CNN-based models have potential for the identification of OSCC and OPMDs in oral photographic images and are expected to be a diagnostic tool to assist GPs for the early detection of oral cancer.
Collapse
Affiliation(s)
- Kritsasith Warin
- Faculty of Dentistry, Thammasat University, Khlong Luang, Pathum Thani, Thailand
| | - Wasit Limprasert
- College of Interdisciplinary Studies, Thammasat University, Khlong Luang, Pathum Thani, Thailand
| | - Siriwan Suebnukarn
- Faculty of Dentistry, Thammasat University, Khlong Luang, Pathum Thani, Thailand
| | | | | | | |
Collapse
|
45
|
Wang Y, Shi D, Zhou W. Convolutional Neural Network Approach Based on Multimodal Biometric System with Fusion of Face and Finger Vein Features. SENSORS (BASEL, SWITZERLAND) 2022; 22:6039. [PMID: 36015799 PMCID: PMC9412820 DOI: 10.3390/s22166039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/05/2022] [Accepted: 08/10/2022] [Indexed: 06/15/2023]
Abstract
In today's information age, how to accurately identify a person's identity and protect information security has become a hot topic of people from all walks of life. At present, a more convenient and secure solution to identity identification is undoubtedly biometric identification, but a single biometric identification cannot support increasingly complex and diversified authentication scenarios. Using multimodal biometric technology can improve the accuracy and safety of identification. This paper proposes a biometric method based on finger vein and face bimodal feature layer fusion, which uses a convolutional neural network (CNN), and the fusion occurs in the feature layer. The self-attention mechanism is used to obtain the weights of the two biometrics, and combined with the RESNET residual structure, the self-attention weight feature is cascaded with the bimodal fusion feature channel Concat. To prove the high efficiency of bimodal feature layer fusion, AlexNet and VGG-19 network models were selected in the experimental part for extracting finger vein and face image features as inputs to the feature fusion module. The extensive experiments show that the recognition accuracy of both models exceeds 98.4%, demonstrating the high efficiency of the bimodal feature fusion.
Collapse
|
46
|
Hao S, Hu X, Feng Z, Sun K, You X, Wang Z, Yang C. Prediction of metal ion ligand binding residues by adding disorder value and propensity factors based on deep learning algorithm. Front Genet 2022; 13:969412. [PMID: 36035120 PMCID: PMC9402973 DOI: 10.3389/fgene.2022.969412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Proteins need to interact with different ligands to perform their functions. Among the ligands, the metal ion is a major ligand. At present, the prediction of protein metal ion ligand binding residues is a challenge. In this study, we selected Zn2+, Cu2+, Fe2+, Fe3+, Co2+, Mn2+, Ca2+ and Mg2+ metal ion ligands from the BioLip database as the research objects. Based on the amino acids, the physicochemical properties and predicted structural information, we introduced the disorder value as the feature parameter. In addition, based on the component information, position weight matrix and information entropy, we introduced the propensity factor as prediction parameters. Then, we used the deep neural network algorithm for the prediction. Furtherly, we made an optimization for the hyper-parameters of the deep learning algorithm and obtained improved results than the previous IonSeq method.
Collapse
Affiliation(s)
- Sixi Hao
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
| | - Xiuzhen Hu
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
- *Correspondence: Xiuzhen Hu, ; Zhenxing Feng,
| | - Zhenxing Feng
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
- *Correspondence: Xiuzhen Hu, ; Zhenxing Feng,
| | - Kai Sun
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
| | - Xiaoxiao You
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
| | - Ziyang Wang
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
| | - Caiyun Yang
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China
| |
Collapse
|
47
|
Identify Bitter Peptides by Using Deep Representation Learning Features. Int J Mol Sci 2022; 23:ijms23147877. [PMID: 35887225 PMCID: PMC9315524 DOI: 10.3390/ijms23147877] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 07/01/2022] [Accepted: 07/14/2022] [Indexed: 02/04/2023] Open
Abstract
A bitter taste often identifies hazardous compounds and it is generally avoided by most animals and humans. Bitterness of hydrolyzed proteins is caused by the presence of bitter peptides. To improve palatability, bitter peptides need to be identified experimentally in a time-consuming and expensive process, before they can be removed or degraded. Here, we report the development of a machine learning prediction method, iBitter-DRLF, which is based on a deep learning pre-trained neural network feature extraction method. It uses three sequence embedding techniques, soft symmetric alignment (SSA), unified representation (UniRep), and bidirectional long short-term memory (BiLSTM). These were initially combined into various machine learning algorithms to build several models. After optimization, the combined features of UniRep and BiLSTM were finally selected, and the model was built in combination with a light gradient boosting machine (LGBM). The results showed that the use of deep representation learning greatly improves the ability of the model to identify bitter peptides, achieving accurate prediction based on peptide sequence data alone. By helping to identify bitter peptides, iBitter-DRLF can help research into improving the palatability of peptide therapeutics and dietary supplements in the future. A webserver is available, too.
Collapse
|
48
|
An Efficient Heap Based Optimizer Algorithm for Feature Selection. MATHEMATICS 2022. [DOI: 10.3390/math10142396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The heap-based optimizer (HBO) is an innovative meta-heuristic inspired by human social behavior. In this research, binary adaptations of the heap-based optimizer B_HBO are presented and used to determine the optimal features for classifications in wrapping form. In addition, HBO balances exploration and exploitation by employing self-adaptive parameters that can adaptively search the solution domain for the optimal solution. In the feature selection domain, the presented algorithms for the binary Heap-based optimizer B_HBO are used to find feature subsets that maximize classification performance while lowering the number of selected features. The textitk-nearest neighbor (textitk-NN) classifier ensures that the selected features are significant. The new binary methods are compared to eight common optimization methods recently employed in this field, including Ant Lion Optimization (ALO), Archimedes Optimization Algorithm (AOA), Backtracking Search Algorithm (BSA), Crow Search Algorithm (CSA), Levy flight distribution (LFD), Particle Swarm Optimization (PSO), Slime Mold Algorithm (SMA), and Tree Seed Algorithm (TSA) in terms of fitness, accuracy, precision, sensitivity, F-score, the number of selected features, and statistical tests. Twenty datasets from the UCI repository are evaluated and compared using a set of evaluation indicators. The non-parametric Wilcoxon rank-sum test was used to determine whether the proposed algorithms’ results varied statistically significantly from those of the other compared methods. The comparison analysis demonstrates that B_HBO is superior or equivalent to the other algorithms used in the literature.
Collapse
|
49
|
Jeon YJ, Hasan MM, Park HW, Lee KW, Manavalan B. TACOS: a novel approach for accurate prediction of cell-specific long noncoding RNAs subcellular localization. Brief Bioinform 2022; 23:6618237. [PMID: 35753698 PMCID: PMC9294414 DOI: 10.1093/bib/bbac243] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 05/23/2022] [Accepted: 05/24/2022] [Indexed: 11/14/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are primarily regulated by their cellular localization, which is responsible for their molecular functions, including cell cycle regulation and genome rearrangements. Accurately identifying the subcellular location of lncRNAs from sequence information is crucial for a better understanding of their biological functions and mechanisms. In contrast to traditional experimental methods, bioinformatics or computational methods can be applied for the annotation of lncRNA subcellular locations in humans more effectively. In the past, several machine learning-based methods have been developed to identify lncRNA subcellular localization, but relevant work for identifying cell-specific localization of human lncRNA remains limited. In this study, we present the first application of the tree-based stacking approach, TACOS, which allows users to identify the subcellular localization of human lncRNA in 10 different cell types. Specifically, we conducted comprehensive evaluations of six tree-based classifiers with 10 different feature descriptors, using a newly constructed balanced training dataset for each cell type. Subsequently, the strengths of the AdaBoost baseline models were integrated via a stacking approach, with an appropriate tree-based classifier for the final prediction. TACOS displayed consistent performance in both the cross-validation and independent assessments compared with the other two approaches employed in this study. The user-friendly online TACOS web server can be accessed at https://balalab-skku.org/TACOS.
Collapse
Affiliation(s)
- Young-Jun Jeon
- Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| | - Md Mehedi Hasan
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Hyun Woo Park
- Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| | - Ki Wook Lee
- Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics laboratory, Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea
| |
Collapse
|
50
|
Mapping Fire Susceptibility in the Brazilian Amazon Forests Using Multitemporal Remote Sensing and Time-Varying Unsupervised Anomaly Detection. REMOTE SENSING 2022. [DOI: 10.3390/rs14102429] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The economic and environmental impacts of wildfires have leveraged the development of new technologies to prevent and reduce the occurrence of these devastating events. Indeed, identifying and mapping fire-susceptible areas arise as critical tasks, not only to pave the way for rapid responses to attenuate the fire spreading, but also to support emergency evacuation plans for the families affected by fire-related tragedies. Aiming at simultaneously mapping and measuring the risk of fires in the forest areas of Brazil’s Amazon, in this paper we combine multitemporal remote sensing, derivative spectral indices, and anomaly detection into a fully unsupervised methodology. We focus our analysis on recent forest fire events that occurred in the Brazilian Amazon by exploring multitemporal images acquired by both Landsat-8 Operational Land Imager and Modis sensors. We experimentally confirm that the current methodology is capable of predicting fire outbreaks immediately at posterior instants, which attests to the operational performance and applicability of our approach to preventing and mitigating the impact of fires in Brazilian forest regions.
Collapse
|