1
|
Kumar S, Balaya RDA, Kanekar S, Raju R, Prasad TSK, Kandasamy RK. Computational tools for exploring peptide-membrane interactions in gram-positive bacteria. Comput Struct Biotechnol J 2023; 21:1995-2008. [PMID: 36950221 PMCID: PMC10025024 DOI: 10.1016/j.csbj.2023.02.051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 02/27/2023] [Accepted: 02/27/2023] [Indexed: 03/05/2023] Open
Abstract
The vital cellular functions in Gram-positive bacteria are controlled by signaling molecules known as quorum sensing peptides (QSPs), considered promising therapeutic interventions for bacterial infections. In the bacterial system QSPs bind to membrane-coupled receptors, which then auto-phosphorylate and activate intracellular response regulators. These response regulators induce target gene expression in bacteria. One of the most reliable trends in drug discovery research for virulence-associated molecular targets is the use of peptide drugs or new functionalities. In this perspective, computational methods act as auxiliary aids for biologists, where methodologies based on machine learning and in silico analysis are developed as suitable tools for target peptide identification. Therefore, the development of quick and reliable computational resources to identify or predict these QSPs along with their receptors and inhibitors is receiving considerable attention. The databases such as Quorumpeps and Quorum Sensing of Human Gut Microbes (QSHGM) provide a detailed overview of the structures and functions of QSPs. The tools and algorithms such as QSPpred, QSPred-FL, iQSP, EnsembleQS and PEPred-Suite have been used for the generic prediction of QSPs and feature representation. The availability of compiled key resources for utilizing peptide features based on amino acid composition, positional preferences, and motifs as well as structural and physicochemical properties, including biofilm inhibitory peptides, can aid in elucidating the QSP and membrane receptor interactions in infectious Gram-positive pathogens. Herein, we present a comprehensive survey of diverse computational approaches that are suitable for detecting QSPs and QS interference molecules. This review highlights the utility of these methods for developing potential biomarkers against infectious Gram-positive pathogens.
Collapse
Key Words
- 3-HBA, 3–Hydroxybenzoic Acid
- AAC, Amino Acid Composition
- ABC, ATP-binding cassette
- ACD, Available Chemicals Database
- AIP, Autoinducing Peptide
- AMP, Anti-Microbial Peptide
- ATP, Adenosine Triphosphate
- Agr, Accessory gene regulator
- BFE, Binding Free Energy
- BIP Inhibitors
- BIP, Biofilm Inhibitory Peptides
- BLAST, Basic Local Alignment Search Tool
- BNB, Bernoulli Naïve-Bayes
- CADD, Computer-Aided Drug Design
- CSP, Competence Stimulating Peptide
- CTD, Composition-Transition-Distribution
- D, Aspartate
- DCH, 3,3′-(3,4-dichlorobenzylidene)-bis-(4-hydroxycoumarin)
- DT, Decision Tree
- FDA, Food and Drug Administration
- GBM, Gradient Boosting Machine
- GDC, g-gap Dipeptide
- GNB, Gaussian NB
- Gram-positive bacteria
- H, Histidine
- H-Kinase, Histidine Kinase
- H-phosphotransferase, Histidine Phosphotransferase
- HAM, Hamamelitannin
- HGM, Human Gut Microbiota
- HNP, Human Neutrophil Peptide
- IT, Information Theory Features
- In silico approaches
- KNN, K-Nearest Neighbors
- MCC, Mathew Co-relation Coefficient
- MD, Molecular Dynamics
- MDR, Multiple Drug Resistance
- ML, Machine Learning
- MRSA, Methicillin Resistant S. aureus
- MSL, Multiple Sequence Alignment
- OMR, Omargliptin
- OVP, Overlapping Property Features
- PCP, Physicochemical Properties
- PDB, Protein Data Bank
- PPIs, Protein-Protein Interactions
- PSM, Phenol-Soluble Modulin
- PTM, Post Translational Modification
- QS, Quorum Sensing
- QSCN, QS communication network
- QSHGM, Quorum Sensing of Human Gut Microbes
- QSI, QS Inhibitors
- QSIM, QS Interference Molecules
- QSP inhibitors
- QSP predictors
- QSP, QS Peptides
- QSPR, Quantitative Structure Property Relationship
- Quorum sensing peptides
- RAP, RNAIII-activating protein
- RF, Random Forest
- RIP, RNAIII-inhibiting peptide
- ROC, Receiver Operating Characteristic
- SAR, Structure-Activity Relationship
- SFS, Sequential Forward Search
- SIT, Sitagliptin
- SVM, Support Vector Machine
- TCS, Two-Component Sensory
- TRAP, Target of RAP
- TRG, Trelagliptin
- WHO, World Health Organization
- mRMR, minimum Redundancy and Maximum Relevance
Collapse
Affiliation(s)
- Shreya Kumar
- Centre for Integrative Omics Data Science, Yenepoya (Deemed to be University), Mangalore 575018, India
- Centre for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | | | - Saptami Kanekar
- Centre for Integrative Omics Data Science, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Rajesh Raju
- Centre for Integrative Omics Data Science, Yenepoya (Deemed to be University), Mangalore 575018, India
- Centre for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | | | - Richard K. Kandasamy
- Centre of Molecular Inflammation Research (CEMIR), and Department of Clinical and Molecular Medicine (IKOM), Norwegian University of Science and Technology, 7491 Trondheim, Norway
- Department of Laboratory Medicine and Pathology, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
2
|
Samant P, Ruysscher DD, Hoebers F, Canters R, Hall E, Nutting C, Maughan T, Van den Heuvel F. Machine learning for normal tissue complication probability prediction: Predictive power with versatility and easy implementation. Clin Transl Radiat Oncol 2023; 39:100595. [PMID: 36880063 PMCID: PMC9984444 DOI: 10.1016/j.ctro.2023.100595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Background and purpose A popular Normal tissue Complication (NTCP) model deployed to predict radiotherapy (RT) toxicity is the Lyman-Burman Kutcher (LKB) model of tissue complication. Despite the LKB model's popularity, it can suffer from numerical instability and considers only the generalized mean dose (GMD) to an organ. Machine learning (ML) algorithms can potentially offer superior predictive power of the LKB model, and with fewer drawbacks. Here we examine the numerical characteristics and predictive power of the LKB model and compare these with those of ML. Materials and methods Both an LKB model and ML models were used to predict G2 Xerostomia on patients following RT for head and neck cancer, using the dose volume histogram of parotid glands as the input feature. Model speed, convergence characteristics and predictive power was evaluated on an independent training set. Results We found that only global optimization algorithms could guarantee a convergent and predictive LKB model. At the same time our results showed that ML models remained unconditionally convergent and predictive, while staying robust to gradient descent optimization. ML models outperform LKB in Brier score and accuracy but compare to LKB in ROC-AUC. Conclusion We have demonstrated that ML models can quantify NTCP better than or as well as LKB models, even for a toxicity that the LKB model is particularly well suited to predict. ML models can offer this performance while offering fundamental advantages in model convergence, speed, and flexibility, and so could offer an alternative to the LKB model that could potentially be used in clinical RT planning decisions.
Collapse
Key Words
- AB, AdaBooost (aka Adaptive Boosting)
- Clinical radiobiology
- DA, Dual Annealing
- DE, Differential Evolution
- DT, Decision Tree
- DVH, Dose Volume Histogram
- GB, Gradient Boost
- GD, Gradient Descent
- GMD, Generalized Mean Dose
- Head and Neck Cancer
- LKB, Lyman Kutcher Burman
- LR, Logistic Regression
- ML, Machine Learning
- Machine Learning
- NTCP, Normal Tissue Complication Probability
- Normal Tissue Complication Probability
- OAR, Organ(s) at Risk
- RT, Radiotherapy
- Radiotherapy
- Treatment Planning
- Xerostomia
Collapse
Affiliation(s)
- Pratik Samant
- Oxford University Hospitals NHS Foundation Trust, Radiotherapy Physics, Oxford, United Kingdom
- University of Oxford, Department of Oncology, Oxford, United Kingdom
| | - Dirk de Ruysscher
- Maastricht University Medical Centre, Department of Radiation Oncology (Maastro), Maastricht, The Netherlands
| | - Frank Hoebers
- Maastricht University Medical Centre, Department of Radiation Oncology (Maastro), Maastricht, The Netherlands
| | - Richard Canters
- Maastricht University Medical Centre, Department of Radiation Oncology (Maastro), Maastricht, The Netherlands
| | - Emma Hall
- Institute of Cancer Research, Division of Clinical Studies, Sutton, United Kingdom
| | - Chris Nutting
- Institute of Cancer Research, Division of Radiotherapy and Imaging, Sutton, United Kingdom
| | - Tim Maughan
- University of Oxford, Department of Oncology, Oxford, United Kingdom
| | - Frank Van den Heuvel
- University of Oxford, Department of Oncology, Oxford, United Kingdom
- Zuidwest Radiotherapeutisch Instituut, Physics, Vlissingen (Flushing), The Netherlands
| |
Collapse
|
3
|
Denysyuk HV, Pinto RJ, Silva PM, Duarte RP, Marinho FA, Pimenta L, Gouveia AJ, Gonçalves NJ, Coelho PJ, Zdravevski E, Lameski P, Leithardt V, Garcia NM, Pires IM. Algorithms for automated diagnosis of cardiovascular diseases based on ECG data: A comprehensive systematic review. Heliyon 2023; 9:e13601. [PMID: 36852052 PMCID: PMC9958295 DOI: 10.1016/j.heliyon.2023.e13601] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 01/31/2023] [Accepted: 02/05/2023] [Indexed: 02/12/2023] Open
Abstract
The prevalence of cardiovascular diseases is increasing around the world. However, the technology is evolving and can be monitored with low-cost sensors anywhere at any time. This subject is being researched, and different methods can automatically identify these diseases, helping patients and healthcare professionals with the treatments. This paper presents a systematic review of disease identification, classification, and recognition with ECG sensors. The review was focused on studies published between 2017 and 2022 in different scientific databases, including PubMed Central, Springer, Elsevier, Multidisciplinary Digital Publishing Institute (MDPI), IEEE Xplore, and Frontiers. It results in the quantitative and qualitative analysis of 103 scientific papers. The study demonstrated that different datasets are available online with data related to various diseases. Several ML/DP-based models were identified in the research, where Convolutional Neural Network and Support Vector Machine were the most applied algorithms. This review can allow us to identify the techniques that can be used in a system that promotes the patient's autonomy.
Collapse
Key Words
- AI, Artificial Intelligence
- BNN, Binarized Neural Network
- CNN, Concolutional Neural Networks
- Cardiovascular diseases
- DL, Deep Learning
- DNN, Deep Neural Networks
- Diagnosis
- ECG sensors
- ECG, Electrocardiography
- GAN, Generative Adversarial Networks
- GMM, Gaussian Mixture Model
- GNB, Gaussian Naive bayes
- GRU, Gated Recurrent Unit
- LASSO, Least Absolute Shrinkage and Selection Operator
- LDA, Linear Discriminant Analysis
- LR, Linear Regression
- LSTM, Long Short-Term Memory
- ML, Machine Learning
- MLP, Multiplayer Perceptron
- MLR, Multiple Linear Regression
- NLP, Natural Language Processing
- POAF, Postoperative Atrial Fibrillation
- RF, Random Forest
- RNN, Recurrent Neural Network
- SHAP, SHapley Additive exPlanations
- SVM, Support Vector Machine
- Systematic review
- WHO, World Health Organization
- kNN, k-nearest neighbors
Collapse
Affiliation(s)
| | - Rui João Pinto
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - Pedro Miguel Silva
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - Rui Pedro Duarte
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - Francisco Alexandre Marinho
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - Luís Pimenta
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - António Jorge Gouveia
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - Norberto Jorge Gonçalves
- Escola de Ciências e Tecnologia, University of Trás-os-Montes e Alto Douro, Quinta de Prados, 5001-801 Vila Real, Portugal
| | - Paulo Jorge Coelho
- Polytechnic of Leiria, Leiria, Portugal
- Institute for Systems Engineering and Computers at Coimbra (INESC Coimbra), Coimbra, Portugal
| | - Eftim Zdravevski
- Faculty of Computer Science and Engineering, University Ss Cyril and Methodius, 1000 Skopje, Macedonia
| | - Petre Lameski
- Faculty of Computer Science and Engineering, University Ss Cyril and Methodius, 1000 Skopje, Macedonia
| | - Valderi Leithardt
- VALORIZA, Research Center for Endogenous Resources Valorization, Instituto Politécnico de Portalegre, 7300-555 Portalegre, Portugal
- COPELABS, Universidade Lusófona de Humanidades e Tecnologias, Lisboa, Portugal
| | - Nuno M. Garcia
- Instituto de Telecomunicações, Universidade da Beira Interior, 6200-001 Covilhã, Portugal
| | - Ivan Miguel Pires
- Instituto de Telecomunicações, Universidade da Beira Interior, 6200-001 Covilhã, Portugal
| |
Collapse
|
4
|
Abdulsalam Hamwi W, Almustafa MM. Development and integration of VGG and dense transfer-learning systems supported with diverse lung images for discovery of the Coronavirus identity. Inform Med Unlocked 2022; 32:101004. [PMID: 35822170 PMCID: PMC9263684 DOI: 10.1016/j.imu.2022.101004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/24/2022] [Accepted: 06/25/2022] [Indexed: 12/18/2022] Open
Abstract
The contagious SARS-CoV-2 has had a tremendous impact on the life and health of many communities. It was first rampant in early 2019 and so far, 539 million cases of COVID-19 have been reported worldwide. This is reminiscent of the 1918 influenza pandemic. However, we can detect the infected cases of COVID-19 by analysing either X-rays or CT, which are presumably considered the least expensive methods. In the existence of state-of-the-art convolutional neural networks (CNNs), which integrate image pre-processing techniques with fully connected layers, we can develop a sophisticated AI system contingent on various pre-trained models. Each pre-trained model we involved in our study assumed its role in extracting some specific features from different chest image datasets in many verified sources, such as (Mendeley, Kaggle, and GitHub). First, for CXR datasets associated with the CNN trained model from the beginning, whereby is comprised of four layers beginning with the Conv2D layer, which comprises 32 filters, followed by the MaxPooling and afterwards, we reiterated similarly. We used two techniques to avoid overgeneralization, the early stopping and the Dropout techniques. After all, the output was one neuron to classify both cases of 0 or 1, followed by a sigmoid function; in addition, we used the Adam optimizer owing to the more improved outcomes than what other optimizers conducted; ultimately, we referred to our findings by using a confusion matrix, classification report (Recall & Precision), sensitivity and specificity; in this approach, we achieved a classification accuracy of 96%. Our three integrated pre-trained models (VGG16, DenseNet201, and DenseNet121) yielded a remarkable test accuracy of 98.81%. Besides, our merged models (VGG16, DenseNet201) trained on CT images with the utmost effort; this model held an accurate test of 99.73% for binary classification with the (Normal/Covid-19) scenario. Comparing our results with related studies shows that our proposed models were superior to the previous CNN machine learning models in terms of various performance metrics. Our pre-trained model associated with the CT dataset achieved 100% of the F1score and the loss value was approximately 0.00268.
Collapse
Key Words
- AI, Artificial Intelligence
- ANNs, Artificilal Neural Networks
- Artificial intelligence
- CNNs, Convolutional Neural Networks
- CT, Computed Tomography
- CXR&CT chest COVID-19 images integration of three pre-trained CNN models Fine-tuning
- Conv2D, 2D Convolutional Layer
- Covid-19, Coronavirus disease of 2019
- DL, Deep Learning
- Image processing
- ML, Machine Learning
- Performance evaluation
- RT-PCR, Reverse Transcription Polymerase Chain Reaction
- ReLU, Rectified Linear Unit
- SARS_COV_2, Severe acute respiratory syndrome coronavirus
- X-ray,CXR, energic high frequency electromagnetic radiation
Collapse
|
5
|
Lee ES, Durant TJ. Supervised machine learning in the mass spectrometry laboratory: A tutorial. J Mass Spectrom Adv Clin Lab 2022; 23:1-6. [PMID: 34984411 PMCID: PMC8692990 DOI: 10.1016/j.jmsacl.2021.12.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 12/02/2021] [Accepted: 12/06/2021] [Indexed: 11/19/2022] Open
Abstract
As the demand for laboratory testing by mass spectrometry increases, so does the need for automated methods for data analysis. Clinical mass spectrometry (MS) data is particularly well-suited for machine learning (ML) methods, which deal nicely with structured and discrete data elements. The alignment of these two fields offers a promising synergy that can be used to optimize workflows, improve result quality, and enhance our understanding of high-dimensional datasets and their inherent relationship with disease. In recent years, there has been an increasing number of publications that examine the capabilities of ML-based software in the context of chromatography and MS. However, given the historically distant nature between the fields of clinical chemistry and computer science, there is an opportunity to improve technological literacy of ML-based software within the clinical laboratory scientist community. To this end, we present a basic overview of ML and a tutorial of an ML-based experiment using a previously published MS dataset. The purpose of this paper is to describe the fundamental principles of supervised ML, outline the steps that are classically involved in an ML-based experiment, and discuss the purpose of good ML practice in the context of a binary MS classification problem.
Collapse
Key Words
- Amino acid
- Artificial intelligence
- CART, Classification and Regression Trees
- ML, Machine Learning
- MS, Mass Spectrometry
- Mass spectrometry
- NLL, Negative Log Loss
- PAA, Plasma Amino Acid
- PR, Precision-Recall
- PRAUC, Area Under the Precision-Recall Curve
- RL, Reinforcement Learning
- ROC, Receiver Operator Curve
- SCF, Supplemental Code File
- Supervised machine learning
- XGBT, Extreme Gradient Boosted Trees
- Xgboost
Collapse
Affiliation(s)
- Edward S. Lee
- Department of Laboratory Medicine, at Yale School of Medicine, New Haven, CT, USA
- Department of Laboratory Medicine, at Yale New Haven Hospital, New Haven, CT, USA
| | - Thomas J.S. Durant
- Department of Laboratory Medicine, at Yale School of Medicine, New Haven, CT, USA
- Department of Laboratory Medicine, at Yale New Haven Hospital, New Haven, CT, USA
- Corresponding author at: Department of Laboratory Medicine, 55 Park Street PS345D, New Haven, CT 06511, USA.
| |
Collapse
|
6
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
7
|
Ayoobi N, Sharifrazi D, Alizadehsani R, Shoeibi A, Gorriz JM, Moosaei H, Khosravi A, Nahavandi S, Gholamzadeh Chofreh A, Goni FA, Klemeš JJ, Mosavi A. Time series forecasting of new cases and new deaths rate for COVID-19 using deep learning methods. Results Phys 2021; 27:104495. [PMID: 34221854 PMCID: PMC8233414 DOI: 10.1016/j.rinp.2021.104495] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 06/19/2021] [Accepted: 06/22/2021] [Indexed: 05/17/2023]
Abstract
The first known case of Coronavirus disease 2019 (COVID-19) was identified in December 2019. It has spread worldwide, leading to an ongoing pandemic, imposed restrictions and costs to many countries. Predicting the number of new cases and deaths during this period can be a useful step in predicting the costs and facilities required in the future. The purpose of this study is to predict new cases and deaths rate one, three and seven-day ahead during the next 100 days. The motivation for predicting every n days (instead of just every day) is the investigation of the possibility of computational cost reduction and still achieving reasonable performance. Such a scenario may be encountered in real-time forecasting of time series. Six different deep learning methods are examined on the data adopted from the WHO website. Three methods are LSTM, Convolutional LSTM, and GRU. The bidirectional extension is then considered for each method to forecast the rate of new cases and new deaths in Australia and Iran countries. This study is novel as it carries out a comprehensive evaluation of the aforementioned three deep learning methods and their bidirectional extensions to perform prediction on COVID-19 new cases and new death rate time series. To the best of our knowledge, this is the first time that Bi-GRU and Bi-Conv-LSTM models are used for prediction on COVID-19 new cases and new deaths time series. The evaluation of the methods is presented in the form of graphs and Friedman statistical test. The results show that the bidirectional models have lower errors than other models. A several error evaluation metrics are presented to compare all models, and finally, the superiority of bidirectional methods is determined. This research could be useful for organisations working against COVID-19 and determining their long-term plans.
Collapse
Key Words
- ANFIS, Adaptive Network-based Fuzzy Inference System
- ANN, Artificial Neural Network
- AU, Australia
- Bi-Conv-LSTM, Bidirectional Convolutional Long Short Term Memory
- Bi-GRU, Bidirectional Gated Recurrent Unit
- Bi-LSTM, Bidirectional Long Short-Term Memory
- Bidirectional
- COVID-19 Prediction
- COVID-19, Coronavirus Disease 2019
- Conv-LSTM, Convolutional Long Short Term Memory
- Convolutional Long Short Term Memory (Conv-LSTM)
- DL, Deep Learning
- DLSTM, Delayed Long Short-Term Memory
- Deep learning
- EMRO, Eastern Mediterranean Regional Office
- ES, Exponential Smoothing
- EV, Explained Variance
- GRU, Gated Recurrent Unit
- Gated Recurrent Unit (GRU)
- IR, Iran
- LR, Linear Regression
- LSTM, Long Short-Term Memory
- Lasso, Least Absolute Shrinkage and Selection Operator
- Long Short Term Memory (LSTM)
- MAE, Mean Absolute Error
- MAPE, Mean Absolute Percentage Error
- MERS, Middle East Respiratory Syndrome
- ML, Machine Learning
- MLP-ICA, Multi-layered Perceptron-Imperialist Competitive Calculation
- MSE, Mean Square Error
- MSLE, Mean Squared Log Error
- Machine learning
- New Cases of COVID-19
- New Deaths of COVID-19
- PRISMA, Preferred Reporting Items for Precise Surveys and Meta-Analyses
- RMSE, Root Mean Square Error
- RMSLE, Root Mean Squared Log Error
- RNN, Repetitive Neural Network
- ReLU, Rectified Linear Unit
- SARS, Serious Intense Respiratory Disorder
- SARS-COV, SARS coronavirus
- SARS-COV-2, Serious Intense Respiratory Disorder Coronavirus 2
- SVM, Support Vector Machine
- VAE, Variational Auto Encoder
- WHO, World Health Organization
- WPRO, Western Pacific Regional Office
Collapse
Affiliation(s)
- Nooshin Ayoobi
- Department of Mathematics, Savitribai Phule Pune University, Pune 411007, India
| | - Danial Sharifrazi
- Department of Computer Engineering, School of Technical and Engineering, Shiraz Branch, Islamic Azad University, Shiraz, Iran
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, VIC 3217, Australia
| | - Afshin Shoeibi
- Computer Engineering Department, Ferdowsi University of Mashhad, Mashhad, Iran
- Faculty of Electrical and Computer Engineering, Biomedical Data Acquisition Lab, K. N. Toosi University of Technology, Tehran, Iran
| | - Juan M Gorriz
- Department of Signal Theory, Networking and Communications, Universidad de Granada, Spain
| | - Hossein Moosaei
- Department of Mathematics, Faculty of Science, University of Bojnord, Iran
| | - Abbas Khosravi
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, VIC 3217, Australia
| | - Saeid Nahavandi
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, VIC 3217, Australia
| | - Abdoulmohammad Gholamzadeh Chofreh
- Sustainable Process Integration Laboratory - SPIL, NETME Centre, Faculty of Mechanical Engineering, Brno University of Technology - VUT Brno, Technická 2896/2, 616 69 Brno, Czech Republic
| | - Feybi Ariani Goni
- Department of Management, Faculty of Business and Management, Brno University of Technology - VUT Brno, Kolejní 2906/4, 612 00 Brno, Czech Republic
| | - Jiří Jaromír Klemeš
- Sustainable Process Integration Laboratory - SPIL, NETME Centre, Faculty of Mechanical Engineering, Brno University of Technology - VUT Brno, Technická 2896/2, 616 69 Brno, Czech Republic
| | - Amir Mosavi
- John von Neumann Faculty of Informatics, Obuda University, 1034 Budapest, Hungary
- School of Economics and Business, Norwegian University of Life Sciences, 1430 Ås, Norway
| |
Collapse
|
8
|
Adamidi ES, Mitsis K, Nikita KS. Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review. Comput Struct Biotechnol J 2021; 19:2833-2850. [PMID: 34025952 PMCID: PMC8123783 DOI: 10.1016/j.csbj.2021.05.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 05/01/2021] [Accepted: 05/02/2021] [Indexed: 12/23/2022] Open
Abstract
The worldwide health crisis caused by the SARS-Cov-2 virus has resulted in>3 million deaths so far. Improving early screening, diagnosis and prognosis of the disease are critical steps in assisting healthcare professionals to save lives during this pandemic. Since WHO declared the COVID-19 outbreak as a pandemic, several studies have been conducted using Artificial Intelligence techniques to optimize these steps on clinical settings in terms of quality, accuracy and most importantly time. The objective of this study is to conduct a systematic literature review on published and preprint reports of Artificial Intelligence models developed and validated for screening, diagnosis and prognosis of the coronavirus disease 2019. We included 101 studies, published from January 1st, 2020 to December 30th, 2020, that developed AI prediction models which can be applied in the clinical setting. We identified in total 14 models for screening, 38 diagnostic models for detecting COVID-19 and 50 prognostic models for predicting ICU need, ventilator need, mortality risk, severity assessment or hospital length stay. Moreover, 43 studies were based on medical imaging and 58 studies on the use of clinical parameters, laboratory results or demographic features. Several heterogeneous predictors derived from multimodal data were identified. Analysis of these multimodal data, captured from various sources, in terms of prominence for each category of the included studies, was performed. Finally, Risk of Bias (RoB) analysis was also conducted to examine the applicability of the included studies in the clinical setting and assist healthcare providers, guideline developers, and policymakers.
Collapse
Key Words
- ABG, Arterial Blood Gas
- ADA, Adenosine Deaminase
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APTT, Activated Partial Thromboplastin Time
- ARMED, Attribute Reduction with Multi-objective Decomposition Ensemble optimizer
- AUC, Area Under the Curve
- Acc, Accuracy
- Adaboost, Adaptive Boosting
- Apol AI, Apolipoprotein AI
- Apol B, Apolipoprotein B
- Artificial intelligence
- BNB, Bernoulli Naïve Bayes
- BUN, Blood Urea Nitrogen
- CI, Confidence Interval
- CK-MB, Creatine Kinase isoenzyme
- CNN, Convolutional Neural Networks
- COVID-19
- CPP, COVID-19 Positive Patients
- CRP, C-Reactive Protein
- CRT, Classification and Regression Decision Tree
- CoxPH, Cox Proportional Hazards
- DCNN, Deep Convolutional Neural Networks
- DL, Deep Learning
- DLC, Density Lipoprotein Cholesterol
- DNN, Deep Neural Networks
- DT, Decision Tree
- Diagnosis
- ED, Emergency Department
- ESR, Erythrocyte Sedimentation Rate
- ET, Extra Trees
- FCV, Fold Cross Validation
- FL, Federated Learning
- FiO2, Fraction of Inspiration O2
- GBDT, Gradient Boost Decision Tree
- GBM light, Gradient Boosting Machine light
- GDCNN, Genetic Deep Learning Convolutional Neural Network
- GFR, Glomerular Filtration Rate
- GFS, Gradient boosted feature selection
- GGT, Glutamyl Transpeptidase
- GNB, Gaussian Naïve Bayes
- HDLC, High Density Lipoprotein Cholesterol
- INR, International Normalized Ratio
- Inception Resnet, Inception Residual Neural Network
- L1LR, L1 Regularized Logistic Regression
- LASSO, Least Absolute Shrinkage and Selection Operator
- LDA, Linear Discriminant Analysis
- LDH, Lactate Dehydrogenase
- LDLC, Low Density Lipoprotein Cholesterol
- LR, Logistic Regression
- LSTM, Long-Short Term Memory
- MCHC, Mean Corpuscular Hemoglobin Concentration
- MCV, Mean corpuscular volume
- ML, Machine Learning
- MLP, MultiLayer Perceptron
- MPV, Mean Platelet Volume
- MRMR, Maximum Relevance Minimum Redundancy
- Multimodal data
- NB, Naïve Bayes
- NLP, Natural Language Processing
- NPV, Negative Predictive Values
- Nadam optimizer, Nesterov Accelerated Adaptive Moment optimizer
- OB, Occult Blood test
- PCT, Thrombocytocrit
- PPV, Positive Predictive Values
- PWD, Platelet Distribution Width
- PaO2, Arterial Oxygen Tension
- Paco2, Arterial Carbondioxide Tension
- Prognosis
- RBC, Red Blood Cell
- RBF, Radial Basis Function
- RBP, Retinol Binding Protein
- RDW, Red blood cell Distribution Width
- RF, Random Forest
- RFE, Recursive Feature Elimination
- RSV, Respiratory Syncytial Virus
- SEN, Sensitivity
- SG, Specific Gravity
- SMOTE, Synthetic Minority Oversampling Technique
- SPE, Specificity
- SRLSR, Sparse Rescaled Linear Square Regression
- SVM, Support Vector Machine
- SaO2, Arterial Oxygen saturation
- Screening
- TBA, Total Bile Acid
- TTS, Training Test Split
- WBC, White Blood Cell count
- XGB, eXtreme Gradient Boost
- k-NN, K-Nearest Neighbor
Collapse
Affiliation(s)
- Eleni S. Adamidi
- Biomedical Simulations and Imaging Lab, School of Electrical and Computer Engineering, National Technical University of Athens, Greece
| | - Konstantinos Mitsis
- Biomedical Simulations and Imaging Lab, School of Electrical and Computer Engineering, National Technical University of Athens, Greece
| | - Konstantina S. Nikita
- Biomedical Simulations and Imaging Lab, School of Electrical and Computer Engineering, National Technical University of Athens, Greece
| |
Collapse
|
9
|
Ghannam RB, Techtmann SM. Machine learning applications in microbial ecology, human microbiome studies, and environmental monitoring. Comput Struct Biotechnol J 2021; 19:1092-1107. [PMID: 33680353 PMCID: PMC7892807 DOI: 10.1016/j.csbj.2021.01.028] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 01/16/2021] [Accepted: 01/18/2021] [Indexed: 01/04/2023] Open
Abstract
Advances in nucleic acid sequencing technology have enabled expansion of our ability to profile microbial diversity. These large datasets of taxonomic and functional diversity are key to better understanding microbial ecology. Machine learning has proven to be a useful approach for analyzing microbial community data and making predictions about outcomes including human and environmental health. Machine learning applied to microbial community profiles has been used to predict disease states in human health, environmental quality and presence of contamination in the environment, and as trace evidence in forensics. Machine learning has appeal as a powerful tool that can provide deep insights into microbial communities and identify patterns in microbial community data. However, often machine learning models can be used as black boxes to predict a specific outcome, with little understanding of how the models arrived at predictions. Complex machine learning algorithms often may value higher accuracy and performance at the sacrifice of interpretability. In order to leverage machine learning into more translational research related to the microbiome and strengthen our ability to extract meaningful biological information, it is important for models to be interpretable. Here we review current trends in machine learning applications in microbial ecology as well as some of the important challenges and opportunities for more broad application of machine learning to understanding microbial communities.
Collapse
Key Words
- 16S rRNA
- ANN, Artificial Neural Networks
- ASV, Amplicon Sequence Variant
- AUC, Area Under the Curve
- Forensics
- GB, Gradient Boosting
- ML, Machine Learning
- Machine learning
- Marker genes
- Metagenomics
- PCoA, Principal Coordinate Analysis
- RF, Random Forests
- ROC, Receiver Operating Characteristic
- SML, Supervised Machine Learning
- SVM, Support Vector Machines
- USML, Unsupervised Machine Learning
- tSNE, t-distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Ryan B. Ghannam
- Department of Biological Sciences, Michigan Technological University, Houghton MI, United States
| | - Stephen M. Techtmann
- Department of Biological Sciences, Michigan Technological University, Houghton MI, United States
| |
Collapse
|
10
|
Magazzino C, Mele M, Schneider N. The relationship between air pollution and COVID-19-related deaths: An application to three French cities. Appl Energy 2020; 279:115835. [PMID: 32952266 PMCID: PMC7486865 DOI: 10.1016/j.apenergy.2020.115835] [Citation(s) in RCA: 106] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 08/22/2020] [Accepted: 08/27/2020] [Indexed: 05/18/2023]
Abstract
Being heavily dependent to oil products (mainly gasoline and diesel), the French transport sector is the main emitter of Particulate Matter (PMs) whose critical levels induce harmful health effects for urban inhabitants. We selected three major French cities (Paris, Lyon, and Marseille) to investigate the relationship between the Coronavirus Disease 19 (COVID-19) outbreak and air pollution. Using Artificial Neural Networks (ANNs) experiments, we have determined the concentration of PM2.5 and PM10 linked to COVID-19-related deaths. Our focus is on the potential effects of Particulate Matter (PM) in spreading the epidemic. The underlying hypothesis is that a pre-determined particulate concentration can foster COVID-19 and make the respiratory system more susceptible to this infection. The empirical strategy used an innovative Machine Learning (ML) methodology. In particular, through the so-called cutting technique in ANNs, we found new threshold levels of PM2.5 and PM10 connected to COVID-19: 17.4 µg/m3 (PM2.5) and 29.6 µg/m3 (PM10) for Paris; 15.6 µg/m3 (PM2.5) and 20.6 µg/m3 (PM10) for Lyon; 14.3 µg/m3 (PM2.5) and 22.04 µg/m3 (PM10) for Marseille. Interestingly, all the threshold values identified by the ANNs are higher than the limits imposed by the European Parliament. Finally, a Causal Direction from Dependency (D2C) algorithm is applied to check the consistency of our findings.
Collapse
Key Words
- ANNs, Artificial Neural Networks
- Air pollution
- Artificial neural networks
- CH4, Methane
- CMAQ, Community Multiscale Air Quality
- CO, Carbon Monoxide
- COVID-19
- COVID-19, Coronavirus Disease 19
- D2C, Causal Direction from Dependency
- GAM, Generalized Additive Model
- GHG, Greenhouse Gas
- ML, Machine Learning
- Machine learning
- NO2, Nitrogen Dioxide
- NOx, Nitrogen Oxides
- O3, Ozone
- PM10, Particulate Matter with an aerodynamic diameter < 10.0 µm
- PM2.5, Particulate Matter with an aerodynamic diameter < 2.5 µm
- Particulate matter
- SO2, Sulfur Dioxide
- SO3, Sulphur Trioxide
- SOx, Sulphur Oxides
- VOC, Volatile Organic Compounds
Collapse
|
11
|
Pomyen Y, Wanichthanarak K, Poungsombat P, Fahrmann J, Grapov D, Khoomrung S. Deep metabolome: Applications of deep learning in metabolomics. Comput Struct Biotechnol J 2020; 18:2818-2825. [PMID: 33133423 PMCID: PMC7575644 DOI: 10.1016/j.csbj.2020.09.033] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/21/2020] [Accepted: 09/21/2020] [Indexed: 01/11/2023] Open
Abstract
In the past few years, deep learning has been successfully applied to various omics data. However, the applications of deep learning in metabolomics are still relatively low compared to others omics. Currently, data pre-processing using convolutional neural network architecture appears to benefit the most from deep learning. Compound/structure identification and quantification using artificial neural network/deep learning performed relatively better than traditional machine learning techniques, whereas only marginally better results are observed in biological interpretations. Before deep learning can be effectively applied to metabolomics, several challenges should be addressed, including metabolome-specific deep learning architectures, dimensionality problems, and model evaluation regimes.
Collapse
Key Words
- AI, Artificial Intelligence
- ANN, Artificial Neural Network
- AUC, Area Under the receiver-operating characteristic Curve
- Artificial neural network
- CCS value, Collision Cross Section value
- CFM-EI, Competitive Fragmentation Modeling-Electron Ionization
- CNN, Convolutional Neural Network
- DL, Deep Learning
- DNN, Deep Neural Network
- Deep learning
- ECFP, Extended Circular Fingerprint
- ER, Estrogen Receptor
- FID, Free Induction Decay
- FP score, Fingerprint correlation score
- FTIR, Fourier Transform Infrared
- GC–MS, Gas Chromatography-Mass Spectrometry
- HDLSS data, High Dimensional Low Sample Size data
- IST, Iterative Soft Thresholding
- LC-MS, Liquid Chromatography-Mass Spectrometry
- LSTM, Long Short-Term Memory
- ML, Machine Learning
- MLP, Multi-layered Perceptron
- MS, Mass Spectrometry
- Mass spectrometry
- Metabolomics
- NEIMS, Neural Electron-Ionization Mass Spectrometry
- NMR
- NMR, Nuclear Magnetic Resonance
- NUS, Non-Uniformly Sampling
- PARAFAC2, Parallel Factor Analysis 2
- RF, Random Forest
- RNN, Recurrent Neural Network
- ReLU, Rectified Linear Unit
- SMARTS, SMILES arbitrary target specification
- SMILE, Sparse Multidimensional Iterative Lineshape-enhanced
- SMILES, Simplified Molecular-Input Line-Entry System
- SRA, Sequence Read Archive
- VAE, Variational Autoencoder
- istHMS, Implementation of IST at Harvard Medical School
- m/z, mass/charge ratio
Collapse
Affiliation(s)
- Yotsawat Pomyen
- Translational Research Unit, Chulabhorn Research Institute, Bangkok, Thailand
| | - Kwanjeera Wanichthanarak
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Patcha Poungsombat
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Center for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| | - Johannes Fahrmann
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030, USA
| | - Dmitry Grapov
- CDS- Creative Data Solutions LLC, https://creative-data.solutions, USA
| | - Sakda Khoomrung
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Center for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| |
Collapse
|
12
|
Dlamini Z, Francies FZ, Hull R, Marima R. Artificial intelligence (AI) and big data in cancer and precision oncology. Comput Struct Biotechnol J 2020; 18:2300-2311. [PMID: 32994889 PMCID: PMC7490765 DOI: 10.1016/j.csbj.2020.08.019] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 08/21/2020] [Accepted: 08/21/2020] [Indexed: 02/07/2023] Open
Abstract
Artificial intelligence (AI) and machine learning have significantly influenced many facets of the healthcare sector. Advancement in technology has paved the way for analysis of big datasets in a cost- and time-effective manner. Clinical oncology and research are reaping the benefits of AI. The burden of cancer is a global phenomenon. Efforts to reduce mortality rates requires early diagnosis for effective therapeutic interventions. However, metastatic and recurrent cancers evolve and acquire drug resistance. It is imperative to detect novel biomarkers that induce drug resistance and identify therapeutic targets to enhance treatment regimes. The introduction of the next generation sequencing (NGS) platforms address these demands, has revolutionised the future of precision oncology. NGS offers several clinical applications that are important for risk predictor, early detection of disease, diagnosis by sequencing and medical imaging, accurate prognosis, biomarker identification and identification of therapeutic targets for novel drug discovery. NGS generates large datasets that demand specialised bioinformatics resources to analyse the data that is relevant and clinically significant. Through these applications of AI, cancer diagnostics and prognostic prediction are enhanced with NGS and medical imaging that delivers high resolution images. Regardless of the improvements in technology, AI has some challenges and limitations, and the clinical application of NGS remains to be validated. By continuing to enhance the progression of innovation and technology, the future of AI and precision oncology show great promise.
Collapse
Affiliation(s)
- Zodwa Dlamini
- SAMRC/UP Precision Prevention & Novel Drug Targets for HIV-Associated Cancers (PPNDTHAC) Extramural Unit, Pan African Cancer Research Institute (PACRI), University of Pretoria, Faculty of Health Sciences, Hatfield 0028, South Africa
| | - Flavia Zita Francies
- SAMRC/UP Precision Prevention & Novel Drug Targets for HIV-Associated Cancers (PPNDTHAC) Extramural Unit, Pan African Cancer Research Institute (PACRI), University of Pretoria, Faculty of Health Sciences, Hatfield 0028, South Africa
| | - Rodney Hull
- SAMRC/UP Precision Prevention & Novel Drug Targets for HIV-Associated Cancers (PPNDTHAC) Extramural Unit, Pan African Cancer Research Institute (PACRI), University of Pretoria, Faculty of Health Sciences, Hatfield 0028, South Africa
| | - Rahaba Marima
- SAMRC/UP Precision Prevention & Novel Drug Targets for HIV-Associated Cancers (PPNDTHAC) Extramural Unit, Pan African Cancer Research Institute (PACRI), University of Pretoria, Faculty of Health Sciences, Hatfield 0028, South Africa
| |
Collapse
|
13
|
Ferroni P, Zanzotto FM, Scarpato N, Spila A, Fofi L, Egeo G, Rullo A, Palmirotta R, Barbanti P, Guadagni F. Machine learning approach to predict medication overuse in migraine patients. Comput Struct Biotechnol J 2020; 18:1487-1496. [PMID: 32637046 PMCID: PMC7327028 DOI: 10.1016/j.csbj.2020.06.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 05/19/2020] [Accepted: 06/05/2020] [Indexed: 11/23/2022] Open
Abstract
Machine learning (ML) is largely used to develop automatic predictors in migraine classification but automatic predictors for medication overuse (MO) in migraine are still in their infancy. Thus, to understand the benefits of ML in MO prediction, we explored an automated predictor to estimate MO risk in migraine. To achieve this objective, a study was designed to analyze the performance of a customized ML-based decision support system that combines support vector machines and Random Optimization (RO-MO). We used RO-MO to extract prognostic information from demographic, clinical and biochemical data. Using a dataset of 777 consecutive migraine patients we derived a set of predictors with discriminatory power for MO higher than that observed for baseline SVM. The best four were incorporated into the final RO-MO decision support system and risk evaluation on a five-level stratification was performed. ROC analysis resulted in a c-statistic of 0.83 with a sensitivity and specificity of 0.69 and 0.87, respectively, and an accuracy of 0.87 when MO was predicted by at least three RO-MO models. Logistic regression analysis confirmed that the derived RO-MO system could effectively predict MO with ORs of 5.7 and 21.0 for patients classified as probably (3 predictors positive), or definitely at risk of MO (4 predictors positive), respectively. In conclusion, a combination of ML and RO - taking into consideration clinical/biochemical features, drug exposure and lifestyle - might represent a valuable approach to MO prediction in migraine and holds the potential for improving model precision through weighting the relative importance of attributes.
Collapse
Key Words
- AI, Artificial Intelligence
- AUC, Area Under the Curve
- Artificial intelligence
- BMI, body mass index
- CI, Confidence Interval
- DBH 19-bp I/D polymorphism, Dopamine-Beta-Hydroxylase 19 bp insertion/deletion polymorphism
- DSS, Decision Support System
- Decision support systems
- ICT, Information and Communications Technology
- KELP, Kernel-based Learning Platform
- LRs, likelihood ratios
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- MO, Medication Overuse
- Machine learning
- Medication overuse
- Migraine
- NSAID, nonsteroidal anti-inflammatory drugs
- PVI, Predictive Value Imputation
- RO, Random Optimization
- ROC, Receiver operating characteristic
- SE, Standard Error
- SVM, Support Vector Machine
Collapse
Affiliation(s)
- Patrizia Ferroni
- BioBIM (InterInstitutional Multidisciplinary Biobank), IRCCS San Raffaele Pisana, Via di Val Cannuta 247, 00166 Rome, Italy
- Dept. of Human Sciences & Quality of Life Promotion, San Raffaele Roma Open University, Via di Val Cannuta 247, 00166 Rome, Italy
| | - Fabio M. Zanzotto
- Department of Enterprise Engineering, University of Rome “Tor Vergata”, Viale Oxford 81, 00133 Rome, Italy
| | - Noemi Scarpato
- Dept. of Human Sciences & Quality of Life Promotion, San Raffaele Roma Open University, Via di Val Cannuta 247, 00166 Rome, Italy
| | - Antonella Spila
- BioBIM (InterInstitutional Multidisciplinary Biobank), IRCCS San Raffaele Pisana, Via di Val Cannuta 247, 00166 Rome, Italy
| | - Luisa Fofi
- Headache and Pain Unit, Dept. of Neurological, Motor and Sensorial Sciences, IRCCS San Raffaele Pisana, Via di Val Cannuta 247, 00166 Rome, Italy
| | - Gabriella Egeo
- Headache and Pain Unit, Dept. of Neurological, Motor and Sensorial Sciences, IRCCS San Raffaele Pisana, Via di Val Cannuta 247, 00166 Rome, Italy
| | - Alessandro Rullo
- Neatec S.p.A., Via Campi Flegrei, 34, 80078 Pozzuoli, Naples, Italy
| | - Raffaele Palmirotta
- Department of Biomedical Sciences & Human Oncology, University of Bari ‘Aldo Moro’, Bari, Italy
| | - Piero Barbanti
- Dept. of Human Sciences & Quality of Life Promotion, San Raffaele Roma Open University, Via di Val Cannuta 247, 00166 Rome, Italy
- Headache and Pain Unit, Dept. of Neurological, Motor and Sensorial Sciences, IRCCS San Raffaele Pisana, Via di Val Cannuta 247, 00166 Rome, Italy
| | - Fiorella Guadagni
- BioBIM (InterInstitutional Multidisciplinary Biobank), IRCCS San Raffaele Pisana, Via di Val Cannuta 247, 00166 Rome, Italy
- Dept. of Human Sciences & Quality of Life Promotion, San Raffaele Roma Open University, Via di Val Cannuta 247, 00166 Rome, Italy
| |
Collapse
|
14
|
Rondina JM, Ferreira LK, de Souza Duran FL, Kubo R, Ono CR, Leite CC, Smid J, Nitrini R, Buchpiguel CA, Busatto GF. Selecting the most relevant brain regions to discriminate Alzheimer's disease patients from healthy controls using multiple kernel learning: A comparison across functional and structural imaging modalities and atlases. Neuroimage Clin 2017; 17:628-641. [PMID: 29234599 PMCID: PMC5716956 DOI: 10.1016/j.nicl.2017.10.026] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Revised: 10/12/2017] [Accepted: 10/24/2017] [Indexed: 12/11/2022]
Abstract
BACKGROUND Machine learning techniques such as support vector machine (SVM) have been applied recently in order to accurately classify individuals with neuropsychiatric disorders such as Alzheimer's disease (AD) based on neuroimaging data. However, the multivariate nature of the SVM approach often precludes the identification of the brain regions that contribute most to classification accuracy. Multiple kernel learning (MKL) is a sparse machine learning method that allows the identification of the most relevant sources for the classification. By parcelating the brain into regions of interest (ROI) it is possible to use each ROI as a source to MKL (ROI-MKL). METHODS We applied MKL to multimodal neuroimaging data in order to: 1) compare the diagnostic performance of ROI-MKL and whole-brain SVM in discriminating patients with AD from demographically matched healthy controls and 2) identify the most relevant brain regions to the classification. We used two atlases (AAL and Brodmann's) to parcelate the brain into ROIs and applied ROI-MKL to structural (T1) MRI, 18F-FDG-PET and regional cerebral blood flow SPECT (rCBF-SPECT) data acquired from the same subjects (20 patients with early AD and 18 controls). In ROI-MKL, each ROI received a weight (ROI-weight) that indicated the region's relevance to the classification. For each ROI, we also calculated whether there was a predominance of voxels indicating decreased or increased regional activity (for 18F-FDG-PET and rCBF-SPECT) or volume (for T1-MRI) in AD patients. RESULTS Compared to whole-brain SVM, the ROI-MKL approach resulted in better accuracies (with either atlas) for classification using 18F-FDG-PET (92.5% accuracy for ROI-MKL versus 84% for whole-brain), but not when using rCBF-SPECT or T1-MRI. Although several cortical and subcortical regions contributed to discrimination, high ROI-weights and predominance of hypometabolism and atrophy were identified specially in medial parietal and temporo-limbic cortical regions. Also, the weight of discrimination due to a pattern of increased voxel-weight values in AD individuals was surprisingly high (ranging from approximately 20% to 40% depending on the imaging modality), located mainly in primary sensorimotor and visual cortices and subcortical nuclei. CONCLUSION The MKL-ROI approach highlights the high discriminative weight of a subset of brain regions of known relevance to AD, the selection of which contributes to increased classification accuracy when applied to 18F-FDG-PET data. Moreover, the MKL-ROI approach demonstrates that brain regions typically spared in mild stages of AD also contribute substantially in the individual discrimination of AD patients from controls.
Collapse
Key Words
- 18F-FDG-PET, 18F-Fluorodeoxyglucose-Positron Emission Tomography
- AAL, Automated Anatomical Labeling (atlas)
- AD, Alzheimer's Disease
- Alzheimer's Disease
- BA, Brodmann's Area
- Brain atlas
- GM, Gray Matter
- MKL, Multiple Kernel Learning
- MKL-ROI, MKL based on regions of interest
- ML, Machine Learning
- MRI
- Multiple kernel learning
- NF, number of features
- NSR, Number of Selected Regions
- PET
- PVE, Partial Volume Effects
- ROI, Region of Interest
- SPECT
- SVM, Support Vector Machine
- T1-MRI, T1-weighted Magnetic Resonance Imaging
- TN, True Negative (specificity - proportion of healthy controls correctly classified)
- TP, True Positive (sensitivity - proportion of patients correctly classified)
- rAUC, Ratio between negative and positive Area Under Curve
- rCBF-SPECT, Regional Cerebral Blood Flow
Collapse
Affiliation(s)
- Jane Maryam Rondina
- Laboratory of Psychiatric Neuroimaging (LIM 21), Department of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil; Sobell Department of Motor Neuroscience and Movement Disorders, Institute of Neurology, University College London, London, UK.
| | - Luiz Kobuti Ferreira
- Laboratory of Psychiatric Neuroimaging (LIM 21), Department of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil; Núcleo de Apoio à Pesquisa em Neurociência Aplicada (NAPNA), University of São Paulo, São Paulo, Brazil
| | - Fabio Luis de Souza Duran
- Laboratory of Psychiatric Neuroimaging (LIM 21), Department of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil
| | - Rodrigo Kubo
- Department of Radiology and Oncology, University of São Paulo Medical School, São Paulo, Brazil
| | - Carla Rachel Ono
- Department of Radiology and Oncology, University of São Paulo Medical School, São Paulo, Brazil
| | - Claudia Costa Leite
- Department of Radiology and Oncology, University of São Paulo Medical School, São Paulo, Brazil; Department of Radiology, University of North Carolina at Chapel Hill, NC, USA
| | - Jerusa Smid
- Department of Neurology and Cognitive Disorders Reference Center (CEREDIC), University of São Paulo, São Paulo, Brazil
| | - Ricardo Nitrini
- Department of Neurology and Cognitive Disorders Reference Center (CEREDIC), University of São Paulo, São Paulo, Brazil
| | | | - Geraldo F Busatto
- Laboratory of Psychiatric Neuroimaging (LIM 21), Department of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil; Núcleo de Apoio à Pesquisa em Neurociência Aplicada (NAPNA), University of São Paulo, São Paulo, Brazil; Department and Institute of Psychiatry, University of São Paulo, São Paulo, Brazil
| |
Collapse
|
15
|
Abstract
Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.
Collapse
Key Words
- ANN, Artificial Neural Network
- AUC, Area Under Curve
- BCRSVM, Breast Cancer Support Vector Machine
- BN, Bayesian Network
- CFS, Correlation based Feature Selection
- Cancer recurrence
- Cancer survival
- Cancer susceptibility
- DT, Decision Tree
- ES, Early Stopping algorithm
- GEO, Gene Expression Omnibus
- HTT, High-throughput Technologies
- LCS, Learning Classifying Systems
- ML, Machine Learning
- Machine learning
- NCI caArray, National Cancer Institute Array Data Management System
- NSCLC, Non-small Cell Lung Cancer
- OSCC, Oral Squamous Cell Carcinoma
- PPI, Protein–Protein Interaction
- Predictive models
- ROC, Receiver Operating Characteristic
- SEER, Surveillance, Epidemiology and End results Database
- SSL, Semi-supervised Learning
- SVM, Support Vector Machine
- TCGA, The Cancer Genome Atlas Research Network
Collapse
Affiliation(s)
- Konstantina Kourou
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Themis P Exarchos
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece ; IMBB - FORTH, Dept. of Biomedical Research, Ioannina, Greece
| | - Konstantinos P Exarchos
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Michalis V Karamouzis
- Molecular Oncology Unit, Department of Biological Chemistry, Medical School, University of Athens, Athens, Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece ; IMBB - FORTH, Dept. of Biomedical Research, Ioannina, Greece
| |
Collapse
|