1
|
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 2014; 13:8-17. [PMID: 25750696 PMCID: PMC4348437 DOI: 10.1016/j.csbj.2014.11.005] [Citation(s) in RCA: 1262] [Impact Index Per Article: 114.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.
Collapse
Key Words
- ANN, Artificial Neural Network
- AUC, Area Under Curve
- BCRSVM, Breast Cancer Support Vector Machine
- BN, Bayesian Network
- CFS, Correlation based Feature Selection
- Cancer recurrence
- Cancer survival
- Cancer susceptibility
- DT, Decision Tree
- ES, Early Stopping algorithm
- GEO, Gene Expression Omnibus
- HTT, High-throughput Technologies
- LCS, Learning Classifying Systems
- ML, Machine Learning
- Machine learning
- NCI caArray, National Cancer Institute Array Data Management System
- NSCLC, Non-small Cell Lung Cancer
- OSCC, Oral Squamous Cell Carcinoma
- PPI, Protein–Protein Interaction
- Predictive models
- ROC, Receiver Operating Characteristic
- SEER, Surveillance, Epidemiology and End results Database
- SSL, Semi-supervised Learning
- SVM, Support Vector Machine
- TCGA, The Cancer Genome Atlas Research Network
Collapse
|
Review |
11 |
1262 |
2
|
Albaradei S, Thafar M, Alsaedi A, Van Neste C, Gojobori T, Essack M, Gao X. Machine learning and deep learning methods that use omics data for metastasis prediction. Comput Struct Biotechnol J 2021; 19:5008-5018. [PMID: 34589181 PMCID: PMC8450182 DOI: 10.1016/j.csbj.2021.09.001] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 08/16/2021] [Accepted: 09/02/2021] [Indexed: 12/14/2022] Open
Abstract
Knowing metastasis is the primary cause of cancer-related deaths, incentivized research directed towards unraveling the complex cellular processes that drive the metastasis. Advancement in technology and specifically the advent of high-throughput sequencing provides knowledge of such processes. This knowledge led to the development of therapeutic and clinical applications, and is now being used to predict the onset of metastasis to improve diagnostics and disease therapies. In this regard, predicting metastasis onset has also been explored using artificial intelligence approaches that are machine learning, and more recently, deep learning-based. This review summarizes the different machine learning and deep learning-based metastasis prediction methods developed to date. We also detail the different types of molecular data used to build the models and the critical signatures derived from the different methods. We further highlight the challenges associated with using machine learning and deep learning methods, and provide suggestions to improve the predictive performance of such methods.
Collapse
Key Words
- AE, autoencoder
- ANN, Artificial Neural Network
- AUC, area under the curve
- Acc, Accuracy
- Artificial intelligence
- BC, Betweenness centrality
- BH, Benjamini-Hochberg
- BioGRID, Biological General Repository for Interaction Datasets
- CCP, compound covariate predictor
- CEA, Carcinoembryonic antigen
- CNN, convolution neural networks
- CV, cross-validation
- Cancer
- DBN, deep belief network
- DDBN, discriminative deep belief network
- DEGs, differentially expressed genes
- DIP, Database of Interacting Proteins
- DNN, Deep neural network
- DT, Decision Tree
- Deep learning
- EMT, epithelial-mesenchymal transition
- FC, fully connected
- GA, Genetic Algorithm
- GANs, generative adversarial networks
- GEO, Gene Expression Omnibus
- HCC, hepatocellular carcinoma
- HPRD, Human Protein Reference Database
- KNN, K-nearest neighbor
- L-SVM, linear SVM
- LIMMA, linear models for microarray data
- LOOCV, Leave-one-out cross-validation
- LR, Logistic Regression
- MCCV, Monte Carlo cross-validation
- MLP, multilayer perceptron
- Machine learning
- Metastasis
- NPV, negative predictive value
- PCA, Principal component analysis
- PPI, protein-protein interaction
- PPV, positive predictive value
- RC, ridge classifier
- RF, Random Forest
- RFE, recursive feature elimination
- RMA, robust multi‐array average
- RNN, recurrent neural networks
- SGD, stochastic gradient descent
- SMOTE, synthetic minority over-sampling technique
- SVM, Support Vector Machine
- Se, sensitivity
- Sp, specificity
- TCGA, The Cancer Genome Atlas
- k-CV, k-fold cross validation
- mRMR, minimum redundancy maximum relevance
Collapse
|
Review |
4 |
83 |
3
|
Yang X, Yang S, Li Q, Wuchty S, Zhang Z. Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method. Comput Struct Biotechnol J 2019; 18:153-161. [PMID: 31969974 PMCID: PMC6961065 DOI: 10.1016/j.csbj.2019.12.005] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 11/29/2019] [Accepted: 12/10/2019] [Indexed: 12/11/2022] Open
Abstract
The identification of human-virus protein-protein interactions (PPIs) is an essential and challenging research topic, potentially providing a mechanistic understanding of viral infection. Given that the experimental determination of human-virus PPIs is time-consuming and labor-intensive, computational methods are playing an important role in providing testable hypotheses, complementing the determination of large-scale interactome between species. In this work, we applied an unsupervised sequence embedding technique (doc2vec) to represent protein sequences as rich feature vectors of low dimensionality. Training a Random Forest (RF) classifier through a training dataset that covers known PPIs between human and all viruses, we obtained excellent predictive accuracy outperforming various combinations of machine learning algorithms and commonly-used sequence encoding schemes. Rigorous comparison with three existing human-virus PPI prediction methods, our proposed computational framework further provided very competitive and promising performance, suggesting that the doc2vec encoding scheme effectively captures context information of protein sequences, pertaining to corresponding protein-protein interactions. Our approach is freely accessible through our web server as part of our host-pathogen PPI prediction platform (http://zzdlab.com/InterSPPI/). Taken together, we hope the current work not only contributes a useful predictor to accelerate the exploration of human-virus PPIs, but also provides some meaningful insights into human-virus relationships.
Collapse
Key Words
- AC, Auto Covariance
- ACC, Accuracy
- AUC, area under the ROC curve
- AUPRC, area under the PR curve
- Adaboost, Adaptive Boosting
- CT, Conjoint Triad
- Doc2vec
- Embedding
- Human-virus interaction
- LD, Local Descriptor
- MCC, Matthews correlation coefficient
- ML, machine learning
- MLP, Multiple Layer Perceptron
- MS, mass spectroscopy
- Machine learning
- PPIs, protein-protein interactions
- PR, Precision-Recall
- Prediction
- Protein-protein interaction
- RBF, radial basis function
- RF, Random Forest
- ROC, Receiver Operating Characteristic
- SGD, stochastic gradient descent
- SVM, Support Vector Machine
- Y2H, yeast two-hybrid
Collapse
|
research-article |
6 |
78 |
4
|
Adamidi ES, Mitsis K, Nikita KS. Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review. Comput Struct Biotechnol J 2021; 19:2833-2850. [PMID: 34025952 PMCID: PMC8123783 DOI: 10.1016/j.csbj.2021.05.010] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 05/01/2021] [Accepted: 05/02/2021] [Indexed: 12/23/2022] Open
Abstract
The worldwide health crisis caused by the SARS-Cov-2 virus has resulted in>3 million deaths so far. Improving early screening, diagnosis and prognosis of the disease are critical steps in assisting healthcare professionals to save lives during this pandemic. Since WHO declared the COVID-19 outbreak as a pandemic, several studies have been conducted using Artificial Intelligence techniques to optimize these steps on clinical settings in terms of quality, accuracy and most importantly time. The objective of this study is to conduct a systematic literature review on published and preprint reports of Artificial Intelligence models developed and validated for screening, diagnosis and prognosis of the coronavirus disease 2019. We included 101 studies, published from January 1st, 2020 to December 30th, 2020, that developed AI prediction models which can be applied in the clinical setting. We identified in total 14 models for screening, 38 diagnostic models for detecting COVID-19 and 50 prognostic models for predicting ICU need, ventilator need, mortality risk, severity assessment or hospital length stay. Moreover, 43 studies were based on medical imaging and 58 studies on the use of clinical parameters, laboratory results or demographic features. Several heterogeneous predictors derived from multimodal data were identified. Analysis of these multimodal data, captured from various sources, in terms of prominence for each category of the included studies, was performed. Finally, Risk of Bias (RoB) analysis was also conducted to examine the applicability of the included studies in the clinical setting and assist healthcare providers, guideline developers, and policymakers.
Collapse
Key Words
- ABG, Arterial Blood Gas
- ADA, Adenosine Deaminase
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APTT, Activated Partial Thromboplastin Time
- ARMED, Attribute Reduction with Multi-objective Decomposition Ensemble optimizer
- AUC, Area Under the Curve
- Acc, Accuracy
- Adaboost, Adaptive Boosting
- Apol AI, Apolipoprotein AI
- Apol B, Apolipoprotein B
- Artificial intelligence
- BNB, Bernoulli Naïve Bayes
- BUN, Blood Urea Nitrogen
- CI, Confidence Interval
- CK-MB, Creatine Kinase isoenzyme
- CNN, Convolutional Neural Networks
- COVID-19
- CPP, COVID-19 Positive Patients
- CRP, C-Reactive Protein
- CRT, Classification and Regression Decision Tree
- CoxPH, Cox Proportional Hazards
- DCNN, Deep Convolutional Neural Networks
- DL, Deep Learning
- DLC, Density Lipoprotein Cholesterol
- DNN, Deep Neural Networks
- DT, Decision Tree
- Diagnosis
- ED, Emergency Department
- ESR, Erythrocyte Sedimentation Rate
- ET, Extra Trees
- FCV, Fold Cross Validation
- FL, Federated Learning
- FiO2, Fraction of Inspiration O2
- GBDT, Gradient Boost Decision Tree
- GBM light, Gradient Boosting Machine light
- GDCNN, Genetic Deep Learning Convolutional Neural Network
- GFR, Glomerular Filtration Rate
- GFS, Gradient boosted feature selection
- GGT, Glutamyl Transpeptidase
- GNB, Gaussian Naïve Bayes
- HDLC, High Density Lipoprotein Cholesterol
- INR, International Normalized Ratio
- Inception Resnet, Inception Residual Neural Network
- L1LR, L1 Regularized Logistic Regression
- LASSO, Least Absolute Shrinkage and Selection Operator
- LDA, Linear Discriminant Analysis
- LDH, Lactate Dehydrogenase
- LDLC, Low Density Lipoprotein Cholesterol
- LR, Logistic Regression
- LSTM, Long-Short Term Memory
- MCHC, Mean Corpuscular Hemoglobin Concentration
- MCV, Mean corpuscular volume
- ML, Machine Learning
- MLP, MultiLayer Perceptron
- MPV, Mean Platelet Volume
- MRMR, Maximum Relevance Minimum Redundancy
- Multimodal data
- NB, Naïve Bayes
- NLP, Natural Language Processing
- NPV, Negative Predictive Values
- Nadam optimizer, Nesterov Accelerated Adaptive Moment optimizer
- OB, Occult Blood test
- PCT, Thrombocytocrit
- PPV, Positive Predictive Values
- PWD, Platelet Distribution Width
- PaO2, Arterial Oxygen Tension
- Paco2, Arterial Carbondioxide Tension
- Prognosis
- RBC, Red Blood Cell
- RBF, Radial Basis Function
- RBP, Retinol Binding Protein
- RDW, Red blood cell Distribution Width
- RF, Random Forest
- RFE, Recursive Feature Elimination
- RSV, Respiratory Syncytial Virus
- SEN, Sensitivity
- SG, Specific Gravity
- SMOTE, Synthetic Minority Oversampling Technique
- SPE, Specificity
- SRLSR, Sparse Rescaled Linear Square Regression
- SVM, Support Vector Machine
- SaO2, Arterial Oxygen saturation
- Screening
- TBA, Total Bile Acid
- TTS, Training Test Split
- WBC, White Blood Cell count
- XGB, eXtreme Gradient Boost
- k-NN, K-Nearest Neighbor
Collapse
|
Review |
4 |
52 |
5
|
Ayoobi N, Sharifrazi D, Alizadehsani R, Shoeibi A, Gorriz JM, Moosaei H, Khosravi A, Nahavandi S, Gholamzadeh Chofreh A, Goni FA, Klemeš JJ, Mosavi A. Time series forecasting of new cases and new deaths rate for COVID-19 using deep learning methods. RESULTS IN PHYSICS 2021; 27:104495. [PMID: 34221854 PMCID: PMC8233414 DOI: 10.1016/j.rinp.2021.104495] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 06/19/2021] [Accepted: 06/22/2021] [Indexed: 05/17/2023]
Abstract
The first known case of Coronavirus disease 2019 (COVID-19) was identified in December 2019. It has spread worldwide, leading to an ongoing pandemic, imposed restrictions and costs to many countries. Predicting the number of new cases and deaths during this period can be a useful step in predicting the costs and facilities required in the future. The purpose of this study is to predict new cases and deaths rate one, three and seven-day ahead during the next 100 days. The motivation for predicting every n days (instead of just every day) is the investigation of the possibility of computational cost reduction and still achieving reasonable performance. Such a scenario may be encountered in real-time forecasting of time series. Six different deep learning methods are examined on the data adopted from the WHO website. Three methods are LSTM, Convolutional LSTM, and GRU. The bidirectional extension is then considered for each method to forecast the rate of new cases and new deaths in Australia and Iran countries. This study is novel as it carries out a comprehensive evaluation of the aforementioned three deep learning methods and their bidirectional extensions to perform prediction on COVID-19 new cases and new death rate time series. To the best of our knowledge, this is the first time that Bi-GRU and Bi-Conv-LSTM models are used for prediction on COVID-19 new cases and new deaths time series. The evaluation of the methods is presented in the form of graphs and Friedman statistical test. The results show that the bidirectional models have lower errors than other models. A several error evaluation metrics are presented to compare all models, and finally, the superiority of bidirectional methods is determined. This research could be useful for organisations working against COVID-19 and determining their long-term plans.
Collapse
Key Words
- ANFIS, Adaptive Network-based Fuzzy Inference System
- ANN, Artificial Neural Network
- AU, Australia
- Bi-Conv-LSTM, Bidirectional Convolutional Long Short Term Memory
- Bi-GRU, Bidirectional Gated Recurrent Unit
- Bi-LSTM, Bidirectional Long Short-Term Memory
- Bidirectional
- COVID-19 Prediction
- COVID-19, Coronavirus Disease 2019
- Conv-LSTM, Convolutional Long Short Term Memory
- Convolutional Long Short Term Memory (Conv-LSTM)
- DL, Deep Learning
- DLSTM, Delayed Long Short-Term Memory
- Deep learning
- EMRO, Eastern Mediterranean Regional Office
- ES, Exponential Smoothing
- EV, Explained Variance
- GRU, Gated Recurrent Unit
- Gated Recurrent Unit (GRU)
- IR, Iran
- LR, Linear Regression
- LSTM, Long Short-Term Memory
- Lasso, Least Absolute Shrinkage and Selection Operator
- Long Short Term Memory (LSTM)
- MAE, Mean Absolute Error
- MAPE, Mean Absolute Percentage Error
- MERS, Middle East Respiratory Syndrome
- ML, Machine Learning
- MLP-ICA, Multi-layered Perceptron-Imperialist Competitive Calculation
- MSE, Mean Square Error
- MSLE, Mean Squared Log Error
- Machine learning
- New Cases of COVID-19
- New Deaths of COVID-19
- PRISMA, Preferred Reporting Items for Precise Surveys and Meta-Analyses
- RMSE, Root Mean Square Error
- RMSLE, Root Mean Squared Log Error
- RNN, Repetitive Neural Network
- ReLU, Rectified Linear Unit
- SARS, Serious Intense Respiratory Disorder
- SARS-COV, SARS coronavirus
- SARS-COV-2, Serious Intense Respiratory Disorder Coronavirus 2
- SVM, Support Vector Machine
- VAE, Variational Auto Encoder
- WHO, World Health Organization
- WPRO, Western Pacific Regional Office
Collapse
|
research-article |
4 |
46 |
6
|
Rondina JM, Ferreira LK, de Souza Duran FL, Kubo R, Ono CR, Leite CC, Smid J, Nitrini R, Buchpiguel CA, Busatto GF. Selecting the most relevant brain regions to discriminate Alzheimer's disease patients from healthy controls using multiple kernel learning: A comparison across functional and structural imaging modalities and atlases. Neuroimage Clin 2017; 17:628-641. [PMID: 29234599 PMCID: PMC5716956 DOI: 10.1016/j.nicl.2017.10.026] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Revised: 10/12/2017] [Accepted: 10/24/2017] [Indexed: 12/11/2022]
Abstract
BACKGROUND Machine learning techniques such as support vector machine (SVM) have been applied recently in order to accurately classify individuals with neuropsychiatric disorders such as Alzheimer's disease (AD) based on neuroimaging data. However, the multivariate nature of the SVM approach often precludes the identification of the brain regions that contribute most to classification accuracy. Multiple kernel learning (MKL) is a sparse machine learning method that allows the identification of the most relevant sources for the classification. By parcelating the brain into regions of interest (ROI) it is possible to use each ROI as a source to MKL (ROI-MKL). METHODS We applied MKL to multimodal neuroimaging data in order to: 1) compare the diagnostic performance of ROI-MKL and whole-brain SVM in discriminating patients with AD from demographically matched healthy controls and 2) identify the most relevant brain regions to the classification. We used two atlases (AAL and Brodmann's) to parcelate the brain into ROIs and applied ROI-MKL to structural (T1) MRI, 18F-FDG-PET and regional cerebral blood flow SPECT (rCBF-SPECT) data acquired from the same subjects (20 patients with early AD and 18 controls). In ROI-MKL, each ROI received a weight (ROI-weight) that indicated the region's relevance to the classification. For each ROI, we also calculated whether there was a predominance of voxels indicating decreased or increased regional activity (for 18F-FDG-PET and rCBF-SPECT) or volume (for T1-MRI) in AD patients. RESULTS Compared to whole-brain SVM, the ROI-MKL approach resulted in better accuracies (with either atlas) for classification using 18F-FDG-PET (92.5% accuracy for ROI-MKL versus 84% for whole-brain), but not when using rCBF-SPECT or T1-MRI. Although several cortical and subcortical regions contributed to discrimination, high ROI-weights and predominance of hypometabolism and atrophy were identified specially in medial parietal and temporo-limbic cortical regions. Also, the weight of discrimination due to a pattern of increased voxel-weight values in AD individuals was surprisingly high (ranging from approximately 20% to 40% depending on the imaging modality), located mainly in primary sensorimotor and visual cortices and subcortical nuclei. CONCLUSION The MKL-ROI approach highlights the high discriminative weight of a subset of brain regions of known relevance to AD, the selection of which contributes to increased classification accuracy when applied to 18F-FDG-PET data. Moreover, the MKL-ROI approach demonstrates that brain regions typically spared in mild stages of AD also contribute substantially in the individual discrimination of AD patients from controls.
Collapse
Key Words
- 18F-FDG-PET, 18F-Fluorodeoxyglucose-Positron Emission Tomography
- AAL, Automated Anatomical Labeling (atlas)
- AD, Alzheimer's Disease
- Alzheimer's Disease
- BA, Brodmann's Area
- Brain atlas
- GM, Gray Matter
- MKL, Multiple Kernel Learning
- MKL-ROI, MKL based on regions of interest
- ML, Machine Learning
- MRI
- Multiple kernel learning
- NF, number of features
- NSR, Number of Selected Regions
- PET
- PVE, Partial Volume Effects
- ROI, Region of Interest
- SPECT
- SVM, Support Vector Machine
- T1-MRI, T1-weighted Magnetic Resonance Imaging
- TN, True Negative (specificity - proportion of healthy controls correctly classified)
- TP, True Positive (sensitivity - proportion of patients correctly classified)
- rAUC, Ratio between negative and positive Area Under Curve
- rCBF-SPECT, Regional Cerebral Blood Flow
Collapse
|
Comparative Study |
8 |
33 |
7
|
Martín J, Tena N, Asuero AG. Current state of diagnostic, screening and surveillance testing methods for COVID-19 from an analytical chemistry point of view. Microchem J 2021; 167:106305. [PMID: 33897053 PMCID: PMC8054532 DOI: 10.1016/j.microc.2021.106305] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/12/2021] [Accepted: 04/14/2021] [Indexed: 12/18/2022]
Abstract
Since December 2019, we have been in the battlefield with a new threat to the humanity known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In this review, we describe the four main methods used for diagnosis, screening and/or surveillance of SARS-CoV-2: Real-time reverse transcription polymerase chain reaction (RT-PCR); chest computed tomography (CT); and different complementary alternatives developed in order to obtain rapid results, antigen and antibody detection. All of them compare the highlighting advantages and disadvantages from an analytical point of view. The gold standard method in terms of sensitivity and specificity is the RT-PCR. The different modifications propose to make it more rapid and applicable at point of care (POC) are also presented and discussed. CT images are limited to central hospitals. However, being combined with RT-PCR is the most robust and accurate way to confirm COVID-19 infection. Antibody tests, although unable to provide reliable results on the status of the infection, are suitable for carrying out maximum screening of the population in order to know the immune capacity. More recently, antigen tests, less sensitive than RT-PCR, have been authorized to determine in a quicker way whether the patient is infected at the time of analysis and without the need of specific instruments.
Collapse
Key Words
- 2019-nCoV, 2019 novel coronavirus
- ACE2, Angiotensin-Converting Enzyme 2
- AI, Artificial Intelligence
- ALP, Alkaline Phosphatase
- ASOs, Antisense Oligonucleotides
- Antigen and antibody tests
- AuNIs, Gold Nanoislands
- AuNPs, Gold Nanoparticles
- BSL, Biosecurity Level
- CAP, College of American Pathologists
- CCD, Charge-Coupled Device
- CG, Colloidal Gold
- CGIA, Colloidal Gold Immunochromatographic Assay
- CLIA, Chemiluminescence Enzyme Immunoassay
- CLIA, Clinical Laboratory Improvement Amendments
- COVID-19
- COVID-19, Coronavirus disease-19
- CRISPR, Clustered Regularly Interspaced Short Palindromic Repeats
- CT, Chest Computed Tomography
- Cas, CRISPR Associate Protein
- China CDC, Chinese Center for Disease Control and Prevention
- Ct, Cycle Threshold
- DETECTR, SARS-CoV-2 DNA Endonuclease-Targeted CRISPR Trans Reporter
- DNA, Dexosyrosyribonucleic Acid
- E, Envelope protein
- ELISA, Enzyme Linked Immunosorbent Assay
- EMA, European Medicines Agency
- EUA, Emergence Use Authorization
- FDA, Food and Drug Administration
- FET, Field-Effect Transistor
- GISAID, Global Initiative on Sharing All Influenza Data
- GeneBank, Genetic sequence data base of the National Institute of Health
- ICTV, International Committee on Taxonomy of Viruses
- IgA, Immunoglobulins A
- IgG, Immunoglobulins G
- IgM, Immunoglobulins M
- IoMT, Internet of Medical Things
- IoT, Internet of Things
- LFIA, Lateral Flow Immunochromatographic Assays
- LOC, Lab-on-a-Chip
- LOD, Limit of detection
- LSPR, Localized Surface Plasmon Resonance
- M, Membrane protein
- MERS-CoV, Middle East Respiratory Syndrome Coronavirus
- MNP, Magnetic Nanoparticle
- MS, Mass spectrometry
- N, Nucleocapsid protein
- NER, Naked Eye Readout
- NGM, Next Generation Molecular
- NGS, Next Generation Sequencing
- NIH, National Institute of Health
- NSPs, Nonstructural Proteins
- Net, Neural Network
- ORF, Open Reading Frame
- OSN, One Step Single-tube Nested
- PDMS, Polydimethylsiloxane
- POC, Point of Care
- PPT, Plasmonic Photothermal
- QD, Quantum Dot
- R0, Basic reproductive number
- RBD, Receptor-binding domain
- RNA, Ribonucleic Acid
- RNaseH, Ribonuclease H
- RT, Reverse Transcriptase
- RT-LAMP, Reverse Transcription Loop-Mediated Isothermal Amplification
- RT-PCR, Real-Time Reverse Transcription Polymerase Chain Reaction
- RT-PCR, chest computerized tomography
- RdRp, RNA-Dependent RNA Polymerase
- S, Spike protein
- SARS-CoV-2
- SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus 2
- SERS, Surface Enhanced Raman Spectroscopy
- SHERLOCK, Specific High Sensitivity Enzymatic Reporter UnLOCKing
- STOPCovid, SHERLOCK Testing on One Pot
- SVM, Support Vector Machine
- SiO2@Ag, Complete silver nanoparticle shell coated on silica core
- US CDC, US Centers for Disease Control and Prevention
- VOC, Variant of Concern
- VTM, Viral Transport Medium
- WGS, Whole Genome Sequencing
- WHO, World Health Organization
- aM, Attomolar
- dNTPs, Nucleotides
- dPCR, Digital PCR
- ddPCR, Droplet digital PCR
- fM, Femtomolar
- m-RNA, Messenger Ribonucleic Acid
- nM, Nanomolar
- pM, Picomolar
- pfu, Plaque-forming unit
- rN, Recombinant nucleocapsid protein antigen
- rS, Recombinant Spike protein antigen
- ssRNA, Single-Stranded Positive-Sense RNA
Collapse
|
research-article |
4 |
27 |
8
|
Ferroni P, Zanzotto FM, Scarpato N, Spila A, Fofi L, Egeo G, Rullo A, Palmirotta R, Barbanti P, Guadagni F. Machine learning approach to predict medication overuse in migraine patients. Comput Struct Biotechnol J 2020; 18:1487-1496. [PMID: 32637046 PMCID: PMC7327028 DOI: 10.1016/j.csbj.2020.06.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 05/19/2020] [Accepted: 06/05/2020] [Indexed: 11/23/2022] Open
Abstract
Machine learning (ML) is largely used to develop automatic predictors in migraine classification but automatic predictors for medication overuse (MO) in migraine are still in their infancy. Thus, to understand the benefits of ML in MO prediction, we explored an automated predictor to estimate MO risk in migraine. To achieve this objective, a study was designed to analyze the performance of a customized ML-based decision support system that combines support vector machines and Random Optimization (RO-MO). We used RO-MO to extract prognostic information from demographic, clinical and biochemical data. Using a dataset of 777 consecutive migraine patients we derived a set of predictors with discriminatory power for MO higher than that observed for baseline SVM. The best four were incorporated into the final RO-MO decision support system and risk evaluation on a five-level stratification was performed. ROC analysis resulted in a c-statistic of 0.83 with a sensitivity and specificity of 0.69 and 0.87, respectively, and an accuracy of 0.87 when MO was predicted by at least three RO-MO models. Logistic regression analysis confirmed that the derived RO-MO system could effectively predict MO with ORs of 5.7 and 21.0 for patients classified as probably (3 predictors positive), or definitely at risk of MO (4 predictors positive), respectively. In conclusion, a combination of ML and RO - taking into consideration clinical/biochemical features, drug exposure and lifestyle - might represent a valuable approach to MO prediction in migraine and holds the potential for improving model precision through weighting the relative importance of attributes.
Collapse
Key Words
- AI, Artificial Intelligence
- AUC, Area Under the Curve
- Artificial intelligence
- BMI, body mass index
- CI, Confidence Interval
- DBH 19-bp I/D polymorphism, Dopamine-Beta-Hydroxylase 19 bp insertion/deletion polymorphism
- DSS, Decision Support System
- Decision support systems
- ICT, Information and Communications Technology
- KELP, Kernel-based Learning Platform
- LRs, likelihood ratios
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- MO, Medication Overuse
- Machine learning
- Medication overuse
- Migraine
- NSAID, nonsteroidal anti-inflammatory drugs
- PVI, Predictive Value Imputation
- RO, Random Optimization
- ROC, Receiver operating characteristic
- SE, Standard Error
- SVM, Support Vector Machine
Collapse
|
research-article |
5 |
23 |
9
|
Wang Q, Zhang Y, Zhang E, Xing X, Chen Y, Su MY, Lang N. Prediction of the early recurrence in spinal giant cell tumor of bone using radiomics of preoperative CT: Long-term outcome of 62 consecutive patients. J Bone Oncol 2021; 27:100354. [PMID: 33850701 PMCID: PMC8039834 DOI: 10.1016/j.jbo.2021.100354] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 02/26/2021] [Accepted: 02/28/2021] [Indexed: 12/27/2022] Open
Abstract
Characteristics of 62 patients with spinal GCTB who underwent surgery. A prognostic classification model was built based on features selected by SVM. The combined histogram and texture features could predict recurrence of GCTB. Objectives To determine if radiomics analysis based on preoperative computed tomography (CT) can predict early postoperative recurrence of giant cell tumor of bone (GCTB) in the spine. Methods In a retrospective review, 62 patients with pathologically confirmed spinal GCTB from March 2008 to February 2018, with a minimum follow-up of 24 months, were identified. The mean follow-up was 73.7 months (range, 28.7–152.1 months). The clinical information including age, gender, lesion location, multi-vertebral involvement, and surgical methods, were obtained. CT images acquired before the operation were retrieved for radiomics analysis. For each case, the tumor regions of interest (ROI) was manually outlined, and a total of 107 radiomics features were extracted. The features were selected via the sequential selection process by using the support vector machine (SVM), then used to construct classification models with Gaussian kernels. The differentiation between recurrence and non-recurrence groups was evaluated by ROC analysis, using 10-fold cross-validation. Results Of the 62 patients, 17 had recurrence with a recurrence rate of 27.4%. None of the clinical information was significantly different between the two groups. Patients receiving curettage had a higher recurrence rate (6/16 = 37.5%) compared to patients receiving TES (6/26 = 23.1%) or intralesional spondylectomy (5/20 = 25%). The final radiomics model was built using 10 selected features, which achieved an accuracy of 89% with AUC of 0.78. Conclusions The radiomics model developed based on pre-operative CT can achieve a high accuracy to predict the recurrence of spinal GCTB. Patients who have a high risk of early recurrence should be treated more aggressively to minimize recurrence.
Collapse
Key Words
- CT texture analysis
- CT, Computed Tomography
- DICOM, Digital Imaging and Communications in Medicine
- GCTB, Giant Cell Tumor of Bone
- GLCM, Gray Level Co-occurrence Matrix
- GLDM, Gray Level Dependence Matrix
- GLRLM, Gray Level Run Length Matrix
- GLSZM, Gray Level Size Zone Matrix
- Giant cell tumor of bone
- MRI, Magnetic Resonance Imaging
- NGTDM, Neighborhood Gray Tone Difference Matrix
- OPG, Osteoprotegerin
- PACS, Picture Archiving and Communication System
- Prognosis
- RANK, Receptor Activator of Nuclear factor Kappa-Β
- RANKL, Receptor Activator of Nuclear factor Kappa-Β Ligand
- ROC, Receiver Operating Characteristic
- ROI, Regions of Interest
- Radiomics
- SVM, Support Vector Machine
- Spine
Collapse
|
Journal Article |
4 |
18 |
10
|
Tang J, Wang Y, Luo Y, Fu J, Zhang Y, Li Y, Xiao Z, Lou Y, Qiu Y, Zhu F. Computational advances of tumor marker selection and sample classification in cancer proteomics. Comput Struct Biotechnol J 2020; 18:2012-2025. [PMID: 32802273 PMCID: PMC7403885 DOI: 10.1016/j.csbj.2020.07.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 07/06/2020] [Accepted: 07/08/2020] [Indexed: 12/11/2022] Open
Abstract
Cancer proteomics has become a powerful technique for characterizing the protein markers driving transformation of malignancy, tracing proteome variation triggered by therapeutics, and discovering the novel targets and drugs for the treatment of oncologic diseases. To facilitate cancer diagnosis/prognosis and accelerate drug target discovery, a variety of methods for tumor marker identification and sample classification have been developed and successfully applied to cancer proteomic studies. This review article describes the most recent advances in those various approaches together with their current applications in cancer-related studies. Firstly, a number of popular feature selection methods are overviewed with objective evaluation on their advantages and disadvantages. Secondly, these methods are grouped into three major classes based on their underlying algorithms. Finally, a variety of sample separation algorithms are discussed. This review provides a comprehensive overview of the advances on tumor maker identification and patients/samples/tissues separations, which could be guidance to the researches in cancer proteomics.
Collapse
Key Words
- ANN, Artificial Neural Network
- ANOVA, Analysis of Variance
- CFS, Correlation-based Feature Selection
- Cancer proteomics
- Computational methods
- DAPC, Discriminant Analysis of Principal Component
- DT, Decision Trees
- EDA, Estimation of Distribution Algorithm
- FC, Fold Change
- GA, Genetic Algorithms
- GR, Gain Ratio
- HC, Hill Climbing
- HCA, Hierarchical Cluster Analysis
- IG, Information Gain
- LDA, Linear Discriminant Analysis
- LIMMA, Linear Models for Microarray Data
- MBF, Markov Blanket Filter
- MWW, Mann–Whitney–Wilcoxon test
- OPLS-DA, Orthogonal Partial Least Squares Discriminant Analysis
- PCA, Principal Component Analysis
- PLS-DA, Partial Least Square Discriminant Analysis
- RF, Random Forest
- RF-RFE, Random Forest with Recursive Feature Elimination
- SA, Simulated Annealing
- SAM, Significance Analysis of Microarrays
- SBE, Sequential Backward Elimination
- SFS, and Sequential Forward Selection
- SOM, Self-organizing Map
- SU, Symmetrical Uncertainty
- SVM, Support Vector Machine
- SVM-RFE, Support Vector Machine with Recursive Feature Elimination
- Sample classification
- Tumor marker selection
- sPLSDA, Sparse Partial Least Squares Discriminant Analysis
- t-SNE, Student t Distribution
- χ2, Chi-square
Collapse
|
Review |
5 |
15 |
11
|
Amaratunga D, Cabrera J, Sargsyan D, Kostis JB, Zinonos S, Kostis WJ. Uses and opportunities for machine learning in hypertension research. INTERNATIONAL JOURNAL CARDIOLOGY HYPERTENSION 2020; 5:100027. [PMID: 33447756 PMCID: PMC7803038 DOI: 10.1016/j.ijchy.2020.100027] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 03/09/2020] [Accepted: 03/12/2020] [Indexed: 01/23/2023]
Abstract
Background Artificial intelligence (AI) promises to provide useful information to clinicians specializing in hypertension. Already, there are some significant AI applications on large validated data sets. Methods and results This review presents the use of AI to predict clinical outcomes in big data i.e. data with high volume, variety, veracity, velocity and value. Four examples are included in this review. In the first example, deep learning and support vector machine (SVM) predicted the occurrence of cardiovascular events with 56%–57% accuracy. In the second example, in a data base of 378,256 patients, a neural network algorithm predicted the occurrence of cardiovascular events during 10 year follow up with sensitivity (68%) and specificity (71%). In the third example, a machine learning algorithm classified 1,504,437 patients on the presence or absence of hypertension with 51% sensitivity, 99% specificity and area under the curve 87%. In example four, wearable biosensors and portable devices were used in assessing a person's risk of developing hypertension using photoplethysmography to separate persons who were at risk of developing hypertension with sensitivity higher than 80% and positive predictive value higher than 90%. The results of the above studies were adjusted for demographics and the traditional risk factors for atherosclerotic disease. Conclusion These examples describe the use of artificial intelligence methods in the field of hypertension.
Collapse
Key Words
- AMI, Acute Myocardial Infarction
- CART, Classification and Regression Trees
- CNN, Convolution Neural Net
- CS/E, Computer Sciences/Engineering
- DBP, Diastolic Blood Pressure
- Deep neural networks
- Disease management
- EHR, Electronic Health Record
- HF, Heart Failure
- Hypertension
- ICD, International Classification of Diseases
- MIDAS, Myocardial Infarction Data Acquisition System
- Machine learning
- NPV, Negative Predictive Value
- PDN, Personalized Disease Network
- PPG, photoplethysmography
- PPV, Positive Predictive Value
- Personalized disease network
- SBP, Systolic Blood Pressure
- SVM, Support Vector Machine
Collapse
|
Journal Article |
5 |
13 |
12
|
Denysyuk HV, Pinto RJ, Silva PM, Duarte RP, Marinho FA, Pimenta L, Gouveia AJ, Gonçalves NJ, Coelho PJ, Zdravevski E, Lameski P, Leithardt V, Garcia NM, Pires IM. Algorithms for automated diagnosis of cardiovascular diseases based on ECG data: A comprehensive systematic review. Heliyon 2023; 9:e13601. [PMID: 36852052 PMCID: PMC9958295 DOI: 10.1016/j.heliyon.2023.e13601] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 01/31/2023] [Accepted: 02/05/2023] [Indexed: 02/12/2023] Open
Abstract
The prevalence of cardiovascular diseases is increasing around the world. However, the technology is evolving and can be monitored with low-cost sensors anywhere at any time. This subject is being researched, and different methods can automatically identify these diseases, helping patients and healthcare professionals with the treatments. This paper presents a systematic review of disease identification, classification, and recognition with ECG sensors. The review was focused on studies published between 2017 and 2022 in different scientific databases, including PubMed Central, Springer, Elsevier, Multidisciplinary Digital Publishing Institute (MDPI), IEEE Xplore, and Frontiers. It results in the quantitative and qualitative analysis of 103 scientific papers. The study demonstrated that different datasets are available online with data related to various diseases. Several ML/DP-based models were identified in the research, where Convolutional Neural Network and Support Vector Machine were the most applied algorithms. This review can allow us to identify the techniques that can be used in a system that promotes the patient's autonomy.
Collapse
Key Words
- AI, Artificial Intelligence
- BNN, Binarized Neural Network
- CNN, Concolutional Neural Networks
- Cardiovascular diseases
- DL, Deep Learning
- DNN, Deep Neural Networks
- Diagnosis
- ECG sensors
- ECG, Electrocardiography
- GAN, Generative Adversarial Networks
- GMM, Gaussian Mixture Model
- GNB, Gaussian Naive bayes
- GRU, Gated Recurrent Unit
- LASSO, Least Absolute Shrinkage and Selection Operator
- LDA, Linear Discriminant Analysis
- LR, Linear Regression
- LSTM, Long Short-Term Memory
- ML, Machine Learning
- MLP, Multiplayer Perceptron
- MLR, Multiple Linear Regression
- NLP, Natural Language Processing
- POAF, Postoperative Atrial Fibrillation
- RF, Random Forest
- RNN, Recurrent Neural Network
- SHAP, SHapley Additive exPlanations
- SVM, Support Vector Machine
- Systematic review
- WHO, World Health Organization
- kNN, k-nearest neighbors
Collapse
|
Review |
2 |
11 |
13
|
Kishimoto T, Takamiya A, Liang KC, Funaki K, Fujita T, Kitazawa M, Yoshimura M, Tazawa Y, Horigome T, Eguchi Y, Kikuchi T, Tomita M, Bun S, Murakami J, Sumali B, Warnita T, Kishi A, Yotsui M, Toyoshiba H, Mitsukura Y, Shinoda K, Sakakibara Y, Mimura M, PROMPT collaborators. The project for objective measures using computational psychiatry technology (PROMPT): Rationale, design, and methodology. Contemp Clin Trials Commun 2020; 19:100649. [PMID: 32913919 PMCID: PMC7473877 DOI: 10.1016/j.conctc.2020.100649] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 08/06/2020] [Accepted: 08/16/2020] [Indexed: 01/08/2023] Open
Abstract
INTRODUCTION Depressive and neurocognitive disorders are debilitating conditions that account for the leading causes of years lived with disability worldwide. However, there are no biomarkers that are objective or easy-to-obtain in daily clinical practice, which leads to difficulties in assessing treatment response and developing new drugs. New technology allows quantification of features that clinicians perceive as reflective of disorder severity, such as facial expressions, phonic/speech information, body motion, daily activity, and sleep. METHODS Major depressive disorder, bipolar disorder, and major and minor neurocognitive disorders as well as healthy controls are recruited for the study. A psychiatrist/psychologist conducts conversational 10-min interviews with participants ≤10 times within up to five years of follow-up. Interviews are recorded using RGB and infrared cameras, and an array microphone. As an option, participants are asked to wear wrist-band type devices during the observational period. Various software is used to process the raw video, voice, infrared, and wearable device data. A machine learning approach is used to predict the presence of symptoms, severity, and the improvement/deterioration of symptoms. DISCUSSION The overall goal of this proposed study, the Project for Objective Measures Using Computational Psychiatry Technology (PROMPT), is to develop objective, noninvasive, and easy-to-use biomarkers for assessing the severity of depressive and neurocognitive disorders in the hopes of guiding decision-making in clinical settings as well as reducing the risk of clinical trial failure. Challenges may include the large variability of samples, which makes it difficult to extract the features that commonly reflect disorder severity. TRIAL REGISTRATION UMIN000021396, University Hospital Medical Information Network (UMIN).
Collapse
Key Words
- AMED, Japan Agency for Medical Research and Development
- Adabag, Adaptive Bagging
- Adaboost, Adaptive Boosting
- BD, Bipolar disorder
- BDI-II, Beck Depression Inventory, Second Edition
- BNN, Bayesian Neural Networks
- CDR, Clinical Dementia Rating
- CDT, Clock Drawing Test
- CNN, Convolutional Neural Networks
- CPP, cepstral peak prominence
- DSM-5, Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition
- Depression
- F0, fundamental frequency
- F1, F2, F3, first, second, and third formant frequencies
- FedRAMP, Federal Risk and Authorization Management Program
- GCNN, Gated Convolutional Neural Networks
- GDS, Geriatric Depression Scale
- HAM-D, Hamilton Depression Rating Scale
- IEC, International Electrotechnical Commission
- ISO, International Organization for Standardization
- LM, Wechsler Memory Scale-Revised Logical Memory
- LSTM, Long Short-Term Memory Networks
- M.I.N.I., Mini-International Neuropsychiatric Interview
- MADRS, Montgomery-Asberg Depression Rating Scale
- MARS, Motor Agitation and Retardation Scale
- MCI, mild cognitive impairment
- MDD, Major depressive disorder
- MFCC, mel-frequency cepstrum coefficients
- MMSE, Mini-Mental State Examination
- MRI, magnetic resonance imaging
- Machine learning
- MoCA, Montreal Cognitive Assessment
- NPI, Neuropsychiatric Inventory
- Natural language processing
- Neurocognitive disorder
- PET, positron emission tomography
- PROMPT, Project for Objective Measures Using Computational Psychiatry Technology
- PSQI, Pittsburgh Sleep Quality Index
- RF, Random Forest
- RGB, red, green, blue
- SCID, Structural Clinical Interview for DSM-5
- SVM, Support Vector Machine
- SVR, Support Vector Regression
- Screening
- UI, uncertainty interval
- UMIN, University Hospital Medical Information Network
- UV, ultraviolet
- YLDs, years lived with disability
- YMRS, Young Mania Rating Scale
Collapse
|
research-article |
5 |
6 |
14
|
Tariq A, Jiango Y, Li Q, Gao J, Lu L, Soufan W, Almutairi KF, Habib-ur-Rahman M. Modelling, mapping and monitoring of forest cover changes, using support vector machine, kernel logistic regression and naive bayes tree models with optical remote sensing data. Heliyon 2023; 9:e13212. [PMID: 36785833 PMCID: PMC9918775 DOI: 10.1016/j.heliyon.2023.e13212] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 01/16/2023] [Accepted: 01/19/2023] [Indexed: 01/27/2023] Open
Abstract
The present study is designed to monitor the spatio-temporal changes in forest cover using Remote Sensing (RS) and Geographic Information system (GIS) techniques from 1990 to 2017. Landsat data from 1990 (Thematic mapper [TM]), 2000 and 2010 (Enhanced Thematic Mapper [ETM+]), and 2013 to 2017 (Operational Land Imager/Thermal Infrared Sensor [OLI/TIRS]) were classified into the classes termed snow, water, barren land, built-up area, forest, and vegetation. The method was built using multitemporal Landsat images and the machine learning techniques Support Vector Machine (SVM), Naive Bayes Tree (NBT) and Kernel Logistic Regression (KLR). According to the results, forest area was decreased from 19,360 km2 (26.0%) to 18,784 km2 (25.2%) from 1990 to 2010, while forest area was increased from 18,640 km2 (25.0%) to 26,765 km2 (35.9%) area from 2013 to 2017 due to "One billion tree Project". According to our findings, SVM performed better than KLR and NBT on all three accuracy metrics (recall, precision, and accuracy) and the F1 score was >0.89. The study demonstrated that concurrent reforestation in barren land areas improved methods of sustaining the forest and RS and GIS into everyday forestry organization practices in Khyber Pakhtun Khwa (KPK), Pakistan. The study results were beneficial, especially at the decision-making level for the local or provincial government of KPK and for understanding the global scenario for regional planning.
Collapse
|
research-article |
2 |
6 |
15
|
Sharifi F, Sharifi I, Babaei Z, Alahdin S, Afgar A. Bioinformatics evaluation of anticancer properties of GP63 protein-derived peptides on MMP2 protein of melanoma cancer. J Pathol Inform 2023; 14:100190. [PMID: 36700237 PMCID: PMC9867975 DOI: 10.1016/j.jpi.2023.100190] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 01/09/2023] [Accepted: 01/09/2023] [Indexed: 01/13/2023] Open
Abstract
Background GP63, also known as Leishmanolysin, is a multifunctional virulence factor abundant on the surface of Leishmania spp. small peptides with anticancer capabilities that are selective and toxic to cancer cells are known as anticancer peptides. We aimed to demonstrate the activity of GP63 and its anticancer properties on melanoma using a range of in silico tools and screening methods to identify predicted and designed anticancer peptides. Methods Various in silico modeling methodologies are used to establish the three-dimensional (3D) structure of GP63. Refinement and re-evaluation of the modeled structures and the built models' quality evaluated using the different docking used to find the interacting amino acids between MMP2 and GP63 and its anticancer peptides. AntiCP2.0 is used for screening anticancer peptides. 2D interaction plots of protein-ligand complexes evaluated by Protein-Ligand Interaction Profiler server. It is for the first time that used anticancer peptides of GP63 and the predicted and designed peptides. Results We used 3 peptides of GP63 based on the AntiCP 2.0 server with scores of 0.63, 0.53, and 0.49, and common peptides of GP63/MMP2 (continues peptide: mean the completely selected peptide after docking with non-anticancer effect, predicted with 0.58 score and designed peptides with 0.47 and 0.45 scores by AntiCP 2.0 server). Conclusions The antileishmanial and anticancer peptide research topics exemplify the multidisciplinary nature of peptide research. The advancement of therapeutics targeting cancer and/or Leishmania requires an interconnected research strategy shown in this work.
Collapse
Key Words
- ACPs, anticancer peptides
- Anticancer
- CASTp, Computed Atlas of Surface Topography of proteins
- CL, cutaneous leishmaniasis
- GP63, Glycoprotein 63
- In silico
- Leishmania
- Leishmanolysin
- MD, molecular dynamics
- MMPs, matrix metalloproteases
- MSP, major surface protease
- Matrix metalloproteases
- PDB, Protein Data Bank
- PLIP, Protein–Ligand Interaction Profiler
- Peptide
- Protein–Ligand Interaction Profiler
- ROS, reactive oxygen species formation
- SVM, Support Vector Machine
- VL, visceral leishmaniasis
- kNN, k-Nearest Neighbors
Collapse
|
research-article |
2 |
3 |
16
|
Afrash MR, Kazemi-Arpanahi H, Shanbehzadeh M, Nopour R, Mirbagheri E. Predicting hospital readmission risk in patients with COVID-19: A machine learning approach. INFORMATICS IN MEDICINE UNLOCKED 2022; 30:100908. [PMID: 35280933 PMCID: PMC8901230 DOI: 10.1016/j.imu.2022.100908] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 02/18/2022] [Accepted: 03/06/2022] [Indexed: 01/20/2023] Open
Abstract
Introduction The Coronavirus 2019 (COVID-19) epidemic stunned the health systems with severe scarcities in hospital resources. In this critical situation, decreasing COVID-19 readmissions could potentially sustain hospital capacity. This study aimed to select the most affecting features of COVID-19 readmission and compare the capability of Machine Learning (ML) algorithms to predict COVID-19 readmission based on the selected features. Material and methods The data of 5791 hospitalized patients with COVID-19 were retrospectively recruited from a hospital registry system. The LASSO feature selection algorithm was used to select the most important features related to COVID-19 readmission. HistGradientBoosting classifier (HGB), Bagging classifier, Multi-Layered Perceptron (MLP), Support Vector Machine ((SVM) kernel = linear), SVM (kernel = RBF), and Extreme Gradient Boosting (XGBoost) classifiers were used for prediction. We evaluated the performance of ML algorithms with a 10-fold cross-validation method using six performance evaluation metrics. Results Out of the 42 features, 14 were identified as the most relevant predictors. The XGBoost classifier outperformed the other six ML models with an average accuracy of 91.7%, specificity of 91.3%, the sensitivity of 91.6%, F-measure of 91.8%, and AUC of 0.91%. Conclusion The experimental results prove that ML models can satisfactorily predict COVID-19 readmission. Besides considering the risk factors prioritized in this work, categorizing cases with a high risk of reinfection can make the patient triaging procedure and hospital resource utilization more effective.
Collapse
Key Words
- AUC, Area under the curve
- Artificial intelligent
- CDSS, Clinical Decision Support Systems
- COVID-19
- COVID-19, Coronavirus disease 2019
- CRISP, Cross-Industry Standard Process
- Coronavirus
- HGB, Hist Gradient Boosting
- LASSO, Least Absolute Shrinkage and Selection Operator
- ML, Machine learning
- MLP, Multi-Layered Perceptron
- Machine learning
- Readmission
- SVM, Support Vector Machine
- XGBoost, Extreme Gradient Boosting
Collapse
|
research-article |
3 |
2 |
17
|
Passarelli-Araujo H, Passarelli-Araujo H, Urbano MR, Pescim RR. Machine learning and comorbidity network analysis for hospitalized patients with COVID-19 in a city in Southern Brazil. SMART HEALTH (AMSTERDAM, NETHERLANDS) 2022; 26:100323. [PMID: 36159078 PMCID: PMC9485420 DOI: 10.1016/j.smhl.2022.100323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 07/17/2022] [Accepted: 09/13/2022] [Indexed: 12/18/2022]
Abstract
The large amount of data generated during the COVID-19 pandemic requires advanced tools for the long-term prediction of risk factors associated with COVID-19 mortality with higher accuracy. Machine learning (ML) methods directly address this topic and are essential tools to guide public health interventions. Here, we used ML to investigate the importance of demographic and clinical variables on COVID-19 mortality. We also analyzed how comorbidity networks are structured according to age groups. We conducted a retrospective study of COVID-19 mortality with hospitalized patients from Londrina, Parana, Brazil, registered in the database for severe acute respiratory infections (SIVEP-Gripe), from January 2021 to February 2022. We tested four ML models to predict the COVID-19 outcome: Logistic Regression, Support Vector Machine, Random Forest, and XGBoost. We also constructed a comorbidity network to investigate the impact of co-occurring comorbidities on COVID-19 mortality. Our study comprised 8358 hospitalized patients, of whom 2792 (33.40%) died. The XGBoost model achieved excellent performance (ROC-AUC = 0.90). Both permutation method and SHAP values highlighted the importance of age, ventilatory support status, and intensive care unit admission as key features in predicting COVID-19 outcomes. The comorbidity networks for old deceased patients are denser than those for young patients. In addition, the co-occurrence of heart disease and diabetes may be the most important combination to predict COVID-19 mortality, regardless of age and sex. This work presents a valuable combination of machine learning and comorbidity network analysis to predict COVID-19 outcomes. Reliable evidence on this topic is crucial for guiding the post-pandemic response and assisting in COVID-19 care planning and provision.
Collapse
Key Words
- AUC-ROC, Area under the Receiver-Operating Characteristic curve
- COVID-19, Coronavirus disease 2019
- Co-occurrence analysis
- Epidemiology
- ICU, Intensive Care Unit
- MCC, Matthew's Correlation Coefficient
- ML, Machine learning
- Network density
- OR, Odds ratio
- PCA, Principal Component Analysis
- Risk-factors
- SARS-CoV-2
- SARS-CoV-2, Severe acute respiratory syndrome coronavirus 2
- SHAP, Shapley Additive exPlanations
- SIVEP-Gripe, Sistema de Informação de Vigilância Epidemiológica da Gripe
- SVM, Support Vector Machine
- XGBoost, Extreme Gradient Boosting
Collapse
|
research-article |
3 |
1 |
18
|
Teferra DM, Ngoo LM, Nyakoe GN. Fuzzy-based prediction of solar PV and wind power generation for microgrid modeling using particle swarm optimization. Heliyon 2023; 9:e12802. [PMID: 36704286 PMCID: PMC9871071 DOI: 10.1016/j.heliyon.2023.e12802] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 12/30/2022] [Accepted: 01/02/2023] [Indexed: 01/06/2023] Open
Abstract
Regardless of their nature of stochasticity and uncertain nature, wind and solar resources are the most abundant energy resources used in the development of microgrid systems. In microgrid systems and distribution networks, the uncertain nature of both solar and wind resources results in power quality and system stability issues. The randomization behavior of solar and wind energy resources is controlled through the precise development of a power prediction model. Fuzzy-based solar PV and wind prediction models may more efficiently manage this randomness and uncertain character. However, this method has several drawbacks, it has limited performance when the volumes of wind and solar resources historical data are huge in size and it has also many membership functions of the fuzzy input and output variables as well as multiple fuzzy rules available. The hybrid Fuzzy-PSO intelligent prediction approach improves the fuzzy system's limitations and hence increases the prediction model's performance. The Fuzzy-PSO hybrid forecast model is developed using MATLAB programming of the particle swarm optimization (PSO) algorithm with the help of the global optimization toolbox. In this paper, an error correction factor (ECF) is considered a new fuzzy input variable. It depends on the validation and forecasted data values of both wind and solar prediction models to improve the accuracy of the prediction model. The impact of ECF is observed in fuzzy, Fuzzy-PSO, and Fuzzy-GA wind and solar PV power forecasting models. The hybrid Fuzzy-PSO prediction model of wind and solar power generation has a high degree of accuracy compared to the Fuzzy and Fuzzy-GA forecasting models. The rest of this paper is organized as: Section II is about the analysis of solar and wind resources row data. The Fuzzy-PSO prediction model problem formulation is covered in Section III. Section IV, is about the results and discussion of the study. Section V contains the conclusion. The references and abbreviations are presented at the end of the paper.
Collapse
Key Words
- ANFIS, Adaptive Neuro-Fuzzy Inference System
- ANN, Artificial Neural Network
- ARIMA, Autoregressive Integrated Moving Average
- ARMA, Auto-Regressive Moving Average
- BPNN, Back Propagation Neural Network
- CA, Cultural Algorithm
- CNN, Convolutional Neural Network
- DNI, Direct Normal Insolation
- DSI, Diffused Solar Insolation
- ECF, Error Correction Factor
- FF, Firefly Algorithm
- FOA, Fruit Fly Optimization Algorithm
- FR, Fuzzy Regression
- Fuzzy system
- Fuzzy-GA Hybrid algorithm
- Fuzzy-PSO Algorithm
- GA, Genetic Algorithm
- GHI, Global Horizontal Irradiance
- LSSVM, Least-Square Support Vector Machine
- MAPE, Mean Absolute Percentage Error
- NRMSE, Normalized Root-Mean-Square Error
- PSO, Particle Swarm Optimization
- PV, Photovoltaic
- Particle swarm optimization
- SVM, Support Vector Machine
- SVR, Support Vector Regression
- Solar power prediction model
- Wind power prediction model
Collapse
|
research-article |
2 |
1 |
19
|
Abdulmohsin HA, Al-Khateeb B, Hasan SS, Dwivedi R. Automatic illness prediction system through speech. COMPUTERS & ELECTRICAL ENGINEERING : AN INTERNATIONAL JOURNAL 2022; 102:108224. [PMID: 35880184 PMCID: PMC9302036 DOI: 10.1016/j.compeleceng.2022.108224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 06/29/2022] [Accepted: 07/08/2022] [Indexed: 06/15/2023]
Abstract
Due to the COVID-19 epidemic and the curfew caused by it, many people have sought to find an ADPS on the internet in the last few years. This hints to a new age of medical treatment, all the more so if the number of internet users continues to expand. As a result, automatic illness prediction online applications have attracted the interest of a large number of researchers worldwide. This work aims to develop and implement an automated illness prediction system based on speech. The system will be designed to forecast the sort of ailment a patient is suffering from based on his voice, but this was not feasible during the trial, therefore the diseases were divided into three categories (painful, light pain and psychological pain), and then the diagnose process were implemented accordingly. The medical dataset named "speech, transcription, and intent" served as the baseline for this study. The smoothness, MFCC, and SCV properties were used in this work, which demonstrated their high representation to human being medical situations. The noise reduction forward-backward filter was used to eliminate noise from wave files captured online in order to account for the high level of noise seen in the deployed dataset. For this study, a hybrid feature selection method was created and built that combined the output of a genetic algorithm (GA) with the inputs of a NN algorithm. Classification was performed using SVM, neural network, and GMM. The greatest results obtained were 94.55% illness classification accuracy in terms of SVM. The results showed that diagnosing illness through speech is a difficult process, especially when diagnosing each type of illness separately, but when grouping the different illness types into groups, depending on the amount of pain and the psychological situation of the patient, the results were much higher.
Collapse
Key Words
- ADPS, Automated Disease Prediction System
- Automatic disease prediction
- CPU, Central Processing Unit
- Forward-backward filter
- GA, Genetic Algorithm
- GB, Giga Byte
- GMM, Gaussian Mixture Model
- MFCC, Mel Frequency Cepstral Co-efficient
- Medical speech transcription and intent dataset
- Mel frequency Cepstral coefficient
- NN, Neural Network
- Neural network
- RAM, Random Access Memory
- RSM, Response Service Methodology
- SCV, Spectral Centroid Variability
- SVM, Support Vector Machine
- Spectral centroid variability
Collapse
|
research-article |
3 |
|
20
|
Kumar S, Balaya RDA, Kanekar S, Raju R, Prasad TSK, Kandasamy RK. Computational tools for exploring peptide-membrane interactions in gram-positive bacteria. Comput Struct Biotechnol J 2023; 21:1995-2008. [PMID: 36950221 PMCID: PMC10025024 DOI: 10.1016/j.csbj.2023.02.051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 02/27/2023] [Accepted: 02/27/2023] [Indexed: 03/05/2023] Open
Abstract
The vital cellular functions in Gram-positive bacteria are controlled by signaling molecules known as quorum sensing peptides (QSPs), considered promising therapeutic interventions for bacterial infections. In the bacterial system QSPs bind to membrane-coupled receptors, which then auto-phosphorylate and activate intracellular response regulators. These response regulators induce target gene expression in bacteria. One of the most reliable trends in drug discovery research for virulence-associated molecular targets is the use of peptide drugs or new functionalities. In this perspective, computational methods act as auxiliary aids for biologists, where methodologies based on machine learning and in silico analysis are developed as suitable tools for target peptide identification. Therefore, the development of quick and reliable computational resources to identify or predict these QSPs along with their receptors and inhibitors is receiving considerable attention. The databases such as Quorumpeps and Quorum Sensing of Human Gut Microbes (QSHGM) provide a detailed overview of the structures and functions of QSPs. The tools and algorithms such as QSPpred, QSPred-FL, iQSP, EnsembleQS and PEPred-Suite have been used for the generic prediction of QSPs and feature representation. The availability of compiled key resources for utilizing peptide features based on amino acid composition, positional preferences, and motifs as well as structural and physicochemical properties, including biofilm inhibitory peptides, can aid in elucidating the QSP and membrane receptor interactions in infectious Gram-positive pathogens. Herein, we present a comprehensive survey of diverse computational approaches that are suitable for detecting QSPs and QS interference molecules. This review highlights the utility of these methods for developing potential biomarkers against infectious Gram-positive pathogens.
Collapse
Key Words
- 3-HBA, 3–Hydroxybenzoic Acid
- AAC, Amino Acid Composition
- ABC, ATP-binding cassette
- ACD, Available Chemicals Database
- AIP, Autoinducing Peptide
- AMP, Anti-Microbial Peptide
- ATP, Adenosine Triphosphate
- Agr, Accessory gene regulator
- BFE, Binding Free Energy
- BIP Inhibitors
- BIP, Biofilm Inhibitory Peptides
- BLAST, Basic Local Alignment Search Tool
- BNB, Bernoulli Naïve-Bayes
- CADD, Computer-Aided Drug Design
- CSP, Competence Stimulating Peptide
- CTD, Composition-Transition-Distribution
- D, Aspartate
- DCH, 3,3′-(3,4-dichlorobenzylidene)-bis-(4-hydroxycoumarin)
- DT, Decision Tree
- FDA, Food and Drug Administration
- GBM, Gradient Boosting Machine
- GDC, g-gap Dipeptide
- GNB, Gaussian NB
- Gram-positive bacteria
- H, Histidine
- H-Kinase, Histidine Kinase
- H-phosphotransferase, Histidine Phosphotransferase
- HAM, Hamamelitannin
- HGM, Human Gut Microbiota
- HNP, Human Neutrophil Peptide
- IT, Information Theory Features
- In silico approaches
- KNN, K-Nearest Neighbors
- MCC, Mathew Co-relation Coefficient
- MD, Molecular Dynamics
- MDR, Multiple Drug Resistance
- ML, Machine Learning
- MRSA, Methicillin Resistant S. aureus
- MSL, Multiple Sequence Alignment
- OMR, Omargliptin
- OVP, Overlapping Property Features
- PCP, Physicochemical Properties
- PDB, Protein Data Bank
- PPIs, Protein-Protein Interactions
- PSM, Phenol-Soluble Modulin
- PTM, Post Translational Modification
- QS, Quorum Sensing
- QSCN, QS communication network
- QSHGM, Quorum Sensing of Human Gut Microbes
- QSI, QS Inhibitors
- QSIM, QS Interference Molecules
- QSP inhibitors
- QSP predictors
- QSP, QS Peptides
- QSPR, Quantitative Structure Property Relationship
- Quorum sensing peptides
- RAP, RNAIII-activating protein
- RF, Random Forest
- RIP, RNAIII-inhibiting peptide
- ROC, Receiver Operating Characteristic
- SAR, Structure-Activity Relationship
- SFS, Sequential Forward Search
- SIT, Sitagliptin
- SVM, Support Vector Machine
- TCS, Two-Component Sensory
- TRAP, Target of RAP
- TRG, Trelagliptin
- WHO, World Health Organization
- mRMR, minimum Redundancy and Maximum Relevance
Collapse
|
Review |
2 |
|