1
|
Gugulothu P, Bhukya R. Exploring coronavirus sequence motifs through convolutional neural network for accurate identification of COVID-19. Comput Methods Biomech Biomed Engin 2024:1-15. [PMID: 39508163 DOI: 10.1080/10255842.2024.2404149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 04/22/2024] [Accepted: 09/05/2024] [Indexed: 11/08/2024]
Abstract
The SARS-CoV-2 virus reportedly originated in Wuhan in 2019, causing the coronavirus outbreak (COVID-19), which was technically designated as a global epidemic. Numerous studies have been carried out to diagnose and treat COVID-19 throughout the midst of the disease's spread. However, the genetic similarity between COVID-19 and other types of coronaviruses makes it challenging to differentiate between them. Therefore it's essential to swiftly identify if an epidemic is brought on by a brand-new virus or a well-known disease. In the present article, the DeepCoV deep-learning (DL) approach utilizes layered convolutional neural networks (CNNs) to classify viral serious acute respiratory syndrome coronavirus 2 (SARS-CoV-2) besides other viral diseases. Additionally, various motifs linked with SARS-CoV-2 can be located by examining the computational filter processes. In identifying these important motifs, DeepCoV reveals the transparency of CNNs. Experiments were conducted using the 2019nCoVR datasets, and the results indicate that DeepCoV performed more accurately than several benchmark ML models. Additionally, DeepCoV scored its maximum area under the precision-recall curve (AUCPR) and receiver operating characteristic curve (AUC-ROC) at 98.62% and 98.58%, respectively. Overall, these investigations provide strong knowledge of the employment of deep learning (DL) algorithms as a crucial alternative to identifying SARS-CoV-2 and identifying patterns of disease in the SARS-CoV-2 genes.
Collapse
Affiliation(s)
- Praveen Gugulothu
- Computer Science and Engineering, National Institute of Technology, Warangal, India
| | - Raju Bhukya
- Computer Science and Engineering, National Institute of Technology, Warangal, India
| |
Collapse
|
2
|
Fakieh B, Saleem F. COVID-19 from symptoms to prediction: A statistical and machine learning approach. Comput Biol Med 2024; 182:109211. [PMID: 39342677 DOI: 10.1016/j.compbiomed.2024.109211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 09/02/2024] [Accepted: 09/23/2024] [Indexed: 10/01/2024]
Abstract
During the COVID-19 pandemic, the analysis of patient data has become a cornerstone for developing effective public health strategies. This study leverages a dataset comprising over 10,000 anonymized patient records from various leading medical institutions to predict COVID-19 patient age groups using a suite of statistical and machine learning techniques. Initially, extensive statistical tests including ANOVA and t-tests were utilized to assess relationships among demographic and symptomatic variables. The study then employed machine learning models such as Decision Tree, Naïve Bayes, KNN, Gradient Boosted Trees, Support Vector Machine, and Random Forest, with rigorous data preprocessing to enhance model accuracy. Further improvements were sought through ensemble methods; bagging, boosting, and stacking. Our findings indicate strong associations between key symptoms and patient age groups, with ensemble methods significantly enhancing model accuracy. Specifically, stacking applied with random forest as a meta leaner exhibited the highest accuracy (0.7054). In addition, the implementation of stacking techniques notably improved the performance of K-Nearest Neighbors (from 0.529 to 0.63) and Naïve Bayes (from 0.554 to 0.622) and demonstrated the most successful prediction method. The study aimed to understand the number of symptoms identified in COVID-19 patients and their association with different age groups. The results can assist doctors and higher authorities in improving treatment strategies. Additionally, several decision-making techniques can be applied during pandemic, tailored to specific age groups, such as resource allocation, medicine availability, vaccine development, and treatment strategies. The integration of these predictive models into clinical settings could support real-time public health responses and targeted intervention strategies.
Collapse
Affiliation(s)
- Bahjat Fakieh
- Department of Information System, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Farrukh Saleem
- School of Built Environment, Engineering, and Computing, Leeds Beckett University, Leeds, LS6 3QR, UK.
| |
Collapse
|
3
|
Gugulothu P, Bhukya R. Coot-Lion optimized deep learning algorithm for COVID-19 point mutation rate prediction using genome sequences. Comput Methods Biomech Biomed Engin 2024; 27:1410-1429. [PMID: 37668061 DOI: 10.1080/10255842.2023.2244109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 07/08/2023] [Accepted: 07/28/2023] [Indexed: 09/06/2023]
Abstract
In this study, a deep quantum neural network (DQNN) based on the Lion-based Coot algorithm (LBCA-based Deep QNN) is employed to predict COVID-19. Here, the genome sequences are subjected to feature extraction. The fusion of features is performed using the Bray-Curtis distance and the deep belief network (DBN). Lastly, a deep quantum neural network (Deep QNN) is used to predict COVID-19. The LBCA is obtained by integrating Coot algorithm and LOA. The COVID-19 predictions are done with mutation points. The LBCA-based Deep QNN outperformed with testing accuracy of 0.941, true positive rate of 0.931, and false positive rate of 0.869.
Collapse
Affiliation(s)
- Praveen Gugulothu
- Department of Computer Science and Engineering, National Institute of Technology Warangal, Hanamkonda, Telangana 506004, India
| | - Raju Bhukya
- Department of Computer Science and Engineering, National Institute of Technology Warangal, Hanamkonda, Telangana 506004, India
| |
Collapse
|
4
|
Dubey S, Verma DK, Kumar M. Real-time infectious disease endurance indicator system for scientific decisions using machine learning and rapid data processing. PeerJ Comput Sci 2024; 10:e2062. [PMID: 39145255 PMCID: PMC11323025 DOI: 10.7717/peerj-cs.2062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/25/2024] [Indexed: 08/16/2024]
Abstract
The SARS-CoV-2 virus, which induces an acute respiratory illness commonly referred to as COVID-19, had been designated as a pandemic by the World Health Organization due to its highly infectious nature and the associated public health risks it poses globally. Identifying the critical factors for predicting mortality is essential for improving patient therapy. Unlike other data types, such as computed tomography scans, x-radiation, and ultrasounds, basic blood test results are widely accessible and can aid in predicting mortality. The present research advocates the utilization of machine learning (ML) methodologies for predicting the likelihood of infectious disease like COVID-19 mortality by leveraging blood test data. Age, LDH (lactate dehydrogenase), lymphocytes, neutrophils, and hs-CRP (high-sensitivity C-reactive protein) are five extremely potent characteristics that, when combined, can accurately predict mortality in 96% of cases. By combining XGBoost feature importance with neural network classification, the optimal approach can predict mortality with exceptional accuracy from infectious disease, along with achieving a precision rate of 90% up to 16 days before the event. The studies suggested model's excellent predictive performance and practicality were confirmed through testing with three instances that depended on the days to the outcome. By carefully analyzing and identifying patterns in these significant biomarkers insightful information has been obtained for simple application. This study offers potential remedies that could accelerate decision-making for targeted medical treatments within healthcare systems, utilizing a timely, accurate, and reliable method.
Collapse
Affiliation(s)
- Shivendra Dubey
- Computer Science and Engineering, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India
| | - Dinesh Kumar Verma
- Computer Science and Engineering, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India
| | - Mahesh Kumar
- Computer Science and Engineering, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India
| |
Collapse
|
5
|
Andrioaia DA, Gaitan VG. Finding fault types of BLDC motors within UAVs using machine learning techniques. Heliyon 2024; 10:e30251. [PMID: 38711625 PMCID: PMC11070806 DOI: 10.1016/j.heliyon.2024.e30251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/08/2024] Open
Abstract
Due to the potential of the Unmanned Aerial Vehicle (UAV), they began to be increasingly used in various fields such as: environment, leisure, health, military, transport, etc. Along with increasing battery storage capacity, the UAVs began to be propulsion by Brushless DC (BLDC) motors. Failure of BLDC motors can lead to loss of control, which can cause accidents. In these conditions, it is necessary to devise methods that can find the defects of the BLDC motors in the UAVs. In this article, the authors propose a novel method to predict BLDC motor defects using machine learning. To maximize the method results, the performance of three machine learning models, K-Nearest Neighbor (KNN), Support Vector Machine (SVM) and Bayesian Network (BN) in predicting the flaws of BLDC motors, have been compared.
Collapse
Affiliation(s)
- Dragos Alexandru Andrioaia
- "Vasile Alecsandri" University of Bacau, Bacau, 600115, Romania
- "Stefan cel Mare" University of Suceava, Suceava, 720229, Romania
| | | |
Collapse
|
6
|
Nawaz MS, Fournier-Viger P, Nawaz S, Zhu H, Yun U. SPM4GAC: SPM based approach for genome analysis and classification of macromolecules. Int J Biol Macromol 2024; 266:130984. [PMID: 38513910 DOI: 10.1016/j.ijbiomac.2024.130984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 03/16/2024] [Indexed: 03/23/2024]
Abstract
Genome sequence analysis and classification play critical roles in properly understanding an organism's main characteristics, functionalities, and changing (evolving) nature. However, the rapid expansion of genomic data makes genome sequence analysis and classification a challenging task due to the high computational requirements, proper management, and understanding of genomic data. Recently proposed models yielded promising results for the task of genome sequence classification. Nevertheless, these models often ignore the sequential nature of nucleotides, which is crucial for revealing their underlying structure and function. To address this limitation, we present SPM4GAC, a sequential pattern mining (SPM)-based framework to analyze and classify the macromolecule genome sequences of viruses. First, a large dataset containing the genome sequences of various RNA viruses is developed and transformed into a suitable format. On the transformed dataset, algorithms for SPM are used to identify frequent sequential patterns of nucleotide bases. The obtained frequent sequential patterns of bases are then used as features to classify different viruses. Ten classifiers are employed, and their performance is assessed by using several evaluation measures. Finally, a performance comparison of SPM4GAC with state-of-the-art methods for genome sequence classification/detection reveals that SPM4GAC performs better than those methods.
Collapse
Affiliation(s)
- M Saqib Nawaz
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China.
| | | | - Shoaib Nawaz
- Department of Pharmacy, The University of Lahore, Sargodha Campus, Pakistan.
| | - Haowei Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China.
| | - Unil Yun
- Sejong University, Seoul, Republic of Korea.
| |
Collapse
|
7
|
HaLiMaiMaiTi N, Hong Y, Li M, Li H, Wang Y, Chen C, Lv X, Chen C. Classification of benign and malignant parotid tumors based on CT images combined with stack generalization model. Med Biol Eng Comput 2023; 61:3123-3135. [PMID: 37656333 DOI: 10.1007/s11517-023-02898-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 07/09/2023] [Indexed: 09/02/2023]
Abstract
Parotid tumors are among the most prevalent tumors in otolaryngology, and malignant parotid tumors are one of the main causes of facial paralysis in patients. Currently, the main diagnostic modality for parotid tumors is computed tomography, which relies mainly on the subjective judgment of clinicians and leads to practical problems such as high workloads. Therefore, to assist physicians in solving the preoperative classification problem, a stacked generalization model is proposed for the automated classification of parotid tumor images. A ResNet50 pretrained model is used for feature extraction. The first layer of the adopted stacked generalization model consists of multiple weak learners, and the results of the weak learners are integrated as input data in a meta-classifier in the second layer. The output results of the meta-classifier are the final classification results. The classification accuracy of the stacked generalization model reaches 91%. Comparing the classification results under different classifiers, the stacked generalization model used in this study can identify benign and malignant tumors in the parotid gland effectively, thus relieving physicians of tedious work pressure.
Collapse
Affiliation(s)
| | - Yue Hong
- People's Hospital of Xinjiang Uygur Autonomous Region, UrumqiXinjiang, 830001, China
| | - Min Li
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China
| | - Hongtao Li
- The Affiliated Cancer Hospital of Xinjiang Medical University, Urumqi, 830011, China
| | - Yunling Wang
- The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830000, China
| | - Chen Chen
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China
| | - Xiaoyi Lv
- College of Software, Xinjiang University, Urumqi, 830046, China.
- Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830046, China.
- Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi, 830046, China.
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi, 830046, China.
| |
Collapse
|
8
|
Akbari Rokn Abadi S, Mohammadi A, Koohi S. A new profiling approach for DNA sequences based on the nucleotides' physicochemical features for accurate analysis of SARS-CoV-2 genomes. BMC Genomics 2023; 24:266. [PMID: 37202721 DOI: 10.1186/s12864-023-09373-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 05/11/2023] [Indexed: 05/20/2023] Open
Abstract
BACKGROUND The prevalence of the COVID-19 disease in recent years and its widespread impact on mortality, as well as various aspects of life around the world, has made it important to study this disease and its viral cause. However, very long sequences of this virus increase the processing time, complexity of calculation, and memory consumption required by the available tools to compare and analyze the sequences. RESULTS We present a new encoding method, named PC-mer, based on the k-mer and physic-chemical properties of nucleotides. This method minimizes the size of encoded data by around 2 k times compared to the classical k-mer based profiling method. Moreover, using PC-mer, we designed two tools: 1) a machine-learning-based classification tool for coronavirus family members with the ability to recive input sequences from the NCBI database, and 2) an alignment-free computational comparison tool for calculating dissimilarity scores between coronaviruses at the genus and species levels. CONCLUSIONS PC-mer achieves 100% accuracy despite the use of very simple classification algorithms based on Machine Learning. Assuming dynamic programming-based pairwise alignment as the ground truth approach, we achieved a degree of convergence of more than 98% for coronavirus genus-level sequences and 93% for SARS-CoV-2 sequences using PC-mer in the alignment-free classification method. This outperformance of PC-mer suggests that it can serve as a replacement for alignment-based approaches in certain sequence analysis applications that rely on similarity/dissimilarity scores, such as searching sequences, comparing sequences, and certain types of phylogenetic analysis methods that are based on sequence comparison.
Collapse
Affiliation(s)
| | | | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
9
|
Khodaei A, Shams P, Sharifi H, Mozaffari-Tazehkand B. Identification and classification of coronavirus genomic signals based on linear predictive coding and machine learning methods. Biomed Signal Process Control 2023; 80:104192. [PMID: 36168586 PMCID: PMC9500098 DOI: 10.1016/j.bspc.2022.104192] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 08/12/2022] [Accepted: 09/10/2022] [Indexed: 11/30/2022]
Abstract
Corona disease has become one of the problems and challenges of humankind over the past two years. One of the problems that existed from the first days of this epidemic was clinical symptoms similar to other infectious viruses such as colds and influenza. Therefore, diagnosis of this disease and its coping and treatment approaches is also been difficult. In this study, Attempts has been made to investigate the origin of this disease and the genetic structure of the virus leading to it. For this purpose, signal processing and linear predictive coding approaches were used which are widely used in data compression. A pattern recognition model was presented for the detection and separation of covid samples from the influenza virus case studies. This model, which was based on support vector machine classifier, was tested successfully on several datasets collected from different countries. The obtained results performed on all datasets by more than 98% accuracy. The proposed model, in addition to its good performance accuracy, can be a step forward in quantifying and digitizing medical big data information.
Collapse
Affiliation(s)
- Amin Khodaei
- Faculty of Electrical & Computer Engineering of University of Tabriz, 29 Bahman Blvd, Tabriz, Iran
| | - Parvaneh Shams
- Computer Engineering Department, Istanbul Aydin University, Turkey
| | - Hadi Sharifi
- Faculty of Electrical & Computer Engineering of University of Tabriz, 29 Bahman Blvd, Tabriz, Iran
| | | |
Collapse
|
10
|
Das B. An implementation of a hybrid method based on machine learning to identify biomarkers in the Covid-19 diagnosis using DNA sequences. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS : AN INTERNATIONAL JOURNAL SPONSORED BY THE CHEMOMETRICS SOCIETY 2022; 230:104680. [PMID: 36213553 PMCID: PMC9528020 DOI: 10.1016/j.chemolab.2022.104680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 09/20/2022] [Accepted: 09/27/2022] [Indexed: 06/16/2023]
Abstract
Although some people do not have any chronic disease or are not in the risky age group for Covid-19, they are more vulnerable to the coronavirus. As the reason for this situation, some experts focus on the immune system of the person, while others think that the genetic history of patients may play a role. It is critical to detect corona from DNA signals as early as possible to determine the relationship between Covid-19 and genes. Thus, the effect on the severe course of the disease of variations in the genes associated with the corona disease will be revealed. In this study, a novel intelligent computer approach is proposed to identify coronavirus from nucleotide signals for the first time. The proposed method presents a multilayered feature extraction structure to extract the most effective features using an Entropy-based mapping technique, Discrete Wavelet Transform (DWT), statistical feature extractor, and Singular Value Decomposition (SVD), together. Then 94 distinctive features are selected by the ReliefF technique. Support vector machine (SVM) and k nearest neighborhood (k-NN) are chosen as classifiers. The method achieved the highest classification accuracy rate of 98.84% with an SVM classifier to detect Covid-19 from DNA signals. The proposed method is ready to be tested with a different database in the diagnosis of Covid-19 using RNA or other signals.
Collapse
Affiliation(s)
- Bihter Das
- Department of Software Engineering, Technology Faculty, Firat University, 23119, Elazig, Turkey
| |
Collapse
|
11
|
Zazzaro G, Pavone L. Machine Learning Characterization of Ictal and Interictal States in EEG Aimed at Automated Seizure Detection. Biomedicines 2022; 10:biomedicines10071491. [PMID: 35884796 PMCID: PMC9312966 DOI: 10.3390/biomedicines10071491] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/15/2022] [Accepted: 06/22/2022] [Indexed: 11/16/2022] Open
Abstract
Background: The development of automated seizure detection methods using EEG signals could be of great importance for the diagnosis and the monitoring of patients with epilepsy. These methods are often patient-specific and require high accuracy in detecting seizures but also very low false-positive rates. The aim of this study is to evaluate the performance of a seizure detection method using EEG signals by investigating its performance in correctly identifying seizures and in minimizing false alarms and to determine if it is generalizable to different patients. Methods: We tested the method on about two hours of preictal/ictal and about ten hours of interictal EEG recordings of one patient from the Freiburg Seizure Prediction EEG database using machine learning techniques for data mining. Then, we tested the obtained model on six other patients of the same database. Results: The method achieved very high performance in detecting seizures (close to 100% of correctly classified positive elements) with a very low false-positive rate when tested on one patient. Furthermore, the model portability or transfer analysis revealed that the method achieved good performance in one out of six patients from the same dataset. Conclusions: This result suggests a strategy to discover clusters of similar patients, for which it would be possible to train a general-purpose model for seizure detection.
Collapse
Affiliation(s)
- Gaetano Zazzaro
- C.I.R.A.—Italian Aerospace Research Centre, Via Maiorise s.n.c., 81043 Capua, Italy;
| | - Luigi Pavone
- I.R.C.C.S. Neuromed, Via Atinense, 18, 86077 Pozzilli, Italy
- Correspondence:
| |
Collapse
|
12
|
Ogunjo ST, Fuwape IA, Rabiu AB. Predicting COVID-19 Cases From Atmospheric Parameters Using Machine Learning Approach. GEOHEALTH 2022; 6:e2021GH000509. [PMID: 35415381 PMCID: PMC8983058 DOI: 10.1029/2021gh000509] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 02/06/2022] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
The dynamical nature of COVID-19 cases in different parts of the world requires robust mathematical approaches for prediction and forecasting. In this study, we aim to (a) forecast future COVID-19 cases based on past infections, (b) predict current COVID-19 cases using PM2.5, temperature, and humidity data, using four different machine learning classifiers (Decision Tree, K-nearest neighbor, Support Vector Machine, and Random Forest). Based on RMSE values, k-nearest neighbor and support vector machine algorithms were found to be the best for predicting future incidences of COVID-19 based on past histories. From the RMSE values obtained, temperature was found to be the best predictor for number of COVID-19 cases, followed by relative humidity. Decision tree models was found to perform poorly in the prediction of COVID-19 cases considering particulate matter and atmospheric parameters as predictors. Our results suggests the possibility of predicting virus infection using machine learning. This will guide policy makers in proactive monitoring and control.
Collapse
Affiliation(s)
- S. T. Ogunjo
- Department of PhysicsFederal University of Technology AkureAkureNigeria
| | - I. A. Fuwape
- Department of PhysicsFederal University of Technology AkureAkureNigeria
- Office of the Vice ChancellorMichael and Cecilia Ibru UniversityUghelliNigeria
| | - A. B. Rabiu
- Centre for Atmospheric ResearchNational Space and Research Development AgencyAnyigbaNigeria
| |
Collapse
|
13
|
Wang SY, Bi WH, Gan WY, Li XY, Zhang BJ, Fu GW, Jiang TJ. Identification of ichthyotoxic red tide algae based on three-dimensional fluorescence spectra and particle swarm optimization support vector machine. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 268:120711. [PMID: 34902694 DOI: 10.1016/j.saa.2021.120711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 11/11/2021] [Accepted: 12/01/2021] [Indexed: 06/14/2023]
Abstract
Acccurate identification whether red tide has ithyotoxicity is very significant for microalgae monitoring. In order to realize the rapid and non-destructive detection of ichthyotoxic red tide algae, a detection method combining three-dimensional (3D) fluorescence spectrum and particle swarm optimization support vector machine (PSO-SVM) was developed to monitor the ichthyotoxic red tide algae with cell concentrations from 104 cells/mL to 106 cells/mL. The contour maps contracted form three-dimensional fluorescence spectra of six common species of ichthyotoxic algae and eight common species of non-ichthyotoxic algae,which are analyzed to select the optimal emission and excitation wavelength span. The new feature data are acquired by using the emission spectrum data at 480 nm and 510 nm excitation wavelengths. The new feature data are used as the input of particle swarm optimization support vector machine to establish the optimal classification model of ichthyotoxic algae, which achieves an classification accuracy of 100% for the test set. The optimal classification model is successfully applied to identify the ichthyotoxicity of different algae including Heterosigma akashiwo, Chattonella marina, Phaeocystis globosa, Prorocentrum donghaiense, Karenia dunnii, Isoscelina galbana, Isosceles globosa and Skeletonema costatum.
Collapse
Affiliation(s)
- Si-Yuan Wang
- School of Information Science and Engineering, Yanshan University, The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Qinhuangdao 066004, China
| | - Wei-Hong Bi
- School of Information Science and Engineering, Yanshan University, The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Qinhuangdao 066004, China.
| | - Wen-Yu Gan
- Research Center for Harmful Algae and marine biology, Jinan University, Guangzhou 510632, China
| | - Xin-Yu Li
- School of Information Science and Engineering, Yanshan University, The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Qinhuangdao 066004, China
| | - Bao-Jun Zhang
- School of Information Science and Engineering, Yanshan University, The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Qinhuangdao 066004, China
| | - Guang-Wei Fu
- School of Information Science and Engineering, Yanshan University, The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Qinhuangdao 066004, China
| | - Tian-Jiu Jiang
- Research Center for Harmful Algae and marine biology, Jinan University, Guangzhou 510632, China
| |
Collapse
|
14
|
Tayara H, Abdelbaky I, To Chong K. Recent omics-based computational methods for COVID-19 drug discovery and repurposing. Brief Bioinform 2021; 22:6355836. [PMID: 34423353 DOI: 10.1093/bib/bbab339] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 07/09/2021] [Indexed: 12/22/2022] Open
Abstract
The coronavirus disease 2019 (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is the main reason for the increasing number of deaths worldwide. Although strict quarantine measures were followed in many countries, the disease situation is still intractable. Thus, it is needed to utilize all possible means to confront this pandemic. Therefore, researchers are in a race against the time to produce potential treatments to cure or reduce the increasing infections of COVID-19. Computational methods are widely proving rapid successes in biological related problems, including diagnosis and treatment of diseases. Many efforts in recent months utilized Artificial Intelligence (AI) techniques in the context of fighting the spread of COVID-19. Providing periodic reviews and discussions of recent efforts saves the time of researchers and helps to link their endeavors for a faster and efficient confrontation of the pandemic. In this review, we discuss the recent promising studies that used Omics-based data and utilized AI algorithms and other computational tools to achieve this goal. We review the established datasets and the developed methods that were basically directed to new or repurposed drugs, vaccinations and diagnosis. The tools and methods varied depending on the level of details in the available information such as structures, sequences or metabolic data.
Collapse
Affiliation(s)
- Hilal Tayara
- School of international Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Ibrahim Abdelbaky
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Banha 13518, Egypt
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, Jeollabukdo 54896, Republic of Korea.,Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
15
|
Arslan H. COVID-19 prediction based on genome similarity of human SARS-CoV-2 and bat SARS-CoV-like coronavirus. COMPUTERS & INDUSTRIAL ENGINEERING 2021; 161:107666. [PMID: 34511707 PMCID: PMC8423779 DOI: 10.1016/j.cie.2021.107666] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 07/13/2021] [Accepted: 09/05/2021] [Indexed: 05/03/2023]
Abstract
This paper proposes an efficient and accurate method to predict coronavirus disease 19 (COVID-19) based on the genome similarity of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and a bat SARS-CoV-like coronavirus. We introduce similarity features to distinguish COVID-19 from other human coronaviruses by comparing human coronaviruses with a bat SARS-CoV-like coronavirus. In the proposed method each human coronavirus sequence is assigned to three similarity scores considering nucleotide similarities and mutations that lead to the strong absence of cytosine and guanine nucleotides. Next the proposed features are integrated with CpG island features of the genome sequences to improve COVID-19 prediction. Thus, each genome sequence is represented by five real numbers. We exhibit the effectiveness of the proposed features using six machine learning classifiers on a dataset including the genome sequences of human coronaviruses similar to SARS-CoV-2. The performances of the machine learning classifiers are close to each other and k-nearest neighbor classifier with similarity features achieves the best results with an accuracy of 99.2%. Moreover, k-nearest neighbor classifier with the integration of CpG based and similarity features has an admirable performance and achieves an accuracy of 99.8%. Experimental results demonstrate that similarity features remarkably decrease the number of false negatives and significantly improve the overall performance. The superiority of the proposed method is also highlighted by comparing with the state-of-the-art studies detecting COVID-19 from genome sequences.
Collapse
Affiliation(s)
- Hilal Arslan
- Department of Software Engineering, Ankara Yıldırım Beyazıt University, Turkey
| |
Collapse
|