1
|
Daneshvar NHN, Masoudi-Sobhanzadeh Y, Omidi Y. A voting-based machine learning approach for classifying biological and clinical datasets. BMC Bioinformatics 2023; 24:140. [PMID: 37041456 PMCID: PMC10088226 DOI: 10.1186/s12859-023-05274-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Accepted: 04/05/2023] [Indexed: 04/13/2023] Open
Abstract
BACKGROUND Different machine learning techniques have been proposed to classify a wide range of biological/clinical data. Given the practicability of these approaches accordingly, various software packages have been also designed and developed. However, the existing methods suffer from several limitations such as overfitting on a specific dataset, ignoring the feature selection concept in the preprocessing step, and losing their performance on large-size datasets. To tackle the mentioned restrictions, in this study, we introduced a machine learning framework consisting of two main steps. First, our previously suggested optimization algorithm (Trader) was extended to select a near-optimal subset of features/genes. Second, a voting-based framework was proposed to classify the biological/clinical data with high accuracy. To evaluate the efficiency of the proposed method, it was applied to 13 biological/clinical datasets, and the outcomes were comprehensively compared with the prior methods. RESULTS The results demonstrated that the Trader algorithm could select a near-optimal subset of features with a significant level of p-value < 0.01 relative to the compared algorithms. Additionally, on the large-sie datasets, the proposed machine learning framework improved prior studies by ~ 10% in terms of the mean values associated with fivefold cross-validation of accuracy, precision, recall, specificity, and F-measure. CONCLUSION Based on the obtained results, it can be concluded that a proper configuration of efficient algorithms and methods can increase the prediction power of machine learning approaches and help researchers in designing practical diagnosis health care systems and offering effective treatment plans.
Collapse
Affiliation(s)
| | - Yosef Masoudi-Sobhanzadeh
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.
- Faculty of Advanced Medical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran.
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, College of Pharmacy, Nova Southeastern University, Florida, 33328, USA.
| |
Collapse
|
2
|
Discovering driver nodes in chronic kidney disease-related networks using Trader as a newly developed algorithm. Comput Biol Med 2022; 148:105892. [DOI: 10.1016/j.compbiomed.2022.105892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 07/04/2022] [Accepted: 07/16/2022] [Indexed: 11/18/2022]
|
3
|
Masoudi-Sobhanzadeh Y, Esmaeili H, Masoudi-Nejad A. A fuzzy logic-based computational method for the repurposing of drugs against COVID-19. BIOIMPACTS : BI 2022; 12:315-324. [PMID: 35975205 PMCID: PMC9376160 DOI: 10.34172/bi.2021.40] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 03/27/2021] [Accepted: 04/03/2021] [Indexed: 01/09/2023]
Abstract
Introduction: COVID-19 has spread out all around the world and seriously interrupted human activities. Being a newfound disease, not only many aspects of the disease are unknown, but also there is not an effective medication to cure the disease. Besides, designing a drug is a time-consuming process and needs large investment. Hence, drug repurposing techniques, employed to discover the hidden benefits of the existing drugs, maybe a useful option for treating COVID-19. Methods: The present study exploits the drug repositioning concepts and introduces some candidate drugs which may be effective in controlling COVID-19. The suggested method consists of three main steps. First, the required data such as the amino acid sequences of targets and drug-target interactions are extracted from the public databases. Second, the similarity score between the targets (protein/enzymes) and genome of SARS-COV-2 is computed using the proposed fuzzy logic-based method. Since the classical approaches yield outcomes which may not be useful for the real-world applications, the fuzzy technique can address the issue. Third, after ranking targets based on the obtained scores, the usefulness of drugs affecting them is examined for managing COVID-19. Results: The results indicate that antiviral medicines, designed for curing hepatitis C, may also cure COVID-19. According to the findings, ribavirin, simeprevir, danoprevir, and XTL-6865 may be helpful in controlling the disease. Conclusion: It can be concluded that the similarity-based drug repurposing techniques may be the most suitable option for managing emerging diseases such as COVID-19 and can be applied to a wide range of data. Also, fuzzy logic-based scoring methods can produce outcomes which are more consistent with the real-world biological applications than others.
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
,Corresponding authors: Ali Masoudi-Nejad, ; Yosef Masoudi-Sobhanzadeh,
| | - Hosein Esmaeili
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
,Corresponding authors: Ali Masoudi-Nejad, ; Yosef Masoudi-Sobhanzadeh,
| |
Collapse
|
4
|
Masoudi-Sobhanzadeh Y, Salemi A, Pourseif MM, Jafari B, Omidi Y, Masoudi-Nejad A. Structure-based drug repurposing against COVID-19 and emerging infectious diseases: methods, resources and discoveries. Brief Bioinform 2021; 22:bbab113. [PMID: 33993214 PMCID: PMC8194848 DOI: 10.1093/bib/bbab113] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Revised: 02/15/2021] [Accepted: 03/13/2021] [Indexed: 01/09/2023] Open
Abstract
To attain promising pharmacotherapies, researchers have applied drug repurposing (DR) techniques to discover the candidate medicines to combat the coronavirus disease 2019 (COVID-19) outbreak. Although many DR approaches have been introduced for treating different diseases, only structure-based DR (SBDR) methods can be employed as the first therapeutic option against the COVID-19 pandemic because they rely on the rudimentary information about the diseases such as the sequence of the severe acute respiratory syndrome coronavirus 2 genome. Hence, to try out new treatments for the disease, the first attempts have been made based on the SBDR methods which seem to be among the proper choices for discovering the potential medications against the emerging and re-emerging infectious diseases. Given the importance of SBDR approaches, in the present review, well-known SBDR methods are summarized, and their merits are investigated. Then, the databases and software applications, utilized for repurposing the drugs against COVID-19, are introduced. Besides, the identified drugs are categorized based on their targets. Finally, a comparison is made between the SBDR approaches and other DR methods, and some possible future directions are proposed.
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Aysan Salemi
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Mohammad M Pourseif
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Behzad Jafari
- Department of Medicinal Chemistry, Faculty of Pharmacy, Urmia University of Medical Sciences, Urmia, Iran
| | - Yadollah Omidi
- Nova Southeastern University College of Pharmacy, Florida, USA
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
5
|
Masoudi-Sobhanzadeh Y, Jafari B, Parvizpour S, Pourseif MM, Omidi Y. A novel multi-objective metaheuristic algorithm for protein-peptide docking and benchmarking on the LEADS-PEP dataset. Comput Biol Med 2021; 138:104896. [PMID: 34601392 DOI: 10.1016/j.compbiomed.2021.104896] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 01/03/2023]
Abstract
Protein-peptide interactions have attracted the attention of many drug discovery scientists due to their possible druggability features on most key biological activities such as regulating disease-related signaling pathways and enhancing the immune system's responses. Different studies have utilized some protein-peptide-specific docking algorithms/methods to predict protein-peptide interactions. However, the existing algorithms/methods suffer from two serious limitations which make them unsuitable for protein-peptide docking problems. First, it seems that the prevalent approaches require to be modified and remodeled for weighting the unbounded forces between a protein and a peptide. Second, they do not employ state-of-the-art search algorithms for detecting the 3D pose of a peptide relative to a protein. To address these restrictions, the present study aims to introduce a novel multi-objective algorithm, which first generates some potential 3D poses of a peptide, and then, improves them through its operators. The candidate solutions are further evaluated using Multi-Objective Pareto Front (MOPF) optimization concepts. To this end, van der Waals, electrostatic, solvation, and hydrogen bond energies between the atoms of a protein and designated peptide are computed. To evaluate the algorithm, it is first applied to the LEADS-PEP dataset containing 53 protein-peptide complexes with up to 53 rotatable branches/bonds and then compared with three popular/efficient algorithms. The obtained results indicate that the MOPF-based approaches which reduce the backbone RMSD between the original and predicted states, achieve significantly better results in terms of the success rate in predicting the near-native conditions. Besides, a comparison between the different types of search algorithms reveals that efficient ones like the multi-objective Trader/differential evolution algorithm can predict protein-peptide interactions better than the popular algorithms such as the multi-objective genetic/particle swarm optimization algorithms.
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Behzad Jafari
- Department of Medicinal Chemistry, Faculty of Pharmacy, Urmia University of Medical Sciences, Urmia, Iran
| | - Sepideh Parvizpour
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Mohammad M Pourseif
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, College of Pharmacy, Nova Southeastern University, Florida, 33328, USA.
| |
Collapse
|
6
|
Masoudi-Sobhanzadeh Y, Motieghader H, Omidi Y, Masoudi-Nejad A. A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications. Sci Rep 2021; 11:3349. [PMID: 33558580 PMCID: PMC7870651 DOI: 10.1038/s41598-021-82796-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Accepted: 01/25/2021] [Indexed: 01/30/2023] Open
Abstract
Gene/feature selection is an essential preprocessing step for creating models using machine learning techniques. It also plays a critical role in different biological applications such as the identification of biomarkers. Although many feature/gene selection algorithms and methods have been introduced, they may suffer from problems such as parameter tuning or low level of performance. To tackle such limitations, in this study, a universal wrapper approach is introduced based on our introduced optimization algorithm and the genetic algorithm (GA). In the proposed approach, candidate solutions have variable lengths, and a support vector machine scores them. To show the usefulness of the method, thirteen classification and regression-based datasets with different properties were chosen from various biological scopes, including drug discovery, cancer diagnostics, clinical applications, etc. Our findings confirmed that the proposed method outperforms most of the other currently used approaches and can also free the users from difficulties related to the tuning of various parameters. As a result, users may optimize their biological applications such as obtaining a biomarker diagnostic kit with the minimum number of genes and maximum separability power.
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- grid.412888.f0000 0001 2174 8913Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Habib Motieghader
- grid.459617.80000 0004 0494 2783Department of Bioinformatics, Biotechnology Research Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran ,grid.459617.80000 0004 0494 2783Department of Basic Sciences, Gowgan Educational Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Yadollah Omidi
- grid.261241.20000 0001 2168 8324Department of Pharmaceutical Sciences, College of Pharmacy, Nova Southeastern University, Fort Lauderdale, Florida, 33328 USA
| | - Ali Masoudi-Nejad
- grid.46072.370000 0004 0612 7950Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
7
|
A multimodal deep learning-based drug repurposing approach for treatment of COVID-19. Mol Divers 2020; 25:1717-1730. [PMID: 32997257 PMCID: PMC7525234 DOI: 10.1007/s11030-020-10144-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 09/12/2020] [Indexed: 12/12/2022]
Abstract
Abstract Recently, various computational methods have been proposed to find new therapeutic applications of the existing drugs. The Multimodal Restricted Boltzmann Machine approach (MM-RBM), which has the capability to connect the information about the multiple modalities, can be applied to the problem of drug repurposing. The present study utilized MM-RBM to combine two types of data, including the chemical structures data of small molecules and differentially expressed genes as well as small molecules perturbations. In the proposed method, two separate RBMs were applied to find out the features and the specific probability distribution of each datum (modality). Besides, RBM was used to integrate the discovered features, resulting in the identification of the probability distribution of the combined data. The results demonstrated the significance of the clusters acquired by our model. These clusters were used to discover the medicines which were remarkably similar to the proposed medications to treat COVID-19. Moreover, the chemical structures of some small molecules as well as dysregulated genes’ effect led us to suggest using these molecules to treat COVID-19. The results also showed that the proposed method might prove useful in detecting the highly promising remedies for COVID-19 with minimum side effects. All the source codes are accessible using https://github.com/LBBSoft/Multimodal-Drug-Repurposing.git Graphic abstract ![]()
Electronic supplementary material The online version of this article (10.1007/s11030-020-10144-9) contains supplementary material, which is available to authorized users.
Collapse
|
8
|
World competitive contest-based artificial neural network: A new class-specific method for classification of clinical and biological datasets. Genomics 2020; 113:541-552. [PMID: 32991962 PMCID: PMC7521912 DOI: 10.1016/j.ygeno.2020.09.047] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/05/2020] [Accepted: 09/22/2020] [Indexed: 12/26/2022]
Abstract
Many data mining methods have been proposed to generate computer-aided diagnostic systems, which may determine diseases in their early stages by categorizing the data into some proper classes. Considering the importance of the existence of a suitable classifier, the present study aims to introduce an efficient approach based on the World Competitive Contests (WCC) algorithm as well as a multi-layer perceptron artificial neural network (ANN). Unlike the previously introduced methods, which each has developed a universal model for all different kinds of data classes, our proposed approach generates a single specific model for each individual class of data. The experimental results show that the proposed method (ANNWCC), which can be applied to both the balanced and unbalanced datasets, yields more than 76% (without applying feature selection methods) and 90% (with applying feature selection methods) of the average five-fold cross-validation accuracy on the 13 clinical and biological datasets. The findings also indicate that under different conditions, our proposed method can produce better results in comparison to some state-of-art meta-heuristic algorithms and methods in terms of various statistical and classification measurements. To classify the clinical and biological data, a multi-layer ANN and the WCC algorithm were combined. It was shown that developing a specific model for each individual class of data may yield better results compared with creating a universal model for all of the existing data classes. Besides, some efficient algorithms proved to be essential to generate acceptable biological results, and the methods' performance was found to be enhanced by fuzzifying or normalizing the biological data. We combined multi-layer artificial neural networks and world competitive contests algorithms to classify biological datasets The proposed method has been investigated on 13 clinical datasets with different properties Efficient models may yield better classification models and health diagnostic systems Feature selection methods can improve the performance of a model in separating case and control samples
Collapse
|