1
|
Ahammad I, Lamisa AB, Bhattacharjee A, Jamal TB, Arefin MS, Chowdhury ZM, Hossain MU, Das KC, Keya CA, Salimullah M. AITeQ: a machine learning framework for Alzheimer's prediction using a distinctive five-gene signature. Brief Bioinform 2024; 25:bbae291. [PMID: 38877887 PMCID: PMC11179120 DOI: 10.1093/bib/bbae291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 05/23/2024] [Accepted: 06/04/2024] [Indexed: 06/18/2024] Open
Abstract
Neurodegenerative diseases, such as Alzheimer's disease, pose a significant global health challenge with their complex etiology and elusive biomarkers. In this study, we developed the Alzheimer's Identification Tool (AITeQ) using ribonucleic acid-sequencing (RNA-seq), a machine learning (ML) model based on an optimized ensemble algorithm for the identification of Alzheimer's from RNA-seq data. Analysis of RNA-seq data from several studies identified 87 differentially expressed genes. This was followed by a ML protocol involving feature selection, model training, performance evaluation, and hyperparameter tuning. The feature selection process undertaken in this study, employing a combination of four different methodologies, culminated in the identification of a compact yet impactful set of five genes. Twelve diverse ML models were trained and tested using these five genes (CNKSR1, EPHA2, CLSPN, OLFML3, and TARBP1). Performance metrics, including precision, recall, F1 score, accuracy, Matthew's correlation coefficient, and receiver operating characteristic area under the curve were assessed for the finally selected model. Overall, the ensemble model consisting of logistic regression, naive Bayes classifier, and support vector machine with optimized hyperparameters was identified as the best and was used to develop AITeQ. AITeQ is available at: https://github.com/ishtiaque-ahammad/AITeQ.
Collapse
Affiliation(s)
- Ishtiaque Ahammad
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Anika Bushra Lamisa
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Arittra Bhattacharjee
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Tabassum Binte Jamal
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Md Shamsul Arefin
- Department of Biochemistry and Microbiology, North South University, Bashundhara, Dhaka 1229, Bangladesh
| | - Zeshan Mahmud Chowdhury
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Mohammad Uzzal Hossain
- Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Keshob Chandra Das
- Molecular Biotechnology Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| | - Chaman Ara Keya
- Department of Biochemistry and Microbiology, North South University, Bashundhara, Dhaka 1229, Bangladesh
| | - Md Salimullah
- Molecular Biotechnology Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh
| |
Collapse
|
2
|
Prieto-Garrido FL, Hernández Verdejo JL, Villa-Collar C, Ruiz-Pomeda A. Predicting factors for progression of the myopia in the MiSight assessment study Spain (MASS). JOURNAL OF OPTOMETRY 2022; 15:78-87. [PMID: 33750678 PMCID: PMC8712588 DOI: 10.1016/j.optom.2020.11.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 11/12/2020] [Accepted: 11/23/2020] [Indexed: 05/04/2023]
Abstract
PURPOSE To investigate which baseline factors are predictive for success in controlling myopia progression in a group of children wearing MiSight Contact Lens (CLs). METHODS Myopic patients (n=41) fitted with MiSight CLs and followed up two years were included in this study. Bivariate analysis, a logistic regression analysis (LG) and a decision tree (DT) approach were used to screen for the factors influencing the success of the treatment. To assess the response, axial length (AL) changes were considered as main variable. Patients were classified based on a specific range of change of axial length at the end of each year of treatment as "responders" (R) (AL change <0.11mm/per year) and "non-responders" (NR) (AL change ≥0.11mm/per year). RESULTS Of a total of forty-one Caucasian patients treated with MiSight CLs, 21 and 16 were considered responders in the first and the second year of follow-up, respectively. LG analysis showed that the only factor associated with smaller axial length growth was more time spent outdoors (p=0.0079) in the first year of treatment. The decision tree analysis showed that in the responding group spending more than 3 and 4h outdoors per week was associated with the best response in the first year and in the second year of treatment respectively. CONCLUSIONS The LR and the DT approach of this pilot study identifies time spent outdoors as a main factor in controlling axial eye growth in children treated with MiSight CLs.
Collapse
Affiliation(s)
| | | | - César Villa-Collar
- European University of Madrid, Doctoral and Research School, Madrid, Spain
| | | |
Collapse
|
3
|
Shukla R, Singh TR. High-throughput screening of natural compounds and inhibition of a major therapeutic target HsGSK-3β for Alzheimer's disease using computational approaches. J Genet Eng Biotechnol 2021; 19:61. [PMID: 33945025 PMCID: PMC8096881 DOI: 10.1186/s43141-021-00163-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 04/15/2021] [Indexed: 12/31/2022]
Abstract
BACKGROUND Alzheimer's disease is a leading neurodegenerative disease worldwide and is the 6th leading cause of death in the USA. AD is a very complex disease and the drugs available in the market cannot fully cure it. The glycogen synthase kinase 3 beta plays a major role in the hyperphosphorylation of tau protein which forms the neurofibrillary tangles which is a major hallmark of AD. In this study, we have used a series of computational approaches to find novel inhibitors against GSK-3β to reduce the TAU hyperphosphorylation. RESULTS We have retrieved a set of compounds (n=167,741) and screened against GSK-3β in four sequential steps. The resulting analysis of virtual screening suggested that 404 compounds show good binding affinity and can be employed for pharmacokinetic analysis. From here, we have selected 20 compounds those were good in terms of pharmacokinetic parameters. All these compounds were re-docked by using Autodock Vina followed by Autodock. Four best compounds were employed for MDS and here predicted RMSD, RMSF, Rg, hydrogen bonds, SASA, PCA, and binding-free energy. From all these analyses, we have concluded that out of 167,741 compounds, the ZINC15968620, ZINC15968622, and ZINC70707119 can act as lead compounds against HsGSK-3β to reduce the hyperphosphorylation. CONCLUSION The study suggested three compounds (ZINC15968620, ZINC15968622, and ZINC70707119) have great potential to be a drug candidate and can be tested using in vitro and in vivo experiments for further characterization and applications.
Collapse
Affiliation(s)
- Rohit Shukla
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, H.P., 173234, India
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, H.P., 173234, India.
| |
Collapse
|
4
|
Kumar A, Bansal A, Singh TR. ABCD: Alzheimer's disease Biomarkers Comprehensive Database. 3 Biotech 2019; 9:351. [PMID: 31501752 DOI: 10.1007/s13205-019-1888-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 08/27/2019] [Indexed: 11/24/2022] Open
Abstract
Alzheimer's disease (AD) is an age-related, non-reversible, and progressive brain disorder. Memory loss, confusion, and personality changes are major symptoms noticed. AD ultimately leads to a severe loss of mental function. Due to lack of effective biomarkers, no effective medication was available for the complete treatment of AD. There is a need to provide all AD-related essential information to the scientific community. Our resource Alzheimer's disease Biomarkers Comprehensive Database (ABCD) is being planned to accomplish this objective. ABCD is a huge collection of AD-related data of molecular markers. The web interface contains information concerning the proteins, genes, transcription factors, SNPs, miRNAs, mitochondrial genes, and expressed genes implicated in AD pathogenesis. In addition to the molecular-level data, the database has information for animal models, medicinal candidates and pathways involved in the AD and some image data for AD patients. ABCD is coupled with some major external resources where the user can retrieve additional general information about the disease. The database was designed in such a manner that user can extract meaningful information about gene, protein, pathway, and regulatory elements based search options. This database is unique in the sense that it is completely dedicated to specific neurological disorder i.e. AD. Further advance options like AD-affected brain image data of patients and structural compound level information add values to our database. Features of this database enable users to extract, analyze and display information related to a disease in many different ways. The database is available for academic purpose and accessible at http://www.bioinfoindia.org/abcd.
Collapse
Affiliation(s)
- Ashwani Kumar
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh 173234 India
| | - Ankush Bansal
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh 173234 India
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh 173234 India
| |
Collapse
|
5
|
Kaushik AC, Gautam D, Nangraj AS, Wei DQ, Sahi S. Protection of Primary Dopaminergic Midbrain Neurons Through Impact of Small Molecules Using Virtual Screening of GPR139 Supported by Molecular Dynamic Simulation and Systems Biology. Interdiscip Sci 2019; 11:247-257. [DOI: 10.1007/s12539-019-00334-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 04/14/2019] [Accepted: 05/06/2019] [Indexed: 12/31/2022]
|
6
|
Shukla R, Munjal NS, Singh TR. Identification of novel small molecules against GSK3β for Alzheimer's disease using chemoinformatics approach. J Mol Graph Model 2019; 91:91-104. [PMID: 31202091 DOI: 10.1016/j.jmgm.2019.06.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2019] [Revised: 06/06/2019] [Accepted: 06/06/2019] [Indexed: 12/13/2022]
Abstract
Alzheimer's disease is a rapidly increasing neurodegenerative disease. It is a multifactorial disease and also a global threat. Several enzymes are implicated in the disease in which Glycogen Synthase Kinase 3 beta is a key enzyme to increase the disease progression by the hyperphosphorylation of the tau protein. We have used an integrative chemoinformatics and pharmacokinetics approach for the identification of novel small molecules. We have retrieved a subset from the ZINC database (n = 5,36,709) and screened against GSK3β in four steps. From here top 298 potent compounds were selected and employed for their pharmacokinetics analysis. We had seen that 29 compounds showed the key characteristics to be a novel drug candidate therefore, all these compounds were employed for redocking studies using Autodock Vina and Autodock. This analysis revealed that four compounds were showing good binding affinity. All these four compounds were employed for MDS analysis of 100 ns From here using a bunch of MD analyses we have found that out of four compounds GSK3β-ZINC21011059 and GSK3β-ZINC21011066 act as a stable protein-ligand complex. Therefore we proposed ZINC21011059 and ZINC21011066 can serve as a novel compounds against GSK3β and predicted scaffold can be used for further optimization towards the improvement of isoform selectivity, and warranting further investigations towards their in vitro and in vivo validation of the bioactivity.
Collapse
Affiliation(s)
- Rohit Shukla
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, H.P, 173234, India
| | - Nupur S Munjal
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, H.P, 173234, India
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, H.P, 173234, India.
| |
Collapse
|
7
|
Shukla R, Singh TR. Virtual screening, pharmacokinetics, molecular dynamics and binding free energy analysis for small natural molecules against cyclin-dependent kinase 5 for Alzheimer's disease. J Biomol Struct Dyn 2019; 38:248-262. [PMID: 30688165 DOI: 10.1080/07391102.2019.1571947] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Alzheimer's disease (AD) is a progressive neurodegenerative disorder and characterized by brain cell death, memory loss and is the most common form of dementia. Although AD has devastating effects, however, drugs which can treat the AD remain limited. The cyclin-dependent kinase 5 (CDK5) has been recognized as being involved in the pathological hyperphosphorylation of tau protein, which leads to the formation of neurofibrillary tangles (NFTs). We utilized the structure-based virtual screening (SBVS) approach to find the potential inhibitors against HsCDK5. The natural compound subset from the ZINC database (n = 167,741) was retrieved and screened by using SBVS method. From here, we have predicted 297 potent inhibitors. These 297 compounds were evaluated through their pharmacokinetic properties by ADMET (absorption, distribution, metabolism, elimination/excretion and toxicity) descriptors. Finally, 17 compounds were selected and used for re-docking. After the refinement by molecular docking and by using drug-likeness analysis, we have identified four potential inhibitors (ZINC85877721, ZINC96114862, ZINC96115616 and ZINC96116231). All these four ligands were employed for 100 ns MDS study. From the root mean square deviation (RMSD), root mean square fluctuation (RMSF), Rg, number of hydrogen bonds, solvent accessible surface area (SASA), principal component analysis (PCA) and binding free energy analysis we have found that out of four inhibitors ZINC85877721 and ZINC96116231 showed good binding free energy of -198.84 and -159.32 kJ.mol-1, respectively, and also good in other structural analyses. Both compounds displayed excellent pharmacological and structural properties to be the drug candidates. Collectively, these findings recommend that two compounds have great potential to be a promising agent against AD to reduce the CDK5 induced hyperphosphorylation and could be considered as therapeutic agents for the AD.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Rohit Shukla
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, India
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, India
| |
Collapse
|
8
|
Porter T, Villemagne VL, Savage G, Milicic L, Ying Lim Y, Maruff P, Masters CL, Ames D, Bush AI, Martins RN, Rainey-Smith S, Rowe CC, Taddei K, Groth D, Verdile G, Burnham SC, Laws SM. Cognitive gene risk profile for the prediction of cognitive decline in presymptomatic Alzheimer’s disease. ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.pmip.2018.03.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
9
|
A New Intelligent Medical Decision Support System Based on Enhanced Hierarchical Clustering and Random Decision Forest for the Classification of Alcoholic Liver Damage, Primary Hepatoma, Liver Cirrhosis, and Cholelithiasis. JOURNAL OF HEALTHCARE ENGINEERING 2018; 2018:1469043. [PMID: 29599940 PMCID: PMC5823417 DOI: 10.1155/2018/1469043] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 12/25/2017] [Accepted: 01/02/2018] [Indexed: 12/20/2022]
Abstract
Diagnosis of liver disease principally depends on physician's subjective knowledge. Automatic prediction of the disease is a critical real-world medical problem. This work presents an EHC-ERF-based intelligence-integrated model purposive to predict different types of liver disease including alcoholic liver damage, primary hepatoma, liver cirrhosis, and cholelithiasis. These diseases cause many clinical complications, and their accurate assessment is the only way for providing efficient treatment facilities to patients. EHC is deployed to divide the data into a hierarchy structure that is more informative for the disease predictions carried out by ERF. The occurrence of ERF error rate was dependent on correlation and strength of each individual tree where correlation is directly proportional to forest error rate and strength is inversely proportional to the forest rate. In total, two individual and three integrated classification models are developed to achieve enhanced predictions for the liver disease types. Analysis of results showed that the proposed framework achieved better outcomes in terms of accuracy, true positive rate, precision, F-measure, kappa statistic, mean absolute error, and root mean squared error. Furthermore, it achieved the highest accuracy rates when compared with the state-of-the-art techniques. Results also indicated that the weighted distance function employed in EHC has improved the efficiency of proposed system and has shown the capability to be used by physicians for diagnostic advice.
Collapse
|