1
|
Mohtasham F, Pourhoseingholi M, Hashemi Nazari SS, Kavousi K, Zali MR. Comparative analysis of feature selection techniques for COVID-19 dataset. Sci Rep 2024; 14:18627. [PMID: 39128991 DOI: 10.1038/s41598-024-69209-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 08/01/2024] [Indexed: 08/13/2024] Open
Abstract
In the context of early disease detection, machine learning (ML) has emerged as a vital tool. Feature selection (FS) algorithms play a crucial role in ensuring the accuracy of predictive models by identifying the most influential variables. This study, focusing on a retrospective cohort of 4778 COVID-19 patients from Iran, explores the performance of various FS methods, including filter, embedded, and hybrid approaches, in predicting mortality outcomes. The researchers leveraged 115 routine clinical, laboratory, and demographic features and employed 13 ML models to assess the effectiveness of these FS methods based on classification accuracy, predictive accuracy, and statistical tests. The results indicate that a Hybrid Boruta-VI model combined with the Random Forest algorithm demonstrated superior performance, achieving an accuracy of 0.89, an F1 score of 0.76, and an AUC value of 0.95 on test data. Key variables identified as important predictors of adverse outcomes include age, oxygen saturation levels, albumin levels, neutrophil counts, platelet levels, and markers of kidney function. These findings highlight the potential of advanced FS techniques and ML models in enhancing early disease detection and informing clinical decision-making.
Collapse
Affiliation(s)
- Farideh Mohtasham
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - MohamadAmin Pourhoseingholi
- Hearing Sciences, Mental Health and Clinical Neurosciences, School of Medicine, National Institute for Health and Care Research (NIHR) Nottingham Biomedical Research Center, University of Nottingham, Nottingham, UK
| | - Seyed Saeed Hashemi Nazari
- Department of Epidemiology, School of Public Health & Safety, Shahid Beheshti University of Medical Sciences (SBMU), Tehran, Iran
| | - Kaveh Kavousi
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran.
| | - Mohammad Reza Zali
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
2
|
Jiang Y, Jin Y, Shan Y, Zhong Q, Wang H, Shen C, Feng S. Advances in Physalis molecular research: applications in authentication, genetic diversity, phylogenetics, functional genes, and omics. FRONTIERS IN PLANT SCIENCE 2024; 15:1407625. [PMID: 38993935 PMCID: PMC11236614 DOI: 10.3389/fpls.2024.1407625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 06/07/2024] [Indexed: 07/13/2024]
Abstract
The plants of the genus Physalis L. have been extensively utilized in traditional and indigenous Chinese medicinal practices for treating a variety of ailments, including dermatitis, malaria, asthma, hepatitis, and liver disorders. The present review aims to achieve a comprehensive and up-to-date investigation of the genus Physalis, a new model crop, to understand plant diversity and fruit development. Several chloroplast DNA-, nuclear ribosomal DNA-, and genomic DNA-based markers, such as psbA-trnH, internal-transcribed spacer (ITS), simple sequence repeat (SSR), random amplified microsatellites (RAMS), sequence-characterized amplified region (SCAR), and single nucleotide polymorphism (SNP), were developed for molecular identification, genetic diversity, and phylogenetic studies of Physalis species. A large number of functional genes involved in inflated calyx syndrome development (AP2-L, MPF2, MPF3, and MAGO), organ growth (AG1, AG2, POS1, and CNR1), and active ingredient metabolism (24ISO, DHCRT, P450-CPL, SR, DUF538, TAS14, and 3β-HSB) were identified contributing to the breeding of novel Physalis varieties. Various omic studies revealed and functionally identified a series of reproductive organ development-related factors, environmental stress-responsive genes, and active component biosynthesis-related enzymes. The chromosome-level genomes of Physalis floridana Rydb., Physalis grisea (Waterf.) M. Martínez, and Physalis pruinosa L. have been recently published providing a valuable resource for genome editing in Physalis crops. Our review summarizes the recent progress in genetic diversity, molecular identification, phylogenetics, functional genes, and the application of omics in the genus Physalis and accelerates efficient utilization of this traditional herb.
Collapse
Affiliation(s)
- Yan Jiang
- Hangzhou Normal University, Hangzhou, China
| | - Yanyun Jin
- Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Yiyi Shan
- Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Quanzhou Zhong
- Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Huizhong Wang
- Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Chenjia Shen
- Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| | - Shangguo Feng
- Hangzhou Normal University, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control of Medicinal Plants, Hangzhou Normal University, Hangzhou, China
| |
Collapse
|
3
|
Tang DY, Mao YJ, Zhao J, Yang J, Li SY, Ren FX, Zheng J. SEEI: spherical evolution with feedback mechanism for identifying epistatic interactions. BMC Genomics 2024; 25:462. [PMID: 38735952 DOI: 10.1186/s12864-024-10373-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 05/03/2024] [Indexed: 05/14/2024] Open
Abstract
BACKGROUND Detecting epistatic interactions (EIs) involves the exploration of associations among single nucleotide polymorphisms (SNPs) and complex diseases, which is an important task in genome-wide association studies. The EI detection problem is dependent on epistasis models and corresponding optimization methods. Although various models and methods have been proposed to detect EIs, identifying EIs efficiently and accurately is still a challenge. RESULTS Here, we propose a linear mixed statistical epistasis model (LMSE) and a spherical evolution approach with a feedback mechanism (named SEEI). The LMSE model expands the existing single epistasis models such as LR-Score, K2-Score, Mutual information, and Gini index. The SEEI includes an adaptive spherical search strategy and population updating strategy, which ensures that the algorithm is not easily trapped in local optima. We analyzed the performances of 8 random disease models, 12 disease models with marginal effects, 30 disease models without marginal effects, and 10 high-order disease models. The 60 simulated disease models and a real breast cancer dataset were used to evaluate eight algorithms (SEEI, EACO, EpiACO, FDHEIW, MP-HS-DHSI, NHSA-DHSC, SNPHarvester, CSE). Three evaluation criteria (pow1, pow2, pow3), a T-test, and a Friedman test were used to compare the performances of these algorithms. The results show that the SEEI algorithm (order 1, averages ranks = 13.125) outperformed the other algorithms in detecting EIs. CONCLUSIONS Here, we propose an LMSE model and an evolutionary computing method (SEEI) to solve the optimization problem of the LMSE model. The proposed method performed better than the other seven algorithms tested in its ability to identify EIs in genome-wide association datasets. We identified new SNP-SNP combinations in the real breast cancer dataset and verified the results. Our findings provide new insights for the diagnosis and treatment of breast cancer. AVAILABILITY AND IMPLEMENTATION https://github.com/scutdy/SSO/blob/master/SEEI.zip .
Collapse
Affiliation(s)
- De-Yu Tang
- Department of Computer Science, School of Mathematics and Informatics, School of Software Engineering, South China Agricultural University, Guangzhou, 510642, PR China.
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| | - Yi-Jun Mao
- Department of Computer Science, School of Mathematics and Informatics, School of Software Engineering, South China Agricultural University, Guangzhou, 510642, PR China.
| | - Jie Zhao
- School of Management, Guangdong University of Technology, Guangzhou, 510006, PR China
| | - Jin Yang
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| | - Shi-Yin Li
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China
| | - Fu-Xiang Ren
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China
| | - Junxi Zheng
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| |
Collapse
|
4
|
Ren F, Li S, Wen Z, Liu Y, Tang D. The Spherical Evolutionary Multi-Objective (SEMO) Algorithm for Identifying Disease Multi-Locus SNP Interactions. Genes (Basel) 2023; 15:11. [PMID: 38275593 PMCID: PMC10815643 DOI: 10.3390/genes15010011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/21/2023] [Accepted: 12/18/2023] [Indexed: 01/27/2024] Open
Abstract
Single-nucleotide polymorphisms (SNPs), as disease-related biogenetic markers, are crucial in elucidating complex disease susceptibility and pathogenesis. Due to computational inefficiency, it is difficult to identify high-dimensional SNP interactions efficiently using combinatorial search methods, so the spherical evolutionary multi-objective (SEMO) algorithm for detecting multi-locus SNP interactions was proposed. The algorithm uses a spherical search factor and a feedback mechanism of excellent individual history memory to enhance the balance between search and acquisition. Moreover, a multi-objective fitness function based on the decomposition idea was used to evaluate the associations by combining two functions, K2-Score and LR-Score, as an objective function for the algorithm's evolutionary iterations. The performance evaluation of SEMO was compared with six state-of-the-art algorithms on a simulated dataset. The results showed that SEMO outperforms the comparative methods by detecting SNP interactions quickly and accurately with a shorter average run time. The SEMO algorithm was applied to the Wellcome Trust Case Control Consortium (WTCCC) breast cancer dataset and detected two- and three-point SNP interactions that were significantly associated with breast cancer, confirming the effectiveness of the algorithm. New combinations of SNPs associated with breast cancer were also identified, which will provide a new way to detect SNP interactions quickly and accurately.
Collapse
Affiliation(s)
- Fuxiang Ren
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Shiyin Li
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Zihao Wen
- College of Mathematics and Informatics, College of Software Engineering, South China Agricultural University, Guangzhou 510642, China
- Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Yidi Liu
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Deyu Tang
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
- College of Mathematics and Informatics, College of Software Engineering, South China Agricultural University, Guangzhou 510642, China
| |
Collapse
|
5
|
Choudhary A, Anand A, Singh A, Roy P, Singh N, Kumar V, Sharma S, Baranwal M. Machine learning-based ensemble approach in prediction of lung cancer predisposition using XRCC1 gene polymorphism. J Biomol Struct Dyn 2023:1-10. [PMID: 37545160 DOI: 10.1080/07391102.2023.2242492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 07/23/2023] [Indexed: 08/08/2023]
Abstract
The employment of machine learning approaches has shown promising results in predicting cancer. In the current study, polymorphisms data of five single nucleotide polymorphisms (SNPs) of DNA repair gene XRCC1 (XRCC1 399, XRCC1 194, XRCC1 206, XRCC1 632, XRCC1 280) of the north Indian population along with four smoking status data is considered as an input to the proposed ensemble model to predict the risk of individual susceptibility to the lung cancer. The prediction accuracy of the proposed ensemble model for cancer predisposition was found to be 85%. The model performance is also evaluated using sensitivity, specificity, precision and the Gini index, which is found in the range of 0.83-0.87. The proposed model also outperformed in all evaluation parameters when compared with the individual Model (LM, SVM, RF, KNN and baseline neural net). Collectively, current results suggest the potential of the proposed ensemble model in predicting the risk of cancer based on XRCC1 SNPs data.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Abhishek Choudhary
- Department of Computer Science, Thapar Institute of Engineering & Technology, India
| | - Adarsh Anand
- Department of Electronics & Communication Engineering, Thapar Institute of Engineering & Technology, India
| | - Amrita Singh
- Department of Biotechnology, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| | - Pratima Roy
- Department of Biotechnology, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| | - Navneet Singh
- Department of Pulmonary Medicine, Post Graduate Institute of Education and Medical Research (PGIMER), Chandigarh, India
| | - Vinay Kumar
- Department of Electronics & Communication Engineering, Thapar Institute of Engineering & Technology, India
| | - Siddharth Sharma
- Department of Biotechnology, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| | - Manoj Baranwal
- Department of Biotechnology, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| |
Collapse
|
6
|
Alzoubi H, Alzubi R, Ramzan N. Deep Learning Framework for Complex Disease Risk Prediction Using Genomic Variations. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094439. [PMID: 37177642 PMCID: PMC10181706 DOI: 10.3390/s23094439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/05/2023] [Accepted: 04/26/2023] [Indexed: 05/15/2023]
Abstract
Genome-wide association studies have proven their ability to improve human health outcomes by identifying genotypes associated with phenotypes. Various works have attempted to predict the risk of diseases for individuals based on genotype data. This prediction can either be considered as an analysis model that can lead to a better understanding of gene functions that underlie human disease or as a black box in order to be used in decision support systems and in early disease detection. Deep learning techniques have gained more popularity recently. In this work, we propose a deep-learning framework for disease risk prediction. The proposed framework employs a multilayer perceptron (MLP) in order to predict individuals' disease status. The proposed framework was applied to the Wellcome Trust Case-Control Consortium (WTCCC), the UK National Blood Service (NBS) Control Group, and the 1958 British Birth Cohort (58C) datasets. The performance comparison of the proposed framework showed that the proposed approach outperformed the other methods in predicting disease risk, achieving an area under the curve (AUC) up to 0.94.
Collapse
Affiliation(s)
- Hadeel Alzoubi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia
| | - Raid Alzubi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia
| | - Naeem Ramzan
- School of Computing, Engineering and Physical Sciences, University of the West of Scotland, High Street, Paisley PA1 2BE, UK
| |
Collapse
|
7
|
Yang CH, Hou MF, Chuang LY, Yang CS, Lin YD. Dimensionality reduction approach for many-objective epistasis analysis. Brief Bioinform 2023; 24:6858949. [PMID: 36458451 DOI: 10.1093/bib/bbac512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 10/07/2022] [Accepted: 10/26/2022] [Indexed: 12/04/2022] Open
Abstract
In epistasis analysis, single-nucleotide polymorphism-single-nucleotide polymorphism interactions (SSIs) among genes may, alongside other environmental factors, influence the risk of multifactorial diseases. To identify SSI between cases and controls (i.e. binary traits), the score for model quality is affected by different objective functions (i.e. measurements) because of potential disease model preferences and disease complexities. Our previous study proposed a multiobjective approach-based multifactor dimensionality reduction (MOMDR), with the results indicating that two objective functions could enhance SSI identification with weak marginal effects. However, SSI identification using MOMDR remains a challenge because the optimal measure combination of objective functions has yet to be investigated. This study extended MOMDR to the many-objective version (i.e. many-objective MDR, MaODR) by integrating various disease probability measures based on a two-way contingency table to improve the identification of SSI between cases and controls. We introduced an objective function selection approach to determine the optimal measure combination in MaODR among 10 well-known measures. In total, 6 disease models with and 40 disease models without marginal effects were used to evaluate the general algorithms, namely those based on multifactor dimensionality reduction, MOMDR and MaODR. Our results revealed that the MaODR-based three objective function model, correct classification rate, likelihood ratio and normalized mutual information (MaODR-CLN) exhibited the higher 6.47% detection success rates (Accuracy) than MOMDR and higher 17.23% detection success rates than MDR through the application of an objective function selection approach. In a Wellcome Trust Case Control Consortium, MaODR-CLN successfully identified the significant SSIs (P < 0.001) associated with coronary artery disease. We performed a systematic analysis to identify the optimal measure combination in MaODR among 10 objective functions. Our combination detected SSIs-based binary traits with weak marginal effects and thus reduced spurious variables in the score model. MOAI is freely available at https://sites.google.com/view/maodr/home.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Information Management at the Tainan University of Technology, and at the Department of Electronic Engineering at National Kaohsiung of Science and Technology, Taiwan.,Biomedical Engineering, Kaohsiung Medical University, Taiwan
| | - Ming-Feng Hou
- Kaohsiung Medical University Hospital, and Professor at the Department of Surgery, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering at I-Shou University, Taiwan
| | - Cheng-San Yang
- Department of Plastic Surgery, and serves as the Medical Matters Secretary of Chia-Yi Christian Hospital, Taiwan
| | - Yu-Da Lin
- Department of Computer Science and Information Engineering, and at the National Penghu University of Science and Technology, Taiwan
| |
Collapse
|
8
|
Ponte-Fernandez C, Gonzalez-Dominguez J, Carvajal-Rodriguez A, Martin MJ. Evaluation of Existing Methods for High-Order Epistasis Detection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:912-926. [PMID: 33055017 DOI: 10.1109/tcbb.2020.3030312] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.
Collapse
|
9
|
Wang J, Zhang H, Ren W, Guo M, Yu G. EpiMC: Detecting Epistatic Interactions Using Multiple Clusterings. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:243-254. [PMID: 33989157 DOI: 10.1109/tcbb.2021.3080462] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Detecting single nucleotide polymorphisms (SNPs) interactions is crucial to identify susceptibility genes associated with complex human diseases in genome-wide association studies. Clustering-based approaches are widely used in reducing search space and exploring potential relationships between SNPs in epistasis analysis. However, these approaches all only use a single measure to filter out nonsignificant SNP combinations, which may be significant ones from another perspective. In this paper, we propose a two-stage approach named EpiMC (Epistatic Interactions detection based on Multiple Clusterings) that employs multiple clusterings to obtain more precise candidate sets and more comprehensively detect high-order interactions based on these sets. In the first stage, EpiMC proposes a matrix factorization based multiple clusterings algorithm to generate multiple diverse clusterings, each of which divide all SNPs into different clusters. This stage aims to reduce the chance of filtering out potential candidates overlooked by a single clustering and groups associated SNPs together from different clustering perspectives. In the next stage, EpiMC considers both the single-locus effects and interaction effects to select high-quality disease associated SNPs, and then uses Jaccard similarity to get candidate sets. Finally, EpiMC uses exhaustive search on the obtained small candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpiMC has a better performance in detecting high-order interactions than state-of-the-art solutions. On the Wellcome Trust Case Control Consortium (WTCCC) dataset, EpiMC detects several significant epistatic interactions associated with breast cancer (BC) and age-related macular degeneration (AMD), which again corroborate the effectiveness of EpiMC.
Collapse
|
10
|
Abd El Hamid MM, Shaheen M, Mabrouk MS, Omar YMK. MACHINE LEARNING FOR DETECTING EPISTASIS INTERACTIONS AND ITS RELEVANCE TO PERSONALIZED MEDICINE IN ALZHEIMER’S DISEASE: SYSTEMATIC REVIEW. BIOMEDICAL ENGINEERING: APPLICATIONS, BASIS AND COMMUNICATIONS 2021; 33. [DOI: 10.4015/s1016237221500472] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Alzheimer’s disease (AD) is a progressive disease that attacks the brain’s neurons and causes problems in memory, thinking, and reasoning skills. Personalized Medicine (PM) needs a better and more accurate understanding of the relationship between human genetic data and complex diseases like AD. The goal of PM is to tailor the treatment of a case person to his individual properties. PM requires the prediction of a person’s disease from genetic data, and its success depends on the accurate detection of genetic biomarkers. Single Nucleotide polymorphisms (SNPs) are considered the most prevalent type of variation in the human genome. Epistasis has a biological relevance to complex diseases and has an important impact on PM. Detection of the most significant epistasis interactions associated with complex diseases is a big challenge. This paper reviews several machine learning techniques and algorithms to detect the most significant epistasis interactions in Alzheimer’s disease. We discuss many machine learning techniques that can be used for detecting SNPs’ combinations like Random Forests, Support Vector Machines, Multifactor Dimensionality Reduction, Neural Network, and Deep Learning. This review paper highlights the pros and cons of these techniques and explains how they can be applied in an efficient framework to apply knowledge discovery and data mining in AD disease.
Collapse
Affiliation(s)
- Marwa M. Abd El Hamid
- The Higher Institute of Computer Science & Information Technology, El-Shorouk Academy, El Shorouk City, Cairo, Egypt
- College of Computing and Information Technology AASTMT, Egypt
| | - Mohamed Shaheen
- College of Computing and Information Technology AASTMT, Egypt
| | - Mai S. Mabrouk
- Biomedical Engineering Department Misr University for Science and Technology 6th of October City, Egypt
| | | |
Collapse
|
11
|
Lee CY, Zeng JH, Lee SY, Lu RB, Kuo PH. SNP Data Science for Classification of Bipolar Disorder I and Bipolar Disorder II. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2862-2869. [PMID: 32324560 DOI: 10.1109/tcbb.2020.2988024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Bipolar disorder I (BD-I) and bipolar disorder II (BD-II) have specific characteristics and clear diagnostic criteria, but quite different treatment guidelines. In clinical practice, BD-II is commonly mistaken as a mild form of BD-I. This study uses data science technique to identify the important Single Nucleotide Polymorphisms (SNPs) significantly affecting the classifications of BD-I and BD-II, and develops a set of complementary diagnostic classifiers to enhance the diagnostic process. Screening assessments and SNP genotypes of 316 Han Chinese were performed with the Affymetrix Axiom Genome-Wide TWB Array Plate. The results show that the classifier constructed by 23 SNPs reached the area under curve of ROC (AUC) level of 0.939, while the classifier constructed by 42 SNPs reached the AUC level of 0.9574, which is a mere addition of 1.84 percent. The accuracy rate of classification increased by 3.46 percent. This study also uses Gene Ontology (GO) and Pathway to conduct a functional analysis and identify significant items, including calcium ion binding, GABA-A receptor activity, Rap1 signaling pathway, ECM proteoglycans, IL12-mediated signaling events, Nicotine addiction), and PI3K-Akt signaling pathway. The study can address time-consuming SNPs identification and also quantify the effect of SNP-SNP interactions.
Collapse
|
12
|
Ahmed H, Alarabi L, El-Sappagh S, Soliman H, Elmogy M. Genetic variations analysis for complex brain disease diagnosis using machine learning techniques: opportunities and hurdles. PeerJ Comput Sci 2021; 7:e697. [PMID: 34616886 PMCID: PMC8459785 DOI: 10.7717/peerj-cs.697] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 08/05/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVES This paper presents an in-depth review of the state-of-the-art genetic variations analysis to discover complex genes associated with the brain's genetic disorders. We first introduce the genetic analysis of complex brain diseases, genetic variation, and DNA microarrays. Then, the review focuses on available machine learning methods used for complex brain disease classification. Therein, we discuss the various datasets, preprocessing, feature selection and extraction, and classification strategies. In particular, we concentrate on studying single nucleotide polymorphisms (SNP) that support the highest resolution for genomic fingerprinting for tracking disease genes. Subsequently, the study provides an overview of the applications for some specific diseases, including autism spectrum disorder, brain cancer, and Alzheimer's disease (AD). The study argues that despite the significant recent developments in the analysis and treatment of genetic disorders, there are considerable challenges to elucidate causative mutations, especially from the viewpoint of implementing genetic analysis in clinical practice. The review finally provides a critical discussion on the applicability of genetic variations analysis for complex brain disease identification highlighting the future challenges. METHODS We used a methodology for literature surveys to obtain data from academic databases. Criteria were defined for inclusion and exclusion. The selection of articles was followed by three stages. In addition, the principal methods for machine learning to classify the disease were presented in each stage in more detail. RESULTS It was revealed that machine learning based on SNP was widely utilized to solve problems of genetic variation for complex diseases related to genes. CONCLUSIONS Despite significant developments in genetic diseases in the past two decades of the diagnosis and treatment, there is still a large percentage in which the causative mutation cannot be determined, and a final genetic diagnosis remains elusive. So, we need to detect the variations of the genes related to brain disorders in the early disease stages.
Collapse
Affiliation(s)
- Hala Ahmed
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Louai Alarabi
- Department of Computer Science, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Shaker El-Sappagh
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
| | - Hassan Soliman
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Mohammed Elmogy
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| |
Collapse
|
13
|
Briggs FBS, Sept C. Mining Complex Genetic Patterns Conferring Multiple Sclerosis Risk. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18052518. [PMID: 33802599 PMCID: PMC7967327 DOI: 10.3390/ijerph18052518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Revised: 02/23/2021] [Accepted: 03/02/2021] [Indexed: 01/21/2023]
Abstract
(1) Background: Complex genetic relationships, including gene-gene (G × G; epistasis), gene(n), and gene-environment (G × E) interactions, explain a substantial portion of the heritability in multiple sclerosis (MS). Machine learning and data mining methods are promising approaches for uncovering higher order genetic relationships, but their use in MS have been limited. (2) Methods: Association rule mining (ARM), a combinatorial rule-based machine learning algorithm, was applied to genetic data for non-Latinx MS cases (n = 207) and controls (n = 179). The objective was to identify patterns (rules) amongst the known MS risk variants, including HLA-DRB1*15:01 presence, HLA-A*02:01 absence, and 194 of the 200 common autosomal variants. Probabilistic measures (confidence and support) were used to mine rules. (3) Results: 114 rules met minimum requirements of 80% confidence and 5% support. The top ranking rule by confidence consisted of HLA-DRB1*15:01, SLC30A7-rs56678847 and AC093277.1-rs6880809; carriers of these variants had a significantly greater risk for MS (odds ratio = 20.2, 95% CI: 8.5, 37.5; p = 4 × 10−9). Several variants were shared across rules, the most common was INTS8-rs78727559, which was in 32.5% of rules. (4) Conclusions: In summary, we demonstrate evidence that specific combinations of MS risk variants disproportionately confer elevated risk by applying a robust analytical framework to a modestly sized study population.
Collapse
Affiliation(s)
- Farren B. S. Briggs
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, 2103 Cornell Rd, Cleveland, OH 44106, USA
- Correspondence: ; Tel.: +1-216-368-5636
| | - Corriene Sept
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA;
| |
Collapse
|
14
|
Araghi S, Nguyen T. A Hybrid Supervised Approach to Human Population Identification Using Genomics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:443-454. [PMID: 31150342 DOI: 10.1109/tcbb.2019.2919501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Single nucleotide polymorphisms (SNPs) are one type of genetic variations and each SNP represents a difference in a single DNA building block, namely a nucleotide. Previous research demonstrated that SNPs can be used to identify the correct source population of an individual. In addition, variations in the DNA sequences have an influence on human diseases. In this regard, SNPs studies are helpful for personalized medicine and treatment. In the literature, unsupervised clustering methods especially principal component analysis (PCA) have been popular for studying population structure. In this study, we investigate supervised approaches, particularly the LASSO multinomial regression classification method, for recognizing individuals' origin genetic population. Then, we introduce PCA-LASSO as an extension of LASSO method that benefits from advantageous characteristics of both PCA and LASSO regression. The experimental results obtained on the 1,000 genome project dataset show PCA-LASSO's significantly high accuracy in prediction of individual's origin population.
Collapse
|
15
|
Single nucleotide polymorphisms in oil palm HOMOGENTISATE GERANYL-GERANYL TRANSFERASE promoter for species differentiation and TOCOTRIENOL improvement. Meta Gene 2021. [DOI: 10.1016/j.mgene.2020.100818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
16
|
Hofer E, Roshchupkin GV, Adams HHH, Knol MJ, Lin H, Li S, Zare H, Ahmad S, Armstrong NJ, Satizabal CL, Bernard M, Bis JC, Gillespie NA, Luciano M, Mishra A, Scholz M, Teumer A, Xia R, Jian X, Mosley TH, Saba Y, Pirpamer L, Seiler S, Becker JT, Carmichael O, Rotter JI, Psaty BM, Lopez OL, Amin N, van der Lee SJ, Yang Q, Himali JJ, Maillard P, Beiser AS, DeCarli C, Karama S, Lewis L, Harris M, Bastin ME, Deary IJ, Veronica Witte A, Beyer F, Loeffler M, Mather KA, Schofield PR, Thalamuthu A, Kwok JB, Wright MJ, Ames D, Trollor J, Jiang J, Brodaty H, Wen W, Vernooij MW, Hofman A, Uitterlinden AG, Niessen WJ, Wittfeld K, Bülow R, Völker U, Pausova Z, Bruce Pike G, Maingault S, Crivello F, Tzourio C, Amouyel P, Mazoyer B, Neale MC, Franz CE, Lyons MJ, Panizzon MS, Andreassen OA, Dale AM, Logue M, Grasby KL, Jahanshad N, Painter JN, Colodro-Conde L, Bralten J, Hibar DP, Lind PA, Pizzagalli F, Stein JL, Thompson PM, Medland SE, Sachdev PS, Kremen WS, Wardlaw JM, Villringer A, van Duijn CM, Grabe HJ, Longstreth WT, Fornage M, Paus T, Debette S, Ikram MA, Schmidt H, Schmidt R, Seshadri S. Genetic correlations and genome-wide associations of cortical structure in general population samples of 22,824 adults. Nat Commun 2020; 11:4796. [PMID: 32963231 PMCID: PMC7508833 DOI: 10.1038/s41467-020-18367-y] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Accepted: 08/20/2020] [Indexed: 12/22/2022] Open
Abstract
Cortical thickness, surface area and volumes vary with age and cognitive function, and in neurological and psychiatric diseases. Here we report heritability, genetic correlations and genome-wide associations of these cortical measures across the whole cortex, and in 34 anatomically predefined regions. Our discovery sample comprises 22,824 individuals from 20 cohorts within the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium and the UK Biobank. We identify genetic heterogeneity between cortical measures and brain regions, and 160 genome-wide significant associations pointing to wnt/β-catenin, TGF-β and sonic hedgehog pathways. There is enrichment for genes involved in anthropometric traits, hindbrain development, vascular and neurodegenerative disease and psychiatric conditions. These data are a rich resource for studies of the biological mechanisms behind cortical development and aging.
Collapse
Affiliation(s)
- Edith Hofer
- Clinical Division of Neurogeriatrics, Department of Neurology, Medical University of Graz, Graz, Austria
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Gennady V Roshchupkin
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Medical Informatics, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Hieab H H Adams
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Maria J Knol
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Honghuang Lin
- Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Shuo Li
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Habil Zare
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, UT Health San Antonio, San Antonio, USA
- Department of Cell Systems & Anatomy, The University of Texas Health Science Center, San Antonio, TX, USA
| | - Shahzad Ahmad
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | | | - Claudia L Satizabal
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, UT Health San Antonio, San Antonio, USA
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
| | | | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, Epidemiology and Health Services, University of Washington, Seattle, WA, USA
| | - Nathan A Gillespie
- Virginia Institute for Psychiatric and Behavior Genetics, Virginia Commonwealth University, Richmond, VA, USA
- QIMR Berghofer Medical Research Institute, Herston, QLD, Australia
| | - Michelle Luciano
- Centre for Cognitive Epidemiology and Cognitive Ageing, University of Edinburgh, Edinburgh, UK
- Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Aniket Mishra
- University of Bordeaux, Bordeaux Population Health Research Center, INSERM UMR 1219, Bordeaux, France
| | - Markus Scholz
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, University of Leipzig, Leipzig, Germany
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
| | - Rui Xia
- Institute of Molecular Medicine and Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Xueqiu Jian
- Institute of Molecular Medicine and Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Thomas H Mosley
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Yasaman Saba
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz, Austria
| | - Lukas Pirpamer
- Clinical Division of Neurogeriatrics, Department of Neurology, Medical University of Graz, Graz, Austria
| | - Stephan Seiler
- Imaging of Dementia and Aging (IDeA) Laboratory, Department of Neurology, University of California-Davis, Davis, CA, USA
- Department of Neurology and Center for Neuroscience, University of California at Davis, Sacramento, CA, USA
| | - James T Becker
- Departments of Psychiatry, Neurology, and Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Jerome I Rotter
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute and Pediatrics at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, Epidemiology and Health Services, University of Washington, Seattle, WA, USA
| | - Oscar L Lopez
- Departments of Psychiatry, Neurology, and Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Najaf Amin
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | | | - Qiong Yang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Jayandra J Himali
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Pauline Maillard
- Imaging of Dementia and Aging (IDeA) Laboratory, Department of Neurology, University of California-Davis, Davis, CA, USA
- Department of Neurology and Center for Neuroscience, University of California at Davis, Sacramento, CA, USA
| | - Alexa S Beiser
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
| | - Charles DeCarli
- Imaging of Dementia and Aging (IDeA) Laboratory, Department of Neurology, University of California-Davis, Davis, CA, USA
- Department of Neurology and Center for Neuroscience, University of California at Davis, Sacramento, CA, USA
| | - Sherif Karama
- McGill University, Montreal Neurological Institute, Montreal, QC, Canada
| | - Lindsay Lewis
- McGill University, Montreal Neurological Institute, Montreal, QC, Canada
| | - Mat Harris
- Centre for Cognitive Epidemiology and Cognitive Ageing, University of Edinburgh, Edinburgh, UK
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
- Brain Research Imaging Centre, University of Edinburgh, Edinburgh, UK
- Scottish Imaging Network, A Platform for Scientific Excellence (SINAPSE) Collaboration, Department of Neuroimaging Sciences, The University of Edinburgh, Edinburgh, UK
| | - Mark E Bastin
- Centre for Cognitive Epidemiology and Cognitive Ageing, University of Edinburgh, Edinburgh, UK
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
- Brain Research Imaging Centre, University of Edinburgh, Edinburgh, UK
- Scottish Imaging Network, A Platform for Scientific Excellence (SINAPSE) Collaboration, Department of Neuroimaging Sciences, The University of Edinburgh, Edinburgh, UK
| | - Ian J Deary
- Centre for Cognitive Epidemiology and Cognitive Ageing, University of Edinburgh, Edinburgh, UK
- Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - A Veronica Witte
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Faculty of Medicine, CRC 1052 Obesity Mechanisms, University of Leipzig, Leipzig, Germany
| | - Frauke Beyer
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Faculty of Medicine, CRC 1052 Obesity Mechanisms, University of Leipzig, Leipzig, Germany
| | - Markus Loeffler
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, University of Leipzig, Leipzig, Germany
| | - Karen A Mather
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
- Neuroscience Research Australia, Sydney, Australia
| | - Peter R Schofield
- Neuroscience Research Australia, Sydney, Australia
- School of Medical Sciences, University of New South Wales, Sydney, Australia
| | - Anbupalam Thalamuthu
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
| | - John B Kwok
- School of Medical Sciences, University of New South Wales, Sydney, Australia
- Brain and Mind Centre - The University of Sydney, Camperdown, NSW, Australia
| | - Margaret J Wright
- Queensland Brain Institute, The University of Queensland, St Lucia, QLD, Australia
- Centre for Advanced Imaging, The University of Queensland, St Lucia, QLD, Australia
| | - David Ames
- National Ageing Research Institute, Royal Melbourne Hospital, Parkvill, VIC, Australia
- Academic Unit for Psychiatry of Old Age, University of Melbourne, St George's Hospital, Kew, VIC, Australia
| | - Julian Trollor
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
- Department of Developmental Disability Neuropsychiatry, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
| | - Jiyang Jiang
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
| | - Henry Brodaty
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
- Dementia Centre for Research Collaboration, University of New South Wales, Sydney, NSW, Australia
| | - Wei Wen
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
| | - Meike W Vernooij
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Albert Hofman
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Wiro J Niessen
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
- Imaging Physics, Faculty of Applied Sciences, Delft University of Technology, Delft, The Netherlands
| | - Katharina Wittfeld
- German Center for Neurodegenerative Diseases (DZNE), Site Rostock/Greifswald, Greifswald, Germany
- Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Greifswald, Germany
| | - Robin Bülow
- Institute for Diagnostic Radiology and Neuroradiology, University Medicine Greifswald, Greifswald, Germany
| | - Uwe Völker
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Zdenka Pausova
- Hospital for Sick Children, Toronto, ON, Canada
- Departments of Physiology and Nutritional Sciences, University of Toronto, Toronto, ON, Canada
| | - G Bruce Pike
- Departments of Radiology and Clinial Neurosciences, University of Calgary, Calgary, AB, Canada
| | - Sophie Maingault
- Institut des Maladies Neurodégénratives UMR5293, CEA, CNRS, University of Bordeaux, Bordeaux, France
| | - Fabrice Crivello
- Institut des Maladies Neurodégénratives UMR5293, CEA, CNRS, University of Bordeaux, Bordeaux, France
| | - Christophe Tzourio
- University of Bordeaux, Bordeaux Population Health Research Center, INSERM UMR 1219, Bordeaux, France
- Pole de santé publique, Centre Hospitalier Universitaire de Bordeaux, Bordeaux, France
| | - Philippe Amouyel
- Centre Hospitalier Universitaire de Bordeaux, France; Inserm U1167, Lille, France
- Department of Epidemiology and Public Health, Pasteur Institute of Lille, Lille, France
- Department of Public Health, Lille University Hospital, Lille, France
| | - Bernard Mazoyer
- Institut des Maladies Neurodégénratives UMR5293, CEA, CNRS, University of Bordeaux, Bordeaux, France
| | - Michael C Neale
- Virginia Institute for Psychiatric and Behavior Genetics, Virginia Commonwealth University, Richmond, VA, USA
| | - Carol E Franz
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Michael J Lyons
- Department of Psychological and Brain Sciences, Boston University, Boston, MA, USA
| | - Matthew S Panizzon
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Ole A Andreassen
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo and Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Anders M Dale
- Departments of Radiology and Neurosciences, University of California, San Diego, La Jolla, CA, USA
| | - Mark Logue
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- National Center for PTSD at Boston VA Healthcare System, Boston, MA, USA
- Department of Psychiatry and Department of Medicine-Biomedical Genetics Section, Boston University School of Medicine, Boston, MA, USA
| | - Katrina L Grasby
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Neda Jahanshad
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
| | - Jodie N Painter
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Lucía Colodro-Conde
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Janita Bralten
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Derrek P Hibar
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
- Neuroscience Biomarkers, Janssen Research and Development, LLC, San Diego, CA, USA
| | - Penelope A Lind
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Fabrizio Pizzagalli
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
| | - Jason L Stein
- Department of Genetics & UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
| | - Sarah E Medland
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Perminder S Sachdev
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
- Neuropsychiatric Institute, Prince of Wales Hospital, Sydney, NSW, Australia
| | - William S Kremen
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Joanna M Wardlaw
- Centre for Cognitive Epidemiology and Cognitive Ageing, University of Edinburgh, Edinburgh, UK
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
- Brain Research Imaging Centre, University of Edinburgh, Edinburgh, UK
- Scottish Imaging Network, A Platform for Scientific Excellence (SINAPSE) Collaboration, Department of Neuroimaging Sciences, The University of Edinburgh, Edinburgh, UK
| | - Arno Villringer
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Day Clinic for Cognitive Neurology, University Hospital Leipzig, Leipzig, Germany
| | - Cornelia M van Duijn
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Hans J Grabe
- German Center for Neurodegenerative Diseases (DZNE), Site Rostock/Greifswald, Greifswald, Germany
- Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Greifswald, Germany
| | - William T Longstreth
- Departments of Neurology and Epidemiology, University of Washington, Seattle, WA, USA
| | - Myriam Fornage
- Institute of Molecular Medicine and Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Tomas Paus
- Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, ON, Canada
- Departments of Psychology and Psychiatry, University of Toronto, Toronto, ON, Canada
| | - Stephanie Debette
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
- University of Bordeaux, Bordeaux Population Health Research Center, INSERM UMR 1219, Bordeaux, France
- CHU de Bordeaux, Department of Neurology, F-33000, Bordeaux, France
| | - M Arfan Ikram
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
- Department of Neurology, Erasmus MC, Rotterdam, The Netherlands
| | - Helena Schmidt
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz, Austria
| | - Reinhold Schmidt
- Clinical Division of Neurogeriatrics, Department of Neurology, Medical University of Graz, Graz, Austria.
| | - Sudha Seshadri
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, UT Health San Antonio, San Antonio, USA.
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA.
| |
Collapse
|
17
|
Kang J, Coates JT, Strawderman RL, Rosenstein BS, Kerns SL. Genomics models in radiotherapy: From mechanistic to machine learning. Med Phys 2020; 47:e203-e217. [PMID: 32418335 PMCID: PMC8725063 DOI: 10.1002/mp.13751] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 06/28/2019] [Accepted: 07/17/2019] [Indexed: 12/28/2022] Open
Abstract
Machine learning (ML) provides a broad framework for addressing high-dimensional prediction problems in classification and regression. While ML is often applied for imaging problems in medical physics, there are many efforts to apply these principles to biological data toward questions of radiation biology. Here, we provide a review of radiogenomics modeling frameworks and efforts toward genomically guided radiotherapy. We first discuss medical oncology efforts to develop precision biomarkers. We next discuss similar efforts to create clinical assays for normal tissue or tumor radiosensitivity. We then discuss modeling frameworks for radiosensitivity and the evolution of ML to create predictive models for radiogenomics.
Collapse
Affiliation(s)
- John Kang
- Department of Radiation Oncology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - James T. Coates
- CRUK/MRC Oxford Institute for Radiation Oncology, University of Oxford, Oxford OX3 7DQ, UK
| | - Robert L. Strawderman
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14642, USA
| | - Barry S. Rosenstein
- Department of Radiation Oncology and the Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sarah L. Kerns
- Department of Radiation Oncology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
18
|
Tang C, He Z, Liu H, Xu Y, Huang H, Yang G, Xiao Z, Li S, Liu H, Deng Y, Chen Z, Chen H, He N. Application of magnetic nanoparticles in nucleic acid detection. J Nanobiotechnology 2020; 18:62. [PMID: 32316985 PMCID: PMC7171821 DOI: 10.1186/s12951-020-00613-6] [Citation(s) in RCA: 92] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 03/25/2020] [Indexed: 12/16/2022] Open
Abstract
Nucleic acid is the main material for storing, copying, and transmitting genetic information. Gene sequencing is of great significance in DNA damage research, gene therapy, mutation analysis, bacterial infection, drug development, and clinical diagnosis. Gene detection has a wide range of applications, such as environmental, biomedical, pharmaceutical, agriculture and forensic medicine to name a few. Compared with Sanger sequencing, high-throughput sequencing technology has the advantages of larger output, high resolution, and low cost which greatly promotes the application of sequencing technology in life science research. Magnetic nanoparticles, as an important part of nanomaterials, have been widely used in various applications because of their good dispersion, high surface area, low cost, easy separation in buffer systems and signal detection. Based on the above, the application of magnetic nanoparticles in nucleic acid detection was reviewed.
Collapse
Affiliation(s)
- Congli Tang
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Ziyu He
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Hongmei Liu
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Yuyue Xu
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Hao Huang
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Gaojian Yang
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Ziqi Xiao
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Song Li
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Hongna Liu
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Yan Deng
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
- State Key Laboratory of Bioelectronics, Southeast University, Nanjing, 210096 China
| | - Zhu Chen
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Hui Chen
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou, 412007 China
| | - Nongyue He
- State Key Laboratory of Bioelectronics, Southeast University, Nanjing, 210096 China
| |
Collapse
|
19
|
El-Rashidy MA. An efficient methodology for discovering both of gene-environment interactions and gene-gene interactions causing genetic diseases. EGYPTIAN INFORMATICS JOURNAL 2020. [DOI: 10.1016/j.eij.2019.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
20
|
Rahit KMTH, Tarailo-Graovac M. Genetic Modifiers and Rare Mendelian Disease. Genes (Basel) 2020; 11:E239. [PMID: 32106447 PMCID: PMC7140819 DOI: 10.3390/genes11030239] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 02/21/2020] [Indexed: 12/11/2022] Open
Abstract
Despite advances in high-throughput sequencing that have revolutionized the discovery of gene defects in rare Mendelian diseases, there are still gaps in translating individual genome variation to observed phenotypic outcomes. While we continue to improve genomics approaches to identify primary disease-causing variants, it is evident that no genetic variant acts alone. In other words, some other variants in the genome (genetic modifiers) may alleviate (suppress) or exacerbate (enhance) the severity of the disease, resulting in the variability of phenotypic outcomes. Thus, to truly understand the disease, we need to consider how the disease-causing variants interact with the rest of the genome in an individual. Here, we review the current state-of-the-field in the identification of genetic modifiers in rare Mendelian diseases and discuss the potential for future approaches that could bridge the existing gap.
Collapse
Affiliation(s)
- K. M. Tahsin Hassan Rahit
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada;
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada;
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
21
|
Tahir M, Sardaraz M. A Fast and Scalable Workflow for SNPs Detection in Genome Sequences Using Hadoop Map-Reduce. Genes (Basel) 2020; 11:E166. [PMID: 32033366 PMCID: PMC7074349 DOI: 10.3390/genes11020166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 01/31/2020] [Accepted: 02/01/2020] [Indexed: 11/16/2022] Open
Abstract
Next generation sequencing (NGS) technologies produce a huge amount of biological data, which poses various issues such as requirements of high processing time and large memory. This research focuses on the detection of single nucleotide polymorphism (SNP) in genome sequences. Currently, SNPs detection algorithms face several issues, e.g., computational overhead cost, accuracy, and memory requirements. In this research, we propose a fast and scalable workflow that integrates Bowtie aligner with Hadoop based Heap SNP caller to improve the SNPs detection in genome sequences. The proposed workflow is validated through benchmark datasets obtained from publicly available web-portals, e.g., NCBI and DDBJ DRA. Extensive experiments have been performed and the results obtained are compared with Bowtie and BWA aligner in the alignment phase, while compared with GATK, FaSD, SparkGA, Halvade, and Heap in SNP calling phase. Experimental results analysis shows that the proposed workflow outperforms existing frameworks e.g., GATK, FaSD, Heap integrated with BWA and Bowtie aligners, SparkGA, and Halvade. The proposed framework achieved 22.46% more efficient F-score and 99.80% consistent accuracy on average. More, comparatively 0.21% mean higher accuracy is achieved. Moreover, SNP mining has also been performed to identify specific regions in genome sequences. All the frameworks are implemented with the default configuration of memory management. The observations show that all workflows have approximately same memory requirement. In the future, it is intended to graphically show the mined SNPs for user-friendly interaction, analyze and optimize the memory requirements as well.
Collapse
Affiliation(s)
| | - Muhammad Sardaraz
- Department of Computer Science, COMSATS University Islamabad, Attock Campus 43600, Pakistan;
| |
Collapse
|
22
|
Uppu S, Krishna A. A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise. Int J Med Inform 2018; 119:134-151. [DOI: 10.1016/j.ijmedinf.2018.09.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 04/13/2018] [Accepted: 09/03/2018] [Indexed: 01/17/2023]
|
23
|
Hall RG, Pasipanodya JG, Swancutt MA, Meek C, Leff R, Gumbo T. Supervised Machine-Learning Reveals That Old and Obese People Achieve Low Dapsone Concentrations. CPT Pharmacometrics Syst Pharmacol 2017; 6:552-559. [PMID: 28575552 PMCID: PMC5572360 DOI: 10.1002/psp4.12208] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Revised: 04/24/2017] [Accepted: 05/18/2017] [Indexed: 12/04/2022] Open
Abstract
The human species is becoming increasingly obese. Dapsone, which is extensively used across the globe for dermatological disorders, arachnid bites, and for treatment of several bacterial, fungal, and parasitic diseases, could be affected by obesity. We performed a clinical experiment, using optimal design, in volunteers weighing 44-150 kg, to identify the effect of obesity on dapsone pharmacokinetic parameters based on maximum-likelihood solution via the expectation-maximization algorithm. Artificial intelligence-based multivariate adaptive regression splines were used for covariate selection, and identified weight and/or age as predictors of absorption, systemic clearance, and volume of distribution. These relationships occurred only between certain patient weight and age ranges, delimited by multiple hinges and regions of discontinuity, not identified by standard pharmacometric approaches. Older and obese people have lower drug concentrations after standard dosing, but with complex patterns. Given that efficacy is concentration-dependent, optimal dapsone doses need to be personalized for obese patients.
Collapse
Affiliation(s)
- RG Hall
- Dose Optimization and Outcomes Research (DOOR) ProgramSchool of Pharmacy, Texas Tech University Health Sciences CenterDallasTexasUSA
| | - JG Pasipanodya
- Center for Infectious Diseases Research and Experimental Therapeutics, Baylor Research Institute, Baylor University Medical CenterDallasTexasUSA
| | - MA Swancutt
- Department of MedicineUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - C Meek
- Dose Optimization and Outcomes Research (DOOR) ProgramSchool of Pharmacy, Texas Tech University Health Sciences CenterDallasTexasUSA
| | - R Leff
- Dose Optimization and Outcomes Research (DOOR) ProgramSchool of Pharmacy, Texas Tech University Health Sciences CenterDallasTexasUSA
| | - T Gumbo
- Center for Infectious Diseases Research and Experimental Therapeutics, Baylor Research Institute, Baylor University Medical CenterDallasTexasUSA
- Department of MedicineUniversity of Cape Town, ObservatoryCape TownSouth Africa
| |
Collapse
|