1
|
Tajerian A. Longitudinal study investigating the influence of COMT gene polymorphism on cortical thickness changes in Parkinson's disease over four years. Sci Rep 2024; 14:9920. [PMID: 38689006 PMCID: PMC11061119 DOI: 10.1038/s41598-024-60828-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 04/27/2024] [Indexed: 05/02/2024] Open
Abstract
Parkinson's disease (PD) is a progressive neurodegenerative disorder affecting over 3% of those over 65. It's caused by reduced dopaminergic neurons and Lewy bodies, leading to motor and non-motor symptoms. The relationship between COMT gene polymorphisms and PD is complex and not fully elucidated. Some studies have reported associations between certain COMT gene variants and PD risk, while others have not found significant associations. This study investigates how COMT gene variations impact cortical thickness changes in PD patients over time, aiming to link genetic factors, especially COMT gene variations, with PD progression. This study analyzed data from 44 PD patients with complete 4-year imaging follow-up from the Parkinson Progression Marker Initiative (PPMI) database. Magnetic resonance imaging (MRI) scans were acquired using consistent methods across 9 different MRI scanners. COMT single-nucleotide polymorphisms (SNPs) were assessed based on whole genome sequencing data. Longitudinal image analysis was conducted using FreeSurfer's processing pipeline. Linear mixed-effect models were employed to examine the interaction effect of genetic variations and time on cortical thickness, while controlling for covariates and subject-specific variations. The rs165599 SNP stands out as a potential contributor to alterations in cortical thickness, showing a significant reduction in overall mean cortical thickness in both hemispheres in homozygotes (Left: P = 0.023, Right: P = 0.028). The supramarginal, precentral, and superior frontal regions demonstrated significant bilateral alterations linked to rs165599. Our findings suggest that the rs165599 variant leads to earlier manifestation of cortical thinning during the course of the disease. However, it does not result in more severe cortical thinning outcomes over time. There is a need for larger cohorts and control groups to validate these findings and consider genetic variant interactions and clinical features to elucidate the specific mechanisms underlying COMT-related neurodegenerative processes in PD.
Collapse
Affiliation(s)
- Amin Tajerian
- School of Medicine, Arak University of Medical Sciences, Arak, Iran.
| |
Collapse
|
2
|
Araghi S, Nguyen T. A Hybrid Supervised Approach to Human Population Identification Using Genomics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:443-454. [PMID: 31150342 DOI: 10.1109/tcbb.2019.2919501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Single nucleotide polymorphisms (SNPs) are one type of genetic variations and each SNP represents a difference in a single DNA building block, namely a nucleotide. Previous research demonstrated that SNPs can be used to identify the correct source population of an individual. In addition, variations in the DNA sequences have an influence on human diseases. In this regard, SNPs studies are helpful for personalized medicine and treatment. In the literature, unsupervised clustering methods especially principal component analysis (PCA) have been popular for studying population structure. In this study, we investigate supervised approaches, particularly the LASSO multinomial regression classification method, for recognizing individuals' origin genetic population. Then, we introduce PCA-LASSO as an extension of LASSO method that benefits from advantageous characteristics of both PCA and LASSO regression. The experimental results obtained on the 1,000 genome project dataset show PCA-LASSO's significantly high accuracy in prediction of individual's origin population.
Collapse
|
3
|
Wang P, Zhu W, Liao B, Cai L, Peng L, Yang J. Predicting Influenza Antigenicity by Matrix Completion With Antigen and Antiserum Similarity. Front Microbiol 2018; 9:2500. [PMID: 30405563 PMCID: PMC6206390 DOI: 10.3389/fmicb.2018.02500] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 10/01/2018] [Indexed: 12/20/2022] Open
Abstract
The rapid mutation of influenza viruses especially on the two surface proteins hemagglutinin (HA) and neuraminidase (NA) has made them capable to escape from population immunity, which has become a key challenge for influenza vaccine design. Thus, it is crucial to predict influenza antigenic evolution and identify new antigenic variants in a timely manner. However, traditional experimental methods like hemagglutination inhibition (HI) assay to select vaccine strains are time and labor-intensive, while popular computational methods are less sensitive, which presents the need for more accurate algorithms. In this study, we have proposed a novel low-rank matrix completion model MCAAS to infer antigenic distances between antigens and antisera based on partially revealed antigenic distances, virus similarity based on HA protein sequences, and vaccine similarity based on vaccine strains. The model exploits the correlations of viruses and vaccines in serological tests as well as the ability of HAs from viruses and vaccine strains in inferring influenza antigenicity. We also compared the effects of comprehensive 65 amino acids substitution matrices in predicting influenza antigenicity. As a result, we applied MCAAS into H3N2 seasonal influenza virus data. Our model achieved a 10-fold cross validation root-mean-squared error (RMSE) of 0.5982, significantly outperformed existing computational methods like antigenic cartography, AntigenMap and BMCSI. We also constructed the antigenic map and studied the association between genetic and antigenic evolution of H3N2 influenza viruses. Finally, our analyses showed that homologous structure derived amino acid substitution matrix (HSDM) is most powerful in predicting influenza antigenicity, which is consistent with previous studies.
Collapse
Affiliation(s)
- Peng Wang
- College of Information Science and Engineering, Hunan University, Changsha, Changsha, China
| | - Wen Zhu
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Bo Liao
- College of Information Science and Engineering, Hunan University, Changsha, Changsha, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Lijun Cai
- College of Information Science and Engineering, Hunan University, Changsha, Changsha, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Jialiang Yang
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine At Mount Sinai, New York, NY, United States
| |
Collapse
|
4
|
Li Z, Liao B, Li Y, Liu W, Chen M, Cai L. Gene function prediction based on combining gene ontology hierarchy with multi-instance multi-label learning. RSC Adv 2018; 8:28503-28509. [PMID: 35542493 PMCID: PMC9083914 DOI: 10.1039/c8ra05122d] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 07/12/2018] [Indexed: 12/04/2022] Open
Abstract
Gene function annotation is the main challenge in the post genome era, which is an important part of the genome annotation. The sequencing of the human genome project produces a whole genome data, providing abundant biological information for the study of gene function annotation. However, to obtain useful knowledge from a large amount of data, a potential strategy is to apply machine learning methods to mine these data and predict gene function. In this study, we improved multi-instance hierarchical clustering by using gene ontology hierarchy to annotate gene function, which combines gene ontology hierarchy with multi-instance multi-label learning frame structure. Then, we used multi-label support vector machine (MLSVM) and multi-label k-nearest neighbor (MLKNN) algorithm to predict the function of gene. Finally, we verified our method in four yeast expression datasets. The performance of the simulated experiments proved that our method is efficient.
Collapse
Affiliation(s)
- Zejun Li
- College of Information Science and Engineering, Hunan University Changsha Hunan 410082 China
- School of Computer and Information Science, Hunan Institute of Technology Hengyang 412002 China
| | - Bo Liao
- College of Information Science and Engineering, Hunan University Changsha Hunan 410082 China
| | - Yun Li
- College of Information Science and Engineering, Hunan University Changsha Hunan 410082 China
| | - Wenhua Liu
- School of Computer and Information Science, Hunan Institute of Technology Hengyang 412002 China
| | - Min Chen
- College of Information Science and Engineering, Hunan University Changsha Hunan 410082 China
- School of Computer and Information Science, Hunan Institute of Technology Hengyang 412002 China
| | - Lijun Cai
- College of Information Science and Engineering, Hunan University Changsha Hunan 410082 China
| |
Collapse
|
5
|
Li Z, Liao B, Cai L, Chen M, Liu W. Semi-Supervised Maximum Discriminative Local Margin for Gene Selection. Sci Rep 2018; 8:8619. [PMID: 29872069 PMCID: PMC5988834 DOI: 10.1038/s41598-018-26806-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 05/14/2018] [Indexed: 11/09/2022] Open
Abstract
In the present study, we introduce a novel semi-supervised method called the semi-supervised maximum discriminative local margin (semiMM) for gene selection in expression data. The semiMM is a "filter" approach that exploits local structure, variance, and mutual information. We first constructed a local nearest neighbour graph and divided this information into within-class and between-class local nearest neighbour graphs by weighing the edge between the two data points. The semiMM aims to discover the most discriminative features for classification via maximizing the local margin between the within-class and between-class data, the variance of all data, and the mutual information of features with class labels. Experiments on five publicly available gene expression datasets revealed the effectiveness of the proposed method compared to three state-of-the-art feature selection algorithms.
Collapse
Affiliation(s)
- Zejun Li
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China.,School of Computer and Information Science, Hunan Institute of Technology, Hengyang, 412002, China
| | - Bo Liao
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China.
| | - Lijun Cai
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Min Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China.,School of Computer and Information Science, Hunan Institute of Technology, Hengyang, 412002, China
| | - Wenhua Liu
- School of Computer and Information Science, Hunan Institute of Technology, Hengyang, 412002, China
| |
Collapse
|
6
|
Discovering Genome-Wide Tag SNPs Based on the Mutual Information of the Variants. PLoS One 2016; 11:e0167994. [PMID: 27992465 PMCID: PMC5161470 DOI: 10.1371/journal.pone.0167994] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 11/23/2016] [Indexed: 01/01/2023] Open
Abstract
Exploring linkage disequilibrium (LD) patterns among the single nucleotide polymorphism (SNP) sites can improve the accuracy and cost-effectiveness of genomic association studies, whereby representative (tag) SNPs are identified to sufficiently represent the genomic diversity in populations. There has been considerable amount of effort in developing efficient algorithms to select tag SNPs from the growing large-scale data sets. Methods using the classical pairwise-LD and multi-locus LD measures have been proposed that aim to reduce the computational complexity and to increase the accuracy, respectively. The present work solves the tag SNP selection problem by efficiently balancing the computational complexity and accuracy, and improves the coverage in genomic diversity in a cost-effective manner. The employed algorithm makes use of mutual information to explore the multi-locus association between SNPs and can handle different data types and conditions. Experiments with benchmark HapMap data sets show comparable or better performance against the state-of-the-art algorithms. In particular, as a novel application, the genome-wide SNP tagging is performed in the 1000 Genomes Project data sets, and produced a well-annotated database of tagging variants that capture the common genotype diversity in 2,504 samples from 26 human populations. Compared to conventional methods, the algorithm requires as input only the genotype (or haplotype) sequences, can scale up to genome-wide analyses, and produces accurate solutions with more information-rich output, providing an improved platform for researchers towards the subsequent association studies.
Collapse
|