1
|
Zhang J, Shen B, Zhou Z, Cai M, Wu X, Han L, Wen Y. An Extended Application of the Fast Multi-Locus Ridge Regression Algorithm in Genome-Wide Association Studies of Categorical Phenotypes. PLANTS (BASEL, SWITZERLAND) 2024; 13:2520. [PMID: 39274004 PMCID: PMC11397509 DOI: 10.3390/plants13172520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Revised: 09/02/2024] [Accepted: 09/05/2024] [Indexed: 09/16/2024]
Abstract
Categorical (either binary or ordinal) quantitative traits are widely observed to measure count and resistance in plants. Unlike continuous traits, categorical traits often provide less detailed insights into genetic variation and possess a more complex underlying genetic architecture, which presents additional challenges for their genome-wide association studies. Meanwhile, methods designed for binary or continuous phenotypes are commonly used to inappropriately analyze ordinal traits, which leads to the loss of original phenotype information and the detection power of quantitative trait nucleotides (QTN). To address these issues, fast multi-locus ridge regression (FastRR), which was originally designed for continuous traits, is used to directly analyze binary or ordinal traits in this study. FastRR includes three stages of continuous transformation, variable reduction, and parameter estimation, and it can computationally handle categorical phenotype data instead of link functions introduced or methods inappropriately used. A series of simulation studies demonstrate that, compared with four other continuous or binary or ordinal approaches, including logistic regression, FarmCPU, FaST-LMM, and POLMM, the FastRR method outperforms in the detection of small-effect QTN, accuracy of estimated effect, and computation speed. We applied FastRR to 14 binary or ordinal phenotypes in the Arabidopsis real dataset and identified 479 significant loci and 76 known genes, at least seven times as many as detected by other algorithms. These findings underscore the potential of FastRR as a very useful tool for genome-wide association studies and novel gene mining of binary and ordinal traits.
Collapse
Affiliation(s)
- Jin Zhang
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Bolin Shen
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Ziyang Zhou
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Mingzhi Cai
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Xinyi Wu
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Le Han
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Yangjun Wen
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| |
Collapse
|
2
|
Silva PP, Gaudillo JD, Vilela JA, Roxas-Villanueva RML, Tiangco BJ, Domingo MR, Albia JR. A machine learning-based SNP-set analysis approach for identifying disease-associated susceptibility loci. Sci Rep 2022; 12:15817. [PMID: 36138111 PMCID: PMC9499949 DOI: 10.1038/s41598-022-19708-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 09/02/2022] [Indexed: 11/17/2022] Open
Abstract
Identifying disease-associated susceptibility loci is one of the most pressing and crucial challenges in modeling complex diseases. Existing approaches to biomarker discovery are subject to several limitations including underpowered detection, neglect for variant interactions, and restrictive dependence on prior biological knowledge. Addressing these challenges necessitates more ingenious ways of approaching the "missing heritability" problem. This study aims to discover disease-associated susceptibility loci by augmenting previous genome-wide association study (GWAS) using the integration of random forest and cluster analysis. The proposed integrated framework is applied to a hepatitis B virus surface antigen (HBsAg) seroclearance GWAS data. Multiple cluster analyses were performed on (1) single nucleotide polymorphisms (SNPs) considered significant by GWAS and (2) SNPs with the highest feature importance scores obtained using random forest. The resulting SNP-sets from the cluster analyses were subsequently tested for trait-association. Three susceptibility loci possibly associated with HBsAg seroclearance were identified: (1) SNP rs2399971, (2) gene LINC00578, and (3) locus 11p15. SNP rs2399971 is a biomarker reported in the literature to be significantly associated with HBsAg seroclearance in patients who had received antiviral treatment. The latter two loci are linked with diseases influenced by the presence of hepatitis B virus infection. These findings demonstrate the potential of the proposed integrated framework in identifying disease-associated susceptibility loci. With further validation, results herein could aid in better understanding complex disease etiologies and provide inputs for a more advanced disease risk assessment for patients.
Collapse
Affiliation(s)
- Princess P Silva
- Data-Driven Research Laboratory (DARELab), Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines
- Computational Interdisciplinary Research Laboratory (CINTERLabs), University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines
| | - Joverlyn D Gaudillo
- Data-Driven Research Laboratory (DARELab), Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines.
- Computational Interdisciplinary Research Laboratory (CINTERLabs), University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines.
- Domingo AI Research Center (DARC Labs), 1606, Pasig City, Philippines.
| | - Julianne A Vilela
- Philippine Genome Center Program for Agriculture, Office of the Vice Chancellor for Research and Extension, University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines
| | - Ranzivelle Marianne L Roxas-Villanueva
- Data-Driven Research Laboratory (DARELab), Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines
- Computational Interdisciplinary Research Laboratory (CINTERLabs), University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines
| | - Beatrice J Tiangco
- National Institute of Health, UP College of Medicine, Taft Avenue, 1000, Manila, Philippines
- Division of Medicine, The Medical City, 1605, Pasig, Philippines
| | - Mario R Domingo
- Domingo AI Research Center (DARC Labs), 1606, Pasig City, Philippines
| | - Jason R Albia
- Data-Driven Research Laboratory (DARELab), Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, 4031, Los Baños, Laguna, Philippines
- Domingo AI Research Center (DARC Labs), 1606, Pasig City, Philippines
- Venn Biosciences Corporation Dba InterVenn Biosciences, Metro Manila, Philippines
| |
Collapse
|
3
|
Radiomics Models Based on Magnetic Resonance Imaging for Prediction of the Response to Bortezomib-Based Therapy in Patients with Multiple Myeloma. BIOMED RESEARCH INTERNATIONAL 2022; 2022:6911246. [PMID: 36105939 PMCID: PMC9467708 DOI: 10.1155/2022/6911246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 08/04/2022] [Accepted: 08/20/2022] [Indexed: 11/17/2022]
Abstract
Purpose. To identify significant radiomics features based on MRI and establish effective models for predicting the response to bortezomib-based regimens. Materials and Methods. In total, 95 MM patients treated with bortezomib-based therapy were enrolled, including 77 with bortezomib, cyclophosphamide, and dexamethasone (BCD) and 18 with bortezomib, lenalidomide, and dexamethasone (VRD). Based on T1-weighted imaging (T1WI) and T2-weighted imaging with fat suppression (T2WI-fs), radiomics features were extracted and then selected. The random forest (RF),
-nearest neighbor, support vector machine, logistic regression, decision tree, and Bayes models were built using the selected features. The predictive power of six models for response to BCD and VRD regimens were evaluated. The correlation between the selected features and progression-free survival (PFS) was also analyzed. Results. Four wavelet features were correlated with BCD treatment response. The six models all showed predictive power for BCD regimen (AUC: 0.84-0.896 in the training set, 0.801-0.885 in the validation set), and RF performed relatively better than others. Nevertheless, all the BCD-based models were incapable of predicting the VRD treatment response. The wavelet-HLH_firstorder_kurtosis was also associated with PFS (log-rank
). Conclusion. The four wavelet features were valuable biomarkers for predicting the response to BCD regimen. The six models based on these features showed predictive power, and RF was the best. One wavelet feature was also a survival-related biomarker. MRI-based radiomics had the potential to guide clinicians in MM management.
Collapse
|
4
|
Sun R, Liu J, Yu M, Xia M, Zhang Y, Sun X, Xu Y, Cui X. Paeoniflorin Ameliorates BiPN by Reducing IL6 Levels and Regulating PARKIN-Mediated Mitochondrial Autophagy. Drug Des Devel Ther 2022; 16:2241-2259. [PMID: 35860525 PMCID: PMC9289176 DOI: 10.2147/dddt.s369111] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 07/02/2022] [Indexed: 11/23/2022] Open
Abstract
Background Bortezomib-induced peripheral neuropathy (BiPN) is a common complication of multiple myeloma (MM) treatment that seriously affects the quality of life of patients. The purpose of the present study was to explore the therapeutic effect of paeoniflorin on BiPN and its possible mechanism. Methods ELISA was used to measure the level of interleukin-6 (IL6) in the plasma of MM patients, and bioinformatics analysis was used to predict the mechanism underlying the effect of paeoniflorin on peripheral neuropathy. Cell and animal models of BiPN were constructed to evaluate mitochondrial function by measuring cell viability and mitochondrial quality and labeling mitochondria with MitoTracker Green. Nerve injury in mice with BiPN was assessed by behavioral tests, evaluation of motor nerve conduction velocity, hematoxylin-eosin (HE) staining, electron microscopy and analysis of the levels of reactive oxygen species (ROS). Western blotting and immunohistochemistry (IHC) were used to assess the expression of autophagy-related proteins. Results In MM patients, IL6 levels were positively correlated with the degree of PN. The results of bioinformatics analysis suggested that paeoniflorin ameliorated PN by altering inflammation levels and mitochondrial autophagy. Paeoniflorin increased PC12 cell viability and mitochondrial autophagy levels, alleviated mitochondrial damage, and reduced IL6 levels. In addition, paeoniflorin effectively improved the behavior of mice with BiPN, relieved sciatic nerve injury in mice, increased the expression of LC3II/I, beclin-1, and Parkin in sciatic nerve cells, and increased the expression of LC3B and Parkin in the nerve tissue. Conclusion The present study confirmed that paeoniflorin significantly ameliorated peripheral neuropathy (PN) caused by bortezomib, possibly by reducing IL6 levels to regulate PARKIN-mediated mitochondrial autophagy and mitochondrial damage.
Collapse
Affiliation(s)
- Runjie Sun
- College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Jinan, 250014, People’s Republic of China
| | - Jiang Liu
- Department of Foreign Affairs Office, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250014, People’s Republic of China
| | - Manya Yu
- College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Jinan, 250014, People’s Republic of China
| | - Mengting Xia
- First School of Clinical Medicine, Shandong University of Traditional Chinese Medicine, Jinan, 250014, People’s Republic of China
| | - Yanyu Zhang
- College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Jinan, 250014, People’s Republic of China
| | - Xiaoqi Sun
- College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Jinan, 250014, People’s Republic of China
| | - Yunsheng Xu
- Second School of Clinical Medicine, the Second Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250001, People’s Republic of China
- Correspondence: Yunsheng Xu; Xing Cui, Second School of Clinical Medicine, the Second Affiliated Hospital of Shandong University of Traditional Chinese Medicine, 1 Jingba Road, Jinan, 250001, People’s Republic of China, Email ;
| | - Xing Cui
- Second School of Clinical Medicine, the Second Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250001, People’s Republic of China
| |
Collapse
|
5
|
Deng Z, Zhang J, Li J, Zhang X. Application of Deep Learning in Plant-Microbiota Association Analysis. Front Genet 2021; 12:697090. [PMID: 34691142 PMCID: PMC8531731 DOI: 10.3389/fgene.2021.697090] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 08/31/2021] [Indexed: 01/04/2023] Open
Abstract
Unraveling the association between microbiome and plant phenotype can illustrate the effect of microbiome on host and then guide the agriculture management. Adequate identification of species and appropriate choice of models are two challenges in microbiome data analysis. Computational models of microbiome data could help in association analysis between the microbiome and plant host. The deep learning methods have been widely used to learn the microbiome data due to their powerful strength of handling the complex, sparse, noisy, and high-dimensional data. Here, we review the analytic strategies in the microbiome data analysis and describe the applications of deep learning models for plant–microbiome correlation studies. We also introduce the application cases of different models in plant–microbiome correlation analysis and discuss how to adapt the models on the critical steps in data processing. From the aspect of data processing manner, model structure, and operating principle, most deep learning models are suitable for the plant microbiome data analysis. The ability of feature representation and pattern recognition is the advantage of deep learning methods in modeling and interpretation for association analysis. Based on published computational experiments, the convolutional neural network and graph neural networks could be recommended for plant microbiome analysis.
Collapse
Affiliation(s)
- Zhiyu Deng
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China.,Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Jinming Zhang
- Department of Infectious Diseases, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Junya Li
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China.,Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China.,Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
| |
Collapse
|
6
|
Ren W, Liang Z, He S, Xiao J. Hybrid of Restricted and Penalized Maximum Likelihood Method for Efficient Genome-Wide Association Study. Genes (Basel) 2020; 11:genes11111286. [PMID: 33138126 PMCID: PMC7692801 DOI: 10.3390/genes11111286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 10/26/2020] [Accepted: 10/27/2020] [Indexed: 11/16/2022] Open
Abstract
In genome-wide association studies, linear mixed models (LMMs) have been widely used to explore the molecular mechanism of complex traits. However, typical association approaches suffer from several important drawbacks: estimation of variance components in LMMs with large scale individuals is computationally slow; single-locus model is unsatisfactory to handle complex confounding and causes loss of statistical power. To address these issues, we propose an efficient two-stage method based on hybrid of restricted and penalized maximum likelihood, named HRePML. Firstly, we performed restricted maximum likelihood (REML) on single-locus LMM to remove unrelated markers, where spectral decomposition on covariance matrix was used to fast estimate variance components. Secondly, we carried out penalized maximum likelihood (PML) on multi-locus LMM for markers with reasonably large effects. To validate the effectiveness of HRePML, we conducted a series of simulation studies and real data analyses. As a result, our method always had the highest average statistical power compared with multi-locus mixed-model (MLMM), fixed and random model circulating probability unification (FarmCPU), and genome-wide efficient mixed model association (GEMMA). More importantly, HRePML can provide higher accuracy estimation of marker effects. HRePML also identifies 41 previous reported genes associated with development traits in Arabidopsis, which is more than was detected by the other methods.
Collapse
Affiliation(s)
- Wenlong Ren
- Department of Epidemiology and Medical Statistics, School of Public Health, Nantong University, Nantong 226019, China; (W.R.); (S.H.)
| | - Zhikai Liang
- Plant and Microbial Biology Department, University of Minnesota, Saint Paul, MN 55108, USA;
| | - Shu He
- Department of Epidemiology and Medical Statistics, School of Public Health, Nantong University, Nantong 226019, China; (W.R.); (S.H.)
| | - Jing Xiao
- Department of Epidemiology and Medical Statistics, School of Public Health, Nantong University, Nantong 226019, China; (W.R.); (S.H.)
- Correspondence:
| |
Collapse
|
7
|
Cao X, Xing L, He H, Zhang X. Views on GWAS statistical analysis. Bioinformation 2020; 16:393-397. [PMID: 32831520 PMCID: PMC7434950 DOI: 10.6026/97320630016393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 04/15/2020] [Accepted: 04/17/2020] [Indexed: 11/23/2022] Open
Abstract
Genome-wide association study (GWAS) is a popular approach to investigate relationships between genetic information and diseases. A number of associations are tested in a study and the results are often corrected using multiple adjustment methods. It is observed that GWAS studies suffer adequate statistical power for reliability. Hence, we document known models for reliability assessment using improved statistical power in GWAS analysis.
Collapse
Affiliation(s)
- Xiaowen Cao
- Department of Mathematics, Hebei University of Technology, Tianjin, China
- Department of Mathematics and Statistics, University of Victoria, BC, Canada
| | - Li Xing
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, SK, Canada
| | - Hua He
- Department of Mathematics, Hebei University of Technology, Tianjin, China
| | - Xuekui Zhang
- Department of Mathematics and Statistics, University of Victoria, BC, Canada
| |
Collapse
|