1
|
Wang J, Zhou F, Li C, Yin N, Liu H, Zhuang B, Huang Q, Wen Y. Gene Association Analysis of Quantitative Trait Based on Functional Linear Regression Model with Local Sparse Estimator. Genes (Basel) 2023; 14:genes14040834. [PMID: 37107592 PMCID: PMC10137544 DOI: 10.3390/genes14040834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 03/27/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023] Open
Abstract
Functional linear regression models have been widely used in the gene association analysis of complex traits. These models retain all the genetic information in the data and take full advantage of spatial information in genetic variation data, which leads to brilliant detection power. However, the significant association signals identified by the high-power methods are not all the real causal SNPs, because it is easy to regard noise information as significant association signals, leading to a false association. In this paper, a method based on the sparse functional data association test (SFDAT) of gene region association analysis is developed based on a functional linear regression model with local sparse estimation. The evaluation indicators CSR and DL are defined to evaluate the feasibility and performance of the proposed method with other indicators. Simulation studies show that: (1) SFDAT performs well under both linkage equilibrium and linkage disequilibrium simulation; (2) SFDAT performs successfully for gene regions (including common variants, low-frequency variants, rare variants and mix variants); (3) With power and type I error rates comparable to OLS and Smooth, SFDAT has a better ability to handle the zero regions. The Oryza sativa data set is analyzed by SFDAT. It is shown that SFDAT can better perform gene association analysis and eliminate the false positive of gene localization. This study showed that SFDAT can lower the interference caused by noise while maintaining high power. SFDAT provides a new method for the association analysis between gene regions and phenotypic quantitative traits.
Collapse
Affiliation(s)
- Jingyu Wang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Fujie Zhou
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Cheng Li
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Ning Yin
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Huiming Liu
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Binxian Zhuang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Qingyu Huang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Yongxian Wen
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Correspondence:
| |
Collapse
|
2
|
Li S, Li S, Su S, Zhang H, Shen J, Wen Y. Gene Region Association Analysis of Longitudinal Quantitative Traits Based on a Function-On-Function Regression Model. Front Genet 2022; 13:781740. [PMID: 35265102 PMCID: PMC8899465 DOI: 10.3389/fgene.2022.781740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 01/04/2022] [Indexed: 11/13/2022] Open
Abstract
In the process of growth and development in life, gene expressions that control quantitative traits will turn on or off with time. Studies of longitudinal traits are of great significance in revealing the genetic mechanism of biological development. With the development of ultra-high-density sequencing technology, the associated analysis has tremendous challenges to statistical methods. In this paper, a longitudinal functional data association test (LFDAT) method is proposed based on the function-on-function regression model. LFDAT can simultaneously treat phenotypic traits and marker information as continuum variables and analyze the association of longitudinal quantitative traits and gene regions. Simulation studies showed that: 1) LFDAT performs well for both linkage equilibrium simulation and linkage disequilibrium simulation, 2) LFDAT has better performance for gene regions (include common variants, low-frequency variants, rare variants and mixture), and 3) LFDAT can accurately identify gene switching in the growth and development stage. The longitudinal data of the Oryza sativa projected shoot area is analyzed by LFDAT. It showed that there is the advantage of quick calculations. Further, an association analysis was conducted between longitudinal traits and gene regions by integrating the micro effects of multiple related variants and using the information of the entire gene region. LFDAT provides a feasible method for studying the formation and expression of longitudinal traits.
Collapse
Affiliation(s)
- Shijing Li
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Shiqin Li
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Shaoqiang Su
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Hui Zhang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Jiayu Shen
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yongxian Wen
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
3
|
Belonogova NM, Svishcheva GR, Wilson JF, Campbell H, Axenovich TI. Weighted functional linear regression models for gene-based association analysis. PLoS One 2018; 13:e0190486. [PMID: 29309409 PMCID: PMC5757938 DOI: 10.1371/journal.pone.0190486] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Accepted: 12/17/2017] [Indexed: 11/19/2022] Open
Abstract
Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.
Collapse
Affiliation(s)
- Nadezhda M. Belonogova
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Gulnara R. Svishcheva
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- Vavilov Institute of General Genetics, the Russian Academy of Sciences, Moscow, Russia
| | - James F. Wilson
- Centre for Global Health Research, Usher Institute for Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, Scotland
| | - Harry Campbell
- Centre for Global Health Research, Usher Institute for Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
| | - Tatiana I. Axenovich
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|