1
|
Seo H, Brand L, Wang H. Learning semi-supervised enrichment of longitudinal imaging-genetic data for improved prediction of cognitive decline. BMC Med Inform Decis Mak 2024; 24:61. [PMID: 38807132 PMCID: PMC11134626 DOI: 10.1186/s12911-024-02455-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 02/05/2024] [Indexed: 05/30/2024] Open
Abstract
BACKGROUND Alzheimer's Disease (AD) is a progressive memory disorder that causes irreversible cognitive decline. Given that there is currently no cure, it is critical to detect AD in its early stage during the disease progression. Recently, many statistical learning methods have been presented to identify cognitive decline with temporal data, but few of these methods integrate heterogeneous phenotype and genetic information together to improve the accuracy of prediction. In addition, many of these models are often unable to handle incomplete temporal data; this often manifests itself in the removal of records to ensure consistency in the number of records across participants. RESULTS To address these issues, in this work we propose a novel approach to integrate the genetic data and the longitudinal phenotype data to learn a fixed-length "enriched" biomarker representation derived from the temporal heterogeneous neuroimaging records. Armed with this enriched representation, as a fixed-length vector per participant, conventional machine learning models can be used to predict clinical outcomes associated with AD. CONCLUSION The proposed method shows improved prediction performance when applied to data derived from Alzheimer's Disease Neruoimaging Initiative cohort. In addition, our approach can be easily interpreted to allow for the identification and validation of biomarkers associated with cognitive decline.
Collapse
Affiliation(s)
- Hoon Seo
- Department of Computer Science, Colorado School of Mines, Golden, Colorado, 80401, USA
| | - Lodewijk Brand
- Department of Computer Science, Colorado School of Mines, Golden, Colorado, 80401, USA
| | - Hua Wang
- Department of Computer Science, Colorado School of Mines, Golden, Colorado, 80401, USA.
| |
Collapse
|
2
|
Wu K, Wang W, Cheng Q, Xiao D, Li Y, Chen M, Zheng X. Rare MED12L Variants Are Associated with Susceptibility to Guttate Psoriasis in the Han Chinese Population. Dermatology 2024:1-9. [PMID: 38735287 DOI: 10.1159/000538805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 04/08/2024] [Indexed: 05/14/2024] Open
Abstract
INTRODUCTION According to the common disease/rare variant hypothesis, it is important to study the role of rare variants in complex diseases. The association of rare variants with psoriasis has been demonstrated, but the association between rare variants and specific clinical subtypes of psoriasis has not been investigated. METHODS Gene-based and gene-level meta-analyses were performed on data extracted from our previous study data sets (2,483 patients with guttate psoriasis and 8,292 patients with non-guttate psoriasis) for genotyping. Then, haplotype analysis was performed for rare loss-of-function variants located in MED12L, and protein function prediction was performed for MED12L. Gene-based analysis at each stage had a moderate significance threshold (p < 0.05). A χ2 test was then conducted on the three potential genes, and the merged gene-based analysis was used to confirm the results. We also conducted association analysis and meta-analysis for functional variants located on the identified gene. RESULTS Through these gene-level analyses, we determined that MED12L is a guttate psoriasis susceptibility gene (p = 9.99 × 10-5), and the single-nucleotide polymorphism with the strongest association was rs199780529 (p_combine = 1 × 10-3, p_meta = 2 × 10-3). CONCLUSIONS In our study, a guttate psoriasis-specific subtype-associated susceptibility gene was confirmed in a Chinese Han population. These findings contribute to a better genetic understanding of different subtypes of psoriasis.
Collapse
Affiliation(s)
- Kejia Wu
- Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
- Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China
- Anhui Province Laboratory of Inflammation and Immune Mediated Diseases, Hefei, China
- Anhui Provincial Institute of Translational Medicine, Hefei, China
- First Clinical Medical College, Anhui Medical University, Hefei, China
| | - Wanrong Wang
- Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
- Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China
- Anhui Province Laboratory of Inflammation and Immune Mediated Diseases, Hefei, China
- Anhui Provincial Institute of Translational Medicine, Hefei, China
- First Clinical Medical College, Anhui Medical University, Hefei, China
| | - Qianhui Cheng
- Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
- Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China
- Anhui Province Laboratory of Inflammation and Immune Mediated Diseases, Hefei, China
- Anhui Provincial Institute of Translational Medicine, Hefei, China
- First Clinical Medical College, Anhui Medical University, Hefei, China
| | - Duncheng Xiao
- Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
- Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China
- Anhui Province Laboratory of Inflammation and Immune Mediated Diseases, Hefei, China
- Anhui Provincial Institute of Translational Medicine, Hefei, China
- Second Clinical Medical College, Anhui Medical University, Hefei, China
| | - Yunxiao Li
- School of Life Science, Shandong University, Qingdao, China
| | - Mengyun Chen
- Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
- Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China
- Anhui Province Laboratory of Inflammation and Immune Mediated Diseases, Hefei, China
- Anhui Provincial Institute of Translational Medicine, Hefei, China
| | - Xiaodong Zheng
- Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
- Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China
- Anhui Province Laboratory of Inflammation and Immune Mediated Diseases, Hefei, China
- Anhui Provincial Institute of Translational Medicine, Hefei, China
| |
Collapse
|
3
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. Genome Med 2024; 16:62. [PMID: 38664839 PMCID: PMC11044415 DOI: 10.1186/s13073-024-01329-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
The "missing" heritability of complex traits may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. We propose a new kernel-based method called Latent Interaction Testing (LIT) to screen for genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Using simulated data, we demonstrate that LIT increases power to detect latent genetic interactions compared to univariate methods. We then apply LIT to obesity-related traits in the UK Biobank and detect variants with interactive effects near known obesity-related genes (URL: https://CRAN.R-project.org/package=lit ).
Collapse
Affiliation(s)
- Andrew J Bass
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA
| | - Aliza P Wingo
- Department of Psychiatry, Emory University, Atlanta, GA, 30322, USA
| | - Thomas S Wingo
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
- Department of Neurology, Emory University, Atlanta, GA, 30322, USA
| | - David J Cutler
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| |
Collapse
|
4
|
Wang P, Xu X, Li M, Lou XY, Xu S, Wu B, Gao G, Yin P, Liu N. Gene-based association tests in family samples using GWAS summary statistics. Genet Epidemiol 2024; 48:103-113. [PMID: 38317324 DOI: 10.1002/gepi.22548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/18/2023] [Accepted: 01/08/2024] [Indexed: 02/07/2024]
Abstract
Genome-wide association studies (GWAS) have led to rapid growth in detecting genetic variants associated with various phenotypes. Owing to a great number of publicly accessible GWAS summary statistics, and the difficulty in obtaining individual-level genotype data, many existing gene-based association tests have been adapted to require only GWAS summary statistics rather than individual-level data. However, these association tests are restricted to unrelated individuals and thus do not apply to family samples directly. Moreover, due to its flexibility and effectiveness, the linear mixed model has been increasingly utilized in GWAS to handle correlated data, such as family samples. However, it remains unknown how to perform gene-based association tests in family samples using the GWAS summary statistics estimated from the linear mixed model. In this study, we show that, when family size is negligible compared to the total sample size, the diagonal block structure of the kinship matrix makes it possible to approximate the correlation matrix of marginal Z scores by linkage disequilibrium matrix. Based on this result, current methods utilizing summary statistics for unrelated individuals can be directly applied to family data without any modifications. Our simulation results demonstrate that this proposed strategy controls the type 1 error rate well in various situations. Finally, we exemplify the usefulness of the proposed approach with a dental caries GWAS data set.
Collapse
Affiliation(s)
- Peng Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Hubei, People's Republic of China
| | - Xiao Xu
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, Indiana, USA
| | - Ming Li
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, Indiana, USA
| | - Xiang-Yang Lou
- Department of Biostatistics, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Siqi Xu
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, Hong Kong
| | - Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Guimin Gao
- Department of Public Health Sciences, University of Chicago, Chicago, Illinois, USA
| | - Ping Yin
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Hubei, People's Republic of China
| | - Nianjun Liu
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, Indiana, USA
| |
Collapse
|
5
|
Das Adhikari S, Cui Y, Wang J. BayesKAT: bayesian optimal kernel-based test for genetic association studies reveals joint genetic effects in complex diseases. Brief Bioinform 2024; 25:bbae182. [PMID: 38653490 PMCID: PMC11036342 DOI: 10.1093/bib/bbae182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/10/2024] [Accepted: 04/05/2024] [Indexed: 04/25/2024] Open
Abstract
Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
6
|
He M, Zhao N. A Mixed Effect Similarity Matrix Regression Model (SMRmix) for Integrating Multiple Microbiome Datasets at Community Level and its Application in HIV. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.10.584315. [PMID: 38559012 PMCID: PMC10979838 DOI: 10.1101/2024.03.10.584315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Recent studies have highlighted the importance of human microbiota in our health and diseases. However, in many areas of research, individual microbiome studies often offer inconsistent results due to the limited sample sizes and the heterogeneity in study populations and experimental procedures. Integrative analysis of multiple microbiome datasets is necessary. However, statistical methods that incorporate multiple microbiome datasets and account for the study heterogeneity are not available in the literature. In this paper, we develop a mixed effect similarity matrix regression (SMRmix) approach for identifying community level microbiome shifts between outcomes. SMRmix has a close connection with the microbiome kernel association test, one of the most popular approaches for such a task but is only applicable when we have a single study. Via extensive simulations, we show that SMRmix has well-controlled type I error and higher power than some potential competitors. We also applied SMRmix to data from the HIV-reanalysis consortium, a collective effort that obtained all publicly available data on gut microbiome and HIV at December 2017, and obtained consistent associations of gut microbiome with HIV infection, and with MSM status (i.e. men who have sex with men).
Collapse
|
7
|
Zhang S, Jiang Z, Zeng P. Incorporating genetic similarity of auxiliary samples into eGene identification under the transfer learning framework. J Transl Med 2024; 22:258. [PMID: 38461317 PMCID: PMC10924384 DOI: 10.1186/s12967-024-05053-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 03/01/2024] [Indexed: 03/11/2024] Open
Abstract
BACKGROUND The term eGene has been applied to define a gene whose expression level is affected by at least one independent expression quantitative trait locus (eQTL). It is both theoretically and empirically important to identify eQTLs and eGenes in genomic studies. However, standard eGene detection methods generally focus on individual cis-variants and cannot efficiently leverage useful knowledge acquired from auxiliary samples into target studies. METHODS We propose a multilocus-based eGene identification method called TLegene by integrating shared genetic similarity information available from auxiliary studies under the statistical framework of transfer learning. We apply TLegene to eGene identification in ten TCGA cancers which have an explicit relevant tissue in the GTEx project, and learn genetic effect of variant in TCGA from GTEx. We also adopt TLegene to the Geuvadis project to evaluate its usefulness in non-cancer studies. RESULTS We observed substantial genetic effect correlation of cis-variants between TCGA and GTEx for a larger number of genes. Furthermore, consistent with the results of our simulations, we found that TLegene was more powerful than existing methods and thus identified 169 distinct candidate eGenes, which was much larger than the approach that did not consider knowledge transfer across target and auxiliary studies. Previous studies and functional enrichment analyses provided empirical evidence supporting the associations of discovered eGenes, and it also showed evidence of allelic heterogeneity of gene expression. Furthermore, TLegene identified more eGenes in Geuvadis and revealed that these eGenes were mainly enriched in cells EBV transformed lymphocytes tissue. CONCLUSION Overall, TLegene represents a flexible and powerful statistical method for eGene identification through transfer learning of genetic similarity shared across auxiliary and target studies.
Collapse
Affiliation(s)
- Shuo Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Zhou Jiang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Xuzhou Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Jiangsu Engineering Research Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
8
|
Zhang H, Li H, Yao J, Zhao M, Zhang C. The mutation of NSUN5 R295C promotes preeclampsia by impairing decidualization through downregulating IL-11Rα. iScience 2024; 27:108899. [PMID: 38559585 PMCID: PMC10978358 DOI: 10.1016/j.isci.2024.108899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 11/24/2023] [Accepted: 01/09/2024] [Indexed: 04/04/2024] Open
Abstract
Preeclampsia (PE) is a pregnancy-specific hypertensive disorder that severely impairs maternal and fetal health. However, its pathogenesis remains elusive. NOP2/Sun5 (NSUN5) is an RNA methyltransferase. This study discovered a significant correlation between rs77133388 of NSUN5 and PE in a cohort of 868 severe PE patients and 982 healthy controls. To further explore this association, the researchers generated single-base mutant mice (NSUN5 R295C) at rs77133388. The pregnant NSUN5 R295C mice exhibited PE symptoms. Additionally, compared to the controls, the decidual area of the placenta was significantly reduced in NSUN5 R295C mice, and their decidualization was impaired with a significantly decrease in polyploid cell numbers after artificially induced decidualization. The study also found a decrease in phosphorylated JAK2, STAT3, and IL-11Rα, Cyclin D3 expression in NSUN5 R295C mice. Overall, these findings suggest that NSUN5 mutation potentially alters decidualization through the IL-11Rα/JAK2/STAT3/Cyclin D3 pathway, ultimately impairing placental development and contributing to PE occurrence.
Collapse
Affiliation(s)
- Hongya Zhang
- Shandong Provincial Key Laboratory of Animal Resistance Biology, College of Life Sciences, Shandong Normal University, Jinan, Shandong 250014, China
| | - Huihui Li
- Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Qilu Hospital of Shandong University, Jinan, Shandong 250012, China
| | - Jiatong Yao
- Shandong Provincial Key Laboratory of Animal Resistance Biology, College of Life Sciences, Shandong Normal University, Jinan, Shandong 250014, China
| | - Miaomiao Zhao
- Shandong Provincial Key Laboratory of Animal Resistance Biology, College of Life Sciences, Shandong Normal University, Jinan, Shandong 250014, China
| | - Cong Zhang
- Shandong Provincial Key Laboratory of Animal Resistance Biology, College of Life Sciences, Shandong Normal University, Jinan, Shandong 250014, China
- Center for Reproductive Medicine, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200135, China
- Shanghai Key Laboratory for Assisted Reproduction and Reproductive Genetics, Shanghai 200135, China
- Shandong Provincial Key Laboratory of Reproductive Medicine, Jinan, Shandong 250001, China
| |
Collapse
|
9
|
Wang S, Li T, Zhao B, Dai W, Yao Y, Li C, Li T, Zhu H, Zhang H. Identification and validation of supervariants reveal novel loci associated with human white matter microstructure. Genome Res 2024; 34:20-33. [PMID: 38190638 PMCID: PMC10904010 DOI: 10.1101/gr.277905.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Accepted: 12/05/2023] [Indexed: 01/10/2024]
Abstract
As an essential part of the central nervous system, white matter coordinates communications between different brain regions and is related to a wide range of neurodegenerative and neuropsychiatric disorders. Previous genome-wide association studies (GWASs) have uncovered loci associated with white matter microstructure. However, GWASs suffer from limited reproducibility and difficulties in detecting multi-single-nucleotide polymorphism (multi-SNP) and epistatic effects. In this study, we adopt the concept of supervariants, a combination of alleles in multiple loci, to account for potential multi-SNP effects. We perform supervariant identification and validation to identify loci associated with 22 white matter fractional anisotropy phenotypes derived from diffusion tensor imaging. To increase reproducibility, we use United Kingdom (UK) Biobank White British (n = 30,842) data for discovery and internal validation, and UK Biobank White but non-British (n = 1927) data, Europeans from the Adolescent Brain Cognitive Development study (n = 4399) data, and Europeans from the Human Connectome Project (n = 319) data for external validation. We identify 23 novel loci on the discovery set that have not been reported in the previous GWASs on white matter microstructure. Among them, three supervariants on genomic regions 5q35.1, 8p21.2, and 19q13.32 have P-values lower than 0.05 in the meta-analysis of the three independent validation data sets. These supervariants contain genetic variants located in genes that have been related to brain structures, cognitive functions, and neuropsychiatric diseases. Our findings provide a better understanding of the genetic architecture underlying white matter microstructure.
Collapse
Affiliation(s)
- Shiying Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06510, USA
| | - Ting Li
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China
| | - Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104-1686, USA
| | - Wei Dai
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06510, USA
| | - Yisha Yao
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06510, USA
| | - Cai Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Heping Zhang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06510, USA;
| |
Collapse
|
10
|
He J, Antonyan L, Zhu H, Ardila K, Li Q, Enoma D, Zhang W, Liu A, Chekouo T, Cao B, MacDonald ME, Arnold PD, Long Q. A statistical method for image-mediated association studies discovers genes and pathways associated with four brain disorders. Am J Hum Genet 2024; 111:48-69. [PMID: 38118447 PMCID: PMC10806749 DOI: 10.1016/j.ajhg.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/04/2023] [Accepted: 11/16/2023] [Indexed: 12/22/2023] Open
Abstract
Brain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts. By leveraging the UK Biobank image-derived phenotypes (IDPs), the IMAS discovered genetic bases underlying four neuropsychiatric disorders and verified them by analyzing annotations, pathways, and expression quantitative trait loci (eQTLs). A cerebellar-mediated mechanism was identified to be common to the four disorders. Simulations show that, if the goal is identifying genetic risk, our IMAS is more powerful than a hypothetical protocol in which the imaging results were available in the GWAS dataset. This implies the feasibility of reanalyzing legacy GWAS datasets without conducting additional imaging, yielding cost savings for integrated analysis of genetics and imaging.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Lilit Antonyan
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Harold Zhu
- Department of Biological Sciences, Faculty of Science, University of Calgary, Calgary, AB, Canada
| | - Karen Ardila
- Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - David Enoma
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | | | - Andy Liu
- Sir Winston Churchill High School, Calgary, AB, Canada; College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Thierry Chekouo
- Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada; Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Bo Cao
- Department of Psychiatry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - M Ethan MacDonald
- The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Paul D Arnold
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
11
|
Mou SI, Sultana T, Chatterjee D, Faruk MO, Hosen MI. Comprehensive characterization of coding and non-coding single nucleotide polymorphisms of the Myoneurin (MYNN) gene using molecular dynamics simulation and docking approaches. PLoS One 2024; 19:e0296361. [PMID: 38165846 PMCID: PMC10760682 DOI: 10.1371/journal.pone.0296361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 12/11/2023] [Indexed: 01/04/2024] Open
Abstract
Genome-wide association studies (GWAS) identified a coding single nucleotide polymorphism, MYNN rs10936599, at chromosome 3q. MYNN gene encodes myoneurin protein, which has been associated with several cancer pathogenesis and disease development processes. However, there needed to be a more detailed characterization of this polymorphism's (and other coding and non-coding polymorphisms) structural, functional, and molecular impact. The current study addressed this gap and analyzed different properties of rs10936599 and non-coding SNPs of MYNN via a thorough computational method. The variant, rs10936599, was predicted functionally deleterious by nine functionality prediction approaches, like SIFT, PolyPhen-2, and REVEL, etc. Following that, structural modifications were estimated through the HOPE server and Mutation3D. Moreover, the mutation was found in a conserved and active residue, according to ConSurf and CPORT. Further, the secondary structures were predicted, followed by tertiary structures, and there was a significant deviation between the native and variant models. Similarly, molecular simulation also showed considerable differences in the dynamic pattern of the wildtype and mutant structures. Molecular docking revealed that the variant binds with better docking scores with ligand NOTCH2. In addition to that, non-coding SNPs located at the MYNN locus were retrieved from the ENSEMBL database. These were found to disrupt the transcription factor binding regulatory regions; nonetheless, only two affect miRNA target sites. Again, eight non-coding variants were detected in the testes with normalized expression, whereas HaploReg v4.1 unveiled annotations for non-coding variants. In summary, in silico comprehensive characterization of coding and non-coding single nucleotide polymorphisms of MYNN gene will assist researchers to work on MYNN gene and establish their association with certain types of cancers.
Collapse
Affiliation(s)
- Sadia Islam Mou
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Tamanna Sultana
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Dipankor Chatterjee
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Md. Omar Faruk
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | - Md. Ismail Hosen
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| |
Collapse
|
12
|
Liu Z, Xu J, Tan J, Li X, Zhang F, Ouyang W, Wang S, Huang Y, Li S, Pan X. Genetic overlap for ten cardiovascular diseases: A comprehensive gene-centric pleiotropic association analysis and Mendelian randomization study. iScience 2023; 26:108150. [PMID: 37908310 PMCID: PMC10613921 DOI: 10.1016/j.isci.2023.108150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/13/2023] [Accepted: 10/02/2023] [Indexed: 11/02/2023] Open
Abstract
Recent studies suggest that pleiotropic effects may explain the genetic architecture of cardiovascular diseases (CVDs). We conducted a comprehensive gene-centric pleiotropic association analysis for ten CVDs using genome-wide association study (GWAS) summary statistics to identify pleiotropic genes and pathways that may underlie multiple CVDs. We found shared genetic mechanisms underlying the pathophysiology of CVDs, with over two-thirds of the diseases exhibiting common genes and single-nucleotide polymorphisms (SNPs). Significant positive genetic correlations were observed in more than half of paired CVDs. Additionally, we investigated the pleiotropic genes shared between different CVDs, as well as their functional pathways and distribution in different tissues. Moreover, six hub genes, including ALDH2, XPO1, HSPA1L, ESR2, WDR12, and RAB1A, as well as 26 targeted potential drugs, were identified. Our study provides further evidence for the pleiotropic effects of genetic variants on CVDs and highlights the importance of considering pleiotropy in genetic association studies.
Collapse
Affiliation(s)
- Zeye Liu
- Department of Structural Heart Disease, National Center for Cardiovascular Disease, China & Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100037, China
- National Health Commission Key Laboratory of Cardiovascular Regeneration Medicine, Beijing 100037, China
- Key Laboratory of Innovative Cardiovascular Devices, Chinese Academy of Medical Sciences, Beijing 100037, China
- National Clinical Research Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing 100037, China
| | - Jing Xu
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical Sciences, and Peking Union Medical College, Beijing, China
| | - Jiangshan Tan
- Key Laboratory of Pulmonary Vascular Medicine, National Clinical Research Center of Cardiovascular Diseases, State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100037, China
| | - Xiaofei Li
- Department of Cardiology, Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Fengwen Zhang
- Department of Structural Heart Disease, National Center for Cardiovascular Disease, China & Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100037, China
- National Health Commission Key Laboratory of Cardiovascular Regeneration Medicine, Beijing 100037, China
- Key Laboratory of Innovative Cardiovascular Devices, Chinese Academy of Medical Sciences, Beijing 100037, China
- National Clinical Research Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing 100037, China
| | - Wenbin Ouyang
- Department of Structural Heart Disease, National Center for Cardiovascular Disease, China & Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100037, China
- National Health Commission Key Laboratory of Cardiovascular Regeneration Medicine, Beijing 100037, China
- Key Laboratory of Innovative Cardiovascular Devices, Chinese Academy of Medical Sciences, Beijing 100037, China
- National Clinical Research Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing 100037, China
| | - Shouzheng Wang
- Department of Structural Heart Disease, National Center for Cardiovascular Disease, China & Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100037, China
- National Health Commission Key Laboratory of Cardiovascular Regeneration Medicine, Beijing 100037, China
- Key Laboratory of Innovative Cardiovascular Devices, Chinese Academy of Medical Sciences, Beijing 100037, China
- National Clinical Research Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing 100037, China
| | - Yuan Huang
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Pediatric Cardiac Surgery Center, Fuwai Hospital, Chinese Academy of Medical Sciences, and Peking Union Medical College, Beijing, China
| | - Shoujun Li
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Pediatric Cardiac Surgery Center, Fuwai Hospital, Chinese Academy of Medical Sciences, and Peking Union Medical College, Beijing, China
| | - Xiangbin Pan
- Department of Structural Heart Disease, National Center for Cardiovascular Disease, China & Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100037, China
- National Health Commission Key Laboratory of Cardiovascular Regeneration Medicine, Beijing 100037, China
- Key Laboratory of Innovative Cardiovascular Devices, Chinese Academy of Medical Sciences, Beijing 100037, China
- National Clinical Research Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing 100037, China
| |
Collapse
|
13
|
John M, Lencz T. Potential application of elastic nets for shared polygenicity detection with adapted threshold selection. Int J Biostat 2023; 19:417-438. [PMID: 36327464 PMCID: PMC10154439 DOI: 10.1515/ijb-2020-0108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 10/05/2022] [Indexed: 11/06/2022]
Abstract
Current research suggests that hundreds to thousands of single nucleotide polymorphisms (SNPs) with small to modest effect sizes contribute to the genetic basis of many disorders, a phenomenon labeled as polygenicity. Additionally, many such disorders demonstrate polygenic overlap, in which risk alleles are shared at associated genetic loci. A simple strategy to detect polygenic overlap between two phenotypes is based on rank-ordering the univariate p-values from two genome-wide association studies (GWASs). Although high-dimensional variable selection strategies such as Lasso and elastic nets have been utilized in other GWAS analysis settings, they are yet to be utilized for detecting shared polygenicity. In this paper, we illustrate how elastic nets, with polygenic scores as the dependent variable and with appropriate adaptation in selecting the penalty parameter, may be utilized for detecting a subset of SNPs involved in shared polygenicity. We provide theory to better understand our approaches, and illustrate their utility using synthetic datasets. Results from extensive simulations are presented comparing the elastic net approaches with the rank ordering approach, in various scenarios. Results from simulations studies exhibit one of the elastic net approaches to be superior when the correlations among the SNPs are high. Finally, we apply the methods on two real datasets to illustrate further the capabilities, limitations and differences among the methods.
Collapse
Affiliation(s)
- Majnu John
- Institute of Behavioral Science, Feinstein Institutes of Medical Research, Manhasset, NY
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell Health System, Glen Oaks, NY
- Departments of Psychiatry and of Mathematics, Hofstra University, Hempstead, NY
| | - Todd Lencz
- Institute of Behavioral Science, Feinstein Institutes of Medical Research, Manhasset, NY
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell Health System, Glen Oaks, NY
- Departments of Psychiatry and of Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY
| |
Collapse
|
14
|
Das Adhikari S, Cui Y, Wang J. BayesKAT: Bayesian Optimal Kernel-based Test for genetic association studies reveals joint genetic effects in complex diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.18.562824. [PMID: 37905124 PMCID: PMC10614916 DOI: 10.1101/2023.10.18.562824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
GWAS methods have identified individual SNPs significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power, or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT( https://github.com/wangjr03/BayesKAT ), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules, and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.
Collapse
|
15
|
Li T, Ferraro N, Strober BJ, Aguet F, Kasela S, Arvanitis M, Ni B, Wiel L, Hershberg E, Ardlie K, Arking DE, Beer RL, Brody J, Blackwell TW, Clish C, Gabriel S, Gerszten R, Guo X, Gupta N, Johnson WC, Lappalainen T, Lin HJ, Liu Y, Nickerson DA, Papanicolaou G, Pritchard JK, Qasba P, Shojaie A, Smith J, Sotoodehnia N, Taylor KD, Tracy RP, Van Den Berg D, Wheeler MT, Rich SS, Rotter JI, Battle A, Montgomery SB. The functional impact of rare variation across the regulatory cascade. CELL GENOMICS 2023; 3:100401. [PMID: 37868038 PMCID: PMC10589633 DOI: 10.1016/j.xgen.2023.100401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 03/08/2023] [Accepted: 08/10/2023] [Indexed: 10/24/2023]
Abstract
Each human genome has tens of thousands of rare genetic variants; however, identifying impactful rare variants remains a major challenge. We demonstrate how use of personal multi-omics can enable identification of impactful rare variants by using the Multi-Ethnic Study of Atherosclerosis, which included several hundred individuals, with whole-genome sequencing, transcriptomes, methylomes, and proteomes collected across two time points, 10 years apart. We evaluated each multi-omics phenotype's ability to separately and jointly inform functional rare variation. By combining expression and protein data, we observed rare stop variants 62 times and rare frameshift variants 216 times as frequently as controls, compared to 13-27 times as frequently for expression or protein effects alone. We extended a Bayesian hierarchical model, "Watershed," to prioritize specific rare variants underlying multi-omics signals across the regulatory cascade. With this approach, we identified rare variants that exhibited large effect sizes on multiple complex traits including height, schizophrenia, and Alzheimer's disease.
Collapse
Affiliation(s)
- Taibo Li
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nicole Ferraro
- Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA
| | - Benjamin J. Strober
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Harvard School of Public Health, Epidemiology Department, Boston, MA, USA
| | | | - Silva Kasela
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Marios Arvanitis
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Medicine, Division of Cardiology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Bohan Ni
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Laurens Wiel
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | | | | | - Dan E. Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Rebecca L. Beer
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jennifer Brody
- Cardiovascular Health Research Unit, Departments of Medicine and Epidemiology, University of Washington, Seattle, WA, USA
| | - Thomas W. Blackwell
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Clary Clish
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Robert Gerszten
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Namrata Gupta
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - W. Craig Johnson
- Collaborative Health Studies Coordinating Center, University of Washington, Seattle, WA, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Henry J. Lin
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Yongmei Liu
- Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | | | - George Papanicolaou
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Pankaj Qasba
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA, USA
| | - Josh Smith
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, Departments of Medicine and Epidemiology, University of Washington, Seattle, WA, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Russell P. Tracy
- Laboratory for Clinical Biochemistry Research, University of Vermont, Burlington, VT, USA
| | - David Van Den Berg
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Matthew T. Wheeler
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Malone Center for Engineering of Healthcare, Johns Hopkins University, Baltimore, MD, USA
| | - Stephen B. Montgomery
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Pathology, Stanford University, Stanford, CA, USA
| |
Collapse
|
16
|
Boutry S, Helaers R, Lenaerts T, Vikkula M. Rare variant association on unrelated individuals in case-control studies using aggregation tests: existing methods and current limitations. Brief Bioinform 2023; 24:bbad412. [PMID: 37974506 DOI: 10.1093/bib/bbad412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 10/14/2023] [Accepted: 10/28/2023] [Indexed: 11/19/2023] Open
Abstract
Over the past years, progress made in next-generation sequencing technologies and bioinformatics have sparked a surge in association studies. Especially, genome-wide association studies (GWASs) have demonstrated their effectiveness in identifying disease associations with common genetic variants. Yet, rare variants can contribute to additional disease risk or trait heterogeneity. Because GWASs are underpowered for detecting association with such variants, numerous statistical methods have been recently proposed. Aggregation tests collapse multiple rare variants within a genetic region (e.g. gene, gene set, genomic loci) to test for association. An increasing number of studies using such methods successfully identified trait-associated rare variants and led to a better understanding of the underlying disease mechanism. In this review, we compare existing aggregation tests, their statistical features and scope of application, splitting them into the five classical classes: burden, adaptive burden, variance-component, omnibus and other. Finally, we describe some limitations of current aggregation tests, highlighting potential direction for further investigations.
Collapse
Affiliation(s)
- Simon Boutry
- Human Molecular Genetics, de Duve Institute, University of Louvain, Avenue Hippocrate 74 (+5) bte B1.74.06, 1200 Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Raphaël Helaers
- Human Molecular Genetics, de Duve Institute, University of Louvain, Avenue Hippocrate 74 (+5) bte B1.74.06, 1200 Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, 1050 Brussels, Belgium
- Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Artificial Intelligence laboratory, Vrije Universiteit Brussel, 1050 Brussels, Belgium
| | - Miikka Vikkula
- Human Molecular Genetics, de Duve Institute, University of Louvain, Avenue Hippocrate 74 (+5) bte B1.74.06, 1200 Brussels, Belgium
- WELBIO department, WEL Research Institute, avenue Pasteur, 6, 1300 Wavre, Belgium
| |
Collapse
|
17
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.11.557155. [PMID: 37745553 PMCID: PMC10515795 DOI: 10.1101/2023.09.11.557155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Genome-wide association studies of complex traits frequently find that SNP-based estimates of heritability are considerably smaller than estimates from classic family-based studies. This 'missing' heritability may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. To circumvent these challenges, we propose a new method to detect genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Our approach, Latent Interaction Testing (LIT), uses the observation that correlated traits with shared latent genetic interactions have trait variance and covariance patterns that differ by genotype. LIT examines the relationship between trait variance/covariance patterns and genotype using a flexible kernel-based framework that is computationally scalable for biobank-sized datasets with a large number of traits. We first use simulated data to demonstrate that LIT substantially increases power to detect latent genetic interactions compared to a trait-by-trait univariate method. We then apply LIT to four obesity-related traits in the UK Biobank and detect genetic variants with interactive effects near known obesity-related genes. Overall, we show that LIT, implemented in the R package lit, uses shared information across traits to improve detection of latent genetic interactions compared to standard approaches.
Collapse
Affiliation(s)
- Andrew J. Bass
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Aliza P. Wingo
- Department of Psychiatry, Emory University, Atlanta, GA 30322, USA
| | - Thomas S. Wingo
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
- Department of Neurology, Emory University, Atlanta, GA 30322, USA
| | - David J. Cutler
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | | |
Collapse
|
18
|
Boutry S, Helaers R, Lenaerts T, Vikkula M. Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data. PLoS Comput Biol 2023; 19:e1011488. [PMID: 37708232 PMCID: PMC10522036 DOI: 10.1371/journal.pcbi.1011488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 09/26/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023] Open
Abstract
The development of high-throughput next-generation sequencing technologies and large-scale genetic association studies produced numerous advances in the biostatistics field. Various aggregation tests, i.e. statistical methods that analyze associations of a trait with multiple markers within a genomic region, have produced a variety of novel discoveries. Notwithstanding their usefulness, there is no single test that fits all needs, each suffering from specific drawbacks. Selecting the right aggregation test, while considering an unknown underlying genetic model of the disease, remains an important challenge. Here we propose a new ensemble method, called Excalibur, based on an optimal combination of 36 aggregation tests created after an in-depth study of the limitations of each test and their impact on the quality of result. Our findings demonstrate the ability of our method to control type I error and illustrate that it offers the best average power across all scenarios. The proposed method allows for novel advances in Whole Exome/Genome sequencing association studies, able to handle a wide range of association models, providing researchers with an optimal aggregation analysis for the genetic regions of interest.
Collapse
Affiliation(s)
- Simon Boutry
- Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, Brussels, Belgium
| | - Raphaël Helaers
- Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, Brussels, Belgium
- Machine Learning Group, Université Libre de Bruxelles, Brussels, Belgium
- Artificial Intelligence laboratory, Vrije Universiteit Brussel, Brussels, Belgium
| | - Miikka Vikkula
- Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium
- WELBIO department, WEL Research Institute, Wavre, Belgium
| |
Collapse
|
19
|
Zheng J, Wang X, Li J, Wu Y, Chang J, Xin J, Wang M, Wang T, Wei Q, Wang M, Zhang R. Rare variants confer shared susceptibility to gastrointestinal tract cancer risk. Front Oncol 2023; 13:1161639. [PMID: 37483484 PMCID: PMC10358854 DOI: 10.3389/fonc.2023.1161639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023] Open
Abstract
Background Cancers arising within the gastrointestinal tract are complex disorders involving genetic events that cause the conversion of normal tissue to premalignant lesions and malignancy. Shared genetic features are reported in epithelial-based gastrointestinal cancers which indicate common susceptibility among this group of malignancies. In addition, the contribution of rare variants may constitute parts of genetic susceptibility. Methods A cross-cancer analysis of 38,171 shared rare genetic variants from genome-wide association assays was conducted, which included data from 3,194 cases and 1,455 controls across three cancer sites (esophageal, gastric and colorectal). The SNP-level association was performed by multivariate logistic regression analyses for single cancer, followed by association analysis for SubSETs (ASSET) to adjust the bias of overlapping controls. Gene-level analyses were conducted by SKAT-O, with multiple comparison adjustments by false discovery rate (FDR). Based on the significant genes indicated by SKATO analysis, pathways analysis was conducted using Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome databases. Results Meta-analysis in three gastrointestinal (GI) cancers identified 13 novel susceptibility loci that reached genome-wide significance (P ASSET< 5×10-8). SKAT-O analysis revealed EXOC6, LRP5L and MIR1263/LINC01324 to be significant genes shared by GI cancers (P adj<0.05, P FDR<0.05). Furthermore, GO pathway analysis identified significant enrichment of synaptic transmission and neuron development pathways shared by all three cancer types. Conclusion Rare variants and the corresponding genes potentially contribute to shared susceptibility in different GI cancer types. The discovery of these novel variants and genes offers new insights for the carcinogenic mechanisms and missing heritability of GI cancers.
Collapse
Affiliation(s)
- Ji Zheng
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety, Ministry of Education, Fudan University, Shanghai, China
| | - Xin Wang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety, Ministry of Education, Fudan University, Shanghai, China
- Office of Cancer Screening, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jingrao Li
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety, Ministry of Education, Fudan University, Shanghai, China
| | - Yuanna Wu
- Department of Biological Sciences, Dedman College of Humanities and Sciences, Southern Methodist University, Dallas, TX, United States
| | - Jiang Chang
- Department of Health Toxicology, Key Laboratory for Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Junyi Xin
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, Nanjing, China
- Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Meilin Wang
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, Nanjing, China
- Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
- The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, China
| | - Tianpei Wang
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Qingyi Wei
- Duke Cancer Institute, Duke University Medical Center, Durham, NC, United States
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, United States
| | - Mengyun Wang
- Yiwu Research Institute of Fudan University, Yiwu, Zhejiang, China
- Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai Medical College, Shanghai, China
| | - Ruoxin Zhang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety, Ministry of Education, Fudan University, Shanghai, China
- Yiwu Research Institute of Fudan University, Yiwu, Zhejiang, China
- Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai Medical College, Shanghai, China
| |
Collapse
|
20
|
Gao XR, Chiariglione M, Choquet H, Arch AJ. 10 Years of GWAS in intraocular pressure. Front Genet 2023; 14:1130106. [PMID: 37124618 PMCID: PMC10130654 DOI: 10.3389/fgene.2023.1130106] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 04/05/2023] [Indexed: 05/02/2023] Open
Abstract
Intraocular pressure (IOP) is the only modifiable risk factor for glaucoma, the leading cause of irreversible blindness worldwide. In this review, we summarize the findings of genome-wide association studies (GWASs) of IOP published in the past 10 years and prior to December 2022. Over 190 genetic loci and candidate genes associated with IOP have been uncovered through GWASs, although most of these studies were conducted in subjects of European and Asian ancestries. We also discuss how these common variants have been used to derive polygenic risk scores for predicting IOP and glaucoma, and to infer causal relationship with other traits and conditions through Mendelian randomization. Additionally, we summarize the findings from a recent large-scale exome-wide association study (ExWAS) that identified rare variants associated with IOP in 40 novel genes, six of which are drug targets for clinical treatment or are being evaluated in clinical trials. Finally, we discuss the need for future genetic studies of IOP to include individuals from understudied populations, including Latinos and Africans, in order to fully characterize the genetic architecture of IOP.
Collapse
Affiliation(s)
- Xiaoyi Raymond Gao
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, United States
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
- Division of Human Genetics, The Ohio State University, Columbus, OH, United States
| | - Marion Chiariglione
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, United States
| | - Hélène Choquet
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, United States
| | - Alexander J. Arch
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
21
|
Knutson KA, Pan W. MATS: a novel multi-ancestry transcriptome-wide association study to account for heterogeneity in the effects of cis-regulated gene expression on complex traits. Hum Mol Genet 2023; 32:1237-1251. [PMID: 36179104 PMCID: PMC10077507 DOI: 10.1093/hmg/ddac247] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 09/16/2022] [Accepted: 09/28/2022] [Indexed: 01/16/2023] Open
Abstract
The Transcriptome-Wide Association Study (TWAS) is a widely used approach which integrates gene expression and Genome Wide Association Study (GWAS) data to study the role of cis-regulated gene expression (GEx) in complex traits. However, the genetic architecture of GEx varies across populations, and recent findings point to possible ancestral heterogeneity in the effects of GEx on complex traits, which may be amplified in TWAS by modeling GEx as a function of cis-eQTLs. Here, we present a novel extension to TWAS to account for heterogeneity in the effects of cis-regulated GEx which are correlated with ancestry. Our proposed Multi-Ancestry TwaS (MATS) framework jointly analyzes samples from multiple populations and distinguishes between shared, ancestry-specific and/or subject-specific expression-trait associations. As such, MATS amplifies power to detect shared GEx associations over ancestry-stratified TWAS through increased sample sizes, and facilitates the detection of genes with subgroup-specific associations which may be masked by standard TWAS. Our simulations highlight the improved Type-I error conservation and power of MATS compared with competing approaches. Our real data applications to Alzheimer's disease (AD) case-control genotypes from the Alzheimer's Disease Sequencing Project (ADSP) and continuous phenotypes from the UK Biobank (UKBB) identify a number of unique gene-trait associations which were not discovered through standard and/or ancestry-stratified TWAS. Ultimately, these findings promote MATS as a powerful method for detecting and estimating significant gene expression effects on complex traits within multi-ancestry cohorts and corroborates the mounting evidence for inter-population heterogeneity in gene-trait associations.
Collapse
Affiliation(s)
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
22
|
Genetic correlation and gene-based pleiotropy analysis for four major neurodegenerative diseases with summary statistics. Neurobiol Aging 2023; 124:117-128. [PMID: 36740554 DOI: 10.1016/j.neurobiolaging.2022.12.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 03/25/2022] [Accepted: 12/27/2022] [Indexed: 01/02/2023]
Abstract
Recent genome-wide association studies suggested shared genetic components between neurodegenerative diseases. However, pleiotropic association patterns among them remain poorly understood. We here analyzed 4 major neurodegenerative diseases including Alzheimer's disease (AD), Parkinson's disease (PD), frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS), and found suggestively positive genetic correlation. We next implemented a gene-centric pleiotropy analysis with a powerful method called PLACO and detected 280 pleiotropic associations (226 unique genes) with these diseases. Functional analyses demonstrated that these genes were enriched in the pancreas, liver, heart, blood, brain, and muscle tissues; and that 42 pleiotropic genes exhibited drug-gene interactions with 341 drugs. Using Mendelian randomization, we discovered that AD and PD can increase the risk of developing ALS, and that AD and ALS can also increase the risk of developing FTD, respectively. Overall, this study provides in-depth insights into shared genetic components and causal relationship among the 4 major neurodegenerative diseases, indicating genetic overlap and causality commonly drive their co-occurrence. It also has important implications on the etiology understanding, drug development and therapeutic targets for neurodegenerative diseases.
Collapse
|
23
|
Wang J, Zhou F, Li C, Yin N, Liu H, Zhuang B, Huang Q, Wen Y. Gene Association Analysis of Quantitative Trait Based on Functional Linear Regression Model with Local Sparse Estimator. Genes (Basel) 2023; 14:genes14040834. [PMID: 37107592 PMCID: PMC10137544 DOI: 10.3390/genes14040834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 03/27/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023] Open
Abstract
Functional linear regression models have been widely used in the gene association analysis of complex traits. These models retain all the genetic information in the data and take full advantage of spatial information in genetic variation data, which leads to brilliant detection power. However, the significant association signals identified by the high-power methods are not all the real causal SNPs, because it is easy to regard noise information as significant association signals, leading to a false association. In this paper, a method based on the sparse functional data association test (SFDAT) of gene region association analysis is developed based on a functional linear regression model with local sparse estimation. The evaluation indicators CSR and DL are defined to evaluate the feasibility and performance of the proposed method with other indicators. Simulation studies show that: (1) SFDAT performs well under both linkage equilibrium and linkage disequilibrium simulation; (2) SFDAT performs successfully for gene regions (including common variants, low-frequency variants, rare variants and mix variants); (3) With power and type I error rates comparable to OLS and Smooth, SFDAT has a better ability to handle the zero regions. The Oryza sativa data set is analyzed by SFDAT. It is shown that SFDAT can better perform gene association analysis and eliminate the false positive of gene localization. This study showed that SFDAT can lower the interference caused by noise while maintaining high power. SFDAT provides a new method for the association analysis between gene regions and phenotypic quantitative traits.
Collapse
Affiliation(s)
- Jingyu Wang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Fujie Zhou
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Cheng Li
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Ning Yin
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Huiming Liu
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Binxian Zhuang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Qingyu Huang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Yongxian Wen
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Correspondence:
| |
Collapse
|
24
|
Li Q, Perera D, Cao C, He J, Bian J, Chen X, Azeem F, Howe A, Au B, Wu J, Yan J, Long Q. Interaction-integrated linear mixed model reveals 3D-genetic basis underlying Autism. Genomics 2023; 115:110575. [PMID: 36758877 DOI: 10.1016/j.ygeno.2023.110575] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 01/16/2023] [Accepted: 02/03/2023] [Indexed: 02/10/2023]
Abstract
Genetic interactions play critical roles in genotype-phenotype associations. We developed a novel interaction-integrated linear mixed model (ILMM) that integrates a priori knowledge into linear mixed models. ILMM enables statistical integration of genetic interactions upfront and overcomes the problems of searching for combinations. To demonstrate its utility, with 3D genomic interactions (assessed by Hi-C experiments) as a priori, we applied ILMM to whole-genome sequencing data for Autism Spectrum Disorders (ASD) and brain transcriptome data, revealing the 3D-genetic basis of ASD and 3D-expression quantitative loci (3D-eQTLs) for brain tissues. Notably, we reported a potential mechanism involving distal regulation between FOXP2 and DNMT3A, conferring the risk of ASD.
Collapse
Affiliation(s)
- Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada
| | - Deshan Perera
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada
| | - Chen Cao
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada
| | - Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada
| | - Jiayi Bian
- Department of Mathematics and Statistics, University of Calgary, Alberta T2N 1N4, Canada
| | - Xingyu Chen
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada
| | - Feeha Azeem
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada
| | - Aaron Howe
- Heritage Youth Researcher Summer Program, University of Calgary, Alberta T2N 1N4, Canada
| | - Billie Au
- Department of Medical Genetics, University of Calgary, Alberta T2N 1N4, Canada; Alberta Children's Hospital Research Institute, University of Calgary, Alberta T2N 1N4, Canada
| | - Jingjing Wu
- Department of Mathematics and Statistics, University of Calgary, Alberta T2N 1N4, Canada
| | - Jun Yan
- Department of Physiology and Pharmacology, University of Calgary, Alberta T2N 1N4, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 1N4, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, University of Calgary, Alberta T2N 1N4, Canada; Department of Medical Genetics, University of Calgary, Alberta T2N 1N4, Canada; Department of Mathematics and Statistics, University of Calgary, Alberta T2N 1N4, Canada; Alberta Children's Hospital Research Institute, University of Calgary, Alberta T2N 1N4, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 1N4, Canada.
| |
Collapse
|
25
|
Zhao Y, Chang C, Zhang J, Zhang Z. Genetic underpinnings of brain structural connectome for young adults. J Am Stat Assoc 2023; 118:1473-1487. [PMID: 37982009 PMCID: PMC10655950 DOI: 10.1080/01621459.2022.2156349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 11/29/2022] [Indexed: 12/13/2022]
Abstract
With distinct advantages in power over behavioral phenotypes, brain imaging traits have become emerging endophenotypes to dissect molecular contributions to behaviors and neuropsychiatric illnesses. Among different imaging features, brain structural connectivity (i.e., structural connectome) which summarizes the anatomical connections between different brain regions is one of the most cutting edge while under-investigated traits; and the genetic influence on the structural connectome variation remains highly elusive. Relying on a landmark imaging genetics study for young adults, we develop a biologically plausible brain network response shrinkage model to comprehensively characterize the relationship between high dimensional genetic variants and the structural connectome phenotype. Under a unified Bayesian framework, we accommodate the topology of brain network and biological architecture within the genome; and eventually establish a mechanistic mapping between genetic biomarkers and the associated brain sub-network units. An efficient expectation-maximization algorithm is developed to estimate the model and ensure computing feasibility. In the application to the Human Connectome Project Young Adult (HCP-YA) data, we establish the genetic underpinnings which are highly interpretable under functional annotation and brain tissue eQTL analysis, for the brain white matter tracts connecting the hippocampus and two cerebral hemispheres. We also show the superiority of our method in extensive simulations.
Collapse
Affiliation(s)
- Yize Zhao
- Department of Biostatistics, Yale University
| | - Changgee Chang
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania
| | - Jingwen Zhang
- Department of Biostatistics, Boston University, Boston, MA
| | - Zhengwu Zhang
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill
| |
Collapse
|
26
|
Zhang J, Liang X, Gonzales S, Liu J, Gao XR, Wang X. A gene based combination test using GWAS summary data. BMC Bioinformatics 2023; 24:2. [PMID: 36597047 PMCID: PMC9811798 DOI: 10.1186/s12859-022-05114-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 12/13/2022] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Gene-based association tests provide a useful alternative and complement to the usual single marker association tests, especially in genome-wide association studies (GWAS). The way of weighting for variants in a gene plays an important role in boosting the power of a gene-based association test. Appropriate weights can boost statistical power, especially when detecting genetic variants with weak effects on a trait. One major limitation of existing gene-based association tests lies in using weights that are predetermined biologically or empirically. This limitation often attenuates the power of a test. On another hand, effect sizes or directions of causal genetic variants in real data are usually unknown, driving a need for a flexible yet robust methodology of gene based association tests. Furthermore, access to individual-level data is often limited, while thousands of GWAS summary data are publicly and freely available. RESULTS To resolve these limitations, we propose a combination test named as OWC which is based on summary statistics from GWAS data. Several traditional methods including burden test, weighted sum of squared score test [SSU], weighted sum statistic [WSS], SNP-set Kernel Association Test [SKAT], and the score test are special cases of OWC. To evaluate the performance of OWC, we perform extensive simulation studies. Results of simulation studies demonstrate that OWC outperforms several existing popular methods. We further show that OWC outperforms comparison methods in real-world data analyses using schizophrenia GWAS summary data and a fasting glucose GWAS meta-analysis data. The proposed method is implemented in an R package available at https://github.com/Xuexia-Wang/OWC-R-package CONCLUSIONS: We propose a novel gene-based association test that incorporates four different weighting schemes (two constant weights and two weights proportional to normal statistic Z) and includes several popular methods as its special cases. Results of the simulation studies and real data analyses illustrate that the proposed test, OWC, outperforms comparable methods in most scenarios. These results demonstrate that OWC is a useful tool that adapts to the underlying biological model for a disease by weighting appropriately genetic variants and combination of well-known gene-based tests.
Collapse
Affiliation(s)
- Jianjun Zhang
- grid.266869.50000 0001 1008 957XDepartment of Mathematics, University of North Texas, 225 Avenue E, Denton, TX 76201 USA
| | - Xiaoyu Liang
- grid.17088.360000 0001 2150 1785Department of Epidemiology and Biostatistics, Michigan State University, 909 Wilson Rd Room B601, East Lansing, MI 48824 USA
| | - Samantha Gonzales
- grid.266869.50000 0001 1008 957XDepartment of Mathematics, University of North Texas, 225 Avenue E, Denton, TX 76201 USA
| | - Jianguo Liu
- grid.266869.50000 0001 1008 957XDepartment of Mathematics, University of North Texas, 225 Avenue E, Denton, TX 76201 USA
| | - Xiaoyi Raymond Gao
- grid.261331.40000 0001 2285 7943Department of Ophthalmology and Visual Science, Department of Biomedical informatics, Division of Human Genetics, Ohio State University, 915 Olentangy River Road, Columbus, OH 43212 USA
| | - Xuexia Wang
- grid.65456.340000 0001 2110 1845Department of Biostatistics, Robert Stempel College of Public Health and Social Work, Florida International University, 11200 SW 8th street, Miami, FL 33174 USA
| |
Collapse
|
27
|
Kim Y, Chi YY, Shen J, Zou F. Robust genetic model-based SNP-set association test using CauchyGM. BIOINFORMATICS (OXFORD, ENGLAND) 2023; 39:6831090. [PMID: 36383169 DOI: 10.1093/bioinformatics/btac728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 10/26/2022] [Accepted: 11/15/2022] [Indexed: 11/17/2022]
Abstract
MOTIVATION Association testing on genome-wide association studies (GWAS) data is commonly performed under a single (mostly additive) genetic model framework. However, the underlying true genetic mechanisms are often unknown in practice for most complex traits. When the employed inheritance model deviates from the underlying model, statistical power may be reduced. To overcome this challenge, an integrative association test that directly infers the underlying genetic model from GWAS data has previously been proposed for single-SNP analysis. RESULTS In this article, we propose a Cauchy combination Genetic Model-based association test (CauchyGM) under a generalized linear model framework for SNP-set level analysis. CauchyGM does not require prior knowledge on the underlying inheritance pattern of each SNP. It performs a score test that first estimates an individual P-value of each SNP in an SNP-set with both minor allele frequency (MAF) > 1% and three genotypes and further aggregates the rest SNPs using SKAT. CauchyGM then combines the correlated P-values across multiple SNPs and different genetic models within the set using Cauchy Combination Test. To further accommodate both sparse and dense signal patterns, we also propose an omnibus association test (CauchyGM-O) by combining CauchyGM with SKAT and the burden test. Our extensive simulations show that both CauchyGM and CauchyGM-O maintain the type I error well at the genome-wide significance level and provide substantial power improvement compared to existing methods. We apply our methods to a pharmacogenomic GWAS data from a large cardiovascular randomized clinical trial. Both CauchyGM and CauchyGM-O identify several novel genome-wide significant genes. AVAILABILITY AND IMPLEMENTATION The R package CauchyGM is publicly available on github: https://github.com/ykim03517/CauchyGM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yeonil Kim
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Yueh-Yun Chi
- Department of Pediatrics, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Fei Zou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
28
|
Xiao H, Ma Y, Zhou Z, Li X, Ding K, Wu Y, Wu T, Chen D. Disease patterns of coronary heart disease and type 2 diabetes harbored distinct and shared genetic architecture. Cardiovasc Diabetol 2022; 21:276. [PMID: 36494812 PMCID: PMC9738029 DOI: 10.1186/s12933-022-01715-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 12/02/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Coronary heart disease (CHD) and type 2 diabetes (T2D) are two complex diseases with complex interrelationships. However, the genetic architecture of the two diseases is often studied independently by the individual single-nucleotide polymorphism (SNP) approach. Here, we presented a genotypic-phenotypic framework for deciphering the genetic architecture underlying the disease patterns of CHD and T2D. METHOD A data-driven SNP-set approach was performed in a genome-wide association study consisting of subpopulations with different disease patterns of CHD and T2D (comorbidity, CHD without T2D, T2D without CHD and all none). We applied nonsmooth nonnegative matrix factorization (nsNMF) clustering to generate SNP sets interacting the information of SNP and subject. Relationships between SNP sets and phenotype sets harboring different disease patterns were then assessed, and we further co-clustered the SNP sets into a genetic network to topologically elucidate the genetic architecture composed of SNP sets. RESULTS We identified 23 non-identical SNP sets with significant association with CHD or T2D (SNP-set based association test, P < 3.70 × [Formula: see text]). Among them, disease patterns involving CHD and T2D were related to distinct SNP sets (Hypergeometric test, P < 2.17 × [Formula: see text]). Accordingly, numerous genes (e.g., KLKs, GRM8, SHANK2) and pathways (e.g., fatty acid metabolism) were diversely implicated in different subtypes and related pathophysiological processes. Finally, we showed that the genetic architecture for disease patterns of CHD and T2D was composed of disjoint genetic networks (heterogeneity), with common genes contributing to it (pleiotropy). CONCLUSION The SNP-set approach deciphered the complexity of both genotype and phenotype as well as their complex relationships. Different disease patterns of CHD and T2D share distinct genetic architectures, for which lipid metabolism related to fibrosis may be an atherogenic pathway that is specifically activated by diabetes. Our findings provide new insights for exploring new biological pathways.
Collapse
Affiliation(s)
- Han Xiao
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Yujia Ma
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Zechen Zhou
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Xiaoyi Li
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Kexin Ding
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Yiqun Wu
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Tao Wu
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| | - Dafang Chen
- grid.11135.370000 0001 2256 9319Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191 China
| |
Collapse
|
29
|
Wendel B, Heidenreich M, Budde M, Heilbronner M, Oraki Kohshour M, Papiol S, Falkai P, Schulze TG, Heilbronner U, Bickeböller H. Kalpra: A kernel approach for longitudinal pathway regression analysis integrating network information with an application to the longitudinal PsyCourse Study. Front Genet 2022; 13:1015885. [PMID: 36561312 PMCID: PMC9767414 DOI: 10.3389/fgene.2022.1015885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 11/24/2022] [Indexed: 12/12/2022] Open
Abstract
A popular approach to reduce the high dimensionality resulting from genome-wide association studies is to analyze a whole pathway in a single test for association with a phenotype. Kernel machine regression (KMR) is a highly flexible pathway analysis approach. Initially, KMR was developed to analyze a simple phenotype with just one measurement per individual. Recently, however, the investigation into the influence of genomic factors in the development of disease-related phenotypes across time (trajectories) has gained in importance. Thus, novel statistical approaches for KMR analyzing longitudinal data, i.e. several measurements at specific time points per individual are required. For longitudinal pathway analysis, we extend KMR to long-KMR using the estimation equivalence of KMR and linear mixed models. We include additional random effects to correct for the dependence structure. Moreover, within long-KMR we created a topology-based pathway analysis by combining this approach with a kernel including network information of the pathway. Most importantly, long-KMR not only allows for the investigation of the main genetic effect adjusting for time dependencies within an individual, but it also allows to test for the association of the pathway with the longitudinal course of the phenotype in the form of testing the genetic time-interaction effect. The approach is implemented as an R package, kalpra. Our simulation study demonstrates that the power of long-KMR exceeded that of another KMR method previously developed to analyze longitudinal data, while maintaining (slightly conservatively) the type I error. The network kernel improved the performance of long-KMR compared to the linear kernel. Considering different pathway densities, the power of the network kernel decreased with increasing pathway density. We applied long-KMR to cognitive data on executive function (Trail Making Test, part B) from the PsyCourse Study and 17 candidate pathways selected from Reactome. We identified seven nominally significant pathways.
Collapse
Affiliation(s)
- Bernadette Wendel
- Department of Genetic Epidemiology, University Medical Center Göttingen, Georg-August-University Göttingen, Göttingen, Germany,*Correspondence: Bernadette Wendel,
| | - Markus Heidenreich
- Department of Genetic Epidemiology, University Medical Center Göttingen, Georg-August-University Göttingen, Göttingen, Germany
| | - Monika Budde
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany
| | - Maria Heilbronner
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany
| | - Mojtaba Oraki Kohshour
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany
| | - Sergi Papiol
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany,Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Munich, Germany
| | - Peter Falkai
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Munich, Germany
| | - Thomas G. Schulze
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany,Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, United States,Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Urs Heilbronner
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany
| | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center Göttingen, Georg-August-University Göttingen, Göttingen, Germany
| |
Collapse
|
30
|
Gao XR, Chiariglione M, Arch AJ. Whole-exome sequencing study identifies rare variants and genes associated with intraocular pressure and glaucoma. Nat Commun 2022; 13:7376. [PMID: 36450729 PMCID: PMC9712679 DOI: 10.1038/s41467-022-35188-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 11/22/2022] [Indexed: 12/02/2022] Open
Abstract
Elevated intraocular pressure (IOP) is a major risk factor for glaucoma, the leading cause of irreversible blindness worldwide. IOP is also the only modifiable risk factor for glaucoma. Previous genome-wide association studies have established the contribution of common genetic variants to IOP. The role of rare variants for IOP was unknown. Using whole exome sequencing data from 110,260 participants in the UK Biobank (UKB), we conducted the largest exome-wide association study of IOP to date. In addition to confirming known IOP genes, we identified 40 novel rare-variant genes for IOP, such as BOD1L1, ACAD10 and HLA-B, demonstrating the power of including and aggregating rare variants in gene discovery. About half of these IOP genes are also associated with glaucoma phenotypes in UKB and the FinnGen cohort. Six of these genes, i.e. ADRB1, PTPRB, RPL26, RPL10A, EGLN2, and MTOR, are drug targets that are either established for clinical treatment or in clinical trials. Furthermore, we constructed a rare-variant polygenic risk score and showed its significant association with glaucoma in independent participants (n = 312,825). We demonstrated the value of rare variants to enhance our understanding of the biological mechanisms regulating IOP and uncovered potential therapeutic targets for glaucoma.
Collapse
Affiliation(s)
- Xiaoyi Raymond Gao
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA. .,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA. .,Division of Human Genetics, The Ohio State University, Columbus, OH, 43210, USA. .,Ohio State University Physicians Inc., Columbus, OH, USA.
| | - Marion Chiariglione
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
| | - Alexander J Arch
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
| |
Collapse
|
31
|
Alamin M, Sultana MH, Lou X, Jin W, Xu H. Dissecting Complex Traits Using Omics Data: A Review on the Linear Mixed Models and Their Application in GWAS. PLANTS (BASEL, SWITZERLAND) 2022; 11:3277. [PMID: 36501317 PMCID: PMC9739826 DOI: 10.3390/plants11233277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 11/23/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]
Abstract
Genome-wide association study (GWAS) is the most popular approach to dissecting complex traits in plants, humans, and animals. Numerous methods and tools have been proposed to discover the causal variants for GWAS data analysis. Among them, linear mixed models (LMMs) are widely used statistical methods for regulating confounding factors, including population structure, resulting in increased computational proficiency and statistical power in GWAS studies. Recently more attention has been paid to pleiotropy, multi-trait, gene-gene interaction, gene-environment interaction, and multi-locus methods with the growing availability of large-scale GWAS data and relevant phenotype samples. In this review, we have demonstrated all possible LMMs-based methods available in the literature for GWAS. We briefly discuss the different LMM methods, software packages, and available open-source applications in GWAS. Then, we include the advantages and weaknesses of the LMMs in GWAS. Finally, we discuss the future perspective and conclusion. The present review paper would be helpful to the researchers for selecting appropriate LMM models and methods quickly for GWAS data analysis and would benefit the scientific society.
Collapse
Affiliation(s)
- Md. Alamin
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | | | - Xiangyang Lou
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Wenfei Jin
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Haiming Xu
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
32
|
Chen X, Zhang H, Liu M, Deng HW, Wu Z. Simultaneous detection of novel genes and SNPs by adaptive p-value combination. Front Genet 2022; 13:1009428. [DOI: 10.3389/fgene.2022.1009428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 11/03/2022] [Indexed: 11/18/2022] Open
Abstract
Combining SNP p-values from GWAS summary data is a promising strategy for detecting novel genetic factors. Existing statistical methods for the p-value-based SNP-set testing confront two challenges. First, the statistical power of different methods depends on unknown patterns of genetic effects that could drastically vary over different SNP sets. Second, they do not identify which SNPs primarily contribute to the global association of the whole set. We propose a new signal-adaptive analysis pipeline to address these challenges using the omnibus thresholding Fisher’s method (oTFisher). The oTFisher remains robustly powerful over various patterns of genetic effects. Its adaptive thresholding can be applied to estimate important SNPs contributing to the overall significance of the given SNP set. We develop efficient calculation algorithms to control the type I error rate, which accounts for the linkage disequilibrium among SNPs. Extensive simulations show that the oTFisher has robustly high power and provides a higher balanced accuracy in screening SNPs than the traditional Bonferroni and FDR procedures. We applied the oTFisher to study the genetic association of genes and haplotype blocks of the bone density-related traits using the summary data of the Genetic Factors for Osteoporosis Consortium. The oTFisher identified more novel and literature-reported genetic factors than existing p-value combination methods. Relevant computation has been implemented into the R package TFisher to support similar data analysis.
Collapse
|
33
|
Aborageh M, Krawitz P, Fröhlich H. Genetics in parkinson's disease: From better disease understanding to machine learning based precision medicine. FRONTIERS IN MOLECULAR MEDICINE 2022; 2:933383. [PMID: 39086979 PMCID: PMC11285583 DOI: 10.3389/fmmed.2022.933383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 08/30/2022] [Indexed: 08/02/2024]
Abstract
Parkinson's Disease (PD) is a neurodegenerative disorder with highly heterogeneous phenotypes. Accordingly, it has been challenging to robustly identify genetic factors associated with disease risk, prognosis and therapy response via genome-wide association studies (GWAS). In this review we first provide an overview of existing statistical methods to detect associations between genetic variants and the disease phenotypes in existing PD GWAS. Secondly, we discuss the potential of machine learning approaches to better quantify disease phenotypes and to move beyond disease understanding towards a better-personalized treatment of the disease.
Collapse
Affiliation(s)
- Mohamed Aborageh
- Bonn-Aachen International Center for Information Technology (B-IT), Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Peter Krawitz
- Institute for Genomic Statistics and Bioinformatics, University Hospital Bonn, Bonn, Germany
| | - Holger Fröhlich
- Bonn-Aachen International Center for Information Technology (B-IT), Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| |
Collapse
|
34
|
A machine learning-based SNP-set analysis approach for identifying disease-associated susceptibility loci. Sci Rep 2022; 12:15817. [PMID: 36138111 PMCID: PMC9499949 DOI: 10.1038/s41598-022-19708-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 09/02/2022] [Indexed: 11/17/2022] Open
Abstract
Identifying disease-associated susceptibility loci is one of the most pressing and crucial challenges in modeling complex diseases. Existing approaches to biomarker discovery are subject to several limitations including underpowered detection, neglect for variant interactions, and restrictive dependence on prior biological knowledge. Addressing these challenges necessitates more ingenious ways of approaching the “missing heritability” problem. This study aims to discover disease-associated susceptibility loci by augmenting previous genome-wide association study (GWAS) using the integration of random forest and cluster analysis. The proposed integrated framework is applied to a hepatitis B virus surface antigen (HBsAg) seroclearance GWAS data. Multiple cluster analyses were performed on (1) single nucleotide polymorphisms (SNPs) considered significant by GWAS and (2) SNPs with the highest feature importance scores obtained using random forest. The resulting SNP-sets from the cluster analyses were subsequently tested for trait-association. Three susceptibility loci possibly associated with HBsAg seroclearance were identified: (1) SNP rs2399971, (2) gene LINC00578, and (3) locus 11p15. SNP rs2399971 is a biomarker reported in the literature to be significantly associated with HBsAg seroclearance in patients who had received antiviral treatment. The latter two loci are linked with diseases influenced by the presence of hepatitis B virus infection. These findings demonstrate the potential of the proposed integrated framework in identifying disease-associated susceptibility loci. With further validation, results herein could aid in better understanding complex disease etiologies and provide inputs for a more advanced disease risk assessment for patients.
Collapse
|
35
|
Irigoien I, Cormand B, Soler-Artigas M, Sanchez-Mora C, Ramos-Quiroga JA, Arenas C. New Distance-Based approach for Genome-Wide Association Studies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2938-2949. [PMID: 34181548 DOI: 10.1109/tcbb.2021.3092812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
With the rise of genome-wide association studies (GWAS), the analysis of typical GWAS data sets with thousands of single-nucleotide polymorphisms (SNPs) has become crucial in biomedicine research. Here, we propose a new method to identify SNPs related to disease in case-control studies. The method, based on genetic distances between individuals, takes into account the possible population substructure, and avoids the issues of multiple testing. The method provides two ordered lists of SNPs; one with SNPs which minor alleles can be considered risk alleles for the disease, and another one with SNPs which minor alleles can be considered as protective. These two lists provide a useful tool to help the researcher to decide where to focus attention in a first stage.
Collapse
|
36
|
Shao Z, Wang T, Qiao J, Zhang Y, Huang S, Zeng P. A comprehensive comparison of multilocus association methods with summary statistics in genome-wide association studies. BMC Bioinformatics 2022; 23:359. [PMID: 36042399 PMCID: PMC9429742 DOI: 10.1186/s12859-022-04897-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/22/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Multilocus analysis on a set of single nucleotide polymorphisms (SNPs) pre-assigned within a gene constitutes a valuable complement to single-marker analysis by aggregating data on complex traits in a biologically meaningful way. However, despite the existence of a wide variety of SNP-set methods, few comprehensive comparison studies have been previously performed to evaluate the effectiveness of these methods. RESULTS We herein sought to fill this knowledge gap by conducting a comprehensive empirical comparison for 22 commonly-used summary-statistics based SNP-set methods. We showed that only seven methods could effectively control the type I error, and that these well-calibrated approaches had varying power performance under the simulation scenarios. Overall, we confirmed that the burden test was generally underpowered and score-based variance component tests (e.g., sequence kernel association test) were much powerful under the polygenic genetic architecture in both common and rare variant association analyses. We further revealed that two linkage-disequilibrium-free P value combination methods (e.g., harmonic mean P value method and aggregated Cauchy association test) behaved very well under the sparse genetic architecture in simulations and real-data applications to common and rare variant association analyses as well as in expression quantitative trait loci weighted integrative analysis. We also assessed the scalability of these approaches by recording computational time and found that all these methods can be scalable to biobank-scale data although some might be relatively slow. CONCLUSION In conclusion, we hope that our findings can offer an important guidance on how to choose appropriate multilocus association analysis methods in post-GWAS era. All the SNP-set methods are implemented in the R package called MCA, which is freely available at https://github.com/biostatpzeng/ .
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yuchen Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
37
|
Liu P, Fang M, Luo Y, Zheng F, Jin Y, Cheng F, Zhu H, Jin X. Rare Variants in Inborn Errors of Immunity Genes Associated With Covid-19 Severity. Front Cell Infect Microbiol 2022; 12:888582. [PMID: 35694544 PMCID: PMC9184678 DOI: 10.3389/fcimb.2022.888582] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/21/2022] [Indexed: 01/08/2023] Open
Abstract
Host genetic factors have been shown to play an important role in SARS-CoV-2 infection and the course of Covid-19 disease. The genetic contributions of common variants influencing Covid-19 susceptibility and severity have been extensively studied in diverse populations. However, the studies of rare genetic defects arising from inborn errors of immunity (IEI) are relatively few, especially in the Chinese population. To fill this gap, we used a deeply sequenced dataset of nearly 500 patients, all of Chinese descent, to investigate putative functional rare variants. Specifically, we annotated rare variants in our call set and selected likely deleterious missense (LDM) and high-confidence predicted loss-of-function (HC-pLoF) variants. Further, we analyzed LDM and HC-pLoF variants between non-severe and severe Covid-19 patients by (a) performing gene- and pathway-level association analyses, (b) testing the number of mutations in previously reported genes mapped from LDM and HC-pLoF variants, and (c) uncovering candidate genes via protein-protein interaction (PPI) network analysis of Covid-19-related genes and genes defined from LDM and HC-pLoF variants. From our analyses, we found that (a) pathways Tuberculosis (hsa:05152), Primary Immunodeficiency (hsa:05340), and Influenza A (hsa:05164) showed significant enrichment in severe patients compared to the non-severe ones, (b) HC-pLoF mutations were enriched in Covid-19-related genes in severe patients, and (c) several candidate genes, such as IL12RB1, TBK1, TLR3, and IFNGR2, are uncovered by PPI network analysis and worth further investigation. These regions generally play an essential role in regulating antiviral innate immunity responses to foreign pathogens and in responding to many inflammatory diseases. We believe that our identified candidate genes/pathways can be potentially used as Covid-19 diagnostic markers and help distinguish patients at higher risk.
Collapse
Affiliation(s)
- Panhong Liu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
- Beijing Genomeics Institute At Shenzhen, BGI-Shenzhen, Shenzhen, China
| | - Mingyan Fang
- Beijing Genomeics Institute At Shenzhen, BGI-Shenzhen, Shenzhen, China
- Beijing Genomeics Institute In Singapore, BGI-Singapore, Singapore, Singapore
| | - Yuxue Luo
- Beijing Genomeics Institute At Shenzhen, BGI-Shenzhen, Shenzhen, China
| | - Fang Zheng
- Department of Pediatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yan Jin
- Department of Pediatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Fanjun Cheng
- Department of Hematology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Huanhuan Zhu
- Beijing Genomeics Institute At Shenzhen, BGI-Shenzhen, Shenzhen, China
- *Correspondence: Xin Jin, ; Huanhuan Zhu,
| | - Xin Jin
- Beijing Genomeics Institute At Shenzhen, BGI-Shenzhen, Shenzhen, China
- Beijing Genomeics Institute In Singapore, BGI-Singapore, Singapore, Singapore
- School of Medicine, South China University of Technology, Guangzhou, China
- *Correspondence: Xin Jin, ; Huanhuan Zhu,
| |
Collapse
|
38
|
Alzahrani YM, Alamoudi AA, Nahar NK, Albar RF. Early-Age Manifestation of Singleton Merten Syndrome With Systemic Lupus Erythematosus Features: A Case Report. Cureus 2022; 14:e25244. [PMID: 35755559 PMCID: PMC9217668 DOI: 10.7759/cureus.25244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/23/2022] [Indexed: 11/26/2022] Open
Abstract
Singleton Merten syndrome (SMS) is one of the rarest multisystem genetic disorders that had been recognized in only a few cases. Patients who have this syndrome often present with calcification of the aorta and heart valves, dental dysplasia, joint calcification, distinct facial features, and growth and developmental delay. Other physical findings usually associated with SMS may include glaucoma, skeletal abnormalities including tendon rupture, muscle weakness, and arthropathy. In individuals with SMS, autoimmune diseases like psoriasis and systemic lupus erythematosus (SLE) can occur. In this case, we report a pre-term baby girl that developed congenital aortic calcification, renal hypertension, dental anomalies, multiple joint calcifications, atypical facial features, mild mental retardation, and developmental delay. At 17 years, the patient developed SLE based on positive antinuclear antibody (ANA) with clinical and immunological features like fever, malar rash, pericardial effusion, proteinuria, high ANA concentration, high anti-double-stranded DNA, low C4 complement, and presence of anti-Smith antibodies.
Collapse
|
39
|
Li B, Wang F, Wang N, Hou K, Du J. Identification of Implications of Angiogenesis and m6A Modification on Immunosuppression and Therapeutic Sensitivity in Low-Grade Glioma by Network Computational Analysis of Subtypes and Signatures. Front Immunol 2022; 13:871564. [PMID: 35572524 PMCID: PMC9094412 DOI: 10.3389/fimmu.2022.871564] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 03/28/2022] [Indexed: 02/06/2023] Open
Abstract
Angiogenesis is a complex process in the immunosuppressed low-grade gliomas (LGG) microenvironment and is regulated by multiple factors. N6-methyladenosine (m6A), modified by the m6A modification regulators (“writers” “readers” and “erasers”), can drive LGG formation. In the hypoxic environment of intracranial tumor immune microenvironment (TIME), m6A modifications in glioma stem cells are predominantly distributed around neovascularization and synergize with complex perivascular pathological ecology to mediate the immunosuppressive phenotype of TIME. The exact mechanism of this phenomenon remains unknown. Herein, we elucidated the relevance of the angiogenesis-related genes (ARGs) and m6A regulators (MAGs) and their influencing mechanism from a macro perspective. Based on the expression pattern of MAGs, we divided patients with LGG into two robust categories via consensus clustering, and further annotated the malignant related mechanisms and corresponding targeted agents. The two subgroups (CL1, CL2) demonstrated a significant correlation with prognosis and clinical-pathology features. Moreover, WGCNA has also uncovered the hub genes and related mechanisms of MAGs affecting clinical characters. Clustering analysis revealed a synergistic promoting effect of M6A and angiogenesis on immunosuppression. Based on the expression patterns of MAGs, we established a high-performance gene-signature (MASig). MASig revealed somatic mutational mechanisms by which MAGs affect the sensitivity to treatment in LGG patients. In conclusion, the MAGs were critical participants in the malignant process of LGG, with a vital potential in the prognosis stratification, prediction of outcome, and therapeutic sensitivity of LGG. Findings based on these strategies may facilitate the development of objective diagnosis and treatment systems to quantify patient survival and other outcomes, and in some cases, to identify potential unexplored targeted therapies.
Collapse
Affiliation(s)
- Bo Li
- Department of Neurosurgery, Huangyan Hospital, Wenzhou Medical University, Taizhou, China.,Department of Neurosurgery, Taizhou First People's Hospital, Taizhou, China
| | - Fang Wang
- Department of Neurosurgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Nan Wang
- Department of Neurosurgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Kuiyuan Hou
- Department of Neurosurgery, The First Hospital of Qiqihar City, Qiqihar, China
| | - Jianyang Du
- Department of Neurosurgery, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| |
Collapse
|
40
|
Smith SP, Shahamatdar S, Cheng W, Zhang S, Paik J, Graff M, Haiman C, Matise TC, North KE, Peters U, Kenny E, Gignoux C, Wojcik G, Crawford L, Ramachandran S. Enrichment analyses identify shared associations for 25 quantitative traits in over 600,000 individuals from seven diverse ancestries. Am J Hum Genet 2022; 109:871-884. [PMID: 35349783 PMCID: PMC9118115 DOI: 10.1016/j.ajhg.2022.03.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 03/02/2022] [Indexed: 12/12/2022] Open
Abstract
Since 2005, genome-wide association (GWA) datasets have been largely biased toward sampling European ancestry individuals, and recent studies have shown that GWA results estimated from self-identified European individuals are not transferable to non-European individuals because of various confounding challenges. Here, we demonstrate that enrichment analyses that aggregate SNP-level association statistics at multiple genomic scales-from genes to genomic regions and pathways-have been underutilized in the GWA era and can generate biologically interpretable hypotheses regarding the genetic basis of complex trait architecture. We illustrate examples of the robust associations generated by enrichment analyses while studying 25 continuous traits assayed in 566,786 individuals from seven diverse self-identified human ancestries in the UK Biobank and the Biobank Japan as well as 44,348 admixed individuals from the PAGE consortium including cohorts of African American, Hispanic and Latin American, Native Hawaiian, and American Indian/Alaska Native individuals. We identify 1,000 gene-level associations that are genome-wide significant in at least two ancestry cohorts across these 25 traits as well as highly conserved pathway associations with triglyceride levels in European, East Asian, and Native Hawaiian cohorts.
Collapse
Affiliation(s)
- Samuel Pattillo Smith
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Sahar Shahamatdar
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Wei Cheng
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Selena Zhang
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Joseph Paik
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Misa Graff
- Department of Epidemiology, University of North Carolina, Chapel Hill, Chapel Hill, NC 27599, USA
| | - Christopher Haiman
- Department of Preventative Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | - T C Matise
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina, Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Eimear Kenny
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; Department of Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Chris Gignoux
- Division of Biomedical Informatics and Personalized Medicine, University of Colorado, Denver, CO 80204, USA
| | - Genevieve Wojcik
- Department of Epidemiology, Johns Hopkins University, Baltimore, MD 21287, USA
| | - Lorin Crawford
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Biostatistics, Brown University, Providence, RI 02906, USA; Microsoft Research New England, Cambridge, MA 02142, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA; Data Science Initiative, Brown University, Providence, RI 02912, USA.
| |
Collapse
|
41
|
Hébert F, Causeur D, Emily M. Omnibus testing approach for gene-based gene-gene interaction. Stat Med 2022; 41:2854-2878. [PMID: 35338506 DOI: 10.1002/sim.9389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Revised: 03/03/2022] [Accepted: 03/04/2022] [Indexed: 11/07/2022]
Abstract
Genetic interaction is considered as one of the main heritable component of complex traits. With the emergence of genome-wide association studies (GWAS), a collection of statistical methods dedicated to the identification of interaction at the SNP level have been proposed. More recently, gene-based gene-gene interaction testing has emerged as an attractive alternative as they confer advantage in both statistical power and biological interpretation. Most of the gene-based interaction methods rely on a multidimensional modeling of the interaction, thus facing a lack of robustness against the huge space of interaction patterns. In this paper, we study a global testing approaches to address the issue of gene-based gene-gene interaction. Based on a logistic regression modeling framework, all SNP-SNP interaction tests are combined to produce a gene-level test for interaction. We propose an omnibus test that takes advantage of (1) the heterogeneity between existing global tests and (2) the complementarity between allele-based and genotype-based coding of SNPs. Through an extensive simulation study, it is demonstrated that the proposed omnibus test has the ability to detect with high power the most common interaction genetic models with one causal pair as well as more complex genetic models where more than one causal pair is involved. On the other hand, the flexibility of the proposed approach is shown to be robust and improves power compared to single global tests in replication studies. Furthermore, the application of our procedure to real datasets confirms the adaptability of our approach to replicate various gene-gene interactions.
Collapse
Affiliation(s)
- Florian Hébert
- Department of Statistics and Computer Science, Institut Agro, CNRS, IRMAR, Univ Rennes, F-35000, Rennes, France
| | - David Causeur
- Department of Statistics and Computer Science, Institut Agro, CNRS, IRMAR, Univ Rennes, F-35000, Rennes, France
| | - Mathieu Emily
- Department of Statistics and Computer Science, Institut Agro, CNRS, IRMAR, Univ Rennes, F-35000, Rennes, France
| |
Collapse
|
42
|
Li S, Li S, Su S, Zhang H, Shen J, Wen Y. Gene Region Association Analysis of Longitudinal Quantitative Traits Based on a Function-On-Function Regression Model. Front Genet 2022; 13:781740. [PMID: 35265102 PMCID: PMC8899465 DOI: 10.3389/fgene.2022.781740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 01/04/2022] [Indexed: 11/13/2022] Open
Abstract
In the process of growth and development in life, gene expressions that control quantitative traits will turn on or off with time. Studies of longitudinal traits are of great significance in revealing the genetic mechanism of biological development. With the development of ultra-high-density sequencing technology, the associated analysis has tremendous challenges to statistical methods. In this paper, a longitudinal functional data association test (LFDAT) method is proposed based on the function-on-function regression model. LFDAT can simultaneously treat phenotypic traits and marker information as continuum variables and analyze the association of longitudinal quantitative traits and gene regions. Simulation studies showed that: 1) LFDAT performs well for both linkage equilibrium simulation and linkage disequilibrium simulation, 2) LFDAT has better performance for gene regions (include common variants, low-frequency variants, rare variants and mixture), and 3) LFDAT can accurately identify gene switching in the growth and development stage. The longitudinal data of the Oryza sativa projected shoot area is analyzed by LFDAT. It showed that there is the advantage of quick calculations. Further, an association analysis was conducted between longitudinal traits and gene regions by integrating the micro effects of multiple related variants and using the information of the entire gene region. LFDAT provides a feasible method for studying the formation and expression of longitudinal traits.
Collapse
Affiliation(s)
- Shijing Li
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Shiqin Li
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Shaoqiang Su
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Hui Zhang
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Jiayu Shen
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yongxian Wen
- College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, China.,> Institute of Statistics and Application, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
43
|
Simulation Research on the Methods of Multi-Gene Region Association Analysis Based on a Functional Linear Model. Genes (Basel) 2022; 13:genes13030455. [PMID: 35328009 PMCID: PMC8954869 DOI: 10.3390/genes13030455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 02/26/2022] [Accepted: 02/27/2022] [Indexed: 11/16/2022] Open
Abstract
Genome-wide association analysis is an important approach to identify genetic variants associated with complex traits. Complex traits are not only affected by single gene loci, but also by the interaction of multiple gene loci. Studies of association between gene regions and quantitative traits are of great significance in revealing the genetic mechanism of biological development. There have been a lot of studies on single-gene region association analysis, but the application of functional linear models in multi-gene region association analysis is still less. In this paper, a functional multi-gene region association analysis test method is proposed based on the functional linear model. From the three directions of common multi-gene region method, multi-gene region weighted method and multi-gene region loci weighted method, that test method is studied combined with computer simulation. The following conclusions are obtained through computer simulation: (a) The functional multi-gene region association analysis test method has higher power than the functional single gene region association analysis test method; (b) The functional multi-gene region weighted method performs better than the common functional multi-gene region method; (c) the functional multi-gene region loci weighted method is the best method for association analysis on three directions of the common multi-gene region method; (d) the performance of the Step method and Multi-gene region loci weighted Step for multi-gene regions is the best in general. Functional multi-gene region association analysis test method can theoretically provide a feasible method for the study of complex traits affected by multiple genes.
Collapse
|
44
|
Bielak LF, Peyser PA, Smith JA, Zhao W, Ruiz‐Narvaez EA, Kardia SLR, Harlow SD. Multivariate, region-based genetic analyses of facets of reproductive aging in White and Black women. Mol Genet Genomic Med 2022; 10:e1896. [PMID: 35179313 PMCID: PMC9000932 DOI: 10.1002/mgg3.1896] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 01/14/2022] [Accepted: 01/31/2022] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Age at final menstrual period (FMP) and the accompanying hormone trajectories across the menopause transition do not occur in isolation, but likely share molecular pathways. Understanding the genetics underlying the endocrinology of the menopause transition may be enhanced by jointly analyzing multiple interrelated traits. METHODS In a sample of 347 White and 164 Black women from the Study of Women's Health Across the Nation (SWAN), we investigated pleiotropic effects of 54 candidate genetic regions of interest (ROI) on 5 menopausal traits (age at FMP and premenopausal and postmenopausal levels of follicle stimulation hormone and estradiol) using multivariate kernel regression (Multi-SKAT). A backward elimination procedure was used to identify which subset of traits were most strongly associated with a specific ROI. RESULTS In White women, the 20 kb ROI around rs10734411 was significantly associated with the multivariate distribution of age at FMP, premenopausal estradiol, and postmenopausal estradiol (omnibus p-value = .00004). This association did not replicate in the smaller sample of Black women. CONCLUSION This study using a region-based, multiple-trait approach suggests a shared genetic basis among multiple facets of reproductive aging.
Collapse
Affiliation(s)
- Lawrence F. Bielak
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Patricia A. Peyser
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Jennifer A. Smith
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA,Survey Research Center, Institute for Social ResearchUniversity of MichiganAnn ArborMichiganUSA
| | - Wei Zhao
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Edward A. Ruiz‐Narvaez
- Department of Nutritional Sciences, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Sharon L. R. Kardia
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Sioban D. Harlow
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| |
Collapse
|
45
|
Zhang Z, Zhao Y. Progress on the roles of MEF2C in neuropsychiatric diseases. Mol Brain 2022; 15:8. [PMID: 34991657 PMCID: PMC8740500 DOI: 10.1186/s13041-021-00892-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 12/23/2021] [Indexed: 12/15/2022] Open
Abstract
Myocyte Enhancer Factor 2 C (MEF2C), one of the transcription factors of the MADS-BOX family, is involved in embryonic brain development, neuronal formation and differentiation, as well as in the growth and pruning of axons and dendrites. MEF2C is also involved in the development of various neuropsychiatric disorders, such as autism spectrum disorders (ASD), epilepsy, schizophrenia and Alzheimer’s disease (AD). Here, we review the relationship between MEF2C and neuropsychiatric disorders, and provide further insights into the mechanism of these diseases.
Collapse
Affiliation(s)
- Zhikun Zhang
- National Center for International Research of Bio-Targeting Theranostics, Guangxi Key Laboratory of Bio-Targeting Theranostics, Collaborative Innovation Center for Targeting Tumor Diagnosis and Therapy, Guangxi Medical University, Nanning, 530021, Guangxi, China.,Department of Mental Health, The Second Affiliated Hospital of Guangxi Medical University, Nanning, 530007, Guangxi, China
| | - Yongxiang Zhao
- National Center for International Research of Bio-Targeting Theranostics, Guangxi Key Laboratory of Bio-Targeting Theranostics, Collaborative Innovation Center for Targeting Tumor Diagnosis and Therapy, Guangxi Medical University, Nanning, 530021, Guangxi, China.
| |
Collapse
|
46
|
Qu J, Cui Y. Gene set analysis with graph-embedded kernel association test. Bioinformatics 2021; 38:1560-1567. [PMID: 34935928 PMCID: PMC8896609 DOI: 10.1093/bioinformatics/btab851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/20/2021] [Accepted: 12/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Kernel-based association test (KAT) has been a popular approach to evaluate the association of expressions of a gene set (e.g. pathway) with a phenotypic trait. KATs rely on kernel functions which capture the sample similarity across multiple features, to capture potential linear or non-linear relationship among features in a gene set. When calculating the kernel functions, no network graphical information about the features is considered. While genes in a functional group (e.g. a pathway) are not independent in general due to regulatory interactions, incorporating regulatory network (or graph) information can potentially increase the power of KAT. In this work, we propose a graph-embedded kernel association test, termed gKAT. gKAT incorporates prior pathway knowledge when constructing a kernel function into hypothesis testing. RESULTS We apply a diffusion kernel to capture any graph structures in a gene set, then incorporate such information to build a kernel function for further association test. We illustrate the geometric meaning of the approach. Through extensive simulation studies, we show that the proposed gKAT algorithm can improve testing power compared to the one without considering graph structures. Application to a real dataset further demonstrate the utility of the method. AVAILABILITY AND IMPLEMENTATION The R code used for the analysis can be accessed at https://github.com/JialinQu/gKAT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jialin Qu
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- To whom correspondence should be addressed.
| |
Collapse
|
47
|
Lu H, Qiao J, Shao Z, Wang T, Huang S, Zeng P. A comprehensive gene-centric pleiotropic association analysis for 14 psychiatric disorders with GWAS summary statistics. BMC Med 2021; 19:314. [PMID: 34895209 PMCID: PMC8667366 DOI: 10.1186/s12916-021-02186-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 11/10/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Recent genome-wide association studies (GWASs) have revealed the polygenic nature of psychiatric disorders and discovered a few of single-nucleotide polymorphisms (SNPs) associated with multiple psychiatric disorders. However, the extent and pattern of pleiotropy among distinct psychiatric disorders remain not completely clear. METHODS We analyzed 14 psychiatric disorders using summary statistics available from the largest GWASs by far. We first applied the cross-trait linkage disequilibrium score regression (LDSC) to estimate genetic correlation between disorders. Then, we performed a gene-based pleiotropy analysis by first aggregating a set of SNP-level associations into a single gene-level association signal using MAGMA. From a methodological perspective, we viewed the identification of pleiotropic associations across the entire genome as a high-dimensional problem of composite null hypothesis testing and utilized a novel method called PLACO for pleiotropy mapping. We ultimately implemented functional analysis for identified pleiotropic genes and used Mendelian randomization for detecting causal association between these disorders. RESULTS We confirmed extensive genetic correlation among psychiatric disorders, based on which these disorders can be grouped into three diverse categories. We detected a large number of pleiotropic genes including 5884 associations and 2424 unique genes and found that differentially expressed pleiotropic genes were significantly enriched in pancreas, liver, heart, and brain, and that the biological process of these genes was remarkably enriched in regulating neurodevelopment, neurogenesis, and neuron differentiation, offering substantial evidence supporting the validity of identified pleiotropic loci. We further demonstrated that among all the identified pleiotropic genes there were 342 unique ones linked with 6353 drugs with drug-gene interaction which can be classified into distinct types including inhibitor, agonist, blocker, antagonist, and modulator. We also revealed causal associations among psychiatric disorders, indicating that genetic overlap and causality commonly drove the observed co-existence of these disorders. CONCLUSIONS Our study is among the first large-scale effort to characterize gene-level pleiotropy among a greatly expanded set of psychiatric disorders and provides important insight into shared genetic etiology underlying these disorders. The findings would inform psychiatric nosology, identify potential neurobiological mechanisms predisposing to specific clinical presentations, and pave the way to effective drug targets for clinical treatment.
Collapse
Affiliation(s)
- Haojie Lu
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
48
|
Maus Esfahani N, Catchpoole D, Khan J, Kennedy PJ. MCKAT: a multi-dimensional copy number variant kernel association test. BMC Bioinformatics 2021; 22:588. [PMID: 34895138 PMCID: PMC8666084 DOI: 10.1186/s12859-021-04494-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 11/25/2021] [Indexed: 11/25/2022] Open
Abstract
Background Copy number variants (CNVs) are the gain or loss of DNA segments in the genome. Studies have shown that CNVs are linked to various disorders, including autism, intellectual disability, and schizophrenia. Consequently, the interest in studying a possible association of CNVs to specific disease traits is growing. However, due to the specific multi-dimensional characteristics of the CNVs, methods for testing the association between CNVs and the disease-related traits are still underdeveloped. We propose a novel multi-dimensional CNV kernel association test (MCKAT) in this paper. We aim to find significant associations between CNVs and disease-related traits using kernel-based methods. Results We address the multi-dimensionality in CNV characteristics. We first design a single pair CNV kernel, which contains three sub-kernels to summarize the similarity between two CNVs considering all CNV characteristics. Then, aggregate single pair CNV kernel to the whole chromosome CNV kernel, which summarizes the similarity between CNVs in two or more chromosomes. Finally, the association between the CNVs and disease-related traits is evaluated by comparing the similarity in the trait with kernel-based similarity using a score test in a random effect model. We apply MCKAT on genome-wide CNV datasets to examine the association between CNVs and disease-related traits, which demonstrates the potential usefulness the proposed method has for the CNV association tests. We compare the performance of MCKAT with CKAT, a uni-dimensional kernel method. Based on the results, MCKAT indicates stronger evidence, smaller p-value, in detecting significant associations between CNVs and disease-related traits in both rare and common CNV datasets. Conclusion A multi-dimensional copy number variant kernel association test can detect statistically significant associated CNV regions with any disease-related trait. MCKAT can provide biologists with CNV hot spots at the cytogenetic band level that CNVs on them may have a significant association with disease-related traits. Using MCKAT, biologists can narrow their investigation from the whole genome, including many genes and CNVs, to more specific cytogenetic bands that MCKAT identifies. Furthermore, MCKAT can help biologists detect significantly associated CNVs with disease-related traits across a patient group instead of examining each subject’s CNVs case by case.
Collapse
Affiliation(s)
- Nastaran Maus Esfahani
- Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, Australia.
| | - Daniel Catchpoole
- Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, Australia.,The Tumour Bank, The Children's Hospital at Westmead, Sydney, Australia
| | - Javed Khan
- Center for Cancer Research, National Cancer Institute, Bethesda, USA
| | - Paul J Kennedy
- Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, Australia
| |
Collapse
|
49
|
Nothaft H, Perez-Muñoz ME, Yang T, Murugan AVM, Miller M, Kolarich D, Plastow GS, Walter J, Szymanski CM. Improving Chicken Responses to Glycoconjugate Vaccination Against Campylobacter jejuni. Front Microbiol 2021; 12:734526. [PMID: 34867850 PMCID: PMC8637857 DOI: 10.3389/fmicb.2021.734526] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 10/04/2021] [Indexed: 01/03/2023] Open
Abstract
Campylobacter jejuni is a common cause of diarrheal disease worldwide. Human infection typically occurs through the ingestion of contaminated poultry products. We previously demonstrated that an attenuated Escherichia coli live vaccine strain expressing the C. jejuni N-glycan on its surface reduced the Campylobacter load in more than 50% of vaccinated leghorn and broiler birds to undetectable levels (responder birds), whereas the remainder of the animals was still colonized (non-responders). To understand the underlying mechanism, we conducted three vaccination and challenge studies using 135 broiler birds and found a similar responder/non-responder effect. Subsequent genome-wide association studies (GWAS), analyses of bird sex and levels of vaccine-induced IgY responses did not correlate with the responder versus non-responder phenotype. In contrast, antibodies isolated from responder birds displayed a higher Campylobacter-opsonophagocytic activity when compared to antisera from non-responder birds. No differences in the N-glycome of the sera could be detected, although minor changes in IgY glycosylation warrant further investigation. As reported before, the composition of the microbiota, particularly levels of OTU classified as Clostridium spp., Ruminococcaceae and Lachnospiraceae are associated with the response. Transplantation of the cecal microbiota of responder birds into new birds in combination with vaccination resulted in further increases in vaccine-induced antigen-specific IgY responses when compared to birds that did not receive microbiota transplants. Our work suggests that the IgY effector function and microbiota contribute to the efficacy of the E. coli live vaccine, information that could form the basis for the development of improved vaccines targeted at the elimination of C. jejuni from poultry.
Collapse
Affiliation(s)
- Harald Nothaft
- Department of Medical Microbiology and Immunology, University of Alberta, Edmonton, AB, Canada
| | - Maria Elisa Perez-Muñoz
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Tianfu Yang
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Abarna V M Murugan
- Institute for Glycomics, Griffith University, Gold Coast Campus, Southport, QLD, Australia
| | | | - Daniel Kolarich
- Institute for Glycomics, Griffith University, Gold Coast Campus, Southport, QLD, Australia.,ARC Centre of Excellence for Nanoscale BioPhotonics, Griffith University, Southport, QLD, Australia
| | - Graham S Plastow
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada.,Livestock Gentec, Edmonton, AB, Canada
| | - Jens Walter
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Christine M Szymanski
- Department of Medical Microbiology and Immunology, University of Alberta, Edmonton, AB, Canada.,Department of Microbiology and Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States
| |
Collapse
|
50
|
Cao C, Kossinna P, Kwok D, Li Q, He J, Su L, Guo X, Zhang Q, Long Q. Disentangling genetic feature selection and aggregation in transcriptome-wide association studies. Genetics 2021; 220:6444993. [PMID: 34849857 DOI: 10.1093/genetics/iyab216] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 11/04/2021] [Indexed: 12/14/2022] Open
Abstract
The success of transcriptome-wide association studies (TWAS) has led to substantial research towards improving the predictive accuracy of its core component of Genetically Regulated eXpression (GReX). GReX links expression information with genotype and phenotype by playing two roles simultaneously: it acts as both the outcome of the genotype-based predictive models (for predicting expressions) and the linear combination of genotypes (as the predicted expressions) for association tests. From the perspective of machine learning (considering SNPs as features), these are actually two separable steps-feature selection and feature aggregation-which can be independently conducted. In this work, we show that the single approach of GReX limits the adaptability of TWAS methodology and practice. By conducting simulations and real data analysis, we demonstrate that disentangled protocols adapting straightforward approaches for feature selection (e.g., simple marker test) and aggregation (e.g., kernel machines) outperform the standard TWAS protocols that rely on GReX. Our development provides more powerful novel tools for conducting TWAS. More importantly, our characterization of the exact nature of TWAS suggests that, instead of questionably binding two distinct steps into the same statistical form (GReX), methodological research focusing on optimal combinations of feature selection and aggregation approaches will bring higher power to TWAS protocols.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Jingni He
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Liya Su
- Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN 37203, USA
| | - Qingrun Zhang
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada.,Department of Medical Genetics, University of Calgary, Calgary, AB T2N 4N1, Canada.,Hotchkiss Brain Institute, O'Brien Institute for Public Health, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|