51
|
Bielak LF, Peyser PA, Smith JA, Zhao W, Ruiz‐Narvaez EA, Kardia SLR, Harlow SD. Multivariate, region-based genetic analyses of facets of reproductive aging in White and Black women. Mol Genet Genomic Med 2022; 10:e1896. [PMID: 35179313 PMCID: PMC9000932 DOI: 10.1002/mgg3.1896] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 01/14/2022] [Accepted: 01/31/2022] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Age at final menstrual period (FMP) and the accompanying hormone trajectories across the menopause transition do not occur in isolation, but likely share molecular pathways. Understanding the genetics underlying the endocrinology of the menopause transition may be enhanced by jointly analyzing multiple interrelated traits. METHODS In a sample of 347 White and 164 Black women from the Study of Women's Health Across the Nation (SWAN), we investigated pleiotropic effects of 54 candidate genetic regions of interest (ROI) on 5 menopausal traits (age at FMP and premenopausal and postmenopausal levels of follicle stimulation hormone and estradiol) using multivariate kernel regression (Multi-SKAT). A backward elimination procedure was used to identify which subset of traits were most strongly associated with a specific ROI. RESULTS In White women, the 20 kb ROI around rs10734411 was significantly associated with the multivariate distribution of age at FMP, premenopausal estradiol, and postmenopausal estradiol (omnibus p-value = .00004). This association did not replicate in the smaller sample of Black women. CONCLUSION This study using a region-based, multiple-trait approach suggests a shared genetic basis among multiple facets of reproductive aging.
Collapse
Affiliation(s)
- Lawrence F. Bielak
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Patricia A. Peyser
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Jennifer A. Smith
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA,Survey Research Center, Institute for Social ResearchUniversity of MichiganAnn ArborMichiganUSA
| | - Wei Zhao
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Edward A. Ruiz‐Narvaez
- Department of Nutritional Sciences, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Sharon L. R. Kardia
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| | - Sioban D. Harlow
- Department of Epidemiology, School of Public HealthUniversity of MichiganAnn ArborMichiganUSA
| |
Collapse
|
52
|
Zhang Z, Zhao Y. Progress on the roles of MEF2C in neuropsychiatric diseases. Mol Brain 2022; 15:8. [PMID: 34991657 PMCID: PMC8740500 DOI: 10.1186/s13041-021-00892-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 12/23/2021] [Indexed: 12/15/2022] Open
Abstract
Myocyte Enhancer Factor 2 C (MEF2C), one of the transcription factors of the MADS-BOX family, is involved in embryonic brain development, neuronal formation and differentiation, as well as in the growth and pruning of axons and dendrites. MEF2C is also involved in the development of various neuropsychiatric disorders, such as autism spectrum disorders (ASD), epilepsy, schizophrenia and Alzheimer’s disease (AD). Here, we review the relationship between MEF2C and neuropsychiatric disorders, and provide further insights into the mechanism of these diseases.
Collapse
Affiliation(s)
- Zhikun Zhang
- National Center for International Research of Bio-Targeting Theranostics, Guangxi Key Laboratory of Bio-Targeting Theranostics, Collaborative Innovation Center for Targeting Tumor Diagnosis and Therapy, Guangxi Medical University, Nanning, 530021, Guangxi, China.,Department of Mental Health, The Second Affiliated Hospital of Guangxi Medical University, Nanning, 530007, Guangxi, China
| | - Yongxiang Zhao
- National Center for International Research of Bio-Targeting Theranostics, Guangxi Key Laboratory of Bio-Targeting Theranostics, Collaborative Innovation Center for Targeting Tumor Diagnosis and Therapy, Guangxi Medical University, Nanning, 530021, Guangxi, China.
| |
Collapse
|
53
|
Qu J, Cui Y. Gene set analysis with graph-embedded kernel association test. Bioinformatics 2021; 38:1560-1567. [PMID: 34935928 PMCID: PMC8896609 DOI: 10.1093/bioinformatics/btab851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/20/2021] [Accepted: 12/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Kernel-based association test (KAT) has been a popular approach to evaluate the association of expressions of a gene set (e.g. pathway) with a phenotypic trait. KATs rely on kernel functions which capture the sample similarity across multiple features, to capture potential linear or non-linear relationship among features in a gene set. When calculating the kernel functions, no network graphical information about the features is considered. While genes in a functional group (e.g. a pathway) are not independent in general due to regulatory interactions, incorporating regulatory network (or graph) information can potentially increase the power of KAT. In this work, we propose a graph-embedded kernel association test, termed gKAT. gKAT incorporates prior pathway knowledge when constructing a kernel function into hypothesis testing. RESULTS We apply a diffusion kernel to capture any graph structures in a gene set, then incorporate such information to build a kernel function for further association test. We illustrate the geometric meaning of the approach. Through extensive simulation studies, we show that the proposed gKAT algorithm can improve testing power compared to the one without considering graph structures. Application to a real dataset further demonstrate the utility of the method. AVAILABILITY AND IMPLEMENTATION The R code used for the analysis can be accessed at https://github.com/JialinQu/gKAT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jialin Qu
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- To whom correspondence should be addressed.
| |
Collapse
|
54
|
Lu H, Qiao J, Shao Z, Wang T, Huang S, Zeng P. A comprehensive gene-centric pleiotropic association analysis for 14 psychiatric disorders with GWAS summary statistics. BMC Med 2021; 19:314. [PMID: 34895209 PMCID: PMC8667366 DOI: 10.1186/s12916-021-02186-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 11/10/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Recent genome-wide association studies (GWASs) have revealed the polygenic nature of psychiatric disorders and discovered a few of single-nucleotide polymorphisms (SNPs) associated with multiple psychiatric disorders. However, the extent and pattern of pleiotropy among distinct psychiatric disorders remain not completely clear. METHODS We analyzed 14 psychiatric disorders using summary statistics available from the largest GWASs by far. We first applied the cross-trait linkage disequilibrium score regression (LDSC) to estimate genetic correlation between disorders. Then, we performed a gene-based pleiotropy analysis by first aggregating a set of SNP-level associations into a single gene-level association signal using MAGMA. From a methodological perspective, we viewed the identification of pleiotropic associations across the entire genome as a high-dimensional problem of composite null hypothesis testing and utilized a novel method called PLACO for pleiotropy mapping. We ultimately implemented functional analysis for identified pleiotropic genes and used Mendelian randomization for detecting causal association between these disorders. RESULTS We confirmed extensive genetic correlation among psychiatric disorders, based on which these disorders can be grouped into three diverse categories. We detected a large number of pleiotropic genes including 5884 associations and 2424 unique genes and found that differentially expressed pleiotropic genes were significantly enriched in pancreas, liver, heart, and brain, and that the biological process of these genes was remarkably enriched in regulating neurodevelopment, neurogenesis, and neuron differentiation, offering substantial evidence supporting the validity of identified pleiotropic loci. We further demonstrated that among all the identified pleiotropic genes there were 342 unique ones linked with 6353 drugs with drug-gene interaction which can be classified into distinct types including inhibitor, agonist, blocker, antagonist, and modulator. We also revealed causal associations among psychiatric disorders, indicating that genetic overlap and causality commonly drove the observed co-existence of these disorders. CONCLUSIONS Our study is among the first large-scale effort to characterize gene-level pleiotropy among a greatly expanded set of psychiatric disorders and provides important insight into shared genetic etiology underlying these disorders. The findings would inform psychiatric nosology, identify potential neurobiological mechanisms predisposing to specific clinical presentations, and pave the way to effective drug targets for clinical treatment.
Collapse
Affiliation(s)
- Haojie Lu
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
55
|
Maus Esfahani N, Catchpoole D, Khan J, Kennedy PJ. MCKAT: a multi-dimensional copy number variant kernel association test. BMC Bioinformatics 2021; 22:588. [PMID: 34895138 PMCID: PMC8666084 DOI: 10.1186/s12859-021-04494-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 11/25/2021] [Indexed: 11/25/2022] Open
Abstract
Background Copy number variants (CNVs) are the gain or loss of DNA segments in the genome. Studies have shown that CNVs are linked to various disorders, including autism, intellectual disability, and schizophrenia. Consequently, the interest in studying a possible association of CNVs to specific disease traits is growing. However, due to the specific multi-dimensional characteristics of the CNVs, methods for testing the association between CNVs and the disease-related traits are still underdeveloped. We propose a novel multi-dimensional CNV kernel association test (MCKAT) in this paper. We aim to find significant associations between CNVs and disease-related traits using kernel-based methods. Results We address the multi-dimensionality in CNV characteristics. We first design a single pair CNV kernel, which contains three sub-kernels to summarize the similarity between two CNVs considering all CNV characteristics. Then, aggregate single pair CNV kernel to the whole chromosome CNV kernel, which summarizes the similarity between CNVs in two or more chromosomes. Finally, the association between the CNVs and disease-related traits is evaluated by comparing the similarity in the trait with kernel-based similarity using a score test in a random effect model. We apply MCKAT on genome-wide CNV datasets to examine the association between CNVs and disease-related traits, which demonstrates the potential usefulness the proposed method has for the CNV association tests. We compare the performance of MCKAT with CKAT, a uni-dimensional kernel method. Based on the results, MCKAT indicates stronger evidence, smaller p-value, in detecting significant associations between CNVs and disease-related traits in both rare and common CNV datasets. Conclusion A multi-dimensional copy number variant kernel association test can detect statistically significant associated CNV regions with any disease-related trait. MCKAT can provide biologists with CNV hot spots at the cytogenetic band level that CNVs on them may have a significant association with disease-related traits. Using MCKAT, biologists can narrow their investigation from the whole genome, including many genes and CNVs, to more specific cytogenetic bands that MCKAT identifies. Furthermore, MCKAT can help biologists detect significantly associated CNVs with disease-related traits across a patient group instead of examining each subject’s CNVs case by case.
Collapse
Affiliation(s)
- Nastaran Maus Esfahani
- Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, Australia.
| | - Daniel Catchpoole
- Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, Australia.,The Tumour Bank, The Children's Hospital at Westmead, Sydney, Australia
| | - Javed Khan
- Center for Cancer Research, National Cancer Institute, Bethesda, USA
| | - Paul J Kennedy
- Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney, Australia
| |
Collapse
|
56
|
Nothaft H, Perez-Muñoz ME, Yang T, Murugan AVM, Miller M, Kolarich D, Plastow GS, Walter J, Szymanski CM. Improving Chicken Responses to Glycoconjugate Vaccination Against Campylobacter jejuni. Front Microbiol 2021; 12:734526. [PMID: 34867850 PMCID: PMC8637857 DOI: 10.3389/fmicb.2021.734526] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 10/04/2021] [Indexed: 01/03/2023] Open
Abstract
Campylobacter jejuni is a common cause of diarrheal disease worldwide. Human infection typically occurs through the ingestion of contaminated poultry products. We previously demonstrated that an attenuated Escherichia coli live vaccine strain expressing the C. jejuni N-glycan on its surface reduced the Campylobacter load in more than 50% of vaccinated leghorn and broiler birds to undetectable levels (responder birds), whereas the remainder of the animals was still colonized (non-responders). To understand the underlying mechanism, we conducted three vaccination and challenge studies using 135 broiler birds and found a similar responder/non-responder effect. Subsequent genome-wide association studies (GWAS), analyses of bird sex and levels of vaccine-induced IgY responses did not correlate with the responder versus non-responder phenotype. In contrast, antibodies isolated from responder birds displayed a higher Campylobacter-opsonophagocytic activity when compared to antisera from non-responder birds. No differences in the N-glycome of the sera could be detected, although minor changes in IgY glycosylation warrant further investigation. As reported before, the composition of the microbiota, particularly levels of OTU classified as Clostridium spp., Ruminococcaceae and Lachnospiraceae are associated with the response. Transplantation of the cecal microbiota of responder birds into new birds in combination with vaccination resulted in further increases in vaccine-induced antigen-specific IgY responses when compared to birds that did not receive microbiota transplants. Our work suggests that the IgY effector function and microbiota contribute to the efficacy of the E. coli live vaccine, information that could form the basis for the development of improved vaccines targeted at the elimination of C. jejuni from poultry.
Collapse
Affiliation(s)
- Harald Nothaft
- Department of Medical Microbiology and Immunology, University of Alberta, Edmonton, AB, Canada
| | - Maria Elisa Perez-Muñoz
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Tianfu Yang
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Abarna V M Murugan
- Institute for Glycomics, Griffith University, Gold Coast Campus, Southport, QLD, Australia
| | | | - Daniel Kolarich
- Institute for Glycomics, Griffith University, Gold Coast Campus, Southport, QLD, Australia.,ARC Centre of Excellence for Nanoscale BioPhotonics, Griffith University, Southport, QLD, Australia
| | - Graham S Plastow
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada.,Livestock Gentec, Edmonton, AB, Canada
| | - Jens Walter
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, AB, Canada
| | - Christine M Szymanski
- Department of Medical Microbiology and Immunology, University of Alberta, Edmonton, AB, Canada.,Department of Microbiology and Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States
| |
Collapse
|
57
|
Cao C, Kossinna P, Kwok D, Li Q, He J, Su L, Guo X, Zhang Q, Long Q. Disentangling genetic feature selection and aggregation in transcriptome-wide association studies. Genetics 2021; 220:6444993. [PMID: 34849857 DOI: 10.1093/genetics/iyab216] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 11/04/2021] [Indexed: 12/14/2022] Open
Abstract
The success of transcriptome-wide association studies (TWAS) has led to substantial research towards improving the predictive accuracy of its core component of Genetically Regulated eXpression (GReX). GReX links expression information with genotype and phenotype by playing two roles simultaneously: it acts as both the outcome of the genotype-based predictive models (for predicting expressions) and the linear combination of genotypes (as the predicted expressions) for association tests. From the perspective of machine learning (considering SNPs as features), these are actually two separable steps-feature selection and feature aggregation-which can be independently conducted. In this work, we show that the single approach of GReX limits the adaptability of TWAS methodology and practice. By conducting simulations and real data analysis, we demonstrate that disentangled protocols adapting straightforward approaches for feature selection (e.g., simple marker test) and aggregation (e.g., kernel machines) outperform the standard TWAS protocols that rely on GReX. Our development provides more powerful novel tools for conducting TWAS. More importantly, our characterization of the exact nature of TWAS suggests that, instead of questionably binding two distinct steps into the same statistical form (GReX), methodological research focusing on optimal combinations of feature selection and aggregation approaches will bring higher power to TWAS protocols.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Jingni He
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Liya Su
- Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN 37203, USA
| | - Qingrun Zhang
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada.,Department of Medical Genetics, University of Calgary, Calgary, AB T2N 4N1, Canada.,Hotchkiss Brain Institute, O'Brien Institute for Public Health, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
58
|
SMCKAT, a Sequential Multi-Dimensional CNV Kernel-Based Association Test. Life (Basel) 2021; 11:life11121302. [PMID: 34947833 PMCID: PMC8709152 DOI: 10.3390/life11121302] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/27/2021] [Accepted: 11/23/2021] [Indexed: 11/17/2022] Open
Abstract
Copy number variants (CNVs) are the most common form of structural genetic variation, reflecting the gain or loss of DNA segments compared with a reference genome. Studies have identified CNV association with different diseases. However, the association between the sequential order of CNVs and disease-related traits has not been studied, to our knowledge, and it is still unclear that CNVs function individually or whether they work in coordination with other CNVs to manifest a disease or trait. Consequently, we propose the first such method to test the association between the sequential order of CNVs and diseases. Our sequential multi-dimensional CNV kernel-based association test (SMCKAT) consists of three parts: (1) a single CNV group kernel measuring the similarity between two groups of CNVs; (2) a whole genome group kernel that aggregates several single group kernels to summarize the similarity between CNV groups in a single chromosome or the whole genome; and (3) an association test between the CNV sequential order and disease-related traits using a random effect model. We evaluate SMCKAT on CNV data sets exhibiting rare or common CNVs, demonstrating that it can detect specific biologically relevant chromosomal regions supported by the biomedical literature. We compare the performance of SMCKAT with MCKAT, a multi-dimensional kernel association test. Based on the results, SMCKAT can detect more specific chromosomal regions compared with MCKAT that not only have CNV characteristics, but the CNV order on them are significantly associated with the disease-related trait.
Collapse
|
59
|
Huang Y, Li L, Jiang J. Radiogenomics of Alzheimer's disease: exploring gene related metabolic imaging markers. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:5772-5775. [PMID: 34892431 DOI: 10.1109/embc46164.2021.9630690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Alzheimer's disease (AD) is the most prevalent neurodegenerative disorder and considerably determined by genetic factors. Fluorodeoxyglucose positron emission tomography (FDG-PET) can reflect the functional state of glucose metabolism in the brain, and radiomic features of FDG-PET were considered as important imaging markers in AD. However, radiomic features are not highly interpretable, especially lack of explanation of underlying biological and molecular mechanisms. Therefore, this study used radiogenomics analysis to explore prognostic metabolic imaging markers by associating radiomics features and genetic data. In the study, we used the FDG-PET images and genotype data of 389 subjects (Cohort B) enrolled in the ADNI, including 109 AD, 134 healthy controls (HCs), 72 MCI non-converters (MCI-nc) and 74 MCI converters (MCI-c). Firstly, we performed a Genome-wide association study (GWAS) on the genotype data of 998 subjects (Cohort A), including 632 AD and 366 HCs after quality control (QC) steps to identify susceptibility loci as the gene features. Secondly, radiomics features were extracted from the preprocessed PET images. Thirdly, two-sample t-test, rank sum test and F-score were regarded as the feature selection step to select effective radiomic features. Fourthly, a support vector machine (SVM) was used to test the ability of the radiomic features to classify HCs, MCI and AD patients. Finally, we performed the Spearman correlation analysis on the genetic data and radiomic features. As a result, we identified rs429358 and rs2075650 as genome-wide significant signals. The radiomic approach achieved good classification abilities. Two prognostic FDG-PET radiomic features in the amygdala were proven to be correlated with the genetic data.
Collapse
|
60
|
Arthur VL, Li Z, Cao R, Oetting WS, Israni AK, Jacobson PA, Ritchie MD, Guan W, Chen J. A Multi-Marker Test for Analyzing Paired Genetic Data in Transplantation. Front Genet 2021; 12:745773. [PMID: 34721531 PMCID: PMC8548646 DOI: 10.3389/fgene.2021.745773] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 09/23/2021] [Indexed: 12/02/2022] Open
Abstract
Emerging evidence suggests that donor/recipient matching in non-HLA (human leukocyte antigen) regions of the genome may impact transplant outcomes and recognizing these matching effects may increase the power of transplant genetics studies. Most available matching scores account for either single-nucleotide polymorphism (SNP) matching only or sum these SNP matching scores across multiple gene-coding regions, which makes it challenging to interpret the association findings. We propose a multi-marker Joint Score Test (JST) to jointly test for association between recipient genotype SNP effects and a gene-based matching score with transplant outcomes. This method utilizes Eigen decomposition as a dimension reduction technique to potentially increase statistical power by decreasing the degrees of freedom for the test. In addition, JST allows for the matching effect and the recipient genotype effect to follow different biological mechanisms, which is not the case for other multi-marker methods. Extensive simulation studies show that JST is competitive when compared with existing methods, such as the sequence kernel association test (SKAT), especially under scenarios where associated SNPs are in low linkage disequilibrium with non-associated SNPs or in gene regions containing a large number of SNPs. Applying the method to paired donor/recipient genetic data from kidney transplant studies yields various gene regions that are potentially associated with incidence of acute rejection after transplant.
Collapse
Affiliation(s)
- Victoria L. Arthur
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Zhengbang Li
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
- Departments of Statistics, Central China Normal University, Wuhan, China
| | - Rui Cao
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, United States
| | - William S. Oetting
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN, United States
| | - Ajay K. Israni
- Minneapolis Medical Research Foundation, Minneapolis, MN, United States
- Department of Medicine, Hennepin County Medical Center, Minneapolis, MN, United States
- Department of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, United States
| | - Pamala A. Jacobson
- Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN, United States
| | - Marylyn D. Ritchie
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Weihua Guan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, United States
| | - Jinbo Chen
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| |
Collapse
|
61
|
Lu H, Wei Y, Jiang Z, Zhang J, Wang T, Huang S, Zeng P. Integrative eQTL-weighted hierarchical Cox models for SNP-set based time-to-event association studies. J Transl Med 2021; 19:418. [PMID: 34627275 PMCID: PMC8502405 DOI: 10.1186/s12967-021-03090-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 09/26/2021] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Integrating functional annotations into SNP-set association studies has been proven a powerful analysis strategy. Statistical methods for such integration have been developed for continuous and binary phenotypes; however, the SNP-set integrative approaches for time-to-event or survival outcomes are lacking. METHODS We here propose IEHC, an integrative eQTL (expression quantitative trait loci) hierarchical Cox regression, for SNP-set based survival association analysis by modeling effect sizes of genetic variants as a function of eQTL via a hierarchical manner. Three p-values combination tests are developed to examine the joint effects of eQTL and genetic variants after a novel decorrelated modification of statistics for the two components. An omnibus test (IEHC-ACAT) is further adapted to aggregate the strengths of all available tests. RESULTS Simulations demonstrated that the IEHC joint tests were more powerful if both eQTL and genetic variants contributed to association signal, while IEHC-ACAT was robust and often outperformed other approaches across various simulation scenarios. When applying IEHC to ten TCGA cancers by incorporating eQTL from relevant tissues of GTEx, we revealed that substantial correlations existed between the two types of effect sizes of genetic variants from TCGA and GTEx, and identified 21 (9 unique) cancer-associated genes which would otherwise be missed by approaches not incorporating eQTL. CONCLUSION IEHC represents a flexible, robust, and powerful approach to integrate functional omics information to enhance the power of identifying association signals for the survival risk of complex human cancers.
Collapse
Affiliation(s)
- Haojie Lu
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yongyue Wei
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, Jiangsu, China
| | - Zhou Jiang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jinhui Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
62
|
Petrykey K, Rezgui AM, Guern ML, Beaulieu P, St-Onge P, Drouin S, Bertout L, Wang F, Baedke JL, Yasui Y, Hudson MM, Raboisson MJ, Laverdière C, Sinnett D, Andelfinger GU, Krajinovic M. Genetic factors in treatment-related cardiovascular complications in survivors of childhood acute lymphoblastic leukemia. Pharmacogenomics 2021; 22:885-901. [PMID: 34505544 PMCID: PMC9043873 DOI: 10.2217/pgs-2021-0067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 08/12/2021] [Indexed: 11/21/2022] Open
Abstract
Aim: Cardiovascular disease represents one of the main causes of secondary morbidity and mortality in patients with childhood cancer. Patients & methods: To further address this issue, we analyzed cardiovascular complications in relation to common and rare genetic variants derived through whole-exome sequencing from childhood acute lymphoblastic leukemia survivors (PETALE cohort). Results: Significant associations were detected among common variants in the TTN gene, left ventricular ejection fraction (p ≤ 0.0005), and fractional shortening (p ≤ 0.001). Rare variants enrichment in the NOS1, ABCG2 and NOD2 was observed in relation to left ventricular ejection fraction, and in NOD2 and ZNF267 genes in relation to fractional shortening. Following stratification according to risk groups, the modulatory effect of rare variants was additionally found in the CBR1, ABCC5 and AKR1C3 genes. None of the associations was replicated in St-Jude Lifetime Cohort Study. Conclusion: Further studies are needed to confirm whether the described genetic markers may be useful in identifying patients at increased risk of these complications.
Collapse
Affiliation(s)
- Kateryna Petrykey
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pharmacology & Physiology, Université de Montréal, QC, H3T 1J4, Canada
| | - Aziz M Rezgui
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Mathilde Le Guern
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Patrick Beaulieu
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Pascal St-Onge
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Simon Drouin
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Laurence Bertout
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Fan Wang
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Jessica L Baedke
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Yutaka Yasui
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Melissa M Hudson
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Marie-Josée Raboisson
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
- Cardiology Unit, Sainte-Justine University Health Center (SJUHC), Montreal, QC, H3T 1C5, Canada
| | - Caroline Laverdière
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
| | - Daniel Sinnett
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
| | - Gregor U Andelfinger
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
- Fetomaternal and Neonatal Pathologies Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC, H3T 1C5, Canada
| | - Maja Krajinovic
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pharmacology & Physiology, Université de Montréal, QC, H3T 1J4, Canada
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
| |
Collapse
|
63
|
Zwir I, Del-Val C, Arnedo J, Pulkki-Råback L, Konte B, Yang SS, Romero-Zaliz R, Hintsanen M, Cloninger KM, Garcia D, Svrakic DM, Lester N, Rozsa S, Mesa A, Lyytikäinen LP, Giegling I, Kähönen M, Martinez M, Seppälä I, Raitoharju E, de Erausquin GA, Mamah D, Raitakari O, Rujescu D, Postolache TT, Gu CC, Sung J, Lehtimäki T, Keltikangas-Järvinen L, Cloninger CR. Three genetic-environmental networks for human personality. Mol Psychiatry 2021; 26:3858-3875. [PMID: 31748689 PMCID: PMC8550959 DOI: 10.1038/s41380-019-0579-x] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 09/26/2019] [Accepted: 10/24/2019] [Indexed: 02/07/2023]
Abstract
Phylogenetic, developmental, and brain-imaging studies suggest that human personality is the integrated expression of three major systems of learning and memory that regulate (1) associative conditioning, (2) intentionality, and (3) self-awareness. We have uncovered largely disjoint sets of genes regulating these dissociable learning processes in different clusters of people with (1) unregulated temperament profiles (i.e., associatively conditioned habits and emotional reactivity), (2) organized character profiles (i.e., intentional self-control of emotional conflicts and goals), and (3) creative character profiles (i.e., self-aware appraisal of values and theories), respectively. However, little is known about how these temperament and character components of personality are jointly organized and develop in an integrated manner. In three large independent genome-wide association studies from Finland, Germany, and Korea, we used a data-driven machine learning method to uncover joint phenotypic networks of temperament and character and also the genetic networks with which they are associated. We found three clusters of similar numbers of people with distinct combinations of temperament and character profiles. Their associated genetic and environmental networks were largely disjoint, and differentially related to distinct forms of learning and memory. Of the 972 genes that mapped to the three phenotypic networks, 72% were unique to a single network. The findings in the Finnish discovery sample were blindly and independently replicated in samples of Germans and Koreans. We conclude that temperament and character are integrated within three disjoint networks that regulate healthy longevity and dissociable systems of learning and memory by nearly disjoint sets of genetic and environmental influences.
Collapse
Grants
- Spanish Ministry of Science and Technology TIN2012-38805 and DPI2015-69585-R
- The Young Finns Study has been financially supported by the Academy of Finland: grants 286284, 134309 (Eye), 126925, 121584, 124282, 129378 (Salve), 117787 (Gendi), 41071 (Skidi), and 308676; the Social Insurance Institution of Finland; Competitive State Research Financing of the Expert Responsibility area of Kuopio, Tampere and Turku University Hospitals (grant X51001); Juho Vainio Foundation; Paavo Nurmi Foundation; Finnish Foundation for Cardiovascular Research ; Finnish Cultural Foundation; Tampere Tuberculosis Foundation; Emil Aaltonen Foundation; Yrjö Jahnsson Foundation; Signe and Ane Gyllenberg Foundation; Diabetes Research Foundation of Finnish Diabetes Association: and EU Horizon 2020 (grant 755320 for TAXINOMISIS).
- American Federation for Suicide Prevention
- Healthy Twin Family Register of Korea
- Anthropedia Foundation
- The Young Finns Study has been financially supported by the Academy of Finland: grants 286284, 322098, 134309 (Eye), 126925, 121584, 124282, 129378 (Salve), 117787 (Gendi), 41071 (Skidi), and 308676; the Social Insurance Institution of Finland; Competitive State Research Financing of the Expert Responsibility area of Kuopio, Tampere and Turku University Hospitals (grant X51001); Juho Vainio Foundation; Paavo Nurmi Foundation; Finnish Foundation for Cardiovascular Research ; Finnish Cultural Foundation; Tampere Tuberculosis Foundation; Emil Aaltonen Foundation; Yrjö Jahnsson Foundation; Signe and Ane Gyllenberg Foundation; Diabetes Research Foundation of Finnish Diabetes Association: and EU Horizon 2020 (grant 755320 for TAXINOMISIS); and Tampere University Hospital Supporting Foundation.
- American Society for Suicide Prevention
- American Foundation for Suicide Prevention
Collapse
Affiliation(s)
- Igor Zwir
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
- Department of Computer Science, University of Granada, Granada, Spain
| | - Coral Del-Val
- Department of Computer Science, University of Granada, Granada, Spain
| | - Javier Arnedo
- Department of Computer Science, University of Granada, Granada, Spain
| | - Laura Pulkki-Råback
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
| | - Bettina Konte
- Department of Psychiatry, Martin-Luther-University Halle-Wittenberg, Halle, Germany
| | - Sarah S Yang
- Department of Epidemiology, and Institute of Health and Environment, School of Public Health, Seoul National University, Seoul, Korea
| | | | - Mirka Hintsanen
- Unit of Psychology, Faculty of Education, University of Oulu, Oulu, Finland
| | | | - Danilo Garcia
- Department of Psychology, University of Gothenburg, Gothenburg, Sweden
- Blekinge Centre of Competence, Blekinge County Council, Karlskrona, Sweden
| | - Dragan M Svrakic
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Nigel Lester
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Sandor Rozsa
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Alberto Mesa
- Department of Computer Science, University of Granada, Granada, Spain
| | - Leo-Pekka Lyytikäinen
- Department of Clinical Chemistry, Fimlab Laboratories, and Finnish Cardiovascular Research Center-Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Ina Giegling
- Department of Psychiatry, Martin-Luther-University Halle-Wittenberg, Halle, Germany
- University Clinic, Ludwig-Maximilian University, Munich, Germany
| | - Mika Kähönen
- Department of Clinical Physiology Tampere University Hospital, and Finnish Cardiovascular Research Center-Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Maribel Martinez
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Ilkka Seppälä
- Department of Clinical Chemistry, Fimlab Laboratories, and Finnish Cardiovascular Research Center-Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Emma Raitoharju
- Department of Clinical Chemistry, Fimlab Laboratories, and Finnish Cardiovascular Research Center-Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Gabriel A de Erausquin
- The Glenn Biggs Institute of Alzheimer's and Neurodegenerative Disorders, Long School of Medicine, University of Texas Heath San Antonio, San Antonio, TX, USA
| | - Daniel Mamah
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Olli Raitakari
- Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland
- Centre for Population Health Research, Turku University Hospital, University of Turku Hospital, Turku, Finland
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku, Finland
| | - Dan Rujescu
- Department of Psychiatry, Martin-Luther-University Halle-Wittenberg, Halle, Germany
| | - Teodor T Postolache
- Department of Psychiatry, School of Medicine, University of Maryland, Baltimore, MD, USA
- Rocky Mountain Mental Illness, Research, Education, and Clinical Center for Veteran Suicide Prevention, Denver, CO, USA
| | - C Charles Gu
- Division of Biostatistics, School of Medicine, Washington University, St. Louis, MO, USA
| | - Joohon Sung
- Department of Epidemiology, and Institute of Health and Environment, School of Public Health, Seoul National University, Seoul, Korea
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories, and Finnish Cardiovascular Research Center-Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | | | - C Robert Cloninger
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Psychological and Brain Sciences, and School of Medicine, Department of Genetics, School of Arts and Sciences, Washington University, St. Louis, MO, USA.
| |
Collapse
|
64
|
Shao Z, Wang T, Zhang M, Jiang Z, Huang S, Zeng P. IUSMMT: Survival mediation analysis of gene expression with multiple DNA methylation exposures and its application to cancers of TCGA. PLoS Comput Biol 2021; 17:e1009250. [PMID: 34464378 PMCID: PMC8437300 DOI: 10.1371/journal.pcbi.1009250] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 09/13/2021] [Accepted: 07/06/2021] [Indexed: 02/07/2023] Open
Abstract
Effective and powerful survival mediation models are currently lacking. To partly fill such knowledge gap, we particularly focus on the mediation analysis that includes multiple DNA methylations acting as exposures, one gene expression as the mediator and one survival time as the outcome. We proposed IUSMMT (intersection-union survival mixture-adjusted mediation test) to effectively examine the existence of mediation effect by fitting an empirical three-component mixture null distribution. With extensive simulation studies, we demonstrated the advantage of IUSMMT over existing methods. We applied IUSMMT to ten TCGA cancers and identified multiple genes that exhibited mediating effects. We further revealed that most of the identified regions, in which genes behaved as active mediators, were cancer type-specific and exhibited a full mediation from DNA methylation CpG sites to the survival risk of various types of cancers. Overall, IUSMMT represents an effective and powerful alternative for survival mediation analysis; our results also provide new insights into the functional role of DNA methylation and gene expression in cancer progression/prognosis and demonstrate potential therapeutic targets for future clinical practice.
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Meng Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Zhou Jiang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, China
| |
Collapse
|
65
|
Demetci P, Cheng W, Darnell G, Zhou X, Ramachandran S, Crawford L. Multi-scale inference of genetic trait architecture using biologically annotated neural networks. PLoS Genet 2021; 17:e1009754. [PMID: 34411094 PMCID: PMC8407593 DOI: 10.1371/journal.pgen.1009754] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 08/31/2021] [Accepted: 07/31/2021] [Indexed: 01/01/2023] Open
Abstract
In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets. We treat the weights and connections of the network as random variables with prior distributions that reflect how genetic effects manifest at different genomic scales. The BANNs software uses variational inference to provide posterior summaries which allow researchers to simultaneously perform (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. Through simulations, we show that our method improves upon state-of-the-art association mapping and enrichment approaches across a wide range of genetic architectures. We then further illustrate the benefits of BANNs by analyzing real GWA data assayed in approximately 2,000 heterogenous stock of mice from the Wellcome Trust Centre for Human Genetics and approximately 7,000 individuals from the Framingham Heart Study. Lastly, using a random subset of individuals of European ancestry from the UK Biobank, we show that BANNs is able to replicate known associations in high and low-density lipoprotein cholesterol content.
Collapse
Affiliation(s)
- Pinar Demetci
- Department of Computer Science, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Wei Cheng
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
| | - Gregory Darnell
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
- Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Sohini Ramachandran
- Department of Computer Science, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
| | - Lorin Crawford
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Microsoft Research New England, Cambridge, Massachusetts, United States of America
- Department of Biostatistics, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
66
|
Xie X, Kendzior MC, Ge X, Mainzer LS, Sinha S. VarSAn: associating pathways with a set of genomic variants using network analysis. Nucleic Acids Res 2021; 49:8471-8487. [PMID: 34313777 PMCID: PMC8421213 DOI: 10.1093/nar/gkab624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 05/18/2021] [Accepted: 07/20/2021] [Indexed: 02/01/2023] Open
Abstract
There is a pressing need today to mechanistically interpret sets of genomic variants associated with diseases. Here we present a tool called ‘VarSAn’ that uses a network analysis algorithm to identify pathways relevant to a given set of variants. VarSAn analyzes a configurable network whose nodes represent variants, genes and pathways, using a Random Walk with Restarts algorithm to rank pathways for relevance to the given variants, and reports P-values for pathway relevance. It treats non-coding and coding variants differently, properly accounts for the number of pathways impacted by each variant and identifies relevant pathways even if many variants do not directly impact genes of the pathway. We use VarSAn to identify pathways relevant to variants related to cancer and several other diseases, as well as drug response variation. We find VarSAn's pathway ranking to be complementary to the standard approach of enrichment tests on genes related to the query set. We adopt a novel benchmarking strategy to quantify its advantage over this baseline approach. Finally, we use VarSAn to discover key pathways, including the VEGFA-VEGFR2 pathway, related to de novo variants in patients of Hypoplastic Left Heart Syndrome, a rare and severe congenital heart defect.
Collapse
Affiliation(s)
- Xiaoman Xie
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Matthew C Kendzior
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Xiyu Ge
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Liudmila S Mainzer
- National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.,Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA.,Cancer Center of Illinois, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
67
|
Cao C, Kwok D, Edie S, Li Q, Ding B, Kossinna P, Campbell S, Wu J, Greenberg M, Long Q. kTWAS: integrating kernel machine with transcriptome-wide association studies improves statistical power and reveals novel genes. Brief Bioinform 2021; 22:5985285. [PMID: 33200776 DOI: 10.1093/bib/bbaa270] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 12/31/2022] Open
Abstract
The power of genotype-phenotype association mapping studies increases greatly when contributions from multiple variants in a focal region are meaningfully aggregated. Currently, there are two popular categories of variant aggregation methods. Transcriptome-wide association studies (TWAS) represent a set of emerging methods that select variants based on their effect on gene expressions, providing pretrained linear combinations of variants for downstream association mapping. In contrast to this, kernel methods such as sequence kernel association test (SKAT) model genotypic and phenotypic variance use various kernel functions that capture genetic similarity between subjects, allowing nonlinear effects to be included. From the perspective of machine learning, these two methods cover two complementary aspects of feature engineering: feature selection/pruning and feature aggregation. Thus far, no thorough comparison has been made between these categories, and no methods exist which incorporate the advantages of TWAS- and kernel-based methods. In this work, we developed a novel method called kernel-based TWAS (kTWAS) that applies TWAS-like feature selection to a SKAT-like kernel association test, combining the strengths of both approaches. Through extensive simulations, we demonstrate that kTWAS has higher power than TWAS and multiple SKAT-based protocols, and we identify novel disease-associated genes in Wellcome Trust Case Control Consortium genotyping array data and MSSNG (Autism) sequence data. The source code for kTWAS and our simulations are available in our GitHub repository (https://github.com/theLongLab/kTWAS).
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, University of Calgary
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary
| | | | - Qing Li
- Department of Biochemistry & Molecular Biology, University of Calgary
| | - Bowei Ding
- Department of Mathematics & Statistics, University of Calgary
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, University of Calgary
| | | | - Jingjing Wu
- Department of Mathematics & Statistics, University of Calgary
| | | | - Quan Long
- Departments of Biochemistry & Molecular Biology, Medical Genetics and Mathematics & Statistics
| |
Collapse
|
68
|
Zeng P, Shao Z, Zhou X. Statistical methods for mediation analysis in the era of high-throughput genomics: Current successes and future challenges. Comput Struct Biotechnol J 2021; 19:3209-3224. [PMID: 34141140 PMCID: PMC8187160 DOI: 10.1016/j.csbj.2021.05.042] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 05/21/2021] [Accepted: 05/21/2021] [Indexed: 12/12/2022] Open
Abstract
Mediation analysis investigates the intermediate mechanism through which an exposure exerts its influence on the outcome of interest. Mediation analysis is becoming increasingly popular in high-throughput genomics studies where a common goal is to identify molecular-level traits, such as gene expression or methylation, which actively mediate the genetic or environmental effects on the outcome. Mediation analysis in genomics studies is particularly challenging, however, thanks to the large number of potential mediators measured in these studies as well as the composite null nature of the mediation effect hypothesis. Indeed, while the standard univariate and multivariate mediation methods have been well-established for analyzing one or multiple mediators, they are not well-suited for genomics studies with a large number of mediators and often yield conservative p-values and limited power. Consequently, over the past few years many new high-dimensional mediation methods have been developed for analyzing the large number of potential mediators collected in high-throughput genomics studies. In this work, we present a thorough review of these important recent methodological advances in high-dimensional mediation analysis. Specifically, we describe in detail more than ten high-dimensional mediation methods, focusing on their motivations, basic modeling ideas, specific modeling assumptions, practical successes, methodological limitations, as well as future directions. We hope our review will serve as a useful guidance for statisticians and computational biologists who develop methods of high-dimensional mediation analysis as well as for analysts who apply mediation methods to high-throughput genomics studies.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
- Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Zhonghe Shao
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor 48109, MI, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor 48109, MI, USA
| |
Collapse
|
69
|
Trevino CE, Rounds JC, Charen K, Shubeck L, Hipp HS, Spencer JB, Johnston HR, Cutler DJ, Zwick ME, Epstein MP, Murray A, Macpherson JN, Mila M, Rodriguez-Revenga L, Berry-Kravis E, Hall DA, Leehey MA, Liu Y, Welt C, Warren ST, Sherman SL, Jin P, Allen EG. Identifying susceptibility genes for primary ovarian insufficiency on the high-risk genetic background of a fragile X premutation. Fertil Steril 2021; 116:843-854. [PMID: 34016428 PMCID: PMC8494118 DOI: 10.1016/j.fertnstert.2021.04.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 04/21/2021] [Accepted: 04/21/2021] [Indexed: 01/07/2023]
Abstract
OBJECTIVE To identify modifying genes that explains the risk of fragile X-associated primary ovarian insufficiency (FXPOI). DESIGN Gene-based, case/control association study, followed by a functional screen of highly ranked genes using a Drosophila model. SETTING Participants were recruited from academic and clinical settings. PATIENT(S) Women with a premutation (PM) who experienced FXPOI at the age of 35 years or younger (n = 63) and women with a PM who experienced menopause at the age of 50 years or older (n = 51) provided clinical information and a deoxyribonucleic acid sample for whole genome sequencing. The functional screen was on the basis of Drosophila TRiP lines. INTERVENTION(S) Clinical information and a DNA sample were collected for whole genome sequencing. MAIN OUTCOME MEASURES A polygenic risk score derived from common variants associated with natural age at menopause was calculated and associated with the risk of FXPOI. Genes associated with the risk of FXPOI were identified on the basis of the P-value from gene-based association test and an altered level of fecundity when knocked down in the Drosophila PM model. RESULTS The polygenic risk score on the basis of common variants associated with natural age at menopause explained approximately 8% of the variance in the risk of FXPOI. Further, SUMO1 and KRR1 were identified as possible modifying genes associated with the risk of FXPOI on the basis of an untargeted gene analysis of rare variants. CONCLUSIONS In addition to the large genetic effect of a PM on ovarian function, the additive effects of common variants associated with natural age at menopause and the effect of rare modifying variants appear to play a role in FXPOI risk.
Collapse
Affiliation(s)
| | | | - Krista Charen
- Department of Human Genetics, Emory University, Atlanta, Georgia
| | - Lisa Shubeck
- Department of Human Genetics, Emory University, Atlanta, Georgia
| | - Heather S Hipp
- Department of Gynecology and Obstetrics, Emory University, Atlanta, Georgia
| | - Jessica B Spencer
- Department of Gynecology and Obstetrics, Emory University, Atlanta, Georgia
| | | | - Dave J Cutler
- Department of Human Genetics, Emory University, Atlanta, Georgia
| | - Michael E Zwick
- Department of Human Genetics, Emory University, Atlanta, Georgia; Department of Pediatrics, Emory University, Atlanta, Georgia
| | | | - Anna Murray
- University of Exeter Medical School, University of Exeter, Exeter, United Kingdom
| | - James N Macpherson
- Wessex Regional Genetics Laboratory, Salisbury District Hospital, Salisbury, United Kingdom
| | - Montserrat Mila
- Biochemistry and Molecular Genetics Department, Hospital Clinic of Barcelona and Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
| | - Laia Rodriguez-Revenga
- Biochemistry and Molecular Genetics Department, Hospital Clinic of Barcelona and Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain; CIBER of Rare Diseases (CIBERER), Instituto de Salud Carlos III, Spain
| | - Elizabeth Berry-Kravis
- Departments of Pediatrics, Neurological Sciences, Biochemistry, Rush University Medical Center, Chicago, Illinois
| | - Deborah A Hall
- Department of Neurological Sciences, Rush University, Chicago, Illinois
| | - Maureen A Leehey
- Department of Neurology, University of Colorado School of Medicine, Aurora, Colorado
| | - Ying Liu
- Department of Neurology, University of Colorado School of Medicine, Aurora, Colorado
| | - Corrine Welt
- Division of Endocrinology, Metabolism and Diabetes, University of Utah School of Medicine, Salt Lake City, Utah
| | - Stephen T Warren
- Department of Human Genetics, Emory University, Atlanta, Georgia; Department of Pediatrics, Emory University, Atlanta, Georgia; Department of Biochemistry, Emory University, Atlanta, Georgia
| | - Stephanie L Sherman
- Department of Human Genetics, Emory University, Atlanta, Georgia; Department of Pediatrics, Emory University, Atlanta, Georgia
| | - Peng Jin
- Department of Human Genetics, Emory University, Atlanta, Georgia
| | - Emily G Allen
- Department of Human Genetics, Emory University, Atlanta, Georgia.
| |
Collapse
|
70
|
Huang M, Lyu C, Li X, Qureshi AA, Han J, Li M. Identifying Susceptibility Loci for Cutaneous Squamous Cell Carcinoma Using a Fast Sequence Kernel Association Test. Front Genet 2021; 12:657499. [PMID: 34040636 PMCID: PMC8141858 DOI: 10.3389/fgene.2021.657499] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 04/09/2021] [Indexed: 11/13/2022] Open
Abstract
Cutaneous squamous cell carcinoma (cSCC) accounts for about 20% of all skin cancers, the most common type of malignancy in the United States. Genome-wide association studies (GWAS) have successfully identified multiple genetic variants associated with the risk of cSCC. Most of these studies were single-locus-based, testing genetic variants one-at-a-time. In this article, we performed gene-based association tests to evaluate the joint effect of multiple variants, especially rare variants, on the risk of cSCC by using a fast sequence kernel association test (fastSKAT). The study included 1,710 cSCC cases and 24,304 cancer-free controls from the Nurses' Health Study, the Nurses' Health Study II and the Health Professionals Follow-up Study. We used UCSC Genome Browser to define gene units as candidate loci, and further evaluated the association between all variants within each gene unit and disease outcome. Four genes HP1BP3, DAG1, SEPT7P2, and SLFN12 were identified using Bonferroni adjusted significance level. Our study is complementary to the existing GWASs, and our findings may provide additional insights into the etiology of cSCC. Further studies are needed to validate these findings.
Collapse
Affiliation(s)
- Manyan Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University at Bloomington, Bloomington, IN, United States
| | - Chen Lyu
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University at Bloomington, Bloomington, IN, United States
| | - Xin Li
- Department of Epidemiology, Richard M. Fairbanks School of Public Health, Indiana University - Purdue University Indianapolis, Indianapolis, IN, United States.,Melvin and Bren Simon Cancer Center, Indianapolis, IN, United States
| | - Abrar A Qureshi
- Department of Dermatology, Alpert Medical School, Brown University, Providence, RI, United States
| | - Jiali Han
- Department of Epidemiology, Richard M. Fairbanks School of Public Health, Indiana University - Purdue University Indianapolis, Indianapolis, IN, United States.,Melvin and Bren Simon Cancer Center, Indianapolis, IN, United States
| | - Ming Li
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University at Bloomington, Bloomington, IN, United States
| |
Collapse
|
71
|
Family-based gene-environment interaction using sequence kernel association test (FGE-SKAT) for complex quantitative traits. Sci Rep 2021; 11:7431. [PMID: 33795796 PMCID: PMC8016937 DOI: 10.1038/s41598-021-86871-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 03/22/2021] [Indexed: 11/30/2022] Open
Abstract
After the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.
Collapse
|
72
|
Tang S, Buchman AS, De Jager PL, Bennett DA, Epstein MP, Yang J. Novel Variance-Component TWAS method for studying complex human diseases with applications to Alzheimer's dementia. PLoS Genet 2021; 17:e1009482. [PMID: 33798195 PMCID: PMC8046351 DOI: 10.1371/journal.pgen.1009482] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 04/14/2021] [Accepted: 03/15/2021] [Indexed: 02/07/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, traditional two-stage TWAS methods first impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on reference transcriptome. Traditional TWAS methods then employ a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL's estimated effect on reference transcriptome. To increase TWAS robustness to this assumption, we propose a novel Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding reference cis-eQTL effects) rather than fixed. VC-TWAS is applicable to both continuous and dichotomous phenotypes, as well as individual-level and summary-level GWAS data. Using simulated data, we show VC-TWAS is more powerful than traditional TWAS methods based on a two-stage Burden test, especially when eQTL genetic effects on test phenotype are no longer a linear function of their eQTL genetic effects on reference transcriptome. We further applied VC-TWAS to both individual-level (N = ~3.4K) and summary-level (N = ~54K) GWAS data to study Alzheimer's dementia (AD). With the individual-level data, we detected 13 significant risk genes including 6 known GWAS risk genes such as TOMM40 that were missed by traditional TWAS methods. With the summary-level data, we detected 57 significant risk genes considering only cis-SNPs and 71 significant genes considering both cis- and trans- SNPs, which also validated our findings with the individual-level GWAS data. Our VC-TWAS method is implemented in the TIGAR tool for public use.
Collapse
Affiliation(s)
- Shizhen Tang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, United States of America
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, Georgia, United States of America
| | - Aron S. Buchman
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, Illinois, United States of America
| | - Philip L. De Jager
- Center for Translational and Computational Neuroimmunology, Department of Neurology and Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Columbia University Irving Medical Center, New York, New York, United States of America
| | - David A. Bennett
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, Illinois, United States of America
| | - Michael P. Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, United States of America
| |
Collapse
|
73
|
Lu H, Zhang J, Jiang Z, Zhang M, Wang T, Zhao H, Zeng P. Detection of Genetic Overlap Between Rheumatoid Arthritis and Systemic Lupus Erythematosus Using GWAS Summary Statistics. Front Genet 2021; 12:656545. [PMID: 33815486 PMCID: PMC8012913 DOI: 10.3389/fgene.2021.656545] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 03/01/2021] [Indexed: 01/04/2023] Open
Abstract
Background Clinical and epidemiological studies have suggested systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA) are comorbidities and common genetic etiologies can partly explain such coexistence. However, shared genetic determinations underlying the two diseases remain largely unknown. Methods Our analysis relied on summary statistics available from genome-wide association studies of SLE (N = 23,210) and RA (N = 58,284). We first evaluated the genetic correlation between RA and SLE through the linkage disequilibrium score regression (LDSC). Then, we performed a multiple-tissue eQTL (expression quantitative trait loci) weighted integrative analysis for each of the two diseases and aggregated association evidence across these tissues via the recently proposed harmonic mean P-value (HMP) combination strategy, which can produce a single well-calibrated P-value for correlated test statistics. Afterwards, we conducted the pleiotropy-informed association using conjunction conditional FDR (ccFDR) to identify potential pleiotropic genes associated with both RA and SLE. Results We found there existed a significant positive genetic correlation (rg = 0.404, P = 6.01E-10) via LDSC between RA and SLE. Based on the multiple-tissue eQTL weighted integrative analysis and the HMP combination across various tissues, we discovered 14 potential pleiotropic genes by ccFDR, among which four were likely newly novel genes (i.e., INPP5B, OR5K2, RP11-2C24.5, and CTD-3105H18.4). The SNP effect sizes of these pleiotropic genes were typically positively dependent, with an average correlation of 0.579. Functionally, these genes were implicated in multiple auto-immune relevant pathways such as inositol phosphate metabolic process, membrane and glucagon signaling pathway. Conclusion This study reveals common genetic components between RA and SLE and provides candidate associated loci for understanding of molecular mechanism underlying the comorbidity of the two diseases.
Collapse
Affiliation(s)
- Haojie Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Jinhui Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Zhou Jiang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Meng Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ting Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Huashuo Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
74
|
Zhan X, Banerjee K, Chen J. Variant-set association test for generalized linear mixed model. Genet Epidemiol 2021; 45:402-412. [PMID: 33604919 DOI: 10.1002/gepi.22378] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 01/18/2021] [Accepted: 01/25/2021] [Indexed: 12/22/2022]
Abstract
Advances in high-throughput biotechnologies have culminated in a wide range of omics (such as genomics, epigenomics, transcriptomics, metabolomics, and metagenomics) studies, and increasing evidence in these studies indicates that the biological architecture of complex traits involves a large number of omics variants each with minor effects but collectively accounting for the full phenotypic variability. Thus, a major challenge in many "ome-wide" association analyses is to achieve adequate statistical power to identify multiple variants of small effect sizes, which is notoriously difficult for studies with relatively small-sample sizes. A small-sample adjustment incorporated in the kernel machine regression framework was proposed to solve this for association studies under various settings. However, such an adjustment in the generalized linear mixed model (GLMM) framework, which accounts for both sample relatedness and non-Gaussian outcomes, has not yet been attempted. In this study, we fill this gap by extending small-sample adjustment in kernel machine association test to GLMM. We propose a new Variant-Set Association Test (VSAT), a powerful and efficient analysis tool in GLMM, to examine the association between a set of omics variants and correlated phenotypes. The usefulness of VSAT is demonstrated using both numerical simulation studies and applications to data collected from multiple association studies. The software for implementing the proposed method in R is available at https://www.github.com/jchen1981/SSKAT.
Collapse
Affiliation(s)
- Xiang Zhan
- Department of Public Health Sciences, Pennsylvania State University, Hershey, Pennsylvania, USA
| | - Kalins Banerjee
- Department of Public Health Sciences, Pennsylvania State University, Hershey, Pennsylvania, USA
| | - Jun Chen
- Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
75
|
Cao C, Ding B, Li Q, Kwok D, Wu J, Long Q. Power analysis of transcriptome-wide association study: Implications for practical protocol choice. PLoS Genet 2021; 17:e1009405. [PMID: 33635859 PMCID: PMC7946362 DOI: 10.1371/journal.pgen.1009405] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 03/10/2021] [Accepted: 02/06/2021] [Indexed: 12/12/2022] Open
Abstract
The transcriptome-wide association study (TWAS) has emerged as one of several promising techniques for integrating multi-scale 'omics' data into traditional genome-wide association studies (GWAS). Unlike GWAS, which associates phenotypic variance directly with genetic variants, TWAS uses a reference dataset to train a predictive model for gene expressions, which allows it to associate phenotype with variants through the mediating effect of expressions. Although effective, this core innovation of TWAS is poorly understood, since the predictive accuracy of the genotype-expression model is generally low and further bounded by expression heritability. This raises the question: to what degree does the accuracy of the expression model affect the power of TWAS? Furthermore, would replacing predictions with actual, experimentally determined expressions improve power? To answer these questions, we compared the power of GWAS, TWAS, and a hypothetical protocol utilizing real expression data. We derived non-centrality parameters (NCPs) for linear mixed models (LMMs) to enable closed-form calculations of statistical power that do not rely on specific protocol implementations. We examined two representative scenarios: causality (genotype contributes to phenotype through expression) and pleiotropy (genotype contributes directly to both phenotype and expression), and also tested the effects of various properties including expression heritability. Our analysis reveals two main outcomes: (1) Under pleiotropy, the use of predicted expressions in TWAS is superior to actual expressions. This explains why TWAS can function with weak expression models, and shows that TWAS remains relevant even when real expressions are available. (2) GWAS outperforms TWAS when expression heritability is below a threshold of 0.04 under causality, or 0.06 under pleiotropy. Analysis of existing publications suggests that TWAS has been misapplied in place of GWAS, in situations where expression heritability is low.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Bowei Ding
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
| | - Jingjing Wu
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
- Department of Medical Genetics, University of Calgary, Calgary, Canada
- Hotchkiss Brain Institute, O’Brien Institute for Public Health, University of Calgary, Calgary, Canada
| |
Collapse
|
76
|
Whole-exome sequencing identifies susceptibility genes and pathways for idiopathic pulmonary fibrosis in the Chinese population. Sci Rep 2021; 11:1443. [PMID: 33446833 PMCID: PMC7809470 DOI: 10.1038/s41598-020-80944-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Accepted: 12/14/2020] [Indexed: 02/07/2023] Open
Abstract
Genetic factors play a role in the risk of idiopathic pulmonary fibrosis (IPF). Specifically, MUC5B rs35705950 non-risk alleles and immunologic aberrations were associated with the IPF’s progression. However, rare genetic variants have not been systematically investigated in Chinese IPF patients. In this study, we aimed to improve understanding of the genetic architecture of IPF in the Chinese population and to assess whether rare protein-coding variants in the immunity pathway genes are enriched in the IPF patients with non-risk alleles at rs35705950. A case–control exome-wide study including 110 IPF patients and 60 matched healthy controls was conducted. rs35705950 was genotyped by Sanger sequencing. To identify genes enriched in IPF, gene-based association analyses were performed. Identified genes were included for further pathway analyses using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). Associations between rs35705950 and genes enriched in the immunity pathway were also tested. 226 genes that were enriched with deleterious variants were identified in IPF patients. Out of them, 36 genes were significantly enriched in GO and KEGG pathways in the IPF. Pathway analyses implicated that these genes were involved in the immune response and cell adhesion. Rare protein-altering variants in genes related to the immunity pathway did not significantly differ between patients with a MUC5B risk allele and individuals without risk allele. We drafted a comprehensive mutational landscape of rare protein-coding variants in the Chinese IPF and identified genes related to immune response and cell adhesion. These results partially explain changes in gene expression involved in the immunity/inflammatory pathways in IPF patients.
Collapse
|
77
|
Lonjou C, Eon-Marchais S, Truong T, Dondon MG, Karimi M, Jiao Y, Damiola F, Barjhoux L, Le Gal D, Beauvallet J, Mebirouk N, Cavaciuti E, Chiesa J, Floquet A, Audebert-Bellanger S, Giraud S, Frebourg T, Limacher JM, Gladieff L, Mortemousque I, Dreyfus H, Lejeune-Dumoulin S, Lasset C, Venat-Bouvet L, Bignon YJ, Pujol P, Maugard CM, Luporsi E, Bonadona V, Noguès C, Berthet P, Delnatte C, Gesta P, Lortholary A, Faivre L, Buecher B, Caron O, Gauthier-Villars M, Coupier I, Mazoyer S, Monraz LC, Kondratova M, Kuperstein I, Guénel P, Barillot E, Stoppa-Lyonnet D, Andrieu N, Lesueur F. Gene- and pathway-level analyses of iCOGS variants highlight novel signaling pathways underlying familial breast cancer susceptibility. Int J Cancer 2021; 148:1895-1909. [PMID: 33368296 PMCID: PMC9290690 DOI: 10.1002/ijc.33457] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 11/20/2020] [Accepted: 12/07/2020] [Indexed: 12/17/2022]
Abstract
Single‐nucleotide polymorphisms (SNPs) in over 180 loci have been associated with breast cancer (BC) through genome‐wide association studies involving mostly unselected population‐based case‐control series. Some of them modify BC risk of women carrying a BRCA1 or BRCA2 (BRCA1/2) mutation and may also explain BC risk variability in BC‐prone families with no BRCA1/2 mutation. Here, we assessed the contribution of SNPs of the iCOGS array in GENESIS consisting of BC cases with no BRCA1/2 mutation and a sister with BC, and population controls. Genotyping data were available for 1281 index cases, 731 sisters with BC, 457 unaffected sisters and 1272 controls. In addition to the standard SNP‐level analysis using index cases and controls, we performed pedigree‐based association tests to capture transmission information in the sibships. We also performed gene‐ and pathway‐level analyses to maximize the power to detect associations with lower‐frequency SNPs or those with modest effect sizes. While SNP‐level analyses identified 18 loci, gene‐level analyses identified 112 genes. Furthermore, 31 Kyoto Encyclopedia of Genes and Genomes and 7 Atlas of Cancer Signaling Network pathways were highlighted (false discovery rate of 5%). Using results from the “index case‐control” analysis, we built pathway‐derived polygenic risk scores (PRS) and assessed their performance in the population‐based CECILE study and in a data set composed of GENESIS‐affected sisters and CECILE controls. Although these PRS had poor predictive value in the general population, they performed better than a PRS built using our SNP‐level findings, and we found that the joint effect of family history and PRS needs to be considered in risk prediction models.
What's new?
Genetic studies have identified more than 180 single‐nucleotide polymorphisms (SNPs) associated with breast cancer susceptibility, but these studies are reaching their limits. Here, the authors evaluated SNPs in the iCOGS genotyping array using a multilevel approach, including single variant, gene, and pathway analyses. They measured the contribution of the SNPs to breast cancer in patients who have a sister with breast cancer but do not carry a BRCA1/2 mutation. They showed that a pathway‐derived polygenic risk score performed poorly in the general population, and that the best predictive model must include family history.
Collapse
Affiliation(s)
- Christine Lonjou
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Séverine Eon-Marchais
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Thérèse Truong
- Université Paris-Saclay, UVSQ, Inserm, CESP, Villejuif, France.,Inserm U1018, CESP, Team Exposome and Heredity, Villejuif, France
| | - Marie-Gabrielle Dondon
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Mojgan Karimi
- Université Paris-Saclay, UVSQ, Inserm, CESP, Villejuif, France.,Inserm U1018, CESP, Team Exposome and Heredity, Villejuif, France
| | - Yue Jiao
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | | | - Laure Barjhoux
- Department of BioPathology, Centre Léon Bérard, Lyon, France
| | - Dorothée Le Gal
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Juana Beauvallet
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Noura Mebirouk
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Eve Cavaciuti
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | | | | | | | - Sophie Giraud
- Service de Génétique, Hospices Civils de Lyon, Groupement Hospitalier Est, Bron, France
| | - Thierry Frebourg
- Département de Génétique, Hôpital Universitaire de Rouen, Rouen, France
| | | | - Laurence Gladieff
- Service d'Oncologie Médicale, Institut Claudius Regaud-IUCT-Oncopole, Toulouse, France
| | | | - Hélène Dreyfus
- Clinique Sainte Catherine, Avignon, France.,Département de Génétique, CHU de Grenoble, Hôpital Couple-Enfant, Grenoble, France
| | | | - Christine Lasset
- Université Claude Bernard Lyon 1, Villeurbanne, France.,CNRS UMR 5558, Lyon, France.,Centre Léon Bérard, Unité de Prévention et Epidémiologie Génétique, Lyon, France
| | | | - Yves-Jean Bignon
- Département d'Oncogénétique, Université Clermont Auvergne, UMR INSERM, U1240, Centre Jean Perrin, Clermont Ferrand, France
| | - Pascal Pujol
- Hôpital Arnaud de Villeneuve, CHU Montpellier, Service de Génétique Médicale et Oncogénétique, Montpellier, France.,INSERM 896, CRCM Val d'Aurelle, Montpellier, France
| | - Christine M Maugard
- Département d'Oncobiologie, LBBM, Hôpitaux Universitaires de Strasbourg, Génétique Oncologique Moléculaire, UF1422, Strasbourg, France.,Hôpitaux Universitaires de Strasbourg, UF6948 Génétique Oncologique Clinique, Évaluation Familiale et Suivi, Strasbourg, France
| | - Elisabeth Luporsi
- ICL Alexis Vautrin, Unité d'Oncogénétique, Vandœuvre-lès-Nancy, France
| | - Valérie Bonadona
- Université Claude Bernard Lyon 1, Villeurbanne, France.,CNRS UMR 5558, Lyon, France.,Centre Léon Bérard, Unité de Prévention et Epidémiologie Génétique, Lyon, France
| | - Catherine Noguès
- Département d'Anticipation et de Suivi des Cancers, Oncogénétique Clinique, Institut Paoli-Calmettes, Marseille, France.,Aix Marseille University, INSERM, IRD, SESSTIM, Marseille, France
| | - Pascaline Berthet
- Département de Biopathologie, Centre François Baclesse, Oncogénétique, Caen, France
| | - Capucine Delnatte
- Institut de Cancérologie de l'Ouest, Unité d'Oncogénétique, Saint Herblain, France
| | - Paul Gesta
- CH Georges Renon, Service d'Oncogénétique Régional Poitou-Charentes, Niort, France
| | - Alain Lortholary
- Centre Catherine de Sienne, Service d'Oncologie Médicale, Nantes, France
| | - Laurence Faivre
- Institut GIMI, CHU de Dijon, Hôpital d'Enfants, Dijon, France.,Oncogénétique, Centre de Lutte contre le Cancer Georges François Leclerc, Dijon, France
| | | | - Olivier Caron
- Département de Médecine Oncologique, Gustave Roussy, Villejuif, France
| | | | - Isabelle Coupier
- Hôpital Arnaud de Villeneuve, CHU Montpellier, Service de Génétique Médicale et Oncogénétique, Montpellier, France.,INSERM 896, CRCM Val d'Aurelle, Montpellier, France
| | - Sylvie Mazoyer
- Equipe GENDEV, Centre de Recherche en Neurosciences de Lyon, Inserm U1028, CNRS UMR5292, Université Lyon 1, Université St Etienne, Lyon, France
| | - Luis-Cristobal Monraz
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Maria Kondratova
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Inna Kuperstein
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Pascal Guénel
- Université Paris-Saclay, UVSQ, Inserm, CESP, Villejuif, France.,Inserm U1018, CESP, Team Exposome and Heredity, Villejuif, France
| | - Emmanuel Barillot
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Dominique Stoppa-Lyonnet
- Institut Curie, Service de Génétique, Paris, France.,Inserm, U830, Université Paris-Descartes, Paris, France
| | - Nadine Andrieu
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| | - Fabienne Lesueur
- Inserm, U900, Institut Curie, Paris, France.,Mines ParisTech, Fontainebleau, France.,PSL Research University, Paris, France
| |
Collapse
|
78
|
Yang S, Wen J, Eckert ST, Wang Y, Liu DJ, Wu R, Li R, Zhan X. Prioritizing genetic variants in GWAS with lasso using permutation-assisted tuning. Bioinformatics 2020; 36:3811-3817. [PMID: 32246825 DOI: 10.1093/bioinformatics/btaa229] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 02/19/2020] [Accepted: 03/31/2020] [Indexed: 01/13/2023] Open
Abstract
MOTIVATION Large scale genome-wide association studies (GWAS) have resulted in the identification of a wide range of genetic variants related to a host of complex traits and disorders. Despite their success, the individual single-nucleotide polymorphism (SNP) analysis approach adopted in most current GWAS can be limited in that it is usually biologically simple to elucidate a comprehensive genetic architecture of phenotypes and statistically underpowered due to heavy multiple-testing correction burden. On the other hand, multiple-SNP analyses (e.g. gene-based or region-based SNP-set analysis) are usually more powerful to examine the joint effects of a set of SNPs on the phenotype of interest. However, current multiple-SNP approaches can only draw an overall conclusion at the SNP-set level and does not directly inform which SNPs in the SNP-set are driving the overall genotype-phenotype association. RESULTS In this article, we propose a new permutation-assisted tuning procedure in lasso (plasso) to identify phenotype-associated SNPs in a joint multiple-SNP regression model in GWAS. The tuning parameter of lasso determines the amount of shrinkage and is essential to the performance of variable selection. In the proposed plasso procedure, we first generate permutations as pseudo-SNPs that are not associated with the phenotype. Then, the lasso tuning parameter is delicately chosen to separate true signal SNPs and non-informative pseudo-SNPs. We illustrate plasso using simulations to demonstrate its superior performance over existing methods, and application of plasso to a real GWAS dataset gains new additional insights into the genetic control of complex traits. AVAILABILITY AND IMPLEMENTATION R codes to implement the proposed methodology is available at https://github.com/xyz5074/plasso. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Songshan Yang
- Department of Statistics, Pennsylvania State University, University Park, PA 16802
| | - Jiawei Wen
- Department of Statistics, Pennsylvania State University, University Park, PA 16802
| | - Scott T Eckert
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA 17033
| | - Yaqun Wang
- Department of Biostatistics, Rutgers University, New Brunswick, NJ 08901, USA
| | - Dajiang J Liu
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA 17033
| | - Rongling Wu
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA 17033
| | - Runze Li
- Department of Statistics, Pennsylvania State University, University Park, PA 16802
| | - Xiang Zhan
- Department of Public Health Sciences, Pennsylvania State University, Hershey, PA 17033
| |
Collapse
|
79
|
Xue Y, Ding J, Wang J, Zhang S, Pan D. Two-phase SSU and SKAT in genetic association studies. J Genet 2020. [DOI: 10.1007/s12041-019-1166-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
80
|
Zheng XD, Cheng J, Qin WJ, Balsai N, Shang XJ, Zhang MT, Chen HQ. Whole Transcriptome Analysis Identifies the Taxonomic Status of a New Chinese Native Cattle Breed and Reveals Genes Related to Body Size. Front Genet 2020; 11:562855. [PMID: 33240316 PMCID: PMC7670488 DOI: 10.3389/fgene.2020.562855] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Accepted: 09/11/2020] [Indexed: 11/15/2022] Open
Abstract
Wandong (WD) cattle has recently been identified as a new Chinese native cattle breed by the National Commission for Livestock and Poultry Genetic Resources. The population size of this breed is less than 10,000. WD cattle and Dabieshan (DB) cattle are sympatric but are raised in different ecological environments, on mountains and plains, respectively, and the body sizes of these two breeds are markedly different. Blood samples were obtained from 8 adult female WD cattle and 7 adult female DB cattle (24 months old). The total RNA was extracted from leukocyte cells, and sequencing experiments were conducted on the Illumina HiSeqTM 4000 platform. After the removal of one outlier sample from the WD cattle breed as determined by principal component analysis (PCA), phylogenetic and population structure analyses indicated that WD and DB cattle formed a distinct Central China cattle group and showed evidence of hybridization between Bos. taurus and Bos. indicus. The immune-regulator CD48 (P = 1.3E-6) was associated with breed-specific traits according to loss-of-function variant enrichment analysis. In addition, 113 differentially expressed genes were identified between the two breeds, many of which are associated with the regulation of body growth, which is the major difference between the two breeds. This study showed that WD cattle belong to the group of hybrids between Bos. Taurus and Bos. indicus, and one novel gene associated with breed traits and multiple differentially expressed genes between these two closely related breeds was identified. The results provide insights into the genetic mechanisms that underlie economically important traits, such as body size, in cattle.
Collapse
Affiliation(s)
- Xiao-Dong Zheng
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China.,Department of Dermatology, The First Affiliated Hospital of Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology (Anhui Medical University), Ministry of Education, Hefei, China.,Key Laboratory of Major Autoimmune Diseases, Hefei, China
| | - Jin Cheng
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China
| | - Wen-Juan Qin
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China.,International Immunization Center, Anhui Agricultural University, Hefei, China
| | - Nyamsuren Balsai
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China
| | - Xuan-Jian Shang
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China
| | - Meng-Ting Zhang
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China
| | - Hong-Quan Chen
- School of Animal Science and Technology, Anhui Agricultural University, Hefei, China.,Key Laboratory of Anhui Local Livestock and Poultry Genetic Resources Conservation and Biobreeding, Hefei, China.,International Immunization Center, Anhui Agricultural University, Hefei, China
| |
Collapse
|
81
|
Xiang Y, Xiang X, Li Y. Identifying rare variants for quantitative traits in extreme samples of population via Kullback-Leibler distance. BMC Genet 2020; 21:130. [PMID: 33234108 PMCID: PMC7687851 DOI: 10.1186/s12863-020-00951-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 11/10/2020] [Indexed: 11/23/2022] Open
Abstract
Background The rapid development of sequencing technology and simultaneously the availability of large quantities of sequence data has facilitated the identification of rare variant associated with quantitative traits. However, existing statistical methods depend on certain assumptions and thus lacking uniform power. The present study focuses on mapping rare variant associated with quantitative traits. Results In the present study, we proposed a two-stage strategy to identify rare variant of quantitative traits using phenotype extreme selection design and Kullback-Leibler distance, where the first stage was association analysis and the second stage was fine mapping. We presented a statistic and a linkage disequilibrium measure for the first stage and the second stage, respectively. Theory analysis and simulation study showed that (1) the power of the proposed statistic for association analysis increased with the stringency of the sample selection and was affected slightly by non-causal variants and opposite effect variants, (2) the statistic here achieved higher power than three commonly used methods, and (3) the linkage disequilibrium measure for fine mapping was independent of the frequencies of non-causal variants and simply dependent on the frequencies of causal variants. Conclusions We conclude that the two-stage strategy here can be used effectively to mapping rare variant associated with quantitative traits.
Collapse
Affiliation(s)
- Yang Xiang
- School of Mathematics and Computational Science, Huaihua University, Huaihua, Hunan, 418008, People's Republic of China.,Key Laboratory of Research and Utilization of Ethnomedicinal Plant Resources of Hunan Province, Huaihua University, Huaihua, 418008, China.,Key Laboratory of Hunan Higher Education for Western Hunan Medicinal Plant and Ethnobotany, Huaihua University, Huaihua, 418008, China
| | - Xinrong Xiang
- School of Mathematics and Statistics, Hunan Normal University, Changsha, Hunan, 410081, People's Republic of China
| | - Yumei Li
- School of Mathematics and Computational Science, Huaihua University, Huaihua, Hunan, 418008, People's Republic of China. .,Key Laboratory of Research and Utilization of Ethnomedicinal Plant Resources of Hunan Province, Huaihua University, Huaihua, 418008, China. .,Key Laboratory of Hunan Higher Education for Western Hunan Medicinal Plant and Ethnobotany, Huaihua University, Huaihua, 418008, China.
| |
Collapse
|
82
|
Li Z, Liu Y, Lin X. Simultaneous Detection of Signal Regions Using Quadratic Scan Statistics With Applications to Whole Genome Association Studies. J Am Stat Assoc 2020; 117:823-834. [PMID: 35845434 PMCID: PMC9285665 DOI: 10.1080/01621459.2020.1822849] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 06/18/2020] [Accepted: 08/25/2020] [Indexed: 01/03/2023]
Abstract
We consider in this paper detection of signal regions associated with disease outcomes in whole genome association studies. Gene- or region-based methods have become increasingly popular in whole genome association analysis as a complementary approach to traditional individual variant analysis. However, these methods test for the association between an outcome and the genetic variants in a pre-specified region, e.g., a gene. In view of massive intergenic regions in whole genome sequencing (WGS) studies, we propose a computationally efficient quadratic scan (Q-SCAN) statistic based method to detect the existence and the locations of signal regions by scanning the genome continuously. The proposed method accounts for the correlation (linkage disequilibrium) among genetic variants, and allows for signal regions to have both causal and neutral variants, and the effects of signal variants to be in different directions. We study the asymptotic properties of the proposed Q-SCAN statistics. We derive an empirical threshold that controls for the family-wise error rate, and show that under regularity conditions the proposed method consistently selects the true signal regions. We perform simulation studies to evaluate the finite sample performance of the proposed method. Our simulation results show that the proposed procedure outperforms the existing methods, especially when signal regions have causal variants whose effects are in different directions, or are contaminated with neutral variants. We illustrate Q-SCAN by analyzing the WGS data from the Atherosclerosis Risk in Communities study.
Collapse
Affiliation(s)
- Zilin Li
- Harvard University T H Chan School of Public Health, Biostatistics, 655 Huntington Avenue, Boston, 02115 United States
| | - Yaowu Liu
- Southwestern University of Finance and Economics School of Statistics, Chengdu, 610074 China
| | - Xihong Lin
- Harvard University T H Chan School of Public Health, Biostatistics, 655 Huntington Avenue, Boston, 02115 United States
| |
Collapse
|
83
|
Fore R, Boehme J, Li K, Westra J, Tintle N. Multi-Set Testing Strategies Show Good Behavior When Applied to Very Large Sets of Rare Variants. Front Genet 2020; 11:591606. [PMID: 33240333 PMCID: PMC7680887 DOI: 10.3389/fgene.2020.591606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 10/05/2020] [Indexed: 12/22/2022] Open
Abstract
Gene-based tests of association (e.g., variance components and burden tests) are now common practice for analyses attempting to elucidate the contribution of rare genetic variants on common disease. As sequencing datasets continue to grow in size, the number of variants within each set (e.g., gene) being tested is also continuing to grow. Pathway-based methods have been used to allow for the initial aggregation of gene-based statistical evidence and then the subsequent aggregation of evidence across the pathway. This “multi-set” approach (first gene-based test, followed by pathway-based) lacks thorough exploration in regard to evaluating genotype–phenotype associations in the age of large, sequenced datasets. In particular, we wonder whether there are statistical and biological characteristics that make the multi-set approach optimal vs. simply doing all gene-based tests? In this paper, we provide an intuitive framework for evaluating these questions and use simulated data to affirm us this intuition. A real data application is provided demonstrating how our insights manifest themselves in practice. Ultimately, we find that when initial subsets are biologically informative (e.g., tending to aggregate causal genetic variants within one or more subsets, often genes), multi-set strategies can improve statistical power, with particular gains in cases where causal variants are aggregated in subsets with less variants overall (high proportion of causal variants in the subset). However, we find that there is little advantage when the sets are non-informative (similar proportion of causal variants in the subsets). Our application to real data further demonstrates this intuition. In practice, we recommend wider use of pathway-based methods and further exploration of optimal ways of aggregating variants into subsets based on emerging biological evidence of the genetic architecture of complex disease.
Collapse
Affiliation(s)
- Ruby Fore
- Department of Biostatistics, Brown University, Providence, RI, United States
| | - Jaden Boehme
- Department of Mathematics, Oregon State University, Corvallis, OR, United States
| | - Kevin Li
- Department of Mathematics, School of Arts and Sciences, Columbia University, New York, NY, United States
| | - Jason Westra
- Department of Mathematics and Statistics, Dordt University, Sioux Center, IA, United States
| | - Nathan Tintle
- Department of Mathematics and Statistics, Dordt University, Sioux Center, IA, United States
| |
Collapse
|
84
|
Trevino CE, Holleman AM, Corbitt H, Maslen CL, Rosser TC, Cutler DJ, Johnston HR, Rambo-Martin BL, Oberoi J, Dooley KJ, Capone GT, Reeves RH, Cordell HJ, Keavney BD, Agopian AJ, Goldmuntz E, Gruber PJ, O'Brien JE, Bittel DC, Wadhwa L, Cua CL, Moskowitz IP, Mulle JG, Epstein MP, Sherman SL, Zwick ME. Identifying genetic factors that contribute to the increased risk of congenital heart defects in infants with Down syndrome. Sci Rep 2020; 10:18051. [PMID: 33093519 PMCID: PMC7582922 DOI: 10.1038/s41598-020-74650-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 10/05/2020] [Indexed: 01/16/2023] Open
Abstract
Atrioventricular septal defects (AVSD) are a severe congenital heart defect present in individuals with Down syndrome (DS) at a > 2000-fold increased prevalence compared to the general population. This study aimed to identify risk-associated genes and pathways and to examine a potential polygenic contribution to AVSD in DS. We analyzed a total cohort of 702 individuals with DS with or without AVSD, with genomic data from whole exome sequencing, whole genome sequencing, and/or array-based imputation. We utilized sequence kernel association testing and polygenic risk score (PRS) methods to examine rare and common variants. Our findings suggest that the Notch pathway, particularly NOTCH4, as well as genes involved in the ciliome including CEP290 may play a role in AVSD in DS. These pathways have also been implicated in DS-associated AVSD in prior studies. A polygenic component for AVSD in DS has not been examined previously. Using weights based on the largest genome-wide association study of congenital heart defects available (2594 cases and 5159 controls; all general population samples), we found PRS to be associated with AVSD with odds ratios ranging from 1.2 to 1.3 per standard deviation increase in PRS and corresponding liability r2 values of approximately 1%, suggesting at least a small polygenic contribution to DS-associated AVSD. Future studies with larger sample sizes will improve identification and quantification of genetic contributions to AVSD in DS.
Collapse
Affiliation(s)
- Cristina E Trevino
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - Aaron M Holleman
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Holly Corbitt
- Division of Cardiovascular Medicine and the Heart Research Center, Oregon Health and Science University, Portland, OR, USA
| | - Cheryl L Maslen
- Division of Cardiovascular Medicine and the Heart Research Center, Oregon Health and Science University, Portland, OR, USA
| | - Tracie C Rosser
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - David J Cutler
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - H Richard Johnston
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - Benjamin L Rambo-Martin
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - Jai Oberoi
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - Kenneth J Dooley
- Sibley Heart Center Cardiology, Department of Pediatrics, Children's Healthcare of Atlanta, Emory University, Atlanta, GA, USA
| | | | - Roger H Reeves
- Department of Physiology and the Institute for Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Heather J Cordell
- Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Bernard D Keavney
- Division of Cardiovascular Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - A J Agopian
- Human Genetics Center; Department of Epidemiology, Human Genetics, and Environmental Sciences, UTHealth School of Public Health, Houston, TX, USA
| | - Elizabeth Goldmuntz
- Division of Cardiology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Peter J Gruber
- Department of Surgery, Yale School of Medicine, New Haven, CT, USA
| | - James E O'Brien
- The Ward Family Heart Center, Section of Cardiac Surgery, Children's Mercy Hospital, Kansas City, MO, USA
| | - Douglas C Bittel
- College of Biosciences, Kansas City University of Medicine and Biosciences, Kansas City, MO, USA
| | | | - Clifford L Cua
- Heart Center, Nationwide Children's Hospital, Columbus, OH, USA
| | - Ivan P Moskowitz
- Departments of Pediatrics, Pathology, and Human Genetics, The University of Chicago, Chicago, IL, USA
| | - Jennifer G Mulle
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
| | - Stephanie L Sherman
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA, USA
| | - Michael E Zwick
- Department of Human Genetics, Emory University School of Medicine, 300 Whitehead Biomedical Research Building, 615 Michael St., Atlanta, GA, 30322, USA.
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA, USA.
| |
Collapse
|
85
|
Gao C, Sha Q, Zhang S, Zhang K. MF-TOWmuT: Testing an optimally weighted combination of common and rare variants with multiple traits using family data. Genet Epidemiol 2020; 45:64-81. [PMID: 33047835 DOI: 10.1002/gepi.22355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 08/03/2020] [Accepted: 08/18/2020] [Indexed: 11/11/2022]
Abstract
With rapid advancements of sequencing technologies and accumulations of electronic health records, a large number of genetic variants and multiple correlated human complex traits have become available in many genetic association studies. Thus, it becomes necessary and important to develop new methods that can jointly analyze the association between multiple genetic variants and multiple traits. Compared with methods that only use a single marker or trait, the joint analysis of multiple genetic variants and multiple traits is more powerful since such an analysis can fully incorporate the correlation structure of genetic variants and/or traits and their mutual dependence patterns. However, most of existing methods that simultaneously analyze multiple genetic variants and multiple traits are only applicable to unrelated samples. We develop a new method called MF-TOWmuT to detect association of multiple phenotypes and multiple genetic variants in a genomic region with family samples. MF-TOWmuT is based on an optimally weighted combination of variants. Our method can be applied to both rare and common variants and both qualitative and quantitative traits. Our simulation results show that (1) the type I error of MF-TOWmuT is preserved; (2) MF-TOWmuT outperforms two existing methods such as Multiple Family-based Quasi-Likelihood Score Test and Multivariate Family-based Rare Variant Association Test in terms of power. We also illustrate the usefulness of MF-TOWmuT by analyzing genotypic and phenotipic data from the Genetics of Kidneys in Diabetes study. R program is available at https://github.com/gaochengPRC/MF-TOWmuT.
Collapse
Affiliation(s)
- Cheng Gao
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Kui Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| |
Collapse
|
86
|
Dauber A, Meng Y, Audi L, Vedantam S, Weaver B, Carrascosa A, Albertsson-Wikland K, Ranke MB, Jorge AAL, Cara J, Wajnrajch MP, Lindberg A, Camacho-Hübner C, Hirschhorn JN. A Genome-Wide Pharmacogenetic Study of Growth Hormone Responsiveness. J Clin Endocrinol Metab 2020; 105:5870346. [PMID: 32652002 PMCID: PMC7446971 DOI: 10.1210/clinem/dgaa443] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 07/09/2020] [Indexed: 02/06/2023]
Abstract
CONTEXT Individual patients vary in their response to growth hormone (GH). No large-scale genome-wide studies have looked for genetic predictors of GH responsiveness. OBJECTIVE To identify genetic variants associated with GH responsiveness. DESIGN Genome-wide association study (GWAS). SETTING Cohorts from multiple academic centers and a clinical trial. PATIENTS A total of 614 individuals from 5 short stature cohorts receiving GH: 297 with idiopathic short stature, 276 with isolated GH deficiency, and 65 born small for gestational age. INTERVENTION Association of more than 2 million variants was tested. MAIN OUTCOME MEASURES Primary analysis: individual single nucleotide polymorphism (SNP) association with first-year change in height standard deviation scores. Secondary analyses: SNP associations in clinical subgroups adjusted for clinical variables; association of polygenic score calculated from 697 genome-wide significant height SNPs with GH responsiveness. RESULTS No common variant associations reached genome-wide significance in the primary analysis. The strongest suggestive signals were found near the B4GALT4 and TBCE genes. After meta-analysis including replication data, signals at several loci reached or retained genome-wide significance in secondary analyses, including variants near ST3GAL6. There was no significant association with variants previously reported to be associated with GH response nor with a polygenic predicted height score. CONCLUSIONS We performed the largest GWAS of GH responsiveness to date. We identified 2 loci with a suggestive effect on GH responsiveness in our primary analysis and several genome-wide significant associations in secondary analyses that require further replication. Our results are consistent with a polygenic component to GH responsiveness, likely distinct from the genetic regulators of adult height.
Collapse
Affiliation(s)
- Andrew Dauber
- Division of Endocrinology, Children’s National Hospital, Washington, DC
| | - Yan Meng
- Division of Endocrinology, Boston Children’s Hospital, and Program in Medical and Population Genetics, Broad Institute, Harvard Medical School, Boston, Massachusetts
| | - Laura Audi
- Department of Pediatrics, Institut de Recerca (VHIR), Hospital Vall d’Hebron, Centre for Biomedical Research on Rare Diseases (CIBERER), Autonomous University, Barcelona, Spain
| | - Sailaja Vedantam
- Division of Endocrinology, Boston Children’s Hospital, and Program in Medical and Population Genetics, Broad Institute, Harvard Medical School, Boston, Massachusetts
| | - Benjamin Weaver
- Division of Endocrinology, Boston Children’s Hospital, and Program in Medical and Population Genetics, Broad Institute, Harvard Medical School, Boston, Massachusetts
| | - Antonio Carrascosa
- Department of Pediatrics, Institut de Recerca (VHIR), Hospital Vall d’Hebron, Centre for Biomedical Research on Rare Diseases (CIBERER), Autonomous University, Barcelona, Spain
| | - Kerstin Albertsson-Wikland
- Department of Physiology/Endocrinology, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Michael B Ranke
- University Children´s Hospital, Paediatric Endocrinology, Tübingen, Germany
| | - Alexander A L Jorge
- Unidade de Endocrinologia do Desenvolvimento (LIM42), Hospital das Clinicas da Faculdade de Medicina da Universidade de Sao Paulo, Sao Paulo, Brazil
| | | | - Michael P Wajnrajch
- Pfizer Inc, Rare Disease, New York
- Correspondence and Reprint Requests: Michael Wajnrajch, MD MPA, Endocrine Care & Inborn Errors of Metabolism, Pfizer Inc, 235 East 42nd Street, MS 235-10-01, New York, NY 10017, USA. E-mail:
| | | | | | - Joel N Hirschhorn
- Division of Endocrinology, Boston Children’s Hospital, and Program in Medical and Population Genetics, Broad Institute, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
87
|
Sarmah P, Bharali R, Khatonier R, Khan A. Polymorphism in Toll interacting protein (TOLLIP) gene and its association with Visceral Leishmaniasis. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
88
|
Bakhtiari S, Sulaimany S, Talebi M, Kalhor K. Computational Prediction of Probable Single Nucleotide Polymorphism-Cancer Relationships. Cancer Inform 2020; 19:1176935120942216. [PMID: 32728337 PMCID: PMC7364831 DOI: 10.1177/1176935120942216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 06/22/2020] [Indexed: 12/18/2022] Open
Abstract
Genetic variations such as single nucleotide polymorphisms (SNPs) can cause susceptibility to cancer. Although thousands of genetic variants have been identified to be associated with different cancers, the molecular mechanisms of cancer remain unknown. There is not a particular dataset of relationships between cancer and SNPs, as a bipartite network, for computational analysis and prediction. Link prediction as a computational graph analysis method can help us to gain new insight into the network. In this article, after creating a network between cancer and SNPs using SNPedia and Cancer Research UK databases, we evaluated the computational link prediction methods to foresee new SNP-Cancer relationships. Results show that among the popular scoring methods based on network topology, for relation prediction, the preferential attachment (PA) algorithm is the most robust method according to computational and experimental evidence, and some of its computational predictions are corroborated in recent publications. According to the PA predictions, rs1801394-Non-small cell lung cancer, rs4880-Non-small cell lung cancer, and rs1805794-Colorectal cancer are some of the best probable SNP-Cancer associations that have not yet been mentioned in any published article, and they are the most probable candidates for additional laboratory and validation studies. Also, it is feasible to improve the predicting algorithms to produce new predictions in the future.
Collapse
Affiliation(s)
- Shahab Bakhtiari
- Department of Biological Sciences, University of Kurdistan, Sanandaj, Iran
| | - Sadegh Sulaimany
- Department of Computer Engineering, University of Kurdistan, Sanandaj, Iran
| | - Mehrdad Talebi
- Department of Medical Genetics, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | - Kabmiz Kalhor
- Department of Biological Sciences, University of Kurdistan, Sanandaj, Iran
| |
Collapse
|
89
|
Loss-of-function variants in FSIP1 identified by targeted sequencing are associated with one particular subtype of mucosal melanoma. Gene 2020; 759:144964. [PMID: 32717308 DOI: 10.1016/j.gene.2020.144964] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 07/04/2020] [Accepted: 07/13/2020] [Indexed: 12/30/2022]
Abstract
BACKGROUND Mucosal melanoma is a tumor caused by the malignant transformation of pigment-producing cells and can arise from any mucosal tissue where melanocytes are present. Due to its rarity, the mucosal melanoma subtype is poorly described, and its genetic characteristics are infrequently studied. The discovery or confirmation of new mucosal melanoma susceptibility genes will provide important insights for the study of its pathogenesis. MATERIALS AND METHODS We performed deep targeted sequencing of 100 previously reported melanoma-related genes in 39 mucosal melanoma samples and a gene-level loss-of-function (LOF) variant enrichment analysis for mucosal melanoma from different incidence sites. RESULTS We detected 7,589 variants in these samples, and 484 were LOF variants (gain or loss of a stop codon, missense, and splice site). Four different gene-level enrichment analyses revealed that FSIP1 (fibrous sheath interacting protein 1) is a susceptibility gene for oral mucosal melanoma (OR = 0.33, PChi = 4.05 × 10-2, Pburden = 3.06 × 10-2, Pskat = 3.01 × 10-2, Pskato = 3.01 × 10-2), whereas the different methods did not detect a significant susceptibility gene for the other subtypes. CONCLUSIONS In our study, a susceptibility gene for oral mucosal melanoma was confirmed in a Chinese Han population, and these findings contribute to a better genetic understanding of mucosal melanoma of different subtypes.
Collapse
|
90
|
Zhong W, Spracklen CN, Mohlke KL, Zheng X, Fine J, Li Y. Multi-SNP mediation intersection-union test. Bioinformatics 2020; 35:4724-4729. [PMID: 31099385 DOI: 10.1093/bioinformatics/btz285] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 03/19/2019] [Accepted: 04/16/2019] [Indexed: 12/27/2022] Open
Abstract
SUMMARY Tens of thousands of reproducibly identified GWAS (Genome-Wide Association Studies) variants, with the vast majority falling in non-coding regions resulting in no eventual protein products, call urgently for mechanistic interpretations. Although numerous methods exist, there are few, if any methods, for simultaneously testing the mediation effects of multiple correlated SNPs via some mediator (e.g. the expression of a gene in the neighborhood) on phenotypic outcome. We propose multi-SNP mediation intersection-union test (SMUT) to fill in this methodological gap. Our extensive simulations demonstrate the validity of SMUT as well as substantial, up to 92%, power gains over alternative methods. In addition, SMUT confirmed known mediators in a real dataset of Finns for plasma adiponectin level, which were missed by many alternative methods. We believe SMUT will become a useful tool to generate mechanistic hypotheses underlying GWAS variants, facilitating functional follow-up. AVAILABILITY AND IMPLEMENTATION The R package SMUT is publicly available from CRAN at https://CRAN.R-project.org/package=SMUT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wujuan Zhong
- Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Cassandra N Spracklen
- Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Karen L Mohlke
- Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Xiaojing Zheng
- Department of Pediatrics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason Fine
- Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yun Li
- Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Computer Science, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
91
|
Cheng W, Ramachandran S, Crawford L. Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits. PLoS Genet 2020; 16:e1008855. [PMID: 32542026 PMCID: PMC7316356 DOI: 10.1371/journal.pgen.1008855] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 06/25/2020] [Accepted: 05/13/2020] [Indexed: 12/22/2022] Open
Abstract
Traditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits. Our key insight is that enrichment studies on the gene-level are improved when we reformulate the genome-wide SNP-level null hypothesis to identify spurious small-to-intermediate SNP effects and classify them as non-causal. gene-ε efficiently identifies enriched genes under a variety of simulated genetic architectures, achieving greater than a 90% true positive rate at 1% false positive rate for polygenic traits. Lastly, we apply gene-ε to summary statistics derived from six quantitative traits using European-ancestry individuals in the UK Biobank, and identify enriched genes that are in biologically relevant pathways.
Collapse
Affiliation(s)
- Wei Cheng
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Sohini Ramachandran
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Lorin Crawford
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Biostatistics, Brown University, Providence, Rhode Island, United States of America
- Center for Statistical Sciences, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
92
|
Cao X, Xing L, He H, Zhang X. Views on GWAS statistical analysis. Bioinformation 2020; 16:393-397. [PMID: 32831520 PMCID: PMC7434950 DOI: 10.6026/97320630016393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 04/15/2020] [Accepted: 04/17/2020] [Indexed: 11/23/2022] Open
Abstract
Genome-wide association study (GWAS) is a popular approach to investigate relationships between genetic information and diseases. A number of associations are tested in a study and the results are often corrected using multiple adjustment methods. It is observed that GWAS studies suffer adequate statistical power for reliability. Hence, we document known models for reliability assessment using improved statistical power in GWAS analysis.
Collapse
Affiliation(s)
- Xiaowen Cao
- Department of Mathematics, Hebei University of Technology, Tianjin, China
- Department of Mathematics and Statistics, University of Victoria, BC, Canada
| | - Li Xing
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, SK, Canada
| | - Hua He
- Department of Mathematics, Hebei University of Technology, Tianjin, China
| | - Xuekui Zhang
- Department of Mathematics and Statistics, University of Victoria, BC, Canada
| |
Collapse
|
93
|
Ding X, Tang R, Zhu J, He M, Huang H, Lin Z, Zhu J. An Appraisal of the Role of Previously Reported Risk Factors in the Age at Menopause Using Mendelian Randomization. Front Genet 2020; 11:507. [PMID: 32547598 PMCID: PMC7274172 DOI: 10.3389/fgene.2020.00507] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2020] [Accepted: 04/24/2020] [Indexed: 12/26/2022] Open
Abstract
Objective Menopause at a young age is associated with many health problems in women, including osteoporosis, depressive symptoms, coronary disease, and stroke. Many traditional observational studies have reported some potential risk factors for early menopause but have drawn different conclusions. This inconsistency can be attributed mainly to unmodified confounding factors. Identifying the factors causally associated with age at menopause is important for early intervention in women with abnormal menopause timing, and for improving the quality of life for postmenopausal women. This study aims to appraise whether the previously reported risk factors are causally associated with early age at natural menopause (ANM) susceptibility. Methods We used Mendelian randomization, a statistical method wherein genetic variants are used to determine whether an observational association between a risk factor and an outcome is consistent with a causal effect. Results Women with earlier age at menarche (β = 0.34, se = 0.16, p = 0.035), lower education level (β = 1.19, se = 0.41, p = 0.004) and higher body mass index (β = −0.05, se = 0.02, p = 0.027) had greater risk for early ANM. The causal link between early age at menarche and early ANM was replicated using ReproGen consortium data (β = 0.23, se = 0.07, p = 0.001). However, a current smoking habit, one of previously reported risk factors, was less likely to be correlated causally with early ANM, suggesting that previous observational studies may not have sufficiently adjusted for confounders. Conclusion Our results help to identify the risk factors of ANM via a genetics approach and future research into the biological mechanism could further help with targeted prevention for early menopause.
Collapse
Affiliation(s)
- Xiaohong Ding
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The First Clinical Medical School, Wenzhou Medical University, Wenzhou, China
| | - Rong Tang
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The Second Clinical Medical School, Wenzhou Medical University, Wenzhou, China
| | - Jinjin Zhu
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The Second Clinical Medical School, Wenzhou Medical University, Wenzhou, China
| | - Minzhi He
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The Second Clinical Medical School, Wenzhou Medical University, Wenzhou, China
| | - Huasong Huang
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The Second Clinical Medical School, Wenzhou Medical University, Wenzhou, China
| | - Zhenlang Lin
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Jianghu Zhu
- Department of Pediatrics, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The Second Clinical Medical School, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
94
|
Statistical Method Based on Bayes-Type Empirical Score Test for Assessing Genetic Association with Multilocus Genotype Data. Int J Genomics 2020; 2020:4708152. [PMID: 32455126 PMCID: PMC7229558 DOI: 10.1155/2020/4708152] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 04/21/2020] [Indexed: 12/20/2022] Open
Abstract
Simultaneous testing of multiple genetic variants for association is widely recognized as a valuable complementary approach to single-marker tests. As such, principal component regression (PCR) has been found to have competitive power. We focus on exploring a robust test for an unknown genetic mode of all SNPs, an unknown Hardy-Weinberg equilibrium (HWE) in a population, and a large number of all SNPs. First, we propose a new global test by means of the use of codominant codes for all markers and PCR. The new global test is built on an empirical Bayes-type score statistic for testing marginal associations with each single marker. The new global test gains power by robustly exploiting the Hardy-Weinberg equilibrium in the control population and effectively using linkage disequilibrium among test markers. The new global test reduces to PCR when the genotype for each marker is coded as the number of minor alleles. This connection lends insight into the power of the new global test relative to PCR and some other popular multimarker test methods. Second, we propose a robust test method based on the new global test and the ordinary PCR test built on a prospective score statistic for testing marginal associations with each single marker when the genotype for each marker is coded as the number of minor alleles by taking the minimum p value of these two tests. Finally, through extensive simulation studies and analysis of the association between pancreatic cancer and some genes of interest, we show that the proposed robust test method has desirable power and can often identify association signals that may be missed by existing methods.
Collapse
|
95
|
Deng Y, He T, Fang R, Li S, Cao H, Cui Y. Genome-Wide Gene-Based Multi-Trait Analysis. Front Genet 2020; 11:437. [PMID: 32508874 PMCID: PMC7248273 DOI: 10.3389/fgene.2020.00437] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 04/08/2020] [Indexed: 11/29/2022] Open
Abstract
Genome-wide association studies focusing on a single phenotype have been broadly conducted to identify genetic variants associated with a complex disease. The commonly applied single variant analysis is limited by failing to consider the complex interactions between variants, which motivated the development of association analyses focusing on genes or gene sets. Moreover, when multiple correlated phenotypes are available, methods based on a multi-trait analysis can improve the association power. However, most currently available multi-trait analyses are single variant-based analyses; thus have limited power when disease variants function as a group in a gene or a gene set. In this work, we propose a genome-wide gene-based multi-trait analysis method by considering genes as testing units. For a given phenotype, we adopt a rapid and powerful kernel-based testing method which can evaluate the joint effect of multiple variants within a gene. The joint effect, either linear or nonlinear, is captured through kernel functions. Given a series of candidate kernel functions, we propose an omnibus test strategy to integrate the test results based on different candidate kernels. A p-value combination method is then applied to integrate dependent p-values to assess the association between a gene and multiple correlated phenotypes. Simulation studies show a reasonable type I error control and an excellent power of the proposed method compared to its counterparts. We further show the utility of the method by applying it to two data sets: the Human Liver Cohort and the Alzheimer Disease Neuroimaging Initiative data set, and novel genes are identified. Our method has broad applications in other fields in which the interest is to evaluate the joint effect (linear or nonlinear) of a set of variants.
Collapse
Affiliation(s)
- Yamin Deng
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Tao He
- Department of Mathematics, San Francisco State University, San Francisco, CA, United States
| | - Ruiling Fang
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Shaoyu Li
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Hongyan Cao
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI, United States
| |
Collapse
|
96
|
The exhaustive genomic scan approach, with an application to rare-variant association analysis. Eur J Hum Genet 2020; 28:1283-1291. [PMID: 32415273 PMCID: PMC7608423 DOI: 10.1038/s41431-020-0639-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Revised: 02/28/2020] [Accepted: 04/07/2020] [Indexed: 12/12/2022] Open
Abstract
Region-based genome-wide scans are usually performed by use of a priori chosen analysis regions. Such an approach will likely miss the region comprising the strongest signal and, thus, may result in increased type II error rates and decreased power. Here, we propose a genomic exhaustive scan approach that analyzes all possible subsequences and does not rely on a prior definition of the analysis regions. As a prime instance, we present a computationally ultraefficient implementation using the rare-variant collapsing test for phenotypic association, the genomic exhaustive collapsing scan (GECS). Our implementation allows for the identification of regions comprising the strongest signals in large, genome-wide rare-variant association studies while controlling the family-wise error rate via permutation. Application of GECS to two genomic data sets revealed several novel significantly associated regions for age-related macular degeneration and for schizophrenia. Our approach also offers a high potential to improve genome-wide scans for selection, methylation, and other analyses.
Collapse
|
97
|
Zhang J, Guo X, Gonzales S, Yang J, Wang X. TS: a powerful truncated test to detect novel disease associated genes using publicly available gWAS summary data. BMC Bioinformatics 2020; 21:172. [PMID: 32366212 PMCID: PMC7199321 DOI: 10.1186/s12859-020-3511-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 04/23/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the last decade, a large number of common variants underlying complex diseases have been identified through genome-wide association studies (GWASs). Summary data of the GWASs are freely and publicly available. The summary data is usually obtained through single marker analysis. Gene-based analysis offers a useful alternative and complement to single marker analysis. Results from gene level association tests can be more readily integrated with downstream functional and pathogenic investigations. Most existing gene-based methods fall into two categories: burden tests and quadratic tests. Burden tests are usually powerful when the directions of effects of causal variants are the same. However, they may suffer loss of statistical power when different directions of effects exist at the causal variants. The power of quadratic tests is not affected by the directions of effects but could be less powerful due to issues such as the large number of degree of freedoms. These drawbacks of existing gene based methods motivated us to develop a new powerful method to identify disease associated genes using existing GWAS summary data. METHODS AND RESULTS In this paper, we propose a new truncated statistic method (TS) by utilizing a truncated method to find the genes that have a true contribution to the genetic association. Extensive simulation studies demonstrate that our proposed test outperforms other comparable tests. We applied TS and other comparable methods to the schizophrenia GWAS data and type 2 diabetes (T2D) GWAS meta-analysis summary data. TS identified more disease associated genes than comparable methods. Many of the significant genes identified by TS may have important mechanisms relevant to the associated traits. TS is implemented in C program TS, which is freely and publicly available online. CONCLUSIONS The proposed truncated statistic outperforms existing methods. It can be employed to detect novel traits associated genes using GWAS summary data.
Collapse
Affiliation(s)
- Jianjun Zhang
- Department of Mathematics, University of North Texas, 1155 Union Circle #311430, Denton, 76203 TX USA
| | - Xuan Guo
- Department of Computer Science and Engineering, University of North Texas, Discovery Park 3940 N. Elm, Denton, 76203 TX USA
| | - Samantha Gonzales
- Department of Computer Science and Engineering, University of North Texas, Discovery Park 3940 N. Elm, Denton, 76203 TX USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics School of Medicine, Emory University, Whitehead Biomedical Research Building, Suite 305K, Atlanta, 30322 GA USA
| | - Xuexia Wang
- Department of Mathematics, University of North Texas, 1155 Union Circle #311430, Denton, 76203 TX USA
| |
Collapse
|
98
|
Wang Y, Bandyopadhyay D, Shaffer JR, Wu X. Gene-Based Association Mapping for Dental Caries in The GENEVA Consortium. JOURNAL OF DENTISTRY AND DENTAL MEDICINE 2020; 3:156. [PMID: 34622142 PMCID: PMC8494074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
OBJECTIVE Dental caries is a multifactorial disease with high prevalence in both children and adults. Recent genome-wide association studies (GWASs) have revealed that genetic factors play an important role in caries incidence. However, existing methods are not sufficient to identify caries-associated genes, due to the complex correlation structure of caries GWAS data, and lack of appropriate summarization at the gene level. This paper attempts to address that by analyzing data from the Gene, Environment Association Studies (GENEVA) consortium. METHODS We investigated gene-based genetic associations for dental caries based on genome-wide data derived from the GENEVA database, with adjustment to covariates, linkage disequilibrium among single-nucleotide polymorphisms, and family relations, in sampled individuals. RESULTS Several suggestive genes were identified, in which some of them have been previously found to have potential biological functions on cariogenesis. CONCLUSIONS By comparing the gene sets identified from gene-based and SNP-based association testing methods, we found a non-negligible overlap, which indicates that our gene-based analysis can provide substantial supplement to the traditional GWAS analysis.
Collapse
Affiliation(s)
- Yueyao Wang
- Department of Statistics, Virginia Polytechnic Institute & State University, Blacksburg, VA
| | | | - John R. Shaffer
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA
| | - Xiaowei Wu
- Department of Statistics, Virginia Polytechnic Institute & State University, Blacksburg, VA
| |
Collapse
|
99
|
Zhang J, Xie S, Gonzales S, Liu J, Wang X. A fast and powerful eQTL weighted method to detect genes associated with complex trait using GWAS summary data. Genet Epidemiol 2020; 44:550-563. [PMID: 32350919 DOI: 10.1002/gepi.22297] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 04/13/2020] [Accepted: 04/14/2020] [Indexed: 02/06/2023]
Abstract
Although genomewide association studies (GWASs) have identified many genetic variants underlying complex traits, a large fraction of heritability still remains unexplained. Integrative analysis that incorporates additional information, such as expression quantitativetrait locus (eQTL) data into sequencing studies (denoted as transcriptomewide association study [TWAS]), can aid the discovery of trait-associated genetic variants. However, general TWAS methods only incorporate one eQTL-derived weight (e.g., cis-effect), and thus can suffer a substantial loss of power when the single estimated cis-effect is not predictive for the effect size of a genetic variant or when there are estimation errors in the estimated cis-effect, or if the data are not consistent with the model assumption. In this study, we propose an omnibus test (OT) which utilizes a Cauchy association test to integrate association evidence demonstrated by three different traditional tests (burden test, quadratic test, and adaptive test) using GWAS summary data with multiple eQTL-derived weights. The p value of the proposed test can be calculated analytically, and thus it is fast and efficient. We applied our proposed test to two schizophrenia (SCZ) GWAS summary data sets and two lipids trait (HDL) GWAS summary data sets. Compared with the three traditional tests, our proposed OT can identify more trait-associated genes.
Collapse
Affiliation(s)
- Jianjun Zhang
- Department of Mathematics, University of North Texas, Denton, Texas
| | - Sicong Xie
- Beijing National Day School, Beijing, China
| | - Samantha Gonzales
- Department of Computer Science and Engineering, University of North Texas, Denton, Texas
| | - Jianguo Liu
- Department of Mathematics, University of North Texas, Denton, Texas
| | - Xuexia Wang
- Department of Mathematics, University of North Texas, Denton, Texas
| |
Collapse
|
100
|
Wang X, Wen Y. A U-statistics for integrative analysis of multilayer omics data. Bioinformatics 2020; 36:2365-2374. [PMID: 31913435 DOI: 10.1093/bioinformatics/btaa004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 12/09/2019] [Accepted: 01/02/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The emerging multilayer omics data provide unprecedented opportunities for detecting biomarkers that are associated with complex diseases at various molecular levels. However, the high-dimensionality of multiomics data and the complex disease etiologies have brought tremendous analytical challenges. RESULTS We developed a U-statistics-based non-parametric framework for the association analysis of multilayer omics data, where consensus and permutation-based weighting schemes are developed to account for various types of disease models. Our proposed method is flexible for analyzing different types of outcomes as it makes no assumptions about their distributions. Moreover, it explicitly accounts for various types of underlying disease models through weighting schemes and thus provides robust performance against them. Through extensive simulations and the application to dataset obtained from the Alzheimer's Disease Neuroimaging Initiatives, we demonstrated that our method outperformed the commonly used kernel regression-based methods. AVAILABILITY AND IMPLEMENTATION The R-package is available at https://github.com/YaluWen/Uomic. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaqiong Wang
- Department of Statistics, University of Auckland, Auckland, New Zealand
| | - Yalu Wen
- Department of Statistics, University of Auckland, Auckland, New Zealand
| |
Collapse
|