1
|
Chen M, Zhang Y, Shi W, Song X, Yang Y, Hou G, Ding H, Chen S, Yang W, Shen N, Cui Y, Zuo X, Tang Y. SPATS2L is a positive feedback regulator of the type I interferon signaling pathway and plays a vital role in lupus. Acta Biochim Biophys Sin (Shanghai) 2024. [PMID: 39099414 DOI: 10.3724/abbs.2024132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2024] Open
Abstract
Through genome-wide association studies (GWAS) and integrated expression quantitative trait locus (eQTL) analyses, numerous susceptibility genes ("eGenes", whose expressions are significantly associated with common variants) associated with systemic lupus erythematosus (SLE) have been identified. Notably, a subset of these eGenes is correlated with disease activity. However, the precise mechanisms through which these genes contribute to the initiation and progression of the disease remain to be fully elucidated. In this investigation, we initially identify SPATS2L as an SLE eGene correlated with disease activity. eSignaling and transcriptomic analyses suggest its involvement in the type I interferon (IFN) pathway. We observe a significant increase in SPATS2L expression following type I IFN stimulation, and the expression levels are dependent on both the concentration and duration of stimulation. Furthermore, through dual-luciferase reporter assays, western blot analysis, and imaging flow cytometry, we confirm that SPATS2L positively modulates the type I IFN pathway, acting as a positive feedback regulator. Notably, siRNA-mediated intervention targeting SPATS2L, an interferon-inducible gene, in peripheral blood mononuclear cells (PBMCs) from patients with SLE reverses the activation of the interferon pathway. In conclusion, our research highlights the pivotal role of SPATS2L as a positive-feedback regulatory molecule within the type I IFN pathway. Our findings suggest that SPATS2L plays a critical role in the onset and progression of SLE and may serve as a promising target for disease activity assessment and intervention strategies.
Collapse
Affiliation(s)
- Mengke Chen
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| | - Yutong Zhang
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| | - Weiwen Shi
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| | - Xuejiao Song
- Department of Dermatology, China-Japan Friendship Hospital, Beijing 100029, China
| | - Yue Yang
- Department of Dermatology, China-Japan Friendship Hospital, Beijing 100029, China
- Department of Pharmacy, China-Japan Friendship Hospital, Beijing 100029, China
| | - Guojun Hou
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| | - Huihua Ding
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| | - Sheng Chen
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong 999077, China
| | - Nan Shen
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200032, China
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati OH 45229, USA
| | - Yong Cui
- Department of Dermatology, China-Japan Friendship Hospital, Beijing 100029, China
| | - Xianbo Zuo
- Department of Dermatology, China-Japan Friendship Hospital, Beijing 100029, China
- Department of Pharmacy, China-Japan Friendship Hospital, Beijing 100029, China
| | - Yuanjia Tang
- Shanghai Institute of Rheumatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai 200001, China
| |
Collapse
|
2
|
Farhangi S, Gòdia M, Derks MFL, Harlizius B, Dibbits B, González-Prendes R, Crooijmans RPMA, Madsen O, Groenen MAM. Expression genome-wide association study identifies key regulatory variants enriched with metabolic and immune functions in four porcine tissues. BMC Genomics 2024; 25:684. [PMID: 38992576 PMCID: PMC11238464 DOI: 10.1186/s12864-024-10583-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 07/01/2024] [Indexed: 07/13/2024] Open
Abstract
BACKGROUND Integration of high throughput DNA genotyping and RNA-sequencing data enables the discovery of genomic regions that regulate gene expression, known as expression quantitative trait loci (eQTL). In pigs, efforts to date have been mainly focused on purebred lines for traits with commercial relevance as such growth and meat quality. However, little is known on genetic variants and mechanisms associated with the robustness of an animal, thus its overall health status. Here, the liver, lung, spleen, and muscle transcriptomes of 100 three-way crossbred female finishers were studied, with the aim of identifying novel eQTL regulatory regions and transcription factors (TFs) associated with regulation of porcine metabolism and health-related traits. RESULTS An expression genome-wide association study with 535,896 genotypes and the expression of 12,680 genes in liver, 13,310 genes in lung, 12,650 genes in spleen, and 12,595 genes in muscle resulted in 4,293, 10,630, 4,533, and 6,871 eQTL regions for each of these tissues, respectively. Although only a small fraction of the eQTLs were annotated as cis-eQTLs, these presented a higher number of polymorphisms per region and significantly stronger associations with their target gene compared to trans-eQTLs. Between 20 and 115 eQTL hotspots were identified across the four tissues. Interestingly, these were all enriched for immune-related biological processes. In spleen, two TFs were identified: ERF and ZNF45, with key roles in regulation of gene expression. CONCLUSIONS This study provides a comprehensive analysis with more than 26,000 eQTL regions identified that are now publicly available. The genomic regions and their variants were mostly associated with tissue-specific regulatory roles. However, some shared regions provide new insights into the complex regulation of genes and their interactions that are involved with important traits related to metabolism and immunity.
Collapse
Affiliation(s)
- Samin Farhangi
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Marta Gòdia
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands.
| | - Martijn F L Derks
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
- Topigs Norsvin Research Center, 's-Hertogenbosch, The Netherlands
| | | | - Bert Dibbits
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Rayner González-Prendes
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
- Ausnutria BV, Zwolle, The Netherlands
| | | | - Ole Madsen
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Martien A M Groenen
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| |
Collapse
|
3
|
Head ST, Dezem F, Todor A, Yang J, Plummer J, Gayther S, Kar S, Schildkraut J, Epstein MP. Cis- and trans-eQTL TWASs of breast and ovarian cancer identify more than 100 susceptibility genes in the BCAC and OCAC consortia. Am J Hum Genet 2024; 111:1084-1099. [PMID: 38723630 PMCID: PMC11179407 DOI: 10.1016/j.ajhg.2024.04.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/11/2024] [Accepted: 04/16/2024] [Indexed: 05/21/2024] Open
Abstract
Transcriptome-wide association studies (TWASs) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have focused on the regulatory effects of risk-associated SNPs thought to act in cis on a nearby target gene. With growing evidence for distal (trans) regulatory effects of variants on gene expression, we performed TWASs of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both cis- and trans-expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole-genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as ACAP3, whose associations are predominantly driven by trans-eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents an in-depth look into the role of trans eQTLs in the complex molecular mechanisms underlying these diseases.
Collapse
Affiliation(s)
- S Taylor Head
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Felipe Dezem
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Andrei Todor
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jingjing Yang
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jasmine Plummer
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Simon Gayther
- Department of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Siddhartha Kar
- Early Cancer Institute, Department of Oncology, University of Cambridge, Cambridge CB2 0XZ, UK
| | - Joellen Schildkraut
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA.
| |
Collapse
|
4
|
Ashayeri H, Sobhi N, Pławiak P, Pedrammehr S, Alizadehsani R, Jafarizadeh A. Transfer Learning in Cancer Genetics, Mutation Detection, Gene Expression Analysis, and Syndrome Recognition. Cancers (Basel) 2024; 16:2138. [PMID: 38893257 PMCID: PMC11171544 DOI: 10.3390/cancers16112138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 05/30/2024] [Accepted: 06/01/2024] [Indexed: 06/21/2024] Open
Abstract
Artificial intelligence (AI), encompassing machine learning (ML) and deep learning (DL), has revolutionized medical research, facilitating advancements in drug discovery and cancer diagnosis. ML identifies patterns in data, while DL employs neural networks for intricate processing. Predictive modeling challenges, such as data labeling, are addressed by transfer learning (TL), leveraging pre-existing models for faster training. TL shows potential in genetic research, improving tasks like gene expression analysis, mutation detection, genetic syndrome recognition, and genotype-phenotype association. This review explores the role of TL in overcoming challenges in mutation detection, genetic syndrome detection, gene expression, or phenotype-genotype association. TL has shown effectiveness in various aspects of genetic research. TL enhances the accuracy and efficiency of mutation detection, aiding in the identification of genetic abnormalities. TL can improve the diagnostic accuracy of syndrome-related genetic patterns. Moreover, TL plays a crucial role in gene expression analysis in order to accurately predict gene expression levels and their interactions. Additionally, TL enhances phenotype-genotype association studies by leveraging pre-trained models. In conclusion, TL enhances AI efficiency by improving mutation prediction, gene expression analysis, and genetic syndrome detection. Future studies should focus on increasing domain similarities, expanding databases, and incorporating clinical data for better predictions.
Collapse
Affiliation(s)
- Hamidreza Ashayeri
- Student Research Committee, Tabriz University of Medical Sciences, Tabriz 5165665811, Iran;
| | - Navid Sobhi
- Nikookari Eye Center, Tabriz University of Medical Sciences, Tabriz 5165665811, Iran; (N.S.); (A.J.)
| | - Paweł Pławiak
- Department of Computer Science, Faculty of Computer Science and Telecommunications, Cracow University of Technology, Warszawska 24, 31-155 Krakow, Poland
- Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Bałtycka 5, 44-100 Gliwice, Poland
| | - Siamak Pedrammehr
- Faculty of Design, Tabriz Islamic Art University, Tabriz 5164736931, Iran;
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Burwood, VIC 3216, Australia;
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Burwood, VIC 3216, Australia;
| | - Ali Jafarizadeh
- Nikookari Eye Center, Tabriz University of Medical Sciences, Tabriz 5165665811, Iran; (N.S.); (A.J.)
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz 5165665811, Iran
| |
Collapse
|
5
|
Head ST, Dezem F, Todor A, Yang J, Plummer J, Gayther S, Kar S, Schildkraut J, Epstein MP. Cis- and trans-eQTL TWAS of breast and ovarian cancer identify more than 100 risk associated genes in the BCAC and OCAC consortia. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566218. [PMID: 38014246 PMCID: PMC10680675 DOI: 10.1101/2023.11.09.566218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Transcriptome-wide association studies (TWAS) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have only considered regulatory effects of risk associated SNPs thought to act in cis on a nearby target gene. With growing evidence for distal (trans) regulatory effects of variants on gene expression, we performed TWAS of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both cis- and trans-expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as ACAP3, whose associations are predominantly driven by trans-eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents a first look into the role of trans-eQTLs in the complex molecular mechanisms underlying these diseases.
Collapse
Affiliation(s)
- S. Taylor Head
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Felipe Dezem
- Department of Developmental Neurobiology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Andrei Todor
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jingjing Yang
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jasmine Plummer
- Department of Developmental Neurobiology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Simon Gayther
- Department of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Siddhartha Kar
- Early Cancer Institute, Department of Oncology, University of Cambridge, Cambridge CB2 0XZ, UK
| | - Joellen Schildkraut
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Michael P. Epstein
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
6
|
Swart PC, Du Plessis M, Rust C, Womersley JS, van den Heuvel LL, Seedat S, Hemmings SMJ. Identifying genetic loci that are associated with changes in gene expression in PTSD in a South African cohort. J Neurochem 2023; 166:705-719. [PMID: 37522158 PMCID: PMC10953375 DOI: 10.1111/jnc.15919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 06/30/2023] [Accepted: 07/05/2023] [Indexed: 08/01/2023]
Abstract
The molecular mechanisms underlying posttraumatic stress disorder (PTSD) are yet to be fully elucidated, especially in underrepresented population groups. Expression quantitative trait loci (eQTLs) are DNA sequence variants that influence gene expression, in a local (cis-) or distal (trans-) manner, and subsequently impact cellular, tissue, and system physiology. This study aims to identify genetic loci associated with gene expression changes in a South African PTSD cohort. Genome-wide genotype and RNA-sequencing data were obtained from 32 trauma-exposed controls and 35 PTSD cases of mixed-ancestry, as part of the SHARED ROOTS project. The first approach utilised 108 937 single-nucleotide polymorphisms (SNPs) (MAF > 10%) and 11 312 genes with Matrix eQTL to map potential eQTLs, while controlling for covariates as appropriate. The second analysis was focused on 5638 SNPs related to a previously calculated PTSD polygenic risk score for this cohort. SNP-gene pairs were considered eQTLs if they surpassed Bonferroni correction and had a false discovery rate <0.05. We did not identify eQTLs that significantly influenced gene expression in a PTSD-dependent manner. However, several known cis-eQTLs, independent of PTSD diagnosis, were observed. rs8521 (C > T) was associated with TAGLN and SIDT2 expression, and rs11085906 (C > T) was associated with ZNF333 expression. This exploratory study provides insight into the molecular mechanisms associated with PTSD in a non-European, admixed sample population. This study was limited by the cross-sectional design and insufficient statistical power. Overall, this study should encourage further multi-omics approaches towards investigating PTSD in diverse populations.
Collapse
Affiliation(s)
- Patricia C. Swart
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Morne Du Plessis
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Carlien Rust
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Jacqueline S. Womersley
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Leigh L. van den Heuvel
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Soraya Seedat
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Sian M. J. Hemmings
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| |
Collapse
|
7
|
Aygün N, Liang D, Crouse WL, Keele GR, Love MI, Stein JL. Inferring cell-type-specific causal gene regulatory networks during human neurogenesis. Genome Biol 2023; 24:130. [PMID: 37254169 PMCID: PMC10230710 DOI: 10.1186/s13059-023-02959-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 05/05/2023] [Indexed: 06/01/2023] Open
Abstract
BACKGROUND Genetic variation influences both chromatin accessibility, assessed in chromatin accessibility quantitative trait loci (caQTL) studies, and gene expression, assessed in expression QTL (eQTL) studies. Genetic variants can impact either nearby genes (cis-eQTLs) or distal genes (trans-eQTLs). Colocalization between caQTL and eQTL, or cis- and trans-eQTLs suggests that they share causal variants. However, pairwise colocalization between these molecular QTLs does not guarantee a causal relationship. Mediation analysis can be applied to assess the evidence supporting causality versus independence between molecular QTLs. Given that the function of QTLs can be cell-type-specific, we performed mediation analyses to find epigenetic and distal regulatory causal pathways for genes within two major cell types of the developing human cortex, progenitors and neurons. RESULTS We find that the expression of 168 and 38 genes is mediated by chromatin accessibility in progenitors and neurons, respectively. We also find that the expression of 11 and 12 downstream genes is mediated by upstream genes in progenitors and neurons. Moreover, we discover that a genetic locus associated with inter-individual differences in brain structure shows evidence for mediation of SLC26A7 through chromatin accessibility, identifying molecular mechanisms of a common variant association to a brain trait. CONCLUSIONS In this study, we identify cell-type-specific causal gene regulatory networks whereby the impacts of variants on gene expression were mediated by chromatin accessibility or distal gene expression. Identification of these causal paths will enable identifying and prioritizing actionable regulatory targets perturbing these key processes during neurodevelopment.
Collapse
Affiliation(s)
- Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Dan Liang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Wesley L Crouse
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Gregory R Keele
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
8
|
KIDD JOHN, RAULERSON CHELSEAK, MOHLKE KARENL, LIN DANYU. Mediation analysis of multiple mediators with incomplete omics data. Genet Epidemiol 2023; 47:61-77. [PMID: 36125445 PMCID: PMC10423053 DOI: 10.1002/gepi.22504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 06/29/2022] [Accepted: 08/16/2022] [Indexed: 02/01/2023]
Abstract
There is an increasing interest in using multiple types of omics features (e.g., DNA sequences, RNA expressions, methylation, protein expressions, and metabolic profiles) to study how the relationships between phenotypes and genotypes may be mediated by other omics markers. Genotypes and phenotypes are typically available for all subjects in genetic studies, but typically, some omics data will be missing for some subjects, due to limitations such as cost and sample quality. In this article, we propose a powerful approach for mediation analysis that accommodates missing data among multiple mediators and allows for various interaction effects. We formulate the relationships among genetic variants, other omics measurements, and phenotypes through linear regression models. We derive the joint likelihood for models with two mediators, accounting for arbitrary patterns of missing values. Utilizing computationally efficient and stable algorithms, we conduct maximum likelihood estimation. Our methods produce unbiased and statistically efficient estimators. We demonstrate the usefulness of our methods through simulation studies and an application to the Metabolic Syndrome in Men study.
Collapse
Affiliation(s)
- JOHN KIDD
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - CHELSEA K. RAULERSON
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| | - KAREN L. MOHLKE
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, U.S.A
| | - DAN-YU LIN
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
9
|
Han YJ, Zhang J, Hardeman A, Liu M, Karginova O, Romero R, Khramtsova GF, Zheng Y, Huo D, Olopade OI. An enhancer variant associated with breast cancer susceptibility in Black women regulates TNFSF10 expression and antitumor immunity in triple-negative breast cancer. Hum Mol Genet 2023; 32:139-150. [PMID: 35930348 PMCID: PMC9837834 DOI: 10.1093/hmg/ddac168] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 07/13/2022] [Accepted: 07/15/2022] [Indexed: 01/25/2023] Open
Abstract
Women of African ancestry have the highest mortality from triple-negative breast cancer (TNBC) of all racial groups. To understand the genomic basis of breast cancer in the populations, we previously conducted genome-wide association studies and identified single nucleotide polymorphisms (SNPs) associated with breast cancer in Black women. In this study, we investigated the functional significance of the top associated SNP rs13074711. We found the SNP served as an enhancer variant and regulated TNFSF10 (TRAIL) expression in TNBC cells, with a significant association between the SNP genotype and TNFSF10 expression in breast tumors. Mechanistically, rs13074711 modulated the binding activity of c-MYB at the motif and thereby controlled TNFSF10 expression. Interestingly, TNFSF10 expression in many cancers was consistently lower in African Americans compared with European Americans. Furthermore, TNFSF10 expression in TNBC was significantly correlated with the expression of antiviral immune genes and was regulated by type I interferons (IFNs). Accordingly, loss of TNFSF10 resulted in a profound decrease in apoptosis of TNBC cells in response to type I IFNs and poly(I:C), a synthetic analogue of double stranded virus. Lastly, in a syngeneic mouse model of breast cancer, TNFSF10-deficiency in breast tumors decreased tumor-infiltrated CD4+ and CD8+ T cell quantities. Collectively, our results suggested that TNFSF10 plays an important role in the regulation of antiviral immune responses in TNBC, and the expression is in part regulated by a genetic variant associated with breast cancer in Black women. Our results underscore the important contributions of genetic variants to immune defense mechanisms.
Collapse
Affiliation(s)
- Yoo Jane Han
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Jing Zhang
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Ashley Hardeman
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Margaret Liu
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Olga Karginova
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Roger Romero
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Galina F Khramtsova
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Yonglan Zheng
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Dezheng Huo
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Olufunmilayo I Olopade
- Section of Hematology/Oncology & Center for Clinical Cancer Genetics, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
10
|
Pudjihartono M, Perry JK, Print C, O'Sullivan JM, Schierding W. Interpretation of the role of germline and somatic non-coding mutations in cancer: expression and chromatin conformation informed analysis. Clin Epigenetics 2022; 14:120. [PMID: 36171609 PMCID: PMC9520844 DOI: 10.1186/s13148-022-01342-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 09/21/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There has been extensive scrutiny of cancer driving mutations within the exome (especially amino acid altering mutations) as these are more likely to have a clear impact on protein functions, and thus on cell biology. However, this has come at the neglect of systematic identification of regulatory (non-coding) variants, which have recently been identified as putative somatic drivers and key germline risk factors for cancer development. Comprehensive understanding of non-coding mutations requires understanding their role in the disruption of regulatory elements, which then disrupt key biological functions such as gene expression. MAIN BODY We describe how advancements in sequencing technologies have led to the identification of a large number of non-coding mutations with uncharacterized biological significance. We summarize the strategies that have been developed to interpret and prioritize the biological mechanisms impacted by non-coding mutations, focusing on recent annotation of cancer non-coding variants utilizing chromatin states, eQTLs, and chromatin conformation data. CONCLUSION We believe that a better understanding of how to apply different regulatory data types into the study of non-coding mutations will enhance the discovery of novel mechanisms driving cancer.
Collapse
Affiliation(s)
| | - Jo K Perry
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Cris Print
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Department of Molecular Medicine and Pathology, School of Medical Sciences, University of Auckland, Auckland, 1142, New Zealand
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Australian Parkinson's Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK
| | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
| |
Collapse
|
11
|
Guo X, Han J, Song Y, Yin Z, Liu S, Shang X. Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions. Front Genet 2022; 13:921775. [PMID: 36046233 PMCID: PMC9421127 DOI: 10.3389/fgene.2022.921775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.
Collapse
Affiliation(s)
- Xinpeng Guo
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, China
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, China
| | - Jinyu Han
- School of Economics and Management, Chang ‘an University, Xi’an, China
| | - Yafei Song
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, China
| | - Zhilei Yin
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, China
| | - Shuaichen Liu
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, China
| | - Xuequn Shang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, China
- *Correspondence: Xuequn Shang,
| |
Collapse
|
12
|
Macias-Velasco JF, St Pierre CL, Wayhart JP, Yin L, Spears L, Miranda MA, Carson C, Funai K, Cheverud JM, Semenkovich CF, Lawson HA. Parent-of-origin effects propagate through networks to shape metabolic traits. eLife 2022; 11:e72989. [PMID: 35356864 PMCID: PMC9075957 DOI: 10.7554/elife.72989] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 03/25/2022] [Indexed: 11/16/2022] Open
Abstract
Parent-of-origin effects are unexpectedly common in complex traits, including metabolic and neurological traits. Parent-of-origin effects can be modified by the environment, but the architecture of these gene-by-environmental effects on phenotypes remains to be unraveled. Previously, quantitative trait loci (QTL) showing context-specific parent-of-origin effects on metabolic traits were mapped in the F16 generation of an advanced intercross between LG/J and SM/J inbred mice. However, these QTL were not enriched for known imprinted genes, suggesting another mechanism is needed to explain these parent-of-origin effects phenomena. We propose that non-imprinted genes can generate complex parent-of-origin effects on metabolic traits through interactions with imprinted genes. Here, we employ data from mouse populations at different levels of intercrossing (F0, F1, F2, F16) of the LG/J and SM/J inbred mouse lines to test this hypothesis. Using multiple populations and incorporating genetic, genomic, and physiological data, we leverage orthogonal evidence to identify networks of genes through which parent-of-origin effects propagate. We identify a network comprised of three imprinted and six non-imprinted genes that show parent-of-origin effects. This epistatic network forms a nutritional responsive pathway and the genes comprising it jointly serve cellular functions associated with growth. We focus on two genes, Nnat and F2r, whose interaction associates with serum glucose levels across generations in high-fat-fed females. Single-cell RNAseq reveals that Nnat expression increases and F2r expression decreases in pre-adipocytes along an adipogenic trajectory, a result that is consistent with our observations in bulk white adipose tissue.
Collapse
Affiliation(s)
- Juan F Macias-Velasco
- Department of Genetics, Washington University School of MedicineSaint LouisUnited States
| | - Celine L St Pierre
- Department of Genetics, Washington University School of MedicineSaint LouisUnited States
| | - Jessica P Wayhart
- Department of Genetics, Washington University School of MedicineSaint LouisUnited States
| | - Li Yin
- Department of Medicine, Washington University School of MedicineSaint LouisUnited States
| | - Larry Spears
- Department of Medicine, Washington University School of MedicineSaint LouisUnited States
| | - Mario A Miranda
- Department of Genetics, Washington University School of MedicineSaint LouisUnited States
| | - Caryn Carson
- Department of Genetics, Washington University School of MedicineSaint LouisUnited States
| | - Katsuhiko Funai
- Diabetes and Metabolism Research Center, University of UtahSalt Lake CityUnited States
| | | | - Clay F Semenkovich
- Department of Medicine, Washington University School of MedicineSaint LouisUnited States
| | - Heather A Lawson
- Department of Genetics, Washington University School of MedicineSaint LouisUnited States
| |
Collapse
|
13
|
Bhattacharya A, Freedman AN, Avula V, Harris R, Liu W, Pan C, Lusis AJ, Joseph RM, Smeester L, Hartwell HJ, Kuban KCK, Marsit CJ, Li Y, O'Shea TM, Fry RC, Santos HP. Placental genomics mediates genetic associations with complex health traits and disease. Nat Commun 2022; 13:706. [PMID: 35121757 PMCID: PMC8817049 DOI: 10.1038/s41467-022-28365-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 12/15/2021] [Indexed: 01/09/2023] Open
Abstract
As the master regulator in utero, the placenta is core to the Developmental Origins of Health and Disease (DOHaD) hypothesis but is historically understudied. To identify placental gene-trait associations (GTAs) across the life course, we perform distal mediator-enriched transcriptome-wide association studies (TWAS) for 40 traits, integrating placental multi-omics from the Extremely Low Gestational Age Newborn Study. At [Formula: see text], we detect 248 GTAs, mostly for neonatal and metabolic traits, across 176 genes, enriched for cell growth and immunological pathways. In aggregate, genetic effects mediated by placental expression significantly explain 4 early-life traits but no later-in-life traits. 89 GTAs show significant mediation through distal genetic variants, identifying hypotheses for distal regulation of GTAs. Investigation of one hypothesis in human placenta-derived choriocarcinoma cells reveal that knockdown of mediator gene EPS15 upregulates predicted targets SPATA13 and FAM214A, both associated with waist-hip ratio in TWAS, and multiple genes involved in metabolic pathways. These results suggest profound health impacts of placental genomic regulation in developmental programming across the life course.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.
| | - Anastasia N Freedman
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Vennela Avula
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Rebeca Harris
- Biobehavioral Laboratory, School of Nursing, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Weifang Liu
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Calvin Pan
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
| | - Aldons J Lusis
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
- Department of Microbiology, Immunology and Molecular Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
| | - Robert M Joseph
- Department of Anatomy and Neurobiology, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Lisa Smeester
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
- Institute for Environmental Health Solutions, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
- Curriculum in Toxicology and Environmental Medicine, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Hadley J Hartwell
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Karl C K Kuban
- Department of Pediatrics, Division of Pediatric Neurology, Boston University Medical Center, Boston, MA, 02118, USA
| | - Carmen J Marsit
- Gangarosa Department of Environmental Health, Rollins School of Public Health Emory University, Atlanta, GA, 30322, USA
| | - Yun Li
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27514, USA
- Department of Computer Science, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - T Michael O'Shea
- Department of Pediatrics, School of Medicine, University of North Carolina, Chapel Hill, NC, 27514, USA
| | - Rebecca C Fry
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA.
- Institute for Environmental Health Solutions, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA.
- Curriculum in Toxicology and Environmental Medicine, University of North Carolina, Chapel Hill, NC, 27514, USA.
| | - Hudson P Santos
- Biobehavioral Laboratory, School of Nursing, University of North Carolina, Chapel Hill, NC, 27514, USA.
- Institute for Environmental Health Solutions, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, 27514, USA.
| |
Collapse
|
14
|
Rajabi F, Jabalameli N, Rezaei N. The Concept of Immunogenetics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1367:1-17. [DOI: 10.1007/978-3-030-92616-8_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
15
|
Lu H, Jiang H, Yang S, Li C, Li C, Shao R, Zhang P, Wang D, Liu Z, Qi H, Cai Y, Xu W, Bao X, Wang H, Li L. Trans-eQTLs of the CYP3A4 and CYP3A5 associated with tacrolimus trough blood concentration in Chinese renal transplant patients. Biomed Pharmacother 2021; 145:112407. [PMID: 34781138 DOI: 10.1016/j.biopha.2021.112407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Revised: 09/23/2021] [Accepted: 11/03/2021] [Indexed: 12/15/2022] Open
Abstract
This study aimed to systematically investigate trans-eQTLs of CYP3A4 and CYP3A5 affecting tacrolimus trough blood concentrations in Chinese renal transplant patients. We used Plink v1.90 to perform data quality control and linear regression analysis on GTEx v8 data. SNPs with p-value < 0.05 were selected and the GTEx eQTL Calculator was used to further prioritize the eQTLs of CYP3A4 and CYP3A5 in the liver and small intestine. The eQTLs with a p-value < 5 × 10-5 and MAF≥ 0.05 in the CHB population were selected as candidate eQTLs. The genotyping of candidate eQTLs was performed using high-resolution melting (HRM) assays and Sanger DNA sequencing. This study included 845 Chinese renal transplant patients who received tacrolimus as an immunosuppressive agent. Association between 103 candidate eQTLs and log-transformed tacrolimus concentration/dose ratio (log (C0/D)) in this cohort was conducted using the SNPassoc package of R software. In the end, a total of 75,632 liver eQTLs of CYP3A4, 69,558 liver eQTLs of CYP3A5, 48,596 small intestine eQTLs of CYP3A4 and 28,616 small intestine eQTLs of CYP3A5 were obtained using the GTEx v8 eQTL Calculator. Of the 103 candidate eQTLs, rs75727207, rs181294422 and rs28522676 were significantly associated with tacrolimus log(C0/D) in different genetic models. We discovered a substantial number of novel eQTLs of CYP3A4 and CYP3A5 in liver and small intestine, also found that rs75727207, rs181294422 and rs28522676 may affect tacrolimus trough blood concentrations in Chinese renal transplant patients.
Collapse
Affiliation(s)
- Huijie Lu
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Haixia Jiang
- Department of Clinical Laboratory, Nanfang Hospital, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Siyao Yang
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Chengcheng Li
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Chuanjiang Li
- Division of Hepatobiliopancreatic Surgery, Department of General Surgery,Nanfang Hospital, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Ruifan Shao
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Pai Zhang
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Daoyi Wang
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Zhiwei Liu
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Huana Qi
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Yinuan Cai
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Wenbin Xu
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Xiaojie Bao
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Hailan Wang
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Liang Li
- Department of Medical Genetics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China; Experimental Education and Administration Center, School of Basic Medical Science, Southern Medical University, Guangzhou 510515, China.
| |
Collapse
|
16
|
Wang J, Ning J, Shete S. Mediation model with a categorical exposure and a censored mediator with application to a genetic study. PLoS One 2021; 16:e0257628. [PMID: 34637449 PMCID: PMC8509986 DOI: 10.1371/journal.pone.0257628] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 09/06/2021] [Indexed: 12/12/2022] Open
Abstract
Mediation analysis is a statistical method for evaluating the direct and indirect effects of an exposure on an outcome in the presence of a mediator. Mediation models have been widely used to determine direct and indirect contributions of genetic variants in clinical phenotypes. In genetic studies, the additive genetic model is the most commonly used model because it can detect effects from either recessive or dominant models (or any model in between). However, the existing approaches for mediation model cannot be directly applied when the genetic model is additive (e.g. the most commonly used model for SNPs) or categorical (e.g. polymorphic loci), and thus modification to measures of indirect and direct effects is warranted. In this study, we proposed overall measures of indirect, direct, and total effects for a mediation model with a categorical exposure and a censored mediator, which accounts for the frequency of different values of the categorical exposure. The proposed approach provides the overall contribution of the categorical exposure to the outcome variable. We assessed the empirical performance of the proposed overall measures via simulation studies and applied the measures to evaluate the mediating effect of a women’s age at menopause on the association between genetic variants and type 2 diabetes.
Collapse
Affiliation(s)
- Jian Wang
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Jing Ning
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Sanjay Shete
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
17
|
Zhao J, Han X, Zheng Z, Nogueira L, Lu AD, Nathan PC, Yabroff KR. Racial/Ethnic Disparities in Childhood Cancer Survival in the United States. Cancer Epidemiol Biomarkers Prev 2021; 30:2010-2017. [PMID: 34593561 DOI: 10.1158/1055-9965.epi-21-0117] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 03/17/2021] [Accepted: 09/01/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Non-white patients with childhood cancer have worse survival than Non-Hispanic (NH) White patients for many childhood cancers in the United States. We examined the contribution of socioeconomic status (SES) and health insurance on racial/ethnic disparities in childhood cancer survival. METHODS We used the National Cancer Database to identify NH White, NH Black, Hispanic, and children of other race/ethnicities (<18 years) diagnosed with cancer between 2004 and 2015. SES was measured by the area-level social deprivation index (SDI) at patient residence and categorized into tertiles. Health insurance coverage at diagnosis was categorized as private, Medicaid, and uninsured. Cox proportional hazard models were used to compare survival by race/ethnicity. We examined the contribution of health insurance and SES by sequentially adjusting for demographic and clinical characteristics (age group, sex, region, metropolitan statistical area, year of diagnosis, and number of conditions other than cancer), health insurance, and SDI. RESULTS Compared with NH Whites, NH Blacks and Hispanics had worse survival for all cancers combined, leukemias and lymphomas, brain tumors, and solid tumors (all P < 0.05). Survival differences were attenuated after adjusting for health insurance and SDI separately; and further attenuated after adjusting for insurance and SDI together. CONCLUSIONS Both SES and health insurance contributed to racial/ethnic disparities in childhood cancer survival. IMPACT Improving health insurance coverage and access to care for children, especially those with low SES, may mitigate racial/ethnic survival disparities.
Collapse
Affiliation(s)
- Jingxuan Zhao
- Surveillance and Health Equity Science, American Cancer Society, Atlanta, Georgia.
| | - Xuesong Han
- Surveillance and Health Equity Science, American Cancer Society, Atlanta, Georgia
| | - Zhiyuan Zheng
- Surveillance and Health Equity Science, American Cancer Society, Atlanta, Georgia
| | - Leticia Nogueira
- Surveillance and Health Equity Science, American Cancer Society, Atlanta, Georgia
| | - Amy D Lu
- The University of Toronto, Toronto, Ontario, Canada
| | - Paul C Nathan
- The University of Toronto, Toronto, Ontario, Canada.,The Hospital for Sick Children, Division of Hematology/Oncology, Toronto, Ontario, Canada
| | - K Robin Yabroff
- Surveillance and Health Equity Science, American Cancer Society, Atlanta, Georgia
| |
Collapse
|
18
|
Kim Y, Lu S, Ho JE, Hwang SJ, Yao C, Huan T, Levy D, Ma J. Proteins as Mediators of the Association Between Diet Quality and Incident Cardiovascular Disease and All-Cause Mortality: The Framingham Heart Study. J Am Heart Assoc 2021; 10:e021245. [PMID: 34482708 PMCID: PMC8649513 DOI: 10.1161/jaha.121.021245] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background Biological mechanisms underlying the association of a healthy diet with chronic diseases remain unclear. Targeted proteomics may facilitate the understanding of mechanisms linking diet to chronic diseases. Methods and Results We examined 6360 participants (mean age 50 years; 54% women) in the Framingham Heart Study. The associations between diet and 71 cardiovascular disease (CVD)‐related proteins were examined using 3 diet quality scores: the Alternate Healthy Eating Index, the modified Mediterranean‐style Diet Score, and the modified Dietary Approaches to Stop Hypertension diet score. A mediation analysis was conducted to examine which proteins mediated the associations of diet with incident CVD and all‐cause mortality. Thirty of the 71 proteins were associated with at least 1 diet quality score (P<0.0007) after adjustment for multiple covariates in all study participants and confirmed by an internal validation analysis. Gene ontology analysis identified inflammation‐related pathways such as regulation of cell killing and neuroinflammatory response (Bonferroni corrected P<0.05). During a median follow‐up of 13 years, we documented 512 deaths and 488 incident CVD events. Higher diet quality scores were associated with lower risk of CVD (P≤0.03) and mortality (P≤0.004). After adjusting for multiple potential confounders, 4 proteins (B2M [beta‐2‐microglobulin], GDF15 [growth differentiation factor 15], sICAM1 [soluble intercellular adhesion molecule 1], and UCMGP [uncarboxylated matrix Gla‐protein]) mediated the association between at least 1 diet quality score and all‐cause mortality (median proportion of mediation ranged from 8.6% to 25.9%). We also observed that GDF15 mediated the association of the Alternate Healthy Eating Index with CVD (median proportion of mediation: 8.6%). Conclusions Diet quality is associated with new‐onset CVD and mortality and with circulating CVD‐related proteins. Several proteins appear to mediate the association of diet with these outcomes.
Collapse
Affiliation(s)
- Youjin Kim
- Nutrition Epidemiology and Data Science Friedman School of Nutrition Science and Policy Tufts University Boston MA
| | - Sophia Lu
- Health Sciences Sargent CollegeBoston University Boston MA
| | - Jennifer E Ho
- Division of Cardiology Department of Medicine and Cardiovascular Research Center Massachusetts General Hospital Boston MA
| | - Shih-Jen Hwang
- Population Sciences Branch National Heart, Lung, and Blood InstituteNIH Bethesda MD.,Framingham Heart Study Framingham MA
| | - Chen Yao
- Population Sciences Branch National Heart, Lung, and Blood InstituteNIH Bethesda MD.,Framingham Heart Study Framingham MA
| | - Tianxiao Huan
- Department of Ophthalmology and Visual Sciences University of Massachusetts Medical School Worcester MA
| | - Daniel Levy
- Population Sciences Branch National Heart, Lung, and Blood InstituteNIH Bethesda MD.,Framingham Heart Study Framingham MA
| | - Jiantao Ma
- Nutrition Epidemiology and Data Science Friedman School of Nutrition Science and Policy Tufts University Boston MA
| |
Collapse
|
19
|
Zhou X, Cai X. Joint eQTL mapping and inference of gene regulatory network improves power of detecting both cis- and trans-eQTLs. Bioinformatics 2021; 38:149-156. [PMID: 34487140 PMCID: PMC8696109 DOI: 10.1093/bioinformatics/btab609] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 07/19/2021] [Accepted: 08/25/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Genetic variations of expression quantitative trait loci (eQTLs) play a critical role in influencing complex traits and diseases development. Two main factors that affect the statistical power of detecting eQTLs are: (i) relatively small size of samples available, and (ii) heavy burden of multiple testing due to a very large number of variants to be tested. The later issue is particularly severe when one tries to identify trans-eQTLs that are far away from the genes they influence. If one can exploit co-expressed genes jointly in eQTL-mapping, effective sample size can be increased. Furthermore, using the structure of the gene regulatory network (GRN) may help to identify trans-eQTLs without increasing multiple testing burden. RESULTS In this article, we use the structure equation model (SEM) to model both GRN and effect of eQTLs on gene expression, and then develop a novel algorithm, named sparse SEM for eQTL mapping (SSEMQ), to conduct joint eQTL mapping and GRN inference. The SEM can exploit co-expressed genes jointly in eQTL mapping and also use GRN to determine trans-eQTLs. Computer simulations demonstrate that our SSEMQ significantly outperforms nine existing eQTL mapping methods. SSEMQ is further used to analyze two real datasets of human breast and whole blood tissues, yielding a number of cis- and trans-eQTLs. AVAILABILITY AND IMPLEMENTATION R package ssemQr is available at https://github.com/Ivis4ml/ssemQr.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xin Zhou
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33146, USA
| | | |
Collapse
|
20
|
Guo X, Song Y, Liu S, Gao M, Qi Y, Shang X. Linking genotype to phenotype in multi-omics data of small sample. BMC Genomics 2021; 22:537. [PMID: 34256701 PMCID: PMC8278664 DOI: 10.1186/s12864-021-07867-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 06/30/2021] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) that link genotype to phenotype represent an effective means to associate an individual genetic background with a disease or trait. However, single-omics data only provide limited information on biological mechanisms, and it is necessary to improve the accuracy for predicting the biological association between genotype and phenotype by integrating multi-omics data. Typically, gene expression data are integrated to analyze the effect of single nucleotide polymorphisms (SNPs) on phenotype. Such multi-omics data integration mainly follows two approaches: multi-staged analysis and meta-dimensional analysis, which respectively ignore intra-omics and inter-omics associations. Moreover, both approaches require omics data from a single sample set, and the large feature set of SNPs necessitates a large sample size for model establishment, but it is difficult to obtain multi-omics data from a single, large sample set. RESULTS To address this problem, we propose a method of genotype-phenotype association based on multi-omics data from small samples. The workflow of this method includes clustering genes using a protein-protein interaction network and gene expression data, screening gene clusters with group lasso, obtaining SNP clusters corresponding to the selected gene clusters through expression quantitative trait locus data, integrating SNP clusters and corresponding gene clusters and phenotypes into three-layer network blocks, analyzing and predicting based on each block, and obtaining the final prediction by taking the average. CONCLUSIONS We compare this method to others using two datasets and find that our method shows better results in both cases. Our method can effectively solve the prediction problem in multi-omics data of small sample, and provide valuable resources for further studies on the fusion of more omics data.
Collapse
Affiliation(s)
- Xinpeng Guo
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, People's Republic of China
- School of Air and Missile Defense, Air Force Engineering University, Xi'an, 710051, People's Republic of China
| | - Yafei Song
- School of Air and Missile Defense, Air Force Engineering University, Xi'an, 710051, People's Republic of China
| | - Shuhui Liu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, People's Republic of China
| | - Meihong Gao
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, People's Republic of China
| | - Yang Qi
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, People's Republic of China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, People's Republic of China.
| |
Collapse
|
21
|
Zeng P, Shao Z, Zhou X. Statistical methods for mediation analysis in the era of high-throughput genomics: Current successes and future challenges. Comput Struct Biotechnol J 2021; 19:3209-3224. [PMID: 34141140 PMCID: PMC8187160 DOI: 10.1016/j.csbj.2021.05.042] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 05/21/2021] [Accepted: 05/21/2021] [Indexed: 12/12/2022] Open
Abstract
Mediation analysis investigates the intermediate mechanism through which an exposure exerts its influence on the outcome of interest. Mediation analysis is becoming increasingly popular in high-throughput genomics studies where a common goal is to identify molecular-level traits, such as gene expression or methylation, which actively mediate the genetic or environmental effects on the outcome. Mediation analysis in genomics studies is particularly challenging, however, thanks to the large number of potential mediators measured in these studies as well as the composite null nature of the mediation effect hypothesis. Indeed, while the standard univariate and multivariate mediation methods have been well-established for analyzing one or multiple mediators, they are not well-suited for genomics studies with a large number of mediators and often yield conservative p-values and limited power. Consequently, over the past few years many new high-dimensional mediation methods have been developed for analyzing the large number of potential mediators collected in high-throughput genomics studies. In this work, we present a thorough review of these important recent methodological advances in high-dimensional mediation analysis. Specifically, we describe in detail more than ten high-dimensional mediation methods, focusing on their motivations, basic modeling ideas, specific modeling assumptions, practical successes, methodological limitations, as well as future directions. We hope our review will serve as a useful guidance for statisticians and computational biologists who develop methods of high-dimensional mediation analysis as well as for analysts who apply mediation methods to high-throughput genomics studies.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
- Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Zhonghe Shao
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor 48109, MI, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor 48109, MI, USA
| |
Collapse
|
22
|
Bhattacharya A, Hamilton AM, Troester MA, Love MI. DeCompress: tissue compartment deconvolution of targeted mRNA expression panels using compressed sensing. Nucleic Acids Res 2021; 49:e48. [PMID: 33524140 PMCID: PMC8096278 DOI: 10.1093/nar/gkab031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 12/21/2020] [Accepted: 01/12/2021] [Indexed: 12/13/2022] Open
Abstract
Targeted mRNA expression panels, measuring up to 800 genes, are used in academic and clinical settings due to low cost and high sensitivity for archived samples. Most samples assayed on targeted panels originate from bulk tissue comprised of many cell types, and cell-type heterogeneity confounds biological signals. Reference-free methods are used when cell-type-specific expression references are unavailable, but limited feature spaces render implementation challenging in targeted panels. Here, we present DeCompress, a semi-reference-free deconvolution method for targeted panels. DeCompress leverages a reference RNA-seq or microarray dataset from similar tissue to expand the feature space of targeted panels using compressed sensing. Ensemble reference-free deconvolution is performed on this artificially expanded dataset to estimate cell-type proportions and gene signatures. In simulated mixtures, four public cell line mixtures, and a targeted panel (1199 samples; 406 genes) from the Carolina Breast Cancer Study, DeCompress recapitulates cell-type proportions with less error than reference-free methods and finds biologically relevant compartments. We integrate compartment estimates into cis-eQTL mapping in breast cancer, identifying a tumor-specific cis-eQTL for CCR3 (C-C Motif Chemokine Receptor 3) at a risk locus. DeCompress improves upon reference-free methods without requiring expression profiles from pure cell populations, with applications in genomic analyses and clinical settings.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California-Los Angeles, Los Angeles, CA 90095, USA
| | - Alina M Hamilton
- Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
| | - Melissa A Troester
- Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
- Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
| |
Collapse
|
23
|
Banerjee S, Simonetti FL, Detrois KE, Kaphle A, Mitra R, Nagial R, Söding J. Tejaas: reverse regression increases power for detecting trans-eQTLs. Genome Biol 2021; 22:142. [PMID: 33957961 PMCID: PMC8101255 DOI: 10.1186/s13059-021-02361-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 04/22/2021] [Indexed: 12/18/2022] Open
Abstract
Trans-acting expression quantitative trait loci (trans-eQTLs) account for ≥70% expression heritability and could therefore facilitate uncovering mechanisms underlying the origination of complex diseases. Identifying trans-eQTLs is challenging because of small effect sizes, tissue specificity, and a severe multiple-testing burden. Tejaas predicts trans-eQTLs by performing L2-regularized “reverse” multiple regression of each SNP on all genes, aggregating evidence from many small trans-effects while being unaffected by the strong expression correlations. Combined with a novel unsupervised k-nearest neighbor method to remove confounders, Tejaas predicts 18851 unique trans-eQTLs across 49 tissues from GTEx. They are enriched in open chromatin, enhancers, and other regulatory regions. Many overlap with disease-associated SNPs, pointing to tissue-specific transcriptional regulation mechanisms.
Collapse
Affiliation(s)
- Saikat Banerjee
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.
| | - Franco L Simonetti
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany
| | - Kira E Detrois
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.,Georg-August University, Göttingen, 37075, Germany
| | - Anubhav Kaphle
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.,Georg-August University, Göttingen, 37075, Germany
| | | | | | - Johannes Söding
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany. .,Campus-Institut Data Science (CIDAS), University of Göttingen, Göttingen, 37073, Germany. .,Cluster of Excellence "Multiscale Bioimaging" (MBExC), University of Göttingen, Göttingen, 37075, Germany.
| |
Collapse
|
24
|
Bhattacharya A, Li Y, Love MI. MOSTWAS: Multi-Omic Strategies for Transcriptome-Wide Association Studies. PLoS Genet 2021; 17:e1009398. [PMID: 33684137 PMCID: PMC7971899 DOI: 10.1371/journal.pgen.1009398] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 03/18/2021] [Accepted: 02/04/2021] [Indexed: 02/06/2023] Open
Abstract
Traditional predictive models for transcriptome-wide association studies (TWAS) consider only single nucleotide polymorphisms (SNPs) local to genes of interest and perform parameter shrinkage with a regularization process. These approaches ignore the effect of distal-SNPs or other molecular effects underlying the SNP-gene association. Here, we outline multi-omics strategies for transcriptome imputation from germline genetics to allow more powerful testing of gene-trait associations by prioritizing distal-SNPs to the gene of interest. In one extension, we identify mediating biomarkers (CpG sites, microRNAs, and transcription factors) highly associated with gene expression and train predictive models for these mediators using their local SNPs. Imputed values for mediators are then incorporated into the final predictive model of gene expression, along with local SNPs. In the second extension, we assess distal-eQTLs (SNPs associated with genes not in a local window around it) for their mediation effect through mediating biomarkers local to these distal-eSNPs. Distal-eSNPs with large indirect mediation effects are then included in the transcriptomic prediction model with the local SNPs around the gene of interest. Using simulations and real data from ROS/MAP brain tissue and TCGA breast tumors, we show considerable gains of percent variance explained (1-2% additive increase) of gene expression and TWAS power to detect gene-trait associations. This integrative approach to transcriptome-wide imputation and association studies aids in identifying the complex interactions underlying genetic regulation within a tissue and important risk genes for various traits and disorders.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, University of California-Los Angeles, Los Angeles, California, United States of America
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Michael I. Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
25
|
Xie Y, Shan N, Zhao H, Hou L. Transcriptome wide association studies: general framework and methods. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-020-0228] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
26
|
Bhattacharya A, García-Closas M, Olshan AF, Perou CM, Troester MA, Love MI. A framework for transcriptome-wide association studies in breast cancer in diverse study populations. Genome Biol 2020; 21:42. [PMID: 32079541 PMCID: PMC7033948 DOI: 10.1186/s13059-020-1942-6] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 01/21/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND The relationship between germline genetic variation and breast cancer survival is largely unknown, especially in understudied minority populations who often have poorer survival. Genome-wide association studies (GWAS) have interrogated breast cancer survival but often are underpowered due to subtype heterogeneity and clinical covariates and detect loci in non-coding regions that are difficult to interpret. Transcriptome-wide association studies (TWAS) show increased power in detecting functionally relevant loci by leveraging expression quantitative trait loci (eQTLs) from external reference panels in relevant tissues. However, ancestry- or race-specific reference panels may be needed to draw correct inference in ancestrally diverse cohorts. Such panels for breast cancer are lacking. RESULTS We provide a framework for TWAS for breast cancer in diverse populations, using data from the Carolina Breast Cancer Study (CBCS), a population-based cohort that oversampled black women. We perform eQTL analysis for 406 breast cancer-related genes to train race-stratified predictive models of tumor expression from germline genotypes. Using these models, we impute expression in independent data from CBCS and TCGA, accounting for sampling variability in assessing performance. These models are not applicable across race, and their predictive performance varies across tumor subtype. Within CBCS (N = 3,828), at a false discovery-adjusted significance of 0.10 and stratifying for race, we identify associations in black women near AURKA, CAPN13, PIK3CA, and SERPINB5 via TWAS that are underpowered in GWAS. CONCLUSIONS We show that carefully implemented and thoroughly validated TWAS is an efficient approach for understanding the genetics underpinning breast cancer outcomes in diverse populations.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, USA
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, USA
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Andrew F. Olshan
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina-Chapel Hill, Chapel Hill, USA
| | - Charles M. Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina-Chapel Hill, Chapel Hill, USA
- Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, USA
- Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, USA
| | - Melissa A. Troester
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, USA
- Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, USA
| | - Michael I. Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, USA
- Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, USA
| |
Collapse
|
27
|
DNA variants affecting the expression of numerous genes in trans have diverse mechanisms of action and evolutionary histories. PLoS Genet 2019; 15:e1008375. [PMID: 31738765 PMCID: PMC6886874 DOI: 10.1371/journal.pgen.1008375] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 12/02/2019] [Accepted: 10/28/2019] [Indexed: 12/13/2022] Open
Abstract
DNA variants that alter gene expression contribute to variation in many phenotypic traits. In particular, trans-acting variants, which are often located on different chromosomes from the genes they affect, are an important source of heritable gene expression variation. However, our knowledge about the identity and mechanism of causal trans-acting variants remains limited. Here, we developed a fine-mapping strategy called CRISPR-Swap and dissected three expression quantitative trait locus (eQTL) hotspots known to alter the expression of numerous genes in trans in the yeast Saccharomyces cerevisiae. Causal variants were identified by engineering recombinant alleles and quantifying the effects of these alleles on the expression of a green fluorescent protein-tagged gene affected by the given locus in trans. We validated the effect of each variant on the expression of multiple genes by RNA-sequencing. The three variants differed in their molecular mechanism, the type of genes they reside in, and their distribution in natural populations. While a missense leucine-to-serine variant at position 63 in the transcription factor Oaf1 (L63S) was almost exclusively present in the reference laboratory strain, the two other variants were frequent among S. cerevisiae isolates. A causal missense variant in the glucose receptor Rgt2 (V539I) occurred at a poorly conserved amino acid residue and its effect was strongly dependent on the concentration of glucose in the culture medium. A noncoding variant in the conserved fatty acid regulated (FAR) element of the OLE1 promoter influenced the expression of the fatty acid desaturase Ole1 in cis and, by modulating the level of this essential enzyme, other genes in trans. The OAF1 and OLE1 variants showed a non-additive genetic interaction, and affected cellular lipid metabolism. These results demonstrate that the molecular basis of trans-regulatory variation is diverse, highlighting the challenges in predicting which natural genetic variants affect gene expression. Differences in the DNA sequence of individual genomes contribute to differences in many traits, such as appearance, physiology, and the risk for common diseases. An important group of these DNA variants influences how individual genes across the genome are turned on or off. In this paper, we describe a strategy for identifying such “trans-acting” variants in different strains of baker’s yeast. We used this strategy to reveal three single DNA base changes that each influences the expression of dozens of genes. These three DNA variants were very different from each other. Two of them changed the protein sequence, one in a transcription factor and the other in a sugar sensor. The third changed the expression of an enzyme, a change that in turn caused other genes to alter their expression. One variant existed in only a few yeast isolates, while the other two existed in many isolates collected from around the world. This diversity of DNA variants that influence the expression of many other genes illustrates how difficult it is to predict which DNA variants in an individual’s genome will have effects on the organism.
Collapse
|