1
|
Wang G, Zhang H, Shao M, Tian M, Feng H, Li Q, Cao C. Optimal variable identification for accurate detection of causal expression Quantitative Trait Loci with applications in heart-related diseases. Comput Struct Biotechnol J 2024; 23:2478-2486. [PMID: 38952424 PMCID: PMC11215961 DOI: 10.1016/j.csbj.2024.05.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 05/31/2024] [Accepted: 05/31/2024] [Indexed: 07/03/2024] Open
Abstract
Gene expression plays a pivotal role in various diseases, contributing significantly to their mechanisms. Most GWAS risk loci are in non-coding regions, potentially affecting disease risk by altering gene expression in specific tissues. This expression is notably tissue-specific, with genetic variants substantially influencing it. However, accurately detecting the expression Quantitative Trait Loci (eQTL) is challenging due to limited heritability in gene expression, extensive linkage disequilibrium (LD), and multiple causal variants. The single variant association approach in eQTL analysis is limited by its susceptibility to capture the combined effects of multiple variants, and a bias towards common variants, underscoring the need for a more robust method to accurately identify causal eQTL variants. To address this, we developed an algorithm, CausalEQTL, which integrates L 0 +L 1 penalized regression with an ensemble approach to localize eQTL, thereby enhancing prediction performance precisely. Our results demonstrate that CausalEQTL outperforms traditional models, including LASSO, Elastic Net, Ridge, in terms of power and overall performance. Furthermore, analysis of heart tissue data from the GTEx project revealed that eQTL sites identified by our algorithm provide deeper insights into heart-related tissue eQTL detection. This advancement in eQTL mapping promises to improve our understanding of the genetic basis of tissue-specific gene expression and its implications in disease. The source code and identified causal eQTLs for CausalEQTL are available on GitHub: https://github.com/zhc-moushang/CausalEQTL.
Collapse
Affiliation(s)
- Guishen Wang
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
| | - Hangchen Zhang
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
| | - Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Hui Feng
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
| | - Qiaoling Li
- Department of Cardiology, Affiliated Drum Tower Hospital, Medical School of Nanjing University, Nanjing 210008, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| |
Collapse
|
2
|
He J, Perera D, Wen W, Ping J, Li Q, Lyu L, Chen Z, Shu X, Long J, Cai Q, Shu XO, Zheng W, Long Q, Guo X. Enhancing Disease Risk Gene Discovery by Integrating Transcription Factor-Linked Trans-located Variants into Transcriptome-Wide Association Analyses. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.10.10.23295443. [PMID: 37873299 PMCID: PMC10593059 DOI: 10.1101/2023.10.10.23295443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying disease susceptibility genes by integrating cis-variants predicted gene expression with genome-wide association studies (GWAS) data. However, trans-located variants for predicting gene expression remain largely unexplored. Here, we introduce transTF-TWAS, which incorporates transcription factor (TF)-linked trans-located variants to enhance model building. Using data from the Genotype-Tissue Expression project, we predict gene expression and alternative splicing and applied these models to large GWAS datasets for breast, prostate, and lung cancers. We demonstrate that transTF-TWAS outperforms other existing TWAS approaches in both constructing gene prediction models and identifying disease-associated genes, as evidenced by simulations and real data analysis. Our transTF-TWAS approach significantly contributes to the discovery of disease risk genes. Findings from this study have shed new light on several genetically driven key regulators and their associated regulatory networks underlying disease susceptibility.
Collapse
|
3
|
Li JL, McClellan JC, Zhang H, Gao G, Huo D. Multi-tissue transcriptome-wide association studies identified 235 genes for intrinsic subtypes of breast cancer. J Natl Cancer Inst 2024; 116:1105-1115. [PMID: 38400758 PMCID: PMC11223833 DOI: 10.1093/jnci/djae041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/25/2024] [Accepted: 02/20/2024] [Indexed: 02/26/2024] Open
Abstract
BACKGROUND Although genome-wide association studies (GWAS) of breast cancer (BC) identified common variants which differ between intrinsic subtypes, genes through which these variants act to impact BC risk have not been fully established. Transcriptome-wide association studies (TWAS) have identified genes associated with overall BC risk, but subtype-specific differences are largely unknown. METHODS We performed two multi-tissue TWAS for each BC intrinsic subtype, including an expression-based approach that collated TWAS signals from expression quantitative trait loci (eQTLs) across multiple tissues and a novel splicing-based approach that collated signals from splicing QTLs (sQTLs) across intron clusters and subsequently across tissues. We used summary statistics for five intrinsic subtypes including Luminal A-like, Luminal B-like, Luminal B/HER2-negative-like, HER2-enriched-like, and triple-negative BC, generated from 106 278 BC cases and 91 477 controls in the Breast Cancer Association Consortium. RESULTS Overall, we identified 235 genes in 88 loci that were associated with at least one of the five intrinsic subtypes. Most genes were subtype-specific, and many have not been reported in previous TWAS. We discovered common variants that modulate expression of CHEK2 confer increased risk to Luminal A-like BC, in contrast to the viewpoint that CHEK2 primarily harbors rare, penetrant mutations. Additionally, our splicing-based TWAS provided population-level support for MDM4 splice variants that increased the risk of triple-negative BC. CONCLUSION Our comprehensive, multi-tissue TWAS corroborated previous GWAS loci for overall BC risk and intrinsic subtypes, while underscoring how common variation that impacts expression and splicing of genes in multiple tissue types can be used to further elucidate the etiology of BC.
Collapse
Affiliation(s)
- James L Li
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Julian C McClellan
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Guimin Gao
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Dezheng Huo
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
- Department of Medicine, Section of Hematology and Oncology, University of Chicago, IL, USA
| |
Collapse
|
4
|
Gao G, McClellan J, Barbeira AN, Fiorica PN, Li JL, Mu Z, Olopade OI, Huo D, Im HK. A multi-tissue, splicing-based joint transcriptome-wide association study identifies susceptibility genes for breast cancer. Am J Hum Genet 2024; 111:1100-1113. [PMID: 38733992 PMCID: PMC11179262 DOI: 10.1016/j.ajhg.2024.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 04/13/2024] [Accepted: 04/15/2024] [Indexed: 05/13/2024] Open
Abstract
Splicing-based transcriptome-wide association studies (splicing-TWASs) of breast cancer have the potential to identify susceptibility genes. However, existing splicing-TWASs test the association of individual excised introns in breast tissue only and thus have limited power to detect susceptibility genes. In this study, we performed a multi-tissue joint splicing-TWAS that integrated splicing-TWAS signals of multiple excised introns in each gene across 11 tissues that are potentially relevant to breast cancer risk. We utilized summary statistics from a meta-analysis that combined genome-wide association study (GWAS) results of 424,650 women of European ancestry. Splicing-level prediction models were trained in GTEx (v.8) data. We identified 240 genes by the multi-tissue joint splicing-TWAS at the Bonferroni-corrected significance level; in the tissue-specific splicing-TWAS that combined TWAS signals of excised introns in genes in breast tissue only, we identified nine additional significant genes. Of these 249 genes, 88 genes in 62 loci have not been reported by previous TWASs, and 17 genes in seven loci are at least 1 Mb away from published GWAS index variants. By comparing the results of our splicing-TWASs with previous gene-expression-based TWASs that used the same summary statistics and expression prediction models trained in the same reference panel, we found that 110 genes in 70 loci that are identified only by the splicing-TWASs. Our results showed that for many genes, expression quantitative trait loci (eQTL) did not show a significant impact on breast cancer risk, whereas splicing quantitative trait loci (sQTL) showed a strong impact through intron excision events.
Collapse
Affiliation(s)
- Guimin Gao
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Julian McClellan
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Alvaro N Barbeira
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Peter N Fiorica
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - James L Li
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Zepeng Mu
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Olufunmilayo I Olopade
- Section of Hematology and Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Dezheng Huo
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA; Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA.
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
5
|
Li Q, Song Q, Chen Z, Choi J, Moreno V, Ping J, Wen W, Li C, Shu X, Yan J, Shu XO, Cai Q, Long J, Huyghe JR, Pai R, Gruber SB, Casey G, Wang X, Toriola AT, Li L, Singh B, Lau KS, Zhou L, Wu C, Peters U, Zheng W, Long Q, Yin Z, Guo X. Large-scale integration of omics and electronic health records to identify potential risk protein biomarkers and therapeutic drugs for cancer prevention and intervention. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.29.24308170. [PMID: 38853880 PMCID: PMC11160851 DOI: 10.1101/2024.05.29.24308170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Identifying risk protein targets and their therapeutic drugs is crucial for effective cancer prevention. Here, we conduct integrative and fine-mapping analyses of large genome-wide association studies data for breast, colorectal, lung, ovarian, pancreatic, and prostate cancers, and characterize 710 lead variants independently associated with cancer risk. Through mapping protein quantitative trait loci (pQTL) for these variants using plasma proteomics data from over 75,000 participants, we identify 365 proteins associated with cancer risk. Subsequent colocalization analysis identifies 101 proteins, including 74 not reported in previous studies. We further characterize 36 potential druggable proteins for cancers or other disease indications. Analyzing >3.5 million electronic health records, we uncover five drugs (Haloperidol, Trazodone, Tranexamic Acid, Haloperidol, and Captopril) associated with increased cancer risk and two drugs (Caffeine and Acetazolamide) linked to reduced colorectal cancer risk. This study offers novel insights into therapeutic drugs targeting risk proteins for cancer prevention and intervention.
Collapse
|
6
|
He J, Li Q, Zhang Q. rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection. Genetics 2024; 226:iyad204. [PMID: 38001381 DOI: 10.1093/genetics/iyad204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/14/2023] [Accepted: 11/16/2023] [Indexed: 11/26/2023] Open
Abstract
Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
7
|
Chen Z, Song W, Shu XO, Wen W, Devall M, Dampier C, Moratalla-Navarro F, Cai Q, Long J, Van Kaer L, Wu L, Huyghe JR, Thomas M, Hsu L, Woods MO, Albanes D, Buchanan DD, Gsur A, Hoffmeister M, Vodicka P, Wolk A, Marchand LL, Wu AH, Phipps AI, Moreno V, Ulrike P, Zheng W, Casey G, Guo X. Novel insights into genetic susceptibility for colorectal cancer from transcriptome-wide association and functional investigation. J Natl Cancer Inst 2024; 116:127-137. [PMID: 37632791 PMCID: PMC10777674 DOI: 10.1093/jnci/djad178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 07/10/2023] [Accepted: 08/19/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND Transcriptome-wide association studies have been successful in identifying candidate susceptibility genes for colorectal cancer (CRC). To strengthen susceptibility gene discovery, we conducted a large transcriptome-wide association study and an alternative splicing transcriptome-wide association study in CRC using improved genetic prediction models and performed in-depth functional investigations. METHODS We analyzed RNA-sequencing data from normal colon tissues and genotype data from 423 European descendants to build genetic prediction models of gene expression and alternative splicing and evaluated model performance using independent RNA-sequencing data from normal colon tissues of the Genotype-Tissue Expression Project. We applied the verified models to genome-wide association studies (GWAS) summary statistics among 58 131 CRC cases and 67 347 controls of European ancestry to evaluate associations of genetically predicted gene expression and alternative splicing with CRC risk. We performed in vitro functional assays for 3 selected genes in multiple CRC cell lines. RESULTS We identified 57 putative CRC susceptibility genes, which included the 48 genes from transcriptome-wide association studies and 15 genes from splicing transcriptome-wide association studies, at a Bonferroni-corrected P value less than .05. Of these, 16 genes were not previously implicated in CRC susceptibility, including a gene PDE7B (6q23.3) at locus previously not reported by CRC GWAS. Gene knockdown experiments confirmed the oncogenic roles for 2 unreported genes, TRPS1 and METRNL, and a recently reported gene, C14orf166. CONCLUSION This study discovered new putative susceptibility genes of CRC and provided novel insights into the biological mechanisms underlying CRC development.
Collapse
Affiliation(s)
- Zhishan Chen
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Wenqiang Song
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Wanqing Wen
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Matthew Devall
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Christopher Dampier
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Ferran Moratalla-Navarro
- Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), L’Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Institut de Recerca Biomedica de Bellvitge (IDIBELL), L’Hospitalet de Llobregat, Barcelona, Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences and Universitat de Barcelona Institute of Complex Systems (UBICS), University of Barcelona (UB), L’Hospitalet de Llobregat, Barcelona, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Luc Van Kaer
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lan Wu
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jeroen R Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Minta Thomas
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Li Hsu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Michael O Woods
- Memorial University of Newfoundland, Discipline of Genetics, St. John’s, ON, Canada
| | - Demetrius Albanes
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Daniel D Buchanan
- Colorectal Oncogenomics Group, Department of Clinical Pathology, The University of Melbourne, Parkville, VIC, Australia
- University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Parkville, VIC, Australia
- Genetic Medicine and Family Cancer Clinic, The Royal Melbourne Hospital, Parkville, VIC, Australia
| | - Andrea Gsur
- Center for Cancer Research, Medical University of Vienna, Vienna, Austria
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Pavel Vodicka
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
- Faculty of Medicine and Biomedical Center in Pilsen, Charles University, Pilsen, Czech Republic
| | - Alicja Wolk
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | | | - Anna H Wu
- Preventative Medicine, University of Southern California, Los Angeles, CA, USA
| | - Amanda I Phipps
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Victor Moreno
- Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), L’Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Institut de Recerca Biomedica de Bellvitge (IDIBELL), L’Hospitalet de Llobregat, Barcelona, Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences and Universitat de Barcelona Institute of Complex Systems (UBICS), University of Barcelona (UB), L’Hospitalet de Llobregat, Barcelona, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - Peters Ulrike
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Graham Casey
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
| |
Collapse
|
8
|
He J, Antonyan L, Zhu H, Ardila K, Li Q, Enoma D, Zhang W, Liu A, Chekouo T, Cao B, MacDonald ME, Arnold PD, Long Q. A statistical method for image-mediated association studies discovers genes and pathways associated with four brain disorders. Am J Hum Genet 2024; 111:48-69. [PMID: 38118447 PMCID: PMC10806749 DOI: 10.1016/j.ajhg.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/04/2023] [Accepted: 11/16/2023] [Indexed: 12/22/2023] Open
Abstract
Brain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts. By leveraging the UK Biobank image-derived phenotypes (IDPs), the IMAS discovered genetic bases underlying four neuropsychiatric disorders and verified them by analyzing annotations, pathways, and expression quantitative trait loci (eQTLs). A cerebellar-mediated mechanism was identified to be common to the four disorders. Simulations show that, if the goal is identifying genetic risk, our IMAS is more powerful than a hypothetical protocol in which the imaging results were available in the GWAS dataset. This implies the feasibility of reanalyzing legacy GWAS datasets without conducting additional imaging, yielding cost savings for integrated analysis of genetics and imaging.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Lilit Antonyan
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Harold Zhu
- Department of Biological Sciences, Faculty of Science, University of Calgary, Calgary, AB, Canada
| | - Karen Ardila
- Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - David Enoma
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | | | - Andy Liu
- Sir Winston Churchill High School, Calgary, AB, Canada; College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Thierry Chekouo
- Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada; Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Bo Cao
- Department of Psychiatry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - M Ethan MacDonald
- The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Paul D Arnold
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
9
|
Guo X, Ping J, Yang Y, Su X, Shu XO, Wen W, Chen Z, Zhang Y, Tao R, Jia G, He J, Cai Q, Zhang Q, Giles GG, Pearlman R, Rennert G, Vodicka P, Phipps A, Gruber SB, Casey G, Peters U, Long J, Lin W, Zheng W. Large-scale alternative polyadenylation (APA)-wide association studies to identify putative susceptibility genes in human common cancers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.11.05.23298125. [PMID: 37986797 PMCID: PMC10659493 DOI: 10.1101/2023.11.05.23298125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Alternative polyadenylation (APA) modulates mRNA processing in the 3' untranslated regions (3'UTR), which affect mRNA stability and translation efficiency. Here, we build genetic models to predict APA levels in multiple tissues using sequencing data of 1,337 samples from the Genotype-Tissue Expression, and apply these models to assess associations between genetically predicted APA levels and cancer risk with data from large genome-wide association studies of six common cancers, including breast, ovary, prostate, colorectum, lung, and pancreas among European-ancestry populations. At a Bonferroni-corrected P □<□0.05, we identify 58 risk genes, including seven in newly identified loci. Using luciferase reporter assays, we demonstrate that risk alleles of 3'UTR variants, rs324015 ( STAT6 ), rs2280503 ( DIP2B ), rs1128450 ( FBXO38 ) and rs145220637 ( LDAH ), could significantly increase post-transcriptional activities of their target genes compared to reference alleles. Further gene knockdown experiments confirm their oncogenic roles. Our study provides additional insight into the genetic susceptibility of these common cancers.
Collapse
|
10
|
Shao M, Zhang Z, Sun H, He J, Wang J, Zhang Q, Cao C. Editorial: Statistical methods for genome-wide association studies (GWAS) and transcriptome-wide association studies (TWAS) and their applications. Front Genet 2023; 14:1287673. [PMID: 37766879 PMCID: PMC10520498 DOI: 10.3389/fgene.2023.1287673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023] Open
Affiliation(s)
- Mengting Shao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou, China
| | - Huiyan Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada
| | - Juexin Wang
- Department of Biohealth Informatics, Indiana University Purdue University Indianapolis, Indianapolis, IN, United States
| | - Qingrun Zhang
- Department of Mathematics and Statistics, University of Calgary, Calgary, AB, Canada
| | - Chen Cao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| |
Collapse
|
11
|
Sun Y, Bae YE, Zhu J, Zhang Z, Zhong H, Cheng C, Deng Y, Wu C, Wu L. A Splicing Transcriptome-Wide Association Study Identifies Candidate Altered Splicing for Prostate Cancer Risk. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:372-380. [PMID: 37486714 DOI: 10.1089/omi.2023.0065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
Prostate cancer (PCa) represents a huge public health burden among men. Many susceptibility genetic factors for PCa still remain unknown. In this study, we performed a large splicing transcriptome-wide association study (spTWAS) using three modeling strategies to develop alternative splicing genetic prediction models for identifying novel susceptibility loci and splicing introns for PCa risk by assessing 79,194 cases and 61,112 controls of European ancestry in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia. We identified 120 splicing introns of 97 genes showing an association with PCa risk at false discovery rate (FDR)-corrected threshold (FDR <0.05). Of them, 33 genes were enriched in PCa-related diseases and function categories. Fine-mapping analysis suggested that 21 splicing introns of 19 genes were likely causally associated with PCa risk. Thirty-five splicing introns of 34 novel genes were identified to be related to PCa susceptibility for the first time, and 11 of the genes were enriched in a cancer-related network. Our study identified novel loci and splicing introns associated with PCa risk, which can improve our understanding of the etiology of this common malignancy.
Collapse
Affiliation(s)
- Yanfa Sun
- College of Life Science, Longyan University, Longyan, P.R. China
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii, USA
- Fujian Provincial Key Laboratory for the Prevention and Control of Animal Infectious Diseases and Biotechnology, Longyan, P.R. China
- Fujian Provincial Universities Key Laboratory of Preventive Veterinary Medicine and Biotechnology (Longyan University), Longyan, P.R. China
| | - Ye Eun Bae
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii, USA
| | - Zichen Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Hua Zhong
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii, USA
| | - Chunmei Cheng
- College of Life Science, Longyan University, Longyan, P.R. China
| | - Youping Deng
- Department of Quantitative Health Sciences, John A. Burns School of Medicine, University of Hawaii at Manoa, Honolulu, Hawaii, USA
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii, USA
| |
Collapse
|
12
|
Gao G, Fiorica PN, McClellan J, Barbeira AN, Li JL, Olopade OI, Im HK, Huo D. A joint transcriptome-wide association study across multiple tissues identifies candidate breast cancer susceptibility genes. Am J Hum Genet 2023; 110:950-962. [PMID: 37164006 PMCID: PMC10257003 DOI: 10.1016/j.ajhg.2023.04.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 04/14/2023] [Indexed: 05/12/2023] Open
Abstract
Genome-wide association studies (GWASs) have identified more than 200 genomic loci for breast cancer risk, but specific causal genes in most of these loci have not been identified. In fact, transcriptome-wide association studies (TWASs) of breast cancer performed using gene expression prediction models trained in breast tissue have yet to clearly identify most target genes. To identify candidate genes, we performed a GWAS analysis in a breast cancer dataset from UK Biobank (UKB) and combined the results with the GWAS results of the Breast Cancer Association Consortium (BCAC) by a meta-analysis. Using the summary statistics from the meta-analysis, we performed a joint TWAS analysis that combined TWAS signals from multiple tissues. We used expression prediction models trained in 11 tissues that are potentially relevant to breast cancer from the Genotype-Tissue Expression (GTEx) data. In the GWAS analysis, we identified eight loci distinct from those reported previously. In the TWAS analysis, we identified 309 genes at 108 genomic loci to be significantly associated with breast cancer at the Bonferroni threshold. Of these, 17 genes were located in eight regions that were at least 1 Mb away from published GWAS hits. The remaining TWAS-significant genes were located in 100 known genomic loci from previous GWASs of breast cancer. We found that 21 genes located in known GWAS loci remained statistically significant after conditioning on previous GWAS index variants. Our study provides insights into breast cancer genetics through mapping candidate target genes in a large proportion of known GWAS loci and discovering multiple new loci.
Collapse
Affiliation(s)
- Guimin Gao
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Peter N Fiorica
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Julian McClellan
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Alvaro N Barbeira
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - James L Li
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Olufunmilayo I Olopade
- Section of Hematology & Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA.
| | - Dezheng Huo
- Department of Public Health Sciences, University of Chicago, Chicago, IL 60637, USA; Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|