1
|
Long Q, Yuan Y, Li M. RNA-SSNV: A Reliable Somatic Single Nucleotide Variant Identification Framework for Bulk RNA-Seq Data. Front Genet 2022; 13:865313. [PMID: 35846154 PMCID: PMC9279659 DOI: 10.3389/fgene.2022.865313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
The usage of expressed somatic mutations may have a unique advantage in identifying active cancer driver mutations. However, accurately calling mutations from RNA-seq data is difficult due to confounding factors such as RNA-editing, reverse transcription, and gap alignment. In the present study, we proposed a framework (named RNA-SSNV, https://github.com/pmglab/RNA-SSNV) to call somatic single nucleotide variants (SSNV) from tumor bulk RNA-seq data. Based on a comprehensive multi-filtering strategy and a machine-learning classification model trained with comprehensively curated features, RNA-SSNV achieved the best precision–recall rate (0.880–0.884) in a testing dataset and robustly retained 0.94 AUC for the precision–recall curve in three validation adult-based TCGA (The Cancer Genome Atlas) datasets. We further showed that the somatic mutations called by RNA-SSNV tended to have a higher functional impact and therapeutic power in known driver genes. Furthermore, VAF (variant allele fraction) analysis revealed that subclonal harboring expressed mutations had evolutional selection advantage and RNA had higher detection power to rescue DNA-omitted mutations. In sum, RNA-SSNV will be a useful approach to accurately call expressed somatic mutations for a more insightful analysis of cancer drive genes and carcinogenic mechanisms.
Collapse
Affiliation(s)
- Qihan Long
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Yangyang Yuan
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
- Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangzhou, China
- *Correspondence: Miaoxin Li,
| |
Collapse
|
2
|
Abstract
Diploidy has profound implications for population genetics and susceptibility to genetic diseases. Although two copies are present for most genes in the human genome, they are not necessarily both active or active at the same level in a given individual. Genomic imprinting, resulting in exclusive or biased expression in favor of the allele of paternal or maternal origin, is now believed to affect hundreds of human genes. A far greater number of genes display unequal expression of gene copies due to cis-acting genetic variants that perturb gene expression. The availability of data generated by RNA sequencing applied to large numbers of individuals and tissue types has generated unprecedented opportunities to assess the contribution of genetic variation to allelic imbalance in gene expression. Here we review the insights gained through the analysis of these data about the extent of the genetic contribution to allelic expression imbalance, the tools and statistical models for gene expression imbalance, and what the results obtained reveal about the contribution of genetic variants that alter gene expression to complex human diseases and phenotypes.
Collapse
Affiliation(s)
- Siobhan Cleary
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway H91 H3CY, Ireland;
| | - Cathal Seoighe
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway H91 H3CY, Ireland;
| |
Collapse
|
3
|
Clayton EA, Khalid S, Ban D, Wang L, Jordan IK, McDonald JF. Tumor suppressor genes and allele-specific expression: mechanisms and significance. Oncotarget 2020; 11:462-479. [PMID: 32064050 PMCID: PMC6996918 DOI: 10.18632/oncotarget.27468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 01/13/2020] [Indexed: 12/12/2022] Open
Abstract
Recent findings indicate that allele-specific expression (ASE) at specific cancer driver gene loci may be of importance in onset/progression of the disease. Of particular interest are loss-of-function (LOF) of tumor suppressor gene (TSGs) alleles. While LOF tumor suppressor mutations are typically considered to be recessive, if these mutant alleles can be significantly differentially expressed relative to wild-type alleles in heterozygotes, the clinical consequences could be significant. LOF TSG alleles are shown to be segregating at high frequencies in world-wide populations of normal/healthy individuals. Matched sets of normal and tumor tissues isolated from 233 cancer patients representing four diverse tumor types demonstrate functionally important changes in patterns of ASE in individuals heterozygous for LOF TSG alleles associated with cancer onset/progression. While a variety of molecular mechanisms were identified as potentially contributing to changes in ASE patterns in cancer, changes in DNA copy number and allele-specific alternative splicing possibly mediated by antisense RNA emerged as predominant factors. In conclusion, LOF TSGs are segregating in human populations at significant frequencies indicating that many otherwise healthy individuals are at elevated risk of developing cancer. Changes in ASE between normal and cancer tissues indicates that LOF TSG alleles may contribute to cancer onset/progression even when heterozygous with wild-type functional alleles.
Collapse
Affiliation(s)
- Evan A. Clayton
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Shareef Khalid
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Dongjo Ban
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Lu Wang
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
| | - I. King Jordan
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - John F. McDonald
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
4
|
Zhao C, Xie S, Wu H, Luan Y, Hu S, Ni J, Lin R, Zhao S, Zhang D, Li X. Quantification of allelic differential expression using a simple Fluorescence primer PCR-RFLP-based method. Sci Rep 2019; 9:6334. [PMID: 31004110 PMCID: PMC6474871 DOI: 10.1038/s41598-019-42815-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 03/29/2019] [Indexed: 12/04/2022] Open
Abstract
Allelic differential expression (ADE) is common in diploid organisms, and is often the key reason for specific phenotype variations. Thus, ADE detection is important for identification of major genes and causal mutations. To date, sensitive and simple methods to detect ADE are still lacking. In this study, we have developed an accurate, simple, and sensitive method, named fluorescence primer PCR-RFLP quantitative method (fPCR-RFLP), for ADE analysis. This method involves two rounds of PCR amplification using a pair of primers, one of which is double-labeled with an overhang 6-FAM. The two alleles are then separated by RFLP and quantified by fluorescence density. fPCR-RFLP could precisely distinguish ADE cross a range of 1- to 32-fold differences. Using this method, we verified PLAG1 and KIT, two candidate genes related to growth rate and immune response traits of pigs, to be ADE both at different developmental stages and in different tissues. Our data demonstrates that fPCR-RFLP is an accurate and sensitive method for detecting ADE on both DNA and RNA level. Therefore, this powerful tool provides a way to analyze mutations that cause ADE.
Collapse
Affiliation(s)
- Changzhi Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Shengsong Xie
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China.,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Hui Wu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Yu Luan
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Suqin Hu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Juan Ni
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Ruiyi Lin
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China.,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Dingxiao Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China. .,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China.
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China. .,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China.
| |
Collapse
|
5
|
Liu Z, Dong X, Li Y. A Genome-Wide Study of Allele-Specific Expression in Colorectal Cancer. Front Genet 2018; 9:570. [PMID: 30538721 PMCID: PMC6277598 DOI: 10.3389/fgene.2018.00570] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 11/06/2018] [Indexed: 12/30/2022] Open
Abstract
Accumulating evidence from small-scale studies has suggested that allele-specific expression (ASE) plays an important role in tumor initiation and progression. However, little is known about genome-wide ASE in tumors. In this study, we conducted a comprehensive analysis of ASE in individuals with colorectal cancer (CRC) on a genome-wide scale. We identified 5.4 thousand genome-wide ASEs of single nucleotide variations (SNVs) from tumor and normal tissues of 59 individuals with CRC. We observed an increased ASE level in tumor samples and the ASEs enriched as hotspots on the genome. Around 63% of the genes located there were previously reported to contain complex regulatory elements, e.g., human leukocyte antigen (HLA), or were implicated in tumor progression. Focussing on the allelic expression of somatic mutations, we found that 37.5% of them exhibited ASE, and genes harboring such somatic mutations, were enriched in important pathways implicated in cancers. In addition, by comparing the expected and observed ASE events in tumor samples, we identified 50 tumor specific ASEs which possibly contributed to the somatic events in the regulatory regions of the genes and significantly enriched known cancer driver genes. By analyzing CRC ASEs from several perspectives, we provided a systematic understanding of how ASE is implicated in both tumor and normal tissues and will be of critical value in guiding ASE studies in cancer.
Collapse
Affiliation(s)
- Zhi Liu
- Department of Epidemiology and Biostatistics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Xiao Dong
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Yixue Li
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, China.,Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, China
| |
Collapse
|
6
|
de Santiago I, Liu W, Yuan K, O'Reilly M, Chilamakuri CSR, Ponder BAJ, Meyer KB, Markowetz F. BaalChIP: Bayesian analysis of allele-specific transcription factor binding in cancer genomes. Genome Biol 2017; 18:39. [PMID: 28235418 PMCID: PMC5326502 DOI: 10.1186/s13059-017-1165-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 02/01/2017] [Indexed: 02/07/2023] Open
Abstract
Allele-specific measurements of transcription factor binding from ChIP-seq data are key to dissecting the allelic effects of non-coding variants and their contribution to phenotypic diversity. However, most methods of detecting an allelic imbalance assume diploid genomes. This assumption severely limits their applicability to cancer samples with frequent DNA copy-number changes. Here we present a Bayesian statistical approach called BaalChIP to correct for the effect of background allele frequency on the observed ChIP-seq read counts. BaalChIP allows the joint analysis of multiple ChIP-seq samples across a single variant and outperforms competing approaches in simulations. Using 548 ENCODE ChIP-seq and six targeted FAIRE-seq samples, we show that BaalChIP effectively corrects allele-specific analysis for copy-number variation and increases the power to detect putative cis-acting regulatory variants in cancer genomes.
Collapse
Affiliation(s)
- Ines de Santiago
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
- Present Address: Seven Bridges Genomics LTD, UK. 101 Euston Road NW1 2RA, London, UK
| | - Wei Liu
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
- Present Address: Institute of Biodiversity Animal Health and Comparative Medicine, University of Glasgow, Glasgow, UK
| | - Ke Yuan
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
- Present Address: School of Computing Science, University of Glasgow, Glasgow, UK
| | - Martin O'Reilly
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
| | | | - Bruce A J Ponder
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
| | - Kerstin B Meyer
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK.
| | - Florian Markowetz
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK.
| |
Collapse
|