1
|
Chen S, Wang P, Guo H, Zhang Y. Deciphering gene expression patterns using large-scale transcriptomic data and its applications. Brief Bioinform 2024; 25:bbae590. [PMID: 39541191 PMCID: PMC11562847 DOI: 10.1093/bib/bbae590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 10/07/2024] [Accepted: 10/31/2024] [Indexed: 11/16/2024] Open
Abstract
Gene expression varies stochastically across genders, racial groups, and health statuses. Deciphering these patterns is crucial for identifying informative genes, classifying samples, and understanding diseases like cancer. This study analyzes 11,252 bulk RNA-seq samples to explore expression patterns of 19,156 genes, including 10,512 cancer tissue samples and 740 normal samples. Additionally, 4,884 single-cell RNA-seq samples are examined. Statistical analysis using 16 probability distributions shows that normal samples display a wider range of distributions compared to cancer samples. Cancer samples tend to favor asymmetric distributions such as generalized extreme value, logarithmic normal, and Gaussian mixture distributions. In contrast, certain genes in normal samples exhibit symmetric distributions. Remarkably, more than 95.5% of genes exhibit non-normal distributions, which challenges traditional assumptions. Furthermore, distributions differ significantly between bulk and single-cell RNA-seq data. Many cancer driver genes exhibit distinct distribution patterns across sample types, suggesting potential for gene selection and classification based on distribution characteristics. A novel skewness-based metric is proposed to quantify distribution variation across datasets, showing genes with significant skewness differences have biological relevance. Finally, an improved naïve Bayes method incorporating gene-specific distributions demonstrates superior performance in simulations over traditional methods. This work enhances understanding of gene expression and its application in omics-based gene selection and sample classification.
Collapse
Affiliation(s)
- Shunjie Chen
- School of Mathematics and Statistics, Henan University, Jinming Avenue, 475004, Kaifeng, China
| | - Pei Wang
- School of Mathematics and Statistics, Henan University, Jinming Avenue, 475004, Kaifeng, China
- Henan Engineering Research Center for Industrial Internet of Things, Henan University, Mingli Road, 450046, Zhengzhou, China
| | - Haiping Guo
- School of Mathematics and Statistics, Henan University, Jinming Avenue, 475004, Kaifeng, China
| | - Yujie Zhang
- School of Mathematics and Statistics, Henan University, Jinming Avenue, 475004, Kaifeng, China
| |
Collapse
|
2
|
Deng C, Li HD, Zhang LS, Liu Y, Li Y, Wang J. Identifying new cancer genes based on the integration of annotated gene sets via hypergraph neural networks. Bioinformatics 2024; 40:i511-i520. [PMID: 38940121 PMCID: PMC11211849 DOI: 10.1093/bioinformatics/btae257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Identifying cancer genes remains a significant challenge in cancer genomics research. Annotated gene sets encode functional associations among multiple genes, and cancer genes have been shown to cluster in hallmark signaling pathways and biological processes. The knowledge of annotated gene sets is critical for discovering cancer genes but remains to be fully exploited. RESULTS Here, we present the DIsease-Specific Hypergraph neural network (DISHyper), a hypergraph-based computational method that integrates the knowledge from multiple types of annotated gene sets to predict cancer genes. First, our benchmark results demonstrate that DISHyper outperforms the existing state-of-the-art methods and highlight the advantages of employing hypergraphs for representing annotated gene sets. Second, we validate the accuracy of DISHyper-predicted cancer genes using functional validation results and multiple independent functional genomics data. Third, our model predicts 44 novel cancer genes, and subsequent analysis shows their significant associations with multiple types of cancers. Overall, our study provides a new perspective for discovering cancer genes and reveals previously undiscovered cancer genes. AVAILABILITY AND IMPLEMENTATION DISHyper is freely available for download at https://github.com/genemine/DISHyper.
Collapse
Affiliation(s)
- Chao Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Li-Shen Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Yiwei Liu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529-0001, United States
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| |
Collapse
|
3
|
Yan Z, Qin G, Shi X, Jiang X, Cheng Z, Zhang Y, Nan N, Cao F, Qiu X, Sang N. Multilevel Screening Strategy to Identify the Hydrophobic Organic Components of Ambient PM 2.5 Associated with Hepatocellular Steatosis. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:10458-10469. [PMID: 38836430 DOI: 10.1021/acs.est.3c10012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
Hepatic steatosis is the first step in a series of events that drives hepatic disease and has been considerably associated with exposure to fine particulate matter (PM2.5). Although the chemical constituents of particles matter in the negative health effects, the specific components of PM2.5 that trigger hepatic steatosis remain unclear. New strategies prioritizing the identification of the key components with the highest potential to cause adverse effects among the numerous components of PM2.5 are needed. Herein, we established a high-resolution mass spectrometry (MS) data set comprising the hydrophobic organic components corresponding to 67 PM2.5 samples in total from Taiyuan and Guangzhou, two representative cities in North and South China, respectively. The lipid accumulation bioeffect profiles of the above samples were also obtained. Considerable hepatocyte lipid accumulation was observed in most PM2.5 extracts. Subsequently, 40 of 695 components were initially screened through machine learning-assisted data filtering based on an integrated bioassay with MS data. Next, nine compounds were further selected as candidates contributing to hepatocellular steatosis based on absorption, distribution, metabolism, and excretion evaluation and molecular dockingin silico. Finally, seven components were confirmed in vitro. This study provided a multilevel screening strategy for key active components in PM2.5 and provided insight into the hydrophobic PM2.5 components that induce hepatocellular steatosis.
Collapse
Affiliation(s)
- Zhipeng Yan
- College of Environment and Resource, Research Center of Environment and Health, Shanxi University, Shanxi 030006, PR China
| | - Guohua Qin
- College of Environment and Resource, Research Center of Environment and Health, Shanxi University, Shanxi 030006, PR China
| | - Xiaodi Shi
- State Key Joint Laboratory for Environmental Simulation and Pollution Control, College of Environmental Sciences and Engineering, and Center for Environment and Health, Peking University, Beijing 100871, PR China
| | - Xing Jiang
- State Key Joint Laboratory for Environmental Simulation and Pollution Control, College of Environmental Sciences and Engineering, and Center for Environment and Health, Peking University, Beijing 100871, PR China
| | - Zhen Cheng
- State Key Joint Laboratory for Environmental Simulation and Pollution Control, College of Environmental Sciences and Engineering, and Center for Environment and Health, Peking University, Beijing 100871, PR China
| | - Yaru Zhang
- College of Environment and Resource, Research Center of Environment and Health, Shanxi University, Shanxi 030006, PR China
| | - Nan Nan
- College of Environment and Resource, Research Center of Environment and Health, Shanxi University, Shanxi 030006, PR China
| | - Fuyuan Cao
- Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Shanxi 030006, PR China
| | - Xinghua Qiu
- State Key Joint Laboratory for Environmental Simulation and Pollution Control, College of Environmental Sciences and Engineering, and Center for Environment and Health, Peking University, Beijing 100871, PR China
| | - Nan Sang
- College of Environment and Resource, Research Center of Environment and Health, Shanxi University, Shanxi 030006, PR China
| |
Collapse
|
4
|
Dakal TC, Dhabhai B, Pant A, Moar K, Chaudhary K, Yadav V, Ranga V, Sharma NK, Kumar A, Maurya PK, Maciaczyk J, Schmidt‐Wolf IGH, Sharma A. Oncogenes and tumor suppressor genes: functions and roles in cancers. MedComm (Beijing) 2024; 5:e582. [PMID: 38827026 PMCID: PMC11141506 DOI: 10.1002/mco2.582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 04/21/2024] [Accepted: 04/26/2024] [Indexed: 06/04/2024] Open
Abstract
Cancer, being the most formidable ailment, has had a profound impact on the human health. The disease is primarily associated with genetic mutations that impact oncogenes and tumor suppressor genes (TSGs). Recently, growing evidence have shown that X-linked TSGs have specific role in cancer progression and metastasis as well. Interestingly, our genome harbors around substantial portion of genes that function as tumor suppressors, and the X chromosome alone harbors a considerable number of TSGs. The scenario becomes even more compelling as X-linked TSGs are adaptive to key epigenetic processes such as X chromosome inactivation. Therefore, delineating the new paradigm related to X-linked TSGs, for instance, their crosstalk with autosome and involvement in cancer initiation, progression, and metastasis becomes utmost importance. Considering this, herein, we present a comprehensive discussion of X-linked TSG dysregulation in various cancers as a consequence of genetic variations and epigenetic alterations. In addition, the dynamic role of X-linked TSGs in sex chromosome-autosome crosstalk in cancer genome remodeling is being explored thoroughly. Besides, the functional roles of ncRNAs, role of X-linked TSG in immunomodulation and in gender-based cancer disparities has also been highlighted. Overall, the focal idea of the present article is to recapitulate the findings on X-linked TSG regulation in the cancer landscape and to redefine their role toward improving cancer treatment strategies.
Collapse
Affiliation(s)
- Tikam Chand Dakal
- Department of BiotechnologyGenome and Computational Biology LabMohanlal Sukhadia UniversityUdaipurRajasthanIndia
| | - Bhanupriya Dhabhai
- Department of BiotechnologyGenome and Computational Biology LabMohanlal Sukhadia UniversityUdaipurRajasthanIndia
| | - Anuja Pant
- Department of BiochemistryCentral University of HaryanaMahendergarhHaryanaIndia
| | - Kareena Moar
- Department of BiochemistryCentral University of HaryanaMahendergarhHaryanaIndia
| | - Kanika Chaudhary
- School of Life Sciences. Jawaharlal Nehru UniversityNew DelhiIndia
| | - Vikas Yadav
- School of Life Sciences. Jawaharlal Nehru UniversityNew DelhiIndia
| | - Vipin Ranga
- Dearptment of Agricultural BiotechnologyDBT‐NECAB, Assam Agricultural UniversityJorhatAssamIndia
| | | | - Abhishek Kumar
- Manipal Academy of Higher EducationManipalKarnatakaIndia
- Institute of Bioinformatics, International Technology ParkBangaloreIndia
| | - Pawan Kumar Maurya
- Department of BiochemistryCentral University of HaryanaMahendergarhHaryanaIndia
| | - Jarek Maciaczyk
- Department of Stereotactic and Functional NeurosurgeryUniversity Hospital of BonnBonnGermany
| | - Ingo G. H. Schmidt‐Wolf
- Department of Integrated OncologyCenter for Integrated Oncology (CIO)University Hospital BonnBonnGermany
| | - Amit Sharma
- Department of Stereotactic and Functional NeurosurgeryUniversity Hospital of BonnBonnGermany
- Department of Integrated OncologyCenter for Integrated Oncology (CIO)University Hospital BonnBonnGermany
| |
Collapse
|
5
|
Bártová E. Epigenetic and gene therapy in human and veterinary medicine. ENVIRONMENTAL EPIGENETICS 2024; 10:dvae006. [PMID: 38751572 PMCID: PMC11095531 DOI: 10.1093/eep/dvae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 04/12/2024] [Accepted: 05/08/2024] [Indexed: 05/18/2024]
Abstract
Gene therapy is a focus of interest in both human and veterinary medicine, especially in recent years due to the potential applications of CRISPR/Cas9 technology. Another relatively new approach is that of epigenetic therapy, which involves an intervention based on epigenetic marks, including DNA methylation, histone post-translational modifications, and post-transcription modifications of distinct RNAs. The epigenome results from enzymatic reactions, which regulate gene expression without altering DNA sequences. In contrast to conventional CRISP/Cas9 techniques, the recently established methodology of epigenetic editing mediated by the CRISPR/dCas9 system is designed to target specific genes without causing DNA breaks. Both natural epigenetic processes and epigenetic editing regulate gene expression and thereby contribute to maintaining the balance between physiological functions and pathophysiological states. From this perspective, knowledge of specific epigenetic marks has immense potential in both human and veterinary medicine. For instance, the use of epigenetic drugs (chemical compounds with therapeutic potential affecting the epigenome) seems to be promising for the treatment of cancer, metabolic, and infectious diseases. Also, there is evidence that an epigenetic diet (nutrition-like factors affecting epigenome) should be considered as part of a healthy lifestyle and could contribute to the prevention of pathophysiological processes. In summary, epigenetic-based approaches in human and veterinary medicine have increasing significance in targeting aberrant gene expression associated with various diseases. In this case, CRISPR/dCas9, epigenetic targeting, and some epigenetic nutrition factors could contribute to reversing an abnormal epigenetic landscape to a healthy physiological state.
Collapse
Affiliation(s)
- Eva Bártová
- Department of Cell Biology and Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, 612 00, the Czech Republic
| |
Collapse
|
6
|
Chen H, Wang Z, Gong L, Wang Q, Chen W, Wang J, Ma X, Ding R, Li X, Zou X, Plass M, Lian C, Ni T, Wei GH, Li W, Deng L, Li L. A distinct class of pan-cancer susceptibility genes revealed by an alternative polyadenylation transcriptome-wide association study. Nat Commun 2024; 15:1729. [PMID: 38409266 PMCID: PMC10897204 DOI: 10.1038/s41467-024-46064-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/12/2024] [Indexed: 02/28/2024] Open
Abstract
Alternative polyadenylation plays an important role in cancer initiation and progression; however, current transcriptome-wide association studies mostly ignore alternative polyadenylation when identifying putative cancer susceptibility genes. Here, we perform a pan-cancer 3' untranslated region alternative polyadenylation transcriptome-wide association analysis by integrating 55 well-powered (n > 50,000) genome-wide association studies datasets across 22 major cancer types with alternative polyadenylation quantification from 23,955 RNA sequencing samples across 7,574 individuals. We find that genetic variants associated with alternative polyadenylation are co-localized with 28.57% of cancer loci and contribute a significant portion of cancer heritability. We further identify 642 significant cancer susceptibility genes predicted to modulate cancer risk via alternative polyadenylation, 62.46% of which have been overlooked by traditional expression- and splicing- studies. As proof of principle validation, we show that alternative alleles facilitate 3' untranslated region lengthening of CRLS1 gene leading to increased protein abundance and promoted proliferation of breast cancer cells. Together, our study highlights the significant role of alternative polyadenylation in discovering new cancer susceptibility genes and provides a strong foundational framework for enhancing our understanding of the etiology underlying human cancers.
Collapse
Affiliation(s)
- Hui Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Zeyang Wang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Lihai Gong
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Qixuan Wang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Wenyan Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Jia Wang
- Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Xuelian Ma
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Ruofan Ding
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Xing Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Xudong Zou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Mireya Plass
- Gene Regulation of Cell Identity Group, Regenerative Medicine Program, Bellvitge Institute for Biomedical Research (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08908, Spain
- Program for Advancing Clinical Translation of Regenerative Medicine of Catalonia, P-CMR[C], L'Hospitalet de Llobregat, Barcelona, 08908, Spain
- Center for Networked Biomedical Research on Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, 28029, Spain
| | - Cheng Lian
- Department of Biochemistry and Molecular Biology of School of Basic Medical Sciences, Shanghai Medical College of Fudan University, Shanghai, 200032, China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai, 200438, China
| | - Gong-Hong Wei
- Department of Biochemistry and Molecular Biology of School of Basic Medical Sciences, Shanghai Medical College of Fudan University, Shanghai, 200032, China
- Disease Networks Research Unit, Faculty of Biochemistry and Molecular Medicine & Biocenter Oulu, University of Oulu, Oulu, 90410, Finland
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, The University of California, Irvine, CA, 92697, USA.
| | - Lin Deng
- Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen, 518055, China.
| | - Lei Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China.
| |
Collapse
|
7
|
Nourbakhsh M, Degn K, Saksager A, Tiberti M, Papaleo E. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks. Brief Bioinform 2024; 25:bbad519. [PMID: 38261338 PMCID: PMC10805075 DOI: 10.1093/bib/bbad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 11/27/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks.
Collapse
Affiliation(s)
- Mona Nourbakhsh
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Astrid Saksager
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| | - Elena Papaleo
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| |
Collapse
|
8
|
Wang Y, Zhou B, Ru J, Meng X, Wang Y, Liu W. Advances in computational methods for identifying cancer driver genes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21643-21669. [PMID: 38124614 DOI: 10.3934/mbe.2023958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Cancer driver genes (CDGs) are crucial in cancer prevention, diagnosis and treatment. This study employed computational methods for identifying CDGs, categorizing them into four groups. The major frameworks for each of these four categories were summarized. Additionally, we systematically gathered data from public databases and biological networks, and we elaborated on computational methods for identifying CDGs using the aforementioned databases. Further, we summarized the algorithms, mainly involving statistics and machine learning, used for identifying CDGs. Notably, the performances of nine typical identification methods for eight types of cancer were compared to analyze the applicability areas of these methods. Finally, we discussed the challenges and prospects associated with methods for identifying CDGs. The present study revealed that the network-based algorithms and machine learning-based methods demonstrated superior performance.
Collapse
Affiliation(s)
- Ying Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Bohao Zhou
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Jidong Ru
- School of Textile Garment and Design, Changshu Institute of Technology, Changshu 215500, China
| | - Xianglian Meng
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| | - Yundong Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Wenjie Liu
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| |
Collapse
|
9
|
Wang J, Shi A, Lyu J. A comprehensive atlas of epigenetic regulators reveals tissue-specific epigenetic regulation patterns. Epigenetics 2023; 18:2139067. [PMID: 36305095 PMCID: PMC9980636 DOI: 10.1080/15592294.2022.2139067] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
Epigenetic machinery contributes to gene regulation in eukaryotic species. However, the machinery including more than 600 epigenetic regulator (ER) genes responsible for reading, writing, and erasing histone modifications and DNA modifications remains largely uncharacterized across species. We compile a comprehensive list of ERs based on an evolutionary analysis across 23 species, which is the most comprehensive ER list in various species until recently. We further perform comparative transcriptomic analyses across different tissues in humans, mice, as well as other amniote species. We observe a consistent tissue-of-origin expression specificity pattern of duplicated ER genes across species and suggest links between expression specificity and ER gene evolution as well as ER function. Additional analyses further suggest that ER duplication can generate tissue-specific ER genes with the same epigenetic substrates, which may be closely related to their regulatory specificity in tissue development. Our work can serve as a foundation to better comprehend the tissue-specific expression patterns of ER genes from an evolutionary perspective and also the functional implications of ERs in tissue-specific epigenetic regulation.
Collapse
Affiliation(s)
- Jilu Wang
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, People's Republic of China
| | - Aiai Shi
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, People's Republic of China
| | - Jie Lyu
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, People's Republic of China.,Joint Centre of Translational Medicine, the First Affiliated Hospital of Wenzhou Medical University, Wenzhou, People's Republic of China.,Joint Centre of Translational Medicine, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, People's Republic of China.,Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, Zhejiang, People's Republic of China
| |
Collapse
|
10
|
Li H, Lei Y, Li G, Huang Y. Identification of tumor-suppressor genes in lung squamous cell carcinoma through integrated bioinformatics analyses. Oncol Res 2023; 32:187-197. [PMID: 38188687 PMCID: PMC10767242 DOI: 10.32604/or.2023.030656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 06/20/2023] [Indexed: 01/09/2024] Open
Abstract
Lung cancer is a prevalent malignancy, and fatalities of the disease exceed 400,000 cases worldwide. Lung squamous cell carcinoma (LUSC) has been recognized as the most common pathological form of lung cancer. The comprehensive understanding of molecular features related to LUSC progression has great significance in LUSC prognosis assessment and clinical management. In this study, we aim to identify a panel of signature genes closely associated with LUSC, which can provide novel insights into the progression of LUSC. Gene expression profiles were retrieved from public resources including gene expression omnibus (GEO) and the cancer genome atlas (TCGA) database. Differentially expressed genes (DEGs) between LUSC specimens and normal lung tissues were identified by bioinformatics analyses. A total of 66 DEGs were identified based on two cohorts of data. CytoHubba plugin of Cytoscape software was utilized for the further analyses of the top 10 candidate hub genes including OGN, ABI3BP, MAMDC2, FGF7, FAM107A, SPARCL1, DCN, COL14A1, and MFAP4 and CHRDL1, which showed significant downregulation in LUSC. Two LUSC cell lines were used to validate the functions of CHRDL1 and FAM107A through overexpression experiment. Together, our data revealed novel candidate tumor-suppressor genes in LUSC, suggesting previously unappreciated mechanisms in the progression of LUSC.
Collapse
Affiliation(s)
- Heng Li
- The 2nd Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Yunnan Tumor Hospital, Kunming, 650118, China
| | - Youming Lei
- Department of Geriatric Thoracic Surgery, The First Affiliated Hospital of Kunming Medical University, Kunming, 650032, China
| | - Gaofeng Li
- The 2nd Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Yunnan Tumor Hospital, Kunming, 650118, China
| | - Yunchao Huang
- The 1st Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Yunnan Tumor Hospital, Kunming, 650118, China
| |
Collapse
|
11
|
Chen H, Shu J, Maley CC, Liu L. A Mouse-Specific Model to Detect Genes under Selection in Tumors. Cancers (Basel) 2023; 15:5156. [PMID: 37958330 PMCID: PMC10647215 DOI: 10.3390/cancers15215156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/16/2023] [Accepted: 10/18/2023] [Indexed: 11/15/2023] Open
Abstract
The mouse is a widely used model organism in cancer research. However, no computational methods exist to identify cancer driver genes in mice due to a lack of labeled training data. To address this knowledge gap, we adapted the GUST (Genes Under Selection in Tumors) model, originally trained on human exomes, to mouse exomes via transfer learning. The resulting tool, called GUST-mouse, can estimate long-term and short-term evolutionary selection in mouse tumors, and distinguish between oncogenes, tumor suppressor genes, and passenger genes using high-throughput sequencing data. We applied GUST-mouse to analyze 65 exomes of mouse primary breast cancer models and 17 exomes of mouse leukemia models. Comparing the predictions between cancer types and between human and mouse tumors revealed common and unique driver genes. The GUST-mouse method is available as an open-source R package on github.
Collapse
Affiliation(s)
- Hai Chen
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; (H.C.); (J.S.)
- Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA;
| | - Jingmin Shu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; (H.C.); (J.S.)
- Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA;
| | - Carlo C. Maley
- Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA;
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ 85281, USA
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; (H.C.); (J.S.)
- Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA;
- Arizona Cancer Evolution Center, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
12
|
Pacelli C, Rossi A, Milella M, Colombo T, Le Pera L. RNA-Based Strategies for Cancer Therapy: In Silico Design and Evaluation of ASOs for Targeted Exon Skipping. Int J Mol Sci 2023; 24:14862. [PMID: 37834310 PMCID: PMC10573945 DOI: 10.3390/ijms241914862] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/26/2023] [Accepted: 09/27/2023] [Indexed: 10/15/2023] Open
Abstract
Precision medicine in oncology has made significant progress in recent years by approving drugs that target specific genetic mutations. However, many cancer driver genes remain challenging to pharmacologically target ("undruggable"). To tackle this issue, RNA-based methods like antisense oligonucleotides (ASOs) that induce targeted exon skipping (ES) could provide a promising alternative. In this work, a comprehensive computational procedure is presented, focused on the development of ES-based cancer treatments. The procedure aims to produce specific protein variants, including inactive oncogenes and partially restored tumor suppressors. This novel computational procedure encompasses target-exon selection, in silico prediction of ES products, and identification of the best candidate ASOs for further experimental validation. The method was effectively employed on extensively mutated cancer genes, prioritized according to their suitability for ES-based interventions. Notable genes, such as NRAS and VHL, exhibited potential for this therapeutic approach, as specific target exons were identified and optimal ASO sequences were devised to induce their skipping. To the best of our knowledge, this is the first computational procedure that encompasses all necessary steps for designing ASO sequences tailored for targeted ES, contributing with a versatile and innovative approach to addressing the challenges posed by undruggable cancer driver genes and beyond.
Collapse
Affiliation(s)
- Chiara Pacelli
- Department of Biochemical Sciences “A. Rossi Fanelli”, Sapienza University of Rome, 00185 Rome, Italy
| | - Alice Rossi
- Section of Oncology, Department of Medicine, University of Verona-School of Medicine and Verona University Hospital Trust, 37134 Verona, Italy
| | - Michele Milella
- Section of Oncology, Department of Medicine, University of Verona-School of Medicine and Verona University Hospital Trust, 37134 Verona, Italy
| | - Teresa Colombo
- Institute of Molecular Biology and Pathology (IBPM), National Research Council (CNR), 00185 Rome, Italy
| | - Loredana Le Pera
- Core Facilities, Italian National Institute of Health (ISS), 00161 Rome, Italy
| |
Collapse
|
13
|
Farooqi AA, Rakhmetova V, Kapanova G, Tanbayeva G, Mussakhanova A, Abdykulova A, Ryskulova AG. Role of Ubiquitination and Epigenetics in the Regulation of AhR Signaling in Carcinogenesis and Metastasis: "Albatross around the Neck" or "Blessing in Disguise". Cells 2023; 12:2382. [PMID: 37830596 PMCID: PMC10571945 DOI: 10.3390/cells12192382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 09/25/2023] [Accepted: 09/27/2023] [Indexed: 10/14/2023] Open
Abstract
The molecular mechanisms and signal transduction cascades evoked by the activation of aryl hydrocarbon receptor (AhR) are becoming increasingly understandable. AhR is a ligand-activated transcriptional factor that integrates environmental, dietary and metabolic cues for the pleiotropic regulation of a wide variety of mechanisms. AhR mediates transcriptional programming in a ligand-specific, context-specific and cell-type-specific manner. Pioneering cutting-edge research works have provided fascinating new insights into the mechanistic role of AhR-driven downstream signaling in a wide variety of cancers. AhR ligands derived from food, environmental contaminants and intestinal microbiota strategically activated AhR signaling and regulated multiple stages of cancer. Although AhR has classically been viewed and characterized as a ligand-regulated transcriptional factor, its role as a ubiquitin ligase is fascinating. Accordingly, recent evidence has paradigmatically shifted our understanding and urged researchers to drill down deep into these novel and clinically valuable facets of AhR biology. Our rapidly increasing realization related to AhR-mediated regulation of the ubiquitination and proteasomal degradation of different proteins has started to scratch the surface of intriguing mechanisms. Furthermore, AhR and epigenome dynamics have shown previously unprecedented complexity during multiple stages of cancer progression. AhR not only transcriptionally regulated epigenetic-associated molecules, but also worked with epigenetic-modifying enzymes during cancer progression. In this review, we have summarized the findings obtained not only from cell-culture studies, but also from animal models. Different clinical trials are currently being conducted using AhR inhibitors and PD-1 inhibitors (Pembrolizumab and nivolumab), which confirm the linchpin role of AhR-related mechanistic details in cancer progression. Therefore, further studies are required to develop a better comprehension of the many-sided and "diametrically opposed" roles of AhR in the regulation of carcinogenesis and metastatic spread of cancer cells to the secondary organs.
Collapse
Affiliation(s)
- Ammad Ahmad Farooqi
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad 54000, Pakistan
| | - Venera Rakhmetova
- Department of Internal Diseases, Medical University of Astana, Astana 010000, Kazakhstan
| | - Gulnara Kapanova
- Faculty of Medicine and healthcare, Al-Farabi Kazakh National University, 71 Al-Farabi Ave, Almaty 050040, Kazakhstan (G.T.)
- Scientific Center of Anti-Infectious Drugs, 75 Al-Farabi Ave, Almaty 050040, Kazakhstan
| | - Gulnur Tanbayeva
- Faculty of Medicine and healthcare, Al-Farabi Kazakh National University, 71 Al-Farabi Ave, Almaty 050040, Kazakhstan (G.T.)
| | - Akmaral Mussakhanova
- Department of Public Health and Management, Astana Medical University, Astana 010000, Kazakhstan;
| | - Akmaral Abdykulova
- Department of General Medical Practice, General Medicine Faculty, Asfendiyarov Kazakh National Medical University, Almaty 050000, Kazakhstan;
| | - Alma-Gul Ryskulova
- Department of Public Health and Social Sciences, Kazakhstan Medical University “KSPH”, Utenos Str. 19A, Almaty 050060, Kazakhstan;
| |
Collapse
|
14
|
Yang H, Liu Y, Yang Y, Li D, Wang Z. InDEP: an interpretable machine learning approach to predict cancer driver genes from multi-omics data. Brief Bioinform 2023; 24:bbad318. [PMID: 37649392 DOI: 10.1093/bib/bbad318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 06/14/2023] [Accepted: 08/16/2023] [Indexed: 09/01/2023] Open
Abstract
Cancer driver genes are critical in driving tumor cell growth, and precisely identifying these genes is crucial in advancing our understanding of cancer pathogenesis and developing targeted cancer drugs. Despite the current methods for discovering cancer driver genes that mainly rely on integrating multi-omics data, many existing models are overly complex, and it is difficult to interpret the results accurately. This study aims to address this issue by introducing InDEP, an interpretable machine learning framework based on cascade forests. InDEP is designed with easy-to-interpret features, cascade forests based on decision trees and a KernelSHAP module that enables fine-grained post-hoc interpretation. Integrating multi-omics data, InDEP can identify essential features of classified driver genes at both the gene and cancer-type levels. The framework accurately identifies driver genes, discovers new patterns that make genes as driver genes and refines the cancer driver gene catalog. In comparison with state-of-the-art methods, InDEP proved to be more accurate on the test set and identified reliable candidate driver genes. Mutational features were the primary drivers for InDEP's identifying driver genes, with other omics features also contributing. At the gene level, the framework concluded that substitution-type mutations were the main reason most genes were identified as driver genes. InDEP's ability to identify reliable candidate driver genes opens up new avenues for precision oncology and discovering new biomedical knowledge. This framework can help advance cancer research by providing an interpretable method for identifying cancer driver genes and their contribution to cancer pathogenesis, facilitating the development of targeted cancer drugs.
Collapse
Affiliation(s)
- Hai Yang
- Department of Computer Science and Engineering, East China University of Science and Technology, 200237, Shanghai, PR China
| | - Yawen Liu
- Department of Computer Science and Engineering, East China University of Science and Technology, 200237, Shanghai, PR China
| | - Yijing Yang
- Department of Computer Science, University of Illinois Urbana-Champaign, Champaign, Illinois, United States of America
| | - Dongdong Li
- Department of Computer Science and Engineering, East China University of Science and Technology, 200237, Shanghai, PR China
| | - Zhe Wang
- Department of Computer Science and Engineering, East China University of Science and Technology, 200237, Shanghai, PR China
| |
Collapse
|
15
|
Chalise JP, Ehsani A, Lemecha M, Hung YW, Zhang G, Larson GP, Itakura K. ARID5B regulates fatty acid metabolism and proliferation at the Pre-B cell stage during B cell development. Front Immunol 2023; 14:1170475. [PMID: 37483604 PMCID: PMC10360657 DOI: 10.3389/fimmu.2023.1170475] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 06/15/2023] [Indexed: 07/25/2023] Open
Abstract
During B cell development in bone marrow, large precursor B cells (large Pre-B cells) proliferate rapidly, exit the cell cycle, and differentiate into non-proliferative (quiescent) small Pre-B cells. Dysregulation of this process may result in the failure to produce functional B cells and pose a risk of leukemic transformation. Here, we report that AT rich interacting domain 5B (ARID5B), a B cell acute lymphoblastic leukemia (B-ALL) risk gene, regulates B cell development at the Pre-B stage. In both mice and humans, we observed a significant upregulation of ARID5B expression that initiates at the Pre-B stage and is maintained throughout later stages of B cell development. In mice, deletion of Arid5b in vivo and ex vivo exhibited a significant reduction in the proportion of immature B cells but an increase in large and small Pre-B cells. Arid5b inhibition ex vivo also led to an increase in proliferation of both Pre-B cell populations. Metabolic studies in mouse and human bone marrow revealed that fatty acid uptake peaked in proliferative B cells then decreased during non-proliferative stages. We showed that Arid5b ablation enhanced fatty acid uptake and oxidation in Pre-B cells. Furthermore, decreased ARID5B expression was observed in tumor cells from B-ALL patients when compared to B cells from non-leukemic individuals. In B-ALL patients, ARID5B expression below the median was associated with decreased survival particularly in subtypes originating from Pre-B cells. Collectively, our data indicated that Arid5b regulates fatty acid metabolism and proliferation of Pre-B cells in mice, and reduced expression of ARID5B in humans is a risk factor for B cell leukemia.
Collapse
Affiliation(s)
- Jaya Prakash Chalise
- Center for RNA Biology and Therapeutics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| | - Ali Ehsani
- Center for RNA Biology and Therapeutics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| | - Mengistu Lemecha
- Center for RNA Biology and Therapeutics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| | - Yu-Wen Hung
- Immunology and Theranostics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| | - Guoxiang Zhang
- Center for RNA Biology and Therapeutics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| | - Garrett P. Larson
- Center for RNA Biology and Therapeutics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| | - Keiichi Itakura
- Center for RNA Biology and Therapeutics, Beckman Research Institute, City of Hope, Duarte, CA, United States
| |
Collapse
|
16
|
Goyal J, Ng DQ, Zhang K, Chan A, Lee J, Zheng K, Hurley-Kim K, Nguyen L, He L, Nguyen M, McBane S, Li W, Cadiz CL. Using machine learning to develop a clinical prediction model for SSRI-associated bleeding: a feasibility study. BMC Med Inform Decis Mak 2023; 23:105. [PMID: 37301967 PMCID: PMC10257821 DOI: 10.1186/s12911-023-02206-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 05/31/2023] [Indexed: 06/12/2023] Open
Abstract
INTRODUCTION Adverse drug events (ADEs) are associated with poor outcomes and increased costs but may be prevented with prediction tools. With the National Institute of Health All of Us (AoU) database, we employed machine learning (ML) to predict selective serotonin reuptake inhibitor (SSRI)-associated bleeding. METHODS The AoU program, beginning in 05/2018, continues to recruit ≥ 18 years old individuals across the United States. Participants completed surveys and consented to contribute electronic health record (EHR) for research. Using the EHR, we determined participants who were exposed to SSRIs (citalopram, escitalopram, fluoxetine, fluvoxamine, paroxetine, sertraline, vortioxetine). Features (n = 88) were selected with clinicians' input and comprised sociodemographic, lifestyle, comorbidities, and medication use information. We identified bleeding events with validated EHR algorithms and applied logistic regression, decision tree, random forest, and extreme gradient boost to predict bleeding during SSRI exposure. We assessed model performance with area under the receiver operating characteristic curve statistic (AUC) and defined clinically significant features as resulting in > 0.01 decline in AUC after removal from the model, in three of four ML models. RESULTS There were 10,362 participants exposed to SSRIs, with 9.6% experiencing a bleeding event during SSRI exposure. For each SSRI, performance across all four ML models was relatively consistent. AUCs from the best models ranged 0.632-0.698. Clinically significant features included health literacy for escitalopram, and bleeding history and socioeconomic status for all SSRIs. CONCLUSIONS We demonstrated feasibility of predicting ADEs using ML. Incorporating genomic features and drug interactions with deep learning models may improve ADE prediction.
Collapse
Affiliation(s)
- Jatin Goyal
- Donald Bren School of Information and Computer Sciences, University of California Irvine, Irvine, CA, USA
| | - Ding Quan Ng
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Kevin Zhang
- Donald Bren School of Information and Computer Sciences, University of California Irvine, Irvine, CA, USA
| | - Alexandre Chan
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Joyce Lee
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Kai Zheng
- Donald Bren School of Information and Computer Sciences, University of California Irvine, Irvine, CA, USA
| | - Keri Hurley-Kim
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Lee Nguyen
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Lu He
- Donald Bren School of Information and Computer Sciences, University of California Irvine, Irvine, CA, USA
| | - Megan Nguyen
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Sarah McBane
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California Irvine, Irvine, CA, USA
| | - Christine Luu Cadiz
- Department of Clinical Pharmacy Practice, School of Pharmacy and Pharmaceutical Sciences, University of California Irvine, 802 W Peltason Dr, Irvine, CA, 92697-4625, USA.
| |
Collapse
|
17
|
Quraish RU, Hirahata T, Quraish AU, ul Quraish S. An Overview: Genetic Tumor Markers for Early Detection and Current Gene Therapy Strategies. Cancer Inform 2023; 22:11769351221150772. [PMID: 36762284 PMCID: PMC9903029 DOI: 10.1177/11769351221150772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 12/24/2022] [Indexed: 02/04/2023] Open
Abstract
Genomic instability is considered a fundamental factor involved in any neoplastic disease. Consequently, the genetically unstable cells contribute to intratumoral genetic heterogeneity and phenotypic diversity of cancer. These genetic alterations can be detected by several diagnostic techniques of molecular biology and the detection of alteration in genomic integrity may serve as reliable genetic molecular markers for the early detection of cancer or cancer-related abnormal changes in the body cells. These genetic molecular markers can detect cancer earlier than any other method of cancer diagnosis, once a tumor is diagnosed, then replacement or therapeutic manipulation of these cancer-related abnormal genetic changes can be possible, which leads toward effective and target-specific cancer treatment and in many cases, personalized treatment of cancer could be performed without the adverse effects of chemotherapy and radiotherapy. In this review, we describe how these genetic molecular markers can be detected and the possible ways for the application of this gene diagnosis for gene therapy that can attack cancerous cells, directly or indirectly, which lead to overall improved management and quality of life for a cancer patient.
Collapse
Affiliation(s)
| | - Tetsuyuki Hirahata
- Tetsuyuki Hirahata, Hirahata Gene Therapy Laboratory, HIC Clinic #1105, Itocia Office Tower 11F, 2-7-1, Yurakucho, Chiyoda-ku, Tokyo 100-0006, Japan.
| | | | | |
Collapse
|
18
|
Nirgude S, Desai S, Choudhary B. Genome-wide differential DNA methylation analysis of MDA-MB-231 breast cancer cells treated with curcumin derivatives, ST08 and ST09. BMC Genomics 2022; 23:807. [PMID: 36474139 PMCID: PMC9727864 DOI: 10.1186/s12864-022-09041-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 10/17/2022] [Indexed: 12/12/2022] Open
Abstract
ST08 and ST09 are potent curcumin derivatives with antiproliferative, apoptotic, and migrastatic properties. Both ST08 and ST09 exhibit in vitro and in vivo anticancer properties. As reported earlier, these derivatives were highly cytotoxic towards MDA-MB-231 triple-negative breast cancer cells with IC50 values in the nanomolar (40-80nM) range.In this study,we performed whole-genome bisulfite sequencing(WGBS) of untreated (control), ST08 and ST09 (treated) triple-negative breast cancer cell line MDA-MB-231 to unravel epigenetic changes induced by the drug. We identified differentially methylated sites (DMSs) enriched in promoter regions across the genome. Analysis of the CpG island promoter methylation identified 12 genes common to both drugs, and 50% of them are known to be methylated in patient samples that were hypomethylated by drugs belonging to the homeobox family transcription factors.Methylation analysis of the gene body revealed 910 and 952 genes to be hypermethylatedin ST08 and ST09 treated MDA-MB-231 cells respectively. Correlation of the gene body hypermethylation with expression revealed CACNAH1 to be upregulated in ST08 treatment and CDH23 upregulation in ST09.Further, integrated analysis of the WGBS with RNA-seq identified uniquely altered pathways - ST08 altered ECM pathway, and ST09 cell cycle, indicating drug-specific signatures.
Collapse
Affiliation(s)
- Snehal Nirgude
- grid.418831.70000 0004 0500 991XInstitute of Bioinformatics and Applied Biotechnology, Electronic city phase 1, 560100 Bangalore, India ,grid.239552.a0000 0001 0680 8770Working at Division of Human Genetics, Children’s Hospital of Philadelphia, 19104 Philadelphia, PA USA
| | - Sagar Desai
- grid.418831.70000 0004 0500 991XInstitute of Bioinformatics and Applied Biotechnology, Electronic city phase 1, 560100 Bangalore, India
| | - Bibha Choudhary
- grid.418831.70000 0004 0500 991XInstitute of Bioinformatics and Applied Biotechnology, Electronic city phase 1, 560100 Bangalore, India
| |
Collapse
|
19
|
Lan Y, Liu W, Hou X, Wang S, Wang H, Deng M, Wang G, Ping Y, Zhang X. Revealing the functions of clonal driver gene mutations in patients based on evolutionary dependencies. Hum Mutat 2022; 43:2187-2204. [PMID: 36218010 DOI: 10.1002/humu.24484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 09/19/2022] [Accepted: 10/06/2022] [Indexed: 01/25/2023]
Abstract
The clonal mutations in driver genes enable cells to gradually acquire growth advantage in tumor development. Therefore, revealing the functions of clonal driver gene mutations is important. Here, we proposed the method FCMP that considered evolutionary dependencies to analyze the functions of clonal driver gene mutations in a single patient. Applying our method to five cancer types from The Cancer Genome Atlas, we identified specific functions and common functions of clonal driver gene mutations. We found that the clonal driver gene mutations in the same patient played multiple functions. We also found that clonal mutations in the same driver gene performed different functions in different patients. These findings suggested that the clonal driver gene mutations showed strong tumor heterogeneity. In the pan-cancer analysis, the immune-related functions for clonal driver gene mutations were shared by multiple cancer types. In addition, clonal mutations in some driver genes predicted the survival of patients in cancers.
Collapse
Affiliation(s)
- Yujia Lan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Wei Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Xiaobo Hou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shuai Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Hao Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Menglan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Guiyu Wang
- Department of Colorectal Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China
| | - Yanyan Ping
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
20
|
Akinlalu AO, Njoku PC, Nzekwe CV, Oni RO, Fojude T, Faniyi AJ, Olagunju AS. Recent developments in the significant effect of mRNA modification (M6A) in glioblastoma and esophageal cancer. SCIENTIFIC AFRICAN 2022. [DOI: 10.1016/j.sciaf.2022.e01347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
|
21
|
Uncovering Oncogenic Mechanisms of Tumor Suppressor Genes in Breast Cancer Multi-Omics Data. Int J Mol Sci 2022; 23:ijms23179624. [PMID: 36077026 PMCID: PMC9455665 DOI: 10.3390/ijms23179624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/16/2022] [Accepted: 08/19/2022] [Indexed: 11/17/2022] Open
Abstract
Tumor suppressor genes (TSGs) are essential genes in the development of cancer. While they have many roles in normal cells, mutation and dysregulation of the TSGs result in aberrant molecular processes in cancer cells. Therefore, understanding TSGs and their roles in the oncogenic process is crucial for prevention and treatment of cancer. In this research, multi-omics breast cancer data were used to identify molecular mechanisms of TSGs in breast cancer. Differentially expressed genes and differentially coexpressed genes were identified in four large-scale transcriptomics data from public repositories and multi-omics data analyses of copy number, methylation and gene expression were performed. The results of the analyses were integrated using enrichment analysis and meta-analysis of a p-value summation method. The integrative analysis revealed that TSGs have a significant relationship with genes of gene ontology terms that are related to cell cycle, genome stability, RNA processing and metastasis, indicating the regulatory mechanisms of TSGs on cancer cells. The analysis frame and research results will provide valuable information for the further identification of TSGs in different types of cancers.
Collapse
|
22
|
Lin X, Liu Y, Liu S, Zhu X, Wu L, Zhu Y, Zhao D, Xu X, Chemparathy A, Wang H, Cao Y, Nakamura M, Noordermeer JN, La Russa M, Wong WH, Zhao K, Qi LS. Nested epistasis enhancer networks for robust genome regulation. Science 2022; 377:1077-1085. [PMID: 35951677 DOI: 10.1126/science.abk3512] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Mammalian genomes possess multiple enhancers spanning an ultralong distance (>megabases) to modulate important genes, yet it is unclear how these enhancers coordinate to achieve this task. Here, we combine multiplexed CRISPRi screening with machine learning to define quantitative enhancer-enhancer interactions. We find that the ultralong distance enhancer network possesses a nested multi-layer architecture that confers functional robustness of gene expression. Experimental characterization reveals that enhancer epistasis is maintained by three-dimensional chromosomal interactions and BRD4 condensation. Machine learning prediction of synergistic enhancers provides an effective strategy to identify non-coding variant pairs associated with pathogenic genes in diseases beyond Genome-Wide Association Studies (GWAS) analysis. Our work unveils nested epistasis enhancer networks, which can better explain enhancer functions within cells and in diseases.
Collapse
Affiliation(s)
- Xueqiu Lin
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yanxia Liu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Shuai Liu
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Xiang Zhu
- Department of Statistics, Stanford University, Stanford, CA 94305, USA.,Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Lingling Wu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yanyu Zhu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Dehua Zhao
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Xiaoshu Xu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Haifeng Wang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yaqiang Cao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Muneaki Nakamura
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Marie La Russa
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Wing Hung Wong
- Department of Statistics, Stanford University, Stanford, CA 94305, USA.,Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Keji Zhao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Lei S Qi
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.,ChEM-H, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg BioHub, San Francisco, CA 94158, USA
| |
Collapse
|
23
|
A multiplexed electrochemical quantitative polymerase chain reaction platform for single-base mutation analysis. Biosens Bioelectron 2022; 214:114496. [DOI: 10.1016/j.bios.2022.114496] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 06/19/2022] [Accepted: 06/20/2022] [Indexed: 11/17/2022]
|
24
|
Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms. Genes (Basel) 2022; 13:genes13050716. [PMID: 35627101 PMCID: PMC9141966 DOI: 10.3390/genes13050716] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 04/16/2022] [Accepted: 04/18/2022] [Indexed: 12/19/2022] Open
Abstract
Cancer is a complex disease caused by genomic and epigenetic alterations; hence, identifying meaningful cancer drivers is an important and challenging task. Most studies have detected cancer drivers with mutated traits, while few studies consider multiple omics characteristics as important factors. In this study, we present a framework to analyze the effects of multi-omics characteristics on the identification of driver genes. We utilize four machine learning algorithms within this framework to detect cancer driver genes in pan-cancer data, including 75 characteristics among 19,636 genes. The 75 features are divided into four types and analyzed using Kullback–Leibler divergence based on CGC genes and non-CGC genes. We detect cancer driver genes in two different ways. One is to detect driver genes from a single feature type, while the other is from the top N features. The first analysis denotes that the mutational features are the best characteristics. The second analysis reveals that the top 45 features are the most effective feature combinations and superior to the mutational features. The top 45 features not only contain mutational features but also three other types of features. Therefore, our study extends the detection of cancer driver genes and provides a more comprehensive understanding of cancer mechanisms.
Collapse
|
25
|
Andrades R, Recamonde-Mendoza M. Machine learning methods for prediction of cancer driver genes: a survey paper. Brief Bioinform 2022; 23:6551145. [PMID: 35323900 DOI: 10.1093/bib/bbac062] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/06/2022] [Accepted: 02/08/2022] [Indexed: 12/21/2022] Open
Abstract
Identifying the genes and mutations that drive the emergence of tumors is a critical step to improving our understanding of cancer and identifying new directions for disease diagnosis and treatment. Despite the large volume of genomics data, the precise detection of driver mutations and their carrying genes, known as cancer driver genes, from the millions of possible somatic mutations remains a challenge. Computational methods play an increasingly important role in discovering genomic patterns associated with cancer drivers and developing predictive models to identify these elements. Machine learning (ML), including deep learning, has been the engine behind many of these efforts and provides excellent opportunities for tackling remaining gaps in the field. Thus, this survey aims to perform a comprehensive analysis of ML-based computational approaches to identify cancer driver mutations and genes, providing an integrated, panoramic view of the broad data and algorithmic landscape within this scientific problem. We discuss how the interactions among data types and ML algorithms have been explored in previous solutions and outline current analytical limitations that deserve further attention from the scientific community. We hope that by helping readers become more familiar with significant developments in the field brought by ML, we may inspire new researchers to address open problems and advance our knowledge towards cancer driver discovery.
Collapse
Affiliation(s)
- Renan Andrades
- Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre/RS, Brazil.,Bioinformatics Core, Hospital de Clínicas de Porto Alegre, Porto Alegre/RS, Brazil
| | - Mariana Recamonde-Mendoza
- Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre/RS, Brazil.,Bioinformatics Core, Hospital de Clínicas de Porto Alegre, Porto Alegre/RS, Brazil
| |
Collapse
|
26
|
Comprehensive patient-level classification and quantification of driver events in TCGA PanCanAtlas cohorts. PLoS Genet 2022; 18:e1009996. [PMID: 35030162 PMCID: PMC8759692 DOI: 10.1371/journal.pgen.1009996] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Accepted: 12/14/2021] [Indexed: 12/14/2022] Open
Abstract
There is a growing need to develop novel therapeutics for targeted treatment of cancer. The prerequisite to success is the knowledge about which types of molecular alterations are predominantly driving tumorigenesis. To shed light onto this subject, we have utilized the largest database of human cancer mutations–TCGA PanCanAtlas, multiple established algorithms for cancer driver prediction (2020plus, CHASMplus, CompositeDriver, dNdScv, DriverNet, HotMAPS, OncodriveCLUSTL, OncodriveFML) and developed four novel computational pipelines: SNADRIF (Single Nucleotide Alteration DRIver Finder), GECNAV (Gene Expression-based Copy Number Alteration Validator), ANDRIF (ANeuploidy DRIver Finder) and PALDRIC (PAtient-Level DRIver Classifier). A unified workflow integrating all these pipelines, algorithms and datasets at cohort and patient levels was created. We have found that there are on average 12 driver events per tumour, of which 0.6 are single nucleotide alterations (SNAs) in oncogenes, 1.5 are amplifications of oncogenes, 1.2 are SNAs in tumour suppressors, 2.1 are deletions of tumour suppressors, 1.5 are driver chromosome losses, 1 is a driver chromosome gain, 2 are driver chromosome arm losses, and 1.5 are driver chromosome arm gains. The average number of driver events per tumour increases with age (from 7 to 15) and cancer stage (from 10 to 15) and varies strongly between cancer types (from 1 to 24). Patients with 1 and 7 driver events per tumour are the most frequent, and there are very few patients with more than 40 events. In tumours having only one driver event, this event is most often an SNA in an oncogene. However, with increasing number of driver events per tumour, the contribution of SNAs decreases, whereas the contribution of copy-number alterations and aneuploidy events increases. By analysing genomic and transcriptomic data from 10000 cancer patients through our custom-built computational pipelines and previously established third-party algorithms, we have found that half of all driver events in a patient’s tumour appear to be gains and losses of chromosomal arms and whole chromosomes. We therefore suggest that future therapeutics development efforts should be focused on targeting aneuploidy. We have also found that approximately a third of driver events in a patient are whole gene amplifications and deletions. Thus, therapies aimed at copy-number alterations also appear very promising. On the other hand, drugs aiming at point mutations are predicted to be less successful, as these alterations are responsible for just a couple of drivers per tumour. One notable exception are patients having only one driver event in their tumours, as this event is almost always a single nucleotide alteration in an oncogene.
Collapse
|
27
|
Feizi N, Nair SK, Smirnov P, Beri G, Eeles C, Esfahani PN, Nakano M, Tkachuk D, Mammoliti A, Gorobets E, Mer AS, Lin E, Yu Y, Martin S, Hafner M, Haibe-Kains B. PharmacoDB 2.0: improving scalability and transparency of in vitro pharmacogenomics analysis. Nucleic Acids Res 2022; 50:D1348-D1357. [PMID: 34850112 PMCID: PMC8728279 DOI: 10.1093/nar/gkab1084] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/15/2021] [Accepted: 10/20/2021] [Indexed: 11/14/2022] Open
Abstract
Cancer pharmacogenomics studies provide valuable insights into disease progression and associations between genomic features and drug response. PharmacoDB integrates multiple cancer pharmacogenomics datasets profiling approved and investigational drugs across cell lines from diverse tissue types. The web-application enables users to efficiently navigate across datasets, view and compare drug dose-response data for a specific drug-cell line pair. In the new version of PharmacoDB (version 2.0, https://pharmacodb.ca/), we present (i) new datasets such as NCI-60, the Profiling Relative Inhibition Simultaneously in Mixtures (PRISM) dataset, as well as updated data from the Genomics of Drug Sensitivity in Cancer (GDSC) and the Genentech Cell Line Screening Initiative (gCSI); (ii) implementation of FAIR data pipelines using ORCESTRA and PharmacoDI; (iii) enhancements to drug-response analysis such as tissue distribution of dose-response metrics and biomarker analysis; and (iv) improved connectivity to drug and cell line databases in the community. The web interface has been rewritten using a modern technology stack to ensure scalability and standardization to accommodate growing pharmacogenomics datasets. PharmacoDB 2.0 is a valuable tool for mining pharmacogenomics datasets, comparing and assessing drug-response phenotypes of cancer models.
Collapse
Affiliation(s)
- Nikta Feizi
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Sisira Kadambat Nair
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Petr Smirnov
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Gangesh Beri
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Christopher Eeles
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Parinaz Nasr Esfahani
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Minoru Nakano
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Denis Tkachuk
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Anthony Mammoliti
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Evgeniya Gorobets
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON M5S 3G5, Canada
| | - Arvind Singh Mer
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Eva Lin
- Department of Discovery Oncology, Genentech Inc, South San Francisco, CA 94080, USA
| | - Yihong Yu
- Department of Discovery Oncology, Genentech Inc, South San Francisco, CA 94080, USA
| | - Scott Martin
- Department of Discovery Oncology, Genentech Inc, South San Francisco, CA 94080, USA
| | - Marc Hafner
- Department of Oncology Bioinformatics, Genentech Inc, South San Francisco, CA 94080, USA
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5T 3A1, Canada
- Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON M5G 1M1, Canada
| |
Collapse
|
28
|
Arslan E, Schulz J, Rai K. Machine Learning in Epigenomics: Insights into Cancer Biology and Medicine. Biochim Biophys Acta Rev Cancer 2021; 1876:188588. [PMID: 34245839 PMCID: PMC8595561 DOI: 10.1016/j.bbcan.2021.188588] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 05/29/2021] [Accepted: 07/02/2021] [Indexed: 02/01/2023]
Abstract
The recent deluge of genome-wide technologies for the mapping of the epigenome and resulting data in cancer samples has provided the opportunity for gaining insights into and understanding the roles of epigenetic processes in cancer. However, the complexity, high-dimensionality, sparsity, and noise associated with these data pose challenges for extensive integrative analyses. Machine Learning (ML) algorithms are particularly suited for epigenomic data analyses due to their flexibility and ability to learn underlying hidden structures. We will discuss four overlapping but distinct major categories under ML: dimensionality reduction, unsupervised methods, supervised methods, and deep learning (DL). We review the preferred use cases of these algorithms in analyses of cancer epigenomics data with the hope to provide an overview of how ML approaches can be used to explore fundamental questions on the roles of epigenome in cancer biology and medicine.
Collapse
Affiliation(s)
- Emre Arslan
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America
| | - Jonathan Schulz
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America
| | - Kunal Rai
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America.
| |
Collapse
|
29
|
Tran V, Kim R, Maertens M, Hartung T, Maertens A. Similarities and Differences in Gene Expression Networks Between the Breast Cancer Cell Line Michigan Cancer Foundation-7 and Invasive Human Breast Cancer Tissues. Front Artif Intell 2021; 4:674370. [PMID: 34056582 PMCID: PMC8155268 DOI: 10.3389/frai.2021.674370] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 04/23/2021] [Indexed: 12/31/2022] Open
Abstract
Failure to adequately characterize cell lines, and understand the differences between in vitro and in vivo biology, can have serious consequences on the translatability of in vitro scientific studies to human clinical trials. This project focuses on the Michigan Cancer Foundation-7 (MCF-7) cells, a human breast adenocarcinoma cell line that is commonly used for in vitro cancer research, with over 42,000 publications in PubMed. In this study, we explore the key similarities and differences in gene expression networks of MCF-7 cell lines compared to human breast cancer tissues. We used two MCF-7 data sets, one data set collected by ARCHS4 including 1032 samples and one data set from Gene Expression Omnibus GSE50705 with 88 estradiol-treated MCF-7 samples. The human breast invasive ductal carcinoma (BRCA) data set came from The Cancer Genome Atlas, including 1212 breast tissue samples. Weighted Gene Correlation Network Analysis (WGCNA) and functional annotations of the data showed that MCF-7 cells and human breast tissues have only minimal similarity in biological processes, although some fundamental functions, such as cell cycle, are conserved. Scaled connectivity—a network topology metric—also showed drastic differences in the behavior of genes between MCF-7 and BRCA data sets. Finally, we used canSAR to compute ligand-based druggability scores of genes in the data sets, and our results suggested that using MCF-7 to study breast cancer may lead to missing important gene targets. Our comparison of the networks of MCF-7 and human breast cancer highlights the nuances of using MCF-7 to study human breast cancer and can contribute to better experimental design and result interpretation of study involving this cell line.
Collapse
Affiliation(s)
- Vy Tran
- Department of Environmental Health and Engineering, Center for Alternatives to Animal Testing, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
| | - Robert Kim
- Department of Environmental Health and Engineering, Center for Alternatives to Animal Testing, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
| | - Mikhail Maertens
- Department of Environmental Health and Engineering, Center for Alternatives to Animal Testing, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
| | - Thomas Hartung
- Department of Environmental Health and Engineering, Center for Alternatives to Animal Testing, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States.,Department of Biology, Center for Alternatives to Animal Testing-Europe, University of Konstanz, Konstanz, Germany.,Department of Environmental Health and Engineering, Doerenkamp-Zbinden Professor and Chair for Evidence-Based Toxicology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
| | - Alexandra Maertens
- Department of Environmental Health and Engineering, Center for Alternatives to Animal Testing, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
30
|
Li JJ, Chen YE, Tong X. A flexible model-free prediction-based framework for feature ranking. JOURNAL OF MACHINE LEARNING RESEARCH : JMLR 2021; 22:124. [PMID: 35321091 PMCID: PMC8939838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Despite the availability of numerous statistical and machine learning tools for joint feature modeling, many scientists investigate features marginally, i.e., one feature at a time. This is partly due to training and convention but also roots in scientists' strong interests in simple visualization and interpretability. As such, marginal feature ranking for some predictive tasks, e.g., prediction of cancer driver genes, is widely practiced in the process of scientific discoveries. In this work, we focus on marginal ranking for binary classification, one of the most common predictive tasks. We argue that the most widely used marginal ranking criteria, including the Pearson correlation, the two-sample t test, and two-sample Wilcoxon rank-sum test, do not fully take feature distributions and prediction objectives into account. To address this gap in practice, we propose two ranking criteria corresponding to two prediction objectives: the classical criterion (CC) and the Neyman-Pearson criterion (NPC), both of which use model-free nonparametric implementations to accommodate diverse feature distributions. Theoretically, we show that under regularity conditions, both criteria achieve sample-level ranking that is consistent with their population-level counterpart with high probability. Moreover, NPC is robust to sampling bias when the two class proportions in a sample deviate from those in the population. This property endows NPC good potential in biomedical research where sampling biases are ubiquitous. We demonstrate the use and relative advantages of CC and NPC in simulation and real data studies. Our model-free objective-based ranking idea is extendable to ranking feature subsets and generalizable to other prediction tasks and learning objectives.
Collapse
Affiliation(s)
| | | | - Xin Tong
- Department of Data Sciences and Operations, Marshall Business School, University of Southern California
| |
Collapse
|