1
|
Jiang J, van Ertvelde J, Ertaylan G, Peeters R, Jennen D, de Kok TM, Vinken M. Unraveling the mechanisms underlying drug-induced cholestatic liver injury: identifying key genes using machine learning techniques on human in vitro data sets. Arch Toxicol 2023; 97:2969-2981. [PMID: 37603094 PMCID: PMC10504391 DOI: 10.1007/s00204-023-03583-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 08/10/2023] [Indexed: 08/22/2023]
Abstract
Drug-induced intrahepatic cholestasis (DIC) is a main type of hepatic toxicity that is challenging to predict in early drug development stages. Preclinical animal studies often fail to detect DIC in humans. In vitro toxicogenomics assays using human liver cells have become a practical approach to predict human-relevant DIC. The present study was set up to identify transcriptomic signatures of DIC by applying machine learning algorithms to the Open TG-GATEs database. A total of nine DIC compounds and nine non-DIC compounds were selected, and supervised classification algorithms were applied to develop prediction models using differentially expressed features. Feature selection techniques identified 13 genes that achieved optimal prediction performance using logistic regression combined with a sequential backward selection method. The internal validation of the best-performing model showed accuracy of 0.958, sensitivity of 0.941, specificity of 0.978, and F1-score of 0.956. Applying the model to an external validation set resulted in an average prediction accuracy of 0.71. The identified genes were mechanistically linked to the adverse outcome pathway network of DIC, providing insights into cellular and molecular processes during response to chemical toxicity. Our findings provide valuable insights into toxicological responses and enhance the predictive accuracy of DIC prediction, thereby advancing the application of transcriptome profiling in designing new approach methodologies for hazard identification.
Collapse
Affiliation(s)
- Jian Jiang
- Entity of In Vitro Toxicology and Dermato‑Cosmetology, Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090, Brussels, Belgium.
| | - Jonas van Ertvelde
- Entity of In Vitro Toxicology and Dermato‑Cosmetology, Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090, Brussels, Belgium
| | - Gökhan Ertaylan
- Vlaamse Instelling voor Technologisch Onderzoek (VITO) NV, Health, Boeretang 200, 2400, Mol, Belgium
| | - Ralf Peeters
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
- Department of Advanced Computing Sciences, Maastricht University, Maastricht, The Netherlands
| | - Danyel Jennen
- Department of Toxicogenomics, GROW School for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Theo M de Kok
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
- Department of Toxicogenomics, GROW School for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Mathieu Vinken
- Entity of In Vitro Toxicology and Dermato‑Cosmetology, Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090, Brussels, Belgium.
| |
Collapse
|
2
|
Albert Jesuwaram AM, Maria Sebastin GP. Implementation analysis of pixel‐level image processing based on multiscale transforms. Comput Intell 2021. [DOI: 10.1111/coin.12384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
3
|
Identification of early liver toxicity gene biomarkers using comparative supervised machine learning. Sci Rep 2020; 10:19128. [PMID: 33154507 PMCID: PMC7645727 DOI: 10.1038/s41598-020-76129-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 10/12/2020] [Indexed: 02/08/2023] Open
Abstract
Screening agrochemicals and pharmaceuticals for potential liver toxicity is required for regulatory approval and is an expensive and time-consuming process. The identification and utilization of early exposure gene signatures and robust predictive models in regulatory toxicity testing has the potential to reduce time and costs substantially. In this study, comparative supervised machine learning approaches were applied to the rat liver TG-GATEs dataset to develop feature selection and predictive testing. We identified ten gene biomarkers using three different feature selection methods that predicted liver necrosis with high specificity and selectivity in an independent validation dataset from the Microarray Quality Control (MAQC)-II study. Nine of the ten genes that were selected with the supervised methods are involved in metabolism and detoxification (Car3, Crat, Cyp39a1, Dcd, Lbp, Scly, Slc23a1, and Tkfc) and transcriptional regulation (Ablim3). Several of these genes are also implicated in liver carcinogenesis, including Crat, Car3 and Slc23a1. Our biomarker gene signature provides high statistical accuracy and a manageable number of genes to study as indicators to potentially accelerate toxicity testing based on their ability to induce liver necrosis and, eventually, liver cancer.
Collapse
|
4
|
Lopez-Rincon A, Mendoza-Maldonado L, Martinez-Archundia M, Schönhuth A, Kraneveld AD, Garssen J, Tonda A. Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification. Cancers (Basel) 2020; 12:cancers12071785. [PMID: 32635415 PMCID: PMC7407482 DOI: 10.3390/cancers12071785] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 06/25/2020] [Accepted: 06/29/2020] [Indexed: 02/07/2023] Open
Abstract
Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, thanks to the availability of miRNAs datasets, machine learning techniques have been successfully applied to tumor classification. The results, however, are difficult to assess and interpret by medical experts because the algorithms exploit information from thousands of miRNAs. In this work, we propose a novel technique that aims at reducing the necessary information to the smallest possible set of circulating miRNAs. The dimensionality reduction achieved reflects a very important first step in a potential, clinically actionable, circulating miRNA-based precision medicine pipeline. While it is currently under discussion whether this first step can be taken, we demonstrate here that it is possible to perform classification tasks by exploiting a recursive feature elimination procedure that integrates a heterogeneous ensemble of high-quality, state-of-the-art classifiers on circulating miRNAs. Heterogeneous ensembles can compensate inherent biases of classifiers by using different classification algorithms. Selecting features then further eliminates biases emerging from using data from different studies or batches, yielding more robust and reliable outcomes. The proposed approach is first tested on a tumor classification problem in order to separate 10 different types of cancer, with samples collected over 10 different clinical trials, and later is assessed on a cancer subtype classification task, with the aim to distinguish triple negative breast cancer from other subtypes of breast cancer. Overall, the presented methodology proves to be effective and compares favorably to other state-of-the-art feature selection methods.
Collapse
Affiliation(s)
- Alejandro Lopez-Rincon
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands; (A.D.K.); (J.G.)
- Correspondence:
| | - Lucero Mendoza-Maldonado
- Nuevo Hospital Civil de Guadalajara “Dr. Juan I. Menchaca”, Salvador Quevedo y Zubieta 750, Independencia Oriente, Guadalajara C.P. 44340, Jalisco, Mexico;
| | - Marlet Martinez-Archundia
- Laboratorio de Modelado Molecular, Bioinformática y Diseno de farmacos, Seccion de Estudios de Posgrado e Investigación, Escuela Superior de Medicina, Instituto Politécnico Nacional, Mexico City 11340, Mexico;
| | - Alexander Schönhuth
- Life Sciences and Health, Centrum Wiskunde & Informatica, Science Park 123, 1098 XG Amsterdam, The Netherlands;
- Genome Data Science, Faculty of Technology, Bielefeld University, Universitätsstraße 25, 33615 Bielefeld, Germany
| | - Aletta D. Kraneveld
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands; (A.D.K.); (J.G.)
| | - Johan Garssen
- Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands; (A.D.K.); (J.G.)
- Global Centre of Excellence Immunology Danone Nutricia Research, Uppsalaan 12, 3584 CT Utrecht, The Netherlands
| | - Alberto Tonda
- UMR 518 MIA-Paris, INRAE, Université Paris-Saclay, 75013 Paris, France;
| |
Collapse
|
5
|
Hu Y, Dingerdissen H, Gupta S, Kahsay R, Shanker V, Wan Q, Yan C, Mazumder R. Identification of key differentially expressed MicroRNAs in cancer patients through pan-cancer analysis. Comput Biol Med 2018; 103:183-197. [PMID: 30384176 DOI: 10.1016/j.compbiomed.2018.10.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 10/01/2018] [Accepted: 10/17/2018] [Indexed: 12/16/2022]
Abstract
microRNAs (miRNAs) functioning in gene silencing have been associated with cancer progression. However, common abnormal miRNA expression patterns and their potential roles in cancer have not yet been evaluated. To account for individual differences between patients, we retrieved miRNA sequencing data for 575 patients with both tumor and adjacent non-tumorous tissues from 14 cancer types from The Cancer Genome Atlas (TCGA). We then performed differential expression analysis using DESeq2 and edgeR. Results showed that cancer types can be grouped based on the distribution of miRNAs with different expression patterns between tumor and non-tumor samples. We found 81 significantly differentially expressed miRNAs (SDEmiRNAs) in a single cancer. We also found 21 key SDEmiRNAs (nine over-expressed and 12 under-expressed) associated with at least eight cancers each and enriched in more than 60% of patients per cancer, including four newly identified SDEmiRNAs (hsa-mir-4746, hsa-mir-3648, hsa-mir-3687, and hsa-mir-1269a). The downstream effects of these 21 SDEmiRNAs on cellular function were evaluated through enrichment and pathway analysis of 7186 protein-coding gene targets mined from literature reports of differential expression of miRNAs in cancer. This analysis enables identification of SDEmiRNA functional similarity in cell proliferation control across a wide range of cancers, and assembly of common regulatory networks over cancer-related pathways. These findings were validated by construction of a regulatory network in the PI3K pathway. This study provides evidence for the value of further analysis of SDEmiRNAs as potential biomarkers and therapeutic targets for cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Yu Hu
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Hayley Dingerdissen
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Samir Gupta
- Department of Computer and Information Science, University of Delaware, Newark, DE, 19716, USA.
| | - Robel Kahsay
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Vijay Shanker
- Department of Computer and Information Science, University of Delaware, Newark, DE, 19716, USA.
| | - Quan Wan
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Cheng Yan
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Raja Mazumder
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA; The McCormick Genomic and Proteomic Center, The George Washington University, Washington, DC, 20037, USA.
| |
Collapse
|
6
|
Yang S, Shao F, Duan W, Zhao Y, Chen F. Variance component testing for identifying differentially expressed genes in RNA-seq data. PeerJ 2017; 5:e3797. [PMID: 28929020 PMCID: PMC5592911 DOI: 10.7717/peerj.3797] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 08/21/2017] [Indexed: 01/28/2023] Open
Abstract
RNA sequencing (RNA-Seq) enables the measurement and comparison of gene expression with isoform-level quantification. Differences in the effect of each isoform may make traditional methods, which aggregate isoforms, ineffective. Here, we introduce a variance component-based test that can jointly test multiple isoforms of one gene to identify differentially expressed (DE) genes, especially those with isoforms that have differential effects. We model isoform-level expression data from RNA-Seq using a negative binomial distribution and consider the baseline abundance of isoforms and their effects as two random terms. Our approach tests the global null hypothesis of no difference in any of the isoforms. The null distribution of the derived score statistic is investigated using empirical and theoretical methods. The results of simulations suggest that the performance of the proposed set test is superior to that of traditional algorithms and almost reaches optimal power when the variance of covariates is large. This method is also applied to analyze real data. Our algorithm, as a supplement to traditional algorithms, is superior at selecting DE genes with sparse or opposite effects for isoforms.
Collapse
Affiliation(s)
- Sheng Yang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, China
| | - Fang Shao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, China
| | - Weiwei Duan
- Department of Biostatistics, School of Public Health, Nanjing Medical University, China
| | - Yang Zhao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, China
| | - Feng Chen
- Department of Biostatistics, School of Public Health, Nanjing Medical University, China
| |
Collapse
|
7
|
Zeng JH, Liang L, He RQ, Tang RX, Cai XY, Chen JQ, Luo DZ, Chen G. Comprehensive investigation of a novel differentially expressed lncRNA expression profile signature to assess the survival of patients with colorectal adenocarcinoma. Oncotarget 2017; 8:16811-16828. [PMID: 28187432 PMCID: PMC5370003 DOI: 10.18632/oncotarget.15161] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 01/24/2017] [Indexed: 02/06/2023] Open
Abstract
Growing evidence has shown that long non-coding RNAs (lncRNAs) can serve as prospective markers for survival in patients with colorectal adenocarcinoma. However, most studies have explored a limited number of lncRNAs in a small number of cases. The objective of this study is to identify a panel of lncRNA signature that could evaluate the prognosis in colorectal adenocarcinoma based on the data from The Cancer Genome Atlas (TCGA). Altogether, 371 colon adenocarcinoma (COAD) patients with complete clinical data were included in our study as the test cohort. A total of 578 differentially expressed lncRNAs (DELs) were observed, among which 20 lncRNAs closely related to overall survival (OS) in COAD patients were identified using a Cox proportional regression model. A risk score formula was developed to assess the prognostic value of the lncRNA signature in COAD with four lncRNAs (LINC01555, RP11-610P16.1, RP11-108K3.1 and LINC01207), which were identified to possess the most remarkable correlation with OS in COAD patients. COAD patients with a high-risk score had poorer OS than those with a low-risk score. The multivariate Cox regression analyses confirmed that the four-lncRNA signature could function as an independent prognostic indicator for COAD patients, which was largely mirrored in the validating cohort with rectal adenocarcinoma (READ) containing 158 cases. In addition, the correlative genes of LINC01555 and LINC01207 were enriched in the cAMP signaling and mucin type O-Glycan biosynthesis pathways. With further validation in the future, our study indicates that the four-lncRNA signature could serve as an independent biomarker for survival of colorectal adenocarcinoma.
Collapse
Affiliation(s)
- Jiang-Hui Zeng
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Liang Liang
- Department of General Surgery, First Affiliated Hospital of Guangxi Medical University (West Branch), Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Rong-Quan He
- Department of Medical Oncology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Rui-Xue Tang
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Xiao-Yong Cai
- Department of General Surgery, First Affiliated Hospital of Guangxi Medical University (West Branch), Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Jun-Qiang Chen
- Department of Gastrointestinal Surgery, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Dian-Zhong Luo
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| | - Gang Chen
- Department of Pathology, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, P. R. China
| |
Collapse
|