1
|
M S K, Rajaguru H, Nair AR. Evaluation and Exploration of Machine Learning and Convolutional Neural Network Classifiers in Detection of Lung Cancer from Microarray Gene-A Paradigm Shift. Bioengineering (Basel) 2023; 10:933. [PMID: 37627818 PMCID: PMC10451477 DOI: 10.3390/bioengineering10080933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 08/03/2023] [Accepted: 08/04/2023] [Indexed: 08/27/2023] Open
Abstract
Microarray gene expression-based detection and classification of medical conditions have been prominent in research studies over the past few decades. However, extracting relevant data from the high-volume microarray gene expression with inherent nonlinearity and inseparable noise components raises significant challenges during data classification and disease detection. The dataset used for the research is the Lung Harvard 2 Dataset (LH2) which consists of 150 Adenocarcinoma subjects and 31 Mesothelioma subjects. The paper proposes a two-level strategy involving feature extraction and selection methods before the classification step. The feature extraction step utilizes Short Term Fourier Transform (STFT), and the feature selection step employs Particle Swarm Optimization (PSO) and Harmonic Search (HS) metaheuristic methods. The classifiers employed are Nonlinear Regression, Gaussian Mixture Model, Softmax Discriminant, Naive Bayes, SVM (Linear), SVM (Polynomial), and SVM (RBF). The two-level extracted relevant features are compared with raw data classification results, including Convolutional Neural Network (CNN) methodology. Among the methods, STFT with PSO feature selection and SVM (RBF) classifier produced the highest accuracy of 94.47%.
Collapse
Affiliation(s)
- Karthika M S
- Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| | - Harikumar Rajaguru
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| | - Ajin R. Nair
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| |
Collapse
|
2
|
Wang J, Liu J, Hou Q, Xu M. LINC02126 is a potential diagnostic, prognostic and immunotherapeutic target for lung adenocarcinoma. BMC Pulm Med 2022; 22:412. [DOI: 10.1186/s12890-022-02215-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 11/02/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
Background
Adenocarcinoma has long been an independent histological class of lung cancer, which leads to high morbidity and mortality. We aimed to investigate the contribution of LINC02126 in lung adenocarcinoma.
Methods
RNA sequencing data and clinical information were downloaded. Diagnostic efficiency and survival analysis of LINC02126 were performed, followed by functional analysis of genes co-expressed with LINC02126 and differentially expressed genes (DEGs) in different LINC02126 expression groups. Tumor immune microenvironment (TIME) cell infiltration and correlation analysis of tumor mutation burden were performed in different LINC02126 expression groups.
Results
In lung adenocarcinoma, the expression level of LINC02126 was significantly decreased. Significant expression differences of LINC02126 were found in some clinical variables, including T staging, M staging, sex, stage, and EGFR mutation. LINC02126 had potential diagnostic and prognostic value for patients. In the low LINC02126 expression group, the infiltration degree of most immune cells was significantly lower than that in the high LINC02126 expression group. Tumor mutation burden level and frequency of somatic mutation in patients with low LINC02126 expression group were significantly higher than in patients with high LINC02126 expression group.
Conclusions
LINC02126 could be considered as a diagnostic, prognostic and immunotherapeutic target for lung adenocarcinoma.
Collapse
|
3
|
Wang B, Law A, Regan T, Parkinson N, Cole J, Russell CD, Dockrell DH, Gutmann MU, Baillie JK. Systematic comparison of ranking aggregation methods for gene lists in experimental results. Bioinformatics 2022; 38:4927-4933. [PMID: 36094347 PMCID: PMC9620830 DOI: 10.1093/bioinformatics/btac621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 06/24/2022] [Accepted: 09/09/2022] [Indexed: 11/17/2022] Open
Abstract
MOTIVATION A common experimental output in biomedical science is a list of genes implicated in a given biological process or disease. The gene lists resulting from a group of studies answering the same, or similar, questions can be combined by ranking aggregation methods to find a consensus or a more reliable answer. Evaluating a ranking aggregation method on a specific type of data before using it is required to support the reliability since the property of a dataset can influence the performance of an algorithm. Such evaluation on gene lists is usually based on a simulated database because of the lack of a known truth for real data. However, simulated datasets tend to be too small compared to experimental data and neglect key features, including heterogeneity of quality, relevance and the inclusion of unranked lists. RESULTS In this study, a group of existing methods and their variations that are suitable for meta-analysis of gene lists are compared using simulated and real data. Simulated data were used to explore the performance of the aggregation methods as a function of emulating the common scenarios of real genomic data, with various heterogeneity of quality, noise level and a mix of unranked and ranked data using 20 000 possible entities. In addition to the evaluation with simulated data, a comparison using real genomic data on the SARS-CoV-2 virus, cancer (non-small cell lung cancer) and bacteria (macrophage apoptosis) was performed. We summarize the results of our evaluation in a simple flowchart to select a ranking aggregation method, and in an automated implementation using the meta-analysis by information content algorithm to infer heterogeneity of data quality across input datasets. AVAILABILITY AND IMPLEMENTATION The code for simulated data generation and running edited version of algorithms: https://github.com/baillielab/comparison_of_RA_methods. Code to perform an optimal selection of methods based on the results of this review, using the MAIC algorithm to infer the characteristics of an input dataset, can be downloaded here: https://github.com/baillielab/maic. An online service for running MAIC: https://baillielab.net/maic. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bo Wang
- Roslin Institute, University of Edinburgh, Edinburgh EH25 9RG, UK
| | - Andy Law
- Roslin Institute, University of Edinburgh, Edinburgh EH25 9RG, UK
| | - Tim Regan
- Roslin Institute, University of Edinburgh, Edinburgh EH25 9RG, UK
| | | | - Joby Cole
- University of Sheffield, Sheffield S10 2NT, UK
| | - Clark D Russell
- Centre for Inflammation Research, The Queen’s Medical Research Institute, University of Edinburgh, Edinburgh EH16 4TJ, UK
| | - David H Dockrell
- Centre for Inflammation Research, The Queen’s Medical Research Institute, University of Edinburgh, Edinburgh EH16 4TJ, UK
| | - Michael U Gutmann
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
| | | |
Collapse
|
4
|
Khaleel A, Alkhawaja B, Al-Qaisi TS, Alshalabi L, Tarkhan AH. Pathway analysis of smoking-induced changes in buccal mucosal gene expression. EGYPTIAN JOURNAL OF MEDICAL HUMAN GENETICS 2022; 23:69. [PMID: 37521848 PMCID: PMC8929449 DOI: 10.1186/s43042-022-00268-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 02/16/2022] [Indexed: 11/10/2022] Open
Abstract
Background Cigarette smoking is the leading preventable cause of death worldwide, and it is the most common cause of oral cancers. This study aims to provide a deeper understanding of the molecular pathways in the oral cavity that are altered by exposure to cigarette smoke. Methods The gene expression dataset (accession number GSE8987, GPL96) of buccal mucosa samples from smokers (n = 5) and never smokers (n = 5) was downloaded from The National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) repository. Differential expression was ascertained via NCBI's GEO2R software, and Ingenuity Pathway Analysis (IPA) software was used to perform a pathway analysis. Results A total of 459 genes were found to be significantly differentially expressed in smoker buccal mucosa (p < 0.05). A total of 261 genes were over-expressed while 198 genes were under-expressed. The top canonical pathways predicted by IPA were nitric oxide and reactive oxygen production at macrophages, macrophages/fibroblasts and endothelial cells in rheumatoid arthritis, and thyroid cancer pathways. The IPA upstream analysis predicted that the TP53, APP, SMAD3, and TNF proteins as well as dexamethasone drug would be top transcriptional regulators. Conclusions IPA highlighted critical pathways of carcinogenesis, mainly nitric oxide and reactive oxygen production at macrophages, and confirmed widespread injury in the buccal mucosa due to exposure to cigarette smoke. Our findings suggest that cigarette smoking significantly impacts gene pathways in the buccal mucosa and may highlight potential targets for treating the effects of cigarette smoking. Supplementary Information The online version contains supplementary material available at 10.1186/s43042-022-00268-y.
Collapse
Affiliation(s)
- Anas Khaleel
- Department of Pharmacology and Biomedical Sciences, Faculty of Pharmacy and Medical Sciences, University of Petra, Amman, Jordan
| | - Bayan Alkhawaja
- Department of Pharmaceutical Medicinal Chemistry and Pharmacognosy, Faculty of Pharmacy and Medical Sciences, University of Petra, Amman, Jordan
| | - Talal Salem Al-Qaisi
- Department of Medical Laboratory Sciences, Pharmacological and Diagnostic Research Centre, Faculty of Allied Medical Sciences, Al-Ahliyya Amman University, Amman, Jordan
| | - Lubna Alshalabi
- Department of Pharmaceutical Medicinal Chemistry and Pharmacognosy, Faculty of Pharmacy and Medical Sciences, University of Petra, Amman, Jordan
| | | |
Collapse
|
5
|
Jézéquel P, Gouraud W, Azzouz FB, Basseville A, Juin PP, Lasla H, Campone M. [Interest of the bc-GenExMiner web tool in oncology]. Bull Cancer 2021; 108:1057-1064. [PMID: 34561023 DOI: 10.1016/j.bulcan.2021.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/07/2021] [Accepted: 05/17/2021] [Indexed: 11/18/2022]
Abstract
We are taking advantage of the launch of the latest version (v4.6) of our web-based data mining tool "breast cancer gene-expression miner" (bc-GenExMiner) to take stock of its position within the oncology research landscape and to present an activity report ten years after its establishment (http://bcgenex.ico.unicancer.fr). bc-GenExMiner is an open-access, user-friendly tool for statistical mining on breast tumor transcriptomes, annotated with more than 20 clinicopathologic and molecular characteristics. The database comprises more than 16,000 patients from 64 cohorts - including TCGA, METABRIC and SCAN-B - for whom several thousands of genes have been quantified by microarrays or RNA-seq. Correlation, expression and prognostic analyses are available for targeted, exhaustive or customized explorations of queried genes. bc-GenExMiner facilitates the validation, investigation, and prioritization of discoveries and hypotheses on genes of interest. It allows users to analyse large databases, create data visualizations, and obtain robust statistical analysis, thereby accelerating biomarker discovery. Ten years after its launch, judging by the number of visits, analyses, and scientific citations of bc-GenExMiner, we conclude that this web resource serves its purpose in the international scientific community working in breast cancer research, with a never-ending rise in its use.
Collapse
Affiliation(s)
- Pascal Jézéquel
- Institut de cancérologie de l'Ouest, unité de bioinfomique, boulevard Jacques-Monod, 44805 Saint-Herblain cedex, France; Université de Nantes, université d'Angers, institut de recherche en santé-Université de Nantes, CRCINA, UMR 1232 Inserm, 8, quai Moncousu - BP 70721, 44007 Nantes cedex 1, France; SIRIC ILIAD, Nantes, Angers, France.
| | - Wilfried Gouraud
- Institut de cancérologie de l'Ouest, unité de bioinfomique, boulevard Jacques-Monod, 44805 Saint-Herblain cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | - Fadoua Ben Azzouz
- Institut de cancérologie de l'Ouest, unité de bioinfomique, boulevard Jacques-Monod, 44805 Saint-Herblain cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | - Agnès Basseville
- Institut de cancérologie de l'Ouest, unité de bioinfomique, boulevard Jacques-Monod, 44805 Saint-Herblain cedex, France
| | - Philippe P Juin
- Université de Nantes, université d'Angers, institut de recherche en santé-Université de Nantes, CRCINA, UMR 1232 Inserm, 8, quai Moncousu - BP 70721, 44007 Nantes cedex 1, France; SIRIC ILIAD, Nantes, Angers, France
| | - Hamza Lasla
- Institut de cancérologie de l'Ouest, unité de bioinfomique, boulevard Jacques-Monod, 44805 Saint-Herblain cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | - Mario Campone
- Université de Nantes, université d'Angers, institut de recherche en santé-Université de Nantes, CRCINA, UMR 1232 Inserm, 8, quai Moncousu - BP 70721, 44007 Nantes cedex 1, France; SIRIC ILIAD, Nantes, Angers, France; Institut de cancérologie de l'Ouest - René-Gauducheau, oncologie médicale, boulevard Jacques-Monod, 44805 Saint-Herblain cedex, France
| |
Collapse
|
6
|
Karaglani M, Gourlia K, Tsamardinos I, Chatzaki E. Accurate Blood-Based Diagnostic Biosignatures for Alzheimer's Disease via Automated Machine Learning. J Clin Med 2020; 9:E3016. [PMID: 32962113 PMCID: PMC7563988 DOI: 10.3390/jcm9093016] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 09/04/2020] [Accepted: 09/14/2020] [Indexed: 12/17/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of neurodegenerative dementia and its timely diagnosis remains a major challenge in biomarker discovery. In the present study, we analyzed publicly available high-throughput low-sample -omics datasets from studies in AD blood, by the AutoML technology Just Add Data Bio (JADBIO), to construct accurate predictive models for use as diagnostic biosignatures. Considering data from AD patients and age-sex matched cognitively healthy individuals, we produced three best performing diagnostic biosignatures specific for the presence of AD: A. A 506-feature transcriptomic dataset from 48 AD and 22 controls led to a miRNA-based biosignature via Support Vector Machines with three miRNA predictors (AUC 0.975 (0.906, 1.000)), B. A 38,327-feature transcriptomic dataset from 134 AD and 100 controls led to six mRNA-based statistically equivalent signatures via Classification Random Forests with 25 mRNA predictors (AUC 0.846 (0.778, 0.905)) and C. A 9483-feature proteomic dataset from 25 AD and 37 controls led to a protein-based biosignature via Ridge Logistic Regression with seven protein predictors (AUC 0.921 (0.849, 0.972)). These performance metrics were also validated through the JADBIO pipeline confirming stability. In conclusion, using the automated machine learning tool JADBIO, we produced accurate predictive biosignatures extrapolating available low sample -omics data. These results offer options for minimally invasive blood-based diagnostic tests for AD, awaiting clinical validation based on respective laboratory assays. They also highlight the value of AutoML in biomarker discovery.
Collapse
Affiliation(s)
- Makrina Karaglani
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, 68100 Alexandroupolis, Greece;
- Gnosis Data Analysis PC, Science and Technology Park of Crete, N. Plastira 100, GR-700 13 Vassilika Vouton, Greece;
| | - Krystallia Gourlia
- Department of Computer Science, University of Crete, GR-700 13 Vassilika Vouton, Greece;
| | - Ioannis Tsamardinos
- Gnosis Data Analysis PC, Science and Technology Park of Crete, N. Plastira 100, GR-700 13 Vassilika Vouton, Greece;
- Department of Computer Science, University of Crete, GR-700 13 Vassilika Vouton, Greece;
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology Hellas, GR-700 13 Vassilika Vouton, Greece
| | - Ekaterini Chatzaki
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, 68100 Alexandroupolis, Greece;
- Institute of Agri-Food and Life Sciences, University Research Centre, Hellenic Mediterranean University, GR-71410 Heraklion, Greece
| |
Collapse
|
7
|
Abstract
BACKGROUNDS Lung adenocarcinoma (LUAD) is one of the most common malignancies, and is a serious threat to human health. The aim of the present study was to assess potential biomarkers for the prognosis of LUAD through the analysis of gene expression microarrays. METHODS The gene expression data for GSE118370 was downloaded from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) between normal lung and LUAD samples were screened using the R language. The DAVID database was used to analyze the functions and pathways of DEGs. The STRING database was used to the map protein-protein interaction (PPI) networks, and these were visualized with the Cytoscape software. Finally, the prognostic analysis of the hub gene in the PPI network was performed using the Kaplan-Meier tool. RESULTS A total of 406 downregulated and 203 upregulated DEGs were identified. The GO analysis results revealed that downregulated DEGs were significantly enriched in angiogenesis, calcium ion binding and cell adhesion. The upregulated DEGs were significantly enriched in the extracellular matrix disassembly, collagen catabolic process, chemokine-mediated signaling pathway and endopeptidase inhibitor activity. The KEGG pathway analysis revealed that downregulated DEGs were enriched in neuroactive ligand-receptor interaction, hematopoietic cell lineage and vascular smooth muscle contraction, while upregulated DEGs were enriched in phototransduction. In addition, the top 10 hub genes and the most closely interacting modules of the top 3 proteins in the PPI network were screened. Finally, the independent prognostic value of each hub gene in LUAD patients was analyzed through the Kaplan-Meier plotter. Seven hub genes (ADCY4, S1PR1, FPR2, PPBP, NMU, PF4, and GCG) were closely correlated to overall survival time. CONCLUSION The discovery of these candidate genes and pathways reveals the etiology and molecular mechanisms of LUAD, providing ideas and guidance for the development of new therapeutic approaches to LUAD.
Collapse
|
8
|
Li X, Choudhary PK, Biswas S, Wang X. A Bayesian latent variable approach to aggregation of partial and top-ranked lists in genomic studies. Stat Med 2018; 37:4266-4278. [PMID: 30094911 DOI: 10.1002/sim.7920] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 06/13/2018] [Accepted: 07/03/2018] [Indexed: 12/30/2022]
Abstract
In genomic research, it is becoming increasingly popular to perform meta-analysis, the practice of combining results from multiple studies that target a common essential biological problem. Rank aggregation, a robust meta-analytic approach, consolidates such studies at the rank level. There exists extensive research on this topic, and various methods have been developed in the past. However, these methods have two major limitations when they are applied in the genomic context. First, they are mainly designed to work with full lists, whereas partial and/or top-ranked lists prevail in genomic studies. Second, the component studies are often clustered, and the existing methods fail to utilize such information. To address the above concerns, a Bayesian latent variable approach, called BiG, is proposed to formally deal with partial and top-ranked lists and incorporate the effect of clustering. Various reasonable prior specifications for variance parameters in hierarchical models are carefully studied and compared. Simulation results demonstrate the superior performance of BiG compared with other popular rank aggregation methods under various practical settings. A non-small-cell lung cancer data example is analyzed for illustration.
Collapse
Affiliation(s)
- Xue Li
- Department of Statistical Science, Southern Methodist University, Dallas, Texas
| | | | - Swati Biswas
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, Texas
| |
Collapse
|
9
|
Frost HR, Amos CI. Gene set selection via LASSO penalized regression (SLPR). Nucleic Acids Res 2017; 45:e114. [PMID: 28472344 PMCID: PMC5499546 DOI: 10.1093/nar/gkx291] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Accepted: 04/12/2017] [Indexed: 01/23/2023] Open
Abstract
Gene set testing is an important bioinformatics technique that addresses the challenges of power, interpretation and replication. To better support the analysis of large and highly overlapping gene set collections, researchers have recently developed a number of multiset methods that jointly evaluate all gene sets in a collection to identify a parsimonious group of functionally independent sets. Unfortunately, current multiset methods all use binary indicators for gene and gene set activity and assume that a gene is active if any containing gene set is active. This simplistic model limits performance on many types of genomic data. To address this limitation, we developed gene set Selection via LASSO Penalized Regression (SLPR), a novel mapping of multiset gene set testing to penalized multiple linear regression. The SLPR method assumes a linear relationship between continuous measures of gene activity and the activity of all gene sets in the collection. As we demonstrate via simulation studies and the analysis of TCGA data using MSigDB gene sets, the SLPR method outperforms existing multiset methods when the true biological process is well approximated by continuous activity measures and a linear association between genes and gene sets.
Collapse
Affiliation(s)
- H Robert Frost
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| | - Christopher I Amos
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| |
Collapse
|
10
|
Li X, Wang X, Xiao G. A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications. Brief Bioinform 2017; 20:178-189. [PMID: 28968705 PMCID: PMC6357556 DOI: 10.1093/bib/bbx101] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Indexed: 02/05/2023] Open
Abstract
Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.
Collapse
Affiliation(s)
- Xue Li
- Department of Statistical Science at Southern Methodist University, Dallas, TX
| | - Xinlei Wang
- Department of Statistical Science at Southern Methodist University, Dallas, TX,Corresponding author. Xinlei Wang, Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, P O Box 750332, Dallas, Texas 75275, USA. Tel: 214-768-2459; Fax: (214) 768-4035; E-mail:
| | - Guanghua Xiao
- Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX
| |
Collapse
|
11
|
Gundersen GW, Jagodnik KM, Woodland H, Fernandez NF, Sani K, Dohlman AB, Ung PMU, Monteiro CD, Schlessinger A, Ma'ayan A. GEN3VA: aggregation and analysis of gene expression signatures from related studies. BMC Bioinformatics 2016; 17:461. [PMID: 27846806 PMCID: PMC5111283 DOI: 10.1186/s12859-016-1321-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 11/04/2016] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Genome-wide gene expression profiling of mammalian cells is becoming a staple of many published biomedical and biological research studies. Such data is deposited into data repositories such as the Gene Expression Omnibus (GEO) for potential reuse. However, these repositories currently do not provide simple interfaces to systematically analyze collections of related studies. RESULTS Here we present GENE Expression and Enrichment Vector Analyzer (GEN3VA), a web-based system that enables the integrative analysis of aggregated collections of tagged gene expression signatures identified and extracted from GEO. Each tagged collection of signatures is presented in a report that consists of heatmaps of the differentially expressed genes; principal component analysis of all signatures; enrichment analysis with several gene set libraries across all signatures, which we term enrichment vector analysis; and global mapping of small molecules that are predicted to reverse or mimic each signature in the aggregate. We demonstrate how GEN3VA can be used to identify common molecular mechanisms of aging by analyzing tagged signatures from 244 studies that compared young vs. old tissues in mammalian systems. In a second case study, we collected 86 signatures from treatment of human cells with dexamethasone, a glucocorticoid receptor (GR) agonist. Our analysis confirms consensus GR target genes and predicts potential drug mimickers. CONCLUSIONS GEN3VA can be used to identify, aggregate, and analyze themed collections of gene expression signatures from diverse but related studies. Such integrative analyses can be used to address concerns about data reproducibility, confirm results across labs, and discover new collective knowledge by data reuse. GEN3VA is an open-source web-based system that is freely available at: http://amp.pharm.mssm.edu/gen3va .
Collapse
Affiliation(s)
- Gregory W Gundersen
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.,Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Kathleen M Jagodnik
- Fluid Physics and Transport Processes Branch, NASA Glenn Research Center, 21000 Brookpark Rd, Cleveland, OH, 44135, USA.,Center for Space Medicine, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
| | - Holly Woodland
- , Daylesford, The Fairway, Weybridge, Surrey, KT13 0RZ, UK
| | - Nicholas F Fernandez
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.,Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Kevin Sani
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.,Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Anders B Dohlman
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.,Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Peter Man-Un Ung
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Caroline D Monteiro
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.,Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Avner Schlessinger
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA. .,Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.
| |
Collapse
|
12
|
Wang J, Wan X, Gao Y, Zhong M, Sha L, Liu B, Zhang W, Tian L, Ruan W, Cao S, Huang M. Latcripin-13 domain induces apoptosis and cell cycle arrest at the G1 phase in human lung carcinoma A549 cells. Oncol Rep 2016; 36:441-7. [PMID: 27221765 DOI: 10.3892/or.2016.4830] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 02/15/2016] [Indexed: 11/05/2022] Open
Abstract
Latcripin-13 domain, isolated from the transcriptome of Lentinula edodes C91-3, contains a regulator of chromosome condensation (RCC1) domain/β-lactamase-inhibitor protein II (BLIP-II) and a plant homeodomain (PHD). Latcripin-13 domain has been shown to have antitumor effects. However, the underlying molecular pharmacology is largely unknown. We report here that Latcripin-13 domain induced cell cycle arrest in the G1 phase and caused the apoptosis of human lung carcinoma A549 cells via the GSK3β-cyclin D1 and caspase-8/NF-κB signaling pathways. Western blot analysis showed that Latcripin-13 domain decreased cyclin D1 and cyclin-dependent kinase 4 (CDK4), while it increased the ratio of GSK3β/phosphorylated GSK3β. Importantly, Latcripin-13 domain induced nuclear fragmentation and chromatin condensation in the A549 cells. In addition, treatment of the A549 cells with Latcripin-13 domain resulted in the loss of mitochondrial membrane potential, accompanied by an increase in the Bax/Bcl-2 ratio and activation of caspase-3, -8, and -9. Intriguingly, western blot analysis revealed that NF-κB was significantly downregulated by Latcripin-13 domain. These results demonstrated that Latcripin-13 domain induced apoptosis and cell cycle arrest at G1 phase in the A549 cells, providing a mechanism for the antitumor effects of Latcripin-13 domain.
Collapse
Affiliation(s)
- Jia Wang
- Department of Critical Care Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning 116021, P.R. China
| | - Xianyao Wan
- Department of Critical Care Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning 116021, P.R. China
| | - Yifan Gao
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Mintao Zhong
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Li Sha
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Ben Liu
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Wei Zhang
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Li Tian
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Wenjing Ruan
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Shuyun Cao
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| | - Min Huang
- Department of Microbiology, Dalian Medical University, Dalian, Liaoning 116044, P.R. China
| |
Collapse
|
13
|
Zhang P, Zhang Y, Yang H, Li W, Chen X, Long F. Association between EPHX1 rs1051740 and lung cancer susceptibility: a meta-analysis. Int J Clin Exp Med 2015; 8:17941-17949. [PMID: 26770388 PMCID: PMC4694288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2015] [Accepted: 09/06/2015] [Indexed: 06/05/2023]
Abstract
BACKGROUND Microsomal epoxide hydrolase 1 (EPHX1) may play an important role in epigenetic change and DNA repair concerned with lung cancer. Several studies have investigated the association between EPHX1 rs1051740 and lung cancer risk, but there is no consensus. Therefore, we performed a meta-analysis to further identify the relationship. METHODS The Pubmed and Embase databases were searched for eligible studies. An odds ratio (OR) with 95% confidence intervals (CIs) was used to assess the correlation between EPHX1 rs1051740 polymorphism and lung cancer risk through a meta-analysis. RESULTS Overall, no significant relationship was found between EPHX1 rs1051740 and lung cancer risk (CC vs. TT: OR=1.10, 95% CI=0.88-1.36; CC+CT vs. TT: OR=1.02, 95% CI=0.88-1.18; CC vs. TT+CT: OR=1.08, 95% CI=0.91-1.27; C vs. T: OR=1.04, 95% CI=0.93-1.17; CT vs. TT: OR=0.98, 95% CI=0.85-1.13). Nevertheless, further subgroup analysis by ethnicity demonstrated that EPHX1 rs1051740 with CC genotype or C allele was an increased risk for lung cancer in Asians (CC vs. TT: OR=1.54, 95% CI=1.23-1.94; CC vs. TT+CT: OR=1.43, 95% CI=1.20-1.71; C vs. T: OR=1.26, 95% CI=1.08-1.47). CONCLUSIONS This meta-analysis indicates that EPHX1 rs1051740 with CC genotype or C allele may be a risk factor in Asians.
Collapse
Affiliation(s)
- Peng Zhang
- Department of Respiratory Medicine, Huashan Hospital North, Fudan University Shanghai 201907, China
| | - Youzhi Zhang
- Department of Respiratory Medicine, Huashan Hospital North, Fudan University Shanghai 201907, China
| | - Haihua Yang
- Department of Respiratory Medicine, Huashan Hospital North, Fudan University Shanghai 201907, China
| | - Wenjing Li
- Department of Respiratory Medicine, Huashan Hospital North, Fudan University Shanghai 201907, China
| | - Xiaodong Chen
- Department of Respiratory Medicine, Huashan Hospital North, Fudan University Shanghai 201907, China
| | - Feng Long
- Department of Respiratory Medicine, Huashan Hospital North, Fudan University Shanghai 201907, China
| |
Collapse
|