1
|
Able H, Wolf-Ringwall A, Rendahl A, Ober CP, Seelig DM, Wilke CT, Lawrence J. Computed tomography radiomic features hold prognostic utility for canine lung tumors: An analytical study. PLoS One 2021; 16:e0256139. [PMID: 34403435 PMCID: PMC8370631 DOI: 10.1371/journal.pone.0256139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 07/29/2021] [Indexed: 12/02/2022] Open
Abstract
Quantitative analysis of computed tomography (CT) radiomic features is an indirect measure of tumor heterogeneity, which has been associated with prognosis in human lung carcinoma. Canine lung tumors share similar features to human lung tumors and serve as a model in which to investigate the utility of radiomic features in differentiating tumor type and prognostication. The purpose of this study was to correlate first-order radiomic features from canine pulmonary tumors to histopathologic characteristics and outcome. Disease-free survival, overall survival time and tumor-specific survival were calculated as days from the date of CT scan. Sixty-seven tumors from 65 dogs were evaluated. Fifty-six tumors were classified as primary pulmonary adenocarcinomas and 11 were non-adenocarcinomas. All dogs were treated with surgical resection; 14 dogs received adjuvant chemotherapy. Second opinion histopathology in 63 tumors confirmed the histologic diagnosis in all dogs and further characterized 53 adenocarcinomas. The median overall survival time was longer (p = 0.004) for adenocarcinomas (339d) compared to non-adenocarcinomas (55d). There was wide variation in first-order radiomic statistics across tumors. Mean Hounsfield units (HU) ratio (p = 0.042) and median mean HU ratio (p = 0.042) were higher in adenocarcinomas than in non-adenocarcinomas. For dogs with adenocarcinoma, completeness of excision was associated with overall survival (p<0.001) while higher mitotic index (p = 0.007) and histologic score (p = 0.037) were associated with shorter disease-free survival. CT-derived tumor variables prognostic for outcome included volume, maximum axial diameter, and four radiomic features: integral total, integral total mean ratio, total HU, and max mean HU ratio. Tumor volume was also significantly associated with tumor invasion (p = 0.044). Further study of radiomic features in canine lung tumors is warranted as a method to non-invasively interrogate CT images for potential predictive and prognostic utility.
Collapse
Affiliation(s)
- Hannah Able
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, Saint Paul, Minnesota, United States of America
- * E-mail: (HA); (JL)
| | - Amber Wolf-Ringwall
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, Saint Paul, Minnesota, United States of America
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Aaron Rendahl
- Department of Veterinary and Biomedical Sciences, College of Veterinary Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Christopher P. Ober
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Davis M. Seelig
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, Saint Paul, Minnesota, United States of America
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Chris T. Wilke
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota, United States of America
- Department of Radiation Oncology, Medical School, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Jessica Lawrence
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine, University of Minnesota, Saint Paul, Minnesota, United States of America
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota, United States of America
- * E-mail: (HA); (JL)
| |
Collapse
|
2
|
Lung adenocarcinoma and lung squamous cell carcinoma cancer classification, biomarker identification, and gene expression analysis using overlapping feature selection methods. Sci Rep 2021; 11:13323. [PMID: 34172784 PMCID: PMC8233431 DOI: 10.1038/s41598-021-92725-8] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 06/14/2021] [Indexed: 02/06/2023] Open
Abstract
Lung cancer is one of the deadliest cancers in the world. Two of the most common subtypes, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), have drastically different biological signatures, yet they are often treated similarly and classified together as non-small cell lung cancer (NSCLC). LUAD and LUSC biomarkers are scarce, and their distinct biological mechanisms have yet to be elucidated. To detect biologically relevant markers, many studies have attempted to improve traditional machine learning algorithms or develop novel algorithms for biomarker discovery. However, few have used overlapping machine learning or feature selection methods for cancer classification, biomarker identification, or gene expression analysis. This study proposes to use overlapping traditional feature selection or feature reduction techniques for cancer classification and biomarker discovery. The genes selected by the overlapping method were then verified using random forest. The classification statistics of the overlapping method were compared to those of the traditional feature selection methods. The identified biomarkers were validated in an external dataset using AUC and ROC analysis. Gene expression analysis was then performed to further investigate biological differences between LUAD and LUSC. Overall, our method achieved classification results comparable to, if not better than, the traditional algorithms. It also identified multiple known biomarkers, and five potentially novel biomarkers with high discriminating values between LUAD and LUSC. Many of the biomarkers also exhibit significant prognostic potential, particularly in LUAD. Our study also unraveled distinct biological pathways between LUAD and LUSC.
Collapse
|
3
|
Tian S, Wang C, Suarez-Farinas M. GEE-TGDR: A Longitudinal Feature Selection Algorithm and Its Application to lncRNA Expression Profiles for Psoriasis Patients Treated with Immune Therapies. BIOMED RESEARCH INTERNATIONAL 2021; 2021:8862895. [PMID: 33928163 PMCID: PMC8053058 DOI: 10.1155/2021/8862895] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 03/06/2021] [Accepted: 03/29/2021] [Indexed: 01/06/2023]
Abstract
With the fast evolution of high-throughput technology, longitudinal gene expression experiments have become affordable and increasingly common in biomedical fields. Generalized estimating equation (GEE) approach is a widely used statistical method for the analysis of longitudinal data. Feature selection is imperative in longitudinal omics data analysis. Among a variety of existing feature selection methods, an embedded method-threshold gradient descent regularization (TGDR)-stands out due to its excellent characteristics. An alignment of GEE with TGDR is a promising area for the purpose of identifying relevant markers that can explain the dynamic changes of outcomes across time. We proposed a new novel feature selection algorithm for longitudinal outcomes-GEE-TGDR. In the GEE-TGDR method, the corresponding quasilikelihood function of a GEE model is the objective function to be optimized, and the optimization and feature selection are accomplished by the TGDR method. Long noncoding RNAs (lncRNAs) are posttranscriptional and epigenetic regulators and have lower expression levels and are more tissue-specific compared with protein-coding genes. So far, the implication of lncRNAs in psoriasis remains largely unexplored and poorly understood even though some evidence in the literature supports that lncRNAs and psoriasis are highly associated. In this study, we applied the GEE-TGDR method to a lncRNA expression dataset that examined the response of psoriasis patients to immune treatments. As a result, a list including 10 relevant lncRNAs was identified with a predictive accuracy of 70% that is superior to the accuracies achieved by two competitive methods and meaningful biological interpretation. A widespread application of the GEE-TGDR method in omics longitudinal data analysis is anticipated.
Collapse
Affiliation(s)
- Suyan Tian
- Division of Clinical Division, First Hospital of Jilin University, Changchun, Jilin, China 130021
| | - Chi Wang
- Department of Internal Medicine, College of Medicine, University of Kentucky, 800 Rose St., Lexington, KY 40536, USA
- Markey Cancer Center, University of Kentucky, 800 Rose St., Lexington, KY 40536, USA
| | - Mayte Suarez-Farinas
- Department of Population Health Science & Policy, The Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Department of Genetics and Genomics, The Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| |
Collapse
|
4
|
Whole transcriptome signature for prognostic prediction (WTSPP): application of whole transcriptome signature for prognostic prediction in cancer. J Transl Med 2020; 100:1356-1366. [PMID: 32144347 PMCID: PMC7483260 DOI: 10.1038/s41374-020-0413-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 02/20/2020] [Accepted: 02/21/2020] [Indexed: 11/08/2022] Open
Abstract
Developing prognostic biomarkers for specific cancer types that accurately predict patient survival is increasingly important in clinical research and practice. Despite the enormous potential of prognostic signatures, proposed models have found limited implementations in routine clinical practice. Herein, we propose a generic, RNA sequencing platform independent, statistical framework named whole transcriptome signature for prognostic prediction to generate prognostic gene signatures. Using ovarian cancer and lung adenocarcinoma as examples, we provide evidence that our prognostic signatures overperform previous reported signatures, capture prognostic features not explained by clinical variables, and expose biologically relevant prognostic pathways, including those involved in the immune system and cell cycle. Our approach demonstrates a robust method for developing prognostic gene expression signatures. In conclusion, our statistical framework can be generally applied to all cancer types for prognostic prediction and might be extended to other human diseases. The proposed method is implemented as an R package (PanCancerSig) and is freely available on GitHub ( https://github.com/Cheng-Lab-GitHub/PanCancer_Signature ).
Collapse
|
5
|
Pak K, Oh SO, Goh TS, Heo HJ, Han ME, Jeong DC, Lee CS, Sun H, Kang J, Choi S, Lee S, Kwon EJ, Kang JW, Kim YH. A User-Friendly, Web-Based Integrative Tool (ESurv) for Survival Analysis: Development and Validation Study. J Med Internet Res 2020; 22:e16084. [PMID: 32369034 PMCID: PMC7238095 DOI: 10.2196/16084] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 11/13/2019] [Accepted: 03/25/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Prognostic genes or gene signatures have been widely used to predict patient survival and aid in making decisions pertaining to therapeutic actions. Although some web-based survival analysis tools have been developed, they have several limitations. OBJECTIVE Taking these limitations into account, we developed ESurv (Easy, Effective, and Excellent Survival analysis tool), a web-based tool that can perform advanced survival analyses using user-derived data or data from The Cancer Genome Atlas (TCGA). Users can conduct univariate analyses and grouped variable selections using multiomics data from TCGA. METHODS We used R to code survival analyses based on multiomics data from TCGA. To perform these analyses, we excluded patients and genes that had insufficient information. Clinical variables were classified as 0 and 1 when there were two categories (for example, chemotherapy: no or yes), and dummy variables were used where features had 3 or more outcomes (for example, with respect to laterality: right, left, or bilateral). RESULTS Through univariate analyses, ESurv can identify the prognostic significance for single genes using the survival curve (median or optimal cutoff), area under the curve (AUC) with C statistics, and receiver operating characteristics (ROC). Users can obtain prognostic variable signatures based on multiomics data from clinical variables or grouped variable selections (lasso, elastic net regularization, and network-regularized high-dimensional Cox-regression) and select the same outputs as above. In addition, users can create custom gene signatures for specific cancers using various genes of interest. One of the most important functions of ESurv is that users can perform all survival analyses using their own data. CONCLUSIONS Using advanced statistical techniques suitable for high-dimensional data, including genetic data, and integrated survival analysis, ESurv overcomes the limitations of previous web-based tools and will help biomedical researchers easily perform complex survival analyses.
Collapse
Affiliation(s)
- Kyoungjune Pak
- Department of Nuclear Medicine, Pusan National University Hospital, Busan, Republic of Korea
| | - Sae-Ock Oh
- Department of Anatomy, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Tae Sik Goh
- Department of Orthopaedic Surgery, Pusan National University Hospital, Busan, Republic of Korea
| | - Hye Jin Heo
- Department of Anatomy, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Myoung-Eun Han
- Department of Anatomy, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Dae Cheon Jeong
- Deloitte Analytics Group, Deloitte Consulting LLC, Seoul, Republic of Korea
| | - Chi-Seung Lee
- Biomedical Research Institute, Pusan National University Hospital, Busan, Republic of Korea.,Department of Biomedical Engineering, School of Medicine, Pusan National University, Busan, Republic of Korea
| | - Hokeun Sun
- Department of Statistics, Pusan National University, Busan, Republic of Korea
| | - Junho Kang
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Suji Choi
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Soohwan Lee
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Eun Jung Kwon
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Ji Wan Kang
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Yun Hak Kim
- Department of Anatomy, School of Medicine, Pusan National University, Yangsan, Republic of Korea.,Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| |
Collapse
|
6
|
Dong YM, Qin LD, Tong YF, He QE, Wang L, Song K. Multiple genome pattern analysis and signature gene identification for the Caucasian lung adenocarcinoma patients with different tobacco exposure patterns. PeerJ 2020; 8:e8349. [PMID: 32030321 PMCID: PMC6995662 DOI: 10.7717/peerj.8349] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 12/04/2019] [Indexed: 11/20/2022] Open
Abstract
Background When considering therapies for lung adenocarcinoma (LUAD) patients, the carcinogenic mechanisms of smokers are believed to differ from those who have never smoked. The rising trend in the proportion of nonsmokers in LUAD urgently requires the understanding of such differences at a molecular level for the development of precision medicine. Methods Three independent LUAD tumor sample sets—TCGA, SPORE and EDRN—were used. Genome patterns of expression (GE), copy number variation (CNV) and methylation (ME) were reviewed to discover the differences between them for both smokers and nonsmokers. Tobacco-related signature genes distinguishing these two groups of LUAD were identified using the GE, ME and CNV values of the whole genome. To do this, a novel iterative multi-step selection method based on the partial least squares (PLS) algorithm was proposed to overcome the high variable dimension and high noise inherent in the data. This method can thoroughly evaluate the importance of genes according to their statistical differences, biological functions and contributions to the tobacco exposure classification model. The kernel partial least squares (KPLS) method was used to further optimize the accuracies of the classification models. Results Forty-three, forty-eight and seventy-five genes were identified as GE, ME and CNV signatures, respectively, to distinguish smokers from nonsmokers. Using only the gene expression values of these 43 GE signature genes, ME values of the 48 ME signature genes or copy numbers of the 75 CNV signature genes, the accuracies of TCGA training and SPORE/EDRN independent validation datasets all exceed 76%. More importantly, the focal amplicon in Telomerase Reverse Transcriptase in nonsmokers, the broad deletion in ChrY in male nonsmokers and the greater amplification of MDM2 in female nonsmokers may explain why nonsmokers of both genders tend to suffer LUAD. These pattern analysis results may have clear biological interpretation in the molecular mechanism of tumorigenesis. Meanwhile, the identified signature genes may serve as potential drug targets for the precision medicine of LUAD.
Collapse
Affiliation(s)
- Yan-mei Dong
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Li-da Qin
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Yi-fan Tong
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Qi-en He
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Ling Wang
- The First Affiliated Hospital Oncology, Dalian Medical University, Dalian, Liaoning, China
| | - Kai Song
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| |
Collapse
|
7
|
E L, Lu L, Li L, Yang H, Schwartz LH, Zhao B. Radiomics for Classification of Lung Cancer Histological Subtypes Based on Nonenhanced Computed Tomography. Acad Radiol 2019; 26:1245-1252. [PMID: 30502076 DOI: 10.1016/j.acra.2018.10.013] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 09/27/2018] [Accepted: 10/04/2018] [Indexed: 12/23/2022]
Abstract
OBJECTIVES To evaluate the performance of using radiomics method to classify lung cancer histological subtypes based on nonenhanced computed tomography images. MATERIALS AND METHODS 278 patients with pathologically confirmed lung cancer were collected, including 181 nonsmall cell lung cancer (NSCLC) and 97 small cell lung cancers (SCLC) patients. Among the NSCLC patients, 88 patients were adenocarcinomas (AD) and 93 patients were squamous cell carcinomas (SCC). In total, 1695 quantitative radiomic features (QRF) were calculated from the primary lung cancer tumor in each patient. To build radiomic classification model based on the extracted QRFs, several machine-learning algorithms were applied sequentially. First, unsupervised hierarchical clustering was used to exclude highly correlated QRFs; second, the minimum Redundancy Maximum Relevance feature selection algorithm was employed to select informative and nonredundant QRFs; finally, the Incremental Forward Search and Support Vector Machine classification algorithms were used to combine the selected QRFs and build the model. In our work, to study the phenotypic differences among lung cancer histological subtypes, four classification models were built. They were models of SCLC vs NSCLC, SCLC vs AD, SCLC vs SCC, and AD vs SCC. The performance of the classification models was evaluated by the area under the receiver operating characteristic curve (AUC) estimated by three-fold cross-validation. RESULTS The AUC (95% confidence interval) for the model of SCLC vs NSCLC was 0.741(0.678, 0.795). For the models of SCLC vs AD and SCLC vs SCC, the AUCs were 0.822(0.755, 0.875) and 0.665(0.583, 0.738), respectively. The AUC for the model of AD vs SCC was 0.655(0.570, 0.731). Several QRFs ("Law_15," "LoG_Uniformity," "GLCM_Contrast," and "Compactness Factor") that characterize tumor heterogeneity and shape were selected as the significant features to build the models. CONCLUSION Our results show that phenotypic differences exist among different lung cancer histological subtypes on nonenhanced computed tomography image.
Collapse
Affiliation(s)
- Linning E
- Department of Radiology, Shanxi DAYI Hospital, Taiyuan, Shanxi, China
| | - Lin Lu
- Department of Radiology, Columbia University Medical Center, 630 West 168th Street, New York, NY 10032, USA.
| | - Li Li
- Department of Pathology, Shanxi DAYI Hospital, Taiyuan, Shanxi, China
| | - Hao Yang
- Department of Radiology, Columbia University Medical Center, 630 West 168th Street, New York, NY 10032, USA
| | - Lawrence H Schwartz
- Department of Radiology, Columbia University Medical Center, 630 West 168th Street, New York, NY 10032, USA
| | - Binsheng Zhao
- Department of Radiology, Columbia University Medical Center, 630 West 168th Street, New York, NY 10032, USA
| |
Collapse
|
8
|
Liu C, Wang L, Wang T, Tian S. Construction of subtype-specific prognostic gene signatures for early-stage non-small cell lung cancer using meta feature selection methods. Oncol Lett 2019; 18:2366-2375. [PMID: 31402939 PMCID: PMC6676737 DOI: 10.3892/ol.2019.10563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 06/05/2019] [Indexed: 11/06/2022] Open
Abstract
Feature selection in the framework of meta-analyses (meta feature selection), combines meta-analysis with a feature selection process and thus allows meta-analysis feature selection across multiple datasets. In the present study, a meta feature selection procedure that fitted a multiple Cox regression model to estimate the effect size of a gene in individual studies and to identify the overall effect of the gene using a meta-analysis model was proposed. The method was used to identify prognostic gene signatures for lung adenocarcinoma and lung squamous cell carcinoma. Furthermore, redundant gene elimination (RGE) is of crucial importance during feature selection, and is also essential for a meta feature selection process. The current study demonstrated that the proposed meta feature selection procedure with RGE outperforms that without RGE in terms of predictive ability, model parsimony and biological interpretation.
Collapse
Affiliation(s)
- Chunshui Liu
- Department of Hematology, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Linlin Wang
- Department of Ultrasound, China-Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| | - Tianjiao Wang
- The State Key Laboratory of Special Economic Animal Molecular Biology, Institute of Special Wild Economic Animal and Plant Science, Chinese Academy Agricultural Science, Changchun, Jilin 130133, P.R. China
| | - Suyan Tian
- Division of Clinical Research, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
9
|
E L, Lu L, Li L, Yang H, Schwartz LH, Zhao B. Radiomics for Classifying Histological Subtypes of Lung Cancer Based on Multiphasic Contrast-Enhanced Computed Tomography. J Comput Assist Tomogr 2019; 43:300-306. [PMID: 30664116 DOI: 10.1097/rct.0000000000000836] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
OBJECTIVES The aim of this study was to evaluate the performance of the radiomics method in classifying lung cancer histological subtypes based on multiphasic contrast-enhanced computed tomography (CT) images. METHODS A total of 229 patients with pathologically confirmed lung cancer were retrospectively recruited. All recruited patients underwent nonenhanced and dual-phase chest contrast-enhanced CT; 1160 quantitative radiomics features were calculated to build a radiomics classification model. The performance of the classification models was evaluated by the receiver operating characteristic curve. RESULTS The areas under the curve of radiomics models in classifying adenocarcinoma and squamous cell carcinoma, adenocarcinoma and small cell lung cancer, and squamous cell carcinoma and small cell lung cancer were 0.801, 0.857, and 0.657 (nonenhanced); 0.834, 0.855, and 0.619 (arterial phase); and 0.864, 0.864, and 0.664 (venous phase), respectively. Moreover, the application of contrast-enhanced CT may affect the selection of radiomics features. CONCLUSIONS Our study indicates that radiomics may be a promising tool for noninvasive predicting histological subtypes of lung cancer based on the multiphasic contrast-enhanced CT images.
Collapse
Affiliation(s)
| | - Lin Lu
- Department of Radiology, Columbia University Medical Center, New York, NY
| | - Li Li
- Department of Pathology, Shanxi DAYI Hospital, Taiyuan, Shanxi, China
| | - Hao Yang
- Department of Radiology, Columbia University Medical Center, New York, NY
| | | | - Binsheng Zhao
- Department of Radiology, Columbia University Medical Center, New York, NY
| |
Collapse
|
10
|
Tian S. Identification of subtype-specific prognostic signatures using Cox models with redundant gene elimination. Oncol Lett 2018; 15:8545-8555. [PMID: 29805591 PMCID: PMC5950526 DOI: 10.3892/ol.2018.8418] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 03/02/2018] [Indexed: 12/14/2022] Open
Abstract
Lung cancer (LC) is a leading cause of cancer-associated mortalities worldwide. Adenocarcinoma (AC) and squamous cell carcinoma (SCC) account for ~70% of all cases of LC. Since AC and SCC are two distinct diseases, their corresponding prognostic genes associated with patient survival time are expected to be different. To date, only a few studies have distinguished patients with good prognosis from those with poor prognosis for each specific subtype. In the present study, the Cox filter model, a feature selection algorithm that identifies subtype-specific prognostic genes to incorporate pathway information and eliminate redundant genes, was adopted. By applying the proposed model to data on non-small cell lung cancer (NSCLC), it was demonstrated that both redundant gene elimination and search space restriction can improve the predictive capacity and the model stability of resulting prognostic gene signatures. To conclude, a pre-filtering procedure that incorporates pathway information for screening likely irrelevant genes prior to complex downstream analysis is recommended. Furthermore, a feature selection algorithm that considers redundant gene elimination may be preferable to one without such a consideration.
Collapse
Affiliation(s)
- Suyan Tian
- Division of Clinical Research, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
11
|
Tian S. Classification and survival prediction for early-stage lung adenocarcinoma and squamous cell carcinoma patients. Oncol Lett 2017; 14:5464-5470. [PMID: 29098036 PMCID: PMC5652232 DOI: 10.3892/ol.2017.6835] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 08/04/2017] [Indexed: 01/08/2023] Open
Abstract
Non-small cell lung cancer (NSCLC) is a leading cause of cancer-associated mortality worldwide. Adenocarcinoma (AC) and squamous cell carcinoma (SCC) are two primary histological subtypes of NSCLC, accounting for ~70% of lung cancer cases. Increasing evidence suggests that AC and SCC differ in the composition of genes and molecular characteristics. Previous research has focused on distinguishing AC from SCC or predicting the NSCLC patient survival rates using gene expression profiles, usually with the aid of a feature selection method. The present study conducted a pre-filtering to identify the genes that have significant expression values and a high connection with other genes in the gene network, and then used the radial coordinate visualization method to identify relevant genes. By applying the proposed procedure to NSCLC data, it was demonstrated that there is a clear segmentation between AC and SCC, however not between patients with a good prognosis and bad prognosis. The focus of discriminating AC and SCC differs from survival prediction and there are almost no overlaps between the two gene signatures. Overall, a supervised learning method is preferred and future studies aiming to identify prognostic gene signatures with an increased prediction efficiency are required.
Collapse
Affiliation(s)
- Suyan Tian
- Division of Clinical Research, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China.,Center for Applied Statistical Research, School of Mathematics, Jilin University, Changchun, Jilin 130012, P.R. China
| |
Collapse
|
12
|
Xiao J, Lu X, Chen X, Zou Y, Liu A, Li W, He B, He S, Chen Q. Eight potential biomarkers for distinguishing between lung adenocarcinoma and squamous cell carcinoma. Oncotarget 2017; 8:71759-71771. [PMID: 29069744 PMCID: PMC5641087 DOI: 10.18632/oncotarget.17606] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2017] [Accepted: 03/29/2017] [Indexed: 11/25/2022] Open
Abstract
Lung adenocarcinoma (LADC) and squamous cell carcinoma (LSCC) are the most common non-small cell lung cancer histological phenotypes. Accurate diagnosis distinguishing between these two lung cancer types has clinical significance. For this study, we analyzed four Gene Expression Omnibus (GEO) datasets (GSE28571, GSE37745, GSE43580, and GSE50081). We then imported the datasets into the Gene-Cloud of Biotechnology Information online platform to identify genes differentially expressed in LADC and LSCC. We identified DSG3 (desmoglein 3), KRT5 (keratin 5), KRT6A (keratin 6A), KRT6B (keratin 6B), NKX2-1 (NK2 homeobox 1), SFTA2 (surfactant associated 2), SFTA3 (surfactant associated 3), and TMC5 (transmembrane channel-like 5) as potential biomarkers for distinguishing between LADC and LSCC. Receiver operating characteristic curve analysis suggested that KRT5 had the highest diagnostic value for discriminating between these two cancer types. Using the PrognoScan online survival analysis tool and the Kaplan-Meier Plotter, we found that high KRT6A or KRT6B levels, or low NKX2-1, SFTA3, or TMC5 levels correlated with unfavorable prognoses in LADC patients. Further studies will be needed to verify our findings in additional patient samples, and to elucidate the mechanisms of action of these potential biomarkers in non-small cell lung cancer.
Collapse
Affiliation(s)
- Jian Xiao
- Department of Geriatrics, Respiratory Medicine, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Xiaoxiao Lu
- Department of Geriatrics, Respiratory Medicine, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Xi Chen
- Department of Respiratory Medicine, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Yong Zou
- Department of Geriatrics, Respiratory Medicine, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Aibin Liu
- Department of Geriatrics, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Wei Li
- Department of Geriatrics, Clinical Laboratory, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Bixiu He
- Department of Geriatrics, Respiratory Medicine, Xiangya Hospital of Central South University, Changsha 410008, China
| | - Shuya He
- Department of Biochemistry & Biology, University of South China, Hengyang 421001, China
| | - Qiong Chen
- Department of Geriatrics, Respiratory Medicine, Xiangya Hospital of Central South University, Changsha 410008, China
| |
Collapse
|
13
|
Identification of prognostic genes and gene sets for early-stage non-small cell lung cancer using bi-level selection methods. Sci Rep 2017; 7:46164. [PMID: 28387364 PMCID: PMC5384004 DOI: 10.1038/srep46164] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 03/09/2017] [Indexed: 12/18/2022] Open
Abstract
In contrast to feature selection and gene set analysis, bi-level selection is a process of selecting not only important gene sets but also important genes within those gene sets. Depending on the order of selections, a bi-level selection method can be classified into three categories – forward selection, which first selects relevant gene sets followed by the selection of relevant individual genes; backward selection which takes the reversed order; and simultaneous selection, which performs the two tasks simultaneously usually with the aids of a penalized regression model. To test the existence of subtype-specific prognostic genes for non-small cell lung cancer (NSCLC), we had previously proposed the Cox-filter method that examines the association between patients’ survival time after diagnosis with one specific gene, the disease subtypes, and their interaction terms. In this study, we further extend it to carry out forward and backward bi-level selection. Using simulations and a NSCLC application, we demonstrate that the forward selection outperforms the backward selection and other relevant algorithms in our setting. Both proposed methods are readily understandable and interpretable. Therefore, they represent useful tools for the researchers who are interested in exploring the prognostic value of gene expression data for specific subtypes or stages of a disease.
Collapse
|
14
|
Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm. BIOMED RESEARCH INTERNATIONAL 2016; 2016:2491671. [PMID: 27446945 PMCID: PMC4944087 DOI: 10.1155/2016/2491671] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 05/09/2016] [Accepted: 06/05/2016] [Indexed: 01/15/2023]
Abstract
Among non-small cell lung cancer (NSCLC), adenocarcinoma (AC), and squamous cell carcinoma (SCC) are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR), can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.
Collapse
|