Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chen X, Wang L. Integrating biological knowledge with gene expression profiles for survival prediction of cancer. J Comput Biol 2009;16:265-78. [PMID: 19183004 DOI: 10.1089/cmb.2008.12tt] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

For:	Chen X, Wang L. Integrating biological knowledge with gene expression profiles for survival prediction of cancer. J Comput Biol 2009;16:265-78. [PMID: 19183004 DOI: 10.1089/cmb.2008.12tt] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Number

Cited by Other Article(s)

Shen J, Wang S, Sun H, Huang J, Bai L, Wang X, Dong Y, Tang Z. A novel non-negative Bayesian stacking modeling method for Cancer survival prediction using high-dimensional omics data. BMC Med Res Methodol 2024;24:105. [PMID: 38702624 PMCID: PMC11067084 DOI: 10.1186/s12874-024-02232-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 04/23/2024] [Indexed: 05/06/2024] Open

Abstract

BACKGROUND

Survival prediction using high-dimensional molecular data is a hot topic in the field of genomics and precision medicine, especially for cancer studies. Considering that carcinogenesis has a pathway-based pathogenesis, developing models using such group structures is a closer mimic of disease progression and prognosis. Many approaches can be used to integrate group information; however, most of them are single-model methods, which may account for unstable prediction.

METHODS

We introduced a novel survival stacking method that modeled using group structure information to improve the robustness of cancer survival prediction in the context of high-dimensional omics data. With a super learner, survival stacking combines the prediction from multiple sub-models that are independently trained using the features in pre-grouped biological pathways. In addition to a non-negative linear combination of sub-models, we extended the super learner to non-negative Bayesian hierarchical generalized linear model and artificial neural network. We compared the proposed modeling strategy with the widely used survival penalized method Lasso Cox and several group penalized methods, e.g., group Lasso Cox, via simulation study and real-world data application.

RESULTS

The proposed survival stacking method showed superior and robust performance in terms of discrimination compared with single-model methods in case of high-noise simulated data and real-world data. The non-negative Bayesian stacking method can identify important biological signal pathways and genes that are associated with the prognosis of cancer.

CONCLUSIONS

This study proposed a novel survival stacking strategy incorporating biological group information into the cancer prognosis models. Additionally, this study extended the super learner to non-negative Bayesian model and ANN, enriching the combination of sub-models. The proposed Bayesian stacking strategy exhibited favorable properties in the prediction and interpretation of complex survival data, which may aid in discovering cancer targets.

Collapse

Affiliation(s)

Junjie Shen Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Shuo Wang Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, 79085, Freiburg, Germany
Hao Sun Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Jie Huang Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Lu Bai Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Xichao Wang Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Yongfei Dong Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Zaixiang Tang Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China.

Collapse

Shen J, Wang S, Dong Y, Sun H, Wang X, Tang Z. A non-negative spike-and-slab lasso generalized linear stacking prediction modeling method for high-dimensional omics data. BMC Bioinformatics 2024;25:119. [PMID: 38509499 PMCID: PMC10953151 DOI: 10.1186/s12859-024-05741-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 03/11/2024] [Indexed: 03/22/2024] Open

Chen Y, Liu S, Papageorgiou LG, Theofilatos K, Tsoka S. Optimisation Models for Pathway Activity Inference in Cancer. Cancers (Basel) 2023;15:1787. [PMID: 36980673 PMCID: PMC10046797 DOI: 10.3390/cancers15061787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 02/24/2023] [Accepted: 03/08/2023] [Indexed: 03/18/2023] Open

Abstract

BACKGROUND

With advances in high-throughput technologies, there has been an enormous increase in data related to profiling the activity of molecules in disease. While such data provide more comprehensive information on cellular actions, their large volume and complexity pose difficulty in accurate classification of disease phenotypes. Therefore, novel modelling methods that can improve accuracy while offering interpretable means of analysis are required. Biological pathways can be used to incorporate a priori knowledge of biological interactions to decrease data dimensionality and increase the biological interpretability of machine learning models.

METHODOLOGY

A mathematical optimisation model is proposed for pathway activity inference towards precise disease phenotype prediction and is applied to RNA-Seq datasets. The model is based on mixed-integer linear programming (MILP) mathematical optimisation principles and infers pathway activity as the linear combination of pathway member gene expression, multiplying expression values with model-determined gene weights that are optimised to maximise discrimination of phenotype classes and minimise incorrect sample allocation.

RESULTS

The model is evaluated on the transcriptome of breast and colorectal cancer, and exhibits solution results of good optimality as well as good prediction performance on related cancer subtypes. Two baseline pathway activity inference methods and three advanced methods are used for comparison. Sample prediction accuracy, robustness against noise expression data, and survival analysis suggest competitive prediction performance of our model while providing interpretability and insight on key pathways and genes. Overall, our work demonstrates that the flexible nature of mathematical programming lends itself well to developing efficient computational strategies for pathway activity inference and disease subtype prediction.

Collapse

Wang W, Liu W. PCLasso: a protein complex-based, group lasso-Cox model for accurate prognosis and risk protein complex discovery. Brief Bioinform 2021;22:6291946. [PMID: 34086850 DOI: 10.1093/bib/bbab212] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 05/08/2021] [Accepted: 05/15/2021] [Indexed: 12/12/2022] Open

Raghu VK, Ge X, Balajiee A, Shirer DJ, Das I, Benos PV, Chrysanthis PK. A Pipeline for Integrated Theory and Data-Driven Modeling of Biomedical Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:811-822. [PMID: 32841121 PMCID: PMC8237279 DOI: 10.1109/tcbb.2020.3019237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Wang W, Liu W. Integration of gene interaction information into a reweighted Lasso-Cox model for accurate survival prediction. Bioinformatics 2020;36:5405-5414. [PMID: 33325490 DOI: 10.1093/bioinformatics/btaa1046] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 11/13/2020] [Accepted: 12/07/2020] [Indexed: 11/14/2022] Open

Abstract Abstract Motivation Accurately predicting the risk of cancer patients is a central challenge for clinical cancer research. For high-dimensional gene expression data, Cox proportional hazard model with the least absolute shrinkage and selection operator for variable selection (Lasso-Cox) is one of the most popular feature selection and risk prediction algorithms. However, the Lasso-Cox model treats all genes equally, ignoring the biological characteristics of the genes themselves. This often encounters the problem of poor prognostic performance on independent datasets. Results Here, we propose a Reweighted Lasso-Cox (RLasso-Cox) model to ameliorate this problem by integrating gene interaction information. It is based on the hypothesis that topologically important genes in the gene interaction network tend to have stable expression changes. We used random walk to evaluate the topological weight of genes, and then highlighted topologically important genes to improve the generalization ability of the RLasso-Cox model. Experiments on datasets of three cancer types showed that the RLasso-Cox model improves the prognostic accuracy and robustness compared with the Lasso-Cox model and several existing network-based methods. More importantly, the RLasso-Cox model has the advantage of identifying small gene sets with high prognostic performance on independent datasets, which may play an important role in identifying robust survival biomarkers for various cancer types. Availability and implementation http://bioconductor.org/packages/devel/bioc/html/RLassoCox.html Supplementary information Supplementary data are available at Bioinformatics online. Collapse

Perscheid C. Integrative biomarker detection on high-dimensional gene expression data sets: a survey on prior knowledge approaches. Brief Bioinform 2020;22:5881664. [PMID: 32761115 DOI: 10.1093/bib/bbaa151] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 06/15/2020] [Accepted: 06/16/2020] [Indexed: 02/06/2023] Open

Guan X, Runger G, Liu L. Dynamic incorporation of prior knowledge from multiple domains in biomarker discovery. BMC Bioinformatics 2020;21:77. [PMID: 32164534 PMCID: PMC7068914 DOI: 10.1186/s12859-020-3344-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Abstract

Background

In biomarker discovery, applying domain knowledge is an effective approach to eliminating false positive features, prioritizing functionally impactful markers and facilitating the interpretation of predictive signatures. Several computational methods have been developed that formulate the knowledge-based biomarker discovery as a feature selection problem guided by prior information. These methods often require that prior information is encoded as a single score and the algorithms are optimized for biological knowledge of a specific type. However, in practice, domain knowledge from diverse resources can provide complementary information. But no current methods can integrate heterogeneous prior information for biomarker discovery. To address this problem, we developed the Know-GRRF (know-guided regularized random forest) method that enables dynamic incorporation of domain knowledge from multiple disciplines to guide feature selection.

Results

Know-GRRF embeds domain knowledge in a regularized random forest framework. It combines prior information from multiple domains in a linear model to derive a composite score, which, together with other tuning parameters, controls the regularization of the random forests model. Know-GRRF concurrently optimizes the weight given to each type of domain knowledge and other tuning parameters to minimize the AIC of out-of-bag predictions. The objective is to select a compact feature subset that has a high discriminative power and strong functional relevance to the biological phenotype.

Via rigorous simulations, we show that Know-GRRF guided by multiple-domain prior information outperforms feature selection methods guided by single-domain prior information or no prior information. We then applied Known-GRRF to a real-world study to identify prognostic biomarkers of prostate cancers. We evaluated the combination of cancer-related gene annotations, evolutionary conservation and pre-computed statistical scores as the prior knowledge to assemble a panel of biomarkers. We discovered a compact set of biomarkers with significant improvements on prediction accuracies.

Conclusions

Know-GRRF is a powerful novel method to incorporate knowledge from multiple domains for feature selection. It has a broad range of applications in biomarker discoveries. We implemented this method and released a KnowGRRF package in the R/CRAN archive.

Collapse

Liu W, Wang W, Tian G, Xie W, Lei L, Liu J, Huang W, Xu L, Li E. Topologically inferring pathway activity for precise survival outcome prediction: breast cancer as a case. MOLECULAR BIOSYSTEMS 2017;13:537-548. [PMID: 28098303 DOI: 10.1039/c6mb00757k] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Matsui S, Crowley J. An Evaluation of Gene Set Analysis for Biomarker Discovery with Applications to Myeloma Research. FRONTIERS OF BIOSTATISTICAL METHODS AND APPLICATIONS IN CLINICAL ONCOLOGY 2017. [PMCID: PMC7120714 DOI: 10.1007/978-981-10-0126-0_25] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Hira ZM, Gillies DF. Identifying Significant Features in Cancer Methylation Data Using Gene Pathway Segmentation. Cancer Inform 2016;15:189-98. [PMID: 27688706 PMCID: PMC5030825 DOI: 10.4137/cin.s39859] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 06/19/2016] [Accepted: 07/03/2016] [Indexed: 12/19/2022] Open

Abstract

In order to provide the most effective therapy for cancer, it is important to be able to diagnose whether a patient's cancer will respond to a proposed treatment. Methylation profiling could contain information from which such predictions could be made. Currently, hypothesis testing is used to determine whether possible biomarkers for cancer progression produce statistically significant results. However, this approach requires the identification of individual genes, or sets of genes, as candidate hypotheses, and with the increasing size of modern microarrays, this task is becoming progressively harder. Exhaustive testing of small sets of genes is computationally infeasible, and so hypothesis generation depends either on the use of established biological knowledge or on heuristic methods. As an alternative machine learning, methods can be used to identify groups of genes that are acting together within sets of cancer data and associate their behaviors with cancer progression. These methods have the advantage of being multivariate and unbiased but unfortunately also rapidly become computationally infeasible as the number of gene probes and datasets increases. To address this problem, we have investigated a way of utilizing prior knowledge to segment microarray datasets in such a way that machine learning can be used to identify candidate sets of genes for hypothesis testing. A methylation dataset is divided into subsets, where each subset contains only the probes that relate to a known gene pathway. Each of these pathway subsets is used independently for classification. The classification method is AdaBoost with decision trees as weak classifiers. Since each pathway subset contains a relatively small number of gene probes, it is possible to train and test its classification accuracy quickly and determine whether it has valuable diagnostic information. Finally, genes from successful pathway subsets can be combined to create a classifier of high accuracy.

Collapse

Plant miRNA function prediction based on functional similarity network and transductive multi-label classification algorithm. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.12.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Kamkar I, Gupta SK, Phung D, Venkatesh S. Stabilizing l1-norm prediction models by supervised feature grouping. J Biomed Inform 2015;59:149-68. [PMID: 26689771 DOI: 10.1016/j.jbi.2015.11.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Revised: 11/18/2015] [Accepted: 11/23/2015] [Indexed: 01/05/2023]

Meng J, Li R, Luan Y. Classification by integrating plant stress response gene expression data with biological knowledge. Math Biosci 2015;266:65-72. [PMID: 26092610 DOI: 10.1016/j.mbs.2015.06.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Revised: 05/03/2015] [Accepted: 06/05/2015] [Indexed: 12/01/2022]

Hira ZM, Gillies DF. A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data. Adv Bioinformatics 2015;2015:198363. [PMID: 26170834 PMCID: PMC4480804 DOI: 10.1155/2015/198363] [Citation(s) in RCA: 282] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 05/18/2015] [Indexed: 02/07/2023] Open

Yang Y, Li D, Yang Y, Jiang G. An integrated analysis of the effects of microRNA and mRNA on esophageal squamous cell carcinoma. Mol Med Rep 2015;12:945-52. [PMID: 25823933 PMCID: PMC4438920 DOI: 10.3892/mmr.2015.3557] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 03/16/2015] [Indexed: 01/29/2023] Open

Zhou XL, Wu JH, Wang XJ, Guo FJ. Integrated microRNA-mRNA analysis revealing the potential roles of microRNAs in tongue squamous cell cancer. Mol Med Rep 2015;12:885-94. [PMID: 25760063 PMCID: PMC4438953 DOI: 10.3892/mmr.2015.3467] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2014] [Accepted: 02/02/2015] [Indexed: 01/21/2023] Open

Meng J, Zhang J, Luan Y. Gene Selection Integrated with Biological Knowledge for Plant Stress Response Using Neighborhood System and Rough Set Theory. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:433-444. [PMID: 26357229 DOI: 10.1109/tcbb.2014.2361329] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Ye S, Dawson JA, Kendziorski C. Extending information retrieval methods to personalized genomic-based studies of disease. Cancer Inform 2015;13:85-95. [PMID: 25733795 PMCID: PMC4332045 DOI: 10.4137/cin.s16354] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Revised: 10/22/2014] [Accepted: 10/23/2014] [Indexed: 01/30/2023] Open

Liu W, Wang Q, Zhao J, Zhang C, Liu Y, Zhang J, Bai X, Li X, Feng H, Liao M, Wang W, Li C. Integration of pathway structure information into a reweighted partial Cox regression approach for survival analysis on high-dimensional gene expression data. MOLECULAR BIOSYSTEMS 2015;11:1876-86. [DOI: 10.1039/c5mb00044k] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Hassanzadeh HR, Phan JH, Wang MD. A semi-supervised method for predicting cancer survival using incomplete clinical data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015;2015:210-213. [PMID: 26736237 DOI: 10.1109/embc.2015.7318337] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Yang L, Ainali C, Tsoka S, Papageorgiou LG. Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework. BMC Bioinformatics 2014;15:390. [PMID: 25475756 PMCID: PMC4269079 DOI: 10.1186/s12859-014-0390-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 11/19/2014] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies.

RESULTS

A supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile.

CONCLUSIONS

The model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.

Collapse

Hira ZM, Trigeorgis G, Gillies DF. An algorithm for finding biologically significant features in microarray data based on a priori manifold learning. PLoS One 2014;9:e90562. [PMID: 24595155 PMCID: PMC3940899 DOI: 10.1371/journal.pone.0090562] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 02/02/2014] [Indexed: 11/19/2022] Open

Gu S, Su P, Yan J, Zhang X, An X, Gao J, Xin R, Liu Y. Comparison of gene expression profiles and related pathways in chronic thromboembolic pulmonary hypertension. Int J Mol Med 2013;33:277-300. [PMID: 24337368 PMCID: PMC3896458 DOI: 10.3892/ijmm.2013.1582] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Accepted: 12/03/2013] [Indexed: 01/08/2023] Open

Yang Y, Li H, Hou S, Hu B, Liu J, Wang J. Differences in gene expression profiles and carcinogenesis pathways involved in cisplatin resistance of four types of cancer. Oncol Rep 2013;30:596-614. [PMID: 23733047 DOI: 10.3892/or.2013.2514] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2012] [Accepted: 03/04/2013] [Indexed: 11/06/2022] Open

Zycinski G, Barla A, Squillario M, Sanavia T, Camillo BD, Verri A. Knowledge Driven Variable Selection (KDVS) - a new approach to enrichment analysis of gene signatures obtained from high-throughput data. SOURCE CODE FOR BIOLOGY AND MEDICINE 2013;8:2. [PMID: 23302187 PMCID: PMC3605163 DOI: 10.1186/1751-0473-8-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Accepted: 12/13/2012] [Indexed: 11/10/2022]

Abstract

Background

High–throughput (HT) technologies provide huge amount of gene expression data that can be used to identify biomarkers useful in the clinical practice. The most frequently used approaches first select a set of genes (i.e. gene signature) able to characterize differences between two or more phenotypical conditions, and then provide a functional assessment of the selected genes with an a posteriori enrichment analysis, based on biological knowledge. However, this approach comes with some drawbacks. First, gene selection procedure often requires tunable parameters that affect the outcome, typically producing many false hits. Second, a posteriori enrichment analysis is based on mapping between biological concepts and gene expression measurements, which is hard to compute because of constant changes in biological knowledge and genome analysis. Third, such mapping is typically used in the assessment of the coverage of gene signature by biological concepts, that is either score–based or requires tunable parameters as well, limiting its power.

Results

We present Knowledge Driven Variable Selection (KDVS), a framework that uses a priori biological knowledge in HT data analysis. The expression data matrix is transformed, according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, unlike most approaches, does not exclude a priori any function or process potentially relevant for the biological question under investigation. Differently from the standard approach where gene selection and functional assessment are applied independently, KDVS embeds these two steps into a unified statistical framework, decreasing the variability derived from the threshold–dependent selection, the mapping to the biological concepts, and the signature coverage. We present three case studies to assess the usefulness of the method.

Conclusions

We showed that KDVS not only enables the selection of known biological functionalities with accuracy, but also identification of new ones. An efficient implementation of KDVS was devised to obtain results in a fast and robust way. Computing time is drastically reduced by the effective use of distributed resources. Finally, integrated visualization techniques immediately increase the interpretability of results. Overall, KDVS approach can be considered as a viable alternative to enrichment–based approaches.

Collapse

Perez-Rathke A, Li H, Lussier YA. Interpreting personal transcriptomes: personalized mechanism-scale profiling of RNA-seq data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2013:159-170. [PMID: 23424121 PMCID: PMC3595401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Kim J, Sohn I, Jung SH, Kim S, Park C. Analysis of Survival Data with Group Lasso. COMMUN STAT-SIMUL C 2012. [DOI: 10.1080/03610918.2011.611311] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Sanavia T, Aiolli F, Da San Martino G, Bisognin A, Di Camillo B. Improving biomarker list stability by integration of biological knowledge in the learning process. BMC Bioinformatics 2012;13 Suppl 4:S22. [PMID: 22536969 PMCID: PMC3314566 DOI: 10.1186/1471-2105-13-s4-s22] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Abstract

BACKGROUND

The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for biomarker discovery using microarray data often provide results with limited overlap. It has been suggested that one reason for these inconsistencies may be that in complex diseases, such as cancer, multiple genes belonging to one or more physiological pathways are associated with the outcomes. Thus, a possible approach to improve list stability is to integrate biological information from genomic databases in the learning process; however, a comprehensive assessment based on different types of biological information is still lacking in the literature. In this work we have compared the effect of using different biological information in the learning process like functional annotations, protein-protein interactions and expression correlation among genes.

RESULTS

Biological knowledge has been codified by means of gene similarity matrices and expression data linearly transformed in such a way that the more similar two features are, the more closely they are mapped. Two semantic similarity matrices, based on Biological Process and Molecular Function Gene Ontology annotation, and geodesic distance applied on protein-protein interaction networks, are the best performers in improving list stability maintaining almost equal prediction accuracy.

CONCLUSIONS

The performed analysis supports the idea that when some features are strongly correlated to each other, for example because are close in the protein-protein interaction network, then they might have similar importance and are equally relevant for the task at hand. Obtained results can be a starting point for additional experiments on combining similarity matrices in order to obtain even more stable lists of biomarkers. The implementation of the classification algorithm is available at the link: http://www.math.unipd.it/~dasan/biomarkers.html.

Collapse

Yang X, Regan K, Huang Y, Zhang Q, Li J, Seiwert TY, Cohen EEW, Xing HR, Lussier YA. Single sample expression-anchored mechanisms predict survival in head and neck cancer. PLoS Comput Biol 2012;8:e1002350. [PMID: 22291585 PMCID: PMC3266878 DOI: 10.1371/journal.pcbi.1002350] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Accepted: 11/28/2011] [Indexed: 12/11/2022] Open

Abstract

Gene expression signatures that are predictive of therapeutic response or prognosis are increasingly useful in clinical care; however, mechanistic (and intuitive) interpretation of expression arrays remains an unmet challenge. Additionally, there is surprisingly little gene overlap among distinct clinically validated expression signatures. These “causality challenges” hinder the adoption of signatures as compared to functionally well-characterized single gene biomarkers. To increase the utility of multi-gene signatures in survival studies, we developed a novel approach to generate “personal mechanism signatures” of molecular pathways and functions from gene expression arrays. FAIME, the Functional Analysis of Individual Microarray Expression, computes mechanism scores using rank-weighted gene expression of an individual sample. By comparing head and neck squamous cell carcinoma (HNSCC) samples with non-tumor control tissues, the precision and recall of deregulated FAIME-derived mechanisms of pathways and molecular functions are comparable to those produced by conventional cohort-wide methods (e.g. GSEA). The overlap of “Oncogenic FAIME Features of HNSCC” (statistically significant and differentially regulated FAIME-derived genesets representing GO functions or KEGG pathways derived from HNSCC tissue) among three distinct HNSCC datasets (pathways:46%, p<0.001) is more significant than the gene overlap (genes:4%). These Oncogenic FAIME Features of HNSCC can accurately discriminate tumors from control tissues in two additional HNSCC datasets (n = 35 and 91, F-accuracy = 100% and 97%, empirical p<0.001, area under the receiver operating characteristic curves = 99% and 92%), and stratify recurrence-free survival in patients from two independent studies (p = 0.0018 and p = 0.032, log-rank). Previous approaches depending on group assignment of individual samples before selecting features or learning a classifier are limited by design to discrete-class prediction. In contrast, FAIME calculates mechanism profiles for individual patients without requiring group assignment in validation sets. FAIME is more amenable for clinical deployment since it translates the gene-level measurements of each given sample into pathways and molecular function profiles that can be applied to analyze continuous phenotypes in clinical outcome studies (e.g. survival time, tumor volume).

Clinical utilization of multi-gene expression signatures that are predictive of therapeutic response has been steadily increasing, however, interpretation of such results remains challenging because multi-gene signatures, generated from analyzing different patient cohorts, tend to be equally predictive but contain minimal overlap. Whereas pathway-level analyses of expression arrays show promise for generating clinically meaningful mechanistic signatures, current approaches do not permit single-patient based analyses that are independent of cross-group calculations. To bridge the gap between deterministic biological mechanisms of single-gene biomarkers and the statistical predictive power of multi-gene signatures that are disconnected from mechanisms, we developed FAIME, a novel method that transforms microarray gene expression data into individualized patient profiles of molecular mechanisms. We have validated its capability for predicting clinical outcomes, including cancer patient samples derived from six different clinical trial cohorts of head and neck cancers. This method provides opportunities to harness an untapped resource for personal genomics: clinical evaluation and testing of individually interpretable mechanistic profiles derived from gene expression arrays.

Collapse

Affiliation(s)

Xinan Yang Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, United States of America Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, United States of America
Kelly Regan Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, United States of America
Yong Huang Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, United States of America Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, United States of America
Qingbei Zhang Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, United States of America Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, United States of America
Jianrong Li Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, United States of America Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, United States of America
Tanguy Y. Seiwert Section of Hematology/Oncology of the Department of Medicine, The University of Chicago, Chicago, Illinois, United States of America Comprehensive Cancer Center, The University of Chicago, Chicago, Illinois, United States of America
Ezra E. W. Cohen Section of Hematology/Oncology of the Department of Medicine, The University of Chicago, Chicago, Illinois, United States of America Comprehensive Cancer Center, The University of Chicago, Chicago, Illinois, United States of America
H. Rosie Xing Comprehensive Cancer Center, The University of Chicago, Chicago, Illinois, United States of America Departments of Pathology and of Cellular and Radiation Oncology, The University of Chicago, Chicago, Illinois, United States of America Ludwig Center for Metastasis Research, The University of Chicago, Chicago, Illinois, United States of America
Yves A. Lussier Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, United States of America Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, United States of America Comprehensive Cancer Center, The University of Chicago, Chicago, Illinois, United States of America Departments of Pathology and of Cellular and Radiation Oncology, The University of Chicago, Chicago, Illinois, United States of America Ludwig Center for Metastasis Research, The University of Chicago, Chicago, Illinois, United States of America Computation Institute, Institute for Translational Medicine, and Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America * E-mail:

Collapse

Lee S, Kim J, Lee S. A comparative study on gene-set analysis methods for assessing differential expression associated with the survival phenotype. BMC Bioinformatics 2011;12:377. [PMID: 21943316 PMCID: PMC3196970 DOI: 10.1186/1471-2105-12-377] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 09/26/2011] [Indexed: 11/10/2022] Open

Abstract

Background

Many gene-set analysis methods have been previously proposed and compared through simulation studies and analysis of real datasets for binary phenotypes. We focused on the survival phenotype and compared the performances of Gene Set Enrichment Analysis (GSEA), Global Test (GT), Wald-type Test (WT) and Global Boost Test (GBST) methods in a simulation study and on two ovarian cancer data sets. We considered two versions of GSEA by allowing different weights: GSEA1 uses equal weights, yielding results similar to the Kolmogorov-Smirnov test; while GSEA2's weights are based on the correlation between genes and the phenotype.

Results

We compared GSEA1, GSEA2, GT, WT and GBST in a simulation study with various settings for the correlation structure of the genes and the association parameter between the survival outcome and the genes. Simulation results indicated that GT, WT and GBST consistently have higher power than GSEA1 and GSEA2 across all scenarios. However, the power of the five tests depends on the combination of correlation structure and association parameter. For the ovarian cancer data set, using the FDR threshold of q < 0.1, the GT, WT and GBST detected 12, 6 and 8 significant pathways, respectively, whereas neither GSEA1 nor GSEA2 detected any significant pathways. In addition, among the pathways found significant by GT, WT, and GBST, three pathways - Purine metabolism, Leukocyte transendothelial migration and Jak-STAT signaling pathway - overlapped with those reported in previous ovarian cancer microarray studies.

Conclusion

Simulation studies and a real data example indicate that GT, WT and GBST tend to have high power, whereas GSEA1 and GSEA2 have lower power. We also found that the power of the five tests is much higher when genes are correlated than when genes are independent, when survival is positively associated with genes. It seems that there is a synergistic effect in detecting significant gene sets when significant genes have within-class correlation and the association between survival and genes is positive or negative (i.e., one-direction correlation).

Collapse

Zhao Z, Liu Y, He H, Chen X, Chen J, Lu YC. Candidate genes influencing sensitivity and resistance of human glioblastoma to Semustine. Brain Res Bull 2011;86:189-94. [PMID: 21807073 DOI: 10.1016/j.brainresbull.2011.07.010] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2011] [Revised: 07/07/2011] [Accepted: 07/13/2011] [Indexed: 12/28/2022]

Abstract

OBJECTIVE

The prognosis of glioblastoma (GBM) is poor. The therapeutic outcome of conventional surgical and adjuvant treatments remains unsatisfactory, and therefore individualized adjuvant chemotherapy has aroused more attention. Microarrays have been applied to study mechanism of GBM development and progression but it has difficulty in determining responsible genes from the plethora of genes on microarrays unrelated to outcome. The present study was attempted to use bioinformatics method to investigate candidate genes that may influence chemosensitivity of GBM to Semustine (Me-CCNU).

METHODS

Clinical data of 4 GBM patients in Affymetrix microarray were perfected through long-term follow-up study. Differential expression genes between the long- and short-survival groups were picked out, GO-analysis and pathway-analysis of the differential expression genes were performed. Me-CCNU-related signal transduction networks were constructed. The methods combined three steps before were used to screen core genes that influenced Me-CCNU chemosensitivity in GBM.

RESULTS

In Affymetrix microarray there were altogether 2018 differential expression genes that influenced survival duration of GBM. Of them, 934 genes were up-regulated and 1084 down-regulated. They mainly participated in 94 pathways. Me-CCNU-related signal transduction networks were constructed. The total number of genes in the networks was 466, of which 66 were also found in survival duration-related differential expression genes. Studied key genes through GO-analysis, pathway-analysis and in the Me-CCNU-related signal transduction networks, 25 core genes that influenced chemosensitivity of GBM to Me-CCNU were obtained, including TP53, MAP2K2, EP300, PRKCA, TNF, CCND1, AKT2, RBL1, CDC2, ID2, RAF1, CDKN2C, FGFR1, SP1, CDK6, IGFBP3, MDM4, PDGFD, SOCS2, CCNG2, CDK2, SDC2, STMN1, TCF7L1, TUBB.

CONCLUSION

Bioinformatics may help excavate and analyze large amounts of data in microarrays by means of rigorous experimental planning, scientific statistical analysis and collection of complete data about survival of GBM patients. In the present study, a novel differential gene expression pattern was constructed and advanced study will provide new targets for chemosensitivity of GBM.

Collapse

Chen X, Wang L, Ishwaran H. An Integrative Pathway-based Clinical-genomic Model for Cancer Survival Prediction. Stat Probab Lett 2010;80:1313-1319. [PMID: 21731150 DOI: 10.1016/j.spl.2010.04.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]

Lee KH, Lee SH. Predicting Survival of DLBCL Patients in Pathway-Based Microarray Analysis. KOREAN JOURNAL OF APPLIED STATISTICS 2010. [DOI: 10.5351/kjas.2010.23.4.705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

He Z, Yu W. Stable feature selection for biomarker discovery. Comput Biol Chem 2010;34:215-25. [PMID: 20702140 DOI: 10.1016/j.compbiolchem.2010.07.002] [Citation(s) in RCA: 131] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2010] [Revised: 06/27/2010] [Accepted: 07/10/2010] [Indexed: 12/27/2022]

Banerjee D. Reinventing diagnostics for personalized therapy in oncology. Cancers (Basel) 2010;2:1066-91. [PMID: 24281107 PMCID: PMC3835119 DOI: 10.3390/cancers2021066] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Revised: 05/15/2010] [Accepted: 05/28/2010] [Indexed: 11/16/2022] Open

Podo F, Buydens LMC, Degani H, Hilhorst R, Klipp E, Gribbestad IS, Van Huffel S, van Laarhoven HWM, Luts J, Monleon D, Postma GJ, Schneiderhan-Marra N, Santoro F, Wouters H, Russnes HG, Sørlie T, Tagliabue E, Børresen-Dale AL. Triple-negative breast cancer: present challenges and new perspectives. Mol Oncol 2010;4:209-29. [PMID: 20537966 PMCID: PMC5527939 DOI: 10.1016/j.molonc.2010.04.006] [Citation(s) in RCA: 225] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Accepted: 04/16/2010] [Indexed: 12/28/2022] Open

De Haan JR, Piek E, van Schaik RC, de Vlieg J, Bauerschmidt S, Buydens LMC, Wehrens R. Integrating gene expression and GO classification for PCA by preclustering. BMC Bioinformatics 2010;11:158. [PMID: 20346140 PMCID: PMC2860362 DOI: 10.1186/1471-2105-11-158] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Accepted: 03/26/2010] [Indexed: 12/03/2022] Open

Bright LA, Burgess SC, Chowdhary B, Swiderski CE, McCarthy FM. Structural and functional-annotation of an equine whole genome oligoarray. BMC Bioinformatics 2009;10 Suppl 11:S8. [PMID: 19811692 PMCID: PMC3226197 DOI: 10.1186/1471-2105-10-s11-s8] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Abstract

Background

The horse genome is sequenced, allowing equine researchers to use high-throughput functional genomics platforms such as microarrays; next-generation sequencing for gene expression and proteomics. However, for researchers to derive value from these functional genomics datasets, they must be able to model this data in biologically relevant ways; to do so requires that the equine genome be more fully annotated. There are two interrelated types of genomic annotation: structural and functional. Structural annotation is delineating and demarcating the genomic elements (such as genes, promoters, and regulatory elements). Functional annotation is assigning function to structural elements. The Gene Ontology (GO) is the de facto standard for functional annotation, and is routinely used as a basis for modelling and hypothesis testing, large functional genomics datasets.

Results

An Equine Whole Genome Oligonucleotide (EWGO) array with 21,351 elements was developed at Texas A&M University. This 70-mer oligoarray was designed using the approximately 7× assembled and annotated sequence of the equine genome to be one of the most comprehensive arrays available for expressed equine sequences. To assist researchers in determining the biological meaning of data derived from this array, we have structurally annotated it by mapping the elements to multiple database accessions, including UniProtKB, Entrez Gene, NRPD (Non-Redundant Protein Database) and UniGene. We next provided GO functional annotations for the gene transcripts represented on this array. Overall, we GO annotated 14,531 gene products (68.1% of the gene products represented on the EWGO array) with 57,912 annotations. GAQ (GO Annotation Quality) scores were calculated for this array both before and after we added GO annotation. The additional annotations improved the meanGAQ score 16-fold. This data is publicly available at AgBase http://www.agbase.msstate.edu/.

Conclusion

Providing additional information about the public databases which link to the gene products represented on the array allows users more flexibility when using gene expression modelling and hypothesis-testing computational tools. Moreover, since different databases provide different types of information, users have access to multiple data sources. In addition, our GO annotation underpins functional modelling for most gene expression analysis tools and enables equine researchers to model large lists of differentially expressed transcripts in biologically relevant ways.

Collapse

Guan P, Huang D, He M, Zhou B. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2009;28:103. [PMID: 19615083 PMCID: PMC2719616 DOI: 10.1186/1756-9966-28-103] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Accepted: 07/18/2009] [Indexed: 01/13/2023]