1
|
Ding Q, Sun Y, Shang J, Li F, Zhang Y, Liu JX. NMFNA: A Non-negative Matrix Factorization Network Analysis Method for Identifying Modules and Characteristic Genes of Pancreatic Cancer. Front Genet 2021; 12:678642. [PMID: 34367241 PMCID: PMC8340025 DOI: 10.3389/fgene.2021.678642] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 06/03/2021] [Indexed: 01/15/2023] Open
Abstract
Pancreatic cancer (PC) is a highly fatal disease, yet its causes remain unclear. Comprehensive analysis of different types of PC genetic data plays a crucial role in understanding its pathogenic mechanisms. Currently, non-negative matrix factorization (NMF)-based methods are widely used for genetic data analysis. Nevertheless, it is a challenge for them to integrate and decompose different types of genetic data simultaneously. In this paper, a non-NMF network analysis method, NMFNA, is proposed, which introduces a graph-regularized constraint to the NMF, for identifying modules and characteristic genes from two-type PC data of methylation (ME) and copy number variation (CNV). Firstly, three PC networks, i.e., ME network, CNV network, and ME-CNV network, are constructed using the Pearson correlation coefficient (PCC). Then, modules are detected from these three PC networks effectively due to the introduced graph-regularized constraint, which is the highlight of the NMFNA. Finally, both gene ontology (GO) and pathway enrichment analyses are performed, and characteristic genes are detected by the multimeasure score, to deeply understand biological functions of PC core modules. Experimental results demonstrated that the NMFNA facilitates the integration and decomposition of two types of PC data simultaneously and can further serve as an alternative method for detecting modules and characteristic genes from multiple genetic data of complex diseases.
Collapse
Affiliation(s)
- Qian Ding
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Yan Sun
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Feng Li
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, China
| |
Collapse
|
2
|
Gabdrakhmanov IT, Gorshkov MV, Tarasova IA. Proteomics of Cellular Response to Stress: Taking Control of False Positive Results. BIOCHEMISTRY (MOSCOW) 2021; 86:338-349. [PMID: 33838633 DOI: 10.1134/s0006297921030093] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
One of the main goals of quantitative proteomics is molecular profiling of cellular response to stress at the protein level. To perform this profiling, statistical analysis of experimental data involves multiple testing of a hypothesis about the equality of protein concentrations between the cells under normal and stress conditions. This analysis is then associated with the multiple testing problem dealing with the increased chance of obtaining false positive results. A number of solutions to this problem are known, yet, they may lead to the loss of potentially important biological information when applied with commonly accepted thresholds of statistical significance. Using the proteomic data obtained earlier for the yeast samples containing proteins at known concentrations and the biological models of early and late cellular responses to stress, we analyzed dependences of distributions of false positive and false negative rates on the protein fold changes and thresholds of statistical significance. Based on the analysis of the density of data points in the volcano plots, Benjamini-Hochberg method, and gene ontology analysis, visual approach for optimization of the statistical threshold and selection of the differentially regulated proteins has been suggested, which could be useful for researchers working in the field of quantitative proteomics.
Collapse
Affiliation(s)
| | - Mikhail V Gorshkov
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow Region, 141701, Russia.,Talrose Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Irina A Tarasova
- Talrose Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
3
|
Abstract
Ion channels and transporters (ICT) play important roles in almost all basic cellular processes. During last decades, abundant evidences have been provided that ICT (e.g., Ca2+ and K+ channels) are notable for regulating physiological pancreatic duct cellular function and deregulation of ICT is closely associated with the widely accepted hallmarks of pancreatic ductal adenocarcinoma (PDAC) such as proliferation, apoptosis resistance, invasion, and metastasis. Hence this review focuses on the role of ICT malfunctions in context with the hallmarks of PDAC. After briefly introducing epidemiology and history of molecular oncology of PDAC and summarizing the recent studies on molecular classification systems, we focus then on the exocrine pancreas as a very active secretory gland which considerably impacts the changes in the ion transport system (the transportome) upon malignant transformation. We highlight multiplicity of ICT members (H+ transporters, Ca2+, K+, Na+ and Cl- channels) and their functional impact in PDAC. We also present some selective therapeutic options to interfere with transportome functions and thereby with key mechanisms of malignant progression. This will hopefully contribute to a better clinical outcome based on improved therapeutic strategies for this still extremely deadly disease.
Collapse
|
4
|
Vinga S. Structured sparsity regularization for analyzing high-dimensional omics data. Brief Bioinform 2020; 22:77-87. [PMID: 32597465 DOI: 10.1093/bib/bbaa122] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Revised: 05/15/2020] [Accepted: 05/18/2020] [Indexed: 12/18/2022] Open
Abstract
The development of new molecular and cell technologies is having a significant impact on the quantity of data generated nowadays. The growth of omics databases is creating a considerable potential for knowledge discovery and, concomitantly, is bringing new challenges to statistical learning and computational biology for health applications. Indeed, the high dimensionality of these data may hamper the use of traditional regression methods and parameter estimation algorithms due to the intrinsic non-identifiability of the inherent optimization problem. Regularized optimization has been rising as a promising and useful strategy to solve these ill-posed problems by imposing additional constraints in the solution parameter space. In particular, the field of statistical learning with sparsity has been significantly contributing to building accurate models that also bring interpretability to biological observations and phenomena. Beyond the now-classic elastic net, one of the best-known methods that combine lasso with ridge penalizations, we briefly overview recent literature on structured regularizers and penalty functions that have been applied in biomedical data to build parsimonious models in a variety of underlying contexts, from survival to generalized linear models. These methods include functions of $\ell _k$-norms and network-based penalties that take into account the inherent relationships between the features. The successful application to omics data illustrates the potential of sparse structured regularization for identifying disease's molecular signatures and for creating high-performance clinical decision support systems towards more personalized healthcare. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Susana Vinga
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
5
|
Iuliano A, Occhipinti A, Angelini C, De Feis I, Liò P. Combining Pathway Identification and Breast Cancer Survival Prediction via Screening-Network Methods. Front Genet 2018; 9:206. [PMID: 29963073 PMCID: PMC6011013 DOI: 10.3389/fgene.2018.00206] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 05/24/2018] [Indexed: 12/30/2022] Open
Abstract
Breast cancer is one of the most common invasive tumors causing high mortality among women. It is characterized by high heterogeneity regarding its biological and clinical characteristics. Several high-throughput assays have been used to collect genome-wide information for many patients in large collaborative studies. This knowledge has improved our understanding of its biology and led to new methods of diagnosing and treating the disease. In particular, system biology has become a valid approach to obtain better insights into breast cancer biological mechanisms. A crucial component of current research lies in identifying novel biomarkers that can be predictive for breast cancer patient prognosis on the basis of the molecular signature of the tumor sample. However, the high dimension and low sample size of data greatly increase the difficulty of cancer survival analysis demanding for the development of ad-hoc statistical methods. In this work, we propose novel screening-network methods that predict patient survival outcome by screening key survival-related genes and we assess the capability of the proposed approaches using METABRIC dataset. In particular, we first identify a subset of genes by using variable screening techniques on gene expression data. Then, we perform Cox regression analysis by incorporating network information associated with the selected subset of genes. The novelty of this work consists in the improved prediction of survival responses due to the different types of screenings (i.e., a biomedical-driven, data-driven and a combination of the two) before building the network-penalized model. Indeed, the combination of the two screening approaches allows us to use the available biological knowledge on breast cancer and complement it with additional information emerging from the data used for the analysis. Moreover, we also illustrate how to extend the proposed approaches to integrate an additional omic layer, such as copy number aberrations, and we show that such strategies can further improve our prediction capabilities. In conclusion, our approaches allow to discriminate patients in high-and low-risk groups using few potential biomarkers and therefore, can help clinicians to provide more precise prognoses and to facilitate the subsequent clinical management of patients at risk of disease.
Collapse
Affiliation(s)
- Antonella Iuliano
- Istituto per le Applicazioni del Calcolo "Mauro Picone", Consiglio Nazionale delle Ricerche, Naples, Italy.,Telethon Institute of Genetics and Medicine, Pozzuoli, Italy
| | | | - Claudia Angelini
- Istituto per le Applicazioni del Calcolo "Mauro Picone", Consiglio Nazionale delle Ricerche, Naples, Italy
| | - Italia De Feis
- Istituto per le Applicazioni del Calcolo "Mauro Picone", Consiglio Nazionale delle Ricerche, Naples, Italy
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
6
|
Nalbantoglu S, Abu-Asab M, Tan M, Zhang X, Cai L, Amri H. Study of Clinical Survival and Gene Expression in a Sample of Pancreatic Ductal Adenocarcinoma by Parsimony Phylogenetic Analysis. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2016; 20:442-7. [PMID: 27428255 PMCID: PMC4968342 DOI: 10.1089/omi.2016.0059] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is one of the rapidly growing forms of pancreatic cancer with a poor prognosis and less than 5% 5-year survival rate. In this study, we characterized the genetic signatures and signaling pathways related to survival from PDAC, using a parsimony phylogenetic algorithm. We applied the parsimony phylogenetic algorithm to analyze the publicly available whole-genome in silico array analysis of a gene expression data set in 25 early-stage human PDAC specimens. We explain here that the parsimony phylogenetics is an evolutionary analytical method that offers important promise to uncover clonal (driver) and nonclonal (passenger) aberrations in complex diseases. In our analysis, parsimony and statistical analyses did not identify significant correlations between survival times and gene expression values. Thus, the survival rankings did not appear to be significantly different between patients for any specific gene (p > 0.05). Also, we did not find correlation between gene expression data and tumor stage in the present data set. While the present analysis was unable to identify in this relatively small sample of patients a molecular signature associated with pancreatic cancer prognosis, we suggest that future research and analyses with the parsimony phylogenetic algorithm in larger patient samples are worthwhile, given the devastating nature of pancreatic cancer and its early diagnosis, and the need for novel data analytic approaches. The future research practices might want to place greater emphasis on phylogenetics as one of the analytical paradigms, as our findings presented here are on the cusp of this shift, especially in the current era of Big Data and innovation policies advocating for greater data sharing and reanalysis.
Collapse
Affiliation(s)
- Sinem Nalbantoglu
- Department of Biochemistry, Cellular and Molecular Biology, School of Medicine, Georgetown University, Washington, DC
| | - Mones Abu-Asab
- Laboratory of Immunology, Section of Immunopathology, National Eye Institute, Bethesda, Maryland
| | - Ming Tan
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Xuemin Zhang
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Ling Cai
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Hakima Amri
- Department of Biochemistry, Cellular and Molecular Biology, School of Medicine, Georgetown University, Washington, DC
| |
Collapse
|
7
|
Iuliano A, Occhipinti A, Angelini C, De Feis I, Lió P. Cancer Markers Selection Using Network-Based Cox Regression: A Methodological and Computational Practice. Front Physiol 2016; 7:208. [PMID: 27378931 PMCID: PMC4911360 DOI: 10.3389/fphys.2016.00208] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Accepted: 05/22/2016] [Indexed: 12/15/2022] Open
Abstract
International initiatives such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) are collecting multiple datasets at different genome-scales with the aim of identifying novel cancer biomarkers and predicting survival of patients. To analyze such data, several statistical methods have been applied, among them Cox regression models. Although these models provide a good statistical framework to analyze omic data, there is still a lack of studies that illustrate advantages and drawbacks in integrating biological information and selecting groups of biomarkers. In fact, classical Cox regression algorithms focus on the selection of a single biomarker, without taking into account the strong correlation between genes. Even though network-based Cox regression algorithms overcome such drawbacks, such network-based approaches are less widely used within the life science community. In this article, we aim to provide a clear methodological framework on the use of such approaches in order to turn cancer research results into clinical applications. Therefore, we first discuss the rationale and the practical usage of three recently proposed network-based Cox regression algorithms (i.e., Net-Cox, AdaLnet, and fastcox). Then, we show how to combine existing biological knowledge and available data with such algorithms to identify networks of cancer biomarkers and to estimate survival of patients. Finally, we describe in detail a new permutation-based approach to better validate the significance of the selection in terms of cancer gene signatures and pathway/networks identification. We illustrate the proposed methodology by means of both simulations and real case studies. Overall, the aim of our work is two-fold. Firstly, to show how network-based Cox regression models can be used to integrate biological knowledge (e.g., multi-omics data) for the analysis of survival data. Secondly, to provide a clear methodological and computational approach for investigating cancers regulatory networks.
Collapse
Affiliation(s)
- Antonella Iuliano
- Istituto per le Applicazioni del Calcolo "Mauro Picone," Consiglio Nazionale delle Ricerche Naples, Italy
| | | | - Claudia Angelini
- Istituto per le Applicazioni del Calcolo "Mauro Picone," Consiglio Nazionale delle Ricerche Naples, Italy
| | - Italia De Feis
- Istituto per le Applicazioni del Calcolo "Mauro Picone," Consiglio Nazionale delle Ricerche Naples, Italy
| | - Pietro Lió
- Computer Laboratory, University of Cambridge Cambridge, UK
| |
Collapse
|
8
|
Laimighofer M, Krumsiek J, Buettner F, Theis FJ. Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression. J Comput Biol 2016; 23:279-90. [PMID: 26894327 PMCID: PMC4827277 DOI: 10.1089/cmb.2015.0192] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
With widespread availability of omics profiling techniques, the analysis and interpretation of high-dimensional omics data, for example, for biomarkers, is becoming an increasingly important part of clinical medicine because such datasets constitute a promising resource for predicting survival outcomes. However, early experience has shown that biomarkers often generalize poorly. Thus, it is crucial that models are not overfitted and give accurate results with new data. In addition, reliable detection of multivariate biomarkers with high predictive power (feature selection) is of particular interest in clinical settings. We present an approach that addresses both aspects in high-dimensional survival models. Within a nested cross-validation (CV), we fit a survival model, evaluate a dataset in an unbiased fashion, and select features with the best predictive power by applying a weighted combination of CV runs. We evaluate our approach using simulated toy data, as well as three breast cancer datasets, to predict the survival of breast cancer patients after treatment. In all datasets, we achieve more reliable estimation of predictive power for unseen cases and better predictive performance compared to the standard CoxLasso model. Taken together, we present a comprehensive and flexible framework for survival models, including performance estimation, final feature selection, and final model construction. The proposed algorithm is implemented in an open source R package (SurvRank) available on CRAN.
Collapse
Affiliation(s)
- Michael Laimighofer
- 1 Institute of Computational Biology , Helmholtz-Zentrum München, Neuherberg, Germany .,2 Department of Mathematics, TU München , Garching, Germany
| | - Jan Krumsiek
- 1 Institute of Computational Biology , Helmholtz-Zentrum München, Neuherberg, Germany .,3 German Center for Diabetes Research (DZD) , München-Neuherberg, Germany
| | - Florian Buettner
- 1 Institute of Computational Biology , Helmholtz-Zentrum München, Neuherberg, Germany .,4 European Bioinformatics Institute , European Molecular Biology Laboratory Hinxton, Cambridge, United Kingdom
| | - Fabian J Theis
- 1 Institute of Computational Biology , Helmholtz-Zentrum München, Neuherberg, Germany .,2 Department of Mathematics, TU München , Garching, Germany
| |
Collapse
|
9
|
Sarkar FH. Novel Holistic Approaches for Overcoming Therapy Resistance in Pancreatic and Colon Cancers. Med Princ Pract 2016; 25 Suppl 2:3-10. [PMID: 26228733 PMCID: PMC5588517 DOI: 10.1159/000435814] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 06/08/2015] [Indexed: 12/12/2022] Open
Abstract
Gastrointestinal (GI) cancers, such as of the colon and pancreas, are highly resistant to both standard and targeted therapeutics. Therapy-resistant and heterogeneous GI cancers harbor highly complex signaling networks (the resistome) that resist apoptotic programming. Commonly used gemcitabine or platinum-based regimens fail to induce meaningful (i.e. disease-reversing) perturbations in the resistome, resulting in high rates of treatment failure. The GI cancer resistance networks are, in part, due to interactions between parallel signaling and aberrantly expressed microRNAs (miRNAs) that collectively promote the development and survival of drug-resistant cancer stem cells with epithelial-to-mesenchymal transition (EMT) characteristics. The lack of understanding of the resistance networks associated with this subpopulation of cells as well as reductionist, single protein-/pathway-targeted approaches have made 'effective drug design' a difficult task. We propose that the successful design of novel therapeutic regimens to target drug-resistant GI tumors is only possible if network-based drug avenues and agents, in particular 'natural agents' with no known toxicity, are correctly identified. Natural agents (dietary agents or their synthetic derivatives) can individually alter miRNA profiles, suppress EMT pathways and eliminate cancer stem-like cells that derive from pancreatic cancer and colon cancer, by partially targeting multiple yet meaningful networks within the GI cancer resistome. However, the efficacy of these agents as combinations (e.g. consumed in the diet) against this resistome has never been studied. This short review article provides an overview of the different challenges involved in the understanding of the GI resistome, and how novel computational biology can help in the design of effective therapies to overcome resistance.
Collapse
Affiliation(s)
- Fazlul H. Sarkar
- *Fazlul H. Sarkar, PhD, Departments of Pathology and Oncology, Karmanos Cancer Institute, Wayne State University School of Medicine, 4100 John R, 740 HWCRC, Detroit, MI 48201 (USA), E-Mail
| |
Collapse
|
10
|
Barry V, Klein M, Winquist A, Darrow LA, Steenland K. Disease fatality and bias in survival cohorts. ENVIRONMENTAL RESEARCH 2015; 140:275-281. [PMID: 25880887 DOI: 10.1016/j.envres.2015.03.039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Revised: 03/19/2015] [Accepted: 03/31/2015] [Indexed: 06/04/2023]
Abstract
OBJECTIVES Simulate how the effect of exposure on disease occurrence and fatality influences the presence and magnitude of bias in survivor cohorts, motivated by an actual survivor cohort under study. METHODS We simulated a cohort of 50,000 subjects exposed to a disease-causing exposure over time and followed forty years, where disease incidence was the outcome of interest. We simulated this 'inception' cohort under different assumptions about the effect of exposure on disease occurrence and fatality after disease occurrence. We then created a corresponding 'survivor' (or 'cross-sectional') cohort, where cohort enrollment took place at a specific date after exposure began in the inception cohort; subjects dying prior to that enrollment date were excluded. The disease of interest caused all deaths in our simulations, but was not always fatal. In the survivor cohort, person-time at risk began before enrollment for all subjects who did not die prior to enrollment. We compared exposure-disease associations in each inception cohort to those in corresponding survivor cohorts to determine how different assumptions impacted bias in the survivor cohorts. All subjects in both inception and survivor cohorts were considered equally susceptible to the effect of exposure in causing disease. We used Cox proportional hazards regression to calculate effect measures. RESULTS There was no bias in survivor cohort estimates when case fatality among diseased subjects was independent of exposure. This was true even when the disease was highly fatal and more highly exposed subjects were more likely to develop disease and die. Assuming a positive exposure-response in the inception cohort, survivor cohort rate ratios were biased downwards when case fatality was greater with higher exposure. CONCLUSIONS Survivor cohort effect estimates for fatal outcomes are not always biased, although precision can decrease.
Collapse
Affiliation(s)
- Vaughn Barry
- Departments of Environmental Health and Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322 USA.
| | - Mitchel Klein
- Departments of Environmental Health and Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322 USA
| | - Andrea Winquist
- Departments of Environmental Health and Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322 USA
| | - Lyndsey A Darrow
- Departments of Environmental Health and Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322 USA
| | - Kyle Steenland
- Departments of Environmental Health and Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322 USA
| |
Collapse
|
11
|
Gong H, Klinger J, Damazyn K, Li X, Huang S. A novel procedure for statistical inference and verification of gene regulatory subnetwork. BMC Bioinformatics 2015; 16 Suppl 7:S7. [PMID: 25952938 PMCID: PMC4423581 DOI: 10.1186/1471-2105-16-s7-s7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Background The reconstruction of gene regulatory network from time course microarray data can help us comprehensively understand the biological system and discover the pathogenesis of cancer and other diseases. But how to correctly and efficiently decifer the gene regulatory network from high-throughput gene expression data is a big challenge due to the relatively small amount of observations and curse of dimensionality. Computational biologists have developed many statistical inference and machine learning algorithms to analyze the microarray data. In the previous studies, the correctness of an inferred regulatory network is manually checked through comparing with public database or an existing model. Results In this work, we present a novel procedure to automatically infer and verify gene regulatory networks from time series expression data. The dynamic Bayesian network, a statistical inference algorithm, is at first implemented to infer an optimal network from time series microarray data of S. cerevisiae, then, a weighted symbolic model checker is applied to automatically verify or falsify the inferred network through checking some desired temporal logic formulas abstracted from experiments or public database. Conclusions Our studies show that the marriage of statistical inference algorithm with model checking technique provides a more efficient way to automatically infer and verify the gene regulatory network from time series expression data than previous studies.
Collapse
|
12
|
Cell signaling events differentiate ER-negative subtypes from ER-positive breast cancer. Med Oncol 2015; 32:142. [DOI: 10.1007/s12032-015-0565-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 03/10/2015] [Indexed: 10/23/2022]
|
13
|
Liu W, Wang Q, Zhao J, Zhang C, Liu Y, Zhang J, Bai X, Li X, Feng H, Liao M, Wang W, Li C. Integration of pathway structure information into a reweighted partial Cox regression approach for survival analysis on high-dimensional gene expression data. MOLECULAR BIOSYSTEMS 2015; 11:1876-86. [DOI: 10.1039/c5mb00044k] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Accurately predicting the risk of cancer relapse or death is important for clinical utility.
Collapse
|