Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Daub CO, Steuer R, Selbig J, Kloska S. Estimating mutual information using B-spline functions--an improved similarity measure for analysing gene expression data. BMC Bioinformatics 2004;5:118. [PMID: 15339346 PMCID: PMC516800 DOI: 10.1186/1471-2105-5-118] [Citation(s) in RCA: 194] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2003] [Accepted: 08/31/2004] [Indexed: 11/10/2022] Open

For:	Daub CO, Steuer R, Selbig J, Kloska S. Estimating mutual information using B-spline functions--an improved similarity measure for analysing gene expression data. BMC Bioinformatics 2004;5:118. [PMID: 15339346 PMCID: PMC516800 DOI: 10.1186/1471-2105-5-118] [Citation(s) in RCA: 194] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2003] [Accepted: 08/31/2004] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

McCain JSP, Britten GL, Hackett SR, Follows MJ, Li GW. Microbial reaction rate estimation using proteins and proteomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.13.607198. [PMID: 39185172 PMCID: PMC11343155 DOI: 10.1101/2024.08.13.607198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]

Chee FT, Harun S, Mohd Daud K, Sulaiman S, Nor Muhammad NA. Exploring gene regulation and biological processes in insects: Insights from omics data using gene regulatory network models. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2024;189:1-12. [PMID: 38604435 DOI: 10.1016/j.pbiomolbio.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/18/2023] [Accepted: 04/03/2024] [Indexed: 04/13/2024]

Pinchas A, Ben-Gal I, Painsky A. A Comparative Analysis of Discrete Entropy Estimators for Large-Alphabet Problems. ENTROPY (BASEL, SWITZERLAND) 2024;26:369. [PMID: 38785618 PMCID: PMC11120205 DOI: 10.3390/e26050369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 04/25/2024] [Accepted: 04/25/2024] [Indexed: 05/25/2024]

Tian J, Lei J, Roeder K. From local to global gene co-expression estimation using single-cell RNA-seq data. Biometrics 2024;80:ujae001. [PMID: 38465983 PMCID: PMC10926266 DOI: 10.1093/biomtc/ujae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 10/01/2023] [Accepted: 01/15/2024] [Indexed: 03/12/2024]

Alanis-Lobato G, Bartlett TE, Huang Q, Simon CS, McCarthy A, Elder K, Snell P, Christie L, Niakan KK. MICA: a multi-omics method to predict gene regulatory networks in early human embryos. Life Sci Alliance 2024;7:e202302415. [PMID: 37879938 PMCID: PMC10599980 DOI: 10.26508/lsa.202302415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 10/12/2023] [Accepted: 10/13/2023] [Indexed: 10/27/2023] Open

Karell-Albo JA, Legón-Pérez CM, Socorro-Llanes R, Rojas O, Sosa-Gómez G. Complexity Reduction in Analyzing Independence between Statistical Randomness Tests Using Mutual Information. ENTROPY (BASEL, SWITZERLAND) 2023;25:1545. [PMID: 37998237 PMCID: PMC10670732 DOI: 10.3390/e25111545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/09/2023] [Accepted: 11/13/2023] [Indexed: 11/25/2023]

Schiffthaler B, van Zalen E, Serrano AR, Street NR, Delhomme N. Seiðr: Efficient calculation of robust ensemble gene networks. Heliyon 2023;9:e16811. [PMID: 37313140 PMCID: PMC10258422 DOI: 10.1016/j.heliyon.2023.e16811] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 05/22/2023] [Accepted: 05/29/2023] [Indexed: 06/15/2023] Open

Ito Y, Uda S, Kokaji T, Hirayama A, Soga T, Suzuki Y, Kuroda S, Kubota H. Comparison of hepatic responses to glucose perturbation between healthy and obese mice based on the edge type of network structures. Sci Rep 2023;13:4758. [PMID: 36959243 PMCID: PMC10036622 DOI: 10.1038/s41598-023-31547-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 03/14/2023] [Indexed: 03/25/2023] Open

Affiliation(s)

Yuki Ito Division of Integrated Omics, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, 812-8582, Japan Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan
Shinsuke Uda Division of Integrated Omics, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, 812-8582, Japan.
Toshiya Kokaji Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan Data Science Center, Nara Institute of Science and Technology, 8916-5, Takayamacho, Ikoma, Nara, 630-0192, Japan
Akiyoshi Hirayama Institute for Advanced Biosciences, Keio University, 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata, 997-0052, Japan
Tomoyoshi Soga Institute for Advanced Biosciences, Keio University, 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata, 997-0052, Japan
Yutaka Suzuki Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan
Shinya Kuroda Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan Department of Biological Sciences, Graduate School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency, Bunkyo-ku, Tokyo, 113-0033, Japan
Hiroyuki Kubota Division of Integrated Omics, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, 812-8582, Japan

Collapse

Shachaf LI, Roberts E, Cahan P, Xiao J. Gene regulation network inference using k-nearest neighbor-based mutual information estimation: revisiting an old DREAM. BMC Bioinformatics 2023;24:84. [PMID: 36879188 PMCID: PMC9990267 DOI: 10.1186/s12859-022-05047-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 11/08/2022] [Indexed: 03/08/2023] Open

Abstract

BACKGROUND

A cell exhibits a variety of responses to internal and external cues. These responses are possible, in part, due to the presence of an elaborate gene regulatory network (GRN) in every single cell. In the past 20 years, many groups worked on reconstructing the topological structure of GRNs from large-scale gene expression data using a variety of inference algorithms. Insights gained about participating players in GRNs may ultimately lead to therapeutic benefits. Mutual information (MI) is a widely used metric within this inference/reconstruction pipeline as it can detect any correlation (linear and non-linear) between any number of variables (n-dimensions). However, the use of MI with continuous data (for example, normalized fluorescence intensity measurement of gene expression levels) is sensitive to data size, correlation strength and underlying distributions, and often requires laborious and, at times, ad hoc optimization.

RESULTS

In this work, we first show that estimating MI of a bi- and tri-variate Gaussian distribution using k-nearest neighbor (kNN) MI estimation results in significant error reduction as compared to commonly used methods based on fixed binning. Second, we demonstrate that implementing the MI-based kNN Kraskov-Stoögbauer-Grassberger (KSG) algorithm leads to a significant improvement in GRN reconstruction for popular inference algorithms, such as Context Likelihood of Relatedness (CLR). Finally, through extensive in-silico benchmarking we show that a new inference algorithm CMIA (Conditional Mutual Information Augmentation), inspired by CLR, in combination with the KSG-MI estimator, outperforms commonly used methods.

CONCLUSIONS

Using three canonical datasets containing 15 synthetic networks, the newly developed method for GRN reconstruction-which combines CMIA, and the KSG-MI estimator-achieves an improvement of 20-35% in precision-recall measures over the current gold standard in the field. This new method will enable researchers to discover new gene interactions or better choose gene candidates for experimental validations.

Collapse

Afshar S, Braun PR, Han S, Lin Y. A multimodal deep learning model to infer cell-type-specific functional gene networks. BMC Bioinformatics 2023;24:47. [PMID: 36788477 PMCID: PMC9926713 DOI: 10.1186/s12859-023-05146-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 01/11/2023] [Indexed: 02/16/2023] Open

Abstract

BACKGROUND

Functional gene networks (FGNs) capture functional relationships among genes that vary across tissues and cell types. Construction of cell-type-specific FGNs enables the understanding of cell-type-specific functional gene relationships and insights into genetic mechanisms of human diseases in disease-relevant cell types. However, most existing FGNs were developed without consideration of specific cell types within tissues.

RESULTS

In this study, we created a multimodal deep learning model (MDLCN) to predict cell-type-specific FGNs in the human brain by integrating single-nuclei gene expression data with global protein interaction networks. We systematically evaluated the prediction performance of the MDLCN and showed its superior performance compared to two baseline models (boosting tree and convolutional neural network). Based on the predicted cell-type-specific FGNs, we observed that cell-type marker genes had a higher level of hubness than non-marker genes in their corresponding cell type. Furthermore, we showed that risk genes underlying autism and Alzheimer's disease were more strongly connected in disease-relevant cell types, supporting the cellular context of predicted cell-type-specific FGNs.

CONCLUSIONS

Our study proposes a powerful deep learning approach (MDLCN) to predict FGNs underlying a diverse set of cell types in human brain. The MDLCN model enhances prediction accuracy of cell-type-specific FGNs compared to single modality convolutional neural network (CNN) and boosting tree models, as shown by higher areas under both receiver operating characteristic (ROC) and precision-recall curves for different levels of independent test datasets. The predicted FGNs also show evidence for the cellular context and distinct topological features (i.e. higher hubness and topological score) of cell-type marker genes. Moreover, we observed stronger modularity among disease-associated risk genes in FGNs of disease-relevant cell types. For example, the strength of connectivity among autism risk genes was stronger in neurons, but risk genes underlying Alzheimer's disease were more connected in microglia.

Collapse

Altay G, Zapardiel-Gonzalo J, Peters B. RNA-seq preprocessing and sample size considerations for gene network inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.02.522518. [PMID: 36711979 PMCID: PMC9881880 DOI: 10.1101/2023.01.02.522518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]

Abstract

Background

Gene network inference (GNI) methods have the potential to reveal functional relationships between different genes and their products. Most GNI algorithms have been developed for microarray gene expression datasets and their application to RNA-seq data is relatively recent. As the characteristics of RNA-seq data are different from microarray data, it is an unanswered question what preprocessing methods for RNA-seq data should be applied prior to GNI to attain optimal performance, or what the required sample size for RNA-seq data is to obtain reliable GNI estimates.

Results

We ran 9144 analysis of 7 different RNA-seq datasets to evaluate 300 different preprocessing combinations that include data transformations, normalizations and association estimators. We found that there was no single best performing preprocessing combination but that there were several good ones. The performance varied widely over various datasets, which emphasized the importance of choosing an appropriate preprocessing configuration before GNI. Two preprocessing combinations appeared promising in general: First, Log-2 TPM (transcript per million) with Variance-stabilizing transformation (VST) and Pearson Correlation Coefficient (PCC) association estimator. Second, raw RNA-seq count data with PCC. Along with these two, we also identified 18 other good preprocessing combinations. Any of these algorithms might perform best in different datasets. Therefore, the GNI performances of these approaches should be measured on any new dataset to select the best performing one for it. In terms of the required biological sample size of RNA-seq data, we found that between 30 to 85 samples were required to generate reliable GNI estimates.

Conclusions

This study provides practical recommendations on default choices for data preprocessing prior to GNI analysis of RNA-seq data to obtain optimal performance results.

Collapse

Tu M, Zeng J, Zhang J, Fan G, Song G. Unleashing the power within short-read RNA-seq for plant research: Beyond differential expression analysis and toward regulomics. FRONTIERS IN PLANT SCIENCE 2022;13:1038109. [PMID: 36570898 PMCID: PMC9773216 DOI: 10.3389/fpls.2022.1038109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 11/21/2022] [Indexed: 06/17/2023]

Lei J, Cai Z, He X, Zheng W, Liu J. An approach of gene regulatory network construction using mixed entropy optimizing context-related likelihood mutual information. Bioinformatics 2022;39:6808612. [PMID: 36342190 PMCID: PMC9805593 DOI: 10.1093/bioinformatics/btac717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 09/18/2022] [Accepted: 11/04/2022] [Indexed: 11/09/2022] Open

Abstract

MOTIVATION

The question of how to construct gene regulatory networks has long been a focus of biological research. Mutual information can be used to measure nonlinear relationships, and it has been widely used in the construction of gene regulatory networks. However, this method cannot measure indirect regulatory relationships under the influence of multiple genes, which reduces the accuracy of inferring gene regulatory networks.

APPROACH

This work proposes a method for constructing gene regulatory networks based on mixed entropy optimizing context-related likelihood mutual information (MEOMI). First, two entropy estimators were combined to calculate the mutual information between genes. Then, distribution optimization was performed using a context-related likelihood algorithm to eliminate some indirect regulatory relationships and obtain the initial gene regulatory network. To obtain the complex interaction between genes and eliminate redundant edges in the network, the initial gene regulatory network was further optimized by calculating the conditional mutual inclusive information (CMI2) between gene pairs under the influence of multiple genes. The network was iteratively updated to reduce the impact of mutual information on the overestimation of the direct regulatory intensity.

RESULTS

The experimental results show that the MEOMI method performed better than several other kinds of gene network construction methods on DREAM challenge simulated datasets (DREAM3 and DREAM5), three real Escherichia coli datasets (E.coli SOS pathway network, E.coli SOS DNA repair network and E.coli community network) and two human datasets.

AVAILABILITY AND IMPLEMENTATION

Source code and dataset are available at https://github.com/Dalei-Dalei/MEOMI/ and http://122.205.95.139/MEOMI/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Functional Network: A Novel Framework for Interpretability of Deep Neural Networks. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Feng H, Zheng R, Wang J, Wu FX, Li M. NIMCE: A Gene Regulatory Network Inference Approach Based on Multi Time Delays Causal Entropy. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:1042-1049. [PMID: 33035155 DOI: 10.1109/tcbb.2020.3029846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Karkowska R, Urjasz S. Linear and Nonlinear Effects in Connectedness Structure: Comparison between European Stock Markets. ENTROPY (BASEL, SWITZERLAND) 2022;24:303. [PMID: 35205597 PMCID: PMC8870905 DOI: 10.3390/e24020303] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 02/17/2022] [Accepted: 02/18/2022] [Indexed: 12/04/2022]

Identifying large scale interaction atlases using probabilistic graphs and external knowledge. J Clin Transl Sci 2022;6:e27. [PMID: 35321220 PMCID: PMC8922291 DOI: 10.1017/cts.2022.18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 12/29/2021] [Accepted: 02/07/2022] [Indexed: 11/17/2022] Open

Du Q, Campbell MT, Yu H, Liu K, Walia H, Zhang Q, Zhang C. Gene Co-expression Network Analysis and Linking Modules to Phenotyping Response in Plants. Methods Mol Biol 2022;2539:261-268. [PMID: 35895209 DOI: 10.1007/978-1-0716-2537-8_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Shang J, Wang J, Sun Y, Li F, Liu JX, Zhang H. Multiscale part mutual information for quantifying nonlinear direct associations in networks. Bioinformatics 2021;37:2920-2929. [PMID: 33730153 DOI: 10.1093/bioinformatics/btab182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/15/2021] [Accepted: 03/15/2021] [Indexed: 02/02/2023] Open

Single-cell analysis reveals the pan-cancer invasiveness-associated transition of adipose-derived stromal cells into COL11A1-expressing cancer-associated fibroblasts. PLoS Comput Biol 2021;17:e1009228. [PMID: 34283835 PMCID: PMC8323949 DOI: 10.1371/journal.pcbi.1009228] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 07/30/2021] [Accepted: 06/30/2021] [Indexed: 01/01/2023] Open

Singh U, Li J, Seetharam A, Wurtele ES. pyrpipe: a Python package for RNA-Seq workflows. NAR Genom Bioinform 2021;3:lqab049. [PMID: 34085037 PMCID: PMC8168212 DOI: 10.1093/nargab/lqab049] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 05/06/2021] [Accepted: 05/18/2021] [Indexed: 02/06/2023] Open

African Americans and European Americans exhibit distinct gene expression patterns across tissues and tumors associated with immunologic functions and environmental exposures. Sci Rep 2021;11:9905. [PMID: 33972602 PMCID: PMC8110974 DOI: 10.1038/s41598-021-89224-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 04/21/2021] [Indexed: 12/20/2022] Open

Contreras Rodríguez L, Madarro-Capó EJ, Legón-Pérez CM, Rojas O, Sosa-Gómez G. Selecting an Effective Entropy Estimator for Short Sequences of Bits and Bytes with Maximum Entropy. ENTROPY (BASEL, SWITZERLAND) 2021;23:561. [PMID: 33946438 PMCID: PMC8147137 DOI: 10.3390/e23050561] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 04/26/2021] [Accepted: 04/28/2021] [Indexed: 11/22/2022]

Wang YXR, Li L, Li JJ, Huang H. Network Modeling in Biology: Statistical Methods for Gene and Brain Networks. Stat Sci 2021;36:89-108. [PMID: 34305304 PMCID: PMC8296984 DOI: 10.1214/20-sts792] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Pan-cancer driver copy number alterations identified by joint expression/CNA data analysis. Sci Rep 2020;10:17199. [PMID: 33057153 PMCID: PMC7566486 DOI: 10.1038/s41598-020-74276-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 09/29/2020] [Indexed: 02/07/2023] Open

Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020;171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.

Collapse

Uda S. Application of information theory in systems biology. Biophys Rev 2020;12:377-384. [PMID: 32144740 PMCID: PMC7242537 DOI: 10.1007/s12551-020-00665-w] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 02/25/2020] [Indexed: 12/12/2022] Open

Singh U, Hur M, Dorman K, Wurtele ES. MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets. Nucleic Acids Res 2020;48:e23. [PMID: 31956905 PMCID: PMC7039010 DOI: 10.1093/nar/gkz1209] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 12/05/2019] [Accepted: 12/17/2019] [Indexed: 12/17/2022] Open

Li J, Lai Y, Zhang C, Zhang Q. TGCnA: temporal gene coexpression network analysis using a low-rank plus sparse framework. J Appl Stat 2019;47:1064-1083. [PMID: 35706920 PMCID: PMC9041782 DOI: 10.1080/02664763.2019.1667311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 09/09/2019] [Indexed: 10/26/2022]

Masnadi-Shirazi M, Maurya MR, Pao G, Ke E, Verma IM, Subramaniam S. Time varying causal network reconstruction of a mouse cell cycle. BMC Bioinformatics 2019;20:294. [PMID: 31142274 PMCID: PMC6542064 DOI: 10.1186/s12859-019-2895-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 05/13/2019] [Indexed: 12/21/2022] Open

Abstract

Background

Biochemical networks are often described through static or time-averaged measurements of the component macromolecules. Temporal variation in these components plays an important role in both describing the dynamical nature of the network as well as providing insights into causal mechanisms. Few methods exist, specifically for systems with many variables, for analyzing time series data to identify distinct temporal regimes and the corresponding time-varying causal networks and mechanisms.

Results

In this study, we use well-constructed temporal transcriptional measurements in a mammalian cell during a cell cycle, to identify dynamical networks and mechanisms describing the cell cycle. The methods we have used and developed in part deal with Granger causality, Vector Autoregression, Estimation Stability with Cross Validation and a nonparametric change point detection algorithm that enable estimating temporally evolving directed networks that provide a comprehensive picture of the crosstalk among different molecular components. We applied our approach to RNA-seq time-course data spanning nearly two cell cycles from Mouse Embryonic Fibroblast (MEF) primary cells. The change-point detection algorithm is able to extract precise information on the duration and timing of cell cycle phases. Using Least Absolute Shrinkage and Selection Operator (LASSO) and Estimation Stability with Cross Validation (ES-CV), we were able to, without any prior biological knowledge, extract information on the phase-specific causal interaction of cell cycle genes, as well as temporal interdependencies of biological mechanisms through a complete cell cycle.

Conclusions

The temporal dependence of cellular components we provide in our model goes beyond what is known in the literature. Furthermore, our inference of dynamic interplay of multiple intracellular mechanisms and their temporal dependence on one another can be used to predict time-varying cellular responses, and provide insight on the design of precise experiments for modulating the regulation of the cell cycle.

Electronic supplementary material

The online version of this article (10.1186/s12859-019-2895-1) contains supplementary material, which is available to authorized users.

Collapse

Larmuseau M, Verbeke LPC, Marchal K. Associating expression and genomic data using co-occurrence measures. Biol Direct 2019;14:10. [PMID: 31072345 PMCID: PMC6507230 DOI: 10.1186/s13062-019-0240-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 04/10/2019] [Indexed: 12/11/2022] Open

Wang Y, Yang S, Zhao J, Du W, Liang Y, Wang C, Zhou F, Tian Y, Ma Q. Using Machine Learning to Measure Relatedness Between Genes: A Multi-Features Model. Sci Rep 2019;9:4192. [PMID: 30862804 PMCID: PMC6414665 DOI: 10.1038/s41598-019-40780-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2018] [Accepted: 02/19/2019] [Indexed: 12/20/2022] Open

Affiliation(s)

Yan Wang Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Sen Yang Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Jing Zhao Population Health Group, Sanford Research, Sioux Falls, SD, 57104, USA.,Department of Internal Medicine, Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, 57105, USA
Wei Du Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Yanchun Liang Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Department of Computer Science and Technology, Zhuhai College of Jilin University, Zhuhai, 519041, China
Cankun Wang Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture, and Plant Science, Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57006, USA
Fengfeng Zhou Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Yuan Tian Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,School of Artificial Intelligence, Jilin University, Changchun, 130012, China.
Qin Ma Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture, and Plant Science, Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57006, USA. .,Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, 43210, USA.

Collapse

Wang YXR, Liu K, Theusch E, Rotter JI, Medina MW, Waterman MS, Huang H, Stegle O. Generalized correlation measure using count statistics for gene expression data with ordered samples. Bioinformatics 2018;34:617-624. [PMID: 29040382 DOI: 10.1093/bioinformatics/btx641] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 10/11/2017] [Indexed: 12/22/2022] Open

Abstract

Motivation

Capturing association patterns in gene expression levels under different conditions or time points is important for inferring gene regulatory interactions. In practice, temporal changes in gene expression may result in complex association patterns that require more sophisticated detection methods than simple correlation measures. For instance, the effect of regulation may lead to time-lagged associations and interactions local to a subset of samples. Furthermore, expression profiles of interest may not be aligned or directly comparable (e.g. gene expression profiles from two species).

Results

We propose a count statistic for measuring association between pairs of gene expression profiles consisting of ordered samples (e.g. time-course), where correlation may only exist locally in subsequences separated by a position shift. The statistic is simple and fast to compute, and we illustrate its use in two applications. In a cross-species comparison of developmental gene expression levels, we show our method not only measures association of gene expressions between the two species, but also provides alignment between different developmental stages. In the second application, we applied our statistic to expression profiles from two distinct phenotypic conditions, where the samples in each profile are ordered by the associated phenotypic values. The detected associations can be useful in building correspondence between gene association networks under different phenotypes. On the theoretical side, we provide asymptotic distributions of the statistic for different regions of the parameter space and test its power on simulated data.

Availability and implementation

The code used to perform the analysis is available as part of the Supplementary Material.

Contact

msw@usc.edu or hhuang@stat.berkeley.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Evaluation of metabolite-microbe correlation detection methods. Anal Biochem 2018;567:106-111. [PMID: 30557528 DOI: 10.1016/j.ab.2018.12.008] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 12/07/2018] [Accepted: 12/10/2018] [Indexed: 12/13/2022]

Abbaszadeh O, Khanteymoori AR, Azarpeyvand A. Parallel Algorithms for Inferring Gene Regulatory Networks: A Review. Curr Genomics 2018;19:603-614. [PMID: 30386172 PMCID: PMC6194435 DOI: 10.2174/1389202919666180601081718] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 02/20/2018] [Accepted: 05/22/2018] [Indexed: 11/22/2022] Open

Jackknife approach to the estimation of mutual information. Proc Natl Acad Sci U S A 2018;115:9956-9961. [PMID: 30224466 DOI: 10.1073/pnas.1715593115] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Ellwanger DC, Scheibinger M, Dumont RA, Barr-Gillespie PG, Heller S. Transcriptional Dynamics of Hair-Bundle Morphogenesis Revealed with CellTrails. Cell Rep 2018;23:2901-2914.e13. [PMID: 29874578 PMCID: PMC6089258 DOI: 10.1016/j.celrep.2018.05.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 03/19/2018] [Accepted: 05/01/2018] [Indexed: 11/30/2022] Open

Development of stock correlation networks using mutual information and financial big data. PLoS One 2018;13:e0195941. [PMID: 29668715 PMCID: PMC5905993 DOI: 10.1371/journal.pone.0195941] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 03/18/2018] [Indexed: 11/19/2022] Open

Xiong W, Wang C, Zhang X, Yang Q, Shao R, Lai J, Du C. Highly interwoven communities of a gene regulatory network unveil topologically important genes for maize seed development. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017;92:1143-1156. [PMID: 29072883 DOI: 10.1111/tpj.13750] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 10/10/2017] [Accepted: 10/17/2017] [Indexed: 06/07/2023]

Erdoğan C, Kurt Z, Diri B. Estimation of the proteomic cancer co-expression sub networks by using association estimators. PLoS One 2017;12:e0188016. [PMID: 29145449 PMCID: PMC5690670 DOI: 10.1371/journal.pone.0188016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 10/29/2017] [Indexed: 01/02/2023] Open

Jokipii‐Lukkari S, Sundell D, Nilsson O, Hvidsten TR, Street NR, Tuominen H. NorWood: a gene expression resource for evo-devo studies of conifer wood development. THE NEW PHYTOLOGIST 2017;216:482-494. [PMID: 28186632 PMCID: PMC6079643 DOI: 10.1111/nph.14458] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 12/22/2016] [Indexed: 05/04/2023]

Omranian N, Nikoloski Z. Computational Approaches to Study Gene Regulatory Networks. Methods Mol Biol 2017. [PMID: 28623592 DOI: 10.1007/978-1-4939-7125-1_18] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Ray SS, Misra S. A supervised weighted similarity measure for gene expressions using biological knowledge. Gene 2016;595:150-160. [PMID: 27688070 DOI: 10.1016/j.gene.2016.09.033] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Revised: 08/18/2016] [Accepted: 09/22/2016] [Indexed: 11/17/2022]

Discovering Genome-Wide Tag SNPs Based on the Mutual Information of the Variants. PLoS One 2016;11:e0167994. [PMID: 27992465 PMCID: PMC5161470 DOI: 10.1371/journal.pone.0167994] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 11/23/2016] [Indexed: 01/01/2023] Open

Chockalingam S, Aluru M, Aluru S. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories. MICROARRAYS 2016;5:microarrays5030023. [PMID: 27657141 PMCID: PMC5040970 DOI: 10.3390/microarrays5030023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Revised: 09/06/2016] [Accepted: 09/13/2016] [Indexed: 11/16/2022]

Rasanen OJ, Saarinen JP. Sequence Prediction With Sparse Distributed Hyperdimensional Coding Applied to the Analysis of Mobile Phone Use Patterns. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016;27:1878-1889. [PMID: 26285224 DOI: 10.1109/tnnls.2015.2462721] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Uncovering Driver DNA Methylation Events in Nonsmoking Early Stage Lung Adenocarcinoma. BIOMED RESEARCH INTERNATIONAL 2016;2016:2090286. [PMID: 27610367 PMCID: PMC5005773 DOI: 10.1155/2016/2090286] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2016] [Revised: 06/28/2016] [Accepted: 07/05/2016] [Indexed: 01/04/2023]

Chen Y, Cao D, Gao J, Yuan Z. Discovering Pair-wise Synergies in Microarray Data. Sci Rep 2016;6:30672. [PMID: 27470995 PMCID: PMC4965793 DOI: 10.1038/srep30672] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Accepted: 07/07/2016] [Indexed: 01/01/2023] Open

Hu Y, Zhao H. CCor: A whole genome network-based similarity measure between two genes. Biometrics 2016;72:1216-1225. [PMID: 26953524 DOI: 10.1111/biom.12508] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 01/01/2016] [Accepted: 02/01/2016] [Indexed: 12/29/2022]

Discovering gene re-ranking efficiency and conserved gene-gene relationships derived from gene co-expression network analysis on breast cancer data. Sci Rep 2016;6:20518. [PMID: 26892392 PMCID: PMC4759568 DOI: 10.1038/srep20518] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 01/05/2016] [Indexed: 12/18/2022] Open