Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Y, Phillips CA, Rogers GL, Baker EJ, Chesler EJ, Langston MA. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinformatics 2014;15:110. [PMID: 24731198 PMCID: PMC4038116 DOI: 10.1186/1471-2105-15-110] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2013] [Accepted: 03/29/2014] [Indexed: 11/10/2022] Open

For:	Zhang Y, Phillips CA, Rogers GL, Baker EJ, Chesler EJ, Langston MA. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinformatics 2014;15:110. [PMID: 24731198 PMCID: PMC4038116 DOI: 10.1186/1471-2105-15-110] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2013] [Accepted: 03/29/2014] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Yang J, Peng Y, Ouyang D, Zhang W, Lin X, Zhao X. (p,q)-biclique counting and enumeration for large sparse bipartite graphs. THE VLDB JOURNAL : VERY LARGE DATA BASES : A PUBLICATION OF THE VLDB ENDOWMENT 2023;32:1-25. [PMID: 37362202 PMCID: PMC10008723 DOI: 10.1007/s00778-023-00786-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 11/02/2022] [Accepted: 12/15/2022] [Indexed: 06/28/2023]

Abstract

In this paper, we study the problem of (p , q )-biclique counting and enumeration for large sparse bipartite graphs. Given a bipartite graph G = ( U , V , E ) and two integer parameters p and q, we aim to efficiently count and enumerate all (p , q )-bicliques in G, where a (p , q )-biclique B(L, R) is a complete subgraph of G with L ⊆ U , R ⊆ V , | L | = p , and | R | = q . The problem of (p , q )-biclique counting and enumeration has many applications, such as graph neural network information aggregation, densest subgraph detection, and cohesive subgroup analysis. Despite the wide range of applications, to the best of our knowledge, we note that there is no efficient and scalable solution to this problem in the literature . This problem is computationally challenging, due to the worst-case exponential number of (p , q )-bicliques. In this paper, we propose a competitive branch-and-bound baseline method, namely BCList, which explores the search space in a depth-first manner, together with a variety of pruning techniques. Although BCList offers a useful computation framework to our problem, its worst-case time complexity is exponential to p + q . To alleviate this, we propose an advanced approach, called BCList++. Particularly, BCList++ applies a layer-based exploring strategy to enumerate (p , q )-bicliques by anchoring the search on either U or V only, which has a worst-case time complexity exponential to either p or q only. Consequently, a vital task is to choose a layer with the least computation cost. To this end, we develop a cost model, which is built upon an unbiased estimator for the density of 2-hop graph induced by U or V. To improve computation efficiency, BCList++ exploits pre-allocated arrays and vertex labeling techniques such that the frequent subgraph creating operations can be substituted by array element switching operations. We conduct extensive experiments on 16 real-life datasets, and the experimental results demonstrate that BCList++ significantly outperforms the baseline methods by up to 3 orders of magnitude. We show via a case study that (p , q )-bicliques optimizes the efficiency of graph neural networks. In this paper, we extend our techniques to count and enumerate (p , q )-bicliques on uncertain bipartite graphs. An efficient method IUBCList is developed on the top of BCList++, together with a couple of pruning techniques, including common neighbor refinement and search branch early termination, to discard unpromising uncertain (p , q )-bicliques early. The experimental results demonstrate that IUBCList significantly outperforms the baseline method by up to 2 orders of magnitude.

Collapse

Zhao J, Sun M, Chen F, Chiu P. Understanding Missing Links in Bipartite Networks With MissBiN. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022;28:2457-2469. [PMID: 33090955 DOI: 10.1109/tvcg.2020.3032984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Aldewereld ZT, Zhang LA, Urbano A, Parker RS, Swigon D, Banerjee I, Gómez H, Clermont G. Identification of Clinical Phenotypes in Septic Patients Presenting With Hypotension or Elevated Lactate. Front Med (Lausanne) 2022;9:794423. [PMID: 35665340 PMCID: PMC9160971 DOI: 10.3389/fmed.2022.794423] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 04/28/2022] [Indexed: 01/13/2023] Open

Thieme S, Walther D. Biclique extension as an effective approach to identify missing links in metabolic compound-protein interaction networks. BIOINFORMATICS ADVANCES 2022;2:vbac001. [PMID: 36699348 PMCID: PMC9710583 DOI: 10.1093/bioadv/vbac001] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 11/26/2021] [Accepted: 01/10/2022] [Indexed: 01/28/2023]

Zhang J, Liu L, Xu T, Zhang W, Zhao C, Li S, Li J, Rao N, Le TD. Exploring cell-specific miRNA regulation with single-cell miRNA-mRNA co-sequencing data. BMC Bioinformatics 2021;22:578. [PMID: 34856921 PMCID: PMC8641245 DOI: 10.1186/s12859-021-04498-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 11/19/2021] [Indexed: 11/13/2022] Open

Abstract

Background

Existing computational methods for studying miRNA regulation are mostly based on bulk miRNA and mRNA expression data. However, bulk data only allows the analysis of miRNA regulation regarding a group of cells, rather than the miRNA regulation unique to individual cells. Recent advance in single-cell miRNA-mRNA co-sequencing technology has opened a way for investigating miRNA regulation at single-cell level. However, as currently single-cell miRNA-mRNA co-sequencing data is just emerging and only available at small-scale, there is a strong need of novel methods to exploit existing single-cell data for the study of cell-specific miRNA regulation.

Results

In this work, we propose a new method, CSmiR (Cell-Specific miRNA regulation) to combine single-cell miRNA-mRNA co-sequencing data and putative miRNA-mRNA binding information to identify miRNA regulatory networks at the resolution of individual cells. We apply CSmiR to the miRNA-mRNA co-sequencing data in 19 K562 single-cells to identify cell-specific miRNA-mRNA regulatory networks for understanding miRNA regulation in each K562 single-cell. By analyzing the obtained cell-specific miRNA-mRNA regulatory networks, we observe that the miRNA regulation in each K562 single-cell is unique. Moreover, we conduct detailed analysis on the cell-specific miRNA regulation associated with the miR-17/92 family as a case study. The comparison results indicate that CSmiR is effective in predicting cell-specific miRNA targets. Finally, through exploring cell–cell similarity matrix characterized by cell-specific miRNA regulation, CSmiR provides a novel strategy for clustering single-cells and helps to understand cell–cell crosstalk.

Conclusions

To the best of our knowledge, CSmiR is the first method to explore miRNA regulation at a single-cell resolution level, and we believe that it can be a useful method to enhance the understanding of cell-specific miRNA regulation.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04498-6.

Collapse

Puelz D, Basse G, Feller A, Toulis P. A graph‐theoretic approach to randomization tests of causal effects under general interference. J R Stat Soc Series B Stat Methodol 2021. [DOI: 10.1111/rssb.12478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Zhao X, Ji J, Wang S, Wang R, Yu Q, Li D. The regulatory pattern of target gene expression by aberrant enhancer methylation in glioblastoma. BMC Bioinformatics 2021;22:420. [PMID: 34482818 PMCID: PMC8420065 DOI: 10.1186/s12859-021-04345-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/23/2021] [Indexed: 12/21/2022] Open

Jha K, Xun G, Zhang A. Continual Representation Learning For Evolving Biomedical Bipartite Networks. Bioinformatics 2021;37:2190-2197. [PMID: 33532833 DOI: 10.1093/bioinformatics/btab067] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 12/14/2020] [Accepted: 01/27/2021] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

Many real-world biomedical interactions such as 'gene-disease', 'disease-symptom', and 'drug-target' are modeled as a bipartite network structure. Learning meaningful representations for such networks is a fundamental problem in the research area of Network Representation Learning (NRL). NRL approaches aim to translate the network structure into low-dimensional vector representations that are useful to a variety of biomedical applications. Despite significant advances, the existing approaches still have certain limitations. First, a majority of these approaches do not model the unique topological properties of bipartite networks. Consequently, their straightforward application to the bipartite graphs yields unsatisfactory results. Second, the existing approaches typically learn representations from static networks. This is limiting for the biomedical bipartite networks that evolve at a rapid pace, and thus necessitate the development of approaches that can update the representations in an online fashion.

RESULTS

In this research, we propose a novel representation learning approach that accurately preserves the intricate bipartite structure, and efficiently updates the node representations. Specifically, we design a customized autoencoder that captures the proximity relationship between nodes participating in the bipartite bicliques (2 × 2 sub-graph), while preserving both the global and local structures. Moreover, the proposed structure-preserving technique is carefully interleaved with the central tenets of continual machine learning to design an incremental learning strategy that updates the node representations in an online manner. Taken together, the proposed approach produces meaningful representations with high fidelity and computational efficiency. Extensive experiments conducted on several biomedical bipartite networks validate the effectiveness and rationality of the proposed approach.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Xiong C, Sun S, Jiang W, Ma L, Zhang J. ASDmiR: A Stepwise Method to Uncover miRNA Regulation Related to Autism Spectrum Disorder. Front Genet 2020;11:562971. [PMID: 33173536 PMCID: PMC7591752 DOI: 10.3389/fgene.2020.562971] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 08/31/2020] [Indexed: 12/14/2022] Open

Bubier JA, Philip VM, Dickson PE, Mittleman G, Chesler EJ. Discovery of a Role for Rab3b in Habituation and Cocaine Induced Locomotor Activation in Mice Using Heterogeneous Functional Genomic Analysis. Front Neurosci 2020;14:721. [PMID: 32742255 PMCID: PMC7364128 DOI: 10.3389/fnins.2020.00721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 06/16/2020] [Indexed: 12/21/2022] Open

Abstract

Substance use disorders are prevalent and present a tremendous societal cost but the mechanisms underlying addiction behavior are poorly understood and few biological treatments exist. One strategy to identify novel molecular mechanisms of addiction is through functional genomic experimentation. However, results from individual experiments are often noisy. To address this problem, the convergent analysis of multiple genomic experiments can discern signal from these studies. In the present study, we examine genetic loci that modulate the locomotor response to cocaine identified in the recombinant inbred (BXD RI) genetic reference population. We then applied the GeneWeaver software system for heterogeneous functional genomic analysis to integrate and aggregate multiple studies of addiction genomics, resulting in the identification of Rab3b as a functional correlate of the locomotor response to cocaine in rodents. This gene encodes a member of the RAB family of Ras-like GTPases known to be involved in trafficking of secretory and endocytic vesicles in eukaryotic cells. The convergent evidence for a role of Rab3b includes co-occurrence in previously published genetic mapping studies of cocaine related behaviors; methamphetamine response and cocaine- and amphetamine-regulated transcript prepropeptide (Cartpt) transcript abundance; evidence related to other addictive substances; density of polymorphisms; and its expression pattern in reward pathways. To evaluate this finding, we examined the effect of RAB3 complex perturbation in cocaine response. B6;129-Rab3btm1Sud Rab3ctm1sud Rab3dtm1sud triple null mice (Rab3bcd -/-) exhibited significant deficits in habituation, and increased acute and repeated cocaine responses. This previously unidentified mechanism of the behavioral predisposition and response to cocaine is an example of many that can be identified and validated using aggregate genomic studies.

Collapse

Guo Q, Wang J, Gao Y, Li X, Hao Y, Ning S, Wang P. Dynamic TF-lncRNA Regulatory Networks Revealed Prognostic Signatures in the Development of Ovarian Cancer. Front Bioeng Biotechnol 2020;8:460. [PMID: 32478062 PMCID: PMC7237576 DOI: 10.3389/fbioe.2020.00460] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 04/21/2020] [Indexed: 12/15/2022] Open

Lu Y, Phillips CA, Langston MA. Biclique: an R package for maximal biclique enumeration in bipartite graphs. BMC Res Notes 2020;13:88. [PMID: 32085812 PMCID: PMC7035696 DOI: 10.1186/s13104-020-04955-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 02/12/2020] [Indexed: 12/02/2022] Open

Barabási DL, Barabási AL. A Genetic Model of the Connectome. Neuron 2020;105:435-445.e5. [PMID: 31806491 PMCID: PMC7007360 DOI: 10.1016/j.neuron.2019.10.031] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 10/07/2019] [Accepted: 10/24/2019] [Indexed: 11/18/2022]

Zhang J, Pham VVH, Liu L, Xu T, Truong B, Li J, Rao N, Le TD. Identifying miRNA synergism using multiple-intervention causal inference. BMC Bioinformatics 2019;20:613. [PMID: 31881825 PMCID: PMC6933624 DOI: 10.1186/s12859-019-3215-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 11/12/2019] [Indexed: 01/15/2023] Open

Abstract

BACKGROUND

Studying multiple microRNAs (miRNAs) synergism in gene regulation could help to understand the regulatory mechanisms of complicated human diseases caused by miRNAs. Several existing methods have been presented to infer miRNA synergism. Most of the current methods assume that miRNAs with shared targets at the sequence level are working synergistically. However, it is unclear if miRNAs with shared targets are working in concert to regulate the targets or they individually regulate the targets at different time points or different biological processes. A standard method to test the synergistic activities is to knock-down multiple miRNAs at the same time and measure the changes in the target genes. However, this approach may not be practical as we would have too many sets of miRNAs to test.

RESULTS

n this paper, we present a novel framework called miRsyn for inferring miRNA synergism by using a causal inference method that mimics the multiple-intervention experiments, e.g. knocking-down multiple miRNAs, with observational data. Our results show that several miRNA-miRNA pairs that have shared targets at the sequence level are not working synergistically at the expression level. Moreover, the identified miRNA synergistic network is small-world and biologically meaningful, and a number of miRNA synergistic modules are significantly enriched in breast cancer. Our further analyses also reveal that most of synergistic miRNA-miRNA pairs show the same expression patterns. The comparison results indicate that the proposed multiple-intervention causal inference method performs better than the single-intervention causal inference method in identifying miRNA synergistic network.

CONCLUSIONS

Taken together, the results imply that miRsyn is a promising framework for identifying miRNA synergism, and it could enhance the understanding of miRNA synergism in breast cancer.

Collapse

Dey L, Mukhopadhyay A. A Graph-Based Approach for Finding the Dengue Infection Pathways in Humans Using Protein-Protein Interactions. J Comput Biol 2019;27:755-768. [PMID: 31486690 DOI: 10.1089/cmb.2019.0171] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Li Q, Yu Q, Ji J, Wang P, Li D. Comparison and analysis of lncRNA-mediated ceRNA regulation in different molecular subtypes of glioblastoma. Mol Omics 2019;15:406-419. [DOI: 10.1039/c9mo00126c] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Phillips CA, Wang K, Baker EJ, Bubier JA, Chesler EJ, Langston MA. On Finding and Enumerating Maximal and Maximum k-Partite Cliques in k-Partite Graphs. ALGORITHMS 2019;12:23. [PMID: 31448059 PMCID: PMC6707360 DOI: 10.3390/a12010023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Sreeja A, Vinayan KP. Multidimensional knowledge-based framework is an essential step in the categorization of gene sets in complex disorders. J Bioinform Comput Biol 2017;15:1750022. [DOI: 10.1142/s0219720017500226] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Kang M, Park J, Kim DC, Biswas AK, Liu C, Gao J. Multi-Block Bipartite Graph for Integrative Genomic Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:1350-1358. [PMID: 27429442 DOI: 10.1109/tcbb.2016.2591521] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Walsh CJ, Hu P, Batt J, Dos Santos CC. Discovering MicroRNA-Regulatory Modules in Multi-Dimensional Cancer Genomic Data: A Survey of Computational Methods. Cancer Inform 2016;15:25-42. [PMID: 27721651 PMCID: PMC5051584 DOI: 10.4137/cin.s39369] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 08/14/2016] [Accepted: 08/16/2016] [Indexed: 12/20/2022] Open

Platig J, Castaldi PJ, DeMeo D, Quackenbush J. Bipartite Community Structure of eQTLs. PLoS Comput Biol 2016;12:e1005033. [PMID: 27618581 PMCID: PMC5019382 DOI: 10.1371/journal.pcbi.1005033] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 06/23/2016] [Indexed: 11/18/2022] Open

Abstract

Genome Wide Association Studies (GWAS) and expression quantitative trait locus (eQTL) analyses have identified genetic associations with a wide range of human phenotypes. However, many of these variants have weak effects and understanding their combined effect remains a challenge. One hypothesis is that multiple SNPs interact in complex networks to influence functional processes that ultimately lead to complex phenotypes, including disease states. Here we present CONDOR, a method that represents both cis- and trans-acting SNPs and the genes with which they are associated as a bipartite graph and then uses the modular structure of that graph to place SNPs into a functional context. In applying CONDOR to eQTLs in chronic obstructive pulmonary disease (COPD), we found the global network “hub” SNPs were devoid of disease associations through GWAS. However, the network was organized into 52 communities of SNPs and genes, many of which were enriched for genes in specific functional classes. We identified local hubs within each community (“core SNPs”) and these were enriched for GWAS SNPs for COPD and many other diseases. These results speak to our intuition: rather than single SNPs influencing single genes, we see groups of SNPs associated with the expression of families of functionally related genes and that disease SNPs are associated with the perturbation of those functions. These methods are not limited in their application to COPD and can be used in the analysis of a wide variety of disease processes and other phenotypic traits.

Large-scale studies have identified thousands of genetic variants associated with different phenotypes without explaining their function. Expression quantitative trait locus analysis associates the compendium of genetic variants with expression levels of individual genes, providing the opportunity to link those variants to functions. But the complexity of those associations has caused most analyses to focus solely on genetic variants immediately adjacent to the genes they may influence. We describe a method that embraces the complexity, representing all variant-gene associations as a bipartite graph. The graph contains highly modular, functional communities in which disease-associated variants emerge as those likely to perturb the structure of the network and the function of the genes in these communities.

Collapse

Alonso R, Monroy R, Trejo LA. Mining IP to Domain Name Interactions to Detect DNS Flood Attacks on Recursive DNS Servers. SENSORS 2016;16:s16081311. [PMID: 27548169 PMCID: PMC5017476 DOI: 10.3390/s16081311] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 08/09/2016] [Accepted: 08/13/2016] [Indexed: 11/16/2022]

Saracco F, Di Clemente R, Gabrielli A, Squartini T. Detecting early signs of the 2007-2008 crisis in the world trade. Sci Rep 2016;6:30286. [PMID: 27461469 PMCID: PMC4962096 DOI: 10.1038/srep30286] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 06/24/2016] [Indexed: 11/09/2022] Open

Bubier JA, Wilcox TD, Jay JJ, Langston MA, Baker EJ, Chesler EJ. Cross-Species Integrative Functional Genomics in GeneWeaver Reveals a Role for Pafah1b1 in Altered Response to Alcohol. Front Behav Neurosci 2016;10:1. [PMID: 26834590 PMCID: PMC4720795 DOI: 10.3389/fnbeh.2016.00001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 01/04/2016] [Indexed: 12/12/2022] Open

Kumari A, Kanchan S, Sinha RP, Kesheri M. Applications of Bio-molecular Databases in Bioinformatics. MEDICAL IMAGING IN CLINICAL APPLICATIONS 2016. [DOI: 10.1007/978-3-319-33793-7_15] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Baker E, Bubier JA, Reynolds T, Langston MA, Chesler EJ. GeneWeaver: data driven alignment of cross-species genomics in biology and disease. Nucleic Acids Res 2015;44:D555-9. [PMID: 26656951 PMCID: PMC4702926 DOI: 10.1093/nar/gkv1329] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 11/13/2015] [Indexed: 11/17/2022] Open

Bubier JA, Phillips CA, Langston MA, Baker EJ, Chesler EJ. GeneWeaver: finding consilience in heterogeneous cross-species functional genomics data. Mamm Genome 2015;26:556-66. [PMID: 26092690 PMCID: PMC4602068 DOI: 10.1007/s00335-015-9575-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 06/03/2015] [Indexed: 01/20/2023]

Abstract

A persistent challenge lies in the interpretation of consensus and discord from functional genomics experimentation. Harmonizing and analyzing this data will enable investigators to discover relations of many genes to many diseases, and from many phenotypes and experimental paradigms to many diseases through their genomic substrates. The GeneWeaver.org system provides a platform for cross-species integration and interrogation of heterogeneous curated and experimentally derived functional genomics data. GeneWeaver enables researchers to store, share, analyze, and compare results of their own genome-wide functional genomics experiments in an environment containing rich companion data obtained from major curated repositories, including the Mouse Genome Database and other model organism databases, along with derived data from highly specialized resources, publications, and user submissions. The data, largely consisting of gene sets and putative biological networks, are mapped onto one another through gene identifiers and homology across species. A versatile suite of interactive tools enables investigators to perform a variety of set analysis operations to find consilience among these often noisy experimental results. Fast algorithms enable real-time analysis of large queries. Specific applications include prioritizing candidate genes for quantitative trait loci, identifying biologically valid mouse models and phenotypic assays for human disease, finding the common biological substrates of related diseases, classifying experiments and the biological concepts they represent from empirical data, and applying patterns of genomic evidence to implicate novel genes in disease. These results illustrate an alternative to strict emphasis on replicability, whereby researchers classify experimental results to identify the conditions that lead to their similarity.

Collapse

Chen HC, Zou W, Lu TP, Chen JJ. A composite model for subgroup identification and prediction via bicluster analysis. PLoS One 2014;9:e111318. [PMID: 25347824 PMCID: PMC4210136 DOI: 10.1371/journal.pone.0111318] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Accepted: 09/30/2014] [Indexed: 11/18/2022] Open

Abstract

Background

A major challenges in the analysis of large and complex biomedical data is to develop an approach for 1) identifying distinct subgroups in the sampled populations, 2) characterizing their relationships among subgroups, and 3) developing a prediction model to classify subgroup memberships of new samples by finding a set of predictors. Each subgroup can represent different pathogen serotypes of microorganisms, different tumor subtypes in cancer patients, or different genetic makeups of patients related to treatment response.

Methods

This paper proposes a composite model for subgroup identification and prediction using biclusters. A biclustering technique is first used to identify a set of biclusters from the sampled data. For each bicluster, a subgroup-specific binary classifier is built to determine if a particular sample is either inside or outside the bicluster. A composite model, which consists of all binary classifiers, is constructed to classify samples into several disjoint subgroups. The proposed composite model neither depends on any specific biclustering algorithm or patterns of biclusters, nor on any classification algorithms.

Results

The composite model was shown to have an overall accuracy of 97.4% for a synthetic dataset consisting of four subgroups. The model was applied to two datasets where the sample’s subgroup memberships were known. The procedure showed 83.7% accuracy in discriminating lung cancer adenocarcinoma and squamous carcinoma subtypes, and was able to identify 5 serotypes and several subtypes with about 94% accuracy in a pathogen dataset.

Conclusion

The composite model presents a novel approach to developing a biclustering-based classification model from unlabeled sampled data. The proposed approach combines unsupervised biclustering and supervised classification techniques to classify samples into disjoint subgroups based on their associated attributes, such as genotypic factors, phenotypic outcomes, efficacy/safety measures, or responses to treatments. The procedure is useful for identification of unknown species or new biomarkers for targeted therapy.

Collapse

Baker E, Culpepper C, Philips C, Bubier J, Langston M, Chesler E. Identifying common components across biological network graphs using a bipartite data model. BMC Proc 2014;8:S4. [PMID: 25374613 PMCID: PMC4202189 DOI: 10.1186/1753-6561-8-s6-s4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Identification of a QTL in Mus musculus for alcohol preference, withdrawal, and Ap3m2 expression using integrative functional genomics and precision genetics. Genetics 2014;197:1377-93. [PMID: 24923803 DOI: 10.1534/genetics.114.166165] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Baker EJ, Jay JJ, Bubier JA, Langston MA, Chesler EJ. GeneWeaver: a web-based system for integrative functional genomics. Nucleic Acids Res 2011;40:D1067-76. [PMID: 22080549 PMCID: PMC3245070 DOI: 10.1093/nar/gkr968] [Citation(s) in RCA: 97] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Baker EJ, Jay JJ, Philip VM, Zhang Y, Li Z, Kirova R, Langston MA, Chesler EJ. Ontological Discovery Environment: a system for integrating gene-phenotype associations. Genomics 2009;94:377-87. [PMID: 19733230 DOI: 10.1016/j.ygeno.2009.08.016] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Revised: 08/19/2009] [Accepted: 08/27/2009] [Indexed: 10/20/2022]

Abstract

The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE's gene set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental systems or species domain.

Collapse