Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

54
(from Reference Citation Analysis)

Article PDFs (27)

Cited by > 0 (46)

Searched Name

T M Murali

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Gandhi N, Wills L, Akers K, Su Y, Niccum P, Murali TM, Rajagopalan P. Comparative transcriptomic and phenotypic analysis of induced pluripotent stem cell hepatocyte-like cells and primary human hepatocytes. Cell Tissue Res 2024;396:119-139. [PMID: 38369646 DOI: 10.1007/s00441-024-03868-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 01/22/2024] [Indexed: 02/20/2024]

Antony B, Blau H, Casiraghi E, Loomba JJ, Callahan TJ, Laraway BJ, Wilkins KJ, Antonescu CC, Valentini G, Williams AE, Robinson PN, Reese JT, Murali TM. Predictive models of long COVID. EBioMedicine 2023;96:104777. [PMID: 37672869 PMCID: PMC10494314 DOI: 10.1016/j.ebiom.2023.104777] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/24/2023] [Accepted: 08/15/2023] [Indexed: 09/08/2023] Open

Abstract

BACKGROUND

The cause and symptoms of long COVID are poorly understood. It is challenging to predict whether a given COVID-19 patient will develop long COVID in the future.

METHODS

We used electronic health record (EHR) data from the National COVID Cohort Collaborative to predict the incidence of long COVID. We trained two machine learning (ML) models - logistic regression (LR) and random forest (RF). Features used to train predictors included symptoms and drugs ordered during acute infection, measures of COVID-19 treatment, pre-COVID comorbidities, and demographic information. We assigned the 'long COVID' label to patients diagnosed with the U09.9 ICD10-CM code. The cohorts included patients with (a) EHRs reported from data partners using U09.9 ICD10-CM code and (b) at least one EHR in each feature category. We analysed three cohorts: all patients (n = 2,190,579; diagnosed with long COVID = 17,036), inpatients (149,319; 3,295), and outpatients (2,041,260; 13,741).

FINDINGS

LR and RF models yielded median AUROC of 0.76 and 0.75, respectively. Ablation study revealed that drugs had the highest influence on the prediction task. The SHAP method identified age, gender, cough, fatigue, albuterol, obesity, diabetes, and chronic lung disease as explanatory features. Models trained on data from one N3C partner and tested on data from the other partners had average AUROC of 0.75.

INTERPRETATION

ML-based classification using EHR information from the acute infection period is effective in predicting long COVID. SHAP methods identified important features for prediction. Cross-site analysis demonstrated the generalizability of the proposed methodology.

FUNDING

NCATS U24 TR002306, NCATS UL1 TR003015, Axle Informatics Subcontract: NCATS-P00438-B, NIH/NIDDK/OD, PSR2015-1720GVALE_01, G43C22001320007, and Director, Office of Science, Office of Basic Energy Sciences of the U.S. Department of Energy Contract No. DE-AC02-05CH11231.

Collapse

Affiliation(s)

Blessy Antony Department of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, 24061, USA
Hannah Blau The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
Elena Casiraghi AnacletoLab, Computer Science Department, Dipartimento di Informatica, Università degli Studi di Milano, Milan, 20133, Italy; Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; ELLIS - European Laboratory for Learning and Intelligent Systems, Milan Unit, Milan, 20133, Italy
Johanna J Loomba Integrated Translational Health Research Institute of Virginia, University of Virginia, Charlottesville, VA, 22904, USA
Tiffany J Callahan Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
Bryan J Laraway Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
Kenneth J Wilkins Biostatistics Program, Office of the Director, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, 20814, USA
Corneliu C Antonescu Banner Health, University of Arizona, Phoenix, AZ, 85006, USA
Giorgio Valentini AnacletoLab, Computer Science Department, Dipartimento di Informatica, Università degli Studi di Milano, Milan, 20133, Italy; ELLIS - European Laboratory for Learning and Intelligent Systems, Milan Unit, Milan, 20133, Italy
Andrew E Williams Institute for Clinical Research and Health Policy Studies, Tufts University School of Medicine, Boston, MA, 02111, USA
Peter N Robinson The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT, 06269, USA
Justin T Reese Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
T M Murali Department of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, 24061, USA.

Collapse

Law J, Orbach SM, Weston BR, Steele PA, Rajagopalan P, Murali TM. Computational Construction of Toxicant Signaling Networks. Chem Res Toxicol 2023;36:1267-1277. [PMID: 37471124 PMCID: PMC10445288 DOI: 10.1021/acs.chemrestox.2c00422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Indexed: 07/21/2023]

Reese JT, Blau H, Casiraghi E, Bergquist T, Loomba JJ, Callahan TJ, Laraway B, Antonescu C, Coleman B, Gargano M, Wilkins KJ, Cappelletti L, Fontana T, Ammar N, Antony B, Murali TM, Caufield JH, Karlebach G, McMurry JA, Williams A, Moffitt R, Banerjee J, Solomonides AE, Davis H, Kostka K, Valentini G, Sahner D, Chute CG, Madlock-Brown C, Haendel MA, Robinson PN. Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes. EBioMedicine 2023;87:104413. [PMID: 36563487 PMCID: PMC9769411 DOI: 10.1016/j.ebiom.2022.104413] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 11/23/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022] Open

Abstract

BACKGROUND

Stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, long COVID is incompletely understood and characterised by a wide range of manifestations that are difficult to analyse computationally. Additionally, the generalisability of machine learning classification of COVID-19 clinical outcomes has rarely been tested.

METHODS

We present a method for computationally modelling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning.

FINDINGS

We found six clusters of PASC patients, each with distinct profiles of phenotypic abnormalities, including clusters with distinct pulmonary, neuropsychiatric, and cardiovascular abnormalities, and a cluster associated with broad, severe manifestations and increased mortality. There was significant association of cluster membership with a range of pre-existing conditions and measures of severity during acute COVID-19. We assigned new patients from other healthcare centres to clusters by maximum semantic similarity to the original patients, and showed that the clusters were generalisable across different hospital systems. The increased mortality rate originally identified in one cluster was consistently observed in patients assigned to that cluster in other hospital systems.

INTERPRETATION

Semantic phenotypic clustering provides a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC.

FUNDING

NIH (TR002306/OT2HL161847-01/OD011883/HG010860), U.S.D.O.E. (DE-AC02-05CH11231), Donald A. Roux Family Fund at Jackson Laboratory, Marsico Family at CU Anschutz.

Collapse

Affiliation(s)

Justin T Reese Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Hannah Blau The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Elena Casiraghi Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA; AnacletoLab, Dipartimento di Informatica, Università Degli Studi di Milano, Milan, Italy
Timothy Bergquist Sage Bionetworks, Seattle, WA, USA
Johanna J Loomba The Integrated Translational Health Research Institute of Virginia (iTHRIV), University of Virginia, Charlottesville, VA, USA
Tiffany J Callahan Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Bryan Laraway Departments of Biomedical Informatics and Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Corneliu Antonescu University of Arizona - Banner Health, Phoenix, AZ, USA
Ben Coleman The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Michael Gargano The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Kenneth J Wilkins Biostatistics Program, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
Luca Cappelletti AnacletoLab, Dipartimento di Informatica, Università Degli Studi di Milano, Milan, Italy
Tommaso Fontana AnacletoLab, Dipartimento di Informatica, Università Degli Studi di Milano, Milan, Italy
Nariman Ammar Health Science Center, University of Tennessee, Memphis, TN, USA
Blessy Antony Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
T M Murali Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
J Harry Caufield Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Guy Karlebach The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Julie A McMurry Departments of Biomedical Informatics and Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Andrew Williams Tufts Medical Center Clinical and Translational Science Institute, Tufts Medical Center, Boston, MA, USA; Tufts University School of Medicine, Institute for Clinical Research and Health Policy Studies, Boston, MA, USA; Northeastern University, OHDSI Center at the Roux Institute, Boston, MA, USA
Richard Moffitt Department of Biomedical Informatics and Stony Brook Cancer Center, Stony Brook University, Stony Brook, NY, USA
Jineta Banerjee Sage Bionetworks, Seattle, WA, USA
Anthony E Solomonides HealthSystem Research Institute, NorthShore University, Evanston, IL, USA
Hannah Davis Patient-Led Research Collaborative, NY, USA
Kristin Kostka Northeastern University, OHDSI Center at the Roux Institute, Boston, MA, USA
Giorgio Valentini AnacletoLab, Dipartimento di Informatica, Università Degli Studi di Milano, Milan, Italy
David Sahner Axle Informatics, Rockville, MD, USA
Christopher G Chute Schools of Medicine, Public Health and Nursing, Johns Hopkins University, Baltimore, MD, USA
Charisse Madlock-Brown Health Science Center, University of Tennessee, Memphis, TN, USA
Melissa A Haendel Departments of Biomedical Informatics and Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Peter N Robinson The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA.

Collapse

Li YC, Wang L, Law JN, Murali TM, Pandey G. Integrating multimodal data through interpretable heterogeneous ensembles. Bioinform Adv 2022;2:vbac065. [PMID: 36158455 PMCID: PMC9495448 DOI: 10.1093/bioadv/vbac065] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 09/01/2022] [Accepted: 09/10/2022] [Indexed: 01/27/2023]

Abstract

Motivation

Integrating multimodal data represents an effective approach to predicting biomedical characteristics, such as protein functions and disease outcomes. However, existing data integration approaches do not sufficiently address the heterogeneous semantics of multimodal data. In particular, early and intermediate approaches that rely on a uniform integrated representation reinforce the consensus among the modalities but may lose exclusive local information. The alternative late integration approach that can address this challenge has not been systematically studied for biomedical problems.

Results

We propose Ensemble Integration (EI) as a novel systematic implementation of the late integration approach. EI infers local predictive models from the individual data modalities using appropriate algorithms and uses heterogeneous ensemble algorithms to integrate these local models into a global predictive model. We also propose a novel interpretation method for EI models. We tested EI on the problems of predicting protein function from multimodal STRING data and mortality due to coronavirus disease 2019 (COVID-19) from multimodal data in electronic health records. We found that EI accomplished its goal of producing significantly more accurate predictions than each individual modality. It also performed better than several established early integration methods for each of these problems. The interpretation of a representative EI model for COVID-19 mortality prediction identified several disease-relevant features, such as laboratory test (blood urea nitrogen and calcium) and vital sign measurements (minimum oxygen saturation) and demographics (age). These results demonstrated the effectiveness of the EI framework for biomedical data integration and predictive modeling.

Availability and implementation

Code and data are available at https://github.com/GauravPandeyLab/ensemble_integration.

Supplementary information

Supplementary data are available at Bioinformatics Advances online.

Collapse

Li YC, Wang L, Law JN, Murali TM, Pandey G. Integrating multimodal data through interpretable heterogeneous ensembles. bioRxiv 2022:2020.05.29.123497. [PMID: 35923321 PMCID: PMC9347276 DOI: 10.1101/2020.05.29.123497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Abstract

Motivation

Integrating multimodal data represents an effective approach to predicting biomedical characteristics, such as protein functions and disease outcomes. However, existing data integration approaches do not sufficiently address the heterogeneous semantics of multimodal data. In particular, early and intermediate approaches that rely on a uniform integrated representation reinforce the consensus among the modalities, but may lose exclusive local information. The alternative late integration approach that can address this challenge has not been systematically studied for biomedical problems.

Results

We propose Ensemble Integration (EI) as a novel systematic implementation of the late integration approach. EI infers local predictive models from the individual data modalities using appropriate algorithms, and uses effective heterogeneous ensemble algorithms to integrate these local models into a global predictive model. We also propose a novel interpretation method for EI models. We tested EI on the problems of predicting protein function from multimodal STRING data, and mortality due to COVID-19 from multimodal data in electronic health records. We found that EI accomplished its goal of producing significantly more accurate predictions than each individual modality. It also performed better than several established early integration methods for each of these problems. The interpretation of a representative EI model for COVID-19 mortality prediction identified several disease-relevant features, such as laboratory test (blood urea nitrogen (BUN) and calcium) and vital sign measurements (minimum oxygen saturation) and demographics (age). These results demonstrated the effectiveness of the EI framework for biomedical data integration and predictive modeling.

Availability

Code and data are available at https://github.com/GauravPandeyLab/ensemble_integration .

Contact

gaurav.pandey@mssm.edu.

Collapse

Reese JT, Blau H, Bergquist T, Loomba JJ, Callahan T, Laraway B, Antonescu C, Casiraghi E, Coleman B, Gargano M, Wilkins KJ, Cappelletti L, Fontana T, Ammar N, Antony B, Murali TM, Karlebach G, McMurry JA, Williams A, Moffitt R, Banerjee J, Solomonides AE, Davis H, Kostka K, Valentini G, Sahner D, Chute CG, Madlock-Brown C, Haendel MA, Robinson PN. Generalizable Long COVID Subtypes: Findings from the NIH N3C and RECOVER Programs. medRxiv 2022:2022.05.24.22275398. [PMID: 35665012 PMCID: PMC9164456 DOI: 10.1101/2022.05.24.22275398] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]

Law JN, Akers K, Tasnina N, Santina CMD, Deutsch S, Kshirsagar M, Klein-Seetharaman J, Crovella M, Rajagopalan P, Kasif S, Murali TM. Interpretable network propagation with application to expanding the repertoire of human proteins that interact with SARS-CoV-2. Gigascience 2021;10:giab082. [PMID: 34966926 PMCID: PMC8716363 DOI: 10.1093/gigascience/giab082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 09/21/2021] [Accepted: 11/28/2021] [Indexed: 01/02/2023] Open

Jalihal AP, Kraikivski P, Murali TM, Tyson JJ. Modeling and analysis of the macronutrient signaling network in budding yeast. Mol Biol Cell 2021;32:ar20. [PMID: 34495680 PMCID: PMC8693975 DOI: 10.1091/mbc.e20-02-0117] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open

Law JN, Kale SD, Murali TM. Accurate and efficient gene function prediction using a multi-bacterial network. Bioinformatics 2021;37:800-806. [PMID: 33063084 DOI: 10.1093/bioinformatics/btaa885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 09/23/2020] [Accepted: 09/30/2020] [Indexed: 11/12/2022] Open

Mandoiu I, Murali TM, Narasimhan G, Rajasekaran S, Skums P, Zelikovsky A. Special Issue: 9th International Computational Advances in Bio and Medical Sciences (ICCABS 2019). J Comput Biol 2021;28:115-116. [PMID: 33539275 DOI: 10.1089/cmb.2021.29034.im] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kshirsagar M, Tasnina N, Ward MD, Law JN, Murali TM, Lavista Ferres JM, Bowman GR, Klein-Seetharaman J. Protein sequence models for prediction and comparative analysis of the SARS-CoV-2 -human interactome. Pac Symp Biocomput 2021;26:154-165. [PMID: 33691013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Wagner MJ, Pratapa A, Murali TM. Reconstructing signaling pathways using regular language constrained paths. Bioinformatics 2020;35:i624-i633. [PMID: 31510694 PMCID: PMC6612893 DOI: 10.1093/bioinformatics/btz360] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Gallegos JE, Adames NR, Rogers MF, Kraikivski P, Ibele A, Nurzynski-Loth K, Kudlow E, Murali TM, Tyson JJ, Peccoud J. Genetic interactions derived from high-throughput phenotyping of 6589 yeast cell cycle mutants. NPJ Syst Biol Appl 2020;6:11. [PMID: 32376972 PMCID: PMC7203125 DOI: 10.1038/s41540-020-0134-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 04/06/2020] [Indexed: 11/09/2022] Open

Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020;17:147-154. [PMID: 31907445 PMCID: PMC7098173 DOI: 10.1038/s41592-019-0690-6] [Citation(s) in RCA: 285] [Impact Index Per Article: 71.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 11/22/2019] [Indexed: 01/10/2023]

Franzese N, Groce A, Murali TM, Ritz A. Hypergraph-based connectivity measures for signaling pathway topologies. PLoS Comput Biol 2019;15:e1007384. [PMID: 31652258 PMCID: PMC6834280 DOI: 10.1371/journal.pcbi.1007384] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Revised: 11/06/2019] [Accepted: 09/09/2019] [Indexed: 12/12/2022] Open

Abstract

Characterizing cellular responses to different extrinsic signals is an active area of research, and curated pathway databases describe these complex signaling reactions. Here, we revisit a fundamental question in signaling pathway analysis: are two molecules “connected” in a network? This question is the first step towards understanding the potential influence of molecules in a pathway, and the answer depends on the choice of modeling framework. We examined the connectivity of Reactome signaling pathways using four different pathway representations. We find that Reactome is very well connected as a graph, moderately well connected as a compound graph or bipartite graph, and poorly connected as a hypergraph (which captures many-to-many relationships in reaction networks). We present a novel relaxation of hypergraph connectivity that iteratively increases connectivity from a node while preserving the hypergraph topology. This measure, B-relaxation distance, provides a parameterized transition between hypergraph connectivity and graph connectivity. B-relaxation distance is sensitive to the presence of small molecules that participate in many functionally unrelated reactions in the network. We also define a score that quantifies one pathway’s downstream influence on another, which can be calculated as B-relaxation distance gradually relaxes the connectivity constraint in hypergraphs. Computing this score across all pairs of 34 Reactome pathways reveals pairs of pathways with statistically significant influence. We present two such case studies, and we describe the specific reactions that contribute to the large influence score. Finally, we investigate the ability for connectivity measures to capture functional relationships among proteins, and use the evidence channels in the STRING database as a benchmark dataset. STRING interactions whose proteins are B-connected in Reactome have statistically significantly higher scores than interactions connected in the bipartite graph representation. Our method lays the groundwork for other generalizations of graph-theoretic concepts to hypergraphs in order to facilitate signaling pathway analysis.

Signaling pathways describe how cells respond to external signals through molecular interactions. As we gain a deeper understanding of these signaling reactions, it is important to understand how molecules may influence downstream responses and how pathways may affect each other. As the amount of information in signaling pathway databases continues to grow, we have the opportunity to analyze properties about pathway structure. We pose an intuitive question about signaling pathways: when are two molecules “connected” in a pathway? This answer varies dramatically based on the assumptions we make about how reactions link molecules. Here, examine four approaches for modeling the structural topology of signaling pathways, and present methods to quantify whether two molecules are “connected” in a pathway database. We find that existing approaches are either too permissive (molecules are connected to many others) or restrictive (molecules are connected to a handful of others), and we present a new measure that offers a continuum between these two extremes. We then expand our question to ask when an entire signaling pathway is “downstream” of another pathway, and show two case studies from the Reactome pathway database that uncovers pathway influence. Finally, we show that the strict notion of connectivity can capture functional relationships among proteins using an independent benchmark dataset. Our approach to quantify connectivity in pathways considers a biologically-motivated definition of connectivity, laying the foundation for more sophisticated analyses that leverage the detailed information in pathway databases.

Collapse

Pratapa A, Adames N, Kraikivski P, Franzese N, Tyson JJ, Peccoud J, Murali TM. CrossPlan: systematic planning of genetic crosses to validate mathematical models. Bioinformatics 2019;34:2237-2244. [PMID: 29432533 DOI: 10.1093/bioinformatics/bty072] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 02/07/2018] [Indexed: 12/27/2022] Open

Wang L, Law J, Kale SD, Murali TM, Pandey G. Large-scale protein function prediction using heterogeneous ensembles. F1000Res 2018;7. [PMID: 30450194 PMCID: PMC6221071 DOI: 10.12688/f1000research.16415.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/26/2018] [Indexed: 12/24/2022] Open

Tegge AN, Rodrigues RR, Larkin AL, Vu L, Murali TM, Rajagopalan P. Transcriptomic Analysis of Hepatic Cells in Multicellular Organotypic Liver Models. Sci Rep 2018;8:11306. [PMID: 30054499 PMCID: PMC6063915 DOI: 10.1038/s41598-018-29455-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 07/11/2018] [Indexed: 02/08/2023] Open

Bharadwaj A, Singh DP, Ritz A, Tegge AN, Poirel CL, Kraikivski P, Adames N, Luther K, Kale SD, Peccoud J, Tyson JJ, Murali TM. GraphSpace: stimulating interdisciplinary collaborations in network biology. Bioinformatics 2018;33:3134-3136. [PMID: 28957495 DOI: 10.1093/bioinformatics/btx382] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2016] [Accepted: 06/09/2017] [Indexed: 01/23/2023] Open

Huang LJ, Law JN, Murali TM. Automating the PathLinker app for Cytoscape. F1000Res 2018;7:727. [PMID: 30057757 PMCID: PMC6051191 DOI: 10.12688/f1000research.14616.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/30/2018] [Indexed: 11/20/2022] Open

Ritz A, Avent B, Murali TM. Pathway Analysis with Signaling Hypergraphs. IEEE/ACM Trans Comput Biol Bioinform 2017;14:1042-1055. [PMID: 28991726 PMCID: PMC5810418 DOI: 10.1109/tcbb.2015.2459681] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Gil DP, Law JN, Murali TM. The PathLinker app: Connect the dots in protein interaction networks. F1000Res 2017;6:58. [PMID: 28413614 PMCID: PMC5365231 DOI: 10.12688/f1000research.9909.1] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/14/2016] [Indexed: 11/20/2022] Open

Sam SA, Teel J, Tegge AN, Bharadwaj A, Murali TM. XTalkDB: a database of signaling pathway crosstalk. Nucleic Acids Res 2016;45:D432-D439. [PMID: 27899583 PMCID: PMC5210533 DOI: 10.1093/nar/gkw1037] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2016] [Revised: 09/28/2016] [Accepted: 10/20/2016] [Indexed: 01/01/2023] Open

Tegge AN, Sharp N, Murali TM. Xtalk: a path-based approach for identifying crosstalk between signaling pathways. Bioinformatics 2015;32:242-51. [PMID: 26400040 DOI: 10.1093/bioinformatics/btv549] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Accepted: 09/04/2015] [Indexed: 12/26/2022] Open

Abstract

MOTIVATION

Cells communicate with their environment via signal transduction pathways. On occasion, the activation of one pathway can produce an effect downstream of another pathway, a phenomenon known as crosstalk. Existing computational methods to discover such pathway pairs rely on simple overlap statistics.

RESULTS

We present Xtalk, a path-based approach for identifying pairs of pathways that may crosstalk. Xtalk computes the statistical significance of the average length of multiple short paths that connect receptors in one pathway to the transcription factors in another. By design, Xtalk reports the precise interactions and mechanisms that support the identified crosstalk. We applied Xtalk to signaling pathways in the KEGG and NCI-PID databases. We manually curated a gold standard set of 132 crosstalking pathway pairs and a set of 140 pairs that did not crosstalk, for which Xtalk achieved an area under the receiver operator characteristic curve of 0.65, a 12% improvement over the closest competing approach. The area under the receiver operator characteristic curve varied with the pathway, suggesting that crosstalk should be evaluated on a pathway-by-pathway level. We also analyzed an extended set of 658 pathway pairs in KEGG and to a set of more than 7000 pathway pairs in NCI-PID. For the top-ranking pairs, we found substantial support in the literature (81% for KEGG and 78% for NCI-PID). We provide examples of networks computed by Xtalk that accurately recovered known mechanisms of crosstalk.

AVAILABILITY AND IMPLEMENTATION

The XTALK software is available at http://bioinformatics.cs.vt.edu/~murali/software. Crosstalk networks are available at http://graphspace.org/graphs?tags=2015-bioinformatics-xtalk.

CONTACT

ategge@vt.edu, murali@cs.vt.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Adames NR, Schuck PL, Chen KC, Murali TM, Tyson JJ, Peccoud J. Experimental testing of a new integrated model of the budding yeast Start transition. Mol Biol Cell 2015;26:3966-84. [PMID: 26310445 PMCID: PMC4710230 DOI: 10.1091/mbc.e15-06-0358] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 08/19/2015] [Indexed: 01/29/2023] Open

Ritz A, Tegge AN, Kim H, Poirel CL, Murali TM. Signaling hypergraphs. Trends Biotechnol 2014;32:356-62. [PMID: 24857424 PMCID: PMC4299695 DOI: 10.1016/j.tibtech.2014.04.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 04/01/2014] [Accepted: 04/04/2014] [Indexed: 01/10/2023]

Poirel CL, Rodrigues RR, Chen KC, Tyson JJ, Murali TM. Top-down network analysis to drive bottom-up modeling of physiological processes. J Comput Biol 2013;20:409-18. [PMID: 23641868 DOI: 10.1089/cmb.2012.0274] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kidane YH, Lawrence C, Murali TM. Computational approaches for discovery of common immunomodulators in fungal infections: towards broad-spectrum immunotherapeutic interventions. BMC Microbiol 2013;13:224. [PMID: 24099000 PMCID: PMC3853472 DOI: 10.1186/1471-2180-13-224] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 09/17/2013] [Indexed: 01/16/2023] Open

Abstract

Background

Fungi are the second most abundant type of human pathogens. Invasive fungal pathogens are leading causes of life-threatening infections in clinical settings. Toxicity to the host and drug-resistance are two major deleterious issues associated with existing antifungal agents. Increasing a host’s tolerance and/or immunity to fungal pathogens has potential to alleviate these problems. A host’s tolerance may be improved by modulating the immune system such that it responds more rapidly and robustly in all facets, ranging from the recognition of pathogens to their clearance from the host. An understanding of biological processes and genes that are perturbed during attempted fungal exposure, colonization, and/or invasion will help guide the identification of endogenous immunomodulators and/or small molecules that activate host-immune responses such as specialized adjuvants.

Results

In this study, we present computational techniques and approaches using publicly available transcriptional data sets, to predict immunomodulators that may act against multiple fungal pathogens. Our study analyzed data sets derived from host cells exposed to five fungal pathogens, namely, Alternaria alternata, Aspergillus fumigatus, Candida albicans, Pneumocystis jirovecii, and Stachybotrys chartarum. We observed statistically significant associations between host responses to A. fumigatus and C. albicans. Our analysis identified biological processes that were consistently perturbed by these two pathogens. These processes contained both immune response-inducing genes such as MALT1, SERPINE1, ICAM1, and IL8, and immune response-repressing genes such as DUSP8, DUSP6, and SPRED2. We hypothesize that these genes belong to a pool of common immunomodulators that can potentially be activated or suppressed (agonized or antagonized) in order to render the host more tolerant to infections caused by A. fumigatus and C. albicans.

Conclusions

Our computational approaches and methodologies described here can now be applied to newly generated or expanded data sets for further elucidation of additional drug targets. Moreover, identified immunomodulators may be used to generate experimentally testable hypotheses that could help in the discovery of broad-spectrum immunotherapeutic interventions. All of our results are available at the following supplementary website: http://bioinformatics.cs.vt.edu/~murali/supplements/2013-kidane-bmc

Collapse

Lasher CD, Rajagopalan P, Murali TM. Summarizing cellular responses as biological process networks. BMC Syst Biol 2013;7:68. [PMID: 23895181 PMCID: PMC3751784 DOI: 10.1186/1752-0509-7-68] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2012] [Accepted: 06/26/2013] [Indexed: 12/02/2022]

Larkin AL, Rodrigues RR, Murali TM, Rajagopalan P. Designing a multicellular organotypic 3D liver model with a detachable, nanoscale polymeric Space of Disse. Tissue Eng Part C Methods 2013;19:875-84. [PMID: 23556413 DOI: 10.1089/ten.tec.2012.0700] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

Kidane YH, Lawrence C, Murali TM. The landscape of host transcriptional response programs commonly perturbed by bacterial pathogens: towards host-oriented broad-spectrum drug targets. PLoS One 2013;8:e58553. [PMID: 23516507 PMCID: PMC3596304 DOI: 10.1371/journal.pone.0058553] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2012] [Accepted: 02/07/2013] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

The emergence of drug-resistant pathogen strains and new infectious agents pose major challenges to public health. A promising approach to combat these problems is to target the host's genes or proteins, especially to discover targets that are effective against multiple pathogens, i.e., host-oriented broad-spectrum (HOBS) drug targets. An important first step in the discovery of such drug targets is the identification of host responses that are commonly perturbed by multiple pathogens.

RESULTS

In this paper, we present a methodology to identify common host responses elicited by multiple pathogens. First, we identified host responses perturbed by each pathogen using a gene set enrichment analysis of publicly available genome-wide transcriptional datasets. Then, we used biclustering to identify groups of host pathways and biological processes that were perturbed only by a subset of the analyzed pathogens. Finally, we tested the enrichment of each bicluster in human genes that are known drug targets, on the basis of which we elicited putative HOBS targets for specific groups of bacterial pathogens. We identified 84 up-regulated and three down-regulated statistically significant biclusters. Each bicluster contained a group of pathogens that commonly dysregulated a group of biological processes. We validated our approach by checking whether these biclusters correspond to known hallmarks of bacterial infection. Indeed, these biclusters contained biological process such as inflammation, activation of dendritic cells, pro- and anti- apoptotic responses and other innate immune responses. Next, we identified biclusters containing pathogens that infected the same tissue. After a literature-based analysis of the drug targets contained in these biclusters, we suggested new uses of the drugs Anakinra, Etanercept, and Infliximab for gastrointestinal pathogens Yersinia enterocolitica, Helicobacter pylori kx2 strain, and enterohemorrhagic Escherichia coli and the drug Simvastatin for hematopoietic pathogen Ehrlichia chaffeensis.

CONCLUSIONS

Using a combination of automated analysis of host-response gene expression data and manual study of the literature, we have been able to suggest host-oriented treatments for specific bacterial infections. The analyses and suggestions made in this study may be utilized to generate concrete hypothesis on which gene sets to probe further in the quest for HOBS drug targets for bacterial infections. All our results are available at the following supplementary website: http://bioinformatics.cs.vt.edu/ murali/supplements/2013-kidane-plos-one.

Collapse

Poirel CL, Rahman A, Rodrigues RR, Krishnan A, Addesa JR, Murali TM. Reconciling differential gene expression data with molecular interaction networks. ACTA ACUST UNITED AC 2013;29:622-9. [PMID: 23314326 DOI: 10.1093/bioinformatics/btt007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Rivera CG, Tyler BM, Murali TM. Sensitive detection of pathway perturbations in cancers. BMC Bioinformatics 2012;13 Suppl 3:S9. [PMID: 22536907 PMCID: PMC3471354 DOI: 10.1186/1471-2105-13-s3-s9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Abstract

Background

The normal functioning of a living cell is characterized by complex interaction networks involving many different types of molecules. Associations detected between diseases and perturbations in well-defined pathways within such interaction networks have the potential to illuminate the molecular mechanisms underlying disease progression and response to treatment.

Results

In this paper, we present a computational method that compares expression profiles of genes in cancer samples to samples from normal tissues in order to detect perturbations of pre-defined pathways in the cancer. In contrast to many previous methods, our scoring function approach explicitly takes into account the interactions between the gene products in a pathway. Moreover, we compute the sub-pathway that has the highest score, as opposed to merely computing the score for the entire pathway. We use a permutation test to assess the statistical significance of the most perturbed sub-pathway. We apply our method to 20 pathways in the Netpath database and to the Global Cancer Map of gene expression in 18 cancers. We demonstrate that our method yields more sensitive results than alternatives that do not consider interactions or measure the perturbation of a pathway as a whole. We perform a sensitivity analysis to show that our approach is robust to modest changes in the input data. Our method confirms numerous well-known connections between pathways and cancers.

Conclusions

Our results indicate that integrating differential gene expression with the interaction structure in a pathway is a powerful approach for detecting links between a cancer and the pathways perturbed in it. Our results also suggest that even well-studied pathways may be perturbed only partially in any given cancer. Further analysis of cancer-specific sub-pathways may shed new light on the similarities and differences between cancers.

Collapse

Murali TM. Computationally Driven Experimental Biology. Computer (Long Beach Calif) 2012;45:22-23. [PMID: 24976642 PMCID: PMC4071611 DOI: 10.1109/mc.2012.93] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Poirel CL, Owens CC, Murali TM. Network-based functional enrichment. BMC Bioinformatics 2011;12 Suppl 13:S14. [PMID: 22479706 PMCID: PMC3278830 DOI: 10.1186/1471-2105-12-s13-s14] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

Background

Many methods have been developed to infer and reason about molecular interaction networks. These approaches often yield networks with hundreds or thousands of nodes and up to an order of magnitude more edges. It is often desirable to summarize the biological information in such networks. A very common approach is to use gene function enrichment analysis for this task. A major drawback of this method is that it ignores information about the edges in the network being analyzed, i.e., it treats the network simply as a set of genes. In this paper, we introduce a novel method for functional enrichment that explicitly takes network interactions into account.

Results

Our approach naturally generalizes Fisher’s exact test, a gene set-based technique. Given a function of interest, we compute the subgraph of the network induced by genes annotated to this function. We use the sequence of sizes of the connected components of this sub-network to estimate its connectivity. We estimate the statistical significance of the connectivity empirically by a permutation test. We present three applications of our method: i) determine which functions are enriched in a given network, ii) given a network and an interesting sub-network of genes within that network, determine which functions are enriched in the sub-network, and iii) given two networks, determine the functions for which the connectivity improves when we merge the second network into the first. Through these applications, we show that our approach is a natural alternative to network clustering algorithms.

Conclusions

We presented a novel approach to functional enrichment that takes into account the pairwise relationships among genes annotated by a particular function. Each of the three applications discovers highly relevant functions. We used our methods to study biological data from three different organisms. Our results demonstrate the wide applicability of our methods. Our algorithms are implemented in C++ and are freely available under the GNU General Public License at our supplementary website. Additionally, all our input data and results are available at http://bioinformatics.cs.vt.edu/~murali/supplements/2011-incob-nbe/.

Collapse

Murali TM, Dyer MD, Badger D, Tyler BM, Katze MG. Network-based prediction and analysis of HIV dependency factors. PLoS Comput Biol 2011;7:e1002164. [PMID: 21966263 PMCID: PMC3178628 DOI: 10.1371/journal.pcbi.1002164] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Accepted: 06/30/2011] [Indexed: 01/27/2023] Open

Abstract

HIV Dependency Factors (HDFs) are a class of human proteins that are essential for HIV replication, but are not lethal to the host cell when silenced. Three previous genome-wide RNAi experiments identified HDF sets with little overlap. We combine data from these three studies with a human protein interaction network to predict new HDFs, using an intuitive algorithm called SinkSource and four other algorithms published in the literature. Our algorithm achieves high precision and recall upon cross validation, as do the other methods. A number of HDFs that we predict are known to interact with HIV proteins. They belong to multiple protein complexes and biological processes that are known to be manipulated by HIV. We also demonstrate that many predicted HDF genes show significantly different programs of expression in early response to SIV infection in two non-human primate species that differ in AIDS progression. Our results suggest that many HDFs are yet to be discovered and that they have potential value as prognostic markers to determine pathological outcome and the likelihood of AIDS development. More generally, if multiple genome-wide gene-level studies have been performed at independent labs to study the same biological system or phenomenon, our methodology is applicable to interpret these studies simultaneously in the context of molecular interaction networks and to ask if they reinforce or contradict each other.

Medicines to cure infectious diseases usually target proteins in the pathogens. Since pathogens have short life cycles, the targeted proteins can rapidly evolve and make the medicines ineffective, especially in viruses such as HIV. However, since viruses have very small genomes, they must exploit the cellular machinery of the host to propagate. Therefore, disrupting the activity of selected host proteins may impede viruses. Three recent experiments have discovered hundreds of such proteins in human cells that HIV depends upon. Surprisingly, these three sets have very little overlap. In this work, we demonstrate that this discrepancy can be explained by considering physical interactions between the human proteins in these studies. Moreover, we exploit these interactions to predict new dependency factors for HIV. Our predictions show very significant overlaps with human proteins that are known to interact with HIV proteins and with human cellular processes that are known to be subverted by the virus. Most importantly, we show that proteins predicted by us may play a prominent role in affecting HIV-related disease progression in lymph nodes. Therefore, our predictions constitute a powerful resource for experimentalists who desire to discover new human proteins that can control the spread of HIV.

Collapse

Dyer MD, Murali TM, Sobral BW. Supervised learning and prediction of physical interactions between human and HIV proteins. Infect Genet Evol 2011;11:917-23. [PMID: 21382517 DOI: 10.1016/j.meegid.2011.02.022] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Revised: 02/22/2011] [Accepted: 02/24/2011] [Indexed: 02/08/2023]

Lasher CD, Rajagopalan P, Murali TM. Discovering networks of perturbed biological processes in hepatocyte cultures. PLoS One 2011;6:e15247. [PMID: 21245926 PMCID: PMC3016309 DOI: 10.1371/journal.pone.0015247] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2010] [Accepted: 11/02/2010] [Indexed: 12/20/2022] Open

Dyer MD, Neff C, Dufford M, Rivera CG, Shattuck D, Bassaganya-Riera J, Murali TM, Sobral BW. The human-bacterial pathogen protein interaction networks of Bacillus anthracis, Francisella tularensis, and Yersinia pestis. PLoS One 2010;5:e12089. [PMID: 20711500 PMCID: PMC2918508 DOI: 10.1371/journal.pone.0012089] [Citation(s) in RCA: 107] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2010] [Accepted: 07/17/2010] [Indexed: 01/01/2023] Open

Kim Y, Lasher CD, Milford LM, Murali TM, Rajagopalan P. A comparative study of genome-wide transcriptional profiles of primary hepatocytes in collagen sandwich and monolayer cultures. Tissue Eng Part C Methods 2010;16:1449-60. [PMID: 20412007 DOI: 10.1089/ten.tec.2010.0012] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Murali TM, Kandasamy S. Mix Proportioning of High Performance Self-Compacting Concrete using Response Surface Methodology. ACTA ACUST UNITED AC 2009. [DOI: 10.2174/1874149500903010093] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Driscoll T, Dyer MD, Murali TM, Sobral BW. PIG--the pathogen interaction gateway. Nucleic Acids Res 2008;37:D647-50. [PMID: 18984614 PMCID: PMC2686532 DOI: 10.1093/nar/gkn799] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Murali TM, Rivera CG. Network Legos: Building Blocks of Cellular Wiring Diagrams. J Comput Biol 2008;15:829-44. [DOI: 10.1089/cmb.2007.0139] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Dyer MD, Murali TM, Sobral BW. Computational prediction of host-pathogen protein-protein interactions. Bioinformatics 2007;23:i159-66. [PMID: 17646292 DOI: 10.1093/bioinformatics/btm208] [Citation(s) in RCA: 124] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Murali TM, Wu CJ, Kasif S. The art of gene function prediction. Nat Biotechnol 2007;24:1474-5; author reply 1475-6. [PMID: 17160037 DOI: 10.1038/nbt1206-1474] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Li P, Sioson A, Mane SP, Ulanov A, Grothaus G, Heath LS, Murali TM, Bohnert HJ, Grene R. Response diversity of Arabidopsis thaliana ecotypes in elevated [CO2] in the field. Plant Mol Biol 2006;62:593-609. [PMID: 16941220 DOI: 10.1007/s11103-006-9041-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2006] [Accepted: 06/27/2006] [Indexed: 05/02/2023]

Grothaus GA, Mufti A, Murali TM. Automatic layout and visualization of biclusters. Algorithms Mol Biol 2006;1:15. [PMID: 16952321 PMCID: PMC1624833 DOI: 10.1186/1748-7188-1-15] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2006] [Accepted: 09/04/2006] [Indexed: 11/17/2022] Open

Massjouni N, Rivera CG, Murali TM. VIRGO: computational prediction of gene functions. Nucleic Acids Res 2006;34:W340-4. [PMID: 16845022 PMCID: PMC1538839 DOI: 10.1093/nar/gkl225] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Pati A, Vasquez-Robinet C, Heath LS, Grene R, Murali TM. XcisClique: analysis of regulatory bicliques. BMC Bioinformatics 2006;7:218. [PMID: 16630346 PMCID: PMC1513260 DOI: 10.1186/1471-2105-7-218] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2005] [Accepted: 04/21/2006] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Modeling of cis-elements or regulatory motifs in promoter (upstream) regions of genes is a challenging computational problem. In this work, set of regulatory motifs simultaneously present in the promoters of a set of genes is modeled as a biclique in a suitably defined bipartite graph. A biologically meaningful co-occurrence of multiple cis-elements in a gene promoter is assessed by the combined analysis of genomic and gene expression data. Greater statistical significance is associated with a set of genes that shares a common set of regulatory motifs, while simultaneously exhibiting highly correlated gene expression under given experimental conditions.

METHODS

XcisClique, the system developed in this work, is a comprehensive infrastructure that associates annotated genome and gene expression data, models known cis-elements as regular expressions, identifies maximal bicliques in a bipartite gene-motif graph; and ranks bicliques based on their computed statistical significance. Significance is a function of the probability of occurrence of those motifs in a biclique (a hypergeometric distribution), and on the new sum of absolute values statistic (SAV) that uses Spearman correlations of gene expression vectors. SAV is a statistic well-suited for this purpose as described in the discussion.

RESULTS

XcisClique identifies new motif and gene combinations that might indicate as yet unidentified involvement of sets of genes in biological functions and processes. It currently supports Arabidopsis thaliana and can be adapted to other organisms, assuming the existence of annotated genomic sequences, suitable gene expression data, and identified regulatory motifs. A subset of Xcis Clique functionalities, including the motif visualization component MotifSee, source code, and supplementary material are available at https://bioinformatics.cs.vt.edu/xcisclique/.

Collapse