1
|
State of the Interactomes: an evaluation of molecular networks for generating biological insights. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.587073. [PMID: 38746239 PMCID: PMC11092493 DOI: 10.1101/2024.04.26.587073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Advancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP and SIGNOR demonstrate strong interaction prediction performance. These findings provide a benchmark for interactomes across diverse network biology applications and clarify factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
|
2
|
A multi-scale map of protein assemblies in the DNA damage response. Cell Syst 2023; 14:447-463.e8. [PMID: 37220749 PMCID: PMC10330685 DOI: 10.1016/j.cels.2023.04.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 01/30/2023] [Accepted: 04/25/2023] [Indexed: 05/25/2023]
Abstract
The DNA damage response (DDR) ensures error-free DNA replication and transcription and is disrupted in numerous diseases. An ongoing challenge is to determine the proteins orchestrating DDR and their organization into complexes, including constitutive interactions and those responding to genomic insult. Here, we use multi-conditional network analysis to systematically map DDR assemblies at multiple scales. Affinity purifications of 21 DDR proteins, with/without genotoxin exposure, are combined with multi-omics data to reveal a hierarchical organization of 605 proteins into 109 assemblies. The map captures canonical repair mechanisms and proposes new DDR-associated proteins extending to stress, transport, and chromatin functions. We find that protein assemblies closely align with genetic dependencies in processing specific genotoxins and that proteins in multiple assemblies typically act in multiple genotoxin responses. Follow-up by DDR functional readouts newly implicates 12 assembly members in double-strand-break repair. The DNA damage response assemblies map is available for interactive visualization and query (ccmi.org/ddram/).
Collapse
|
3
|
NDEx IQuery: a multi-method network gene set analysis leveraging the Network Data Exchange. Bioinformatics 2023; 39:btad118. [PMID: 36882166 PMCID: PMC10023220 DOI: 10.1093/bioinformatics/btad118] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 01/16/2023] [Accepted: 02/17/2023] [Indexed: 03/09/2023] Open
Abstract
MOTIVATION The investigation of sets of genes using biological pathways is a common task for researchers and is supported by a wide variety of software tools. This type of analysis generates hypotheses about the biological processes that are active or modulated in a specific experimental context. RESULTS The Network Data Exchange Integrated Query (NDEx IQuery) is a new tool for network and pathway-based gene set interpretation that complements or extends existing resources. It combines novel sources of pathways, integration with Cytoscape, and the ability to store and share analysis results. The NDEx IQuery web application performs multiple gene set analyses based on diverse pathways and networks stored in NDEx. These include curated pathways from WikiPathways and SIGNOR, published pathway figures from the last 27 years, machine-assembled networks using the INDRA system, and the new NCI-PID v2.0, an updated version of the popular NCI Pathway Interaction Database. NDEx IQuery's integration with MSigDB and cBioPortal now provides pathway analysis in the context of these two resources. AVAILABILITY AND IMPLEMENTATION NDEx IQuery is available at https://www.ndexbio.org/iquery and is implemented in Javascript and Java.
Collapse
|
4
|
Abstract
A major goal of cancer research is to understand how mutations distributed across diverse genes affect common cellular systems, including multiprotein complexes and assemblies. Two challenges—how to comprehensively map such systems and how to identify which are under mutational selection—have hindered this understanding. Accordingly, we created a comprehensive map of cancer protein systems integrating both new and published multi-omic interaction data at multiple scales of analysis. We then developed a unified statistical model that pinpoints 395 specific systems under mutational selection across 13 cancer types. This map, called NeST (Nested Systems in Tumors), incorporates canonical processes and notable discoveries, including a PIK3CA-actomyosin complex that inhibits phosphatidylinositol 3-kinase signaling and recurrent mutations in collagen complexes that promote tumor proliferation. These systems can be used as clinical biomarkers and implicate a total of 548 genes in cancer evolution and progression. This work shows how disparate tumor mutations converge on protein assemblies at different scales.
Collapse
|
5
|
Restriction factor compendium for influenza A virus reveals a mechanism for evasion of autophagy. Nat Microbiol 2021; 6:1319-1333. [PMID: 34556855 PMCID: PMC9683089 DOI: 10.1038/s41564-021-00964-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 08/18/2021] [Indexed: 02/07/2023]
Abstract
The fate of influenza A virus (IAV) infection in the host cell depends on the balance between cellular defence mechanisms and viral evasion strategies. To illuminate the landscape of IAV cellular restriction, we generated and integrated global genetic loss-of-function screens with transcriptomics and proteomics data. Our multi-omics analysis revealed a subset of both IFN-dependent and independent cellular defence mechanisms that inhibit IAV replication. Amongst these, the autophagy regulator TBC1 domain family member 5 (TBC1D5), which binds Rab7 to enable fusion of autophagosomes and lysosomes, was found to control IAV replication in vitro and in vivo and to promote lysosomal targeting of IAV M2 protein. Notably, IAV M2 was observed to abrogate TBC1D5-Rab7 binding through a physical interaction with TBC1D5 via its cytoplasmic tail. Our results provide evidence for the molecular mechanism utilised by IAV M2 protein to escape lysosomal degradation and traffic to the cell membrane, where it supports IAV budding and growth.
Collapse
|
6
|
Abstract
NDEx, the Network Data Exchange (https://www.ndexbio.org) is a web-based resource where users can find, store, share and publish network models of any type and size. NDEx is integrated with Cytoscape, the widely used desktop application for network analysis and visualization. NDEx and Cytoscape are the pillars of the Cytoscape Ecosystem, a diverse environment of resources, tools, applications and services for network biology workflows. In this article, we introduce researchers to NDEx and highlight how it can simplify common tasks in network biology workflows as well as streamline publication and access to). Finally, we show how NDEx can be used programmatically via Python with the 'ndex2' client library, and point readers to additional examples for other popular programming languages such as JavaScript and R. © 2021 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Getting started with NDEx Basic Protocol 2: Using NDEx and Cytoscape in a publication-oriented workflow Basic Protocol 3: Manipulating networks in NDEx via Python.
Collapse
|
7
|
Abstract
A deficient interferon (IFN) response to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been implicated as a determinant of severe coronavirus disease 2019 (COVID-19). To identify the molecular effectors that govern IFN control of SARS-CoV-2 infection, we conducted a large-scale gain-of-function analysis that evaluated the impact of human IFN-stimulated genes (ISGs) on viral replication. A limited subset of ISGs were found to control viral infection, including endosomal factors inhibiting viral entry, RNA binding proteins suppressing viral RNA synthesis, and a highly enriched cluster of endoplasmic reticulum (ER)/Golgi-resident ISGs inhibiting viral assembly/egress. These included broad-acting antiviral ISGs and eight ISGs that specifically inhibited SARS-CoV-2 and SARS-CoV-1 replication. Among the broad-acting ISGs was BST2/tetherin, which impeded viral release and is antagonized by SARS-CoV-2 Orf7a protein. Overall, these data illuminate a set of ISGs that underlie innate immune control of SARS-CoV-2/SARS-CoV-1 infection, which will facilitate the understanding of host determinants that impact disease severity and offer potential therapeutic strategies for COVID-19.
Collapse
|
8
|
Abstract
In any 'omics study, the scale of analysis can dramatically affect the outcome. For instance, when clustering single-cell transcriptomes, is the analysis tuned to discover broad or specific cell types? Likewise, protein communities revealed from protein networks can vary widely in sizes depending on the method. Here, we use the concept of persistent homology, drawn from mathematical topology, to identify robust structures in data at all scales simultaneously. Application to mouse single-cell transcriptomes significantly expands the catalog of identified cell types, while analysis of SARS-COV-2 protein interactions suggests hijacking of WNT. The method, HiDeF, is available via Python and Cytoscape.
Collapse
|
9
|
Abstract
Detection of community structure has become a fundamental step in the analysis of biological networks with application to protein function annotation, disease gene prediction, and drug discovery. This recent impact creates a need to make these techniques and their accompanying visualization schemes available to a broad range of biologists. Here we present a service-oriented, end-to-end software framework, CDAPS (Community Detection APplication and Service), that integrates the identification, annotation, visualization, and interrogation of multiscale network communities, accessible within the popular Cytoscape network analysis platform. With novel design principles, CDAPS addresses unmet new challenges, such as identifying hierarchical community structures, comparison of outputs generated from diverse network resources, and easy deployment of new algorithms, to facilitate community-sourced science. We demonstrate that the CDAPS framework can be applied to high-throughput protein-protein interaction networks to gain novel insights, such as the identification of putative new members of known protein complexes.
Collapse
|
10
|
Identifying persistent structures in multiscale 'omics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.06.16.151555. [PMID: 32587977 PMCID: PMC7310637 DOI: 10.1101/2020.06.16.151555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In any 'omics study, the scale of analysis can dramatically affect the outcome. For instance, when clustering single-cell transcriptomes, is the analysis tuned to discover broad or specific cell types? Likewise, protein communities revealed from protein networks can vary widely in sizes depending on the method. Here we use the concept of "persistent homology", drawn from mathematical topology, to identify robust structures in data at all scales simultaneously. Application to mouse single-cell transcriptomes significantly expands the catalog of identified cell types, while analysis of SARS-COV-2 protein interactions suggests hijacking of WNT. The method, HiDeF, is available via Python and Cytoscape.
Collapse
|
11
|
Functional Landscape of SARS-CoV-2 Cellular Restriction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.09.29.319566. [PMID: 33024967 PMCID: PMC7536870 DOI: 10.1101/2020.09.29.319566] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
A deficient interferon response to SARS-CoV-2 infection has been implicated as a determinant of severe COVID-19. To identify the molecular effectors that govern interferon control of SARS-CoV-2 infection, we conducted a large-scale gain-of-function analysis that evaluated the impact of human interferon stimulated genes (ISGs) on viral replication. A limited subset of ISGs were found to control viral infection, including endosomal factors that inhibited viral entry, nucleic acid binding proteins that suppressed viral RNA synthesis, and a highly enriched cluster of ER and Golgi-resident ISGs that inhibited viral translation and egress. These included the type II integral membrane protein BST2/tetherin, which was found to impede viral release, and is targeted for immune evasion by SARS-CoV-2 Orf7a protein. Overall, these data define the molecular basis of early innate immune control of viral infection, which will facilitate the understanding of host determinants that impact disease severity and offer potential therapeutic strategies for COVID-19.
Collapse
|
12
|
Abstract 3206: IQuery: A tool aggregating multiple gene set analysis methods, based on networks and pathways in a community-driven public data repository. Cancer Res 2020. [DOI: 10.1158/1538-7445.am2020-3206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The investigation of gene sets identified in experiments and analyses is a common task for biologists. This includes identification of relevant pathway networks, neighborhoods of protein interactions, and associations with drugs and diseases. However, services that provide these analyses face the following challenges: (1) Curated pathway content tends to lack biological context and lag behind recent findings, as it focuses on consensus biology and is expensive to maintain; (2) User interfaces for network analyses need to be transparent and intuitive, without requiring complex parameter choices; and (3) Users have diverse, changing network analysis needs, requiring an evolving portfolio of methods and data.
To address these challenges, we present NDEx Integrated Query (IQuery), a novel and flexible resource for gene set analysis providing enrichment, protein-protein interaction, and protein association analyses based on pathways and networks from NDEx, the Network Data Exchange.
IQuery provides researchers with a simple “one click” query interface which runs multiple simultaneous queries without the need to specify the analyses to perform, query parameters, or network data sources. The results are aggregated and presented in a rich, intuitive interface for the user to browse, preview, save, or analyze. For each network, the user can view descriptions, citations, author information and other metadata, and inspect the data associated with each node and edge.
The IQuery database is NDEx, a public data commons which is constantly growing via the addition of new public network resources and networks submitted by the authors of publications. The incorporation of published networks in IQuery enables timely coverage of new research as well as clear biological context for each pathway, and fosters cross-discipline insights and collaboration between teams. IQuery is also integrated with Cytoscape, the widely used network visualization and analysis application, enabling query result networks to be seamlessly downloaded to Cytoscape for editing, merging with other networks, annotation with user datasets, or use with any of hundreds of community developed apps.
IQuery is built as a modular, web service-oriented platform. The primary IQuery service distributes each query to all component services using a standard API. This enables easy addition of new analyses, including those written by collaborators. This allows IQuery to grow in two ways: Through the growth of the NDEx database, and through the addition of modules providing new gene set analysis tools.
In sum, IQuery is a powerful and versatile new network analysis service with collaboration and extensibility at its core, which will allow users to find, access, and analyze networks relevant to their research, and provide opportunities for cross-discipline communication and insight.
Citation Format: Dexter R. Pratt, Keiichiro Ono, Sophie Liu, Jing Chen, Christopher Churas. IQuery: A tool aggregating multiple gene set analysis methods, based on networks and pathways in a community-driven public data repository [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 3206.
Collapse
|
13
|
CDeep3M-Plug-and-Play cloud-based deep learning for image segmentation. Nat Methods 2018; 15:677-680. [PMID: 30171236 DOI: 10.1038/s41592-018-0106-z] [Citation(s) in RCA: 103] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 07/19/2018] [Indexed: 11/09/2022]
Abstract
As biomedical imaging datasets expand, deep neural networks are considered vital for image processing, yet community access is still limited by setting up complex computational environments and availability of high-performance computing resources. We address these bottlenecks with CDeep3M, a ready-to-use image segmentation solution employing a cloud-based deep convolutional neural network. We benchmark CDeep3M on large and complex two-dimensional and three-dimensional imaging datasets from light, X-ray, and electron microscopy.
Collapse
|
14
|
Probability Map Viewer: near real-time probability map generator of serial block electron microscopy collections. Bioinformatics 2018; 33:3145-3147. [PMID: 28957496 DOI: 10.1093/bioinformatics/btx376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 06/06/2017] [Indexed: 11/13/2022] Open
Abstract
Summary To expedite the review of semi-automated probability maps of organelles and other features from 3D electron microscopy data we have developed Probability Map Viewer, a Java-based web application that enables the computation and visualization of probability map generation results in near real-time as the data are being collected from the microscope. Probability Map Viewer allows the user to select one or more voxel classifiers, apply them on a sub-region of an active collection, and visualize the results as overlays on the raw data via any web browser using a personal computer or mobile device. Thus, Probability Map Viewer accelerates and informs the image analysis workflow by providing a tool for experimenting with and optimizing dataset-specific segmentation strategies during imaging. Availability and implementation https://github.com/crbs/probabilitymapviewer. Contact mellisman@ucsd.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
15
|
Electron tomography simulator with realistic 3D phantom for evaluation of acquisition, alignment and reconstruction methods. J Struct Biol 2017; 198:103-115. [PMID: 28392451 DOI: 10.1016/j.jsb.2017.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Revised: 12/22/2016] [Accepted: 04/03/2017] [Indexed: 11/16/2022]
Abstract
Because of the significance of electron microscope tomography in the investigation of biological structure at nanometer scales, ongoing improvement efforts have been continuous over recent years. This is particularly true in the case of software developments. Nevertheless, verification of improvements delivered by new algorithms and software remains difficult. Current analysis tools do not provide adaptable and consistent methods for quality assessment. This is particularly true with images of biological samples, due to image complexity, variability, low contrast and noise. We report an electron tomography (ET) simulator with accurate ray optics modeling of image formation that includes curvilinear trajectories through the sample, warping of the sample and noise. As a demonstration of the utility of our approach, we have concentrated on providing verification of the class of reconstruction methods applicable to wide field images of stained plastic-embedded samples. Accordingly, we have also constructed digital phantoms derived from serial block face scanning electron microscope images. These phantoms are also easily modified to include alignment features to test alignment algorithms. The combination of more realistic phantoms with more faithful simulations facilitates objective comparison of acquisition parameters, alignment and reconstruction algorithms and their range of applicability. With proper phantoms, this approach can also be modified to include more complex optical models, including distance-dependent blurring and phase contrast functions, such as may occur in cryotomography.
Collapse
|
16
|
Abstract
Variation in genome structure is an important source of human genetic polymorphism: It affects a large proportion of the genome and has a variety of phenotypic consequences relevant to health and disease. In spite of this, human genome structure variation is incompletely characterized due to a lack of approaches for discovering a broad range of structural variants in a global, comprehensive fashion. We addressed this gap with Optical Mapping, a high-throughput, high-resolution single-molecule system for studying genome structure. We used Optical Mapping to create genome-wide restriction maps of a complete hydatidiform mole and three lymphoblast-derived cell lines, and we validated the approach by demonstrating a strong concordance with existing methods. We also describe thousands of new variants with sizes ranging from kb to Mb.
Collapse
|
17
|
Abstract
Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.
Collapse
|