Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mesiti M, Re M, Valentini G. Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Gigascience 2014;3:5. [PMID: 24843788 PMCID: PMC4006453 DOI: 10.1186/2047-217x-3-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 04/01/2014] [Indexed: 01/08/2023] Open

For:	Mesiti M, Re M, Valentini G. Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Gigascience 2014;3:5. [PMID: 24843788 PMCID: PMC4006453 DOI: 10.1186/2047-217x-3-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 04/01/2014] [Indexed: 01/08/2023] Open

Number

Cited by Other Article(s)

Perlasca P, Frasca M, Ba CT, Gliozzo J, Notaro M, Pennacchioni M, Valentini G, Mesiti M. Multi-resolution visualization and analysis of biomolecular networks through hierarchical community detection and web-based graphical tools. PLoS One 2020;15:e0244241. [PMID: 33351828 PMCID: PMC7755227 DOI: 10.1371/journal.pone.0244241] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 12/04/2020] [Indexed: 11/19/2022] Open

Abstract

The visual exploration and analysis of biomolecular networks is of paramount importance for identifying hidden and complex interaction patterns among proteins. Although many tools have been proposed for this task, they are mainly focused on the query and visualization of a single protein with its neighborhood. The global exploration of the entire network and the interpretation of its underlying structure still remains difficult, mainly due to the excessively large size of the biomolecular networks. In this paper we propose a novel multi-resolution representation and exploration approach that exploits hierarchical community detection algorithms for the identification of communities occurring in biomolecular networks. The proposed graphical rendering combines two types of nodes (protein and communities) and three types of edges (protein-protein, community-community, protein-community), and displays communities at different resolutions, allowing the user to interactively zoom in and out from different levels of the hierarchy. Links among communities are shown in terms of relationships and functional correlations among the biomolecules they contain. This form of navigation can be also combined by the user with a vertex centric visualization for identifying the communities holding a target biomolecule. Since communities gather limited-size groups of correlated proteins, the visualization and exploration of complex and large networks becomes feasible on off-the-shelf computer machines. The proposed graphical exploration strategies have been implemented and integrated in UNIPred-Web, a web application that we recently introduced for combining the UNIPred algorithm, able to address both integration and protein function prediction in an imbalance-aware fashion, with an easy to use vertex-centric exploration of the integrated network. The tool has been deeply amended from different standpoints, including the prediction core algorithm. Several tests on networks of different size and connectivity have been conducted to show off the vast potential of our methodology; moreover, enrichment analyses have been performed to assess the biological meaningfulness of detected communities. Finally, a CoV-human network has been embedded in the system, and a corresponding case study presented, including the visualization and the prediction of human host proteins that potentially interact with SARS-CoV2 proteins.

Collapse

Frasca M, Grossi G, Gliozzo J, Mesiti M, Notaro M, Perlasca P, Petrini A, Valentini G. A GPU-based algorithm for fast node label learning in large and unbalanced biomolecular networks. BMC Bioinformatics 2018;19:353. [PMID: 30367594 PMCID: PMC6191976 DOI: 10.1186/s12859-018-2301-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Abstract

BACKGROUND

Several problems in network biology and medicine can be cast into a framework where entities are represented through partially labeled networks, and the aim is inferring the labels (usually binary) of the unlabeled part. Connections represent functional or genetic similarity between entities, while the labellings often are highly unbalanced, that is one class is largely under-represented: for instance in the automated protein function prediction (AFP) for most Gene Ontology terms only few proteins are annotated, or in the disease-gene prioritization problem only few genes are actually known to be involved in the etiology of a given disease. Imbalance-aware approaches to accurately predict node labels in biological networks are thereby required. Furthermore, such methods must be scalable, since input data can be large-sized as, for instance, in the context of multi-species protein networks.

RESULTS

We propose a novel semi-supervised parallel enhancement of COSNET, an imbalance-aware algorithm build on Hopfield neural model recently suggested to solve the AFP problem. By adopting an efficient representation of the graph and assuming a sparse network topology, we empirically show that it can be efficiently applied to networks with millions of nodes. The key strategy to speed up the computations is to partition nodes into independent sets so as to process each set in parallel by exploiting the power of GPU accelerators. This parallel technique ensures the convergence to asymptotically stable attractors, while preserving the asynchronous dynamics of the original model. Detailed experiments on real data and artificial big instances of the problem highlight scalability and efficiency of the proposed method.

CONCLUSIONS

By parallelizing COSNET we achieved on average a speed-up of 180x in solving the AFP problem in the S. cerevisiae, Mus musculus and Homo sapiens organisms, while lowering memory requirements. In addition, to show the potential applicability of the method to huge biomolecular networks, we predicted node labels in artificially generated sparse networks involving hundreds of thousands to millions of nodes.

Collapse

Ehsani Ardakani MJ, Safaei A, Arefi Oskouie A, Haghparast H, Haghazali M, Mohaghegh Shalmani H, Peyvandi H, Naderi N, Zali MR. Evaluation of liver cirrhosis and hepatocellular carcinoma using Protein-Protein Interaction Networks. GASTROENTEROLOGY AND HEPATOLOGY FROM BED TO BENCH 2016;9:S14-S22. [PMID: 28224023 PMCID: PMC5310795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Abstract

AIM

In the current study, we analysised only the articles that investigate serum proteome profile of cirrhosis patients or HCC patients versus healthy controls.

BACKGROUND

Increased understanding of cancer biology has enabled identification of molecular events that lead to the discovery of numerous potential biomarkers in diseases. Protein-protein interaction networks is one of aspect that could elevate the understanding level of molecular events and protein connections that lead to the identification of genes and proteins associated with diseases.

METHODS

Gene expression data, including 63 gene or protein names for hepatocellular carcinoma and 29 gene or protein names for cirrhosis, were extracted from a number of previous investigations. The networks of related differentially expressed genes were explored using Cytoscape and the PPI analysis methods such as MCODE and ClueGO. Centrality and cluster screening identified hub genes, including APOE, TTR, CLU, and APOA1 in cirrhosis.

RESULTS

CLU and APOE belong to the regulation of positive regulation of neurofibrillary tangle assembly. HP and APOE involved in cellular oxidant detoxification. C4B and C4BP belong to the complement activation, classical pathway and acute inflammation response pathway. Also, it was reported TTR, TFRC, VWF, CLU, A2M, APOA1, CKAP5, ZNF648, CASP8, and HSP27 as hubs in HCC. In HCC, these include A2M that are corresponding to platelet degranulation, humoral immune response, and negative regulation of immune effector process. CLU belong to the reverse cholesterol transport, platelet degranulation and human immune response. APOA1 corresponds to the reverse cholesterol transport, platelet degranulation and humoral immune response, as well as negative regulation of immune effector process pathway.

CONCLUSION

In conclusion, this study suggests that there is a common molecular relationship between cirrhosis and hepatocellular cancer that may help with identification of target molecules for early treatment that is essential in cancer therapy.

Collapse

Yu G, Fu G, Wang J, Zhu H. Predicting Protein Function via Semantic Integration of Multiple Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:220-232. [PMID: 26800544 DOI: 10.1109/tcbb.2015.2459713] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Frasca M, Bertoni A, Valentini G. UNIPred: Unbalance-Aware Network Integration and Prediction of Protein Functions. J Comput Biol 2015;22:1057-74. [PMID: 26402488 DOI: 10.1089/cmb.2014.0110] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Frasca M, Bassis S, Valentini G. Learning node labels with multi-category Hopfield networks. Neural Comput Appl 2015. [DOI: 10.1007/s00521-015-1965-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

Sevimoglu T, Arga KY. The role of protein interaction networks in systems biomedicine. Comput Struct Biotechnol J 2014;11:22-7. [PMID: 25379140 PMCID: PMC4212283 DOI: 10.1016/j.csbj.2014.08.008] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

Mesiti M, Re M, Valentini G. Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Gigascience 2014;3:5. [PMID: 24843788 PMCID: PMC4006453 DOI: 10.1186/2047-217x-3-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 04/01/2014] [Indexed: 01/08/2023] Open

Abstract

Background

Network-based learning algorithms for automated function prediction (AFP) are negatively affected by the limited coverage of experimental data and limited a priori known functional annotations. As a consequence their application to model organisms is often restricted to well characterized biological processes and pathways, and their effectiveness with poorly annotated species is relatively limited. A possible solution to this problem might consist in the construction of big networks including multiple species, but this in turn poses challenging computational problems, due to the scalability limitations of existing algorithms and the main memory requirements induced by the construction of big networks. Distributed computation or the usage of big computers could in principle respond to these issues, but raises further algorithmic problems and require resources not satisfiable with simple off-the-shelf computers.

Results

We propose a novel framework for scalable network-based learning of multi-species protein functions based on both a local implementation of existing algorithms and the adoption of innovative technologies: we solve “locally” the AFP problem, by designing “vertex-centric” implementations of network-based algorithms, but we do not give up thinking “globally” by exploiting the overall topology of the network. This is made possible by the adoption of secondary memory-based technologies that allow the efficient use of the large memory available on disks, thus overcoming the main memory limitations of modern off-the-shelf computers. This approach has been applied to the analysis of a large multi-species network including more than 300 species of bacteria and to a network with more than 200,000 proteins belonging to 13 Eukaryotic species. To our knowledge this is the first work where secondary-memory based network analysis has been applied to multi-species function prediction using biological networks with hundreds of thousands of proteins.

Conclusions

The combination of these algorithmic and technological approaches makes feasible the analysis of large multi-species networks using ordinary computers with limited speed and primary memory, and in perspective could enable the analysis of huge networks (e.g. the whole proteomes available in SwissProt), using well-equipped stand-alone machines.

Collapse