1
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
2
|
Khoo LS, Lim MK, Chong CY, McNaney R. Machine Learning for Multimodal Mental Health Detection: A Systematic Review of Passive Sensing Approaches. SENSORS (BASEL, SWITZERLAND) 2024; 24:348. [PMID: 38257440 PMCID: PMC10820860 DOI: 10.3390/s24020348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/14/2023] [Accepted: 12/18/2023] [Indexed: 01/24/2024]
Abstract
As mental health (MH) disorders become increasingly prevalent, their multifaceted symptoms and comorbidities with other conditions introduce complexity to diagnosis, posing a risk of underdiagnosis. While machine learning (ML) has been explored to mitigate these challenges, we hypothesized that multiple data modalities support more comprehensive detection and that non-intrusive collection approaches better capture natural behaviors. To understand the current trends, we systematically reviewed 184 studies to assess feature extraction, feature fusion, and ML methodologies applied to detect MH disorders from passively sensed multimodal data, including audio and video recordings, social media, smartphones, and wearable devices. Our findings revealed varying correlations of modality-specific features in individualized contexts, potentially influenced by demographics and personalities. We also observed the growing adoption of neural network architectures for model-level fusion and as ML algorithms, which have demonstrated promising efficacy in handling high-dimensional features while modeling within and cross-modality relationships. This work provides future researchers with a clear taxonomy of methodological approaches to multimodal detection of MH disorders to inspire future methodological advancements. The comprehensive analysis also guides and supports future researchers in making informed decisions to select an optimal data source that aligns with specific use cases based on the MH disorder of interest.
Collapse
Affiliation(s)
- Lin Sze Khoo
- Department of Human-Centered Computing, Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia;
| | - Mei Kuan Lim
- School of Information Technology, Monash University Malaysia, Subang Jaya 46150, Malaysia; (M.K.L.); (C.Y.C.)
| | - Chun Yong Chong
- School of Information Technology, Monash University Malaysia, Subang Jaya 46150, Malaysia; (M.K.L.); (C.Y.C.)
| | - Roisin McNaney
- Department of Human-Centered Computing, Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia;
| |
Collapse
|
3
|
Ding K, Wang S, Luo Y. Supervised biological network alignment with graph neural networks. Bioinformatics 2023; 39:i465-i474. [PMID: 37387160 PMCID: PMC10311300 DOI: 10.1093/bioinformatics/btad241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Despite the advances in sequencing technology, massive proteins with known sequences remain functionally unannotated. Biological network alignment (NA), which aims to find the node correspondence between species' protein-protein interaction (PPI) networks, has been a popular strategy to uncover missing annotations by transferring functional knowledge across species. Traditional NA methods assumed that topologically similar proteins in PPIs are functionally similar. However, it was recently reported that functionally unrelated proteins can be as topologically similar as functionally related pairs, and a new data-driven or supervised NA paradigm has been proposed, which uses protein function data to discern which topological features correspond to functional relatedness. RESULTS Here, we propose GraNA, a deep learning framework for the supervised NA paradigm for the pairwise NA problem. Employing graph neural networks, GraNA utilizes within-network interactions and across-network anchor links for learning protein representations and predicting functional correspondence between across-species proteins. A major strength of GraNA is its flexibility to integrate multi-faceted non-functional relationship data, such as sequence similarity and ortholog relationships, as anchor links to guide the mapping of functionally related proteins across species. Evaluating GraNA on a benchmark dataset composed of several NA tasks between different pairs of species, we observed that GraNA accurately predicted the functional relatedness of proteins and robustly transferred functional annotations across species, outperforming a number of existing NA methods. When applied to a case study on a humanized yeast network, GraNA also successfully discovered functionally replaceable human-yeast protein pairs that were documented in previous studies. AVAILABILITY AND IMPLEMENTATION The code of GraNA is available at https://github.com/luo-group/GraNA.
Collapse
Affiliation(s)
- Kerr Ding
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332, United States
| | - Sheng Wang
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, United States
| | - Yunan Luo
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332, United States
| |
Collapse
|
4
|
Tran TD, Nguyen MT. C-Biomarker.net: A Cytoscape app for the identification of cancer biomarker genes from cores of large biomolecular networks. Biosystems 2023; 226:104887. [PMID: 36990379 DOI: 10.1016/j.biosystems.2023.104887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 03/22/2023] [Accepted: 03/24/2023] [Indexed: 03/30/2023]
Abstract
Although there have been many studies revealing that biomarker genes for early cancer detection can be found in biomolecular networks, no proper tool exists to discover the cancer biomarker genes from various biomolecular networks. Accordingly, we developed a novel Cytoscape app called C-Biomarker.net, which can identify cancer biomarker genes from cores of various biomolecular networks. Derived from recent research, we designed and implemented the software based on parallel algorithms proposed in this study for working on high-performance computing devices. We tested our software on various network sizes and found the suitable size for each running mode on CPU or GPU. Interestingly, using the software for 17 cancer signaling pathways, we found that on average 70.59% of the top three nodes residing at the innermost core of each pathway are biomarker genes of the cancer respectively to the pathway. Similarly, by the software, we also found 100% of the top ten nodes at both cores of Human Gene Regulatory (HGR) network and Human Protein-Protein Interaction (HPPI) network are multi-cancer biomarkers. These case studies are reliable evidence for performance of cancer biomarker prediction function in the software. Through the case studies, we also suggest that true cores of directed complex networks should be identified by the algorithm of R-core rather than K-core as usual. Finally, we compared the prediction result of our software with those of other researchers and confirmed that our prediction method outperforms the other methods. Taken together, C-Biomarker.net is a reliable tool that efficiently detects biomarker nodes from cores of various large biomolecular networks. The software is available at https://github.com/trantd/C-Biomarker.net.
Collapse
|
5
|
Cordier BA, Sawaya NPD, Guerreschi GG, McWeeney SK. Biology and medicine in the landscape of quantum advantages. J R Soc Interface 2022; 19:20220541. [PMID: 36448288 PMCID: PMC9709576 DOI: 10.1098/rsif.2022.0541] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 11/04/2022] [Indexed: 12/03/2022] Open
Abstract
Quantum computing holds substantial potential for applications in biology and medicine, spanning from the simulation of biomolecules to machine learning methods for subtyping cancers on the basis of clinical features. This potential is encapsulated by the concept of a quantum advantage, which is contingent on a reduction in the consumption of a computational resource, such as time, space or data. Here, we distill the concept of a quantum advantage into a simple framework to aid researchers in biology and medicine pursuing the development of quantum applications. We then apply this framework to a wide variety of computational problems relevant to these domains in an effort to (i) assess the potential of practical advantages in specific application areas and (ii) identify gaps that may be addressed with novel quantum approaches. In doing so, we provide an extensive survey of the intersection of biology and medicine with the current landscape of quantum algorithms and their potential advantages. While we endeavour to identify specific computational problems that may admit practical advantages throughout this work, the rapid pace of change in the fields of quantum computing, classical algorithms and biological research implies that this intersection will remain highly dynamic for the foreseeable future.
Collapse
Affiliation(s)
- Benjamin A. Cordier
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97202, USA
| | | | | | - Shannon K. McWeeney
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97202, USA
- Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97202, USA
- Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97202, USA
| |
Collapse
|
6
|
Mao R, O’Leary J, Mesbah A, Mittal J. A Deep Learning Framework Discovers Compositional Order and Self-Assembly Pathways in Binary Colloidal Mixtures. JACS AU 2022; 2:1818-1828. [PMID: 36032540 PMCID: PMC9400045 DOI: 10.1021/jacsau.2c00111] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Binary colloidal superlattices (BSLs) have demonstrated enormous potential for the design of advanced multifunctional materials that can be synthesized via colloidal self-assembly. However, mechanistic understanding of the three-dimensional self-assembly of BSLs is largely limited due to a lack of tractable strategies for characterizing the many two-component structures that can appear during the self-assembly process. To address this gap, we present a framework for colloidal crystal structure characterization that uses branched graphlet decomposition with deep learning to systematically and quantitatively describe the self-assembly of BSLs at the single-particle level. Branched graphlet decomposition is used to evaluate local structure via high-dimensional neighborhood graphs that quantify both structural order (e.g., body-centered-cubic vs face-centered-cubic) and compositional order (e.g., substitutional defects) of each individual particle. Deep autoencoders are then used to efficiently translate these neighborhood graphs into low-dimensional manifolds from which relationships among neighborhood graphs can be more easily inferred. We demonstrate the framework on in silico systems of DNA-functionalized particles, in which two well-recognized design parameters, particle size ratio and interparticle potential well depth can be adjusted independently. The framework reveals that binary colloidal mixtures with small interparticle size disparities (i.e., A- and B-type particle radius ratios of r A/r B = 0.8 to r A/r B = 0.95) can promote the self-assembly of defect-free BSLs much more effectively than systems of identically sized particles, as nearly defect-free BCC-CsCl, FCC-CuAu, and IrV crystals are observed in the former case. The framework additionally reveals that size-disparate colloidal mixtures can undergo nonclassical nucleation pathways where BSLs evolve from dense amorphous precursors, instead of directly nucleating from dilute solution. These findings illustrate that the presented characterization framework can assist in enhancing mechanistic understanding of the self-assembly of binary colloidal mixtures, which in turn can pave the way for engineering the growth of defect-free BSLs.
Collapse
Affiliation(s)
- Runfang Mao
- Department
of Chemical and Biomolecular Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jared O’Leary
- Department
of Chemical and Biomolecular Engineering, University of California, Berkeley, California 94720, United States
| | - Ali Mesbah
- Department
of Chemical and Biomolecular Engineering, University of California, Berkeley, California 94720, United States
| | - Jeetain Mittal
- Artie
McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843, United States
| |
Collapse
|
7
|
Li Q, Milenkovic T. Supervised Prediction of Aging-Related Genes From a Context-Specific Protein Interaction Subnetwork. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2484-2498. [PMID: 33929964 DOI: 10.1109/tcbb.2021.3076961] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Human aging is linked to many prevalent diseases. The aging process is highly influenced by genetic factors. Hence, it is important to identify human aging-related genes. We focus on supervised prediction of such genes. Gene expression-based methods for this purpose study genes in isolation from each other. While protein-protein interaction (PPI) network-based methods for this purpose account for interactions between genes' protein products, current PPI network data are context-unspecific, spanning different biological conditions. Instead, here, we focus on an aging-specific subnetwork of the entire PPI network, obtained by integrating aging-specific gene expression data and PPI network data. The potential of such data integration has been recognized but mostly in the context of cancer. So, we are the first to propose a supervised learning framework for predicting aging-related genes from an aging-specific PPI subnetwork. In a systematic and comprehensive evaluation, we find that in many of the evaluation tests: (i) using an aging-specific subnetwork indeed yields more accurate aging-related gene predictions than using the entire network, and (ii) predictive methods from our framework that have not previously been used for supervised prediction of aging-related genes outperform existing prominent methods for the same purpose. These results justify the need for our framework.
Collapse
|
8
|
Ma L, Shao Z, Li L, Huang J, Wang S, Lin Q, Li J, Gong M, Nandi AK. Heuristics and metaheuristics for biological network alignment: A review. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.08.156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
9
|
Guzzi PH, Tradigo G, Veltri P. Using dual-network-analyser for communities detecting in dual networks. BMC Bioinformatics 2022; 22:614. [PMID: 35012460 PMCID: PMC8750846 DOI: 10.1186/s12859-022-04564-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 01/03/2022] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Representations of the relationships among data using networks are widely used in several research fields such as computational biology, medical informatics and social network mining. Recently, complex networks have been introduced to better capture the insights of the modelled scenarios. Among others, dual networks (DNs) consist of mapping information as pairs of networks containing the same set of nodes but with different edges: one, called physical network, has unweighted edges, while the other, called conceptual network, has weighted edges. RESULTS We focus on DNs and we propose a tool to find common subgraphs (aka communities) in DNs with particular properties. The tool, called Dual-Network-Analyser, is based on the identification of communities that induce optimal modular subgraphs in the conceptual network and connected subgraphs in the physical one. It includes the Louvain algorithm applied to the considered case. The Dual-Network-Analyser can be used to study DNs, to find common modular communities. We report results on using the tool to identify communities on synthetic DNs as well as real cases in social networks and biological data. CONCLUSION The proposed method has been tested by using synthetic and biological networks. Results demonstrate that it is well able to detect meaningful information from DNs.
Collapse
Affiliation(s)
- Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, Magna Graecia University, 88100 Catanzaro, Italy
| | | | - Pierangelo Veltri
- Department of Surgical and Medical Sciences, Magna Graecia University, 88100 Catanzaro, Italy
| |
Collapse
|
10
|
Gupta R, Kumar P. CREB1 K292 and HINFP K330 as Putative Common Therapeutic Targets in Alzheimer's and Parkinson's Disease. ACS OMEGA 2021; 6:35780-35798. [PMID: 34984308 PMCID: PMC8717564 DOI: 10.1021/acsomega.1c05827] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 12/07/2021] [Indexed: 05/16/2023]
Abstract
Integration of omics data and deciphering the mechanism of a biological regulatory network could be a promising approach to reveal the molecular mechanism involved in the progression of complex diseases, including Alzheimer's and Parkinson's. Despite having an overlapping mechanism in the etiology of Alzheimer's disease (AD) and Parkinson's disease (PD), the exact mechanism and signaling molecules behind them are still unknown. Further, the acetylation mechanism and histone deacetylase (HDAC) enzymes provide a positive direction toward studying the shared phenomenon between AD and PD pathogenesis. For instance, increased expression of HDACs causes a decrease in protein acetylation status, resulting in decreased cognitive and memory function. Herein, we employed an integrative approach to analyze the transcriptomics data that established a potential relationship between AD and PD. Data preprocessing and analysis of four publicly available microarray datasets revealed 10 HUB proteins, namely, CDC42, CD44, FGFR1, MYO5A, NUMA1, TUBB4B, ARHGEF9, USP5, INPP5D, and NUP93, that may be involved in the shared mechanism of AD and PD pathogenesis. Further, we identified the relationship between the HUB proteins and transcription factors that could be involved in the overlapping mechanism of AD and PD. CREB1 and HINFP were the crucial regulatory transcription factors that were involved in the AD and PD crosstalk. Further, lysine acetylation sites and HDAC enzyme prediction revealed the involvement of 15 and 27 potential lysine residues of CREB1 and HINFP, respectively. Our results highlighted the importance of HDAC1(K292) and HDAC6(K330) association with CREB1 and HINFP, respectively, in the AD and PD crosstalk. However, different datasets with a large number of samples and wet lab experimentation are required to validate and pinpoint the exact role of CREB1 and HINFP in the AD and PD crosstalk. It is also possible that the different datasets may or may not affect the results due to analysis parameters. In conclusion, our study potentially highlighted the crucial proteins, transcription factors, biological pathways, lysine residues, and HDAC enzymes shared between AD and PD at the molecular level. The findings can be used to study molecular studies to identify the possible relationship in the AD-PD crosstalk.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and
Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Delhi 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and
Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Delhi 110042, India
| |
Collapse
|
11
|
Gu S, Milenković T. Data-driven biological network alignment that uses topological, sequence, and functional information. BMC Bioinformatics 2021; 22:34. [PMID: 33514304 PMCID: PMC7847157 DOI: 10.1186/s12859-021-03971-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Network alignment (NA) can transfer functional knowledge between species' conserved biological network regions. Traditional NA assumes that it is topological similarity (isomorphic-like matching) between network regions that corresponds to the regions' functional relatedness. However, we recently found that functionally unrelated proteins are as topologically similar as functionally related proteins. So, we redefined NA as a data-driven method called TARA, which learns from network and protein functional data what kind of topological relatedness (rather than similarity) between proteins corresponds to their functional relatedness. TARA used topological information (within each network) but not sequence information (between proteins across networks). Yet, TARA yielded higher protein functional prediction accuracy than existing NA methods, even those that used both topological and sequence information. RESULTS Here, we propose TARA++ that is also data-driven, like TARA and unlike other existing methods, but that uses across-network sequence information on top of within-network topological information, unlike TARA. To deal with the within-and-across-network analysis, we adapt social network embedding to the problem of biological NA. TARA++ outperforms protein functional prediction accuracy of existing methods. CONCLUSIONS As such, combining research knowledge from different domains is promising. Overall, improvements in protein functional prediction have biomedical implications, for example allowing researchers to better understand how cancer progresses or how humans age.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, Eck Institute for Global Health, Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, Eck Institute for Global Health, Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
12
|
Hao Shi, Yan KK, Ding L, Qian C, Chi H, Yu J. Network Approaches for Dissecting the Immune System. iScience 2020; 23:101354. [PMID: 32717640 PMCID: PMC7390880 DOI: 10.1016/j.isci.2020.101354] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 06/21/2020] [Accepted: 07/08/2020] [Indexed: 02/06/2023] Open
Abstract
The immune system is a complex biological network composed of hierarchically organized genes, proteins, and cellular components that combat external pathogens and monitor the onset of internal disease. To meet and ultimately defeat these challenges, the immune system orchestrates an exquisitely complex interplay of numerous cells, often with highly specialized functions, in a tissue-specific manner. One of the major methodologies of systems immunology is to measure quantitatively the components and interaction levels in the immunologic networks to construct a computational network and predict the response of the components to perturbations. The recent advances in high-throughput sequencing techniques have provided us with a powerful approach to dissecting the complexity of the immune system. Here we summarize the latest progress in integrating omics data and network approaches to construct networks and to infer the underlying signaling and transcriptional landscape, as well as cell-cell communication, in the immune system, with a focus on hematopoiesis, adaptive immunity, and tumor immunology. Understanding the network regulation of immune cells has provided new insights into immune homeostasis and disease, with important therapeutic implications for inflammation, cancer, and other immune-mediated disorders.
Collapse
Affiliation(s)
- Hao Shi
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Koon-Kiu Yan
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Liang Ding
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Chenxi Qian
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Hongbo Chi
- Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Jiyang Yu
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA.
| |
Collapse
|
13
|
Abstract
In this study, we deal with the problem of biological network alignment (NA), which aims to find a node mapping between species' molecular networks that uncovers similar network regions, thus allowing for the transfer of functional knowledge between the aligned nodes. We provide evidence that current NA methods, which assume that topologically similar nodes (i.e., nodes whose network neighborhoods are isomorphic-like) have high functional relatedness, do not actually end up aligning functionally related nodes. That is, we show that the current topological similarity assumption does not hold well. Consequently, we argue that a paradigm shift is needed with how the NA problem is approached. So, we redefine NA as a data-driven framework, called TARA (data-driven NA), which attempts to learn the relationship between topological relatedness and functional relatedness without assuming that topological relatedness corresponds to topological similarity. TARA makes no assumptions about what nodes should be aligned, distinguishing it from existing NA methods. Specifically, TARA trains a classifier to predict whether two nodes from different networks are functionally related based on their network topological patterns (features). We find that TARA is able to make accurate predictions. TARA then takes each pair of nodes that are predicted as related to be part of an alignment. Like traditional NA methods, TARA uses this alignment for the across-species transfer of functional knowledge. TARA as currently implemented uses topological but not protein sequence information for functional knowledge transfer. In this context, we find that TARA outperforms existing state-of-the-art NA methods that also use topological information, WAVE and SANA, and even outperforms or complements a state-of-the-art NA method that uses both topological and sequence information, PrimAlign. Hence, adding sequence information to TARA, which is our future work, is likely to further improve its performance. The software and data are available at http://www.nd.edu/~cone/TARA/.
Collapse
Affiliation(s)
- Shawn Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, United States of America
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN, United States of America
| |
Collapse
|
14
|
Newaz K, Ghalehnovi M, Rahnama A, Antsaklis PJ, Milenković T. Network-based protein structural classification. ROYAL SOCIETY OPEN SCIENCE 2020; 7:191461. [PMID: 32742675 PMCID: PMC7353965 DOI: 10.1098/rsos.191461] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 05/05/2020] [Indexed: 06/11/2023]
Abstract
Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922.
Collapse
Affiliation(s)
- Khalique Newaz
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Mahboobeh Ghalehnovi
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Arash Rahnama
- Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Panos J. Antsaklis
- Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
15
|
Milano M, Milenković T, Cannataro M, Guzzi PH. L-HetNetAligner: A novel algorithm for Local Alignment of Heterogeneous Biological Networks. Sci Rep 2020; 10:3901. [PMID: 32127586 PMCID: PMC7054427 DOI: 10.1038/s41598-020-60737-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 02/11/2020] [Indexed: 11/10/2022] Open
Abstract
Networks are largely used for modelling and analysing a wide range of biological data. As a consequence, many different research efforts have resulted in the introduction of a large number of algorithms for analysis and comparison of networks. Many of these algorithms can deal with networks with a single class of nodes and edges, also referred to as homogeneous networks. Recently, many different approaches tried to integrate into a single model the interplay of different molecules. A possible formalism to model such a scenario comes from node/edge coloured networks (also known as heterogeneous networks) implemented as node/ edge-coloured graphs. Therefore, the need for the introduction of algorithms able to compare heterogeneous networks arises. We here focus on the local comparison of heterogeneous networks, and we formulate it as a network alignment problem. To the best of our knowledge, the local alignment of heterogeneous networks has not been explored in the past. We here propose L-HetNetAligner a novel algorithm that receives as input two heterogeneous networks (node-coloured graphs) and builds a local alignment of them. We also implemented and tested our algorithm. Our results confirm that our method builds high-quality alignments. The following website *contains Supplementary File 1 material and the code.
Collapse
Affiliation(s)
- Marianna Milano
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, USA
| | - Mario Cannataro
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy
- Data Analytics Research Center, University of Catanzaro, Catanzaro, Italy
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, 88040, Italy.
- Data Analytics Research Center, University of Catanzaro, Catanzaro, Italy.
| |
Collapse
|
16
|
Vijayan V, Gu S, Krebs ET, Meng L, MilenkoviĆ T. Pairwise Versus Multiple Global Network Alignment. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 8:41961-41974. [PMID: 33747670 PMCID: PMC7971151 DOI: 10.1109/access.2020.2976487] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Biological network alignment (NA) aims to identify similar regions between molecular networks of different species. NA can be local or global. Just as the recent trend in the NA field, we also focus on global NA, which can be pairwise (PNA) and multiple (MNA). PNA produces aligned node pairs between two networks. MNA produces aligned node clusters between more than two networks. Recently, the focus has shifted from PNA to MNA, because MNA captures conserved regions between more networks than PNA (and MNA is thus hypothesized to yield higher-quality alignments), though at higher computational complexity. The issue is that, due to the different outputs of PNA and MNA, a PNA method is only compared to other PNA methods, and an MNA method is only compared to other MNA methods. Comparison of PNA against MNA must be done to evaluate whether MNA indeed yields higher-quality alignments, as only this would justify MNA's higher computational complexity. We introduce a framework that allows for this. We evaluate eight prominent PNA and MNA methods, on synthetic and real-world biological networks, using topological and functional alignment quality measures. We compare PNA against MNA in both a pairwise (native to PNA) and multiple (native to MNA) manner. PNA is expected to perform better under the pairwise evaluation framework. Indeed this is what we find. MNA is expected to perform better under the multiple evaluation framework. Shockingly, we find this not always to hold; PNA is often better than MNA in this framework, depending on the choice of evaluation test.
Collapse
Affiliation(s)
- Vipin Vijayan
- Center for Network and Data Science, Department of Computer Science and Engineering, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Shawn Gu
- Center for Network and Data Science, Department of Computer Science and Engineering, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Eric T Krebs
- Center for Network and Data Science, Department of Computer Science and Engineering, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Lei Meng
- Center for Network and Data Science, Department of Computer Science and Engineering, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana MilenkoviĆ
- Center for Network and Data Science, Department of Computer Science and Engineering, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
17
|
Koutrouli M, Karatzas E, Paez-Espino D, Pavlopoulos GA. A Guide to Conquer the Biological Network Era Using Graph Theory. Front Bioeng Biotechnol 2020; 8:34. [PMID: 32083072 PMCID: PMC7004966 DOI: 10.3389/fbioe.2020.00034] [Citation(s) in RCA: 99] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 01/15/2020] [Indexed: 12/24/2022] Open
Abstract
Networks are one of the most common ways to represent biological systems as complex sets of binary interactions or relations between different bioentities. In this article, we discuss the basic graph theory concepts and the various graph types, as well as the available data structures for storing and reading graphs. In addition, we describe several network properties and we highlight some of the widely used network topological features. We briefly mention the network patterns, motifs and models, and we further comment on the types of biological and biomedical networks along with their corresponding computer- and human-readable file formats. Finally, we discuss a variety of algorithms and metrics for network analyses regarding graph drawing, clustering, visualization, link prediction, perturbation, and network alignment as well as the current state-of-the-art tools. We expect this review to reach a very broad spectrum of readers varying from experts to beginners while encouraging them to enhance the field further.
Collapse
Affiliation(s)
- Mikaela Koutrouli
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Department of Informatics and Telecommunications, University of Athens, Athens, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, Department of Energy, Joint Genome Institute, Walnut Creek, CA, United States
| | | |
Collapse
|
18
|
Caufield JH, Ping P. New advances in extracting and learning from protein-protein interactions within unstructured biomedical text data. Emerg Top Life Sci 2019; 3:357-369. [PMID: 33523203 DOI: 10.1042/etls20190003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 07/11/2019] [Accepted: 07/16/2019] [Indexed: 12/14/2022]
Abstract
Protein-protein interactions, or PPIs, constitute a basic unit of our understanding of protein function. Though substantial effort has been made to organize PPI knowledge into structured databases, maintenance of these resources requires careful manual curation. Even then, many PPIs remain uncurated within unstructured text data. Extracting PPIs from experimental research supports assembly of PPI networks and highlights relationships crucial to elucidating protein functions. Isolating specific protein-protein relationships from numerous documents is technically demanding by both manual and automated means. Recent advances in the design of these methods have leveraged emerging computational developments and have demonstrated impressive results on test datasets. In this review, we discuss recent developments in PPI extraction from unstructured biomedical text. We explore the historical context of these developments, recent strategies for integrating and comparing PPI data, and their application to advancing the understanding of protein function. Finally, we describe the challenges facing the application of PPI mining to the text concerning protein families, using the multifunctional 14-3-3 protein family as an example.
Collapse
Affiliation(s)
- J Harry Caufield
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
- Department of Physiology, University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
| | - Peipei Ping
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
- Department of Physiology, University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
- Department of Medicine/Cardiology, University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
- Department of Bioinformatics, University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
- Scalable Analytics Institute (ScAi), University of California at Los Angeles, Los Angeles, CA 90095, U.S.A
| |
Collapse
|