1
|
Brattig-Correia R, Almeida JM, Wyrwoll MJ, Julca I, Sobral D, Misra CS, Di Persio S, Guilgur LG, Schuppe HC, Silva N, Prudêncio P, Nóvoa A, Leocádio AS, Bom J, Laurentino S, Mallo M, Kliesch S, Mutwil M, Rocha LM, Tüttelmann F, Becker JD, Navarro-Costa P. The conserved genetic program of male germ cells uncovers ancient regulators of human spermatogenesis. eLife 2024; 13:RP95774. [PMID: 39388236 PMCID: PMC11466473 DOI: 10.7554/elife.95774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024] Open
Abstract
Male germ cells share a common origin across animal species, therefore they likely retain a conserved genetic program that defines their cellular identity. However, the unique evolutionary dynamics of male germ cells coupled with their widespread leaky transcription pose significant obstacles to the identification of the core spermatogenic program. Through network analysis of the spermatocyte transcriptome of vertebrate and invertebrate species, we describe the conserved evolutionary origin of metazoan male germ cells at the molecular level. We estimate the average functional requirement of a metazoan male germ cell to correspond to the expression of approximately 10,000 protein-coding genes, a third of which defines a genetic scaffold of deeply conserved genes that has been retained throughout evolution. Such scaffold contains a set of 79 functional associations between 104 gene expression regulators that represent a core component of the conserved genetic program of metazoan spermatogenesis. By genetically interfering with the acquisition and maintenance of male germ cell identity, we uncover 161 previously unknown spermatogenesis genes and three new potential genetic causes of human infertility. These findings emphasize the importance of evolutionary history on human reproductive disease and establish a cross-species analytical pipeline that can be repurposed to other cell types and pathologies.
Collapse
Affiliation(s)
- Rion Brattig-Correia
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Department of Systems Science and Industrial Engineering, Binghamton UniversityNew YorkUnited States
| | - Joana M Almeida
- Instituto Gulbenkian de CiênciaOeirasPortugal
- EvoReproMed Lab, Environmental Health Institute (ISAMB), Associate Laboratory TERRA, Faculty of Medicine, University of LisbonLisbonPortugal
| | - Margot Julia Wyrwoll
- Centre of Medical Genetics, Institute of Reproductive Genetics, University and University Hospital of MünsterMünsterGermany
| | - Irene Julca
- School of Biological Sciences, Nanyang Technological UniversitySingaporeSingapore
| | - Daniel Sobral
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, NOVA School of Science and Technology, NOVA University LisbonLisbonPortugal
- UCIBIO - Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, NOVA University LisbonCaparicaPortugal
| | - Chandra Shekhar Misra
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de LisboaOeirasPortugal
| | - Sara Di Persio
- Centre of Reproductive Medicine and Andrology, University Hospital MünsterMünsterGermany
| | | | - Hans-Christian Schuppe
- Clinic of Urology, Pediatric Urology and Andrology, Justus-Liebig-UniversityGiessenGermany
| | - Neide Silva
- Instituto Gulbenkian de CiênciaOeirasPortugal
| | - Pedro Prudêncio
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de LisboaLisboaPortugal
| | - Ana Nóvoa
- Instituto Gulbenkian de CiênciaOeirasPortugal
| | | | - Joana Bom
- Instituto Gulbenkian de CiênciaOeirasPortugal
| | - Sandra Laurentino
- Centre of Reproductive Medicine and Andrology, University Hospital MünsterMünsterGermany
| | | | - Sabine Kliesch
- Centre of Reproductive Medicine and Andrology, University Hospital MünsterMünsterGermany
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological UniversitySingaporeSingapore
| | - Luis M Rocha
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Department of Systems Science and Industrial Engineering, Binghamton UniversityNew YorkUnited States
| | - Frank Tüttelmann
- Centre of Medical Genetics, Institute of Reproductive Genetics, University and University Hospital of MünsterMünsterGermany
| | - Jörg D Becker
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de LisboaOeirasPortugal
| | - Paulo Navarro-Costa
- Instituto Gulbenkian de CiênciaOeirasPortugal
- EvoReproMed Lab, Environmental Health Institute (ISAMB), Associate Laboratory TERRA, Faculty of Medicine, University of LisbonLisbonPortugal
| |
Collapse
|
2
|
Dang TC, Fields L, Li L. MotifQuest: An Automated Pipeline for Motif Database Creation to Improve Peptidomics Database Searching Programs. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:1902-1912. [PMID: 39058243 DOI: 10.1021/jasms.4c00192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
Endogenous peptides are an abundant and versatile class of biomolecules with vital roles pertinent to the functionality of the nervous, endocrine, and immune systems and others. Mass spectrometry stands as a premier technique for identifying endogenous peptides, yet the field still faces challenges due to the lack of optimized computational resources for reliable raw mass spectra analysis and interpretation. Current database searching programs can exhibit discrepancies due to the unique properties of endogenous peptides, which typically require specialized search considerations. Herein, we present a high throughput, novel scoring algorithm for the extraction and ranking of conserved amino acid sequence motifs within any endogenous peptide database. Motifs are conserved patterns across organisms, representing sequence moieties crucial for biological functions, including maintenance of homeostasis. MotifQuest, our novel motif database generation algorithm, is designed to work in partnership with EndoGenius, a program optimized for database searching of endogenous peptides and that is powered by a motif database to capitalize on biological context to produce identifications. MotifQuest aims to quickly develop motif databases without any prior knowledge, a laborious task not possible with traditional sequence alignment resources. In this work we illustrate the utility of MotifQuest to expand EndoGenius' identification utility to other endogenous peptides by showcasing its ability to identify antimicrobial peptides. Additionally, we discuss the potential utility of MotifQuest to parse out motifs from a FASTA database file that can be further validated as new peptide drug candidates.
Collapse
Affiliation(s)
- Tina C Dang
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, Wisconsin 53705, United States
| | - Lauren Fields
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lingjun Li
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, Wisconsin 53705, United States
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, Wisconsin 53706, United States
| |
Collapse
|
3
|
Fernando PC, Mabee PM, Zeng E. Protein-protein interaction network module changes associated with the vertebrate fin-to-limb transition. Sci Rep 2023; 13:22594. [PMID: 38114646 PMCID: PMC10730527 DOI: 10.1038/s41598-023-50050-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 12/14/2023] [Indexed: 12/21/2023] Open
Abstract
Evolutionary phenotypic transitions, such as the fin-to-limb transition in vertebrates, result from modifications in related proteins and their interactions, often in response to changing environment. Identifying these alterations in protein networks is crucial for a more comprehensive understanding of these transitions. However, previous research has not attempted to compare protein-protein interaction (PPI) networks associated with evolutionary transitions, and most experimental studies concentrate on a limited set of proteins. Therefore, the goal of this work was to develop a network-based platform for investigating the fin-to-limb transition using PPI networks. Quality-enhanced protein networks, constructed by integrating PPI networks with anatomy ontology data, were leveraged to compare protein modules for paired fins (pectoral fin and pelvic fin) of fishes (zebrafish) to those of the paired limbs (forelimb and hindlimb) of mammals (mouse). This also included prediction of novel protein candidates and their validation by enrichment and homology analyses. Hub proteins such as shh and bmp4, which are crucial for module stability, were identified, and their changing roles throughout the transition were examined. Proteins with preserved roles during the fin-to-limb transition were more likely to be hub proteins. This study also addressed hypotheses regarding the role of non-preserved proteins associated with the transition.
Collapse
Affiliation(s)
- Pasan C Fernando
- Department of Plant Sciences, University of Colombo, Colombo, Sri Lanka.
| | - Paula M Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, USA
- National Ecological Observatory Network, Battelle, 1625 38th St. #100, Boulder, CO, 80301, USA
| | - Erliang Zeng
- Departments of Preventive & Community Dentistry, College of Dentistry, University of Iowa, Iowa City, IA, USA.
- Division of Biostatistics and Computational Biology, College of Dentistry, University of Iowa, Iowa City, IA, USA.
- Departments of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA, USA.
- Departments of Biomedical Engineering, College of Engineering, University of Iowa, Iowa City, IA, USA.
| |
Collapse
|
4
|
Kondratyeva L, Alekseenko I, Chernov I, Sverdlov E. Data Incompleteness May form a Hard-to-Overcome Barrier to Decoding Life's Mechanism. BIOLOGY 2022; 11:1208. [PMID: 36009835 PMCID: PMC9404739 DOI: 10.3390/biology11081208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 08/03/2022] [Accepted: 08/10/2022] [Indexed: 11/23/2022]
Abstract
In this brief review, we attempt to demonstrate that the incompleteness of data, as well as the intrinsic heterogeneity of biological systems, may form very strong and possibly insurmountable barriers for researchers trying to decipher the mechanisms of the functioning of live systems. We illustrate this challenge using the two most studied organisms: E. coli, with 34.6% genes lacking experimental evidence of function, and C. elegans, with identified proteins for approximately 50% of its genes. Another striking example is an artificial unicellular entity named JCVI-syn3.0, with a minimal set of genes. A total of 31.5% of the genes of JCVI-syn3.0 cannot be ascribed a specific biological function. The human interactome mapping project identified only 5-10% of all protein interactions in humans. In addition, most of the available data are static snapshots, and it is barely possible to generate realistic models of the dynamic processes within cells. Moreover, the existing interactomes reflect the de facto interaction but not its functional result, which is an unpredictable emerging property. Perhaps the completeness of molecular data on any living organism is beyond our reach and represents an unsolvable problem in biology.
Collapse
Affiliation(s)
- Liya Kondratyeva
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russia
| | - Irina Alekseenko
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russia
- Institute of Molecular Genetics of National Research Centre “Kurchatov Institute”, Moscow 123182, Russia
| | - Igor Chernov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russia
| | - Eugene Sverdlov
- Institute of Molecular Genetics of National Research Centre “Kurchatov Institute”, Moscow 123182, Russia
- Kurchatov Center for Genome Research, National Research Center “Kurchatov Institute”, Moscow 123182, Russia
| |
Collapse
|
5
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open
|
6
|
Evaluating the role of community detection in improving influence maximization heuristics. SOCIAL NETWORK ANALYSIS AND MINING 2021. [DOI: 10.1007/s13278-021-00804-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
AbstractBoth community detection and influence maximization are well-researched fields of network science. Here, we investigate how several popular community detection algorithms can be used as part of a heuristic approach to influence maximization. The heuristic is based on the community value, a node-based metric defined on the outputs of overlapping community detection algorithms. This metric is used to select nodes as high influence candidates for expanding the set of influential nodes. Our aim in this paper is twofold. First, we evaluate the performance of eight frequently used overlapping community detection algorithms on this specific task to show how much improvement can be gained compared to the originally proposed method of Kempe et al. Second, selecting the community detection algorithm(s) with the best performance, we propose a variant of the influence maximization heuristic with significantly reduced runtime, at the cost of slightly reduced quality of the output. We use both artificial benchmarks and real-life networks to evaluate the performance of our approach.
Collapse
|
7
|
Ezoe A, Shirai K, Hanada K. Degree of Functional Divergence in Duplicates Is Associated with Distinct Roles in Plant Evolution. Mol Biol Evol 2021; 38:1447-1459. [PMID: 33290522 PMCID: PMC8042753 DOI: 10.1093/molbev/msaa302] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Gene duplication is a major mechanism to create new genes. After gene duplication, some duplicated genes undergo functionalization, whereas others largely maintain redundant functions. Duplicated genes comprise various degrees of functional diversification in plants. However, the evolutionary fate of high and low diversified duplicates is unclear at genomic scale. To infer high and low diversified duplicates in Arabidopsis thaliana genome, we generated a prediction method for predicting whether a pair of duplicate genes was subjected to high or low diversification based on the phenotypes of knock-out mutants. Among 4,017 pairs of recently duplicated A. thaliana genes, 1,052 and 600 are high and low diversified duplicate pairs, respectively. The predictions were validated based on the phenotypes of generated knock-down transgenic plants. We determined that the high diversified duplicates resulting from tandem duplications tend to have lineage-specific functions, whereas the low diversified duplicates produced by whole-genome duplications are related to essential signaling pathways. To assess the evolutionary impact of high and low diversified duplicates in closely related species, we compared the retention rates and selection pressures on the orthologs of A. thaliana duplicates in two closely related species. Interestingly, high diversified duplicates resulting from tandem duplications tend to be retained in multiple lineages under positive selection. Low diversified duplicates by whole-genome duplications tend to be retained in multiple lineages under purifying selection. Taken together, the functional diversities determined by different duplication mechanisms had distinct effects on plant evolution.
Collapse
Affiliation(s)
- Akihiro Ezoe
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| | - Kazumasa Shirai
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| | - Kousuke Hanada
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| |
Collapse
|
8
|
Putnins M, Androulakis IP. Self-selection of evolutionary strategies: adaptive versus non-adaptive forces. Heliyon 2021; 7:e06997. [PMID: 34041384 PMCID: PMC8141468 DOI: 10.1016/j.heliyon.2021.e06997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Revised: 04/03/2021] [Accepted: 04/30/2021] [Indexed: 12/18/2022] Open
Abstract
The evolution of complex genetic networks is shaped over the course of many generations through multiple mechanisms. These mechanisms can be broken into two predominant categories: adaptive forces, such as natural selection, and non-adaptive forces, such as recombination, genetic drift, and random mutation. Adaptive forces are influenced by the environment, where individuals better suited for their ecological niche are more likely to reproduce. This adaptive force results in a selective pressure which creates a bias in the reproduction of individuals with beneficial traits. Non-adaptive forces, in contrast, are not influenced by the environment: Random mutations occur in offspring regardless of whether they improve the fitness of the offspring. Both adaptive and non-adaptive forces play critical roles in the development of a species over time, and both forces are intrinsically linked to one another. We hypothesize that even under a simple sexual reproduction model, selective pressure will result in changes in the mutation rate and genome size. We tested this hypothesis by evolving Boolean networks using a modified genetic algorithm. Our results demonstrate that changes in environmental signals can result in selective pressure which affects mutation rate.
Collapse
Affiliation(s)
- Matthew Putnins
- Biomecdical Engineering Department, Rutgers University, Piscataway, NJ, USA
| | - Ioannis P Androulakis
- Biomecdical Engineering Department, Rutgers University, Piscataway, NJ, USA.,Chemical & Biochemical Engineering Department, Rutgers University, Piscataway, NJ, USA.,Department of Surgery, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
| |
Collapse
|
9
|
van der Hofstad R, van Leeuwaarden JSH, Stegehuis C. Optimal subgraph structures in scale-free configuration models. ANN APPL PROBAB 2021. [DOI: 10.1214/20-aap1580] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | | | - Clara Stegehuis
- Department of Electrical Engineering, Mathematics and Computer Science, University of Twente
| |
Collapse
|
10
|
Bar H, Bang S. A mixture model to detect edges in sparse co-expression graphs with an application for comparing breast cancer subtypes. PLoS One 2021; 16:e0246945. [PMID: 33571253 PMCID: PMC7877669 DOI: 10.1371/journal.pone.0246945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 01/28/2021] [Indexed: 11/19/2022] Open
Abstract
We develop a method to recover a gene network's structure from co-expression data, measured in terms of normalized Pearson's correlation coefficients between gene pairs. We treat these co-expression measurements as weights in the complete graph in which nodes correspond to genes. To decide which edges exist in the gene network, we fit a three-component mixture model such that the observed weights of 'null edges' follow a normal distribution with mean 0, and the non-null edges follow a mixture of two lognormal distributions, one for positively- and one for negatively-correlated pairs. We show that this so-called L2 N mixture model outperforms other methods in terms of power to detect edges, and it allows to control the false discovery rate. Importantly, our method makes no assumptions about the true network structure. We demonstrate our method, which is implemented in an R package called edgefinder, using a large dataset consisting of expression values of 12,750 genes obtained from 1,616 women. We infer the gene network structure by cancer subtype, and find insightful subtype characteristics. For example, we find thirteen pathways which are enriched in each of the cancer groups but not in the Normal group, with two of the pathways associated with autoimmune diseases and two other with graft rejection. We also find specific characteristics of different breast cancer subtypes. For example, the Luminal A network includes a single, highly connected cluster of genes, which is enriched in the human diseases category, and in the Her2 subtype network we find a distinct, and highly interconnected cluster which is uniquely enriched in drug metabolism pathways.
Collapse
Affiliation(s)
- Haim Bar
- Department of Statistics, University of Connecticut, Storrs, CT, United States of America
| | - Seojin Bang
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States of America
| |
Collapse
|
11
|
Choudhari JK, Chatterjee T, Gupta S, Garcia-Garcia JG, Vera-González J. Network Biology Approaches in Ophthalmological Diseases: A Case Study of Glaucoma. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11586-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
12
|
Britto de Assis Prado CH, de Brito Melo Trovão DM, Souza JP. A network model for determining decomposition, topology, and properties of the woody crown. J Theor Biol 2020; 499:110318. [DOI: 10.1016/j.jtbi.2020.110318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 03/17/2020] [Accepted: 05/04/2020] [Indexed: 11/24/2022]
|
13
|
Zambra M, Maritan A, Testolin A. Emergence of Network Motifs in Deep Neural Networks. ENTROPY 2020; 22:e22020204. [PMID: 33285979 PMCID: PMC7516634 DOI: 10.3390/e22020204] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 02/03/2020] [Accepted: 02/07/2020] [Indexed: 12/04/2022]
Abstract
Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called network motifs. In this article, we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Simulations show that the final network topology is shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks.
Collapse
Affiliation(s)
- Matteo Zambra
- Department of Civil, Environmental and Architectural Engineering, University of Padova, Via Marzolo 9, 35131 Padova, Italy
- Correspondence: (M.Z.); (A.T.)
| | - Amos Maritan
- Department of Physics and Astronomy, University of Padova; Istituto Nazionale di Fisica Nucleare—Sezione di Padova, Via Marzolo 8, 35131 Padova, Italy;
| | - Alberto Testolin
- Department of General Psychology, University of Padova, Via Venezia 8, 35131 Padova, Italy
- Department of Information Engineering, University of Padova, Via Gradenigo 6/b, 35131 Padova, Italy
- Correspondence: (M.Z.); (A.T.)
| |
Collapse
|
14
|
Ma J, Bai X, Luo W, Feng Y, Shao X, Bai Q, Sun S, Long Q, Wan D. Genome-Wide Identification of Long Noncoding RNAs and Their Responses to Salt Stress in Two Closely Related Poplars. Front Genet 2019; 10:777. [PMID: 31543901 PMCID: PMC6739720 DOI: 10.3389/fgene.2019.00777] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Accepted: 07/23/2019] [Indexed: 12/23/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are involved in various biological regulatory processes, but their roles in plants resistance to salt stress remain largely unknown. To systematically explore the characteristics of lncRNAs and their roles in plant salt responses, we conducted strand-specific RNA-sequencing of four tissue types with salt treatments in two closely related poplars (Populus euphratica and Populus alba var. pyramidalis), and a total of 10,646 and 10,531 lncRNAs were identified, respectively. These lncRNAs showed significantly lower values in terms of length, expression, and expression correction than with mRNA. We further found that about 40% and 60% of these identified lncRNAs responded to salt stress with tissue-specific expression patterns across the two poplars. Furthermore, lncRNAs showed weak evolutionary conservation in sequences and exhibited diverse regulatory styles; in particular, tissue- and species-specific responses to salt stress varied greatly in two poplars, for example, 322 lncRNAs were found highly expressed in P. euphratica but not in P. alba var. pyramidalis and 3,425 lncRNAs were identified to be species-specific in P. euphratica in response to salt stress. Moreover, tissue-specific expression of lncRNAs in two poplars were identified with predicted target genes included Aux/IAA, NAC, MYB, involved in regulating plant growth and the plant stress response. Taken together, the systematic analysis of lncRNAs between sister species enhances our understanding of the characteristics of lncRNAs and their roles in plant growth and salt response.
Collapse
Affiliation(s)
- Jianchao Ma
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China.,Key Laboratory of Plant Stress Biology, State Key Laboratory of Cotton Biology, School of Life Sciences, Henan University, Kaifeng, China
| | - Xiaotao Bai
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Wenchun Luo
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Yannan Feng
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Xuemin Shao
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Qiuxian Bai
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Shujiao Sun
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Qiming Long
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Dongshi Wan
- State Key Laboratory of Grassland Agro-Ecosystem, School of Life Sciences, Lanzhou University, Lanzhou, China
| |
Collapse
|
15
|
Abstract
AbstractWe consider subgraph counts in general preferential attachment models with power-law degree exponent
$\tau > 2$
. For all subgraphs H, we find the scaling of the expected number of subgraphs as a power of the number of vertices. We prove our results on the expected number of subgraphs by defining an optimization problem that finds the optimal subgraph structure in terms of the indices of the vertices that together span it and by using the representation of the preferential attachment model as a Pólya urn model.
Collapse
|
16
|
Stegehuis C, Hofstad RVD, van Leeuwaarden JSH. Variational principle for scale-free network motifs. Sci Rep 2019; 9:6762. [PMID: 31043621 PMCID: PMC6494877 DOI: 10.1038/s41598-019-43050-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Accepted: 04/15/2019] [Indexed: 11/30/2022] Open
Abstract
For scale-free networks with degrees following a power law with an exponent τ ∈ (2, 3), the structures of motifs (small subgraphs) are not yet well understood. We introduce a method designed to identify the dominant structure of any given motif as the solution of an optimization problem. The unique optimizer describes the degrees of the vertices that together span the most likely motif, resulting in explicit asymptotic formulas for the motif count and its fluctuations. We then classify all motifs into two categories: motifs with small and large fluctuations.
Collapse
Affiliation(s)
- Clara Stegehuis
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Netherlands.
| | - Remco van der Hofstad
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Johan S H van Leeuwaarden
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Netherlands
| |
Collapse
|
17
|
Evolutionary transitions in controls reconcile adaptation with continuity of evolution. Semin Cell Dev Biol 2019; 88:36-45. [DOI: 10.1016/j.semcdb.2018.05.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 02/19/2018] [Accepted: 05/15/2018] [Indexed: 12/14/2022]
|
18
|
In silico predicted transcriptional regulatory control of steroidogenesis in spawning female fathead minnows (Pimephales promelas). J Theor Biol 2018; 455:179-190. [PMID: 30036528 DOI: 10.1016/j.jtbi.2018.07.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 07/16/2018] [Accepted: 07/18/2018] [Indexed: 11/21/2022]
Abstract
Oocyte development and maturation (or oogenesis) in spawning female fish is mediated by interrelated transcriptional regulatory and steroidogenesis networks. This study integrates a transcriptional regulatory network (TRN) model of steroidogenic enzyme gene expressions with a flux balance analysis (FBA) model of steroidogenesis. The two models were functionally related. Output from the TRN model (as magnitude gene expression simulated using extreme pathway (ExPa) analysis) was used to re-constrain linear inequality bounds for reactions in the FBA model. This allowed TRN model predictions to impact the steroidogenesis FBA model. These two interrelated models were tested as follows: First, in silico targeted steroidogenic enzyme gene activations in the TRN model showed high co-regulation (67-83%) for genes involved with oocyte growth and development (cyp11a1, cyp17-17,20-lyase, 3β-HSD and cyp19a1a). Whereas, no or low co-regulation corresponded with genes concertedly involved with oocyte final maturation prior to spawning (cyp17-17α-hydroxylase (0%) and 20β-HSD (33%)). Analysis (using FBA) of accompanying steroidogenesis fluxes showed high overlap for enzymes involved with oocyte growth and development versus those involved with final maturation and spawning. Second, the TRN model was parameterized with in vivo changes in the presence/absence of transcription factors (TFs) during oogenesis in female fathead minnows (Pimephales promelas). Oogenesis stages studied included: PreVitellogenic-Vitellogenic, Vitellogenic-Mature, Mature-Ovulated and Ovulated-Atretic stages. Predictions of TRN genes active during oogenesis showed overall elevated expressions for most genes during early oocyte development (PreVitellogenic-Vitellogenic, Vitellogenic-Mature) and post-ovulation (Ovulated-Atretic). Whereas ovulation (Mature-Ovulated) showed highest expression for cyp17-17α-hydroxylase only. FBA showed steroid hormone productions to also follow trends concomitant with steroidogenic enzyme gene expressions. General trends predicted by in silico modeling were similar to those observed in vivo. The integrated computational framework presented was capable of mechanistically representing aspects of reproductive function in fish. This approach can be extended to study reproductive effects under exposure to adverse environmental or anthropogenic stressors.
Collapse
|
19
|
Peng Y, Michonova E. Long-range effect of a single mutation in spermine synthase. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2018. [DOI: 10.1142/s021963361850030x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Spermine synthase (SpmSyn) is an enzyme critical for maintaining the balance of spermine/spermidine in the cell. The amino acid sequence of SpmSyn is highly conserved among the species. Most of the mutations found in the human population are shown to be causing Snyder–Robinson syndrome, a severe mental disorder, while not so many are neutral. This is intriguing since SpmSyn is a relatively large protein and less than 10% of its amino acids are directly involved in the catalysis. Here, we demonstrated that a mutation (G191S) at a site far away from the active pocket affects the active site dynamics and thus the functionality of SpmSyn. This suggests that SpmSyn functionality is regulated by networks of interacting residues and thus expands the functional and structural importance beyond the amino acids directly involved in the catalysis. Comparing the calculated effects of G191S and a nine-residue deletion shown to decrease SpmSyn activity [Wu H, Min J, Zeng H, McCloskey DE, Ikeguchi Y, Loppnau P, Michael AJ, Pegg AE, Plotnikov AN, Crystal structure of human spermine synthase: Implications of substrate binding and catalytic mechanism, J Biol Chem 283:16135–16146, 2008], we predict that G191S mutation also decreases SpmSyn activity and may be causing disease.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Ekaterina Michonova
- Department of Chemistry and Physics, Erskine College, Due West SC 29639, USA
| |
Collapse
|
20
|
Choi J, Lee D. Topological motifs populate complex networks through grouped attachment. Sci Rep 2018; 8:12670. [PMID: 30140017 PMCID: PMC6107624 DOI: 10.1038/s41598-018-30845-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 08/07/2018] [Indexed: 11/29/2022] Open
Abstract
Network motifs are topological subgraph patterns that recur with statistical significance in a network. Network motifs have been widely utilized to represent important topological features for analyzing the functional properties of complex networks. While recent studies have shown the importance of network motifs, existing network models are not capable of reproducing real-world topological properties of network motifs, such as the frequency of network motifs and relative graphlet frequency distances. Here, we propose a new network measure and a new network model to reconstruct real-world network topologies, by incorporating our Grouped Attachment algorithm to generate networks in which closely related nodes have similar edge connections. We applied the proposed model to real-world complex networks, and the resulting constructed networks more closely reflected real-world network motif properties than did the existing models that we tested: the Erdös–Rényi, small-world, scale-free, popularity-similarity-optimization, and nonuniform popularity-similarity-optimization models. Furthermore, we adapted the preferential attachment algorithm to our model to gain scale-free properties while preserving motif properties. Our findings show that grouped attachment is one possible mechanism to reproduce network motif recurrence in real-world complex networks.
Collapse
Affiliation(s)
- Jaejoon Choi
- Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.,Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts, United States of America
| | - Doheon Lee
- Bio-Synergy Research Center, 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea. .,Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea.
| |
Collapse
|
21
|
Grandchamp A, Monget P. Synchronous birth is a dominant pattern in receptor-ligand evolution. BMC Genomics 2018; 19:611. [PMID: 30107779 PMCID: PMC6092800 DOI: 10.1186/s12864-018-4977-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 07/31/2018] [Indexed: 12/11/2022] Open
Abstract
Background Interactions between proteins are key components in the chemical and physical processes of living organisms. Among these interactions, membrane receptors and their ligands are particularly important because they are at the interface between extracellular and intracellular environments. Many studies have investigated how binding partners have co-evolved in genomes during the evolution. However, little is known about the establishment of the interaction on a phylogenetic scale. In this study, we systematically studied the time of birth of genes encoding human membrane receptors and their ligands in the animal tree of life. We examined a total of 553 pairs of ligands/receptors, representing non-redundant interactions. Results We found that 41% of the receptors and their respective first ligands appeared in the same branch, representing 2.5-fold more than expected by chance, thus suggesting an evolutionary dynamic of interdependence and conservation between these partners. In contrast, 21% of the receptors appeared after their ligand, i.e. three-fold less often than expected by chance. Most surprisingly, 38% of the receptors appeared before their first ligand, as much as expected by chance. Conclusions According to these results, we propose that a selective pressure is exerted on ligands and receptors once they appear, that would remove molecules whose partner does not appear quickly. Electronic supplementary material The online version of this article (10.1186/s12864-018-4977-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anna Grandchamp
- PRC, UMR85, INRA, CNRS, IFCE, Université de Tours, F-37380, Nouzilly, France.
| | - Philippe Monget
- PRC, UMR85, INRA, CNRS, IFCE, Université de Tours, F-37380, Nouzilly, France.
| |
Collapse
|
22
|
Morrison ES, Badyaev AV. Structure versus time in the evolutionary diversification of avian carotenoid metabolic networks. J Evol Biol 2018; 31:764-772. [PMID: 29485222 DOI: 10.1111/jeb.13257] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 02/14/2018] [Accepted: 02/20/2018] [Indexed: 01/07/2023]
Abstract
Historical associations of genes and proteins are thought to delineate pathways available to subsequent evolution; however, the effects of past functional involvements on contemporary evolution are rarely quantified. Here, we examined the extent to which the structure of a carotenoid enzymatic network persists in avian evolution. Specifically, we tested whether the evolution of carotenoid networks was most concordant with phylogenetically structured expansion from core reactions of common ancestors or with subsampling of biochemical pathway modules from an ancestral network. We compared structural and historical associations in 467 carotenoid networks of extant and ancestral species and uncovered the overwhelming effect of pre-existing metabolic network structure on carotenoid diversification over the last 50 million years of avian evolution. Over evolutionary time, birds repeatedly subsampled and recombined conserved biochemical modules, which likely maintained the overall structure of the carotenoid metabolic network during avian evolution. These findings explain the recurrent convergence of evolutionary distant species in carotenoid metabolism and weak phylogenetic signal in avian carotenoid evolution. Remarkable retention of an ancient metabolic structure throughout extensive and prolonged ecological diversification in avian carotenoid metabolism illustrates a fundamental requirement of organismal evolution - historical continuity of a deterministic network that links past and present functional associations of its components.
Collapse
Affiliation(s)
- Erin S Morrison
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Alexander V Badyaev
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
23
|
Gumi AM, Guha PK, Mazumder A, Jayaswal P, Mondal TK. Characterization of OglDREB2A gene from African rice ( Oryza glaberrima), comparative analysis and its transcriptional regulation under salinity stress. 3 Biotech 2018; 8:91. [PMID: 29430353 PMCID: PMC5796934 DOI: 10.1007/s13205-018-1098-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 01/05/2018] [Indexed: 01/17/2023] Open
Abstract
In this study, AP2 DNA-binding domain-containing transcription factor, OglDREB2A, was cloned from the African rice (Oryza glaberrima) and compared with 3000 rice genotypes. Further, the phylogenetic and various structural analysis was performed using in silico approaches. Further, to understand its allelic variation in rice, SNPs and indels were detected among the 3000 rice genotypes which indicated that while coding region is highly conserved, yet noncoding regions such as UTR and intron contained most of the variation. Phylogenetic analysis of the OglDREB2A sequence in different Oryza as well as in diverse eudicot species revealed that DREB from various Oryza species were diversed much earlier than other genes. Further, structural features and in silico analyses provided insights into different properties of OglDREB2A protein. The neutrality test on the coding region of OglDREB2A from different genotypes of O. glaberrima showed the lack of selection in this gene. Among the different developmental stages, it was upregulated at tillering and flag leaf under salinity treatment indicating its positive role in seedling and reproductive stage tolerance. Real-time PCR analysis also indicated the conserve expression pattern of this gene under salinity stress across the three different Oryza species having different degree of salinity tolerance.
Collapse
Affiliation(s)
- Abubakar Mohammad Gumi
- ICAR-National Bureau of Plant Genetic Resources, IARI Campus, Pusa, New Delhi, 110012 India
- Present Address: Department of Biological Sciences, Usmanu Danfodiyo University, Sokoto, Nigeria
| | - Pritam Kanti Guha
- ICAR-National Bureau of Plant Genetic Resources, IARI Campus, Pusa, New Delhi, 110012 India
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
| | - Abhishek Mazumder
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
| | - Pawan Jayaswal
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
| | - Tapan Kumar Mondal
- ICAR-National Bureau of Plant Genetic Resources, IARI Campus, Pusa, New Delhi, 110012 India
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
- Present Address: Department of Biological Sciences, Usmanu Danfodiyo University, Sokoto, Nigeria
| |
Collapse
|
24
|
Mohammadi S, Gleich DF, Kolda TG, Grama A. Triangular Alignment (TAME): A Tensor-Based Approach for Higher-Order Network Alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1446-1458. [PMID: 27483461 DOI: 10.1109/tcbb.2016.2595583] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Network alignment has extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provides a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we identify a closely related surrogate function whose maximization results in a tensor eigenvector problem. Based on this formulation, we present an algorithm called Triangular AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. Using a case study on the NAPAbench dataset, we show that triangular alignment is capable of producing mappings with high node correctness. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods in terms of conserved triangles. In addition, we show that the number of conserved triangles is more significantly correlated, compared to the conserved edge, with node correctness and co-expression of edges. Our formulation and resulting algorithms can be easily extended to arbitrary motifs.
Collapse
|
25
|
Kim W, Haukap L. NemoProfile as an efficient approach to network motif analysis with instance collection. BMC Bioinformatics 2017; 18:423. [PMID: 29072139 PMCID: PMC5657038 DOI: 10.1186/s12859-017-1822-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Background A network motif is defined as a statistically significant and recurring subgraph pattern within a network. Most existing instance collection methods are not feasible due to high memory usage issues and provision of limited network motif information. They require a two-step process that requires network motif identification prior to instance collection. Due to the impracticality in obtaining motif instances, the significance of their contribution to problem solving is debated within the field of biology. Results This paper presents NemoProfile, an efficient new network motif data model. NemoProfile simplifies instance collection by resolving memory overhead issues and is seamlessly generated, thus eliminating the need for costly two-step processing. Additionally, a case study was conducted to demonstrate the application of network motifs to existing problems in the field of biology. Conclusion NemoProfile comprises network motifs and their instances, thereby facilitating network motifs usage in real biological problems.
Collapse
|
26
|
Stability of Control Networks in Autonomous Homeostatic Regulation of Stem Cell Lineages. Bull Math Biol 2017; 80:1345-1365. [PMID: 28508298 DOI: 10.1007/s11538-017-0283-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2016] [Accepted: 04/07/2017] [Indexed: 01/02/2023]
Abstract
Design principles of biological networks have been studied extensively in the context of protein-protein interaction networks, metabolic networks, and regulatory (transcriptional) networks. Here we consider regulation networks that occur on larger scales, namely the cell-to-cell signaling networks that connect groups of cells in multicellular organisms. These are the feedback loops that orchestrate the complex dynamics of cell fate decisions and are necessary for the maintenance of homeostasis in stem cell lineages. We focus on "minimal" networks that are those that have the smallest possible numbers of controls. For such minimal networks, the number of controls must be equal to the number of compartments, and the reducibility/irreducibility of the network (whether or not it can be split into smaller independent sub-networks) is defined by a matrix comprised of the cell number increments induced by each of the controlled processes in each of the compartments. Using the formalism of digraphs, we show that in two-compartment lineages, reducible systems must contain two 1-cycles, and irreducible systems one 1-cycle and one 2-cycle; stability follows from the signs of the controls and does not require magnitude restrictions. In three-compartment systems, irreducible digraphs have a tree structure or have one 3-cycle and at least two more shorter cycles, at least one of which is a 1-cycle. With further work and proper biological validation, our results may serve as a first step toward an understanding of ways in which these networks become dysregulated in cancer.
Collapse
|
27
|
Mallik S, Kundu S. Modular Organization of Residue-Level Contacts Shapes the Selection Pressure on Individual Amino Acid Sites of Ribosomal Proteins. Genome Biol Evol 2017; 9:916-931. [PMID: 28338825 PMCID: PMC5388290 DOI: 10.1093/gbe/evx036] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2017] [Indexed: 12/26/2022] Open
Abstract
Understanding the molecular evolution of macromolecular complexes in the light of their structure, assembly, and stability is of central importance. Here, we address how the modular organization of native molecular contacts shapes the selection pressure on individual residue sites of ribosomal complexes. The bacterial ribosomal complex is represented as a residue contact network where nodes represent amino acid/nucleotide residues and edges represent their van der Waals interactions. We find statistically overrepresented native amino acid-nucleotide contacts (OaantC, one amino acid contacts one or multiple nucleotides, internucleotide contacts are disregarded). Contact number is defined as the number of nucleotides contacted. Involvement of individual amino acids in OaantCs with smaller contact numbers is more random, whereas only a few amino acids significantly contribute to OaantCs with higher contact numbers. An investigation of structure, stability, and assembly of bacterial ribosome depicts the involvement of these OaantCs in diverse biophysical interactions stabilizing the complex, including high-affinity protein-RNA contacts, interprotein cooperativity, intersubunit bridge, packing of multiple ribosomal RNA domains, etc. Amino acid-nucleotide constituents of OaantCs with higher contact numbers are generally associated with significantly slower substitution rates compared with that of OaantCs with smaller contact numbers. This evolutionary rate heterogeneity emerges from the strong purifying selection pressure that conserves the respective amino acid physicochemical properties relevant to the stabilizing interaction with OaantC nucleotides. An analysis of relative molecular orientations of OaantC residues and their interaction energetics provides the biophysical ground of purifying selection conserving OaantC amino acid physicochemical properties.
Collapse
Affiliation(s)
- Saurav Mallik
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, Kolkata, India
- Center of Excellence in Systems Biology and Biomedical Engineering (TEQIP Phase-II), University of Calcutta, Kolkata, India
| | - Sudip Kundu
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, Kolkata, India
- Center of Excellence in Systems Biology and Biomedical Engineering (TEQIP Phase-II), University of Calcutta, Kolkata, India
| |
Collapse
|
28
|
Martin AJ, Contreras-Riquelme S, Dominguez C, Perez-Acle T. LoTo: a graphlet based method for the comparison of local topology between gene regulatory networks. PeerJ 2017; 5:e3052. [PMID: 28265516 PMCID: PMC5333545 DOI: 10.7717/peerj.3052] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 01/31/2017] [Indexed: 11/24/2022] Open
Abstract
One of the main challenges of the post-genomic era is the understanding of how gene expression is controlled. Changes in gene expression lay behind diverse biological phenomena such as development, disease and the adaptation to different environmental conditions. Despite the availability of well-established methods to identify these changes, tools to discern how gene regulation is orchestrated are still required. The regulation of gene expression is usually depicted as a Gene Regulatory Network (GRN) where changes in the network structure (i.e., network topology) represent adjustments of gene regulation. Like other networks, GRNs are composed of basic building blocks; small induced subgraphs called graphlets. Here we present LoTo, a novel method that using Graphlet Based Metrics (GBMs) identifies topological variations between different states of a GRN. Under our approach, different states of a GRN are analyzed to determine the types of graphlet formed by all triplets of nodes in the network. Subsequently, graphlets occurring in a state of the network are compared to those formed by the same three nodes in another version of the network. Once the comparisons are performed, LoTo applies metrics from binary classification problems calculated on the existence and absence of graphlets to assess the topological similarity between both network states. Experiments performed on randomized networks demonstrate that GBMs are more sensitive to topological variation than the same metrics calculated on single edges. Additional comparisons with other common metrics demonstrate that our GBMs are capable to identify nodes whose local topology changes between different states of the network. Notably, due to the explicit use of graphlets, LoTo captures topological variations that are disregarded by other approaches. LoTo is freely available as an online web server at http://dlab.cl/loto.
Collapse
Affiliation(s)
- Alberto J Martin
- Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile; Centro Interdisciplinario de Neurociencia de Valparaíso, Valparaiso, Chile
| | - Sebastián Contreras-Riquelme
- Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile; Facultad de Ciencias Biologicas, Universidad Andres Bello, Santiago, Chile
| | - Calixto Dominguez
- Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida , Santiago , Chile
| | - Tomas Perez-Acle
- Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile; Centro Interdisciplinario de Neurociencia de Valparaíso, Valparaiso, Chile
| |
Collapse
|
29
|
Ma C, Luciani T, Terebus A, Liang J, Marai GE. PRODIGEN: visualizing the probability landscape of stochastic gene regulatory networks in state and time space. BMC Bioinformatics 2017; 18:24. [PMID: 28251874 PMCID: PMC5333168 DOI: 10.1186/s12859-016-1447-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Visualizing the complex probability landscape of stochastic gene regulatory networks can further biologists’ understanding of phenotypic behavior associated with specific genes. Results We present PRODIGEN (PRObability DIstribution of GEne Networks), a web-based visual analysis tool for the systematic exploration of probability distributions over simulation time and state space in such networks. PRODIGEN was designed in collaboration with bioinformaticians who research stochastic gene networks. The analysis tool combines in a novel way existing, expanded, and new visual encodings to capture the time-varying characteristics of probability distributions: spaghetti plots over one dimensional projection, heatmaps of distributions over 2D projections, enhanced with overlaid time curves to display temporal changes, and novel individual glyphs of state information corresponding to particular peaks. Conclusions We demonstrate the effectiveness of the tool through two case studies on the computed probabilistic landscape of a gene regulatory network and of a toggle-switch network. Domain expert feedback indicates that our visual approach can help biologists: 1) visualize probabilities of stable states, 2) explore the temporal probability distributions, and 3) discover small peaks in the probability landscape that have potential relation to specific diseases. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1447-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chihua Ma
- Electronic Visualization Laboratory, Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan St (M/C 152), Room 1120 SEO, Chicago, 60607, IL, US.
| | - Timothy Luciani
- Electronic Visualization Laboratory, Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan St (M/C 152), Room 1120 SEO, Chicago, 60607, IL, US
| | - Anna Terebus
- Department of Bioengineering, University of Illinois at Chicago, 851 S. Morgan St (M/C 063), Room 218 SEO, Chicago, 60607, IL, USA
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, 851 S. Morgan St (M/C 063), Room 218 SEO, Chicago, 60607, IL, USA
| | - G Elisabeta Marai
- Electronic Visualization Laboratory, Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan St (M/C 152), Room 1120 SEO, Chicago, 60607, IL, US
| |
Collapse
|
30
|
Singh KV, Vig L. Improved prediction of missing protein interactome links via anomaly detection. APPLIED NETWORK SCIENCE 2017; 2:2. [PMID: 30533510 PMCID: PMC6245231 DOI: 10.1007/s41109-017-0022-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 01/14/2017] [Indexed: 06/09/2023]
Abstract
Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of finding undiscovered links in a network is also referred to as the link prediction problem. A popular approach for link prediction has been to formulate it as a binary classification problem in which class labels indicate the existence or absence of a link (we refer to these as positive links or negative links respectively) between a pair of nodes in the network. Researchers have successfully applied such supervised classification techniques to determine the presence of links in protein interaction networks. However, it is quite common for protein-protein interaction (PPI) networks to have a large proportion of undiscovered links. Thus, a link prediction approach could incorrectly treat undiscovered positive links as negative links, thereby introducing a bias in the learning. In this paper, we propose to denoise the class of negative links in the training data via a Gaussian process anomaly detector. We show that this significantly reduces the noise due to mislabelled negative links and improves the resulting link prediction accuracy. We evaluate the approach by introducing synthetic noise into the PPI networks and measuring how accurately we can reconstruct the original PPI networks using classifiers trained on both noisy and denoised data. Experiments were performed with five different PPI network datasets and the results indicate a significant reduction in bias due to label noise, and more importantly, a significant improvement in the accuracy of detecting missing links via classification.
Collapse
Affiliation(s)
- Kushal Veer Singh
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, Delhi, India
| | - Lovekesh Vig
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, Delhi, India
| |
Collapse
|
31
|
Elhesha R, Kahveci T. Identification of large disjoint motifs in biological networks. BMC Bioinformatics 2016; 17:408. [PMID: 27716036 PMCID: PMC5053092 DOI: 10.1186/s12859-016-1271-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 09/21/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biological networks provide great potential to understand how cells function. Network motifs, frequent topological patterns, are key structures through which biological networks operate. Finding motifs in biological networks remains to be computationally challenging task as the size of the motif and the underlying network grow. Often, different copies of a given motif topology in a network share nodes or edges. Counting such overlapping copies introduces significant problems in motif identification. RESULTS In this paper, we develop a scalable algorithm for finding network motifs. Unlike most of the existing studies, our algorithm counts independent copies of each motif topology. We introduce a set of small patterns and prove that we can construct any larger pattern by joining those patterns iteratively. By iteratively joining already identified motifs with those patterns, our algorithm avoids (i) constructing topologies which do not exist in the target network (ii) repeatedly counting the frequency of the motifs generated in subsequent iterations. Our experiments on real and synthetic networks demonstrate that our method is significantly faster and more accurate than the existing methods including SUBDUE and FSG. CONCLUSIONS We conclude that our method for finding network motifs is scalable and computationally feasible for large motif sizes and a broad range of networks with different sizes and densities. We proved that any motif with four or more edges can be constructed as a join of the small patterns.
Collapse
Affiliation(s)
- Rasha Elhesha
- CISE Department, University of Florida, 432 Newell Dr, Gainesville, Florida, 32611, USA.
| | - Tamer Kahveci
- CISE Department, University of Florida, 432 Newell Dr, Gainesville, Florida, 32611, USA
| |
Collapse
|
32
|
Martin AJM, Dominguez C, Contreras-Riquelme S, Holmes DS, Perez-Acle T. Graphlet Based Metrics for the Comparison of Gene Regulatory Networks. PLoS One 2016; 11:e0163497. [PMID: 27695050 PMCID: PMC5047442 DOI: 10.1371/journal.pone.0163497] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 09/10/2016] [Indexed: 11/18/2022] Open
Abstract
Understanding the control of gene expression remains one of the main challenges in the post-genomic era. Accordingly, a plethora of methods exists to identify variations in gene expression levels. These variations underlay almost all relevant biological phenomena, including disease and adaptation to environmental conditions. However, computational tools to identify how regulation changes are scarce. Regulation of gene expression is usually depicted in the form of a gene regulatory network (GRN). Structural changes in a GRN over time and conditions represent variations in the regulation of gene expression. Like other biological networks, GRNs are composed of basic building blocks called graphlets. As a consequence, two new metrics based on graphlets are proposed in this work: REConstruction Rate (REC) and REC Graphlet Degree (RGD). REC determines the rate of graphlet similarity between different states of a network and RGD identifies the subset of nodes with the highest topological variation. In other words, RGD discerns how th GRN was rewired. REC and RGD were used to compare the local structure of nodes in condition-specific GRNs obtained from gene expression data of Escherichia coli, forming biofilms and cultured in suspension. According to our results, most of the network local structure remains unaltered in the two compared conditions. Nevertheless, changes reported by RGD necessarily imply that a different cohort of regulators (i.e. transcription factors (TFs)) appear on the scene, shedding light on how the regulation of gene expression occurs when E. coli transits from suspension to biofilm. Consequently, we propose that both metrics REC and RGD should be adopted as a quantitative approach to conduct differential analyses of GRNs. A tool that implements both metrics is available as an on-line web server (http://dlab.cl/loto).
Collapse
Affiliation(s)
- Alberto J. M. Martin
- Computational Biology Lab, Fundación Ciencia & Vida, Santiago, Chile
- Centro Interdisciplinario de Neurociencia de Valparaíso, Universidad de Valparaíso, Chile
- * E-mail: (AJMM); (TPA)
| | - Calixto Dominguez
- Computational Biology Lab, Fundación Ciencia & Vida, Santiago, Chile
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & Vida and Facultad de Ciencias Biologicas, Universidad Andres Bello, Santiago, Chile
| | | | - David S. Holmes
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & Vida and Facultad de Ciencias Biologicas, Universidad Andres Bello, Santiago, Chile
| | - Tomas Perez-Acle
- Computational Biology Lab, Fundación Ciencia & Vida, Santiago, Chile
- Centro Interdisciplinario de Neurociencia de Valparaíso, Universidad de Valparaíso, Chile
- * E-mail: (AJMM); (TPA)
| |
Collapse
|
33
|
Abstract
Throughout biology, function is intimately linked with form. Across scales ranging from subcellular to multiorganismal, the identity and organization of a biological structure's subunits dictate its properties. The field of molecular morphogenesis has traditionally been concerned with describing these links, decoding the molecular mechanisms that give rise to the shape and structure of cells, tissues, organs, and organisms. Recent advances in synthetic biology promise unprecedented control over these molecular mechanisms; this opens the path to not just probing morphogenesis but directing it. This review explores several frontiers in the nascent field of synthetic morphogenesis, including programmable tissues and organs, synthetic biomaterials and programmable matter, and engineering complex morphogenic systems de novo. We will discuss each frontier's objectives, current approaches, constraints and challenges, and future potential.
Collapse
Affiliation(s)
- Brian P Teague
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| | - Patrick Guye
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| | - Ron Weiss
- Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| |
Collapse
|
34
|
Sendiña-Nadal I, Danziger MM, Wang Z, Havlin S, Boccaletti S. Assortativity and leadership emerge from anti-preferential attachment in heterogeneous networks. Sci Rep 2016; 6:21297. [PMID: 26887684 PMCID: PMC4758035 DOI: 10.1038/srep21297] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 01/18/2016] [Indexed: 01/08/2023] Open
Abstract
Real-world networks have distinct topologies, with marked deviations from purely random networks. Many of them exhibit degree-assortativity, with nodes of similar degree more likely to link to one another. Though microscopic mechanisms have been suggested for the emergence of other topological features, assortativity has proven elusive. Assortativity can be artificially implanted in a network via degree-preserving link permutations, however this destroys the graph's hierarchical clustering and does not correspond to any microscopic mechanism. Here, we propose the first generative model which creates heterogeneous networks with scale-free-like properties in degree and clustering distributions and tunable realistic assortativity. Two distinct populations of nodes are incrementally added to an initial network by selecting a subgraph to connect to at random. One population (the followers) follows preferential attachment, while the other population (the potential leaders) connects via anti-preferential attachment: they link to lower degree nodes when added to the network. By selecting the lower degree nodes, the potential leader nodes maintain high visibility during the growth process, eventually growing into hubs. The evolution of links in Facebook empirically validates the connection between the initial anti-preferential attachment and long term high degree. In this way, our work sheds new light on the structure and evolution of social networks.
Collapse
Affiliation(s)
- I. Sendiña-Nadal
- Complex Systems Group & GISC, Universidad Rey Juan Carlos, 28933 Móstoles, Madrid, Spain
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - M. M. Danziger
- Department of Physics, Bar Ilan University, Ramat Gan 52900, Israel
| | - Z. Wang
- School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
- Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816-8580, Japan
| | - S. Havlin
- Department of Physics, Bar Ilan University, Ramat Gan 52900, Israel
| | - S. Boccaletti
- CNR- Institute of Complex Systems, Via Madonna del Piano, 10, 50019 Sesto Fiorentino, Florence, Italy
- The Italian Embassy in Israel, 25 Hamered st., 68125 Tel Aviv, Israel
| |
Collapse
|
35
|
Liang C, Luo J, Song D. Network simulation reveals significant contribution of network motifs to the age-dependency of yeast protein-protein interaction networks. MOLECULAR BIOSYSTEMS 2015; 10:2277-88. [PMID: 24964354 DOI: 10.1039/c4mb00230j] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Advances in proteomic technologies combined with sophisticated computing and modeling methods have generated an unprecedented amount of high-throughput data for system-scale analysis. As a result, the study of protein-protein interaction (PPI) networks has garnered much attention in recent years. One of the most fundamental problems in studying PPI networks is to understand how their architecture originated and evolved to their current state. By investigating how proteins of different ages are connected in the yeast PPI networks, one can deduce their expansion procedure in evolution and how the ancient primitive network expanded and evolved. Studies have shown that proteins are often connected to other proteins of a similar age, suggesting a high degree of age preference between interacting proteins. Though several theories have been proposed to explain this phenomenon, none of them considered protein-clusters as a contributing factor. Here we first investigate the age-dependency of the proteins from the perspective of network motifs. Our analysis confirms that proteins of the same age groups tend to form interacting network motifs; furthermore, those proteins within motifs tend to be within protein complexes and the interactions among them largely contribute to the observed age preference in the yeast PPI networks. In light of these results, we describe a new modeling approach, based on "network motifs", whereby topologically connected protein clusters in the network are treated as single evolutionary units. Instead of modeling single proteins, our approach models the connections and evolutionary relationships of multiple related protein clusters or "network motifs" that are collectively integrated into an existing PPI network. Through simulation studies, we found that the "network motif" modeling approach can capture yeast PPI network properties better than if individual proteins were considered to be the simplest evolutionary units. Our approach provides a fresh perspective on modeling the evolution of yeast PPI networks, specifically that PPI networks may have a much higher age-dependency of interaction density than had been previously envisioned.
Collapse
Affiliation(s)
- Cheng Liang
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China.
| | | | | |
Collapse
|
36
|
Computational Identification of Post Translational Modification Regulated RNA Binding Protein Motifs. PLoS One 2015; 10:e0137696. [PMID: 26368004 PMCID: PMC4569568 DOI: 10.1371/journal.pone.0137696] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2015] [Accepted: 08/19/2015] [Indexed: 11/19/2022] Open
Abstract
RNA and its associated RNA binding proteins (RBPs) mitigate a diverse array of cellular functions and phenotypes. The interactions between RNA and RBPs are implicated in many roles of biochemical processing by the cell such as localization, protein translation, and RNA stability. Recent discoveries of novel mechanisms that are of significant evolutionary advantage between RBPs and RNA include the interaction of the RBP with the 3’ and 5’ untranslated region (UTR) of target mRNA. These mechanisms are shown to function through interaction of a trans-factor (RBP) and a cis-regulatory element (3’ or 5’ UTR) by the binding of a RBP to a regulatory-consensus nucleic acid motif region that is conserved throughout evolution. Through signal transduction, regulatory RBPs are able to temporarily dissociate from their target sites on mRNAs and induce translation, typically through a post-translational modification (PTM). These small, regulatory motifs located in the UTR of mRNAs are subject to a loss-of-function due to single polymorphisms or other mutations that disrupt the motif and inhibit the ability to associate into the complex with RBPs. The identification of a consensus motif for a given RBP is difficult, time consuming, and requires a significant degree of experimentation to identify each motif-containing gene on a genomic scale. We have developed a computational algorithm to analyze high-throughput genomic arrays that contain differential binding induced by a PTM for a RBP of interest–RBP-PTM Target Scan (RPTS). We demonstrate the ability of this application to accurately predict a PTM-specific binding motif to an RBP that has no antibody capable of distinguishing the PTM of interest, negating the use of in-vitro exonuclease digestion techniques.
Collapse
|
37
|
Lyon KF, Strong CL, Schooler SG, Young RJ, Roy N, Ozar B, Bachmeier M, Rajasekaran S, Schiller MR. Natural variability of minimotifs in 1092 people indicates that minimotifs are targets of evolution. Nucleic Acids Res 2015; 43:6399-412. [PMID: 26068475 PMCID: PMC4513861 DOI: 10.1093/nar/gkv580] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Revised: 04/17/2015] [Accepted: 05/21/2015] [Indexed: 01/05/2023] Open
Abstract
Since the function of a short contiguous peptide minimotif can be introduced or eliminated by a single point mutation, these functional elements may be a source of human variation and a target of selection. We analyzed the variability of ∼300 000 minimotifs in 1092 human genomes from the 1000 Genomes Project. Most minimotifs have been purified by selection, with a 94% invariance, which supports important functional roles for minimotifs. Minimotifs are generally under negative selection, possessing high genomic evolutionary rate profiling (GERP) and sitewise likelihood-ratio (SLR) scores. Some are subject to neutral drift or positive selection, similar to coding regions. Most SNPs in minimotif were common variants, but with minor allele frequencies generally <10%. This was supported by low substation rates and few newly derived minimotifs. Several minimotif alleles showed different intercontinental and regional geographic distributions, strongly suggesting a role for minimotifs in adaptive evolution. We also note that 4% of PTM minimotif sites in histone tails were common variants, which has the potential to differentially affect DNA packaging among individuals. In conclusion, minimotifs are a source of functional genetic variation in the human population; thus, they are likely to be an important target of selection and evolution.
Collapse
Affiliation(s)
- Kenneth F Lyon
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Christy L Strong
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Steve G Schooler
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Richard J Young
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-2155, USA
| | - Nervik Roy
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Brittany Ozar
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Mark Bachmeier
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Sanguthevar Rajasekaran
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-2155, USA
| | - Martin R Schiller
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| |
Collapse
|
38
|
Kim WY, Kurmar S. Sensible method for updating motif instances in an increased biological network. Methods 2015; 83:71-9. [PMID: 25869675 DOI: 10.1016/j.ymeth.2015.04.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Revised: 04/04/2015] [Accepted: 04/06/2015] [Indexed: 11/20/2022] Open
Abstract
A network motif is defined as an over-represented subgraph pattern in a network. Network motif based techniques have been widely applied in analyses of biological networks such as transcription regulation networks (TRNs), protein-protein interaction networks (PPIs), and metabolic networks. The detection of network motifs involves the computationally expensive enumeration of subgraphs, NP-complete graph isomorphism testing, and significance testing through the generation of many random graphs to determine the statistical uniqueness of a given subgraph. These computational obstacles make network motif analysis unfeasible for many real-world applications. We observe that the fast growth of biotechnology has led to the rapid accretion of molecules (vertices) and interactions (edges) to existing biological network databases. Even with a small percentage of additions, revised networks can have a large number of differing motif instances. Currently, no existing algorithms recalculate motif instances in 'updated' networks in a practical manner. In this paper, we introduce a sensible method for efficiently recalculating motif instances by performing motif enumeration from only updated vertices and edges. Preliminary experimental results indicate that our method greatly reduces computational time by eliminating the repeated enumeration of overlapped subgraph instances detected in earlier versions of the network. The software program implementing this algorithm, defined as SUNMI (Sensible Update of Network Motif Instances), is currently a stand-alone java program and we plan to upgrade it as a web-interactive program that will be available through http://faculty.washington.edu/kimw6/research.htm in near future. Meanwhile it is recommended to contact authors to obtain the stand-alone SUNMI program.
Collapse
Affiliation(s)
- W Y Kim
- Computing and Software Systems, School of Science, Technology, Engineering, and Mathematics, University of Washington Bothell, Bothell, WA 98011-8246, United States.
| | - S Kurmar
- Computing and Software Systems, School of Science, Technology, Engineering, and Mathematics, University of Washington Bothell, Bothell, WA 98011-8246, United States.
| |
Collapse
|
39
|
Peng W, Wang J, Wu F, Yi P. Detecting conserved protein complexes using a dividing-and-matching algorithm and unequally lenient criteria for network comparison. Algorithms Mol Biol 2015; 10:21. [PMID: 26136815 PMCID: PMC4487215 DOI: 10.1186/s13015-015-0053-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 05/26/2015] [Indexed: 01/09/2023] Open
Abstract
The increase of protein–protein interaction (PPI) data of different species makes it possible to identify common subnetworks (conserved protein complexes) across species via local alignment of their PPI networks, which benefits us to study biological evolution. Local alignment algorithms compare PPI network of different species at both protein sequence and network structure levels. For computational and biological reasons, it is hard to find common subnetworks with strict similar topology from two input PPI networks. Consequently some methods introduce less strict criteria for topological similarity. However those methods fail to consider the differences of the two input networks and adopt equally lenient criteria on them. In this work, a new dividing-and-matching-based method, namely UEDAMAlign is proposed to detect conserved protein complexes. This method firstly uses known protein complexes or computational methods to divide one of the two input PPI networks into subnetworks and then maps the proteins in these subnetworks to the other PPI network to get their homologous proteins. After that, UEDAMAlign conducts unequally lenient criteria on the two input networks to find common connected components from the proteins in the subnetworks and their homologous proteins in the other network. We carry out network alignments between S. cerevisiae and D. melanogaster, H. sapiens and D. melanogaster, respectively. Comparisons are made between other six existing methods and UEDAMAlign. The experimental results show that UEDAMAlign outperforms other existing methods in recovering conserved protein complexes that both match well with known protein complexes and have similar functions.
Collapse
|
40
|
Wang P, Lu J, Yu X, Liu Z. Duplication and Divergence Effect on Network Motifs in Undirected Bio-Molecular Networks. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2015; 9:312-20. [PMID: 25203993 DOI: 10.1109/tbcas.2014.2343620] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Duplication and divergence are two basic evolutionary mechanisms of bio-molecular networks. Real-world bio-molecular networks and their statistical characteristics can be well mimicked by artificial algorithms based on the two mechanisms. Bio-molecular networks consist of network motifs, which act as building blocks of large-scale networks. A fundamental question is how network motifs are evolved from long time evolution and natural selection. By considering the effect of various duplication and divergence strategies, we find that the underlying duplication scheme of the real-world undirected bio-molecular networks would rather follow the anti-preference strategy than the random one. The anti-preference duplication mechanism and the dimerization processes can lead to the formation of various motifs, and robustly conserve proper quantities of motifs in the artificial networks as that in the real-world ones. Furthermore, the anti-preference mechanism and edge deletion divergence can robustly preserve the sparsity of the networks. The investigations reveal the possible evolutionary mechanisms of network motifs in real-world bio-molecular networks, and have potential implications in the design, synthesis and reengineering of biological networks for biomedical purpose.
Collapse
|
41
|
Speegle G. P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:309-321. [PMID: 26357219 DOI: 10.1109/tcbb.2014.2355216] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Because most complex genetic diseases are caused by defects of cell signaling, illuminating a signaling cascade is essential for understanding their mechanisms. We present three novel computational algorithms to reconstruct signaling networks between a starting protein and an ending protein using genome-wide protein-protein interaction (PPI) networks and gene ontology (GO) annotation data. A signaling network is represented as a directed acyclic graph in a merged form of multiple linear pathways. An advanced semantic similarity metric is applied for weighting PPIs as the preprocessing of all three methods. The first algorithm repeatedly extends the list of nodes based on path frequency towards an ending protein. The second algorithm repeatedly appends edges based on the occurrence of network motifs which indicate the link patterns more frequently appearing in a PPI network than in a random graph. The last algorithm uses the information propagation technique which iteratively updates edge orientations based on the path strength and merges the selected directed edges. Our experimental results demonstrate that the proposed algorithms achieve higher accuracy than previous methods when they are tested on well-studied pathways of S. cerevisiae. Furthermore, we introduce an interactive web application tool, called P-Finder, to visualize reconstructed signaling networks.
Collapse
|
42
|
Abstract
Years of meticulous curation of scientific literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein interaction data. Over the years, these repositories have become a basic framework in which experiments are analyzed and new directions of research are explored. Here we present an overview of the most widely used protein-protein interaction databases and the methods they employ to gather, combine, and predict interactions. We also point out the trade-off between comprehensiveness and accuracy and the main pitfall scientists have to be aware before adopting protein interaction databases in any single-gene or genome-wide analysis.
Collapse
|
43
|
Pearcy N, Crofts JJ, Chuzhanova N. Network motif frequency vectors reveal evolving metabolic network organisation. MOLECULAR BIOSYSTEMS 2014; 11:77-85. [PMID: 25325903 DOI: 10.1039/c4mb00430b] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
At the systems level many organisms of interest may be described by their patterns of interaction, and as such, are perhaps best characterised via network or graph models. Metabolic networks, in particular, are fundamental to the proper functioning of many important biological processes, and thus, have been widely studied over the past decade or so. Such investigations have revealed a number of shared topological features, such as a short characteristic path-length, large clustering coefficient and hierarchical modular structure. However, the extent to which evolutionary and functional properties of metabolism manifest via this underlying network architecture remains unclear. In this paper, we employ a novel graph embedding technique, based upon low-order network motifs, to compare metabolic network structure for 383 bacterial species categorised according to a number of biological features. In particular, we introduce a new global significance score which enables us to quantify important evolutionary relationships that exist between organisms and their physical environments. Using this new approach, we demonstrate a number of significant correlations between environmental factors, such as growth conditions and habitat variability, and network motif structure, providing evidence that organism adaptability leads to increased complexities in the resultant metabolic networks.
Collapse
Affiliation(s)
- Nicole Pearcy
- School of Science and Technology, Nottingham Trent University, Nottingham, NG11 8NS, UK.
| | | | | |
Collapse
|
44
|
Guo NL, Wan YW. Network-based identification of biomarkers coexpressed with multiple pathways. Cancer Inform 2014; 13:37-47. [PMID: 25392692 PMCID: PMC4218687 DOI: 10.4137/cin.s14054] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Revised: 06/25/2014] [Accepted: 06/29/2014] [Indexed: 02/07/2023] Open
Abstract
Unraveling complex molecular interactions and networks and incorporating clinical information in modeling will present a paradigm shift in molecular medicine. Embedding biological relevance via modeling molecular networks and pathways has become increasingly important for biomarker identification in cancer susceptibility and metastasis studies. Here, we give a comprehensive overview of computational methods used for biomarker identification, and provide a performance comparison of several network models used in studies of cancer susceptibility, disease progression, and prognostication. Specifically, we evaluated implication networks, Boolean networks, Bayesian networks, and Pearson’s correlation networks in constructing gene coexpression networks for identifying lung cancer diagnostic and prognostic biomarkers. The results show that implication networks, implemented in Genet package, identified sets of biomarkers that generated an accurate prediction of lung cancer risk and metastases; meanwhile, implication networks revealed more biologically relevant molecular interactions than Boolean networks, Bayesian networks, and Pearson’s correlation networks when evaluated with MSigDB database.
Collapse
Affiliation(s)
- Nancy Lan Guo
- Mary Babb Randolph Cancer Center/School of Public Health, West Virginia University, Morgantown, WV, USA
| | - Ying-Wooi Wan
- Mary Babb Randolph Cancer Center/School of Public Health, West Virginia University, Morgantown, WV, USA
| |
Collapse
|
45
|
Tamm MV, Shkarin AB, Avetisov VA, Valba OV, Nechaev SK. Islands of stability in motif distributions of random networks. PHYSICAL REVIEW LETTERS 2014; 113:095701. [PMID: 25215992 DOI: 10.1103/physrevlett.113.095701] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2013] [Indexed: 06/03/2023]
Abstract
We consider random nondirected networks subject to dynamics conserving vertex degrees and study, analytically and numerically, equilibrium three-vertex motif distributions in the presence of an external field h coupled to one of the motifs. For small h, the numerics is well described by the "chemical kinetics" for the concentrations of motifs based on the law of mass action. For larger h, a transition into some trapped motif state occurs in Erdős-Rényi networks. We explain the existence of the transition by employing the notion of the entropy of the motif distribution and describe it in terms of a phenomenological Landau-type theory with a nonzero cubic term. A localization transition should always occur if the entropy function is nonconvex. We conjecture that this phenomenon is the origin of the motifs' pattern formation in real evolutionary networks.
Collapse
Affiliation(s)
- M V Tamm
- Physics Department, Moscow State University, 119992 Moscow, Russia and Department of Applied Mathematics, National Research University Higher School of Economics, 101000 Moscow, Russia
| | - A B Shkarin
- Department of Physics, Yale University, 217 Prospect Street, New Haven, Connecticut 06511, USA
| | - V A Avetisov
- N.N. Semenov Institute of Chemical Physics of the Russian Academy of Sciences, 119991 Moscow, Russia and Department of Applied Mathematics, National Research University Higher School of Economics, 101000 Moscow, Russia
| | - O V Valba
- Department of Applied Mathematics, National Research University Higher School of Economics, 101000 Moscow, Russia and Université Paris-Sud/CNRS, LPTMS, UMR8626, Bâtiment 100, 91405 Orsay, France and Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - S K Nechaev
- Department of Applied Mathematics, National Research University Higher School of Economics, 101000 Moscow, Russia and Université Paris-Sud/CNRS, LPTMS, UMR8626, Bâtiment 100, 91405 Orsay, France and P.N. Lebedev Physical Institute of the Russian Academy of Sciences, 119991 Moscow, Russia
| |
Collapse
|
46
|
Wang P, Lü J, Yu X. Identification of important nodes in directed biological networks: a network motif approach. PLoS One 2014; 9:e106132. [PMID: 25170616 PMCID: PMC4149525 DOI: 10.1371/journal.pone.0106132] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2013] [Accepted: 08/03/2014] [Indexed: 11/18/2022] Open
Abstract
Identification of important nodes in complex networks has attracted an increasing attention over the last decade. Various measures have been proposed to characterize the importance of nodes in complex networks, such as the degree, betweenness and PageRank. Different measures consider different aspects of complex networks. Although there are numerous results reported on undirected complex networks, few results have been reported on directed biological networks. Based on network motifs and principal component analysis (PCA), this paper aims at introducing a new measure to characterize node importance in directed biological networks. Investigations on five real-world biological networks indicate that the proposed method can robustly identify actually important nodes in different networks, such as finding command interneurons, global regulators and non-hub but evolutionary conserved actually important nodes in biological networks. Receiver Operating Characteristic (ROC) curves for the five networks indicate remarkable prediction accuracy of the proposed measure. The proposed index provides an alternative complex network metric. Potential implications of the related investigations include identifying network control and regulation targets, biological networks modeling and analysis, as well as networked medicine.
Collapse
Affiliation(s)
- Pei Wang
- School of Mathematics and Information Sciences, Henan University, Kaifeng, China
- School of Electrical and Computer Engineering, RMIT University, Melbourne, Victoria, Australia
- * E-mail:
| | - Jinhu Lü
- Institute of Systems Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Xinghuo Yu
- School of Electrical and Computer Engineering, RMIT University, Melbourne, Victoria, Australia
| |
Collapse
|
47
|
Zhang Y, Tao C, Jiang G, Nair AA, Su J, Chute CG, Liu H. Network-based analysis reveals distinct association patterns in a semantic MEDLINE-based drug-disease-gene network. J Biomed Semantics 2014; 5:33. [PMID: 25170419 PMCID: PMC4137727 DOI: 10.1186/2041-1480-5-33] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2013] [Accepted: 07/02/2014] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND A huge amount of associations among different biological entities (e.g., disease, drug, and gene) are scattered in millions of biomedical articles. Systematic analysis of such heterogeneous data can infer novel associations among different biological entities in the context of personalized medicine and translational research. Recently, network-based computational approaches have gained popularity in investigating such heterogeneous data, proposing novel therapeutic targets and deciphering disease mechanisms. However, little effort has been devoted to investigating associations among drugs, diseases, and genes in an integrative manner. RESULTS We propose a novel network-based computational framework to identify statistically over-expressed subnetwork patterns, called network motifs, in an integrated disease-drug-gene network extracted from Semantic MEDLINE. The framework consists of two steps. The first step is to construct an association network by extracting pair-wise associations between diseases, drugs and genes in Semantic MEDLINE using a domain pattern driven strategy. A Resource Description Framework (RDF)-linked data approach is used to re-organize the data to increase the flexibility of data integration, the interoperability within domain ontologies, and the efficiency of data storage. Unique associations among drugs, diseases, and genes are extracted for downstream network-based analysis. The second step is to apply a network-based approach to mine the local network structure of this heterogeneous network. Significant network motifs are then identified as the backbone of the network. A simplified network based on those significant motifs is then constructed to facilitate discovery. We implemented our computational framework and identified five network motifs, each of which corresponds to specific biological meanings. Three case studies demonstrate that novel associations are derived from the network topology analysis of reconstructed networks of significant network motifs, further validated by expert knowledge and functional enrichment analyses. CONCLUSIONS We have developed a novel network-based computational approach to investigate the heterogeneous drug-gene-disease network extracted from Semantic MEDLINE. We demonstrate the power of this approach by prioritizing candidate disease genes, inferring potential disease relationships, and proposing novel drug targets, within the context of the entire knowledge. The results indicate that such approach will facilitate the formulization of novel research hypotheses, which is critical for translational medicine research and personalized medicine.
Collapse
Affiliation(s)
- Yuji Zhang
- Division of Biostatistics and Bioinformatics, University of Maryland Greenebaum Cancer Center and Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Guoqian Jiang
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Asha A Nair
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Jian Su
- Radiology Informatics Laboratory, Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - Christopher G Chute
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Hongfang Liu
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
48
|
Haldane A, Manhart M, Morozov AV. Biophysical fitness landscapes for transcription factor binding sites. PLoS Comput Biol 2014; 10:e1003683. [PMID: 25010228 PMCID: PMC4091707 DOI: 10.1371/journal.pcbi.1003683] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 05/11/2014] [Indexed: 11/18/2022] Open
Abstract
Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions. Specialized proteins called transcription factors turn genes on and off by binding to short stretches of DNA in their regulatory regions. Precise gene regulation is essential for cellular survival and proliferation, and its evolution and maintenance under mutational pressure are central issues in biology. Here we discuss how evolution of gene regulation is shaped by the need to maintain favorable binding energies between transcription factors and their genomic binding sites. We show that, surprisingly, transcription factor binding is not affected by many biological properties, such as the essentiality of the gene it regulates. Rather, all sites for a given factor appear to evolve under a universal set of constraints, which can be rationalized in terms of a simple model inspired by transcription factor – DNA binding thermodynamics.
Collapse
Affiliation(s)
- Allan Haldane
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Alexandre V. Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
49
|
Abstract
As function units, network motifs have been detected to reveal evolutionary mechanisms of complex systems, such as biological networks, food webs, engineering networks and social networks. However, emergence of motifs in growing networks may be problematic due to large fluctuation of subgraph frequency in the initial stage. This paper contributes to present a method which can identify the emergence of motif in growing networks. Based on the Erdös-Rényi(E-R) random null model, the variation rate of expected frequency of subgraph at adjacent time points was used to define the suitable detection range for motif identification. Upper and lower boundaries of the range were obtained in analytical form according to a chosen risk level. Then, the statistical metric Z-score was extended to a new one,, which effectively reveals the statistical significance of subgraph in a continuous period of time. In this paper, a novel research framework of motif identification was proposed, defining critical boundaries for the evolutionary process of networks and a significance metric of time scale. Finally, an industrial ecosystem at Kalundborg was adopted as a case study to illustrate the effectiveness and convenience of the proposed methodology.
Collapse
Affiliation(s)
- Haijia Shi
- State Key Joint-Laboratory of Environmental Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, China
| | - Lei Shi
- State Key Joint-Laboratory of Environmental Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, China
- * E-mail:
| |
Collapse
|
50
|
Yang L, Zhao X, Tang X. Predicting disease-related proteins based on clique backbone in protein-protein interaction network. Int J Biol Sci 2014; 10:677-88. [PMID: 25013377 PMCID: PMC4081603 DOI: 10.7150/ijbs.8430] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Accepted: 05/21/2014] [Indexed: 12/19/2022] Open
Abstract
Network biology integrates different kinds of data, including physical or functional networks and disease gene sets, to interpret human disease. A clique (maximal complete subgraph) in a protein-protein interaction network is a topological module and possesses inherently biological significance. A disease-related clique possibly associates with complex diseases. Fully identifying disease components in a clique is conductive to uncovering disease mechanisms. This paper proposes an approach of predicting disease proteins based on cliques in a protein-protein interaction network. To tolerate false positive and negative interactions in protein networks, extending cliques and scoring predicted disease proteins with gene ontology terms are introduced to the clique-based method. Precisions of predicted disease proteins are verified by disease phenotypes and steadily keep to more than 95%. The predicted disease proteins associated with cliques can partly complement mapping between genotype and phenotype, and provide clues for understanding the pathogenesis of serious diseases.
Collapse
Affiliation(s)
- Lei Yang
- 1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China; ; 2. Information and Network Management Centre, Heilongjiang University, Harbin, China
| | - Xudong Zhao
- 1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Xianglong Tang
- 1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|