1
|
Hansen L, Mariño-Ramírez L, Landsman D. Differences in local genomic context of bound and unbound motifs. Gene 2012; 506:125-34. [PMID: 22692006 PMCID: PMC3412921 DOI: 10.1016/j.gene.2012.06.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Accepted: 06/04/2012] [Indexed: 11/25/2022]
Abstract
Understanding gene regulation is a major objective in molecular biology research. Frequently, transcription is driven by transcription factors (TFs) that bind to specific DNA sequences. These motifs are usually short and degenerate, rendering the likelihood of multiple copies occurring throughout the genome due to random chance as high. Despite this, TFs only bind to a small subset of sites, thus prompting our investigation into the differences between motifs that are bound by TFs and those that remain unbound. Here we constructed vectors representing various chromatin- and sequence-based features for a published set of bound and unbound motifs representing nine TFs in the budding yeast Saccharomyces cerevisiae. Using a machine learning approach, we identified a set of features that can be used to discriminate between bound and unbound motifs. We also discovered that some TFs bind most or all of their strong motifs in intergenic regions. Our data demonstrate that local sequence context can be strikingly different around motifs that are bound compared to motifs that are unbound. We concluded that there are multiple combinations of genomic features that characterize bound or unbound motifs.
Collapse
Affiliation(s)
- Loren Hansen
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8900 Rockville Pike, Bethesda, MD 20894
- Bioinformatics Program, Boston University, Boston, MA 02215, USA
| | - Leonardo Mariño-Ramírez
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8900 Rockville Pike, Bethesda, MD 20894
- PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia
| | - David Landsman
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8900 Rockville Pike, Bethesda, MD 20894
| |
Collapse
|
2
|
Cheng Y, Zhang F, Chen Q, Gao J, Cui W, Ji M, Tung CH. Structural basis of specific binding between Aurora A and TPX2 by molecular dynamics simulations. J Chem Inf Model 2011; 51:2626-35. [PMID: 21919471 DOI: 10.1021/ci2002439] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
In the present study, the impacts of G198N and W128F mutations on the recognition between Aurora A and targeting protein of Xenopus kinesin-like protein 2 (TPX2) were investigated using molecular dynamics (MD) simulations, free energy calculations, and free energy decomposition analysis. The predicted binding free energy of the wild-type complex is more favorable than those of three mutants, indicating that both single and double mutations are unfavorable for the Aurora A and TPX2 binding. It is also observed that the mutations alternate the binding pattern between Aurora A and TPX2, especially the downstream of TPX2. An intramolecular hydrogen bond between the atom OD of Asp11(TPX2) and the atom HE1 of Trp34(TPX2) disappear in three mutants and thus lead to the instability of the secondary structure of TPX2. The combination of different molecular modeling techniques is an efficient way to understand how mutation has impacts on the protein-protein binding and our work gives valuable information for the future design of specific peptide inhibitors for Aurora A.
Collapse
Affiliation(s)
- Yuanhua Cheng
- Key Laboratory of Organic Optoelectronics and Molecular Engineering of Ministry of Education, Department of Chemistry, Tsinghua University, Beijing 100084, PR China
| | | | | | | | | | | | | |
Collapse
|
3
|
Hempel S, Koseska A, Nikoloski Z, Kurths J. Unraveling gene regulatory networks from time-resolved gene expression data - a measures comparison study. BMC Bioinformatics 2011; 12:292. [PMID: 21771321 PMCID: PMC3161045 DOI: 10.1186/1471-2105-12-292] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2011] [Accepted: 07/19/2011] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications. RESULTS Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study. CONCLUSIONS Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices.
Collapse
Affiliation(s)
- Sabrina Hempel
- Interdisciplinary Center for Dynamics of Complex Systems, University of Potsdam, Campus Golm, Karl-Liebknecht-Str. 24, D-14476 Potsdam, Germany
- Potsdam Institute for Climate Impact Research (PIK), Telegraphenberg A 31, D-14473 Potsdam, Germany
- Department of Physics, Humboldt University of Berlin, Campus Adlershof, Newtonstr. 15, D-12489 Berlin, Germany
| | - Aneta Koseska
- Interdisciplinary Center for Dynamics of Complex Systems, University of Potsdam, Campus Golm, Karl-Liebknecht-Str. 24, D-14476 Potsdam, Germany
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modeling Group, Max Planck Institute for Molecular Plant Physiology, Am Mühlenberg 1, D-14476 Potsdam, Germany
- Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 25, D-14476 Potsdam, Germany
| | - Jürgen Kurths
- Potsdam Institute for Climate Impact Research (PIK), Telegraphenberg A 31, D-14473 Potsdam, Germany
- Department of Physics, Humboldt University of Berlin, Campus Adlershof, Newtonstr. 15, D-12489 Berlin, Germany
- Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Aberdeen AB243UE, UK
| |
Collapse
|
4
|
Cheng Y, Cui W, Chen Q, Tung CH, Ji M, Zhang F. The molecular mechanism studies of chirality effect of PHA-739358 on Aurora kinase A by molecular dynamics simulation and free energy calculations. J Comput Aided Mol Des 2011; 25:171-80. [PMID: 21222017 DOI: 10.1007/s10822-010-9408-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2010] [Accepted: 12/20/2010] [Indexed: 12/25/2022]
Abstract
Aurora kinase family is one of the emerging targets in oncology drug discovery and several small molecules targeting aurora kinases have been discovered and evaluated under early phase I/II trials. Among them, PHA-739358 (compound 1r) is a 3-aminopyrazole derivative with strong activity against Aurora A under early phase II trial. Inhibitory potency of compound 1r (the benzylic substituent at the pro-R position) is 30 times over that of compound 1s (the benzylic substituent at the pro-S position). In present study, the mechanism of how different configurations influence the binding affinity was investigated using molecular dynamics (MD) simulations, free energy calculations and free energy decomposition analysis. The predicted binding free energies of these two complexes are consistent with the experimental data. The analysis of the individual energy terms indicates that although the van der Waals contribution is important for distinguishing the binding affinities of these two inhibitors, the electrostatic contribution plays a more crucial role in that. Moreover, it is observed that different configurations of the benzylic substituent could form different binding patterns with protein, thus leading to variant inhibitory potency of compounds 1r and 1s. The combination of different molecular modeling techniques is an efficient way to interpret the chirality effects of inhibitors and our work gives valuable information for the chiral drug design in the near future.
Collapse
Affiliation(s)
- Yuanhua Cheng
- Key Laboratory of Organic Optoelectronics and Molecular Engineering of Ministry of Education, Department of Chemistry, Tsinghua University, 100084 Beijing, People's Republic of China
| | | | | | | | | | | |
Collapse
|
5
|
Nikoloski Z, May P, Selbig J. Algebraic connectivity may explain the evolution of gene regulatory networks. J Theor Biol 2010; 267:7-14. [PMID: 20682325 DOI: 10.1016/j.jtbi.2010.07.028] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2009] [Revised: 07/21/2010] [Accepted: 07/21/2010] [Indexed: 11/26/2022]
Abstract
Gene expression is a result of the interplay between the structure, type, kinetics, and specificity of gene regulatory interactions, whose diversity gives rise to the variety of life forms. As the dynamic behavior of gene regulatory networks depends on their structure, here we attempt to determine structural reasons which, despite the similarities in global network properties, may explain the large differences in organismal complexity. We demonstrate that the algebraic connectivity, the smallest non-trivial eigenvalue of the Laplacian, of the directed gene regulatory networks decreases with the increase of organismal complexity, and may therefore explain the difference between the variety of analyzed regulatory networks. In addition, our results point out that, for the species considered in this study, evolution favours decreasing concentration of strategically positioned feed forward loops, so that the network as a whole can increase the specificity towards changing environments. Moreover, contrary to the existing results, we show that the average degree, the length of the longest cascade, and the average cascade length of gene regulatory networks cannot recover the evolutionary relationships between organisms. Whereas the dynamical properties of special subnetworks are relatively well understood, there is still limited knowledge about the evolutionary reasons for the already identified design principles pertaining to these special subnetworks, underlying the global quantitative features of gene regulatory networks of different organisms. The behavior of the algebraic connectivity, which we show valid on gene regulatory networks extracted from curated databases, can serve as an additional evolutionary principle of organism-specific regulatory networks.
Collapse
Affiliation(s)
- Zoran Nikoloski
- Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Brandenburg, Germany.
| | | | | |
Collapse
|
6
|
Zhang M, Lu LJ. Investigating the validity of current network analysis on static conglomerate networks by protein network stratification. BMC Bioinformatics 2010; 11:466. [PMID: 20846443 PMCID: PMC2949894 DOI: 10.1186/1471-2105-11-466] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2010] [Accepted: 09/16/2010] [Indexed: 01/25/2023] Open
Abstract
Background A molecular network perspective forms the foundation of systems biology. A common practice in analyzing protein-protein interaction (PPI) networks is to perform network analysis on a conglomerate network that is an assembly of all available binary interactions in a given organism from diverse data sources. Recent studies on network dynamics suggested that this approach might have ignored the dynamic nature of context-dependent molecular systems. Results In this study, we employed a network stratification strategy to investigate the validity of the current network analysis on conglomerate PPI networks. Using the genome-scale tissue- and condition-specific proteomics data in Arabidopsis thaliana, we present here the first systematic investigation into this question. We stratified a conglomerate A. thaliana PPI network into three levels of context-dependent subnetworks. We then focused on three types of most commonly conducted network analyses, i.e., topological, functional and modular analyses, and compared the results from these network analyses on the conglomerate network and five stratified context-dependent subnetworks corresponding to specific tissues. Conclusions We found that the results based on the conglomerate PPI network are often significantly different from those of context-dependent subnetworks corresponding to specific tissues or conditions. This conclusion depends neither on relatively arbitrary cutoffs (such as those defining network hubs or bottlenecks), nor on specific network clustering algorithms for module extraction, nor on the possible high false positive rates of binary interactions in PPI networks. We also found that our conclusions are likely to be valid in human PPI networks. Furthermore, network stratification may help resolve many controversies in current research of systems biology.
Collapse
Affiliation(s)
- Minlu Zhang
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229, USA
| | | |
Collapse
|
7
|
|
8
|
Zhang Z, Zhang J. Accuracy and application of the motif expression decomposition method in dissecting transcriptional regulation. Nucleic Acids Res 2008; 36:3185-93. [PMID: 18411204 PMCID: PMC2425491 DOI: 10.1093/nar/gkn127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Understanding transcriptional regulation is a major goal of molecular biology. Motif expression decomposition (MED) was recently introduced to describe the expression level of a gene as the sum of the products of the binding strengths of its cis-regulatory motifs and the activities of the corresponding trans-acting transcription factors (TFs). Here, we use computer simulation to examine the accuracy of MED. We found that although MED accurately rebuilds gene expression levels from decomposed motif binding strengths and TF activities, estimates of motif binding strengths and TF activities are unreliable. Nonetheless, MED provides accurate estimates of relative binding strengths of the same motif in different genes and relative activities of the same TF under different conditions. We found that reasonably accurate results are achievable with genome-wide expression data from only 30 conditions and that MED results are robust to the existence of unknown occurrences of known motifs, although they are less robust to the presence of unknown motifs. With these understandings, judicious use of MED will likely provide useful information about eukaryotic transcriptional regulation. As an example, MED results are used to demonstrate that motifs generally have higher binding strengths when appearing in multiple copies than appearing in one copy per promoter.
Collapse
Affiliation(s)
- Zhihua Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor MI 48109, USA
| | | |
Collapse
|
9
|
Zhang S, Jin G, Zhang XS, Chen L. Discovering functions and revealing mechanisms at molecular level from biological networks. Proteomics 2007; 7:2856-69. [PMID: 17703505 DOI: 10.1002/pmic.200700095] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
With the increasingly accumulated data from high-throughput technologies, study on biomolecular networks has become one of key focuses in systems biology and bioinformatics. In particular, various types of molecular networks (e.g., protein-protein interaction (PPI) network; gene regulatory network (GRN); metabolic network (MN); gene coexpression network (GCEN)) have been extensively investigated, and those studies demonstrate great potentials to discover basic functions and to reveal essential mechanisms for various biological phenomena, by understanding biological systems not at individual component level but at a system-wide level. Recent studies on networks have created very prolific researches on many aspects of living organisms. In this paper, we aim to review the recent developments on topics related to molecular networks in a comprehensive manner, with the special emphasis on the computational aspect. The contents of the survey cover global topological properties and local structural characteristics, network motifs, network comparison and query, detection of functional modules and network motifs, function prediction from network analysis, inferring molecular networks from biological data as well as representative databases and software tools.
Collapse
Affiliation(s)
- Shihua Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | | | | | | |
Collapse
|
10
|
Yarragudi A, Parfrey LW, Morse RH. Genome-wide analysis of transcriptional dependence and probable target sites for Abf1 and Rap1 in Saccharomyces cerevisiae. Nucleic Acids Res 2006; 35:193-202. [PMID: 17158163 PMCID: PMC1802568 DOI: 10.1093/nar/gkl1059] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Abf1 and Rap1 are general regulatory factors (GRFs) that contribute to transcriptional activation of a large number of genes, as well as to replication, silencing and telomere structure in yeast. In spite of their widespread roles in transcription, the scope of their functional targets genome-wide has not been previously determined. Here, we use microarrays to examine the contribution of these essential GRFs to transcription genome-wide, by using ts mutants that dissociate from their binding sites at 37°C. We then combine this data with published ChIP-chip studies and motif analysis to identify probable direct targets for Abf1 and Rap1. We also identify a substantial number of genes likely to bind Rap1 or Abf1, but not affected by loss of GRF binding. Interestingly, the results strongly suggest that Rap1 can contribute to gene activation from farther upstream than can Abf1. Also, consistent with previous work, more genes that bind Abf1 are unaffected by loss of binding than those that bind Rap1. Finally, we show for several such genes that the Abf1 C-terminal region, which contains the putative activation domain, is not needed to confer this peculiar ‘memory effect’ that allows continued transcription after loss of Abf1 binding.
Collapse
Affiliation(s)
- Arunadevi Yarragudi
- Laboratory of Developmental Genetics, Wadsworth Center, New York State Department of HealthAlbany, NY 12201-2002, USA
| | - Laura Wegener Parfrey
- Laboratory of Developmental Genetics, Wadsworth Center, New York State Department of HealthAlbany, NY 12201-2002, USA
| | - Randall H. Morse
- Laboratory of Developmental Genetics, Wadsworth Center, New York State Department of HealthAlbany, NY 12201-2002, USA
- Department of Biomedical Sciences, State University of New York at Albany School of Public HealthAlbany, NY 12201-2002, USA
- To whom correspondence should be addressed. Tel: +1 518 486 3116; Fax: +1 518 474 3181;
| |
Collapse
|