1
|
Fukunaga T, Iwasaki W. Logicome Profiler: Exhaustive detection of statistically significant logic relationships from comparative omics data. PLoS One 2020; 15:e0232106. [PMID: 32357172 PMCID: PMC7194410 DOI: 10.1371/journal.pone.0232106] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 04/07/2020] [Indexed: 02/01/2023] Open
Abstract
Logic relationship analysis is a data mining method that comprehensively detects item triplets that satisfy logic relationships from a binary matrix dataset, such as an ortholog table in comparative genomics. Thanks to recent technological advancements, many binary matrix datasets are now being produced in genomics, transcriptomics, epigenomics, metagenomics, and many other fields for comparative purposes. However, regardless of presumed interpretability and importance of logic relationships, existing data mining methods are not based on the framework of statistical hypothesis testing. That means, the type-1 and type-2 error rates are neither controlled nor estimated. Here, we developed Logicome Profiler, which exhaustively detects statistically significant triplet logic relationships from a binary matrix dataset (Logicome means ome of logics). To test all item triplets in a dataset while avoiding false positives, Logicome Profiler adjusts a significance level by the Bonferroni or Benjamini-Yekutieli method for the multiple testing correction. Its application to an ocean metagenomic dataset showed that Logicome Profiler can effectively detect statistically significant triplet logic relationships among environmental microbes and genes, which include those among urea transporter, urease, and photosynthesis-related genes. Beyond omics data analysis, Logicome Profiler is applicable to various binary matrix datasets in general for finding significant triplet logic relationships. The source code is available at https://github.com/fukunagatsu/LogicomeProfiler.
Collapse
Affiliation(s)
- Tsukasa Fukunaga
- Department of Computer Science, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
- * E-mail:
| | - Wataru Iwasaki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
- Department of Computational Biology and Medical Science, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
- Atmosphere and Ocean Research Institute, The University of Tokyo, Chiba, Japan
- Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan
- Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Chiba, Japan
| |
Collapse
|
2
|
Zhao Q, Zhang Y. Ensemble Method of Feature Selection and Reverse Construction of Gene Logical Network Based on Information Entropy. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001420590041] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a novel ensemble gene selection method to obtain a gene subset. Then we provide a reverse construction method of gene network derived from expression profile data of the gene subset. The uncertainty coefficient based on information entropy are used to define the existence of logical relations among these genes. If the uncertainty coefficient between some genes exceeds predefined thresholds, the gene nodes will be connected by directed edges. Thus, a gene network is generated, which we define as gene logical network. This method is applied to the breast cancer data including control group and experimental group, with comparisons of the 2nd-order logic type distribution, average degree as well as average path length of the networks. It is found that these structures with different networks are quite distinct. By the comparison of the degree difference between control group and experimental group, the key genes are picked up. By defining the dynamics evolution rules of state transition based on the logical regulation among the key genes in the network, the dynamic behaviors for normal breast cells and cells with cancer of different stages are simulated numerically. Some of them are highly related to the development of breast cancer through literature inquiry. The study may provide a useful revelation to the biological mechanism in the formation and development of cancer.
Collapse
Affiliation(s)
- Qingfeng Zhao
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266590, P. R. China
- Shandong Province Key Laboratory of Wisdom Mine Information Technology, Shandong University of Science and Technology, Qingdao 266590, P. R. China
| | - Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, Shandong 266590, P. R. China
| |
Collapse
|
3
|
Zhang W, Wang SL. An Integrated Framework for Identifying Mutated Driver Pathway and Cancer Progression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:455-464. [PMID: 29990286 DOI: 10.1109/tcbb.2017.2788016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Next-generation sequencing (NGS) technologies provide amount of somatic mutation data in a large number of patients. The identification of mutated driver pathway and cancer progression from these data is a challenging task because of the heterogeneity of interpatient. In addition, cancer progression at the pathway level has been proved to be more reasonable than at the gene level. In this paper, we introduce an integrated framework to identify mutated driver pathways and cancer progression (iMDPCP) at the pathway level from somatic mutation data. First, we use uncertainty coefficient to quantify mutual exclusivity on gene driver pathways and develop a computational framework to identify mutated driver pathways based on the adaptive discrete differential evolution algorithm. Then, we construct cancer progression model for driver pathways based on the Bayesian Network. Finally, we evaluate the performance of iMDPCP on real cancer somatic mutation datasets. The experimental results indicate that iMDPCP is more accurate than state-of-the-art methods according to the enrichment of KEGG pathways, and it also provides new insights on identifying cancer progression at the pathway level.
Collapse
|
4
|
Novel Model for Cascading Failure Based on Degree Strength and Its Application in Directed Gene Logic Networks. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2018; 2018:8950794. [PMID: 29670664 PMCID: PMC5836303 DOI: 10.1155/2018/8950794] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Revised: 01/15/2018] [Accepted: 01/18/2018] [Indexed: 11/23/2022]
Abstract
A novel model for cascading failures in a directed logic network based on the degree strength at a node was proposed. The definitions of in-degree and out-degree strength of a node were initially reconsidered, and the load at a nonisolated node was proposed as the ratio of in-degree strength to out-degree strength of the node. The cascading failure model based on degree strength was applied to the logic network for three types of cancer including adenocarcinoma of lung, prostate cancer, and colon cancer based on their gene expression profiles. In order to highlight the differences between the three networks by the cascading failure mechanism, we used the largest-scale cascades and the cumulative cascade probability to depict the damage. It was found that the cascading failures caused by hubs are usually larger. Furthermore, the result shows that propagations against the networks were correlated with the structures motifs of connected logical doublets. Finally, some genes were selected based on cascading failure mechanism. We believe that these genes may be involved in the occurrence and development of three types of cancer.
Collapse
|
5
|
Li F, Gao L, Ma X, Yang X. Detection of driver pathways using mutated gene network in cancer. MOLECULAR BIOSYSTEMS 2016; 12:2135-41. [PMID: 27118146 DOI: 10.1039/c6mb00084c] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
A mutated gene network is constructed based on a new mutual exclusivity index and coverage for detecting driver pathways.
Collapse
Affiliation(s)
- Feng Li
- School of Computer Science and Technology
- Xidian University
- Xi'an
- P. R. China
| | - Lin Gao
- School of Computer Science and Technology
- Xidian University
- Xi'an
- P. R. China
| | - Xiaoke Ma
- School of Computer Science and Technology
- Xidian University
- Xi'an
- P. R. China
| | - Xiaofei Yang
- School of Computer Science and Technology
- Xidian University
- Xi'an
- P. R. China
| |
Collapse
|
6
|
Modeling Gene Networks in Saccharomyces cerevisiae Based on Gene Expression Profiles. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:621264. [PMID: 26839582 PMCID: PMC4709922 DOI: 10.1155/2015/621264] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Revised: 10/14/2015] [Accepted: 11/16/2015] [Indexed: 11/30/2022]
Abstract
Detailed and innovative analysis of gene regulatory network structures may reveal novel insights to biological mechanisms. Here we study how gene regulatory network in Saccharomyces cerevisiae can differ under aerobic and anaerobic conditions. To achieve this, we discretized the gene expression profiles and calculated the self-entropy of down- and upregulation of gene expression as well as joint entropy. Based on these quantities the uncertainty coefficient was calculated for each gene triplet, following which, separate gene logic networks were constructed for the aerobic and anaerobic conditions. Four structural parameters such as average degree, average clustering coefficient, average shortest path, and average betweenness were used to compare the structure of the corresponding aerobic and anaerobic logic networks. Five genes were identified to be putative key components of the two energy metabolisms. Furthermore, community analysis using the Newman fast algorithm revealed two significant communities for the aerobic but only one for the anaerobic network. David Gene Functional Classification suggests that, under aerobic conditions, one such community reflects the cell cycle and cell replication, while the other one is linked to the mitochondrial respiratory chain function.
Collapse
|
7
|
Wang H, Huang H, Ding C, Nie F. Predicting Protein–Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization. J Comput Biol 2013; 20:344-58. [DOI: 10.1089/cmb.2012.0273] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- Hua Wang
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| | - Heng Huang
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| | - Chris Ding
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| | - Feiping Nie
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| |
Collapse
|
8
|
Cui J, DeLuca TF, Jung JY, Wall DP. Phylogenetically informed logic relationships improve detection of biological network organization. BMC Bioinformatics 2011; 12:476. [PMID: 22172058 PMCID: PMC3402364 DOI: 10.1186/1471-2105-12-476] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 12/15/2011] [Indexed: 12/04/2022] Open
Abstract
Background A "phylogenetic profile" refers to the presence or absence of a gene across a set of organisms, and it has been proven valuable for understanding gene functional relationships and network organization. Despite this success, few studies have attempted to search beyond just pairwise relationships among genes. Here we search for logic relationships involving three genes, and explore its potential application in gene network analyses. Results Taking advantage of a phylogenetic matrix constructed from the large orthologs database Roundup, we invented a method to create balanced profiles for individual triplets of genes that guarantee equal weight on the different phylogenetic scenarios of coevolution between genes. When we applied this idea to LAPP, the method to search for logic triplets of genes, the balanced profiles resulted in significant performance improvement and the discovery of hundreds of thousands more putative triplets than unadjusted profiles. We found that logic triplets detected biological network organization and identified key proteins and their functions, ranging from neighbouring proteins in local pathways, to well separated proteins in the whole pathway, and to the interactions among different pathways at the system level. Finally, our case study suggested that the directionality in a logic relationship and the profile of a triplet could disclose the connectivity between the triplet and surrounding networks. Conclusion Balanced profiles are superior to the raw profiles employed by traditional methods of phylogenetic profiling in searching for high order gene sets. Gene triplets can provide valuable information in detection of biological network organization and identification of key genes at different levels of cellular interaction.
Collapse
Affiliation(s)
- Jike Cui
- Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | |
Collapse
|
9
|
Sprinzak E, Cokus SJ, Yeates TO, Eisenberg D, Pellegrini M. Detecting coordinated regulation of multi-protein complexes using logic analysis of gene expression. BMC SYSTEMS BIOLOGY 2009; 3:115. [PMID: 20003439 PMCID: PMC2804736 DOI: 10.1186/1752-0509-3-115] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Accepted: 12/14/2009] [Indexed: 11/25/2022]
Abstract
Background Many of the functional units in cells are multi-protein complexes such as RNA polymerase, the ribosome, and the proteasome. For such units to work together, one might expect a high level of regulation to enable co-appearance or repression of sets of complexes at the required time. However, this type of coordinated regulation between whole complexes is difficult to detect by existing methods for analyzing mRNA co-expression. We propose a new methodology that is able to detect such higher order relationships. Results We detect coordinated regulation of multiple protein complexes using logic analysis of gene expression data. Specifically, we identify gene triplets composed of genes whose expression profiles are found to be related by various types of logic functions. In order to focus on complexes, we associate the members of a gene triplet with the distinct protein complexes to which they belong. In this way, we identify complexes related by specific kinds of regulatory relationships. For example, we may find that the transcription of complex C is increased only if the transcription of both complex A AND complex B is repressed. We identify hundreds of examples of coordinated regulation among complexes under various stress conditions. Many of these examples involve the ribosome. Some of our examples have been previously identified in the literature, while others are novel. One notable example is the relationship between the transcription of the ribosome, RNA polymerase and mannosyltransferase II, which is involved in N-linked glycan processing in the Golgi. Conclusions The analysis proposed here focuses on relationships among triplets of genes that are not evident when genes are examined in a pairwise fashion as in typical clustering methods. By grouping gene triplets, we are able to decipher coordinated regulation among sets of three complexes. Moreover, using all triplets that involve coordinated regulation with the ribosome, we derive a large network involving this essential cellular complex. In this network we find that all multi-protein complexes that belong to the same functional class are regulated in the same direction as a group (either induced or repressed).
Collapse
Affiliation(s)
- Einat Sprinzak
- UCLA-DOE Institute for Genomics and Proteomics, University of California Los Angeles, Los Angeles, CA, USA.
| | | | | | | | | |
Collapse
|
10
|
Babu M, Musso G, Díaz-Mejía JJ, Butland G, Greenblatt JF, Emili A. Systems-level approaches for identifying and analyzing genetic interaction networks in Escherichia coli and extensions to other prokaryotes. MOLECULAR BIOSYSTEMS 2009; 5:1439-55. [PMID: 19763343 DOI: 10.1039/b907407d] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Molecular interactions define the functional organization of the cell. Epistatic (genetic, or gene-gene) interactions, one of the most informative and commonly encountered forms of functional relationships, are increasingly being used to map process architecture in model eukaryotic organisms. In particular, 'systems-level' screens in yeast and worm aimed at elucidating genetic interaction networks have led to the generation of models describing the global modular organization of gene products and protein complexes within a cell. However, comparable data for prokaryotic organisms have not been available. Given its ease of growth and genetic manipulation, the Gram-negative bacterium Escherichia coli appears to be an ideal model system for performing comprehensive genome-scale examinations of genetic redundancy in bacteria. In this review, we highlight emerging experimental and computational techniques that have been developed recently to examine functional relationships and redundancy in E. coli at a systems-level, and their potential application to prokaryotes in general. Additionally, we have scanned PubMed abstracts and full-text published articles to manually curate a list of approximately 200 previously reported synthetic sick or lethal genetic interactions in E. coli derived from small-scale experimental studies.
Collapse
Affiliation(s)
- Mohan Babu
- Banting and Best Department of Medical Research, Terrence Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada M5S 3E1
| | | | | | | | | | | |
Collapse
|
11
|
Antonov AV, Mewes HW. Complex phylogenetic profiling reveals fundamental genotype–phenotype associations. Comput Biol Chem 2008; 32:412-6. [PMID: 18753010 DOI: 10.1016/j.compbiolchem.2008.07.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2007] [Revised: 03/28/2008] [Accepted: 07/02/2008] [Indexed: 01/19/2023]
|
12
|
The use of logic relationships to model colon cancer gene expression networks with mRNA microarray data. J Biomed Inform 2008; 41:530-43. [DOI: 10.1016/j.jbi.2007.11.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2007] [Revised: 11/04/2007] [Accepted: 11/24/2007] [Indexed: 11/17/2022]
|
13
|
Computational prediction of protein-protein interactions. Mol Biotechnol 2007; 38:1-17. [PMID: 18095187 DOI: 10.1007/s12033-007-0069-2] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2007] [Accepted: 07/16/2007] [Indexed: 01/19/2023]
Abstract
Recently a number of computational approaches have been developed for the prediction of protein-protein interactions. Complete genome sequencing projects have provided the vast amount of information needed for these analyses. These methods utilize the structural, genomic, and biological context of proteins and genes in complete genomes to predict protein interaction networks and functional linkages between proteins. Given that experimental techniques remain expensive, time-consuming, and labor-intensive, these methods represent an important advance in proteomics. Some of these approaches utilize sequence data alone to predict interactions, while others combine multiple computational and experimental datasets to accurately build protein interaction maps for complete genomes. These methods represent a complementary approach to current high-throughput projects whose aim is to delineate protein interaction maps in complete genomes. We will describe a number of computational protocols for protein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis.
Collapse
|
14
|
Jothi R, Przytycka TM, Aravind L. Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment. BMC Bioinformatics 2007; 8:173. [PMID: 17521444 PMCID: PMC1904249 DOI: 10.1186/1471-2105-8-173] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2007] [Accepted: 05/23/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A widely-used approach for discovering functional and physical interactions among proteins involves phylogenetic profile comparisons (PPCs). Here, proteins with similar profiles are inferred to be functionally related under the assumption that proteins involved in the same metabolic pathway or cellular system are likely to have been co-inherited during evolution. RESULTS Our experimentation with E. coli and yeast proteins with 16 different carefully composed reference sets of genomes revealed that the phyletic patterns of proteins in prokaryotes alone could be adequate enough to make reasonably accurate functional linkage predictions. A slight improvement in performance is observed on adding few eukaryotes into the reference set, but a noticeable drop-off in performance is observed with increased number of eukaryotes. Inclusion of most parasitic, pathogenic or vertebrate genomes and multiple strains of the same species into the reference set do not necessarily contribute to an improved sensitivity or accuracy. Interestingly, we also found that evolutionary histories of individual pathways have a significant affect on the performance of the PPC approach with respect to a particular reference set. For example, to accurately predict functional links in carbohydrate or lipid metabolism, a reference set solely composed of prokaryotic (or bacterial) genomes performed among the best compared to one composed of genomes from all three super-kingdoms; this is in contrast to predicting functional links in translation for which a reference set composed of prokaryotic (or bacterial) genomes performed the worst. We also demonstrate that the widely used random null model to quantify the statistical significance of profile similarity is incomplete, which could result in an increased number of false-positives. CONCLUSION Contrary to previous proposals, it is not merely the number of genomes but a careful selection of informative genomes in the reference set that influences the prediction accuracy of the PPC approach. We note that the predictive power of the PPC approach, especially in eukaryotes, is heavily influenced by the primary endosymbiosis and subsequent bacterial contributions. The over-representation of parasitic unicellular eukaryotes and vertebrates additionally make eukaryotes less useful in the reference sets. Reference sets composed of highly non-redundant set of genomes from all three super-kingdoms fare better with pathways showing considerable vertical inheritance and strong conservation (e.g. translation apparatus), while reference sets solely composed of prokaryotic genomes fare better for more variable pathways like carbohydrate metabolism. Differential performance of the PPC approach on various pathways, and a weak positive correlation between functional and profile similarities suggest that caution should be exercised while interpreting functional linkages inferred from genome-wide large-scale profile comparisons using a single reference set.
Collapse
Affiliation(s)
- Raja Jothi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
15
|
Shoemaker BA, Panchenko AR. Deciphering protein-protein interactions. Part II. Computational methods to predict protein and domain interaction partners. PLoS Comput Biol 2007; 3:e43. [PMID: 17465672 PMCID: PMC1857810 DOI: 10.1371/journal.pcbi.0030043] [Citation(s) in RCA: 212] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Recent advances in high-throughput experimental methods for the identification of protein interactions have resulted in a large amount of diverse data that are somewhat incomplete and contradictory. As valuable as they are, such experimental approaches studying protein interactomes have certain limitations that can be complemented by the computational methods for predicting protein interactions. In this review we describe different approaches to predict protein interaction partners as well as highlight recent achievements in the prediction of specific domains mediating protein-protein interactions. We discuss the applicability of computational methods to different types of prediction problems and point out limitations common to all of them.
Collapse
|
16
|
Weber APM, Fischer K. Making the connections--the crucial role of metabolite transporters at the interface between chloroplast and cytosol. FEBS Lett 2007; 581:2215-22. [PMID: 17316618 DOI: 10.1016/j.febslet.2007.02.010] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2007] [Revised: 02/06/2007] [Accepted: 02/07/2007] [Indexed: 10/23/2022]
Abstract
Eukaryotic cells are most fascinating because of their high degree of compartmentation. This is particularly true for plant cells, due to the presence of chloroplasts, photosynthetic organelles of endosymbiotic origin that can be traced back to a single cyanobacterial ancestor. Plastids are major hubs in the metabolic network of plant cells, their metabolism being heavily intertwined with that of the cytosol and of other organelles. Solute transport across the plastid envelope by metabolite transporters is key to integrating plastid metabolism with that of other cellular compartments. Here, we review the advances in understanding metabolite transport across the plastid envelope membrane.
Collapse
Affiliation(s)
- Andreas P M Weber
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA.
| | | |
Collapse
|
17
|
Antonov AV, Mewes HW. Complex functionality of gene groups identified from high-throughput data. J Mol Biol 2006; 363:289-96. [PMID: 16959266 DOI: 10.1016/j.jmb.2006.07.062] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2006] [Revised: 07/24/2006] [Accepted: 07/25/2006] [Indexed: 12/19/2022]
Abstract
Relating experimental data to biological knowledge is necessary to cope with the avalanches of new data emerging from recent developments in high-throughput technologies. Automatic functional profiling becomes the de facto standard approach for the secondary analysis of high-throughput data. A number of tools employing available gene functional annotations have been developed for this purpose. However, current annotations are derived mostly from traditional analysis of the individual gene function. The complex biological phenomena carried out by the concerted activity of many genes often requires the definition of new complex functionality (related to a group of genes), which is, in many cases, not available in current annotation vocabularies. Functional profiling with annotation terms related to the description of individual biological functions of a gene may fail to provide reasonable interpretation of biological relationships in a set of genes involved in complex biological phenomena. We introduce a novel procedure to profile a complex functionality of a gene set. Complex functionality is constructed as a combination of available annotation terms. By profiling ChIP-chip data from Saccharomyces cerevisiae we demonstrate that this technique produces deeper insights into the results of high-throughput experiments that are beyond the known facts described in the functional classifications.
Collapse
Affiliation(s)
- Alexey V Antonov
- GSF National Research Center for Environment and Health, Institute for Bioinformatics, Ingolstädter Landstrasse 1, D-85764 Neuherberg, Germany.
| | | |
Collapse
|
18
|
Spirin V, Gelfand MS, Mironov AA, Mirny LA. A metabolic network in the evolutionary context: multiscale structure and modularity. Proc Natl Acad Sci U S A 2006; 103:8774-9. [PMID: 16731630 PMCID: PMC1482654 DOI: 10.1073/pnas.0510258103] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2005] [Indexed: 01/09/2023] Open
Abstract
The enormous complexity of biological networks has led to the suggestion that networks are built of modules that perform particular functions and are "reused" in evolution in a manner similar to reusable domains in protein structures or modules of electronic circuits. Analysis of known biological networks has revealed several modules, many of which have transparent biological functions. However, it remains to be shown that identified structural modules constitute evolutionary building blocks, independent and easily interchangeable units. An alternative possibility is that evolutionary modules do not match structural modules. To investigate the structure of evolutionary modules and their relationship to functional ones, we integrated a metabolic network with evolutionary associations between genes inferred from comparative genomics. The resulting metabolic-genomic network places metabolic pathways into evolutionary and genomic context, thereby revealing previously unknown components and modules. We analyzed the integrated metabolic-genomic network on three levels: macro-, meso-, and microscale. The macroscale level demonstrates strong associations between neighboring enzymes and between enzymes that are distant on the network but belong to the same linear pathway. At the mesoscale level, we identified evolutionary metabolic modules and compared them with traditional metabolic pathways. Although, in some cases, there is almost exact correspondence, some pathways are split into independent modules. On the microscale level, we observed high association of enzyme subunits and weak association of isoenzymes independently catalyzing the same reaction. This study shows that evolutionary modules, rather than pathways, may be thought of as regulatory and functional units in bacterial genomes.
Collapse
Affiliation(s)
- Victor Spirin
- *Harvard–MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139
| | - Mikhail S. Gelfand
- Institute for Information Transmission Problems, Russian Academy of Sciences, Bolshoi Karetnu Pereulok 19, Moscow 127994, Russia
- State Scientific Center GosNIIGenetika, 1-j Dorozhny Proezd 1, Moscow 117545, Russia; and
| | - Andrey A. Mironov
- State Scientific Center GosNIIGenetika, 1-j Dorozhny Proezd 1, Moscow 117545, Russia; and
- Department of Bioengineering and Bioinformatics, Moscow State University, Vorobjevy Gory 1-73, Moscow 119992, Russia
| | - Leonid A. Mirny
- *Harvard–MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139
| |
Collapse
|
19
|
Glazko G, Coleman M, Mushegian A. Similarity searches in genome-wide numerical data sets. Biol Direct 2006; 1:13. [PMID: 16734895 PMCID: PMC1489924 DOI: 10.1186/1745-6150-1-13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2006] [Accepted: 05/30/2006] [Indexed: 11/24/2022] Open
Abstract
We present psi-square, a program for searching the space of gene vectors. The program starts with a gene vector, i.e., the set of measurements associated with a gene, and finds similar vectors, derives a probabilistic model of these vectors, then repeats search using this model as a query, and continues to update the model and search again, until convergence. When applied to three different pathway-discovery problems, psi-square was generally more sensitive and sometimes more specific than the ad hoc methods developed for solving each of these problems before.
Collapse
Affiliation(s)
- Galina Glazko
- Stowers Institute for Medical Research, 1000 E 50St., Kansas City MO 64110, USA
- University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Michael Coleman
- Stowers Institute for Medical Research, 1000 E 50St., Kansas City MO 64110, USA
| | - Arcady Mushegian
- Stowers Institute for Medical Research, 1000 E 50St., Kansas City MO 64110, USA
- Department of Microbiology, Molecular Genetics, and Immunology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| |
Collapse
|
20
|
Appella E, Anderson CW. Identifying protein interactions. Experimental approaches. FEBS J 2005. [DOI: 10.1111/j.1742-4658.2005.04969.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|