1
|
Zhang YC, Lin K. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions. Evol Bioinform Online 2015; 11:1-9. [PMID: 26715828 PMCID: PMC4686347 DOI: 10.4137/ebo.s33491] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Revised: 11/10/2015] [Accepted: 11/16/2015] [Indexed: 11/25/2022] Open
Abstract
Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms.
Collapse
Affiliation(s)
- Yan-Cong Zhang
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, China. ; MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Kui Lin
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, China. ; MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| |
Collapse
|
2
|
Wiwanitkit V. Utilization of multiple "omics" studies in microbial pathogeny for microbiology insights. Asian Pac J Trop Biomed 2015; 3:330-3. [PMID: 23620861 DOI: 10.1016/s2221-1691(13)60073-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2013] [Accepted: 02/20/2013] [Indexed: 11/28/2022] Open
Abstract
In the present day, bioinformatics becomes the modern science with several advantages. Several new "omics" sciences have been introduced for a few years and those sciences can be applied in biomedical work. Here, the author will summarize and discuss on important applications of omics studies in microbiology focusing on microbial pathogeny. It can be seen that genomics and proteinomics can be well used in this area of biomedical studies.
Collapse
|
3
|
Luo Y, Battistuzzi F, Lin K. Evolutionary dynamics of overlapped genes in Salmonella. PLoS One 2013; 8:e81016. [PMID: 24312259 PMCID: PMC3843671 DOI: 10.1371/journal.pone.0081016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 10/16/2013] [Indexed: 11/19/2022] Open
Abstract
Presence of overlapping genes (OGs) is a common phenomenon in bacterial genomes. Most frequently, overlapping genes share coding regions with as few as one nucleotide to as many as thousands of nucleotides. Overlapping genes are often co-regulated, transcriptionally and translationally. Overlapping genes are also subject to the whims of evolution, as the gene overlap is known to be disrupted in some species/strains and participating genes are sometimes lost in independent lineages. Therefore, a better understanding of evolutionary patterns and rates of the disruption of overlapping genes is an important component of genome structure and evolution of gene function. In this study, we investigate the fate of ancestrally overlapping genes in complete genomes from 15 contemporary strains of Salmonella species. We find that the fates of overlapping genes inside and outside operons are distinctly different. A larger fraction of overlapping genes inside operons conserves their overlap as compared to gene pairs outside of the operons (average 0.89 vs. 0.83 per genome). However, when overlapping genes in the operons separate, one partner is lost more frequently than in those separated genes outside of operons (average 0.02 vs. 0.01 per genome). We also investigate the fate of a pan set of overlapping genes at the present and ancestral nodes over a phylogenetic tree based on genome sequence data, respectively. We propose that co-regulation plays important roles on the fates of genes. Furthermore, a vast majority of disruptions occurred prior to the common ancestor of all 15 Salmonella strains, which enables us to obtain an estimate of disruptions between Salmonella and E. coli.
Collapse
Affiliation(s)
- Yingqin Luo
- Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, Arizona, United States of America
- Center for Infectious Diseases and Vaccinology, The Biodesign Institute, Arizona State University, Tempe, Arizona, United States of America
| | - Fabia Battistuzzi
- Department of Biological Sciences, Oakland University, Rochester, Michigan, United States of America
| | - Kui Lin
- College of Life Sciences, Beijing Normal University, Beijing, China
| |
Collapse
|
4
|
Prokaryotic phylogenies inferred from whole-genome sequence and annotation data. BIOMED RESEARCH INTERNATIONAL 2013; 2013:409062. [PMID: 24073404 PMCID: PMC3773407 DOI: 10.1155/2013/409062] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 06/26/2013] [Accepted: 07/22/2013] [Indexed: 11/25/2022]
Abstract
Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.
Collapse
|
5
|
Zhu S, Wang HL, Wang C, Tang L, Wang X, Yu KJ, Liu SL. Non-contiguous finished genome sequence and description of Salmonella enterica subsp. houtenae str. RKS3027. Stand Genomic Sci 2013; 8:198-205. [PMID: 23991252 PMCID: PMC3746422 DOI: 10.4056/sigs.3767427] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Salmonella enterica subsp. houtenae serovar 16:z4, z32:-- str. RKS3027 was isolated from a human in Illinois, USA. S. enterica subsp. houtenae is a facultative aerobic rod-shaped Gram-negative bacterium. Here we describe the features of this organism, together with the draft genome sequence and annotation. The 4,404,136 bp long genome (97 contigs) contains 4,335 protein-coding gene and 28 RNA genes.
Collapse
Affiliation(s)
- Songling Zhu
- Genomics Research Center of Harbin Medical University, Harbin, China ; Genetic Detection Center of First Affiliated Hospital, Harbin Medical University, Harbin, China
| | | | | | | | | | | | | |
Collapse
|
6
|
Rosenfeld JA, DeSalle R. E value cutoff and eukaryotic genome content phylogenetics. Mol Phylogenet Evol 2012; 63:342-50. [PMID: 22306824 DOI: 10.1016/j.ympev.2012.01.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2011] [Revised: 01/02/2012] [Accepted: 01/03/2012] [Indexed: 10/14/2022]
Abstract
Genome content analysis has been used as a source of phylogenetic information in large prokaryotic tree of life studies. Recently the sequencing of many eukaryotic genomes has allowed for the similar use of genome content analysis for these organisms too. In this communication we examine the utility of genome content analysis for recovering phylogenetic patterns in several eukaryotic groups. By constructing multiple matrices using different e value cutoffs we examine the dynamics of altering the e value cutoff on five eukaryotic genome data sets. Our analysis indicates that the e value cutoff that is used as a criterion in the construction of the genome content matrix is a critical factor in both the accuracy and information content of the analysis. Strikingly, genome content by itself is not a reliable or accurate source of characters for phylogenetic analysis of the taxa in the five data sets we analyzed. We discuss two problems--small genome attraction and genome duplications as being involved in the rather poor performance of genome content data in recovering eukaryotic phylogeny.
Collapse
Affiliation(s)
- Jeffrey A Rosenfeld
- IST/High Performance and Research Computing, University of Medicine and Dentistry of New Jersey, Newark, NJ 07103, United States.
| | | |
Collapse
|
7
|
Wang Z, Zhang XC, Le MH, Xu D, Stacey G, Cheng J. A protein domain co-occurrence network approach for predicting protein function and inferring species phylogeny. PLoS One 2011; 6:e17906. [PMID: 21455299 PMCID: PMC3063783 DOI: 10.1371/journal.pone.0017906] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2010] [Accepted: 02/16/2011] [Indexed: 11/18/2022] Open
Abstract
Protein Domain Co-occurrence Network (DCN) is a biological network that has not been fully-studied. We analyzed the properties of the DCNs of H. sapiens, S. cerevisiae, C. elegans, D. melanogaster, and 15 plant genomes. These DCNs have the hallmark features of scale-free networks. We investigated the possibility of using DCNs to predict protein and domain functions. Based on our experiment conducted on 66 randomly selected proteins, the best of top 3 predictions made by our DCN-based aggregated neighbor-counting method achieved a semantic similarity score of 0.81 to the actual Gene Ontology terms of the proteins. Moreover, the top 3 predictions using neighbor-counting, χ(2), and a SVM-based method achieved an accuracy of 66%, 59%, and 61%, respectively, when used to predict specific Gene Ontology terms of human target domains. These predictions on average had a semantic similarity score of 0.82, 0.80, and 0.79 to the actual Gene Ontology terms, respectively. We also used DCNs to predict whether a domain is an enzyme domain, and our SVM-based and neighbor-inference method correctly classified 79% and 77% of the target domains, respectively. When using DCNs to classify a target domain into one of the six enzyme classes, we found that, as long as there is one EC number available in the neighboring domains, our SVM-based and neighboring-counting method correctly classified 92.4% and 91.9% of the target domains, respectively. Furthermore, we benchmarked the performance of using DCNs to infer species phylogenies on six different combinations of 398 single-chromosome prokaryotic genomes. The phylogenetic tree of 54 prokaryotic taxa generated by our DCNs-alignment-based method achieved a 93.45% similarity score compared to the Bergey's taxonomy. In summary, our studies show that genome-wide DCNs contain rich information that can be effectively used to decipher protein function and reveal the evolutionary relationship among species.
Collapse
Affiliation(s)
- Zheng Wang
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
| | - Xue-Cheng Zhang
- Christopher S. Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- Division of Plant Science, University of Missouri, Columbia, Missouri, United States of America
| | - Mi Ha Le
- Division of Plant Science, University of Missouri, Columbia, Missouri, United States of America
| | - Dong Xu
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
- Christopher S. Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
| | - Gary Stacey
- Christopher S. Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- Division of Plant Science, University of Missouri, Columbia, Missouri, United States of America
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
- Christopher S. Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
8
|
Cheng CH, Yang CH, Chiu HT, Lu CL. Reconstructing genome trees of prokaryotes using overlapping genes. BMC Bioinformatics 2010; 11:102. [PMID: 20181237 PMCID: PMC2845580 DOI: 10.1186/1471-2105-11-102] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2009] [Accepted: 02/24/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. In fact, they are ubiquitous in microbial genomes and more conserved between species than non-overlapping genes. Based on this property, we have previously implemented a web server, named OGtree, that allows the user to reconstruct genome trees of some prokaryotes according to their pairwise OG distances. By analogy to the analyses of gene content and gene order, the OG distance between two genomes we defined was based on a measure of combining OG content (i.e., the normalized number of shared orthologous OG pairs) and OG order (i.e., the normalized OG breakpoint distance) in their whole genomes. A shortcoming of using the concept of breakpoints to define the OG distance is its inability to analyze the OG distance of multi-chromosomal genomes. In addition, the amount of overlapping coding sequences between some distantly related prokaryotic genomes may be limited so that it is hard to find enough OGs to properly evaluate their pairwise OG distances. RESULTS In this study, we therefore define a new OG order distance that is based on more biologically accurate rearrangements (e.g., reversals, transpositions and translocations) rather than breakpoints and that is applicable to both uni-chromosomal and multi-chromosomal genomes. In addition, we expand the term "gene" to include both its coding sequence and regulatory regions so that two adjacent genes whose coding sequences or regulatory regions overlap with each other are considered as a pair of overlapping genes. This is because overlapping of regulatory regions of distinct genes suggests that the regulation of expression for these genes should be more or less interrelated. Based on these modifications, we have reimplemented our OGtree as a new web server, named OGtree2, and have also evaluated its accuracy of genome tree reconstruction on a testing dataset consisting of 21 Proteobacteria genomes. Our experimental results have finally shown that our current OGtree2 indeed outperforms its previous version OGtree, as well as another similar server, called BPhyOG, significantly in the quality of genome tree reconstruction, because the phylogenetic tree obtained by OGtree2 is greatly congruent with the reference tree that coincides with the taxonomy accepted by biologists for these Proteobacteria. CONCLUSIONS In this study, we have introduced a new web server OGtree2 at http://bioalgorithm.life.nctu.edu.tw/OGtree2.0/ that can serve as a useful tool for reconstructing more precise and robust genome trees of prokaryotes according to their overlapping genes.
Collapse
Affiliation(s)
- Chih-Hsien Cheng
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Chung-Han Yang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Hsien-Tai Chiu
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Chin Lung Lu
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| |
Collapse
|
9
|
Pallejà A, Reverter T, Garcia-Vallvé S, Romeu A. PairWise Neighbours database: overlaps and spacers among prokaryote genomes. BMC Genomics 2009; 10:281. [PMID: 19555467 PMCID: PMC2716372 DOI: 10.1186/1471-2164-10-281] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Accepted: 06/25/2009] [Indexed: 05/25/2023] Open
Abstract
Background Although prokaryotes live in a variety of habitats and possess different metabolic and genomic complexity, they have several genomic architectural features in common. The overlapping genes are a common feature of the prokaryote genomes. The overlapping lengths tend to be short because as the overlaps become longer they have more risk of deleterious mutations. The spacers between genes tend to be short too because of the tendency to reduce the non coding DNA among prokaryotes. However they must be long enough to maintain essential regulatory signals such as the Shine-Dalgarno (SD) sequence, which is responsible of an efficient translation. Description PairWise Neighbours is an interactive and intuitive database used for retrieving information about the spacers and overlapping genes among bacterial and archaeal genomes. It contains 1,956,294 gene pairs from 678 fully sequenced prokaryote genomes and is freely available at the URL . This database provides information about the overlaps and their conservation across species. Furthermore, it allows the wide analysis of the intergenic regions providing useful information such as the location and strength of the SD sequence. Conclusion There are experiments and bioinformatic analysis that rely on correct annotations of the initiation site. Therefore, a database that studies the overlaps and spacers among prokaryotes appears to be desirable. PairWise Neighbours database permits the reliability analysis of the overlapping structures and the study of the SD presence and location among the adjacent genes, which may help to check the annotation of the initiation sites.
Collapse
Affiliation(s)
- Albert Pallejà
- Department of Biochemistry and Biotechnology, Rovira i Virgili University, Tarragona, Catalunya, Spain.
| | | | | | | |
Collapse
|
10
|
Lin GN, Cai Z, Lin G, Chakraborty S, Xu D. ComPhy: prokaryotic composite distance phylogenies inferred from whole-genome gene sets. BMC Bioinformatics 2009; 10 Suppl 1:S5. [PMID: 19208152 PMCID: PMC2648732 DOI: 10.1186/1471-2105-10-s1-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background With the increasing availability of whole genome sequences, it is becoming more and more important to use complete genome sequences for inferring species phylogenies. We developed a new tool ComPhy, 'Composite Distance Phylogeny', based on a composite distance matrix calculated from the comparison of complete gene sets between genome pairs to produce a prokaryotic phylogeny. Results The composite distance between two genomes is defined by three components: Gene Dispersion Distance (GDD), Genome Breakpoint Distance (GBD) and Gene Content Distance (GCD). GDD quantifies the dispersion of orthologous genes along the genomic coordinates from one genome to another; GBD measures the shared breakpoints between two genomes; GCD measures the level of shared orthologs between two genomes. The phylogenetic tree is constructed from the composite distance matrix using a neighbor joining method. We tested our method on 9 datasets from 398 completely sequenced prokaryotic genomes. We have achieved above 90% agreement in quartet topologies between the tree created by our method and the tree from the Bergey's taxonomy. In comparison to several other phylogenetic analysis methods, our method showed consistently better performance. Conclusion ComPhy is a fast and robust tool for genome-wide inference of evolutionary relationship among genomes. It can be downloaded from .
Collapse
Affiliation(s)
- Guan Ning Lin
- Digital Biology Laboratory, Informatics Institute, Computer Science Department and Christopher S, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
| | | | | | | | | |
Collapse
|
11
|
Sabath N, Graur D, Landan G. Same-strand overlapping genes in bacteria: compositional determinants of phase bias. Biol Direct 2008; 3:36. [PMID: 18717987 PMCID: PMC2542354 DOI: 10.1186/1745-6150-3-36] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2008] [Accepted: 08/21/2008] [Indexed: 11/24/2022] Open
Abstract
Background Same-strand overlapping genes may occur in frameshifts of one (phase 1) or two nucleotides (phase 2). In previous studies of bacterial genomes, long phase-1 overlaps were found to be more numerous than long phase-2 overlaps. This bias was explained by either genomic location or an unspecified selection advantage. Models that focused on the ability of the two genes to evolve independently did not predict this phase bias. Here, we propose that a purely compositional model explains the phase bias in a more parsimonious manner. Same-strand overlapping genes may arise through either a mutation at the termination codon of the upstream gene or a mutation at the initiation codon of the downstream gene. We hypothesized that given these two scenarios, the frequencies of initiation and termination codons in the two phases may determine the number for overlapping genes. Results We examined the frequencies of initiation- and termination-codons in the two phases, and found that termination codons do not significantly differ between the two phases, whereas initiation codons are more abundant in phase 1. We found that the primary factors explaining the phase inequality are the frequencies of amino acids whose codons may combine to form start codons in the two phases. We show that the frequencies of start codons in each of the two phases, and, hence, the potential for the creation of overlapping genes, are determined by a universal amino-acid frequency and species-specific codon usage, leading to a correlation between long phase-1 overlaps and genomic GC content. Conclusion Our model explains the phase bias in same-strand overlapping genes by compositional factors without invoking selection. Therefore, it can be used as a null model of neutral evolution to test selection hypotheses concerning the evolution of overlapping genes. Reviewers This article was reviewed by Bill Martin, Itai Yanai, and Mikhail Gelfand.
Collapse
Affiliation(s)
- Niv Sabath
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA.
| | | | | |
Collapse
|
12
|
Jiang LW, Lin KL, Lu CL. OGtree: a tool for creating genome trees of prokaryotes based on overlapping genes. Nucleic Acids Res 2008; 36:W475-80. [PMID: 18456706 PMCID: PMC2447762 DOI: 10.1093/nar/gkn240] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
OGtree is a web-based tool for constructing genome trees of prokaryotic species based on a measure of combining overlapping-gene content and overlapping-gene order in their whole genomes. The overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. In fact, OGs are ubiquitous in microbial genomes and more conserved between species than non-OGs. Based on these properties, it has been suggested that OGs can serve as better phylogenetic characters than non-OGs for reconstructing the evolutionary relationships among microbial genomes. OGtree takes the accession numbers of prokaryotic genomes as its input. It then downloads their complete genomes from the National Centre for Biotechnology Information and identifies OGs in each genome and their orthologous OGs in other genomes. Next, OGtree computes an overlapping-gene distance between each pair of input genomes based on a combination of their OG content and orthologous OG order. Finally, it utilizes distance-based methods of building tree to reconstruct the genome trees of input prokaryotic genomes according to their pairwise OG distance. OGtree is available online at http://bioalgorithm.life.nctu.edu.tw/OGtree/.
Collapse
Affiliation(s)
- Li-Wei Jiang
- Institute of Bioinformatics and Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | | | | |
Collapse
|
13
|
|