Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bogusz M, Whelan S. Phylogenetic Tree Estimation With and Without Alignment: New Distance Methods and Benchmarking. Syst Biol 2018;66:218-231. [PMID: 27633353 DOI: 10.1093/sysbio/syw074] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 08/23/2016] [Indexed: 12/20/2022] Open

For:	Bogusz M, Whelan S. Phylogenetic Tree Estimation With and Without Alignment: New Distance Methods and Benchmarking. Syst Biol 2018;66:218-231. [PMID: 27633353 DOI: 10.1093/sysbio/syw074] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 08/23/2016] [Indexed: 12/20/2022] Open

Number

Cited by Other Article(s)

Julian AT, Pombert JF. SYNY: a pipeline to investigate and visualize collinearity between genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.09.593317. [PMID: 38798446 PMCID: PMC11118330 DOI: 10.1101/2024.05.09.593317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Wilson D, Rogers JD. Evaluating Compression-Based Phylogeny Estimation in the Presence of Incomplete Lineage Sorting. J Comput Biol 2023;30:250-260. [PMID: 36848254 DOI: 10.1089/cmb.2022.0197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open

Ray M, Sarkar S, Rath SN. Druggability for COVID-19: in silico discovery of potential drug compounds against nucleocapsid (N) protein of SARS-CoV-2. Genomics Inform 2020;18:e43. [PMID: 33412759 PMCID: PMC7808868 DOI: 10.5808/gi.2020.18.4.e43] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 10/19/2020] [Accepted: 10/20/2020] [Indexed: 12/13/2022] Open

Bhattacharjee A, Bayzid MS. Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices. BMC Genomics 2020;21:497. [PMID: 32689946 PMCID: PMC7370488 DOI: 10.1186/s12864-020-06892-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 07/07/2020] [Indexed: 02/08/2023] Open

Nwaiwu O, Aduba CC. An in silico analysis of acquired antimicrobial resistance genes in Aeromonas plasmids. AIMS Microbiol 2020;6:75-91. [PMID: 32226916 PMCID: PMC7099201 DOI: 10.3934/microbiol.2020005] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 03/13/2020] [Indexed: 12/17/2022] Open

Ali RH, Bogusz M, Whelan S. Identifying Clusters of High Confidence Homologies in Multiple Sequence Alignments. Mol Biol Evol 2019;36:2340-2351. [PMID: 31209473 PMCID: PMC6933875 DOI: 10.1093/molbev/msz142] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Vizueta J, Rozas J, Sánchez-Gracia A. Comparative Genomics Reveals Thousands of Novel Chemosensory Genes and Massive Changes in Chemoreceptor Repertories across Chelicerates. Genome Biol Evol 2018;10:1221-1236. [PMID: 29788250 PMCID: PMC5952958 DOI: 10.1093/gbe/evy081] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/17/2018] [Indexed: 12/15/2022] Open

Levy Karin E, Shkedy D, Ashkenazy H, Cartwright RA, Pupko T. Inferring Rates and Length-Distributions of Indels Using Approximate Bayesian Computation. Genome Biol Evol 2018;9:1280-1294. [PMID: 28453624 PMCID: PMC5438127 DOI: 10.1093/gbe/evx084] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2017] [Indexed: 02/07/2023] Open

Mavrodiev EV, Dell C, Schroder L. A laid-back trip through the Hennigian Forests. PeerJ 2017;5:e3578. [PMID: 28740753 PMCID: PMC5522724 DOI: 10.7717/peerj.3578] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 06/23/2017] [Indexed: 11/20/2022] Open

Abstract

BACKGROUND

This paper is a comment on the idea of matrix-free Cladistics. Demonstration of this idea's efficiency is a major goal of the study. Within the proposed framework, the ordinary (phenetic) matrix is necessary only as "source" of Hennigian trees, not as a primary subject of the analysis. Switching from the matrix-based thinking to the matrix-free Cladistic approach clearly reveals that optimizations of the character-state changes are related not to the real processes, but to the form of the data representation.

METHODS

We focused our study on the binary data. We wrote the simple ruby-based script FORESTER version 1.0 that helps represent a binary matrix as an array of the rooted trees (as a "Hennigian forest"). The binary representations of the genomic (DNA) data have been made by script 1001. The Average Consensus method as well as the standard Maximum Parsimony (MP) approach has been used to analyze the data.

PRINCIPLE FINDINGS

The binary matrix may be easily re-written as a set of rooted trees (maximal relationships). The latter might be analyzed by the Average Consensus method. Paradoxically, this method, if applied to the Hennigian forests, in principle can help to identify clades despite the absence of the direct evidence from the primary data. Our approach may handle the clock- or non clock-like matrices, as well as the hypothetical, molecular or morphological data.

DISCUSSION

Our proposal clearly differs from the numerous phenetic alignment-free techniques of the construction of the phylogenetic trees. Dealing with the relations, not with the actual "data" also distinguishes our approach from all optimization-based methods, if the optimization is defined as a way to reconstruct the sequences of the character-state changes on a tree, either the standard alignment-based techniques or the "direct" alignment-free procedure. We are not viewing our recent framework as an alternative to the three-taxon statement analysis (3TA), but there are two major differences between our recent proposal and the 3TA, as originally designed and implemented: (1) the 3TA deals with the three-taxon statements or minimal relationships. According to the logic of 3TA, the set of the minimal trees must be established as a binary matrix and used as an input for the parsimony program. In this paper, we operate directly with maximal relationships written just as trees, not as binary matrices, while also using the Average Consensus method instead of the MP analysis. The solely 'reversal'-based groups can always be found by our method without the separate scoring of the putative reversals before analyses.

Collapse

Nojoomi S, Koehl P. String kernels for protein sequence comparisons: improved fold recognition. BMC Bioinformatics 2017;18:137. [PMID: 28245816 PMCID: PMC5331664 DOI: 10.1186/s12859-017-1560-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 02/23/2017] [Indexed: 11/28/2022] Open

Abstract

BACKGROUND

The amino acid sequence of a protein is the blueprint from which its structure and ultimately function can be derived. Therefore, sequence comparison methods remain essential for the determination of similarity between proteins. Traditional approaches for comparing two protein sequences begin with strings of letters (amino acids) that represent the sequences, before generating textual alignments between these strings and providing scores for each alignment. When the similitude between the two protein sequences to be compared is low however, the quality of the corresponding sequence alignment is usually poor, leading to poor performance for the recognition of similarity.

RESULTS

In this study, we develop an alignment free alternative to these methods that is based on the concept of string kernels. Starting from recently proposed kernels on the discrete space of protein sequences (Shen et al, Found. Comput. Math., 2013,14:951-984), we introduce our own version, SeqKernel. Its implementation depends on two parameters, a coefficient that tunes the substitution matrix and the maximum length of k-mers that it includes. We provide an exhaustive analysis of the impacts of these two parameters on the performance of SeqKernel for fold recognition. We show that with the right choice of parameters, use of the SeqKernel similarity measure improves fold recognition compared to the use of traditional alignment-based methods. We illustrate the application of SeqKernel to inferring phylogeny on RNA polymerases and show that it performs as well as methods based on multiple sequence alignments.

CONCLUSION

We have presented and characterized a new alignment free method based on a mathematical kernel for scoring the similarity of protein sequences. We discuss possible improvements of this method, as well as an extension of its applications to other modeling methods that rely on sequence comparison.

Collapse