1
|
NOJAH: NOt Just Another Heatmap for genome-wide cluster analysis. PLoS One 2019; 14:e0204542. [PMID: 30921318 PMCID: PMC6438523 DOI: 10.1371/journal.pone.0204542] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 02/25/2019] [Indexed: 02/05/2023] Open
Abstract
Since their inception, several tools have been developed for cluster analysis and heatmap construction. The application of such tools to the number and types of genome-wide data available from next generation sequencing (NGS) technologies requires the adaptation of statistical concepts, such as in defining a most variable gene set, and more intricate cluster analyses method to address multiple omic data types. Additionally, the growing number of publicly available datasets has created the desire to estimate the statistical significance of a gene signature derived from one dataset to similarly group samples based on another dataset. The currently available number of tools and their combined use for generating heatmaps, along with the several adaptations of statistical concepts for addressing the higher dimensionality of genome-wide NGS-derived data, has created a further challenge in the ability to replicate heatmap results. We introduce NOJAH (NOt Just Another Heatmap), an interactive tool that defines and implements a workflow for genome-wide cluster analysis and heatmap construction by creating and combining several tools into a single user interface. NOJAH includes several newly developed scripts for techniques that though frequently applied are not sufficiently documented to allow for replicability of results. These techniques include: defining a most variable gene set (a.k.a., ‘core genes’), estimating the statistical significance of a gene signature to separate samples into clusters, and performing a result merging integrated cluster analysis. With only a user uploaded dataset, NOJAH provides as output, among other things, the minimum documentation required for replicating heatmap results. Additionally, NOJAH contains five different existing R packages that are connected in the interface by their functionality as part of a defined workflow for genome-wide cluster analysis. The NOJAH application tool is available at http://bbisr.shinyapps.winship.emory.edu/NOJAH/http://shinygispa.winship.emory.edu/shinyGISPA/ with corresponding source code available at https://github.com/bbisr-shinyapps/NOJAH/.
Collapse
|
2
|
Manjunath M, Zhang Y, Kim Y, Yeo SH, Sobh O, Russell N, Followell C, Bushell C, Ravaioli U, Song JS. ClusterEnG: an interactive educational web resource for clustering and visualizing high-dimensional data. PeerJ Comput Sci 2018; 4:e155. [PMID: 30906871 PMCID: PMC6429934 DOI: 10.7717/peerj-cs.155] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 05/01/2018] [Indexed: 06/09/2023]
Abstract
SUMMARY Clustering is one of the most common techniques used in data analysis to discover hidden structures by grouping together data points that are similar in some measure into clusters. Although there are many programs available for performing clustering, a single web resource that provides both state-of-the-art clustering methods and interactive visualizations is lacking. ClusterEnG (acronym for Clustering Engine for Genomics) provides an interface for clustering big data and interactive visualizations including 3D views, cluster selection and zoom features. ClusterEnG also aims at educating the user about the similarities and differences between various clustering algorithms and provides clustering tutorials that demonstrate potential pitfalls of each algorithm. The web resource will be particularly useful to scientists who are not conversant with computing but want to understand the structure of their data in an intuitive manner. AVAILABILITY ClusterEnG is part of a bigger project called KnowEnG (Knowledge Engine for Genomics) and is available at http://education.knoweng.org/clustereng. CONTACT songi@illinois.edu.
Collapse
Affiliation(s)
- Mohith Manjunath
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Yi Zhang
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Yeonsung Kim
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Steve H. Yeo
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Omar Sobh
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Nathan Russell
- Illinois Applied Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Christian Followell
- Illinois Applied Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Colleen Bushell
- Illinois Applied Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Umberto Ravaioli
- Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| | - Jun S. Song
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
- Department of Physics, University of Illinois at Urbana-Champaign, Champaign, IL, United States of America
| |
Collapse
|
3
|
Méric G, Mageiros L, Pascoe B, Woodcock DJ, Mourkas E, Lamble S, Bowden R, Jolley KA, Raymond B, Sheppard SK. Lineage-specific plasmid acquisition and the evolution of specialized pathogens in Bacillus thuringiensis and the Bacillus cereus group. Mol Ecol 2018; 27:1524-1540. [PMID: 29509989 PMCID: PMC5947300 DOI: 10.1111/mec.14546] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Revised: 02/06/2018] [Accepted: 02/20/2018] [Indexed: 12/20/2022]
Abstract
Bacterial plasmids can vary from small selfish genetic elements to large autonomous replicons that constitute a significant proportion of total cellular DNA. By conferring novel function to the cell, plasmids may facilitate evolution but their mobility may be opposed by co-evolutionary relationships with chromosomes or encouraged via the infectious sharing of genes encoding public goods. Here, we explore these hypotheses through large-scale examination of the association between plasmids and chromosomal DNA in the phenotypically diverse Bacillus cereus group. This complex group is rich in plasmids, many of which encode essential virulence factors (Cry toxins) that are known public goods. We characterized population genomic structure, gene content and plasmid distribution to investigate the role of mobile elements in diversification. We analysed coding sequence within the core and accessory genome of 190 B. cereus group isolates, including 23 novel sequences and genes from 410 reference plasmid genomes. While cry genes were widely distributed, those with invertebrate toxicity were predominantly associated with one sequence cluster (clade 2) and phenotypically defined Bacillus thuringiensis. Cry toxin plasmids in clade 2 showed evidence of recent horizontal transfer and variable gene content, a pattern of plasmid segregation consistent with transfer during infectious cooperation. Nevertheless, comparison between clades suggests that co-evolutionary interactions may drive association between plasmids and chromosomes and limit wider transfer of key virulence traits. Proliferation of successful plasmid and chromosome combinations is a feature of specialized pathogens with characteristic niches (Bacillus anthracis, B. thuringiensis) and has occurred multiple times in the B. cereus group.
Collapse
Affiliation(s)
- Guillaume Méric
- The Milner Centre for EvolutionDepartment of Biology and BiochemistryUniversity of BathBathUK
| | | | - Ben Pascoe
- The Milner Centre for EvolutionDepartment of Biology and BiochemistryUniversity of BathBathUK
- MRC CLIMB ConsortiumUniversity of BathBathUK
| | - Dan J. Woodcock
- Mathematics Institute and Zeeman Institute for Systems Biology and Infectious Epidemiology ResearchUniversity of WarwickCoventryUK
| | - Evangelos Mourkas
- The Milner Centre for EvolutionDepartment of Biology and BiochemistryUniversity of BathBathUK
| | - Sarah Lamble
- Wellcome Trust Centre for Human GeneticsUniversity of OxfordOxfordUK
| | - Rory Bowden
- Wellcome Trust Centre for Human GeneticsUniversity of OxfordOxfordUK
| | | | - Ben Raymond
- Department of Life SciencesFaculty of Natural SciencesImperial College LondonAscotUK
- Department of BiosciencesUniversity of ExeterExeterUK
| | - Samuel K. Sheppard
- The Milner Centre for EvolutionDepartment of Biology and BiochemistryUniversity of BathBathUK
- MRC CLIMB ConsortiumUniversity of BathBathUK
- Department of ZoologyUniversity of OxfordOxfordUK
| |
Collapse
|