1
|
Li G, Zeng X, Li Y, Li J, Huang X, Zhao D. BRITTLE CULM17, a Novel Allele of TAC4, Affects the Mechanical Properties of Rice Plants. Int J Mol Sci 2022; 23:ijms23105305. [PMID: 35628116 PMCID: PMC9140386 DOI: 10.3390/ijms23105305] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/21/2022] [Accepted: 04/22/2022] [Indexed: 01/27/2023] Open
Abstract
Lodging resistance of rice (Oryza sativa L.) has always been a hot issue in agricultural production. A brittle stem mutant, osbc17, was identified by screening an EMS (Ethylmethane sulfonate) mutant library established in our laboratory. The stem segments and leaves of the mutant were obviously brittle and fragile, with low mechanical strength. Examination of paraffin sections of flag leaf and internode samples indicated that the number of cell layers in mechanical tissue of the mutant was decreased compared with the wild type, Pingtangheinuo, and scanning electron microscopy revealed that the mechanical tissue cell walls of the mutant were thinner. Lignin contents of the internodes of mature-stage rice were significantly lower in the mutant than in the wild type. By the MutMap method, we found candidate gene OsBC17, which was located on rice chromosome 2 and had a 2433 bp long coding sequence encoding a protein sequence of 810 amino acid residues with unknown function. According to LC-MS/MS analysis of intermediate products of the lignin synthesis pathway, the accumulation of caffeyl alcohol in the osbc17 mutant was significantly higher than in Pingtangheinuo. Caffeyl alcohol can be polymerized to the catechyl lignin monomer by laccase ChLAC8; however, ChLAC8 and OsBC17 are not homologous proteins, which suggests that the osbc17 gene is involved in this process by regulating laccase expression.
Collapse
Affiliation(s)
- Guangzheng Li
- The State Key Laboratory of Green Pesticide and Agricultural Biological Engineering, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang 550025, China;
- The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in Mountainous Region, Ministry of Education, Institute of Agro-Bioengineering, College of Life Sciences, Guizhou University, Guiyang 550025, China; (X.Z.); (Y.L.); (J.L.); (X.H.)
| | - Xiaofang Zeng
- The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in Mountainous Region, Ministry of Education, Institute of Agro-Bioengineering, College of Life Sciences, Guizhou University, Guiyang 550025, China; (X.Z.); (Y.L.); (J.L.); (X.H.)
| | - Yan Li
- The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in Mountainous Region, Ministry of Education, Institute of Agro-Bioengineering, College of Life Sciences, Guizhou University, Guiyang 550025, China; (X.Z.); (Y.L.); (J.L.); (X.H.)
| | - Jianrong Li
- The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in Mountainous Region, Ministry of Education, Institute of Agro-Bioengineering, College of Life Sciences, Guizhou University, Guiyang 550025, China; (X.Z.); (Y.L.); (J.L.); (X.H.)
| | - Xiaozhen Huang
- The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in Mountainous Region, Ministry of Education, Institute of Agro-Bioengineering, College of Life Sciences, Guizhou University, Guiyang 550025, China; (X.Z.); (Y.L.); (J.L.); (X.H.)
| | - Degang Zhao
- The State Key Laboratory of Green Pesticide and Agricultural Biological Engineering, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang 550025, China;
- The Key Laboratory of Plant Resources Conservation and Germplasm Innovation in Mountainous Region, Ministry of Education, Institute of Agro-Bioengineering, College of Life Sciences, Guizhou University, Guiyang 550025, China; (X.Z.); (Y.L.); (J.L.); (X.H.)
- Guizhou Plant Conservation Center, Guizhou Academy of Agricultural Sciences, Guiyang 550006, China
- Correspondence:
| |
Collapse
|
2
|
Nusrat S, Harbig T, Gehlenborg N. Tasks, Techniques, and Tools for Genomic Data Visualization. COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS 2019; 38:781-805. [PMID: 31768085 PMCID: PMC6876635 DOI: 10.1111/cgf.13727] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Genomic data visualization is essential for interpretation and hypothesis generation as well as a valuable aid in communicating discoveries. Visual tools bridge the gap between algorithmic approaches and the cognitive skills of investigators. Addressing this need has become crucial in genomics, as biomedical research is increasingly data-driven and many studies lack well-defined hypotheses. A key challenge in data-driven research is to discover unexpected patterns and to formulate hypotheses in an unbiased manner in vast amounts of genomic and other associated data. Over the past two decades, this has driven the development of numerous data visualization techniques and tools for visualizing genomic data. Based on a comprehensive literature survey, we propose taxonomies for data, visualization, and tasks involved in genomic data visualization. Furthermore, we provide a comprehensive review of published genomic visualization tools in the context of the proposed taxonomies.
Collapse
Affiliation(s)
- S Nusrat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - T Harbig
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - N Gehlenborg
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Yan S, Wang L, Zhao L, Wang H, Wang D. Evaluation of Genetic Variation among Sorghum Varieties from Southwest China via Genome Resequencing. THE PLANT GENOME 2018; 11:170098. [PMID: 30512039 DOI: 10.3835/plantgenome2017.11.0098] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Little is known regarding genomic variation among glutinous sorghum [ (L.) Moench] varieties grown in southwest China, which are primarily used to brew the popular Jiang-flavor liquor. This study evaluated genomic variation among six representative sorghum accessions via whole-genome resequencing. The evaluation revealed 2365,363 single-nucleotide polymorphisms (SNPs), 394,365 insertions and deletions, and 47,567 copy number variations among the six genomes. Chromosomes 5 and 10 showed relatively high SNP densities, whereas whole-genome diversity in this population was low. In addition, some chromosomal loci exhibited obvious selection during the breeding process. Sorghum accessions from southwest China formed an elite germplasm population compared with the findings of other geographic populations, and the elite variety 'Hongyingzi' contained 79 unique genes primarily involved in basic metabolism. The six sorghum lines contained a large number of high-confidence genes, with Hongyingzi in particular possessing 104 unique genes. These findings advance our understanding of domestication of the sorghum genome, and Chinese sorghum accessions will be valuable resources for further research and breeding improvements.
Collapse
|
4
|
Discovery of potential causative mutations in human coding and noncoding genome with the interactive software BasePlayer. Nat Protoc 2018; 13:2580-2600. [DOI: 10.1038/s41596-018-0052-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
5
|
Steuernagel B, Witek K, Jones JDG, Wulff BBH. MutRenSeq: A Method for Rapid Cloning of Plant Disease Resistance Genes. Methods Mol Biol 2017; 1659:215-229. [PMID: 28856654 DOI: 10.1007/978-1-4939-7249-4_19] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
MutRenSeq is a method to clone disease resistance (R) genes in plants. Tips and detailed experimental protocols for the pipeline, including the complexity reduction by R gene targeted enrichment sequencing, and computational analysis based on comparative genomics are provided in this chapter.
Collapse
Affiliation(s)
| | - Kamil Witek
- The Sainsbury Laboratory, Norwich Research Park, Norwich, UK
| | | | | |
Collapse
|
6
|
Glueck M, Gvozdik A, Chevalier F, Khan A, Brudno M, Wigdor D. PhenoStacks: Cross-Sectional Cohort Phenotype Comparison Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:191-200. [PMID: 27514055 DOI: 10.1109/tvcg.2016.2598469] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Cross-sectional phenotype studies are used by genetics researchers to better understand how phenotypes vary across patients with genetic diseases, both within and between cohorts. Analyses within cohorts identify patterns between phenotypes and patients (e.g., co-occurrence) and isolate special cases (e.g., potential outliers). Comparing the variation of phenotypes between two cohorts can help distinguish how different factors affect disease manifestation (e.g., causal genes, age of onset, etc.). PhenoStacks is a novel visual analytics tool that supports the exploration of phenotype variation within and between cross-sectional patient cohorts. By leveraging the semantic hierarchy of the Human Phenotype Ontology, phenotypes are presented in context, can be grouped and clustered, and are summarized via overviews to support the exploration of phenotype distributions. The design of PhenoStacks was motivated by formative interviews with genetics researchers: we distil high-level tasks, present an algorithm for simplifying ontology topologies for visualization, and report the results of a deployment evaluation with four expert genetics researchers. The results suggest that PhenoStacks can help identify phenotype patterns, investigate data quality issues, and inform data collection design.
Collapse
|
7
|
Hilker R, Stadermann KB, Schwengers O, Anisiforov E, Jaenicke S, Weisshaar B, Zimmermann T, Goesmann A. ReadXplorer 2-detailed read mapping analysis and visualization from one single source. Bioinformatics 2016; 32:3702-3708. [PMID: 27540267 PMCID: PMC5167064 DOI: 10.1093/bioinformatics/btw541] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Revised: 08/02/2016] [Accepted: 08/15/2016] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION The vast amount of already available and currently generated read mapping data requires comprehensive visualization, and should benefit from bioinformatics tools offering a wide spectrum of analysis functionality from just one source. Appropriate handling of multiple mapped reads during mapping analyses remains an issue that demands improvement. RESULTS The capabilities of the read mapping analysis and visualization tool ReadXplorer were vastly enhanced. Here, we present an even finer granulated read mapping classification, improving the level of detail for analyses and visualizations. The spectrum of automatic analysis functions has been broadened to include genome rearrangement detection as well as correlation analysis between two mapping data sets. Existing functions were refined and enhanced, namely the computation of differentially expressed genes, the read count and normalization analysis and the transcription start site detection. Additionally, ReadXplorer 2 features a highly improved support for large eukaryotic data sets and a command line version, enabling its integration into workflows. Finally, the new version is now able to display any kind of tabular results from other bioinformatics tools. AVAILABILITY AND IMPLEMENTATION http://www.readxplorer.org CONTACT: readxplorer@computational.bio.uni-giessen.deSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rolf Hilker
- Bioinformatics and Systems Biology, Faculty of Biology and Chemistry, Justus-Liebig-University, Giessen 35392, Germany
| | - Kai Bernd Stadermann
- Faculty of Biology, Chair of Genome Research, Bielefeld University, Bielefeld 33615, Germany
| | - Oliver Schwengers
- Bioinformatics and Systems Biology, Faculty of Biology and Chemistry, Justus-Liebig-University, Giessen 35392, Germany
| | - Evgeny Anisiforov
- Bioinformatics and Systems Biology, Faculty of Biology and Chemistry, Justus-Liebig-University, Giessen 35392, Germany
| | - Sebastian Jaenicke
- Bioinformatics and Systems Biology, Faculty of Biology and Chemistry, Justus-Liebig-University, Giessen 35392, Germany
| | - Bernd Weisshaar
- Faculty of Biology, Chair of Genome Research, Bielefeld University, Bielefeld 33615, Germany
| | - Tobias Zimmermann
- Bioinformatics and Systems Biology, Faculty of Biology and Chemistry, Justus-Liebig-University, Giessen 35392, Germany
| | - Alexander Goesmann
- Bioinformatics and Systems Biology, Faculty of Biology and Chemistry, Justus-Liebig-University, Giessen 35392, Germany
| |
Collapse
|
8
|
Steuernagel B, Periyannan SK, Hernández-Pinzón I, Witek K, Rouse MN, Yu G, Hatta A, Ayliffe M, Bariana H, Jones JDG, Lagudah ES, Wulff BBH. Rapid cloning of disease-resistance genes in plants using mutagenesis and sequence capture. Nat Biotechnol 2016; 34:652-5. [PMID: 27111722 DOI: 10.1038/nbt.3543] [Citation(s) in RCA: 226] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 03/16/2016] [Indexed: 01/18/2023]
Abstract
Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5-15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize.
Collapse
Affiliation(s)
| | - Sambasivam K Periyannan
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Agriculture Flagship, Canberra, NSW, Australia
| | | | | | - Matthew N Rouse
- USDA-ARS Cereal Disease Laboratory and Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota, USA
| | | | - Asyraf Hatta
- John Innes Centre, Norwich, UK
- Department of Agriculture Technology, Universiti Putra Malaysia, Serdang, Malaysia
| | - Mick Ayliffe
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Agriculture Flagship, Canberra, NSW, Australia
| | - Harbans Bariana
- University of Sydney, Plant Breeding Institute, Cobbitty, NSW, Australia
| | | | - Evans S Lagudah
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Agriculture Flagship, Canberra, NSW, Australia
| | - Brande B H Wulff
- The Sainsbury Laboratory, Norwich, UK
- John Innes Centre, Norwich, UK
| |
Collapse
|
9
|
Glueck M, Hamilton P, Chevalier F, Breslav S, Khan A, Wigdor D, Brudno M. PhenoBlocks: Phenotype Comparison Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2016; 22:101-110. [PMID: 26529691 DOI: 10.1109/tvcg.2015.2467733] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The differential diagnosis of hereditary disorders is a challenging task for clinicians due to the heterogeneity of phenotypes that can be observed in patients. Existing clinical tools are often text-based and do not emphasize consistency, completeness, or granularity of phenotype reporting. This can impede clinical diagnosis and limit their utility to genetics researchers. Herein, we present PhenoBlocks, a novel visual analytics tool that supports the comparison of phenotypes between patients, or between a patient and the hallmark features of a disorder. An informal evaluation of PhenoBlocks with expert clinicians suggested that the visualization effectively guides the process of differential diagnosis and could reinforce the importance of complete, granular phenotypic reporting.
Collapse
|
10
|
Kannan L, Ramos M, Re A, El-Hachem N, Safikhani Z, Gendoo DM, Davis S, Gomez-Cabrero D, Castelo R, Hansen KD, Carey VJ, Morgan M, Culhane AC, Haibe-Kains B, Waldron L. Public data and open source tools for multi-assay genomic investigation of disease. Brief Bioinform 2015; 17:603-15. [PMID: 26463000 PMCID: PMC4945830 DOI: 10.1093/bib/bbv080] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Indexed: 01/07/2023] Open
Abstract
Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods.
Collapse
|
11
|
Griffith M, Walker JR, Spies NC, Ainscough BJ, Griffith OL. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud. PLoS Comput Biol 2015; 11:e1004393. [PMID: 26248053 PMCID: PMC4527835 DOI: 10.1371/journal.pcbi.1004393] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.
Collapse
Affiliation(s)
- Malachi Griffith
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Jason R. Walker
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Nicholas C. Spies
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Benjamin J. Ainscough
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Obi L. Griffith
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| |
Collapse
|
12
|
Shi X, Peng J, Yu X, Zhang X, Li D, Liu B, Kong F, Yuan X. PopGeV: a web-based large-scale population genome browser. Bioinformatics 2015; 31:3048-50. [PMID: 26002882 DOI: 10.1093/bioinformatics/btv324] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 05/15/2015] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The development of high-throughput sequencing technology has made it possible for more and more researchers to use population sequencing data to mine genes associated with specific traits. However, the massive amounts of sequencing data have also brought new challenges to the researchers. The question of how to browse population genomic data in an easy and intuitive manner must be addressed. Web-based genome browsers allow user to conveniently view the results of genomic analyses, but heavy usage can reduce the response speed of the webpage, which limits its usefulness in the display of large-scale genome data. IndexedDB technology is a good solution to this problem; it supports web browsers and so creates local databases. In this way, data can be read from the local storage, achieving a smooth display of population genomic data. RESULTS PopGeV has the following characteristics. First, it uses a new encoding method for compression of population SNP and INDEL data. IndexedDB technology is used to download the results to local storage so that users can browse the results smoothly even when the network traffic is heavy. Second, PopGeV identify similar genomic regions between two individuals based on SNP data. Population diversity indexes are calculated when comparing two populations. Third, user defined annotation information can be integrated for user-friendly mining of gene functions. Simulation shows that PopGeV can smoothly display analysis results of population genome containing over 500 individuals with 2 millions SNP data. AVAILABILITY AND IMPLEMENTATION PopGeV is available at www.soyomics.com/popgev/ CONTACT yuanxh@iga.ac.cn.
Collapse
Affiliation(s)
- Xinyi Shi
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Jing Peng
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Xiaohan Yu
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Xiaohong Zhang
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Dongye Li
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Baohui Liu
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Fanjiang Kong
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Xiaohui Yuan
- The Key Lab of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China, University of Chinese Academy of Sciences, Beijing, China, School of Computer Science and Technology, Heilongjiang University, Harbin, China, College of Electronic and Information, Northeast Agricultural University, Harbin, China and School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| |
Collapse
|
13
|
Bowdin SC, Hayeems RZ, Monfared N, Cohn RD, Meyn MS. The SickKids Genome Clinic: developing and evaluating a pediatric model for individualized genomic medicine. Clin Genet 2015; 89:10-9. [PMID: 25813238 DOI: 10.1111/cge.12579] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2014] [Revised: 02/01/2015] [Accepted: 02/23/2015] [Indexed: 01/16/2023]
Abstract
Our increasing knowledge of how genomic variants affect human health and the falling costs of whole-genome sequencing are driving the development of individualized genomic medicine. This new clinical paradigm uses knowledge of an individual's genomic variants to anticipate, diagnose and manage disease. While individualized genetic medicine offers the promise of transformative change in health care, it forces us to reconsider existing ethical, scientific and clinical paradigms. The potential benefits of pre-symptomatic identification of at-risk individuals, improved diagnostics, individualized therapy, accurate prognosis and avoidance of adverse drug reactions coexist with the potential risks of uninterpretable results, psychological harm, outmoded counseling models and increased health care costs. Here we review the challenges, opportunities and limits of integrating genomic analysis into pediatric clinical practice and describe a model for implementing individualized genomic medicine. Our multidisciplinary team of bioinformaticians, health economists, health services and policy researchers, ethicists, geneticists, genetic counselors and clinicians has designed a 'Genome Clinic' research project that addresses multiple challenges in pediatric genomic medicine--ranging from development of bioinformatics tools for the clinical assessment of genomic variants and the discovery of disease genes to health policy inquiries, assessment of clinical care models, patient preference and the ethics of consent.
Collapse
Affiliation(s)
- S C Bowdin
- Division of Clinical and Metabolic Genetics, Department of Paediatrics, The Hospital for Sick Children, Toronto, Canada.,Centre for Genetic Medicine, The Hospital for Sick Children, Toronto, Canada.,Department of Paediatrics, University of Toronto, Toronto, Canada
| | - R Z Hayeems
- Centre for Genetic Medicine, The Hospital for Sick Children, Toronto, Canada.,Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada.,Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Canada
| | - N Monfared
- Centre for Genetic Medicine, The Hospital for Sick Children, Toronto, Canada
| | - R D Cohn
- Division of Clinical and Metabolic Genetics, Department of Paediatrics, The Hospital for Sick Children, Toronto, Canada.,Centre for Genetic Medicine, The Hospital for Sick Children, Toronto, Canada.,Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Canada.,Department of Paediatrics, University of Toronto, Toronto, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - M S Meyn
- Division of Clinical and Metabolic Genetics, Department of Paediatrics, The Hospital for Sick Children, Toronto, Canada.,Centre for Genetic Medicine, The Hospital for Sick Children, Toronto, Canada.,Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Canada.,Department of Paediatrics, University of Toronto, Toronto, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Canada
| |
Collapse
|
14
|
Juan L, Liu Y, Wang Y, Teng M, Zang T, Wang Y. Family genome browser: visualizing genomes with pedigree information. Bioinformatics 2015; 31:2262-8. [PMID: 25788626 DOI: 10.1093/bioinformatics/btv151] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 03/11/2015] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. RESULTS We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. AVAILABILITY AND IMPLEMENTATION The FGB is available at http://mlg.hit.edu.cn/FGB/.
Collapse
Affiliation(s)
- Liran Juan
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yongzhuang Liu
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yongtian Wang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Mingxiang Teng
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Tianyi Zang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Wang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| |
Collapse
|
15
|
An J, Lai J, Wood DLA, Sajjanhar A, Wang C, Tevz G, Lehman ML, Nelson CC. RNASeqBrowser: a genome browser for simultaneous visualization of raw strand specific RNAseq reads and UCSC genome browser custom tracks. BMC Genomics 2015; 16:145. [PMID: 25766521 PMCID: PMC4355470 DOI: 10.1186/s12864-015-1346-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/16/2015] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Strand specific RNAseq data is now more common in RNAseq projects. Visualizing RNAseq data has become an important matter in Analysis of sequencing data. The most widely used visualization tool is the UCSC genome browser that introduced the custom track concept that enabled researchers to simultaneously visualize gene expression at a particular locus from multiple experiments. Our objective of the software tool is to provide friendly interface for visualization of RNAseq datasets. RESULTS This paper introduces a visualization tool (RNASeqBrowser) that incorporates and extends the functionality of the UCSC genome browser. For example, RNASeqBrowser simultaneously displays read coverage, SNPs, InDels and raw read tracks with other BED and wiggle tracks -- all being dynamically built from the BAM file. Paired reads are also connected in the browser to enable easier identification of novel exon/intron borders and chimaeric transcripts. Strand specific RNAseq data is also supported by RNASeqBrowser that displays reads above (positive strand transcript) or below (negative strand transcripts) a central line. Finally, RNASeqBrowser was designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform. CONCLUSIONS The features of RNASeqBrowser: (1) RNASeqBrowser integrates UCSC genome browser and NGS visualization tools such as IGV. It extends the functionality of the UCSC genome browser by adding several new types of tracks to show NGS data such as individual raw reads, SNPs and InDels. (2) RNASeqBrowser can dynamically generate RNA secondary structure. It is useful for identifying non-coding RNA such as miRNA. (3) Overlaying NGS wiggle data is helpful in displaying differential expression and is simple to implement in RNASeqBrowser. (4) NGS data accumulates a lot of raw reads. Thus, RNASeqBrowser collapses exact duplicate reads to reduce visualization space. Normal PC's can show many windows of NGS individual raw reads without much delay. (5) Multiple popup windows of individual raw reads provide users with more viewing space. This avoids existing approaches (such as IGV) which squeeze all raw reads into one window. This will be helpful for visualizing multiple datasets simultaneously. RNASeqBrowser and its manual are freely available at http://www.australianprostatecentre.org/research/software/rnaseqbrowser or http://sourceforge.net/projects/rnaseqbrowser/.
Collapse
Affiliation(s)
- Jiyuan An
- Australian Prostate Cancer Research Centre - Queensland, Institute of Health and Biomedical Innovation, Queensland University of Technology, Princess Alexandra Hospital, Translational Research Institute, Brisbane, 4102, Australia.
| | - John Lai
- Australian Prostate Cancer Research Centre - Queensland, Institute of Health and Biomedical Innovation, Queensland University of Technology, Princess Alexandra Hospital, Translational Research Institute, Brisbane, 4102, Australia.
| | - David L A Wood
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St. Lucia, 4072, Australia.
| | - Atul Sajjanhar
- School of Information Technology, Deakin University, 221 Burwood Highway, Burwood, VIC, 3125, Australia.
| | - Chenwei Wang
- Australian Prostate Cancer Research Centre - Queensland, Institute of Health and Biomedical Innovation, Queensland University of Technology, Princess Alexandra Hospital, Translational Research Institute, Brisbane, 4102, Australia.
| | - Gregor Tevz
- Australian Prostate Cancer Research Centre - Queensland, Institute of Health and Biomedical Innovation, Queensland University of Technology, Princess Alexandra Hospital, Translational Research Institute, Brisbane, 4102, Australia.
| | - Melanie L Lehman
- Australian Prostate Cancer Research Centre - Queensland, Institute of Health and Biomedical Innovation, Queensland University of Technology, Princess Alexandra Hospital, Translational Research Institute, Brisbane, 4102, Australia.
| | - Colleen C Nelson
- Australian Prostate Cancer Research Centre - Queensland, Institute of Health and Biomedical Innovation, Queensland University of Technology, Princess Alexandra Hospital, Translational Research Institute, Brisbane, 4102, Australia.
| |
Collapse
|
16
|
Kildegaard KR, Hallström BM, Blicher TH, Sonnenschein N, Jensen NB, Sherstyk S, Harrison SJ, Maury J, Herrgård MJ, Juncker AS, Forster J, Nielsen J, Borodina I. Evolution reveals a glutathione-dependent mechanism of 3-hydroxypropionic acid tolerance. Metab Eng 2014; 26:57-66. [PMID: 25263954 DOI: 10.1016/j.ymben.2014.09.004] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 08/15/2014] [Accepted: 09/15/2014] [Indexed: 12/19/2022]
Abstract
Biologically produced 3-hydroxypropionic acid (3 HP) is a potential source for sustainable acrylates and can also find direct use as monomer in the production of biodegradable polymers. For industrial-scale production there is a need for robust cell factories tolerant to high concentration of 3 HP, preferably at low pH. Through adaptive laboratory evolution we selected S. cerevisiae strains with improved tolerance to 3 HP at pH 3.5. Genome sequencing followed by functional analysis identified the causal mutation in SFA1 gene encoding S-(hydroxymethyl)glutathione dehydrogenase. Based on our findings, we propose that 3 HP toxicity is mediated by 3-hydroxypropionic aldehyde (reuterin) and that glutathione-dependent reactions are used for reuterin detoxification. The identified molecular response to 3 HP and reuterin may well be a general mechanism for handling resistance to organic acid and aldehydes by living cells.
Collapse
Affiliation(s)
- Kanchana R Kildegaard
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Björn M Hallström
- Science for Life Laboratory, KTH Royal Institution of Technology, Box 1031, SE-171 21 Solna, Sweden
| | - Thomas H Blicher
- The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3b, DK-2200 Copenhagen , Denmark
| | - Nikolaus Sonnenschein
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Niels B Jensen
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Svetlana Sherstyk
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Scott J Harrison
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Jérôme Maury
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Markus J Herrgård
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Agnieszka S Juncker
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Jochen Forster
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark
| | - Jens Nielsen
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark; Department of Chemical and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96 Göteborg, Sweden
| | - Irina Borodina
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kogle Allé 6, DK-2970 Hørsholm, Denmark.
| |
Collapse
|
17
|
First Complete Genome Sequence of Staphylococcus xylosus, a Meat Starter Culture and a Host to Propagate Staphylococcus aureus Phages. GENOME ANNOUNCEMENTS 2014; 2:2/4/e00671-14. [PMID: 25013142 PMCID: PMC4110768 DOI: 10.1128/genomea.00671-14] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Staphylococcus xylosus is a bacterial species used in meat fermentation and a commensal microorganism found on animals. We present the first complete circular genome from this species. The genome is composed of 2,757,557 bp, with a G+C content of 32.9%, and contains 2,514 genes and 79 structural RNAs.
Collapse
|
18
|
Celiku O, Johnson S, Zhao S, Camphausen K, Shankavaram U. Visualizing molecular profiles of glioblastoma with GBM-BioDP. PLoS One 2014; 9:e101239. [PMID: 25010047 PMCID: PMC4091869 DOI: 10.1371/journal.pone.0101239] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 06/04/2014] [Indexed: 12/11/2022] Open
Abstract
Validation of clinical biomarkers and response to therapy is a challenging topic in cancer research. An important source of information for virtual validation is the datasets generated from multi-center cancer research projects such as The Cancer Genome Atlas project (TCGA). These data enable investigation of genetic and epigenetic changes responsible for cancer onset and progression, response to cancer therapies, and discovery of the molecular profiles of various cancers. However, these analyses often require bulk download of data and substantial bioinformatics expertise, which can be intimidating for investigators. Here, we report on the development of a new resource available to scientists: a data base called Glioblastoma Bio Discovery Portal (GBM-BioDP). GBM-BioDP is a free web-accessible resource that hosts a subset of the glioblastoma TCGA data and enables an intuitive query and interactive display of the resultant data. This resource provides visualization tools for the exploration of gene, miRNA, and protein expression, differential expression within the subtypes of GBM, and potential associations with clinical outcome, which are useful for virtual biological validation. The tool may also enable generation of hypotheses on how therapies impact GBM molecular profiles, which can help in personalization of treatment for optimal outcome. The resource can be accessed freely at http://gbm-biodp.nci.nih.gov (a tutorial is included).
Collapse
Affiliation(s)
- Orieta Celiku
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Seth Johnson
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Shuping Zhao
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Kevin Camphausen
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Uma Shankavaram
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
19
|
Jäger G, Peltzer A, Nieselt K. inPHAP: interactive visualization of genotype and phased haplotype data. BMC Bioinformatics 2014; 15:200. [PMID: 25002076 PMCID: PMC4083868 DOI: 10.1186/1471-2105-15-200] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Accepted: 06/10/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To understand individual genomes it is necessary to look at the variations that lead to changes in phenotype and possibly to disease. However, genotype information alone is often not sufficient and additional knowledge regarding the phase of the variation is needed to make correct interpretations. Interactive visualizations, that allow the user to explore the data in various ways, can be of great assistance in the process of making well informed decisions. But, currently there is a lack for visualizations that are able to deal with phased haplotype data. RESULTS We present inPHAP, an interactive visualization tool for genotype and phased haplotype data. inPHAP features a variety of interaction possibilities such as zooming, sorting, filtering and aggregation of rows in order to explore patterns hidden in large genetic data sets. As a proof of concept, we apply inPHAP to the phased haplotype data set of Phase 1 of the 1000 Genomes Project. Thereby, inPHAP's ability to show genetic variations on the population as well as on the individuals level is demonstrated for several disease related loci. CONCLUSIONS As of today, inPHAP is the only visual analytical tool that allows the user to explore unphased and phased haplotype data interactively. Due to its highly scalable design, inPHAP can be applied to large datasets with up to 100 GB of data, enabling users to visualize even large scale input data. inPHAP closes the gap between common visualization tools for unphased genotype data and introduces several new features, such as the visualization of phased data. inPHAP is available for download at http://bit.ly/1iJgKmX.
Collapse
Affiliation(s)
- Günter Jäger
- Integrative Transcriptomics, Center for Bioinformatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany
| | - Alexander Peltzer
- Integrative Transcriptomics, Center for Bioinformatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany
| | - Kay Nieselt
- Integrative Transcriptomics, Center for Bioinformatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany
| |
Collapse
|
20
|
Juan L, Teng M, Zang T, Hao Y, Wang Z, Yan C, Liu Y, Li J, Zhang T, Wang Y. The personal genome browser: visualizing functions of genetic variants. Nucleic Acids Res 2014; 42:W192-7. [PMID: 24799434 PMCID: PMC4086072 DOI: 10.1093/nar/gku361] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome Browser (PGB) is developed to provide comprehensive functional annotation and visualization for individual genomes based on the genetic-molecular-phenotypic model. Investigators can easily view individual genetic variants, such as single nucleotide variants (SNVs), INDELs and structural variations (SVs), as well as genomic features and phenotypes associated to the individual genetic variants. The PGB especially highlights potential functional variants using the PGB built-in method or SIFT/PolyPhen2 scores. Moreover, the functional risks of genes could be evaluated by scanning individual genetic variants on the whole genome, a chromosome, or a cytoband based on functional implications of the variants. Investigators can then navigate to high risk genes on the scanned individual genome. The PGB accepts Variant Call Format (VCF) and Genetic Variation Format (GVF) files as the input. The functional annotation of input individual genome variants can be visualized in real time by well-defined symbols and shapes. The PGB is available at http://www.pgbrowser.org/.
Collapse
Affiliation(s)
- Liran Juan
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Mingxiang Teng
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Tianyi Zang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yafeng Hao
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Zhenxing Wang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Chengwu Yan
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yongzhuang Liu
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Jie Li
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Tianjiao Zhang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Wang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| |
Collapse
|
21
|
Bowdin S, Ray PN, Cohn RD, Meyn MS. The Genome Clinic: A Multidisciplinary Approach to Assessing the Opportunities and Challenges of Integrating Genomic Analysis into Clinical Care. Hum Mutat 2014; 35:513-9. [DOI: 10.1002/humu.22536] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2013] [Accepted: 02/21/2014] [Indexed: 11/09/2022]
Affiliation(s)
- Sarah Bowdin
- Division of Clinical and Metabolic Genetics; Department of Paediatrics; The Hospital for Sick Children; Toronto Ontario Canada
- The Centre for Genetic Medicine; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Paediatrics; University of Toronto; Toronto Ontario Canada
| | - Peter N. Ray
- The Centre for Genetic Medicine; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Paediatric Laboratory Medicine; The Hospital for Sick Children; Toronto Ontario Canada
- Program in Genetics and Genome Biology; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| | - Ronald D. Cohn
- Division of Clinical and Metabolic Genetics; Department of Paediatrics; The Hospital for Sick Children; Toronto Ontario Canada
- The Centre for Genetic Medicine; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Paediatrics; University of Toronto; Toronto Ontario Canada
- Program in Genetics and Genome Biology; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| | - M. Stephen Meyn
- Division of Clinical and Metabolic Genetics; Department of Paediatrics; The Hospital for Sick Children; Toronto Ontario Canada
- The Centre for Genetic Medicine; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Paediatrics; University of Toronto; Toronto Ontario Canada
- Program in Genetics and Genome Biology; The Hospital for Sick Children; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| |
Collapse
|
22
|
Genomic analysis of diffuse intrinsic pontine gliomas identifies three molecular subgroups and recurrent activating ACVR1 mutations. Nat Genet 2014; 46:451-6. [PMID: 24705254 PMCID: PMC3997489 DOI: 10.1038/ng.2936] [Citation(s) in RCA: 464] [Impact Index Per Article: 46.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Accepted: 03/05/2014] [Indexed: 12/19/2022]
Abstract
Diffuse Intrinsic Pontine Glioma (DIPG) is a fatal brain cancer that arises in the brainstem of children with no effective treatment and near 100% fatality. The failure of most therapies can be attributed to the delicate location of these tumors and choosing therapies based on assumptions that DIPGs are molecularly similar to adult disease. Recent studies have unraveled the unique genetic make-up of this brain cancer with nearly 80% harboring a K27M-H3.3 or K27M-H3.1 mutation. However, DIPGs are still thought of as one disease with limited understanding of the genetic drivers of these tumors. To understand what drives DIPGs we integrated whole-genome-sequencing with methylation, expression and copy-number profiling, discovering that DIPGs are three molecularly distinct subgroups (H3-K27M, Silent, MYCN) and uncovering a novel recurrent activating mutation in the activin receptor ACVR1, in 20% of DIPGs. Mutations in ACVR1 were constitutively activating, leading to SMAD phosphorylation and increased expression of downstream activin signaling targets ID1 and ID2. Our results highlight distinct molecular subgroups and novel therapeutic targets for this incurable pediatric cancer.
Collapse
|
23
|
Nguyen QV, Nelmes G, Huang ML, Simoff S, Catchpoole D. Interactive Visualization for Patient-to-Patient Comparison. Genomics Inform 2014; 12:21-34. [PMID: 24748858 PMCID: PMC3990763 DOI: 10.5808/gi.2014.12.1.21] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Revised: 02/19/2014] [Accepted: 02/20/2014] [Indexed: 12/20/2022] Open
Abstract
A visual analysis approach and the developed supporting technology provide a comprehensive solution for analyzing large and complex integrated genomic and biomedical data. This paper presents a methodology that is implemented as an interactive visual analysis technology for extracting knowledge from complex genetic and clinical data and then visualizing it in a meaningful and interpretable way. By synergizing the domain knowledge into development and analysis processes, we have developed a comprehensive tool that supports a seamless patient-to-patient analysis, from an overview of the patient population in the similarity space to the detailed views of genes. The system consists of multiple components enabling the complete analysis process, including data mining, interactive visualization, analytical views, and gene comparison. We demonstrate our approach with medical scientists on a case study of childhood cancer patients on how they use the tool to confirm existing hypotheses and to discover new scientific insights.
Collapse
Affiliation(s)
- Quang Vinh Nguyen
- MARCS Institute & School of Computing, Engineering and Mathematics, University of Western Sydney, South Penrith DC, NSW 1979, Australia
| | - Guy Nelmes
- The Kids Research Institute, The Children's Hospital at Westmead, Westmead, NSW 2145, Australia
| | - Mao Lin Huang
- School of Software, Faculty of Engineering & IT, University of Technology, Sydney, NSW 2007, Australia
| | - Simeon Simoff
- MARCS Institute & School of Computing, Engineering and Mathematics, University of Western Sydney, South Penrith DC, NSW 1979, Australia
| | - Daniel Catchpoole
- The Kids Research Institute, The Children's Hospital at Westmead, Westmead, NSW 2145, Australia
| |
Collapse
|
24
|
Mader M, Simon R, Kurtz S. FISH Oracle 2: a web server for integrative visualization of genomic data in cancer research. J Clin Bioinforma 2014; 4:5. [PMID: 24684958 PMCID: PMC4230720 DOI: 10.1186/2043-9113-4-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 03/26/2014] [Indexed: 01/21/2023] Open
Abstract
Background A comprehensive view on all relevant genomic data is instrumental for understanding the complex patterns of molecular alterations typically found in cancer cells. One of the most effective ways to rapidly obtain an overview of genomic alterations in large amounts of genomic data is the integrative visualization of genomic events. Results We developed FISH Oracle 2, a web server for the interactive visualization of different kinds of downstream processed genomics data typically available in cancer research. A powerful search interface and a fast visualization engine provide a highly interactive visualization for such data. High quality image export enables the life scientist to easily communicate their results. A comprehensive data administration allows to keep track of the available data sets. We applied FISH Oracle 2 to published data and found evidence that, in colorectal cancer cells, the gene TTC28 may be inactivated in two different ways, a fact that has not been published before. Conclusions The interactive nature of FISH Oracle 2 and the possibility to store, select and visualize large amounts of downstream processed data support life scientists in generating hypotheses. The export of high quality images supports explanatory data visualization, simplifying the communication of new biological findings. A FISH Oracle 2 demo server and the software is available at
http://www.zbh.uni-hamburg.de/fishoracle.
Collapse
Affiliation(s)
| | | | - Stefan Kurtz
- Center for Bioinformatics, University of Hamburg, Bundesstrasse 43, 20146 Hamburg, Germany.
| |
Collapse
|
25
|
Nikolayeva O, Robinson MD. edgeR for differential RNA-seq and ChIP-seq analysis: an application to stem cell biology. Methods Mol Biol 2014; 1150:45-79. [PMID: 24743990 DOI: 10.1007/978-1-4939-0512-6_3] [Citation(s) in RCA: 160] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
The edgeR package, an R-based tool within the Bioconductor project, offers a flexible statistical framework for detection of changes in abundance based on counts. In this chapter, we illustrate the use of edgeR on a human embryonic stem cell dataset, in particular for RNA-seq and ChIP-seq data. We focus on a step-by-step statistical analysis of differential expression, going from raw data to a list of putative differentially expressed genes and give examples of integrative analysis using the ChIP-seq data. We emphasize data quality spot checks and the use of positive controls throughout the process and give practical recommendations for reproducible research.
Collapse
Affiliation(s)
- Olga Nikolayeva
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, CH-8057, Zurich, Switzerland
| | | |
Collapse
|
26
|
Jupe F, Witek K, Verweij W, Śliwka J, Pritchard L, Etherington GJ, Maclean D, Cock PJ, Leggett RM, Bryan GJ, Cardle L, Hein I, Jones JDG. Resistance gene enrichment sequencing (RenSeq) enables reannotation of the NB-LRR gene family from sequenced plant genomes and rapid mapping of resistance loci in segregating populations. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 76:530-44. [PMID: 23937694 PMCID: PMC3935411 DOI: 10.1111/tpj.12307] [Citation(s) in RCA: 219] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 08/02/2013] [Accepted: 08/06/2013] [Indexed: 05/02/2023]
Abstract
RenSeq is a NB-LRR (nucleotide binding-site leucine-rich repeat) gene-targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB-LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB-LRRs and can be accessed through a genome browser that we provide. We compared these NB-LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ~80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum 'Heinz 1706' extended the NB-LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co-segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi-ber2) and S. ruiz-ceballosii (Rpi-rzc1), we were able to apply RenSeq successfully to identify markers that co-segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy-to-adapt Galaxy pipelines.
Collapse
Affiliation(s)
- Florian Jupe
- The Sainsbury LaboratoryNorwich Research Park, NR4 7UH, Norwich, UK
| | - Kamil Witek
- The Sainsbury LaboratoryNorwich Research Park, NR4 7UH, Norwich, UK
| | - Walter Verweij
- The Sainsbury LaboratoryNorwich Research Park, NR4 7UH, Norwich, UK
- The Genome Analysis CentreNorwich Research Park, NR4 7UH, Norwich, UK
| | - Jadwiga Śliwka
- The Plant Breeding and Acclimatization Institute, Research Center MłochówPlatanowa 19, 05-831, Młochów, Poland
| | - Leighton Pritchard
- Information and Computational Sciences, James Hutton InstituteDD2 5DA, Dundee, UK
| | | | - Dan Maclean
- The Sainsbury LaboratoryNorwich Research Park, NR4 7UH, Norwich, UK
| | - Peter J Cock
- Information and Computational Sciences, James Hutton InstituteDD2 5DA, Dundee, UK
| | - Richard M Leggett
- The Genome Analysis CentreNorwich Research Park, NR4 7UH, Norwich, UK
| | - Glenn J Bryan
- Cell and Molecular Sciences, James Hutton InstituteDD2 5DA, Dundee, UK
| | - Linda Cardle
- Information and Computational Sciences, James Hutton InstituteDD2 5DA, Dundee, UK
| | - Ingo Hein
- Cell and Molecular Sciences, James Hutton InstituteDD2 5DA, Dundee, UK
- *For correspondence (e-mails ; )
| | - Jonathan DG Jones
- The Sainsbury LaboratoryNorwich Research Park, NR4 7UH, Norwich, UK
- *For correspondence (e-mails ; )
| |
Collapse
|
27
|
Lechat P, Souche E, Moszer I. SynTView - an interactive multi-view genome browser for next-generation comparative microorganism genomics. BMC Bioinformatics 2013; 14:277. [PMID: 24053737 PMCID: PMC3849071 DOI: 10.1186/1471-2105-14-277] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 09/16/2013] [Indexed: 12/31/2022] Open
Abstract
Background Dynamic visualisation interfaces are required to explore the multiple microbial genome data now available, especially those obtained by high-throughput sequencing — a.k.a. “Next-Generation Sequencing” (NGS) — technologies; they would also be useful for “standard” annotated genomes whose chromosome organizations may be compared. Although various software systems are available, few offer an optimal combination of feature-rich capabilities, non-static user interfaces and multi-genome data handling. Results We developed SynTView, a comparative and interactive viewer for microbial genomes, designed to run as either a web-based tool (Flash technology) or a desktop application (AIR environment). The basis of the program is a generic genome browser with sub-maps holding information about genomic objects (annotations). The software is characterised by the presentation of syntenic organisations of microbial genomes and the visualisation of polymorphism data (typically Single Nucleotide Polymorphisms — SNPs) along these genomes; these features are accessible to the user in an integrated way. A variety of specialised views are available and are all dynamically inter-connected (including linear and circular multi-genome representations, dot plots, phylogenetic profiles, SNP density maps, and more). SynTView is not linked to any particular database, allowing the user to plug his own data into the system seamlessly, and use external web services for added functionalities. SynTView has now been used in several genome sequencing projects to help biologists make sense out of huge data sets. Conclusions The most important assets of SynTView are: (i) the interactivity due to the Flash technology; (ii) the capabilities for dynamic interaction between many specialised views; and (iii) the flexibility allowing various user data sets to be integrated. It can thus be used to investigate massive amounts of information efficiently at the chromosome level. This innovative approach to data exploration could not be achieved with most existing genome browsers, which are more static and/or do not offer multiple views of multiple genomes. Documentation, tutorials and demonstration sites are available at the URL: http://genopole.pasteur.fr/SynTView.
Collapse
Affiliation(s)
- Pierre Lechat
- Institut Pasteur, Plate-forme Bioanalyse Génomique, 28 rue du Docteur Roux, Paris, Cedex 15 75724, France.
| | | | | |
Collapse
|
28
|
Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat Protoc 2013; 8:1765-86. [PMID: 23975260 DOI: 10.1038/nprot.2013.099] [Citation(s) in RCA: 819] [Impact Index Per Article: 74.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4-10 samples) can be <1 h, with computation time <1 d using a standard desktop PC.
Collapse
|
29
|
Goecks J, Eberhard C, Too T, Nekrutenko A, Taylor J. Web-based visual analysis for high-throughput genomics. BMC Genomics 2013; 14:397. [PMID: 23758618 PMCID: PMC3691752 DOI: 10.1186/1471-2164-14-397] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Accepted: 05/31/2013] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Visualization plays an essential role in genomics research by making it possible to observe correlations and trends in large datasets as well as communicate findings to others. Visual analysis, which combines visualization with analysis tools to enable seamless use of both approaches for scientific investigation, offers a powerful method for performing complex genomic analyses. However, there are numerous challenges that arise when creating rich, interactive Web-based visualizations/visual analysis applications for high-throughput genomics. These challenges include managing data flow from Web server to Web browser, integrating analysis tools and visualizations, and sharing visualizations with colleagues. RESULTS We have created a platform simplifies the creation of Web-based visualization/visual analysis applications for high-throughput genomics. This platform provides components that make it simple to efficiently query very large datasets, draw common representations of genomic data, integrate with analysis tools, and share or publish fully interactive visualizations. Using this platform, we have created a Circos-style genome-wide viewer, a generic scatter plot for correlation analysis, an interactive phylogenetic tree, a scalable genome browser for next-generation sequencing data, and an application for systematically exploring tool parameter spaces to find good parameter values. All visualizations are interactive and fully customizable. The platform is integrated with the Galaxy (http://galaxyproject.org) genomics workbench, making it easy to integrate new visual applications into Galaxy. CONCLUSIONS Visualization and visual analysis play an important role in high-throughput genomics experiments, and approaches are needed to make it easier to create applications for these activities. Our framework provides a foundation for creating Web-based visualizations and integrating them into Galaxy. Finally, the visualizations we have created using the framework are useful tools for high-throughput genomics experiments.
Collapse
Affiliation(s)
- Jeremy Goecks
- Department of Biology, Emory University, 1510 Clifton Road NE, Atlanta, GA 30322, USA
| | | | | | | | | | | |
Collapse
|
30
|
Valsesia A, Macé A, Jacquemont S, Beckmann JS, Kutalik Z. The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation. Front Genet 2013; 4:92. [PMID: 23750167 PMCID: PMC3667386 DOI: 10.3389/fgene.2013.00092] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2013] [Accepted: 05/04/2013] [Indexed: 02/03/2023] Open
Abstract
Differences between genomes can be due to single nucleotide variants, translocations, inversions, and copy number variants (CNVs, gain or loss of DNA). The latter can range from sub-microscopic events to complete chromosomal aneuploidies. Small CNVs are often benign but those larger than 500 kb are strongly associated with morbid consequences such as developmental disorders and cancer. Detecting CNVs within and between populations is essential to better understand the plasticity of our genome and to elucidate its possible contribution to disease. Hence there is a need for better-tailored and more robust tools for the detection and genome-wide analyses of CNVs. While a link between a given CNV and a disease may have often been established, the relative CNV contribution to disease progression and impact on drug response is not necessarily understood. In this review we discuss the progress, challenges, and limitations that occur at different stages of CNV analysis from the detection (using DNA microarrays and next-generation sequencing) and identification of recurrent CNVs to the association with phenotypes. We emphasize the importance of germline CNVs and propose strategies to aid clinicians to better interpret structural variations and assess their clinical implications.
Collapse
Affiliation(s)
- Armand Valsesia
- Genetics Core, Nestlé Institute of Health Sciences Lausanne, Switzerland
| | | | | | | | | |
Collapse
|
31
|
Oberlin AT, Jurkovic DA, Balish MF, Friedberg I. Biological database of images and genomes: tools for community annotations linking image and genomic information. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat016. [PMID: 23550062 PMCID: PMC3708683 DOI: 10.1093/database/bat016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Genomic data and biomedical imaging data are undergoing exponential growth. However, our understanding of the phenotype-genotype connection linking the two types of data is lagging behind. While there are many types of software that enable the manipulation and analysis of image data and genomic data as separate entities, there is no framework established for linking the two. We present a generic set of software tools, BioDIG, that allows linking of image data to genomic data. BioDIG tools can be applied to a wide range of research problems that require linking images to genomes. BioDIG features the following: rapid construction of web-based workbenches, community-based annotation, user management and web services. By using BioDIG to create websites, researchers and curators can rapidly annotate a large number of images with genomic information. Here we present the BioDIG software tools that include an image module, a genome module and a user management module. We also introduce a BioDIG-based website, MyDIG, which is being used to annotate images of mycoplasmas.
Collapse
Affiliation(s)
- Andrew T Oberlin
- Department of Computer Science and Software Engineering, Miami University, Oxford, OH 45056, USA
| | | | | | | |
Collapse
|
32
|
Schroeder MP, Gonzalez-Perez A, Lopez-Bigas N. Visualizing multidimensional cancer genomics data. Genome Med 2013; 5:9. [PMID: 23363777 PMCID: PMC3706894 DOI: 10.1186/gm413] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Cancer genomics projects employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Examples include projects carried out by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). A crucial step in the extraction of knowledge from the data is the exploration by experts of the different alterations, as well as the multiple relationships between them. To that end, the use of intuitive visualization tools that can integrate different types of alterations with clinical data is essential to the field of cancer genomics. Here, we review effective and common visualization techniques for exploring oncogenomics data and discuss a selection of tools that allow researchers to effectively visualize multidimensional oncogenomics datasets. The review covers visualization methods employed by tools such as Circos, Gitools, the Integrative Genomics Viewer, Cytoscape, Savant Genome Browser, StratomeX and platforms such as cBio Cancer Genomics Portal, IntOGen, the UCSC Cancer Genomics Browser, the Regulome Explorer and the Cancer Genome Workbench.
Collapse
Affiliation(s)
- Michael P Schroeder
- Research Program on Biomedical Informatics - GRIB, Universitat Pompeu Fabra (UPF), Parc de Recerca Biomèdica de Barcelona (PRBB), Dr. Aiguader 88, E-08003 Barcelona, Spain
| | - Abel Gonzalez-Perez
- Research Program on Biomedical Informatics - GRIB, Universitat Pompeu Fabra (UPF), Parc de Recerca Biomèdica de Barcelona (PRBB), Dr. Aiguader 88, E-08003 Barcelona, Spain
| | - Nuria Lopez-Bigas
- Research Program on Biomedical Informatics - GRIB, Universitat Pompeu Fabra (UPF), Parc de Recerca Biomèdica de Barcelona (PRBB), Dr. Aiguader 88, E-08003 Barcelona, Spain ; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
33
|
Mezlini AM, Smith EJM, Fiume M, Buske O, Savich GL, Shah S, Aparicio S, Chiang DY, Goldenberg A, Brudno M. iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res 2012. [PMID: 23204306 PMCID: PMC3589540 DOI: 10.1101/gr.142232.112] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
High-throughput RNA sequencing (RNA-seq) promises to revolutionize our understanding of genes and their role in human disease by characterizing the RNA content of tissues and cells. The realization of this promise, however, is conditional on the development of effective computational methods for the identification and quantification of transcripts from incomplete and noisy data. In this article, we introduce iReckon, a method for simultaneous determination of the isoforms and estimation of their abundances. Our probabilistic approach incorporates multiple biological and technical phenomena, including novel isoforms, intron retention, unspliced pre-mRNA, PCR amplification biases, and multimapped reads. iReckon utilizes regularized expectation-maximization to accurately estimate the abundances of known and novel isoforms. Our results on simulated and real data demonstrate a superior ability to discover novel isoforms with a significantly reduced number of false-positive predictions, and our abundance accuracy prediction outmatches that of other state-of-the-art tools. Furthermore, we have applied iReckon to two cancer transcriptome data sets, a triple-negative breast cancer patient sample and the MCF7 breast cancer cell line, and show that iReckon is able to reconstruct the complex splicing changes that were not previously identified. QT-PCR validations of the isoforms detected in the MCF7 cell line confirmed all of iReckon's predictions and also showed strong agreement (r2 = 0.94) with the predicted abundances.
Collapse
Affiliation(s)
- Aziz M Mezlini
- Department of Computer Science, University of Toronto, Ontario M5S 2E4, Canada
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Jiang Y, Wang Y, Brudno M. PRISM: pair-read informed split-read mapping for base-pair level detection of insertion, deletion and structural variants. ACTA ACUST UNITED AC 2012; 28:2576-83. [PMID: 22851530 DOI: 10.1093/bioinformatics/bts484] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The development of high-throughput sequencing technologies has enabled novel methods for detecting structural variants (SVs). Current methods are typically based on depth of coverage or pair-end mapping clusters. However, most of these only report an approximate location for each SV, rather than exact breakpoints. RESULTS We have developed pair-read informed split mapping (PRISM), a method that identifies SVs and their precise breakpoints from whole-genome resequencing data. PRISM uses a split-alignment approach informed by the mapping of paired-end reads, hence enabling breakpoint identification of multiple SV types, including arbitrary-sized inversions, deletions and tandem duplications. Comparisons to previous datasets and simulation experiments illustrate PRISM's high sensitivity, while PCR validations of PRISM results, including previously uncharacterized variants, indicate an overall precision of ~90%. AVAILABILITY PRISM is freely available at http://compbio.cs.toronto.edu/prism.
Collapse
Affiliation(s)
- Yue Jiang
- Center for Biomedical Informatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.
| | | | | |
Collapse
|