1
|
Fatima N, Saif Ur Rahman M, Qasim M, Ali Ashfaq U, Ahmed U, Masoud MS. Transcriptional Factors Mediated Reprogramming to Pluripotency. Curr Stem Cell Res Ther 2024; 19:367-388. [PMID: 37073151 DOI: 10.2174/1574888x18666230417084518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 02/01/2023] [Accepted: 02/06/2023] [Indexed: 04/20/2023]
Abstract
A unique kind of pluripotent cell, i.e., Induced pluripotent stem cells (iPSCs), now being targeted for iPSC synthesis, are produced by reprogramming animal and human differentiated cells (with no change in genetic makeup for the sake of high efficacy iPSCs formation). The conversion of specific cells to iPSCs has revolutionized stem cell research by making pluripotent cells more controllable for regenerative therapy. For the past 15 years, somatic cell reprogramming to pluripotency with force expression of specified factors has been a fascinating field of biomedical study. For that technological primary viewpoint reprogramming method, a cocktail of four transcription factors (TF) has required: Kruppel-like factor 4 (KLF4), four-octamer binding protein 34 (OCT3/4), MYC and SOX2 (together referred to as OSKM) and host cells. IPS cells have great potential for future tissue replacement treatments because of their ability to self-renew and specialize in all adult cell types, although factor-mediated reprogramming mechanisms are still poorly understood medically. This technique has dramatically improved performance and efficiency, making it more useful in drug discovery, disease remodeling, and regenerative medicine. Moreover, in these four TF cocktails, more than 30 reprogramming combinations were proposed, but for reprogramming effectiveness, only a few numbers have been demonstrated for the somatic cells of humans and mice. Stoichiometry, a combination of reprogramming agents and chromatin remodeling compounds, impacts kinetics, quality, and efficiency in stem cell research.
Collapse
Affiliation(s)
- Nazira Fatima
- Laboratory Animal Center, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi, 710061, China
| | - Muhammad Saif Ur Rahman
- Institute of Advanced Studies, Shenzhen University, Shenzhen, 518060, China
- Key Laboratory of Optoelectronic Devices and Systems of Ministry of Education and Guangdong Province, College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Muhammad Qasim
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad, 38000, Pakistan
| | - Usman Ali Ashfaq
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad, 38000, Pakistan
| | - Uzair Ahmed
- EMBL Partnership Institute for Genome Editing Technologies, Vilnius University, Vilnius, 10257, Lithuania
| | - Muhammad Shareef Masoud
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad, 38000, Pakistan
| |
Collapse
|
2
|
Saarimäki LA, del Giudice G, Greco D. Expanding adverse outcome pathways towards one health models for nanosafety. FRONTIERS IN TOXICOLOGY 2023; 5:1176745. [PMID: 37692900 PMCID: PMC10485555 DOI: 10.3389/ftox.2023.1176745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 08/15/2023] [Indexed: 09/12/2023] Open
Abstract
The ever-growing production of nano-enabled products has generated the need for dedicated risk assessment strategies that ensure safety for humans and the environment. Transdisciplinary approaches are needed to support the development of new technologies while respecting environmental limits, as also highlighted by the EU Green Deal Chemicals Strategy for Sustainability and its safe and sustainable by design (SSbD) framework. The One Health concept offers a holistic multiscale approach for the assessment of nanosafety. However, toxicology is not yet capable of explaining the interaction between chemicals and biological systems at the multiscale level and in the context of the One Health framework. Furthermore, there is a disconnect between chemical safety assessment, epidemiology, and other fields of biology that, if unified, would enable the adoption of the One Health model. The development of mechanistic toxicology and the generation of omics data has provided important biological knowledge of the response of individual biological systems to nanomaterials (NMs). On the other hand, epigenetic data have the potential to inform on interspecies mechanisms of adaptation. These data types, however, need to be linked to concepts that support their intuitive interpretation. Adverse Outcome Pathways (AOPs) represent an evolving framework to anchor existing knowledge to chemical risk assessment. In this perspective, we discuss the possibility of integrating multi-level toxicogenomics data, including toxicoepigenetic insights, into the AOP framework. We anticipate that this new direction of toxicogenomics can support the development of One Health models applicable to groups of chemicals and to multiple species in the tree of life.
Collapse
Affiliation(s)
- Laura Aliisa Saarimäki
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, Helsinki, Finland
| | - Giusy del Giudice
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, Helsinki, Finland
| | - Dario Greco
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, Helsinki, Finland
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| |
Collapse
|
3
|
Smith DA, Sadler MC, Altman RB. Promises and challenges in pharmacoepigenetics. CAMBRIDGE PRISMS. PRECISION MEDICINE 2023; 1:e18. [PMID: 37560024 PMCID: PMC10406571 DOI: 10.1017/pcm.2023.6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/27/2023] [Accepted: 01/31/2023] [Indexed: 08/11/2023]
Abstract
Pharmacogenetics, the study of how interindividual genetic differences affect drug response, does not explain all observed heritable variance in drug response. Epigenetic mechanisms, such as DNA methylation, and histone acetylation may account for some of the unexplained variances. Epigenetic mechanisms modulate gene expression and can be suitable drug targets and can impact the action of nonepigenetic drugs. Pharmacoepigenetics is the field that studies the relationship between epigenetic variability and drug response. Much of this research focuses on compounds targeting epigenetic mechanisms, called epigenetic drugs, which are used to treat cancers, immune disorders, and other diseases. Several studies also suggest an epigenetic role in classical drug response; however, we know little about this area. The amount of information correlating epigenetic biomarkers to molecular datasets has recently expanded due to technological advances, and novel computational approaches have emerged to better identify and predict epigenetic interactions. We propose that the relationship between epigenetics and classical drug response may be examined using data already available by (1) finding regions of epigenetic variance, (2) pinpointing key epigenetic biomarkers within these regions, and (3) mapping these biomarkers to a drug-response phenotype. This approach expands on existing knowledge to generate putative pharmacoepigenetic relationships, which can be tested experimentally. Epigenetic modifications are involved in disease and drug response. Therefore, understanding how epigenetic drivers impact the response to classical drugs is important for improving drug design and administration to better treat disease.
Collapse
Affiliation(s)
- Delaney A Smith
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Marie C Sadler
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- University Center for Primary Care and Public Health, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Russ B Altman
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| |
Collapse
|
4
|
Chenarani N, Emamjomeh A, Allahverdi A, Mirmostafa S, Afsharinia MH, Zahiri J. Bioinformatic tools for DNA methylation and histone modification: A survey. Genomics 2021; 113:1098-1113. [PMID: 33677056 DOI: 10.1016/j.ygeno.2021.03.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 10/10/2020] [Accepted: 03/02/2021] [Indexed: 01/19/2023]
Abstract
Epigenetic inheritance occurs due to different mechanisms such as chromatin and histone modifications, DNA methylation and processes mediated by non-coding RNAs. It leads to changes in gene expressions and the emergence of new traits in different organisms in many diseases such as cancer. Recent advances in experimental methods led to the identification of epigenetic target sites in various organisms. Computational approaches have enabled us to analyze mass data produced by these methods. Next-generation sequencing (NGS) methods have been broadly used to identify these target sites and their patterns. By using these patterns, the emergence of diseases could be prognosticated. In this study, target site prediction tools for two major epigenetic mechanisms comprising histone modification and DNA methylation are reviewed. Publicly accessible databases are reviewed as well. Some suggestions regarding the state-of-the-art methods and databases have been made, including examining patterns of epigenetic changes that are important in epigenotypes detection.
Collapse
Affiliation(s)
- Nasibeh Chenarani
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Abbasali Emamjomeh
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran; Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Bioinformatics, Faculty of Basic Sciences, University of Zabol, Zabol, Iran.
| | - Abdollah Allahverdi
- Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - SeyedAli Mirmostafa
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Mohammad Hossein Afsharinia
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Javad Zahiri
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran; Department of Neuroscience, University of California, San Diego, USA.
| |
Collapse
|
5
|
Liu ZP. Towards precise reconstruction of gene regulatory networks by data integration. QUANTITATIVE BIOLOGY 2018. [DOI: 10.1007/s40484-018-0139-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
6
|
Leonard D, Svenungsson E, Dahlqvist J, Alexsson A, Ärlestig L, Taylor KE, Sandling JK, Bengtsson C, Frodlund M, Jönsen A, Eketjäll S, Jensen-Urstad K, Gunnarsson I, Sjöwall C, Bengtsson AA, Eloranta ML, Syvänen AC, Rantapää-Dahlqvist S, Criswell LA, Rönnblom L. Novel gene variants associated with cardiovascular disease in systemic lupus erythematosus and rheumatoid arthritis. Ann Rheum Dis 2018. [PMID: 29514802 PMCID: PMC6029634 DOI: 10.1136/annrheumdis-2017-212614] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Objectives Patients with systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA) have increased risk of cardiovascular disease (CVD). We investigated whether single nucleotide polymorphisms (SNPs) at autoimmunity risk loci were associated with CVD in SLE and RA. Methods Patients with SLE (n=1045) were genotyped using the 200K Immunochip SNP array (Illumina). The allele frequency was compared between patients with and without different manifestations of CVD. Results were replicated in a second SLE cohort (n=1043) and in an RA cohort (n=824). We analysed publicly available genetic data from general population, performed electrophoretic mobility shift assays and measured cytokine levels and occurrence of antiphospholipid antibodies (aPLs). Results We identified two new putative risk loci associated with increased risk for CVD in two SLE populations, which remained after adjustment for traditional CVD risk factors. An IL19 risk allele, rs17581834(T) was associated with stroke/myocardial infarction (MI) in SLE (OR 2.3 (1.5 to 3.4), P=8.5×10−5) and RA (OR 2.8 (1.4 to 5.6), P=3.8×10−3), meta-analysis (OR 2.5 (2.0 to 2.9), P=3.5×10−7), but not in population controls. The IL19 risk allele affected protein binding, and SLE patients with the risk allele had increased levels of plasma-IL10 (P=0.004) and aPL (P=0.01). An SRP54-AS1 risk allele, rs799454(G) was associated with stroke/transient ischaemic attack in SLE (OR 1.7 (1.3 to 2.2), P=2.5×10−5) but not in RA. The SRP54-AS1 risk allele is an expression quantitative trait locus for four genes. Conclusions The IL19 risk allele was associated with stroke/MI in SLE and RA, but not in the general population, indicating that shared immune pathways may be involved in the CVD pathogenesis in inflammatory rheumatic diseases.
Collapse
Affiliation(s)
- Dag Leonard
- Department of Medical Sciences, Science for Life Laboratory, Rheumatology, Uppsala University, Uppsala, Sweden
| | - Elisabet Svenungsson
- Department of Medicine, Rheumatology Unit, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
| | - Johanna Dahlqvist
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Andrei Alexsson
- Department of Medical Sciences, Science for Life Laboratory, Rheumatology, Uppsala University, Uppsala, Sweden
| | - Lisbeth Ärlestig
- Department of Public Health and Clinical Medicine/Rheumatology, Umeå University, Umeå, Sweden
| | - Kimberly E Taylor
- University of California, San Francisco, Rosalind Russell/Ephraim P. Engleman Rheumatology Research Center, San Francisco, California, USA
| | - Johanna K Sandling
- Department of Medical Sciences, Science for Life Laboratory, Rheumatology, Uppsala University, Uppsala, Sweden
| | - Christine Bengtsson
- Department of Public Health and Clinical Medicine/Rheumatology, Umeå University, Umeå, Sweden
| | - Martina Frodlund
- Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | - Andreas Jönsen
- Department of Rheumatology, Skåne University Hospital, Lund, Sweden
| | - Susanna Eketjäll
- Cardiovascular and Metabolic Diseases, Innovative Medicines and Early Development Biotech Unit, AstraZeneca, Integrated Cardio Metabolic Centre, Karolinska Institutet, Stockholm, Sweden
| | - Kerstin Jensen-Urstad
- Department of Clinical Physiology, Södersjukhuset, Karolinska Institutet, Stockholm, Sweden
| | - Iva Gunnarsson
- Department of Medicine, Rheumatology Unit, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
| | - Christopher Sjöwall
- Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
| | | | - Maija-Leena Eloranta
- Department of Medical Sciences, Science for Life Laboratory, Rheumatology, Uppsala University, Uppsala, Sweden
| | - Ann-Christine Syvänen
- Department of Medical Sciences, Science for Life Laboratory, Molecular Medicine, Uppsala University, Uppsala, Sweden
| | | | - Lindsey A Criswell
- University of California, San Francisco, Rosalind Russell/Ephraim P. Engleman Rheumatology Research Center, San Francisco, California, USA
| | - Lars Rönnblom
- Department of Medical Sciences, Science for Life Laboratory, Rheumatology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
7
|
Meng F, Yuan G, Zhu X, Zhou Y, Wang D, Guo Y. Functional Variants Identified Efficiently through an Integrated Transcriptome and Epigenome Analysis. Sci Rep 2018; 8:2959. [PMID: 29440655 PMCID: PMC5811556 DOI: 10.1038/s41598-018-21024-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 01/29/2018] [Indexed: 12/25/2022] Open
Abstract
Although genome-wide association studies (GWAS) have identified numerous genetic loci associated with complex diseases, the underlying molecular mechanisms of how these loci contribute to disease pathogenesis remain largely unknown, due to the lack of an efficient strategy to identify these risk variants. Here, we proposed a new strategy termed integrated transcriptome and epigenome analysis (iTEA) to identify functional genetic variants in non-coding elements. We considered type 2 diabetes mellitus as a model and identified a well-known diabetic risk variant rs35767 using iTEA. Furthermore, we discovered a new functional SNP, rs815815, involved in glucose metabolism. Our study provides an approach to directly and quickly identify functional genetic variants in type 2 diabetes mellitus, and this approach can be extended to study other complex diseases.
Collapse
Affiliation(s)
- Fanlin Meng
- School of Medicine, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Tsinghua University, Beijing, 100084, China
| | - Guohong Yuan
- Human Genetic Resource Center, National Research Institute for Health and Family Planning, Beijing, 100081, China
| | - Xiurui Zhu
- School of Medicine, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Tsinghua University, Beijing, 100084, China
| | - Yiming Zhou
- National Engineering Research Center for Beijing Biochip Technology, Beijing, 102206, China
| | - Dong Wang
- Department of Basic Medicine, School of Medicine, Tsinghua University, Beijing, 100084, China.
| | - Yong Guo
- School of Medicine, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
8
|
Banerjee A, Roychoudhury A. The gymnastics of epigenomics in rice. PLANT CELL REPORTS 2018; 37:25-49. [PMID: 28866772 DOI: 10.1007/s00299-017-2192-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 08/01/2017] [Indexed: 05/21/2023]
Abstract
Epigenomics is represented by the high-throughput investigations of genome-wide epigenetic alterations, which ultimately dictate genomic, transcriptomic, proteomic and metabolomic dynamism. Rice has been accepted as the global staple crop. As a result, this model crop deserves significant importance in the rapidly emerging field of plant epigenomics. A large number of recently available data reveal the immense flexibility and potential of variable epigenomic landscapes. Such epigenomic impacts and variability are determined by a number of epigenetic regulators and several crucial inheritable epialleles, respectively. This article highlights the correlation of the epigenomic landscape with growth, flowering, reproduction, non-coding RNA-mediated post-transcriptional regulation, transposon mobility and even heterosis in rice. We have also discussed the drastic epigenetic alterations which are reported in rice plants grown from seeds exposed to the extraterrestrial environment. Such abiotic conditions impose stress on the plants leading to epigenomic modifications in a genotype-specific manner. Some significant bioinformatic databases and in silico approaches have also been explained in this article. These softwares provide important interfaces for comparative epigenomics. The discussion concludes with a unified goal of developing epigenome editing to promote biological hacking of the rice epigenome. Such a cutting-edge technology if properly standardized, can integrate genomics and epigenomics together with the generation of high-yielding trait in several cultivars of rice.
Collapse
Affiliation(s)
- Aditya Banerjee
- Department of Biotechnology, St. Xavier's College (Autonomous), 30, Mother Teresa Sarani, Kolkata, 700016, West Bengal, India
| | - Aryadeep Roychoudhury
- Department of Biotechnology, St. Xavier's College (Autonomous), 30, Mother Teresa Sarani, Kolkata, 700016, West Bengal, India.
| |
Collapse
|
9
|
Dhiman VK, Bolt MJ, White KP. Nuclear receptors in cancer — uncovering new and evolving roles through genomic analysis. Nat Rev Genet 2017; 19:160-174. [DOI: 10.1038/nrg.2017.102] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
10
|
Lee PJ, Choudhary MNK, Wang T. Online resources for studies of genome biology and epigenetics. CURRENT OPINION IN TOXICOLOGY 2017. [DOI: 10.1016/j.cotox.2017.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
11
|
Gan Y, Tao H, Guan J, Zhou S. iHMS: a database integrating human histone modification data across developmental stages and tissues. BMC Bioinformatics 2017; 18:103. [PMID: 28187703 PMCID: PMC5303264 DOI: 10.1186/s12859-017-1461-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 01/03/2017] [Indexed: 11/17/2022] Open
Abstract
Background Differences in chromatin states are critical to the multiplicity of cell states. Recently genome-wide histone modification maps of diverse human developmental stages and tissues have been charted. Description To facilitate the investigation of epigenetic dynamics and regulatory mechanisms in cellular differentiation processes, we developed iHMS, an integrated human histone modification database that incorporates massive histone modification maps spanning different developmental stages, lineages and tissues (http://www.tongjidmb.com/human/index.html). It also includes genome-wide expression data of different conditions, reference gene annotations, GC content and CpG island information. By providing an intuitive and user-friendly query interface, iHMS enables comprehensive query and comparative analysis based on gene names, genomic region locations, histone modification marks and cell types. Moreover, it offers an efficient browser that allows users to visualize and compare multiple genome-wide histone modification maps and related expression profiles across different developmental stages and tissues. Conclusion iHMS is of great helpfulness to understand how global histone modification state transitions impact cellular phenotypes across different developmental stages and tissues in the human genome. This extensive catalog of histone modification states thus presents an important resource for epigenetic and developmental studies.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Han Tao
- Department of Computer Science and Technology, Tongji University, Shanghai, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai, China.
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing and School of Computer Science, Fudan University, Shanghai, China
| |
Collapse
|
12
|
Silva TC, Colaprico A, Olsen C, D'Angelo F, Bontempi G, Ceccarelli M, Noushmehr H. TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages. F1000Res 2016; 5:1542. [PMID: 28232861 PMCID: PMC5302158 DOI: 10.12688/f1000research.8923.2] [Citation(s) in RCA: 92] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/24/2016] [Indexed: 01/09/2023] Open
Abstract
Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as
The Cancer Genome Atlas (TCGA),
The Encyclopedia of DNA Elements (ENCODE), and
The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The
Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages:
AnnotationHub,
ChIPSeeker,
ComplexHeatmap,
pathview,
ELMER,
GAIA,
MINET,
RTCGAToolbox,
TCGAbiolinks.
Collapse
Affiliation(s)
- Tiago C Silva
- Department of Genetics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Brazil; Department of Biomedical Sciences, Cedars-Sinai, Los Angeles, CA, USA
| | - Antonio Colaprico
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium; Machine Learning Group, ULB, Brussels, Belgium
| | - Catharina Olsen
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium; Machine Learning Group, ULB, Brussels, Belgium
| | - Fulvio D'Angelo
- Department of Science and Technology, University of Sannio, Benevento, Italy; Biogem, Istituto di Ricerche Genetiche Gaetano Salvatore, Avellino, Italy
| | - Gianluca Bontempi
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium; Machine Learning Group, ULB, Brussels, Belgium; Department of Science and Technology, University of Sannio, Benevento, Italy
| | | | - Houtan Noushmehr
- Department of Genetics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Brazil; Department of Neurosurgery, Henry Ford Hospital, Detroit, MI, USA
| |
Collapse
|
13
|
Silva TC, Colaprico A, Olsen C, D'Angelo F, Bontempi G, Ceccarelli M, Noushmehr H. TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages. F1000Res 2016. [PMID: 28232861 DOI: 10.12688/f1000research.8923.1] [Citation(s) in RCA: 130] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks.
Collapse
Affiliation(s)
- Tiago C Silva
- Department of Genetics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Brazil; Department of Biomedical Sciences, Cedars-Sinai, Los Angeles, CA, USA
| | - Antonio Colaprico
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium; Machine Learning Group, ULB, Brussels, Belgium
| | - Catharina Olsen
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium; Machine Learning Group, ULB, Brussels, Belgium
| | - Fulvio D'Angelo
- Department of Science and Technology, University of Sannio, Benevento, Italy; Biogem, Istituto di Ricerche Genetiche Gaetano Salvatore, Avellino, Italy
| | - Gianluca Bontempi
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium; Machine Learning Group, ULB, Brussels, Belgium; Department of Science and Technology, University of Sannio, Benevento, Italy
| | | | - Houtan Noushmehr
- Department of Genetics, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, Brazil; Department of Neurosurgery, Henry Ford Hospital, Detroit, MI, USA
| |
Collapse
|
14
|
Liu CH, Ho BC, Chen CL, Chang YH, Hsu YC, Li YC, Yuan SS, Huang YH, Chang CS, Li KC, Chen HY. ePIANNO: ePIgenomics ANNOtation tool. PLoS One 2016; 11:e0148321. [PMID: 26859295 PMCID: PMC4747527 DOI: 10.1371/journal.pone.0148321] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 01/15/2016] [Indexed: 12/04/2022] Open
Abstract
Recently, with the development of next generation sequencing (NGS), the combination of chromatin immunoprecipitation (ChIP) and NGS, namely ChIP-seq, has become a powerful technique to capture potential genomic binding sites of regulatory factors, histone modifications and chromatin accessible regions. For most researchers, additional information including genomic variations on the TF binding site, allele frequency of variation between different populations, variation associated disease, and other neighbour TF binding sites are essential to generate a proper hypothesis or a meaningful conclusion. Many ChIP-seq datasets had been deposited on the public domain to help researchers make new discoveries. However, researches are often intimidated by the complexity of data structure and largeness of data volume. Such information would be more useful if they could be combined or downloaded with ChIP-seq data. To meet such demands, we built a webtool: ePIgenomic ANNOtation tool (ePIANNO, http://epianno.stat.sinica.edu.tw/index.html). ePIANNO is a web server that combines SNP information of populations (1000 Genomes Project) and gene-disease association information of GWAS (NHGRI) with ChIP-seq (hmChIP, ENCODE, and ROADMAP epigenomics) data. ePIANNO has a user-friendly website interface allowing researchers to explore, navigate, and extract data quickly. We use two examples to demonstrate how users could use functions of ePIANNO webserver to explore useful information about TF related genomic variants. Users could use our query functions to search target regions, transcription factors, or annotations. ePIANNO may help users to generate hypothesis or explore potential biological functions for their studies.
Collapse
Affiliation(s)
- Chia-Hsin Liu
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Nangang, Taipei, Taiwan
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
| | - Bing-Ching Ho
- Department of Clinical Laboratory Sciences and Medical Biotechnology, College of Medicine, National Taiwan University, Taipei, Taiwan
- NTU Center for Genomic Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Chun-Ling Chen
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Ya-Hsuan Chang
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Yi-Chiung Hsu
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Yu-Cheng Li
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Shin-Sheng Yuan
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Yi-Huan Huang
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Chi-Sheng Chang
- NTU Center for Genomic Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Ker-Chau Li
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
| | - Hsuan-Yu Chen
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
15
|
Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2016; 44:D7-19. [PMID: 26615191 PMCID: PMC4702911 DOI: 10.1093/nar/gkv1290] [Citation(s) in RCA: 1013] [Impact Index Per Article: 126.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Revised: 11/04/2015] [Accepted: 11/05/2015] [Indexed: 11/25/2022] Open
Abstract
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Collapse
|
16
|
Chen HH, Tsai LJ, Lee KR, Chen YM, Hung WT, Chen DY. Genetic association of complement component 2 polymorphism with systemic lupus erythematosus. ACTA ACUST UNITED AC 2015; 86:122-33. [DOI: 10.1111/tan.12602] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2014] [Revised: 05/29/2015] [Accepted: 06/09/2015] [Indexed: 12/24/2022]
Affiliation(s)
- H.-H. Chen
- Institute of Molecular Medicine; National Tsing Hua University; Hsinchu Taiwan
| | - L.-J. Tsai
- Graduate Institute of Clinical Medicine; Taipei Medical University; Taipei Taiwan
| | - K.-R. Lee
- Institute of Molecular Medicine; National Tsing Hua University; Hsinchu Taiwan
| | - Y.-M. Chen
- Division of Allergy, Immunology and Rheumatology; Taichung Veterans General Hospital; Taichung Taiwan
- Institute of Microbiology and Immunology; Chung Shan Medical University; Taichung Taiwan
- Institute of Biomedical Science; National Chung Hsing University; Taichung Taiwan
- Rong Hsing Research Center for Translational Medicine; National Chung Hsing University; Taichung Taiwan
| | - W.-T. Hung
- Division of Allergy, Immunology and Rheumatology; Taichung Veterans General Hospital; Taichung Taiwan
- Institute of Microbiology and Immunology; Chung Shan Medical University; Taichung Taiwan
| | - D.-Y. Chen
- Institute of Molecular Medicine; National Tsing Hua University; Hsinchu Taiwan
- Division of Allergy, Immunology and Rheumatology; Taichung Veterans General Hospital; Taichung Taiwan
- Institute of Microbiology and Immunology; Chung Shan Medical University; Taichung Taiwan
- Institute of Biomedical Science; National Chung Hsing University; Taichung Taiwan
- Rong Hsing Research Center for Translational Medicine; National Chung Hsing University; Taichung Taiwan. Faculty of Medicine; National Yang Ming University; Taipei Taiwan
| |
Collapse
|
17
|
Medvedeva YA, Lennartsson A, Ehsani R, Kulakovskiy IV, Vorontsov IE, Panahandeh P, Khimulya G, Kasukawa T, Drabløs F. EpiFactors: a comprehensive database of human epigenetic factors and complexes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav067. [PMID: 26153137 PMCID: PMC4494013 DOI: 10.1093/database/bav067] [Citation(s) in RCA: 164] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 06/15/2015] [Indexed: 12/22/2022]
Abstract
Epigenetics refers to stable and long-term alterations of cellular traits that are
not caused by changes in the DNA sequence per se. Rather, covalent
modifications of DNA and histones affect gene expression and genome stability
via proteins that recognize and act upon such modifications. Many
enzymes that catalyse epigenetic modifications or are critical for enzymatic
complexes have been discovered, and this is encouraging investigators to study the
role of these proteins in diverse normal and pathological processes. Rapidly growing
knowledge in the area has resulted in the need for a resource that compiles,
organizes and presents curated information to the researchers in an easily accessible
and user-friendly form. Here we present EpiFactors, a manually curated database
providing information about epigenetic regulators, their complexes, targets and
products. EpiFactors contains information on 815 proteins, including 95 histones and
protamines. For 789 of these genes, we include expressions values across several
samples, in particular a collection of 458 human primary cell samples (for
approximately 200 cell types, in many cases from three individual donors), covering
most mammalian cell steady states, 255 different cancer cell lines (representing
approximately 150 cancer subtypes) and 134 human postmortem tissues. Expression
values were obtained by the FANTOM5 consortium using Cap Analysis of Gene Expression
technique. EpiFactors also contains information on 69 protein complexes that are
involved in epigenetic regulation. The resource is practical for a wide range of
users, including biologists, pharmacologists and clinicians. Database URL: http://epifactors.autosome.ru
Collapse
Affiliation(s)
- Yulia A Medvedeva
- Institute of Personal and Predictive Medicine of Cancer, 08916 Badalona, Spain, Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia,
| | - Andreas Lennartsson
- Department of Biosciences and Nutrition, Karolinska Institutet, 14183 Huddinge, Sweden
| | - Rezvan Ehsani
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
| | - Ivan V Kulakovskiy
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Ilya E Vorontsov
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Pouda Panahandeh
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
| | - Grigory Khimulya
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Takeya Kasukawa
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama 230-0045, Kanagawa, Japan
| | | | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway,
| |
Collapse
|
18
|
Cui H, Dhroso A, Johnson N, Korkin D. The variation game: Cracking complex genetic disorders with NGS and omics data. Methods 2015; 79-80:18-31. [PMID: 25944472 DOI: 10.1016/j.ymeth.2015.04.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2014] [Revised: 03/27/2015] [Accepted: 04/17/2015] [Indexed: 12/14/2022] Open
Abstract
Tremendous advances in Next Generation Sequencing (NGS) and high-throughput omics methods have brought us one step closer towards mechanistic understanding of the complex disease at the molecular level. In this review, we discuss four basic regulatory mechanisms implicated in complex genetic diseases, such as cancer, neurological disorders, heart disease, diabetes, and many others. The mechanisms, including genetic variations, copy-number variations, posttranscriptional variations, and epigenetic variations, can be detected using a variety of NGS methods. We propose that malfunctions detected in these mechanisms are not necessarily independent, since these malfunctions are often found associated with the same disease and targeting the same gene, group of genes, or functional pathway. As an example, we discuss possible rewiring effects of the cancer-associated genetic, structural, and posttranscriptional variations on the protein-protein interaction (PPI) network centered around P53 protein. The review highlights multi-layered complexity of common genetic disorders and suggests that integration of NGS and omics data is a critical step in developing new computational methods capable of deciphering this complexity.
Collapse
Affiliation(s)
- Hongzhu Cui
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Andi Dhroso
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Nathan Johnson
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Dmitry Korkin
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States; Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| |
Collapse
|
19
|
Loharch S, Bhutani I, Jain K, Gupta P, Sahoo DK, Parkesh R. EpiDBase: a manually curated database for small molecule modulators of epigenetic landscape. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav013. [PMID: 25776023 PMCID: PMC4360624 DOI: 10.1093/database/bav013] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
We have developed EpiDBase (www.epidbase.org), an interactive database of small molecule ligands of epigenetic protein families by bringing together experimental, structural and chemoinformatic data in one place. Currently, EpiDBase encompasses 5784 unique ligands (11 422 entries) of various epigenetic markers such as writers, erasers and readers. The EpiDBase includes experimental IC50 values, ligand molecular weight, hydrogen bond donor and acceptor count, XlogP, number of rotatable bonds, number of aromatic rings, InChIKey, two-dimensional and three-dimensional (3D) chemical structures. A catalog of all epidbase ligands based on the molecular weight is also provided. A structure editor is provided for 3D visualization of ligands. EpiDBase is integrated with tools like text search, disease-specific search, advanced search, substructure, and similarity analysis. Advanced analysis can be performed using substructure and OpenBabel-based chemical similarity fingerprints. The EpiDBase is curated to identify unique molecular scaffolds. Initially, molecules were selected by removing peptides, macrocycles and other complex structures and then processed for conformational sampling by generating 3D conformers. Subsequent filtering through Zinc Is Not Commercial (ZINC: a free database of commercially available compounds for virtual screening) and Lilly MedChem regular rules retained many distinctive drug-like molecules. These molecules were then analyzed for physicochemical properties using OpenBabel descriptors and clustered using various methods such as hierarchical clustering, binning partition and multidimensional scaling. EpiDBase provides comprehensive resources for further design, development and refinement of small molecule modulators of epigenetic markers. Database URL:www.epidbase.org
Collapse
Affiliation(s)
- Saurabh Loharch
- Department of Advanced Protein Science, Institute of Microbial Technology, Chandigarh 160036, India
| | - Isha Bhutani
- Department of Advanced Protein Science, Institute of Microbial Technology, Chandigarh 160036, India
| | - Kamal Jain
- Department of Advanced Protein Science, Institute of Microbial Technology, Chandigarh 160036, India
| | - Pawan Gupta
- Department of Advanced Protein Science, Institute of Microbial Technology, Chandigarh 160036, India
| | - Debendra K Sahoo
- Department of Advanced Protein Science, Institute of Microbial Technology, Chandigarh 160036, India
| | - Raman Parkesh
- Department of Advanced Protein Science, Institute of Microbial Technology, Chandigarh 160036, India
| |
Collapse
|
20
|
Abstract
INTRODUCTION 1α,25-Dihydroxyvitamin D3 (1,25-D3) is antiproliferative in preclinical models of lung cancer, but in tumor tissues, its efficacy may be limited by CYP24A1 expression. CYP24A1 is the rate limiting catabolic enzyme for 1,25-D3 and is overexpressed in human lung adenocarcinoma (AC) by unknown mechanisms. METHODS The DNA methylation status of CYP24A1 was determined by bisulfite DNA pyrosequencing in a panel of 30 lung cell lines and 90 surgically resected lung AC. The level of CYP24A1 methylation was correlated with CYP24A1 expression in lung AC cell lines and tumors. In addition, histone modifications were assessed by quantitative chromatin immunoprecipitation-polymerase chain reaction (ChIP-qPCR) in A549, NCI-H460, and SK-LU-1. RESULTS Bisulfite DNA pyrosequencing analysis revealed that CYP24A1 gene was heterogeneously methylated in lung AC. Expression of CYP24A1 was inversely correlated with promoter DNA methylation in lung AC cell lines and tumors. Treatment with 5-aza-2'-deoxycytidine (5-Aza) and trichostatin A (TSA) increased CYP24A1 expression in lung AC. We observed that CYP24A1 promoter hypermethylation decreased CYP24A1 enzyme activity in vitro, whereas treatment with 5-Aza and/or TSA increased CYP24A1 enzyme affinity for its substrate 1,25-D3. In addition, ChIP-qPCR analysis revealed specific histone modifications within the CYP24A1 promoter region. Treatment with TSA increased H3K4me2 and H3K9ac and simultaneously decreased H3K9me2 at the CYP24A1 promoter and treatment with 5-Aza and/or TSA increased the recruitment of vitamin D receptor (VDR) to vitamin D response elements (VDRE) of the CYP24A1 promoter. CONCLUSIONS The expression of CYP24A1 gene in human lung AC is in part epigenetically regulated by promoter DNA methylation and repressive histone modifications. These findings should be taken into consideration when targeting CYP24A1 to optimize antiproliferative effects of 1,25-D3 in lung AC.
Collapse
|
21
|
Abstract
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov.
Collapse
|
22
|
Wei Y, Su J, Liu H, Lv J, Wang F, Yan H, Wen Y, Liu H, Wu Q, Zhang Y. MetaImprint: an information repository of mammalian imprinted genes. Development 2014; 141:2516-23. [DOI: 10.1242/dev.105320] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Genomic imprinting is a complex genetic and epigenetic phenomenon that plays important roles in mammalian development and diseases. Mammalian imprinted genes have been identified widely by experimental strategies or predicted using computational methods. Systematic information for these genes would be necessary for the identification of novel imprinted genes and the analysis of their regulatory mechanisms and functions. Here, a well-designed information repository, MetaImprint (http://bioinfo.hrbmu.edu.cn/MetaImprint), is presented, which focuses on the collection of information concerning mammalian imprinted genes. The current version of MetaImprint incorporates 539 imprinted genes, including 255 experimentally confirmed genes, and their detailed research courses from eight mammalian species. MetaImprint also hosts genome-wide genetic and epigenetic information of imprinted genes, including imprinting control regions, single nucleotide polymorphisms, non-coding RNAs, DNA methylation and histone modifications. Information related to human diseases and functional annotation was also integrated into MetaImprint. To facilitate data extraction, MetaImprint supports multiple search options, such as by gene ID and disease name. Moreover, a configurable Imprinted Gene Browser was developed to visualize the information on imprinted genes in a genomic context. In addition, an Epigenetic Changes Analysis Tool is provided for online analysis of DNA methylation and histone modification differences of imprinted genes among multiple tissues and cell types. MetaImprint provides a comprehensive information repository of imprinted genes, allowing researchers to investigate systematically the genetic and epigenetic regulatory mechanisms of imprinted genes and their functions in development and diseases.
Collapse
Affiliation(s)
- Yanjun Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Jianzhong Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hongbo Liu
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Jie Lv
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Fang Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Haidan Yan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yanhua Wen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hui Liu
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Qiong Wu
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Yan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
23
|
Pathak RR, Davé V. Integrating omics technologies to study pulmonary physiology and pathology at the systems level. Cell Physiol Biochem 2014; 33:1239-60. [PMID: 24802001 PMCID: PMC4396816 DOI: 10.1159/000358693] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/11/2014] [Indexed: 12/13/2022] Open
Abstract
Assimilation and integration of "omics" technologies, including genomics, epigenomics, proteomics, and metabolomics has readily altered the landscape of medical research in the last decade. The vast and complex nature of omics data can only be interpreted by linking molecular information at the organismic level, forming the foundation of systems biology. Research in pulmonary biology/medicine has necessitated integration of omics, network, systems and computational biology data to differentially diagnose, interpret, and prognosticate pulmonary diseases, facilitating improvement in therapy and treatment modalities. This review describes how to leverage this emerging technology in understanding pulmonary diseases at the systems level -called a "systomic" approach. Considering the operational wholeness of cellular and organ systems, diseased genome, proteome, and the metabolome needs to be conceptualized at the systems level to understand disease pathogenesis and progression. Currently available omics technology and resources require a certain degree of training and proficiency in addition to dedicated hardware and applications, making them relatively less user friendly for the pulmonary biologist and clinicians. Herein, we discuss the various strategies, computational tools and approaches required to study pulmonary diseases at the systems level for biomedical scientists and clinical researchers.
Collapse
Affiliation(s)
- Ravi Ramesh Pathak
- Morsani College of Medicine, Department of Pathology and Cell Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL USA
| | - Vrushank Davé
- Morsani College of Medicine, Department of Pathology and Cell Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL USA
- Department of Molecular Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL USA
| |
Collapse
|
24
|
Abstract
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page.
Collapse
Affiliation(s)
- NCBI Resource Coordinators
- *To whom correspondence should be addressed. Eric W. Sayers. Tel: +1 301 496 2475; Fax: +1 301 480 9241;
| |
Collapse
|
25
|
Cho SY, Chai JC, Park SJ, Seo H, Sohn CB, Lee YS. EPITRANS: a database that integrates epigenome and transcriptome data. Mol Cells 2013; 36:472-5. [PMID: 24213601 PMCID: PMC3887936 DOI: 10.1007/s10059-013-0249-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Accepted: 09/10/2013] [Indexed: 11/28/2022] Open
Abstract
Epigenetic modifications affect gene expression and thereby govern a wide range of biological processes such as differentiation, development and tumorigenesis. Recent initiatives to define genome-wide DNA methylation and histone modification profiles by microarray and sequencing methods have led to the construction of databases. These databases are repositories for international epigenetic consortiums or provide mining results from PubMed, but do not integrate the epigenetic information with gene expression changes. In order to overcome this limitation, we constructed EPITRANS, a novel database that visualizes the relationships between gene expression and epigenetic modifications. EPITRANS uses combined analysis of epigenetic modification and gene expression to search for cell function-related epigenetic and transcriptomic alterations (Freely available on the web at http://epitrans.org ).
Collapse
Affiliation(s)
- Soo Young Cho
- Laboratory of Developmental Biology and Genomics, College of Veterinary Medicine, Research Institute for Veterinary Science, Brain Korea 21 Program for Veterinary Science
- Interdisciplinary Program for Bioinformatics, Program for Cancer Biology and BIO-MAX Institute, Seoul National University, Seoul 151-742, Korea
- MRC Harwell, Mammalian Genetics Unit, Harwell Science and Innovation Campus, Oxfordshire, United Kingdom
| | - Jin Choul Chai
- Depatment of Molecular and Life Sciences, Hanyang University, Ansan 425-791, Korea
| | - Soo Jun Park
- Bio-Medical IT Convergence Research Department, ETRI, Daejeon 305-700, Korea
| | - Hyemyung Seo
- Depatment of Molecular and Life Sciences, Hanyang University, Ansan 425-791, Korea
| | - Chae-Bong Sohn
- Department of Electronics and Communications Engineering, Kwangwoon University, Seoul 139-701, Korea
| | - Young Seek Lee
- Depatment of Molecular and Life Sciences, Hanyang University, Ansan 425-791, Korea
| |
Collapse
|
26
|
Genomic structure and variation of nuclear factor (erythroid-derived 2)-like 2. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2013; 2013:286524. [PMID: 23936606 PMCID: PMC3723247 DOI: 10.1155/2013/286524] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 04/22/2013] [Indexed: 12/15/2022]
Abstract
High-density mapping of mammalian genomes has enabled a wide range of genetic investigations including the mapping of polygenic traits, determination of quantitative trait loci, and phylogenetic comparison. Genome sequencing analysis of inbred mouse strains has identified high-density single nucleotide polymorphisms (SNPs) for investigation of complex traits, which has become a useful tool for biomedical research of human disease to alleviate ethical and practical problems of experimentation in humans. Nuclear factor (erythroid-derived 2)-like 2 (NRF2) encodes a key host defense transcription factor. This review describes genetic characteristics of human NRF2 and its homologs in other vertebrate species. NRF2 is evolutionally conserved and shares sequence homology among species. Compilation of publically available SNPs and other genetic mutations shows that human NRF2 is highly polymorphic with a mutagenic frequency of 1 per every 72 bp. Functional at-risk alleles and haplotypes have been demonstrated in various human disorders. In addition, other pathogenic alterations including somatic mutations and misregulated epigenetic processes in NRF2 have led to oncogenic cell survival. Comprehensive information from the current review addresses association of NRF2 variation and disease phenotypes and supports the new insights into therapeutic strategies.
Collapse
|
27
|
|
28
|
Fingerman IM, Zhang X, Ratzat W, Husain N, Cohen RF, Schuler GD. NCBI Epigenomics: what's new for 2013. Nucleic Acids Res 2012. [PMID: 23193265 PMCID: PMC3531100 DOI: 10.1093/nar/gks1171] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
The Epigenomics resource at the National Center for Biotechnology Information (NCBI) has been created to serve as a comprehensive public repository for whole-genome epigenetic data sets (www.ncbi.nlm.nih.gov/epigenomics). We have constructed this resource by selecting the subset of epigenetics-specific data from the Gene Expression Omnibus (GEO) database and then subjecting them to further review and annotation. Associated data tracks can be viewed using popular genome browsers or downloaded for local analysis. We have performed extensive user testing throughout the development of this resource, and new features and improvements are continuously being implemented based on the results. We have made substantial usability improvements to user interfaces, enhanced functionality, made identification of data tracks of interest easier and created new tools for preliminary data analyses. Additionally, we have made efforts to enhance the integration between the Epigenomics resource and other NCBI databases, including the Gene database and PubMed. Data holdings have also increased dramatically since the initial publication describing the NCBI Epigenomics resource and currently consist of >3700 viewable and downloadable data tracks from 955 biological sources encompassing five well-studied species. This updated manuscript highlights these changes and improvements.
Collapse
Affiliation(s)
- Ian M Fingerman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892, USA.
| | | | | | | | | | | |
Collapse
|
29
|
Abstract
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page.
Collapse
Affiliation(s)
- NCBI Resource Coordinators
- *To whom correspondence should be addressed. Eric W. Sayers. Tel: +30 1 49 62 475; Fax: +30 1 48 09 241;
| |
Collapse
|
30
|
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res 2012. [PMID: 23193258 PMCID: PMC3531084 DOI: 10.1093/nar/gks1193] [Citation(s) in RCA: 5874] [Impact Index Per Article: 489.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.
Collapse
Affiliation(s)
- Tanya Barrett
- National Center for Biotechnology Information, National Library of Medicine and Molecular Genetics Section, Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
HEMD: an integrated tool of human epigenetic enzymes and chemical modulators for therapeutics. PLoS One 2012; 7:e39917. [PMID: 22761927 PMCID: PMC3382562 DOI: 10.1371/journal.pone.0039917] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Accepted: 05/29/2012] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Epigenetic mechanisms mainly include DNA methylation, post-translational modifications of histones, chromatin remodeling and non-coding RNAs. All of these processes are mediated and controlled by enzymes. Abnormalities of the enzymes are involved in a variety of complex human diseases. Recently, potent natural or synthetic chemicals are utilized to establish the quantitative contributions of epigenetic regulation through the enzymes and provide novel insight for developing new therapeutics. However, the development of more specific and effective epigenetic therapeutics requires a more complete understanding of the chemical epigenomic landscape. DESCRIPTION Here, we present a human epigenetic enzyme and modulator database (HEMD), the database which provides a central resource for the display, search, and analysis of the structure, function, and related annotation for human epigenetic enzymes and chemical modulators focused on epigenetic therapeutics. Currently, HEMD contains 269 epigenetic enzymes and 4377 modulators in three categories (activators, inhibitors, and regulators). Enzymes are annotated with detailed description of epigenetic mechanisms, catalytic processes, and related diseases, and chemical modulators with binding sites, pharmacological effect, and therapeutic uses. Integrating the information of epigenetic enzymes in HEMD should allow for the prediction of conserved features for proteins and could potentially classify them as ideal targets for experimental validation. In addition, modulators curated in HEMD can be used to investigate potent epigenetic targets for the query compound and also help chemists to implement structural modifications for the design of novel epigenetic drugs. CONCLUSIONS HEMD could be a platform and a starting point for biologists and medicinal chemists for furthering research on epigenetic therapeutics. HEMD is freely available at http://mdl.shsmu.edu.cn/HEMD/.
Collapse
|
32
|
Abstract
The NIH Roadmap Reference Epigenome Mapping Consortium is developing a community resource of genome-wide epigenetic maps in a broad range of human primary cells and tissues. There are large amounts of data already available, and a number of different options for viewing and analyzing the data. This report will describe key features of the websites where users will find data, protocols and analysis tools developed by the consortium, and provide a perspective on how this unique resource will facilitate and inform human disease research, both immediately and in the future.
Collapse
Affiliation(s)
- Lisa Helbling Chadwick
- Division of Extramural Research & Training, National Institute of Environmental Health Sciences, 530 Davis Drive, Morrisville, NC 27709, USA.
| |
Collapse
|
33
|
Qin B, Zhou M, Ge Y, Taing L, Liu T, Wang Q, Wang S, Chen J, Shen L, Duan X, Hu S, Li W, Long H, Zhang Y, Liu XS. CistromeMap: a knowledgebase and web server for ChIP-Seq and DNase-Seq studies in mouse and human. Bioinformatics 2012; 28:1411-2. [PMID: 22495751 DOI: 10.1093/bioinformatics/bts157] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
SUMMARY Transcription and chromatin regulators, and histone modifications play essential roles in gene expression regulation. We have created CistromeMap as a web server to provide a comprehensive knowledgebase of all of the publicly available ChIP-Seq and DNase-Seq data in mouse and human. We have also manually curated metadata to ensure annotation consistency, and developed a user-friendly display matrix for quick navigation and retrieval of data for specific factors, cells and papers. Finally, we provide users with summary statistics of ChIP-Seq and DNase-Seq studies.
Collapse
Affiliation(s)
- Bo Qin
- Department of Bioinformatics, School of Life Science and Technology, Tongji University, Shanghai, 200092, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Egger G, Wielscher M, Pulverer W, Kriegner A, Weinhäusel A. DNA methylation testing and marker validation using PCR: diagnostic applications. Expert Rev Mol Diagn 2012; 12:75-92. [PMID: 22133121 DOI: 10.1586/erm.11.90] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
DNA methylation provides a fundamental epigenetic mechanism to establish and promote cell-specific gene-expression patterns, which are inherited by subsequent cell generations. Thus, the epigenome determines the differentiation into a cell lineage but can also program cells to become abnormal or malignant. In humans, different germline and somatic diseases have been linked to faulty DNA methylation. In this article, we will discuss the available PCR-based technologies to assess differences in DNA methylation levels mainly affecting 5-methylcytosine in the CpG dinucleotide context in hereditary syndromal and somatic pathological conditions. We will discuss some of the current diagnostic applications and provide an outlook on how DNA methylation-based biomarkers might provide novel tools for diagnosis, prognosis or patient stratification for diseases such as cancer.
Collapse
Affiliation(s)
- Gerda Egger
- Clinical Institute of Pathology, Medical University of Vienna, Austria
| | | | | | | | | |
Collapse
|
35
|
Abstract
The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.
Collapse
Affiliation(s)
- Stephen E. Wilhite
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD, USA
| | - Tanya Barrett
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD, USA
| |
Collapse
|
36
|
Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Federhen S, Feolo M, Fingerman IM, Geer LY, Helmberg W, Kapustin Y, Krasnov S, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Karsch-Mizrachi I, Ostell J, Panchenko A, Phan L, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Slotta D, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Wang Y, Wilbur WJ, Yaschenko E, Ye J. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2011; 40:D13-25. [PMID: 22140104 PMCID: PMC3245031 DOI: 10.1093/nar/gkr1184] [Citation(s) in RCA: 466] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Collapse
Affiliation(s)
- Eric W Sayers
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Barrett T, Clark K, Gevorgyan R, Gorelenkov V, Gribov E, Karsch-Mizrachi I, Kimelman M, Pruitt KD, Resenchuk S, Tatusova T, Yaschenko E, Ostell J. BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res 2011; 40:D57-63. [PMID: 22139929 PMCID: PMC3245069 DOI: 10.1093/nar/gkr1163] [Citation(s) in RCA: 214] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/bioproject and http://www.ncbi.nlm.nih.gov/biosample, respectively.
Collapse
Affiliation(s)
- Tanya Barrett
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Milosavljevic A. Emerging patterns of epigenomic variation. Trends Genet 2011; 27:242-50. [PMID: 21507501 DOI: 10.1016/j.tig.2011.03.001] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2010] [Revised: 03/09/2011] [Accepted: 03/14/2011] [Indexed: 12/15/2022]
Abstract
Fuelled by new sequencing technologies, epigenome mapping projects are revealing epigenomic variation at all levels of biological complexity, from species to cells. Comparisons of methylation profiles among species reveal evolutionary conservation of gene body methylation patterns, pointing to the fundamental role of epigenomes in gene regulation. At the human population level, epigenomic changes provide footprints of the effects of genomic variants within the vast nonprotein-coding fraction of the genome, and comparisons of the epigenomes of parents and their offspring point to quantitative epigenomic parent-of-origin effects confounding classical Mendelian genetics. At the organismal level, comparisons of epigenomes from diverse cell types provide insights into cellular differentiation. Finally, comparisons of epigenomes from monozygotic twins help dissect genetic and environmental influences on human phenotypes and longitudinal comparisons reveal aging-associated epigenomic drift. The development of new bioinformatic frameworks for comparative epigenome analysis is putting epigenome maps within the reach of researchers across a wide spectrum of biological disciplines.
Collapse
|
39
|
Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Holko M, Ayanbule O, Yefanov A, Soboleva A. NCBI GEO: archive for functional genomics data sets--10 years on. Nucleic Acids Res 2010; 39:D1005-10. [PMID: 21097893 PMCID: PMC3013736 DOI: 10.1093/nar/gkq1184] [Citation(s) in RCA: 798] [Impact Index Per Article: 57.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Collapse
Affiliation(s)
- Tanya Barrett
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|