1
|
Maderazo D, Flegg JA, Algama M, Ramialison M, Keith J. Detection and identification of cis-regulatory elements using change-point and classification algorithms. BMC Genomics 2022; 23:78. [PMID: 35078412 PMCID: PMC8790847 DOI: 10.1186/s12864-021-08190-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 11/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcriptional regulation is primarily mediated by the binding of factors to non-coding regions in DNA. Identification of these binding regions enhances understanding of tissue formation and potentially facilitates the development of gene therapies. However, successful identification of binding regions is made difficult by the lack of a universal biological code for their characterisation. RESULTS We extend an alignment-based method, changept, and identify clusters of biological significance, through ontology and de novo motif analysis. Further, we apply a Bayesian method to estimate and combine binary classifiers on the clusters we identify to produce a better performing composite. CONCLUSIONS The analysis we describe provides a computational method for identification of conserved binding sites in the human genome and facilitates an alternative interrogation of combinations of existing data sets with alignment data.
Collapse
Affiliation(s)
- Dominic Maderazo
- School of Mathematics and Statistics, The University of Melbourne, Melbourne, 3010, VIC, Australia.
| | - Jennifer A Flegg
- School of Mathematics and Statistics, The University of Melbourne, Melbourne, 3010, VIC, Australia
| | - Manjula Algama
- School of Mathematics, Monash University, Melbourne, 3800, VIC, Australia
| | - Mirana Ramialison
- Australian Regenerative Medicine Institute, Monash University, Melbourne, 3800, VIC, Australia
| | - Jonathan Keith
- School of Mathematics, Monash University, Melbourne, 3800, VIC, Australia
| |
Collapse
|
2
|
Xie J, Yin Y, Yang F, Sun J, Wang J. Differential Network Analysis Reveals Regulatory Patterns in Neural Stem Cell Fate Decision. Interdiscip Sci 2021; 13:91-102. [PMID: 33439459 DOI: 10.1007/s12539-020-00415-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 12/11/2020] [Accepted: 12/22/2020] [Indexed: 11/30/2022]
Abstract
Deciphering regulatory patterns of neural stem cell (NSC) differentiation with multiple stages is essential to understand NSC differentiation mechanisms. Recent single-cell transcriptome datasets became available at individual differentiation. However, a systematic and integrative analysis of multiple datasets at multiple temporal stages of NSC differentiation is lacking. In this study, we propose a new method integrating prior information to construct three gene regulatory networks at pair-wise stages of transcriptome and apply this method to investigate five NSC differentiation paths on four different single-cell transcriptome datasets. By constructing gene regulatory networks for each path, we delineate their regulatory patterns via differential topology and network diffusion analyses. We find 12 common differentially expressed genes among the five NSC differentiation paths, with one common regulatory pattern (Gsk3b_App_Cdk5) shared by all paths. The identified regulatory pattern, partly supported by previous experimental evidence, is essential to all differentiation paths, but it plays a different role in each path when regulating other genes. Together, our integrative analysis provides both common and specific regulatory mechanisms for each of the five NSC differentiation paths.
Collapse
Affiliation(s)
- Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Yiting Yin
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Fuzhang Yang
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiamin Sun
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiao Wang
- School of Life Sciences, Shanghai University, Shanghai, China.
| |
Collapse
|
3
|
In-silico analysis of eukaryotic translation initiation factors (eIFs) in response to environmental stresses in rice (Oryza sativa). Biologia (Bratisl) 2020. [DOI: 10.2478/s11756-020-00467-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
4
|
Khamis AM, Motwalli O, Oliva R, Jankovic BR, Medvedeva YA, Ashoor H, Essack M, Gao X, Bajic VB. A novel method for improved accuracy of transcription factor binding site prediction. Nucleic Acids Res 2018; 46:e72. [PMID: 29617876 PMCID: PMC6037060 DOI: 10.1093/nar/gky237] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 03/01/2018] [Accepted: 03/20/2018] [Indexed: 12/12/2022] Open
Abstract
Identifying transcription factor (TF) binding sites (TFBSs) is important in the computational inference of gene regulation. Widely used computational methods of TFBS prediction based on position weight matrices (PWMs) usually have high false positive rates. Moreover, computational studies of transcription regulation in eukaryotes frequently require numerous PWM models of TFBSs due to a large number of TFs involved. To overcome these problems we developed DRAF, a novel method for TFBS prediction that requires only 14 prediction models for 232 human TFs, while at the same time significantly improves prediction accuracy. DRAF models use more features than PWM models, as they combine information from TFBS sequences and physicochemical properties of TF DNA-binding domains into machine learning models. Evaluation of DRAF on 98 human ChIP-seq datasets shows on average 1.54-, 1.96- and 5.19-fold reduction of false positives at the same sensitivities compared to models from HOCOMOCO, TRANSFAC and DeepBind, respectively. This observation suggests that one can efficiently replace the PWM models for TFBS prediction by a small number of DRAF models that significantly improve prediction accuracy. The DRAF method is implemented in a web tool and in a stand-alone software freely available at http://cbrc.kaust.edu.sa/DRAF.
Collapse
Affiliation(s)
- Abdullah M Khamis
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Olaa Motwalli
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Romina Oliva
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
- Department of Sciences and Technologies, University ‘Parthenope’ of Naples, Centro Direzionale Isola C4 80143, Naples, Italy
| | - Boris R Jankovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Yulia A Medvedeva
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
- Institute of Bioengineering, Research Centre of Biotechnology, Russian Academy of Science, 117312 Moscow, Russia
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Science, 119991 Moscow, Russia
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, 141701, Dolgoprudny, Moscow Region, Russia
| | - Haitham Ashoor
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| |
Collapse
|
5
|
Abroun S, Saki N, Fakher R, Asghari F. Biology and bioinformatics of myeloma cell. ACTA ACUST UNITED AC 2013; 18:30-41. [PMID: 23253865 DOI: 10.1532/lh96.11003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Multiple myeloma (MM) is a plasma cell disorder that occurs in about 10% of all hematologic cancers. The majority of patients (99%) are over 50 years of age when diagnosed. In the bone marrow (BM), stromal and hematopoietic stem cells (HSCs) are responsible for the production of blood cells. Therefore any destruction or/and changes within the BM undesirably impacts a wide range of hematopoiesis, causing diseases and influencing patient survival. In order to establish an effective therapeutic strategy, recognition of the biology and evaluation of bioinformatics models for myeloma cells are necessary to assist in determining suitable methods to cure or prevent disease complications in patients. This review presents the evaluation of molecular and cellular aspects of MM such as genetic translocation, genetic analysis, cell surface marker, transcription factors, and chemokine signaling pathways. It also briefly reviews some of the mechanisms involved in MM in order to develop a better understanding for use in future studies.
Collapse
Affiliation(s)
- Saeid Abroun
- Department of Hematology and Blood Banking, School of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
| | | | | | | |
Collapse
|
6
|
Struckmann S, Esch D, Schöler H, Fuellen G. Visualization and exploration of conserved regulatory modules using ReXSpecies 2. BMC Evol Biol 2011; 11:267. [PMID: 21942985 PMCID: PMC3203875 DOI: 10.1186/1471-2148-11-267] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Accepted: 09/24/2011] [Indexed: 11/10/2022] Open
Abstract
Background The prediction of transcription factor binding sites is difficult for many reasons. Thus, filtering methods are needed to enrich for biologically relevant (true positive) matches in the large amount of computational predictions that are frequently generated from promoter sequences. Results ReXSpecies 2 filters predictions of transcription factor binding sites and generates a set of figures displaying them in evolutionary context. More specifically, it uses position specific scoring matrices to search for motifs that specify transcription factor binding sites. It removes redundant matches and filters the remaining matches by the phylogenetic group that the matrices belong to. It then identifies potential transcriptional modules, and generates figures that highlight such modules, taking evolution into consideration. Module formation, scoring by evolutionary criteria and visual clues reduce the amount of predictions to a manageable scale. Identification of transcription factor binding sites of particular functional importance is left to expert filtering. ReXSpecies 2 interacts with genome browsers to enable scientists to filter predictions together with other sequence-related data. Conclusions Based on ReXSpecies 2, we derive plausible hypotheses about the regulation of pluripotency. Our tool is designed to analyze transcription factor binding site predictions considering their common pattern of occurrence, highlighting their evolutionary history.
Collapse
Affiliation(s)
- Stephan Struckmann
- University of Rostock, Institute for Biostatistics and Informatics in Medicine and Ageing Research, Heydemannstrasse 8, 18057 Rostock, Germany.
| | | | | | | |
Collapse
|
7
|
Kirkilionis M, Janus U, Sbano L. Multi-scale genetic dynamic modelling I : an algorithm to compute generators. Theory Biosci 2011; 130:165-82. [DOI: 10.1007/s12064-011-0125-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2009] [Accepted: 02/14/2011] [Indexed: 10/18/2022]
|
8
|
Fuellen G, Struckmann S. Evolution of gene regulation of pluripotency--the case for wiki tracks at genome browsers. Biol Direct 2010; 5:67. [PMID: 21190561 PMCID: PMC3024949 DOI: 10.1186/1745-6150-5-67] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2010] [Accepted: 12/29/2010] [Indexed: 12/23/2022] Open
Abstract
Background Experimentally validated data on gene regulation are hard to obtain. In particular, information about transcription factor binding sites in regulatory regions are scattered around in the literature. This impedes their systematic in-context analysis, e.g. the inference of their conservation in evolutionary history. Results We demonstrate the power of integrative bioinformatics by including curated transcription factor binding site information into the UCSC genome browser, using wiki and custom tracks, which enable easy publication of annotation data. Data integration allows to investigate the evolution of gene regulation of the pluripotency-associated genes Oct4, Sox2 and Nanog. For the first time, experimentally validated transcription factor binding sites in the regulatory regions of all three genes were assembled together based on manual curation of data from 39 publications. Using the UCSC genome browser, these data were then visualized in the context of multi-species conservation based on genomic alignment. We confirm previous hypotheses regarding the evolutionary age of specific regulatory patterns, establishing their "deep homology". We also confirm some other principles of Carroll's "Genetic theory of Morphological Evolution", such as "mosaic pleiotropy", exemplified by the dual role of Sox2 reflected in its regulatory region. Conclusions We were able to elucidate some aspects of the evolution of gene regulation for three genes associated with pluripotency. Based on the expected return on investment for the community, we encourage other scientists to contribute experimental data on gene regulation (original work as well as data collected for reviews) to the UCSC system, to enable studies of the evolution of gene regulation on a large scale, and to report their findings. Reviewers This article was reviewed by Dr. Gustavo Glusman and Dr. Juan Caballero, Institute for Systems Biology, Seattle, USA (nominated by Dr. Doron Lancet, Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel), Dr. Niels Grabe, TIGA Center (BIOQUANT) and Medical Systems Biology Group, Institute of Medical Biometry and Informatics, University Hospital Heidelberg, Germany (nominated by Dr. Mikhail Gelfand, Department of Bioinformatics, Institute of Information Transfer Problems, Russian Academy of Science, Moscow, Russian Federation) and Dr. Franz-Josef Müller, Center for Regenerative Medicine, The Scripps Research Institute, La Jolla, CA, USA and University Hospital for Psychiatry and Psychotherapy (part of ZIP gGmbH), University of Kiel, Germany (nominated by Dr. Trey Ideker, University of California, San Diego, La Jolla CA, United States).
Collapse
Affiliation(s)
- Georg Fuellen
- Institute for Biostatistics and Informatics in Medicine and Ageing Research - IBIMA, University of Rostock, Medical Faculty, Ernst-Heydemann-Str. 8, 18057 Rostock, Germany.
| | | |
Collapse
|
9
|
Warsow G, Greber B, Falk SSI, Harder C, Siatkowski M, Schordan S, Som A, Endlich N, Schöler H, Repsilber D, Endlich K, Fuellen G. ExprEssence--revealing the essence of differential experimental data in the context of an interaction/regulation net-work. BMC SYSTEMS BIOLOGY 2010; 4:164. [PMID: 21118483 PMCID: PMC3012047 DOI: 10.1186/1752-0509-4-164] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2010] [Accepted: 11/30/2010] [Indexed: 12/15/2022]
Abstract
Background Experimentalists are overwhelmed by high-throughput data and there is an urgent need to condense information into simple hypotheses. For example, large amounts of microarray and deep sequencing data are becoming available, describing a variety of experimental conditions such as gene knockout and knockdown, the effect of interventions, and the differences between tissues and cell lines. Results To address this challenge, we developed a method, implemented as a Cytoscape plugin called ExprEssence. As input we take a network of interaction, stimulation and/or inhibition links between genes/proteins, and differential data, such as gene expression data, tracking an intervention or development in time. We condense the network, highlighting those links across which the largest changes can be observed. Highlighting is based on a simple formula inspired by the law of mass action. We can interactively modify the threshold for highlighting and instantaneously visualize results. We applied ExprEssence to three scenarios describing kidney podocyte biology, pluripotency and ageing: 1) We identify putative processes involved in podocyte (de-)differentiation and validate one prediction experimentally. 2) We predict and validate the expression level of a transcription factor involved in pluripotency. 3) Finally, we generate plausible hypotheses on the role of apoptosis, cell cycle deregulation and DNA repair in ageing data obtained from the hippocampus. Conclusion Reducing the size of gene/protein networks to the few links affected by large changes allows to screen for putative mechanistic relationships among the genes/proteins that are involved in adaptation to different experimental conditions, yielding important hypotheses, insights and suggestions for new experiments. We note that we do not focus on the identification of 'active subnetworks'. Instead we focus on the identification of single links (which may or may not form subnetworks), and these single links are much easier to validate experimentally than submodules. ExprEssence is available at http://sourceforge.net/projects/expressence/.
Collapse
Affiliation(s)
- Gregor Warsow
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, Ernst-Heydemann-Strasse 8, Rostock, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|