Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bednarz P, Wilczyński B. Supervised learning method for predicting chromatin boundary associated insulator elements. J Bioinform Comput Biol 2015;12:1442006. [PMID: 25385081 DOI: 10.1142/s0219720014420062] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

For:	Bednarz P, Wilczyński B. Supervised learning method for predicting chromatin boundary associated insulator elements. J Bioinform Comput Biol 2015;12:1442006. [PMID: 25385081 DOI: 10.1142/s0219720014420062] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Number

Cited by Other Article(s)

Wall BPG, Nguyen M, Harrell JC, Dozmorov MG. Machine and Deep Learning Methods for Predicting 3D Genome Organization. Methods Mol Biol 2025;2856:357-400. [PMID: 39283464 DOI: 10.1007/978-1-0716-4136-1_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]

Gnocis: An integrated system for interactive and reproducible analysis and modelling of cis-regulatory elements in Python 3. PLoS One 2022;17:e0274338. [PMID: 36084008 PMCID: PMC9462789 DOI: 10.1371/journal.pone.0274338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 08/25/2022] [Indexed: 11/23/2022] Open

Abstract

Gene expression is regulated through cis-regulatory elements (CREs), among which are promoters, enhancers, Polycomb/Trithorax Response Elements (PREs), silencers and insulators. Computational prediction of CREs can be achieved using a variety of statistical and machine learning methods combined with different feature space formulations. Although Python packages for DNA sequence feature sets and for machine learning are available, no existing package facilitates the combination of DNA sequence feature sets with machine learning methods for the genome-wide prediction of candidate CREs. We here present Gnocis, a Python package that streamlines the analysis and the modelling of CRE sequences by providing extensible APIs and implementing the glue required for combining feature sets and models for genome-wide prediction. Gnocis implements a variety of base feature sets, including motif pair occurrence frequencies and the k-spectrum mismatch kernel. It integrates with Scikit-learn and TensorFlow for state-of-the-art machine learning. Gnocis additionally implements a broad suite of tools for the handling and preparation of sequence, region and curve data, which can be useful for general DNA bioinformatics in Python. We also present Deep-MOCCA, a neural network architecture inspired by SVM-MOCCA that achieves moderate to high generalization without prior motif knowledge. To demonstrate the use of Gnocis, we applied multiple machine learning methods to the modelling of D. melanogaster PREs, including a Convolutional Neural Network (CNN), making this the first study to model PREs with CNNs. The models are readily adapted to new CRE modelling problems and to other organisms. In order to produce a high-performance, compiled package for Python 3, we implemented Gnocis in Cython. Gnocis can be installed using the PyPI package manager by running ‘pip install gnocis’. The source code is available on GitHub, at https://github.com/bjornbredesen/gnocis.

Collapse

Kurbidaeva A, Purugganan M. Insulators in Plants: Progress and Open Questions. Genes (Basel) 2021;12:genes12091422. [PMID: 34573404 PMCID: PMC8470105 DOI: 10.3390/genes12091422] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 09/08/2021] [Accepted: 09/14/2021] [Indexed: 11/16/2022] Open

Bredesen BA, Rehmsmeier M. MOCCA: a flexible suite for modelling DNA sequence motif occurrence combinatorics. BMC Bioinformatics 2021;22:234. [PMID: 33962556 PMCID: PMC8105988 DOI: 10.1186/s12859-021-04143-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 04/21/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Cis-regulatory elements (CREs) are DNA sequence segments that regulate gene expression. Among CREs are promoters, enhancers, Boundary Elements (BEs) and Polycomb Response Elements (PREs), all of which are enriched in specific sequence motifs that form particular occurrence landscapes. We have recently introduced a hierarchical machine learning approach (SVM-MOCCA) in which Support Vector Machines (SVMs) are applied on the level of individual motif occurrences, modelling local sequence composition, and then combined for the prediction of whole regulatory elements. We used SVM-MOCCA to predict PREs in Drosophila and found that it was superior to other methods. However, we did not publish a polished implementation of SVM-MOCCA, which can be useful for other researchers, and we only tested SVM-MOCCA with IUPAC motifs and PREs.

RESULTS

We here present an expanded suite for modelling CRE sequences in terms of motif occurrence combinatorics-Motif Occurrence Combinatorics Classification Algorithms (MOCCA). MOCCA contains efficient implementations of several modelling methods, including SVM-MOCCA, and a new method, RF-MOCCA, a Random Forest-derivative of SVM-MOCCA. We used SVM-MOCCA and RF-MOCCA to model Drosophila PREs and BEs in cross-validation experiments, making this the first study to model PREs with Random Forests and the first study that applies the hierarchical MOCCA approach to the prediction of BEs. Both models significantly improve generalization to PREs and boundary elements beyond that of previous methods-including 4-spectrum and motif occurrence frequency Support Vector Machines and Random Forests-, with RF-MOCCA yielding the best results.

CONCLUSION

MOCCA is a flexible and powerful suite of tools for the motif-based modelling of CRE sequences in terms of motif composition. MOCCA can be applied to any new CRE modelling problems where motifs have been identified. MOCCA supports IUPAC and Position Weight Matrix (PWM) motifs. For ease of use, MOCCA implements generation of negative training data, and additionally a mode that requires only that the user specifies positives, motifs and a genome. MOCCA is licensed under the MIT license and is available on Github at https://github.com/bjornbredesen/MOCCA .

Collapse

Gerassis S, Abad A, Taboada J, Saavedra Á, Giráldez E. A comparative analysis of health surveillance strategies for administrative video display terminal employees. Biomed Eng Online 2019;18:118. [PMID: 31829225 PMCID: PMC6907276 DOI: 10.1186/s12938-019-0737-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2019] [Accepted: 11/29/2019] [Indexed: 11/11/2022] Open

Sauerwald N, Kingsford C. Quantifying the similarity of topological domains across normal and cancer human cell types. Bioinformatics 2019;34:i475-i483. [PMID: 29949963 PMCID: PMC6022623 DOI: 10.1093/bioinformatics/bty265] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Sefer E, Kingsford C. Semi-nonparametric modeling of topological domain formation from epigenetic data. Algorithms Mol Biol 2019;14:4. [PMID: 30867673 PMCID: PMC6399866 DOI: 10.1186/s13015-019-0142-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 02/26/2019] [Indexed: 01/01/2023] Open

Herman-Izycka J, Wlasnowolski M, Wilczynski B. Taking promoters out of enhancers in sequence based predictions of tissue-specific mammalian enhancers. BMC Med Genomics 2017;10:34. [PMID: 28589862 PMCID: PMC5461523 DOI: 10.1186/s12920-017-0264-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open