1
|
Vorobyeva NE, Krasnov AN, Erokhin M, Chetverina D, Mazina M. Su(Hw) interacts with Combgap to establish long-range chromatin contacts. Epigenetics Chromatin 2024; 17:17. [PMID: 38773468 PMCID: PMC11106861 DOI: 10.1186/s13072-024-00541-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 05/16/2024] [Indexed: 05/23/2024] Open
Abstract
BACKGROUND Insulator-binding proteins (IBPs) play a critical role in genome architecture by forming and maintaining contact domains. While the involvement of several IBPs in organising chromatin architecture in Drosophila has been described, the specific contribution of the Suppressor of Hairy wings (Su(Hw)) insulator-binding protein to genome topology remains unclear. RESULTS In this study, we provide evidence for the existence of long-range interactions between chromatin bound Su(Hw) and Combgap, which was first characterised as Polycomb response elements binding protein. Loss of Su(Hw) binding to chromatin results in the disappearance of Su(Hw)-Combgap long-range interactions and in a decrease in spatial self-interactions among a subset of Su(Hw)-bound genome sites. Our findings suggest that Su(Hw)-Combgap long-range interactions are associated with active chromatin rather than Polycomb-directed repression. Furthermore, we observe that the majority of transcription start sites that are down-regulated upon loss of Su(Hw) binding to chromatin are located within 2 kb of Combgap peaks and exhibit Su(Hw)-dependent changes in Combgap and transcriptional regulators' binding. CONCLUSIONS This study demonstrates that Su(Hw) insulator binding protein can form long-range interactions with Combgap, Polycomb response elements binding protein, and that these interactions are associated with active chromatin factors rather than with Polycomb dependent repression.
Collapse
Affiliation(s)
- Nadezhda E Vorobyeva
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Alexey N Krasnov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Maksim Erokhin
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Darya Chetverina
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Marina Mazina
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia.
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
2
|
Salzler HR, Vandadi V, McMichael BD, Brown JC, Boerma SA, Leatham-Jensen MP, Adams KM, Meers MP, Simon JM, Duronio RJ, McKay DJ, Matera AG. Distinct roles for canonical and variant histone H3 lysine-36 in Polycomb silencing. SCIENCE ADVANCES 2023; 9:eadf2451. [PMID: 36857457 PMCID: PMC9977188 DOI: 10.1126/sciadv.adf2451] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 01/31/2023] [Indexed: 05/26/2023]
Abstract
Polycomb complexes regulate cell type-specific gene expression programs through heritable silencing of target genes. Trimethylation of histone H3 lysine 27 (H3K27me3) is essential for this process. Perturbation of H3K36 is thought to interfere with H3K27me3. We show that mutants of Drosophila replication-dependent (H3.2K36R) or replication-independent (H3.3K36R) histone H3 genes generally maintain Polycomb silencing and reach later stages of development. In contrast, combined (H3.3K36RH3.2K36R) mutants display widespread Hox gene misexpression and fail to develop past the first larval stage. Chromatin profiling revealed that the H3.2K36R mutation disrupts H3K27me3 levels broadly throughout silenced domains, whereas these regions are mostly unaffected in H3.3K36R animals. Analysis of H3.3 distributions showed that this histone is enriched at presumptive Polycomb response elements located outside of silenced domains but relatively depleted from those inside. We conclude that H3.2 and H3.3 K36 residues collaborate to repress Hox genes using different mechanisms.
Collapse
Affiliation(s)
- Harmony R. Salzler
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Vasudha Vandadi
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Benjamin D. McMichael
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
| | - John C. Brown
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Sally A. Boerma
- Department of Biology, Carleton College, Northfield, MN, USA
| | - Mary P. Leatham-Jensen
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Kirsten M. Adams
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
| | - Michael P. Meers
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC, USA
| | - Jeremy M. Simon
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA
| | - Robert J. Duronio
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA
| | - Daniel J. McKay
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - A. Gregory Matera
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
3
|
Solorzano J, Carrillo-de Santa Pau E, Laguna T, Busturia A. A genome-wide computational approach to define microRNA-Polycomb/trithorax gene regulatory circuits in Drosophila. Dev Biol 2023; 495:63-75. [PMID: 36596335 DOI: 10.1016/j.ydbio.2022.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 12/07/2022] [Accepted: 12/26/2022] [Indexed: 01/02/2023]
Abstract
Characterization of gene regulatory networks is fundamental to understanding homeostatic development. This process can be simplified by analyzing relatively simple genomes such as the genome of Drosophila melanogaster. In this work we have developed a computational framework in Drosophila to explore for the presence of gene regulatory circuits between two large groups of transcriptional regulators: the epigenetic group of the Polycomb/trithorax (PcG/trxG) proteins and the microRNAs (miRNAs). We have searched genome-wide for miRNA targets in PcG/trxG transcripts as well as for Polycomb Response Elements (PREs) in miRNA genes. Our results show that 10% of the analyzed miRNAs could be controlling PcG/trxG gene expression, while 40% of those miRNAs are putatively controlled by the selected set of PcG/trxG proteins. The integration of these analyses has resulted in the predicted existence of 3 classes of miRNA-PcG/trxG crosstalk interactions that define potential regulatory circuits. In the first class, miRNA-PcG circuits are defined by miRNAs that reciprocally crosstalk with PcG. In the second, miRNA-trxG circuits are defined by miRNAs that reciprocally crosstalk with trxG. In the third class, miRNA-PcG/trxG shared circuits are defined by miRNAs that crosstalk with both PcG and trxG regulators. These putative regulatory circuits may uncover a novel mechanism in Drosophila for the control of PcG/trxG and miRNAs levels of expression. The computational framework developed here for Drosophila melanogaster can serve as a model case for similar analyses in other species. Moreover, our work provides, for the first time, a new and useful resource for the Drosophila community to consult prior to experimental studies investigating the epigenetic regulatory networks of miRNA-PcG/trxG mediated gene expression.
Collapse
Affiliation(s)
- Jacobo Solorzano
- Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Nicolas Cabrera 1, 28049, Madrid, Spain; Centre de Recherches en Cancerologie de Toulouse, 2 Av. Hubert Curien, 31100, Toulouse, France
| | - Enrique Carrillo-de Santa Pau
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, 28049, Madrid, Spain
| | - Teresa Laguna
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, 28049, Madrid, Spain.
| | - Ana Busturia
- Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Nicolas Cabrera 1, 28049, Madrid, Spain.
| |
Collapse
|
4
|
Ornelas-Ayala D, Cortés-Quiñones C, Olvera-Herrera J, García-Ponce B, Garay-Arroyo A, Álvarez-Buylla ER, Sanchez MDLP. A Green Light to Switch on Genes: Revisiting Trithorax on Plants. PLANTS (BASEL, SWITZERLAND) 2022; 12:75. [PMID: 36616203 PMCID: PMC9824250 DOI: 10.3390/plants12010075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/18/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
The Trithorax Group (TrxG) is a highly conserved multiprotein activation complex, initially defined by its antagonistic activity with the PcG repressor complex. TrxG regulates transcriptional activation by the deposition of H3K4me3 and H3K36me3 marks. According to the function and evolutionary origin, several proteins have been defined as TrxG in plants; nevertheless, little is known about their interactions and if they can form TrxG complexes. Recent evidence suggests the existence of new TrxG components as well as new interactions of some TrxG complexes that may be acting in specific tissues in plants. In this review, we bring together the latest research on the topic, exploring the interactions and roles of TrxG proteins at different developmental stages, required for the fine-tuned transcriptional activation of genes at the right time and place. Shedding light on the molecular mechanism by which TrxG is recruited and regulates transcription.
Collapse
|
5
|
Torosin NS, Golla TR, Lawlor MA, Cao W, Ellison CE. Mode and Tempo of 3D Genome Evolution in Drosophila. Mol Biol Evol 2022; 39:6750036. [PMID: 36201625 PMCID: PMC9641997 DOI: 10.1093/molbev/msac216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Topologically associating domains (TADs) are thought to play an important role in preventing gene misexpression by spatially constraining enhancer-promoter contacts. The deleterious nature of gene misexpression implies that TADs should, therefore, be conserved among related species. Several early studies comparing chromosome conformation between species reported high levels of TAD conservation; however, more recent studies have questioned these results. Furthermore, recent work suggests that TAD reorganization is not associated with extensive changes in gene expression. Here, we investigate the evolutionary conservation of TADs among 11 species of Drosophila. We use Hi-C data to identify TADs in each species and employ a comparative phylogenetic approach to derive empirical estimates of the rate of TAD evolution. Surprisingly, we find that TADs evolve rapidly. However, we also find that the rate of evolution depends on the chromatin state of the TAD, with TADs enriched for developmentally regulated chromatin evolving significantly slower than TADs enriched for broadly expressed, active chromatin. We also find that, after controlling for differences in chromatin state, highly conserved TADs do not exhibit higher levels of gene expression constraint. These results suggest that, in general, most TADs evolve rapidly and their divergence is not associated with widespread changes in gene expression. However, higher levels of evolutionary conservation and gene expression constraints in TADs enriched for developmentally regulated chromatin suggest that these TAD subtypes may be more important for regulating gene expression, likely due to the larger number of long-distance enhancer-promoter contacts associated with developmental genes.
Collapse
Affiliation(s)
- Nicole S Torosin
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| | - Tirupathi Rao Golla
- LifeCell, Kelambakkam Main Road, Keelakottaiyur, Chennai 600127, Tamil Nadu, India
| | - Matthew A Lawlor
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| | - Weihuan Cao
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| | | |
Collapse
|
6
|
Gnocis: An integrated system for interactive and reproducible analysis and modelling of cis-regulatory elements in Python 3. PLoS One 2022; 17:e0274338. [PMID: 36084008 PMCID: PMC9462789 DOI: 10.1371/journal.pone.0274338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 08/25/2022] [Indexed: 11/23/2022] Open
Abstract
Gene expression is regulated through cis-regulatory elements (CREs), among which are promoters, enhancers, Polycomb/Trithorax Response Elements (PREs), silencers and insulators. Computational prediction of CREs can be achieved using a variety of statistical and machine learning methods combined with different feature space formulations. Although Python packages for DNA sequence feature sets and for machine learning are available, no existing package facilitates the combination of DNA sequence feature sets with machine learning methods for the genome-wide prediction of candidate CREs. We here present Gnocis, a Python package that streamlines the analysis and the modelling of CRE sequences by providing extensible APIs and implementing the glue required for combining feature sets and models for genome-wide prediction. Gnocis implements a variety of base feature sets, including motif pair occurrence frequencies and the k-spectrum mismatch kernel. It integrates with Scikit-learn and TensorFlow for state-of-the-art machine learning. Gnocis additionally implements a broad suite of tools for the handling and preparation of sequence, region and curve data, which can be useful for general DNA bioinformatics in Python. We also present Deep-MOCCA, a neural network architecture inspired by SVM-MOCCA that achieves moderate to high generalization without prior motif knowledge. To demonstrate the use of Gnocis, we applied multiple machine learning methods to the modelling of D. melanogaster PREs, including a Convolutional Neural Network (CNN), making this the first study to model PREs with CNNs. The models are readily adapted to new CRE modelling problems and to other organisms. In order to produce a high-performance, compiled package for Python 3, we implemented Gnocis in Cython. Gnocis can be installed using the PyPI package manager by running ‘pip install gnocis’. The source code is available on GitHub, at https://github.com/bjornbredesen/gnocis.
Collapse
|
7
|
Vijayanathan M, Trejo-Arellano MG, Mozgová I. Polycomb Repressive Complex 2 in Eukaryotes-An Evolutionary Perspective. EPIGENOMES 2022; 6:3. [PMID: 35076495 PMCID: PMC8788455 DOI: 10.3390/epigenomes6010003] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 01/12/2022] [Accepted: 01/12/2022] [Indexed: 12/23/2022] Open
Abstract
Polycomb repressive complex 2 (PRC2) represents a group of evolutionarily conserved multi-subunit complexes that repress gene transcription by introducing trimethylation of lysine 27 on histone 3 (H3K27me3). PRC2 activity is of key importance for cell identity specification and developmental phase transitions in animals and plants. The composition, biochemistry, and developmental function of PRC2 in animal and flowering plant model species are relatively well described. Recent evidence demonstrates the presence of PRC2 complexes in various eukaryotic supergroups, suggesting conservation of the complex and its function. Here, we provide an overview of the current understanding of PRC2-mediated repression in different representatives of eukaryotic supergroups with a focus on the green lineage. By comparison of PRC2 in different eukaryotes, we highlight the possible common and diverged features suggesting evolutionary implications and outline emerging questions and directions for future research of polycomb repression and its evolution.
Collapse
Affiliation(s)
- Mallika Vijayanathan
- Biology Centre, Institute of Plant Molecular Biology, Czech Academy of Sciences, 370 05 Ceske Budejovice, Czech Republic; (M.V.); (M.G.T.-A.)
| | - María Guadalupe Trejo-Arellano
- Biology Centre, Institute of Plant Molecular Biology, Czech Academy of Sciences, 370 05 Ceske Budejovice, Czech Republic; (M.V.); (M.G.T.-A.)
| | - Iva Mozgová
- Biology Centre, Institute of Plant Molecular Biology, Czech Academy of Sciences, 370 05 Ceske Budejovice, Czech Republic; (M.V.); (M.G.T.-A.)
- Faculty of Science, University of South Bohemia, 370 05 Ceske Budejovice, Czech Republic
| |
Collapse
|
8
|
Su(Hw) primes 66D and 7F Drosophila chorion genes loci for amplification through chromatin decondensation. Sci Rep 2021; 11:16963. [PMID: 34417521 PMCID: PMC8379230 DOI: 10.1038/s41598-021-96488-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 08/11/2021] [Indexed: 11/11/2022] Open
Abstract
Suppressor of Hairy wing [Su(Hw)] is an insulator protein that participates in regulating chromatin architecture and gene repression in Drosophila. In previous studies we have shown that Su(Hw) is also required for pre-replication complex (pre-RC) recruitment on Su(Hw)-bound sites (SBSs) in Drosophila S2 cells and pupa. Here, we describe the effect of Su(Hw) on developmentally regulated amplification of 66D and 7F Drosophila amplicons in follicle cells (DAFCs), widely used as models in replication studies. We show Su(Hw) binding co-localizes with all known DAFCs in Drosophila ovaries, whereas disruption of Su(Hw) binding to 66D and 7F DAFCs causes a two-fold decrease in the amplification of these loci. The complete loss of Su(Hw) binding to chromatin impairs pre-RC recruitment to all amplification regulatory regions of 66D and 7F loci at early oogenesis (prior to DAFCs amplification). These changes coincide with a considerable Su(Hw)-dependent condensation of chromatin at 66D and 7F loci. Although we observed the Brm, ISWI, Mi-2, and CHD1 chromatin remodelers at SBSs genome wide, their remodeler activity does not appear to be responsible for chromatin decondensation at the 66D and 7F amplification regulatory regions. We have discovered that, in addition to the CBP/Nejire and Chameau histone acetyltransferases, the Gcn5 acetyltransferase binds to 66D and 7F DAFCs at SBSs and this binding is dependent on Su(Hw). We propose that the main function of Su(Hw) in developmental amplification of 66D and 7F DAFCs is to establish a chromatin structure that is permissive to pre-RC recruitment.
Collapse
|
9
|
Bredesen BA, Rehmsmeier M. MOCCA: a flexible suite for modelling DNA sequence motif occurrence combinatorics. BMC Bioinformatics 2021; 22:234. [PMID: 33962556 PMCID: PMC8105988 DOI: 10.1186/s12859-021-04143-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 04/21/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cis-regulatory elements (CREs) are DNA sequence segments that regulate gene expression. Among CREs are promoters, enhancers, Boundary Elements (BEs) and Polycomb Response Elements (PREs), all of which are enriched in specific sequence motifs that form particular occurrence landscapes. We have recently introduced a hierarchical machine learning approach (SVM-MOCCA) in which Support Vector Machines (SVMs) are applied on the level of individual motif occurrences, modelling local sequence composition, and then combined for the prediction of whole regulatory elements. We used SVM-MOCCA to predict PREs in Drosophila and found that it was superior to other methods. However, we did not publish a polished implementation of SVM-MOCCA, which can be useful for other researchers, and we only tested SVM-MOCCA with IUPAC motifs and PREs. RESULTS We here present an expanded suite for modelling CRE sequences in terms of motif occurrence combinatorics-Motif Occurrence Combinatorics Classification Algorithms (MOCCA). MOCCA contains efficient implementations of several modelling methods, including SVM-MOCCA, and a new method, RF-MOCCA, a Random Forest-derivative of SVM-MOCCA. We used SVM-MOCCA and RF-MOCCA to model Drosophila PREs and BEs in cross-validation experiments, making this the first study to model PREs with Random Forests and the first study that applies the hierarchical MOCCA approach to the prediction of BEs. Both models significantly improve generalization to PREs and boundary elements beyond that of previous methods-including 4-spectrum and motif occurrence frequency Support Vector Machines and Random Forests-, with RF-MOCCA yielding the best results. CONCLUSION MOCCA is a flexible and powerful suite of tools for the motif-based modelling of CRE sequences in terms of motif composition. MOCCA can be applied to any new CRE modelling problems where motifs have been identified. MOCCA supports IUPAC and Position Weight Matrix (PWM) motifs. For ease of use, MOCCA implements generation of negative training data, and additionally a mode that requires only that the user specifies positives, motifs and a genome. MOCCA is licensed under the MIT license and is available on Github at https://github.com/bjornbredesen/MOCCA .
Collapse
Affiliation(s)
- Bjørn André Bredesen
- Computational Biology Unit, Department of Informatics, University of Bergen, P.O. Box 7803, 5020, Bergen, Norway.
| | - Marc Rehmsmeier
- Department of Biology, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany
| |
Collapse
|
10
|
Liu T, Chen JM, Zhang D, Zhang Q, Peng B, Xu L, Tang H. ApoPred: Identification of Apolipoproteins and Their Subfamilies With Multifarious Features. Front Cell Dev Biol 2021; 8:621144. [PMID: 33490085 PMCID: PMC7820372 DOI: 10.3389/fcell.2020.621144] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 11/24/2020] [Indexed: 01/24/2023] Open
Abstract
Apolipoprotein is a group of plasma proteins that are associated with a variety of diseases, such as hyperlipidemia, atherosclerosis, Alzheimer's disease, and diabetes. In order to investigate the function of apolipoproteins and to develop effective targets for related diseases, it is necessary to accurately identify and classify apolipoproteins. Although it is possible to identify apolipoproteins accurately through biochemical experiments, they are expensive and time-consuming. This work aims to establish a high-efficiency and high-accuracy prediction model for recognition of apolipoproteins and their subfamilies. We firstly constructed a high-quality benchmark dataset including 270 apolipoproteins and 535 non-apolipoproteins. Based on the dataset, pseudo-amino acid composition (PseAAC) and composition of k-spaced amino acid pairs (CKSAAP) were used as input vectors. To improve the prediction accuracy and eliminate redundant information, analysis of variance (ANOVA) was used to rank the features. And the incremental feature selection was utilized to obtain the best feature subset. Support vector machine (SVM) was proposed to construct the classification model, which could produce the accuracy of 97.27%, sensitivity of 96.30%, and specificity of 97.76% for discriminating apolipoprotein from non-apolipoprotein in 10-fold cross-validation. In addition, the same process was repeated to generate a new model for predicting apolipoprotein subfamilies. The new model could achieve an overall accuracy of 95.93% in 10-fold cross-validation. According to our proposed model, a convenient webserver called ApoPred was established, which can be freely accessed at http://tang-biolab.com/server/ApoPred/service.html. We expect that this work will contribute to apolipoprotein function research and drug development in relevant diseases.
Collapse
Affiliation(s)
- Ting Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Jia-Mao Chen
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Dan Zhang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Qian Zhang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Bowen Peng
- Division of international Cooperation, Health Commission of Sichuan Province, Chengdu, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Hua Tang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
- Central Nervous System Drug Key Laboratory of Sichuan Province, Luzhou, China
| |
Collapse
|
11
|
Torosin NS, Anand A, Golla TR, Cao W, Ellison CE. 3D genome evolution and reorganization in the Drosophila melanogaster species group. PLoS Genet 2020; 16:e1009229. [PMID: 33284803 PMCID: PMC7746282 DOI: 10.1371/journal.pgen.1009229] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 12/17/2020] [Accepted: 10/27/2020] [Indexed: 01/17/2023] Open
Abstract
Topologically associating domains, or TADs, are functional units that organize chromosomes into 3D structures of interacting chromatin. TADs play an important role in regulating gene expression by constraining enhancer-promoter contacts and there is evidence that deletion of TAD boundaries leads to aberrant expression of neighboring genes. While the mechanisms of TAD formation have been well-studied, current knowledge on the patterns of TAD evolution across species is limited. Due to the integral role TADs play in gene regulation, their structure and organization is expected to be conserved during evolution. However, more recent research suggests that TAD structures diverge relatively rapidly. We use Hi-C chromosome conformation capture to measure evolutionary conservation of whole TADs and TAD boundary elements between D. melanogaster and D. triauraria, two early-branching species from the melanogaster species group which diverged ∼15 million years ago. We find that the majority of TADs have been reorganized since the common ancestor of D. melanogaster and D. triauraria, via a combination of chromosomal rearrangements and gain/loss of TAD boundaries. TAD reorganization between these two species is associated with a localized effect on gene expression, near the site of disruption. By separating TADs into subtypes based on their chromatin state, we find that different subtypes are evolving under different evolutionary forces. TADs enriched for broadly expressed, transcriptionally active genes are evolving rapidly, potentially due to positive selection, whereas TADs enriched for developmentally-regulated genes remain conserved, presumably due to their importance in restricting gene-regulatory element interactions. These results provide novel insight into the evolutionary dynamics of TADs and help to reconcile contradictory reports related to the evolutionary conservation of TADs and whether changes in TAD structure affect gene expression.
Collapse
Affiliation(s)
- Nicole S. Torosin
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States
| | - Aparna Anand
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States
| | - Tirupathi Rao Golla
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States
| | - Weihuan Cao
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States
| | - Christopher E. Ellison
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States
| |
Collapse
|