1
|
Kravchuk EV, Ashniev GA, Gladkova MG, Orlov AV, Zaitseva ZG, Malkerov JA, Orlova NN. Sequence-Only Prediction of Super-Enhancers in Human Cell Lines Using Transformer Models. BIOLOGY 2025; 14:172. [PMID: 40001940 PMCID: PMC11852244 DOI: 10.3390/biology14020172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2025] [Revised: 01/29/2025] [Accepted: 02/01/2025] [Indexed: 02/27/2025]
Abstract
The study discloses the application of transformer-based deep learning models for the task of super-enhancers prediction in human tumor cell lines with a specific focus on sequence-only features within studied entities of super-enhancer and enhancer elements in the human genome. The proposed SE-prediction method included the GENA-LM application at handling long DNA sequences with the classification task, distinguishing super-enhancers from enhancers using H3K36me, H3K4me1, H3K4me3 and H3K27ac landscape datasets from HeLa, HEK293, H2171, Jurkat, K562, MM1S and U87 cell lines. The model was fine-tuned on relevant sequence data, allowing for the analysis of extended genomic sequences without the need for epigenetic markers as proposed in early approaches. The study achieved balanced accuracy metrics, surpassing previous models like SENet, particularly in HEK293 and K562 cell lines. Also, it was shown that super-enhancers frequently co-localize with epigenetic marks such as H3K4me3 and H3K27ac. Therefore, the attention mechanism of the model provided insights into the sequence features contributing to SE classification, indicating a correlation between sequence-only features and mentioned epigenetic landscapes. These findings support the potential transformer models use in further genomic sequence analysis for bioinformatics applications in enhancer/super-enhancer characterization and gene regulation studies.
Collapse
Affiliation(s)
- Ekaterina V. Kravchuk
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
| | - German A. Ashniev
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
- Faculty of Biology, Lomonosov Moscow State University, Leninskiye Gory, MSU, 1-12, 119991 Moscow, Russia
- Institute for Information Transmission Problems RAS, 127051 Moscow, Russia
| | - Marina G. Gladkova
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, GSP-1, Leninskiye Gory, MSU, 1-73, 119234 Moscow, Russia
| | - Alexey V. Orlov
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
| | - Zoia G. Zaitseva
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
| | - Juri A. Malkerov
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
| | - Natalia N. Orlova
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia; (E.V.K.); (G.A.A.); (M.G.G.); (Z.G.Z.); (J.A.M.)
| |
Collapse
|
2
|
Balzanelli MG, Rastmanesh R, Distratis P, Lazzaro R, Inchingolo F, Del Prete R, Pham VH, Aityan SK, Cong TT, Nguyen KCD, Isacco CG. The Role of SARS-CoV-2 Spike Protein in Long-term Damage of Tissues and Organs, the Underestimated Role of Retrotransposons and Stem Cells, a Working Hypothesis. Endocr Metab Immune Disord Drug Targets 2025; 25:85-98. [PMID: 38468535 DOI: 10.2174/0118715303283480240227113401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/09/2024] [Accepted: 02/09/2024] [Indexed: 03/13/2024]
Abstract
Coronavirus disease-2019 (COVID-19) is a respiratory disease in which Spike protein from SARS-CoV-2 plays a key role in transferring virus genomic code into target cells. Spike protein, which is found on the surface of the SARS-CoV-2 virus, latches onto angiotensin-converting enzyme 2 receptors (ACE2r) on target cells. The RNA genome of coronaviruses, with an average length of 29 kb, is the longest among all RNA viruses and comprises six to ten open reading frames (ORFs) responsible for encoding replicase and structural proteins for the virus. Each component of the viral genome is inserted into a helical nucleocapsid surrounded by a lipid bilayer. The Spike protein is responsible for damage to several organs and tissues, even leading to severe impairments and long-term disabilities. Spike protein could also be the cause of the long-term post-infectious conditions known as Long COVID-19, characterized by a group of unresponsive idiopathic severe neuro- and cardiovascular disorders, including strokes, cardiopathies, neuralgias, fibromyalgia, and Guillaume- Barret's like-disease. In this paper, we suggest a pervasive mechanism whereby the Spike proteins either from SARS-CoV-2 mRNA or mRNA vaccines, tend to enter the mature cells, and progenitor, multipotent, and pluripotent stem cells (SCs), altering the genome integrity. This will eventually lead to the production of newly affected clones and mature cells. The hypothesis presented in this paper proposes that the mRNA integration into DNA occurs through several components of the evolutionarily genetic mechanism such as retrotransposons and retrotransposition, LINE-1 or L1 (long interspersed element-1), and ORF-1 and 2 responsible for the generation of retrogenes. Once the integration phase is concluded, somatic cells, progenitor cells, and SCs employ different silencing mechanisms. DNA methylation, followed by histone modification, begins to generate unlimited lines of affected cells and clones that form affected tissues characterized by abnormal patterns that become targets of systemic immune cells, generating uncontrolled inflammatory conditions, as observed in both Long COVID-19 syndrome and the mRNA vaccine.
Collapse
Affiliation(s)
- Mario G Balzanelli
- 118 SET, Department of Pre-hospital and Emergency, SG Giuseppe Moscati Hospital, 74120 Taranto, Italy
| | - Reza Rastmanesh
- Department of Nutrition and Metabolism, The Nutrition Society, Boyd Orr House, 10 Cambridge Court, 210 Shepherds Bush Road, London, UK
| | - Pietro Distratis
- 118 SET, Department of Pre-hospital and Emergency, SG Giuseppe Moscati Hospital, 74120 Taranto, Ital
| | - Rita Lazzaro
- 118 SET, Department of Pre-hospital and Emergency, SG Giuseppe Moscati Hospital, 74120 Taranto, Ital
| | - Francesco Inchingolo
- Department of Interdisciplinary Medicine, Section of Microbiology and Virology, School of Medicine, University of Bari "Aldo Moro", 70124 Bari, Italy
| | - Raffaele Del Prete
- Department of Interdisciplinary Medicine, Section of Microbiology and Virology, School of Medicine, University of Bari "Aldo Moro", 70124 Bari, Italy
| | - Van H Pham
- Phan Chau Trinh University of Medicine, Quang Nam 70000, Vietnam
| | - Sergey K Aityan
- Northwestern University, Multidisciplinary Research Center, Oakland, CA 94612, USA
| | - Toai Tran Cong
- Pham Ngoc Thach University of Medicine, Ho Chi Minh City 700000, Vietnam
| | - Kieu C D Nguyen
- Department of Interdisciplinary Medicine, Section of Microbiology and Virology, School of Medicine, University of Bari "Aldo Moro", 70124 Bari, Italy
| | - Ciro Gargiulo Isacco
- 118 SET, Department of Pre-hospital and Emergency, SG Giuseppe Moscati Hospital, 74120 Taranto, Italy
- Department of Interdisciplinary Medicine, Section of Microbiology and Virology, School of Medicine, University of Bari "Aldo Moro", 70124 Bari, Italy
| |
Collapse
|
3
|
Yang Y, Li Q, Liu X, Shao C, Yang H, Niu S, Peng H, Meng X. The combination of decitabine with multi-omics confirms the regulatory pattern of the correlation between DNA methylation of the CACNA1C gene and atrial fibrillation. Front Pharmacol 2024; 15:1497977. [PMID: 39734414 PMCID: PMC11681619 DOI: 10.3389/fphar.2024.1497977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 11/28/2024] [Indexed: 12/31/2024] Open
Abstract
Background Studies have shown that DNA methylation of the CACNA1C gene is involved in the pathogenesis of various diseases and the mechanism of drug action. However, its relationship with atrial fibrillation (AF) remains largely unexplored. Objective To investigate the association between DNA methylation of the CACNA1C gene and AF by combining decitabine (5-Aza-2'-deoxycytidine, AZA) treatment with multi-omics analysis. Methods HepG2 cells were treated with AZA to observe the expression of the CACNA1C gene, which was further validated using gene expression microarrays. Pyrosequencing was employed to validate differentially methylated sites of the CACNA1C gene observed in DNA methylation microarrays. A custom DNA methylation dataset based on the MSigDB database was combined with ChIP-sequencing and RNA-sequencing data to explore the regulatory patterns of DNA methylation of the CACNA1C gene. Results Treatment of HepG2 cells with three different concentrations of AZA (2.5 µM, 5.0 µM, and 10.0 µM) resulted in 1.6, 2.5, and 2.9-fold increases in the mRNA expression of the CACNA1C gene, respectively, compared to the DMSO group, with statistical significance at the highest concentration group (p < 0.05). Similarly, AZA treatment of T47D cells showed upregulated mRNA expression of the CACNA1C gene in the gene expression microarray results (adj P < 0.05). DNA methylation microarray analysis revealed that methylation of a CpG site in intron 30 of the CACNA1C gene may be associated with AF (adj P < 0.05). Pyrosequencing of this site and its adjacent two CpG sites demonstrated significant differences in DNA methylation levels between AF and sinus rhythm groups (p < 0.05). Subsequent multivariate logistic regression models confirmed that the DNA methylation degree of these three sites and their average was associated with AF (p < 0.05). Additionally, the UCSC browser combined with ChIP-sequencing revealed that the aforementioned region was enriched in enhancer markers H3K27ac and H3K4me1. Differential expression and pathway analysis of RNA-sequencing data ultimately identified ATF7IP and KAT2B genes as potential regulators of the CACNA1C gene. Conclusion The DNA methylation levels at three CpG sites in intron 30 of the CACNA1C gene are associated with AF status, and potentially regulated by ATF7IP and KAT2B.
Collapse
Affiliation(s)
- Yuling Yang
- Department of Pharmacy, Zhengzhou No. 7 People’s Hospital, Zhengzhou, Henan, China
| | - Qijun Li
- Department of Dermatology, Puyang Oilfield General Hospital, Puyang, Henan, China
| | - Xiaoning Liu
- Medical School, Huanghe Science and Technology College, Zhengzhou, Henan, China
| | - Caixia Shao
- Department of Surgery, Zhengzhou No. 7 People’s Hospital, Zhengzhou, Henan, China
| | - Heng Yang
- Department of Cardiac Surgery, Zhengzhou No. 7 People’s Hospital, Zhengzhou, Henan, China
| | - Siquan Niu
- Department of Cardiology, Zhengzhou No. 7 People’s Hospital, Zhengzhou, Henan, China
| | - Hong Peng
- Medical School, Huanghe Science and Technology College, Zhengzhou, Henan, China
| | - Xiangguang Meng
- Department of Pharmacy, Zhengzhou No. 7 People’s Hospital, Zhengzhou, Henan, China
- Medical School, Huanghe Science and Technology College, Zhengzhou, Henan, China
| |
Collapse
|
4
|
Voutsadakis IA. Targeting super-enhancer activity for colorectal cancer therapy. Am J Transl Res 2024; 16:700-719. [PMID: 38586095 PMCID: PMC10994804 DOI: 10.62347/qkhb5897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 02/28/2024] [Indexed: 04/09/2024]
Abstract
In addition to genetic variants and copy number alterations, epigenetic deregulation of oncogenes and tumor suppressors is a major contributor in cancer development and propagation. Regulatory elements for gene transcription regulation can be found in promoters which are located in the vicinity of transcription start sites but also at a distance, in enhancer sites, brought to interact with proximal sites when occupied by enhancer protein complexes. These sites provide most of the specific regulatory sequences recognized by transcription factors. A sub-set of enhancers characterized by a longer structure and stronger activity, called super-enhancers, are critical for the expression of specific genes, usually associated with individual cell type identity and function. Super-enhancers show deregulation in cancer, which may have profound repercussions for cancer cell survival and response to therapy. Dysfunction of super-enhancers may result from multiple mechanisms that include changes in their sequence, alterations in the topological neighborhoods where they belong, and alterations in the proteins that mediate their function, such as transcription factors and epigenetic modifiers. These can become potential targets for therapeutic interventions. Genes that are targets of super-enhancers are cell and cancer type specific and could also be of interest for therapeutic targeting. In colorectal cancer, a super-enhancer regulated and over-expressed oncogene is MYC, under the influence of the WNT/β-catenin pathway. Identification and targeting of additional oncogenes regulated by super-enhancers in colorectal cancer may pave the way for combination therapies targeting the super-enhancer machinery and signal transduction pathways that regulate the specific transcription factors operative on them.
Collapse
Affiliation(s)
- Ioannis A Voutsadakis
- Algoma District Cancer Program, Sault Area HospitalSault Ste. Marie, ON, Canada
- Division of Clinical Sciences, Section of Internal Medicine, Northern Ontario School of MedicineSudbury, ON, Canada
| |
Collapse
|