1
|
Smetanina MA, Korolenya VA, Kel AE, Sevostyanova KS, Gavrilov KA, Shevela AI, Filipenko ML. Epigenome-Wide Changes in the Cell Layers of the Vein Wall When Exposing the Venous Endothelium to Oscillatory Shear Stress. Epigenomes 2023; 7:epigenomes7010008. [PMID: 36975604 PMCID: PMC10048778 DOI: 10.3390/epigenomes7010008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/10/2023] [Accepted: 03/13/2023] [Indexed: 03/29/2023] Open
Abstract
Epigenomic changes in the venous cells exerted by oscillatory shear stress towards the endothelium may result in consolidation of gene expression alterations upon vein wall remodeling during varicose transformation. We aimed to reveal such epigenome-wide methylation changes. Primary culture cells were obtained from non-varicose vein segments left after surgery of 3 patients by growing the cells in selective media after magnetic immunosorting. Endothelial cells were either exposed to oscillatory shear stress or left at the static condition. Then, other cell types were treated with preconditioned media from the adjacent layer's cells. DNA isolated from the harvested cells was subjected to epigenome-wide study using Illumina microarrays followed by data analysis with GenomeStudio (Illumina), Excel (Microsoft), and Genome Enhancer (geneXplain) software packages. Differential (hypo-/hyper-) methylation was revealed for each cell layer's DNA. The most targetable master regulators controlling the activity of certain transcription factors regulating the genes near the differentially methylated sites appeared to be the following: (1) HGS, PDGFB, and AR for endothelial cells; (2) HGS, CDH2, SPRY2, SMAD2, ZFYVE9, and P2RY1 for smooth muscle cells; and (3) WWOX, F8, IGF2R, NFKB1, RELA, SOCS1, and FXN for fibroblasts. Some of the identified master regulators may serve as promising druggable targets for treating varicose veins in the future.
Collapse
Affiliation(s)
- Mariya A Smetanina
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Department of Fundamental Medicine, V. Zelman Institute for Medicine and Psychology, Novosibirsk State University (NSU), Novosibirsk 630090, Russia
| | - Valeria A Korolenya
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Department of Natural Sciences, Novosibirsk State University (NSU), Novosibirsk 630090, Russia
| | - Alexander E Kel
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Department of Research & Development, GeneXplain GmbH, D-38302 Wolfenbüttel, Germany
| | - Ksenia S Sevostyanova
- Center of New Medical Technologies, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Laboratory of Invasive Medical Technologies, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Department of Surgical Diseases, V. Zelman Institute for Medicine and Psychology, Novosibirsk State University (NSU), Novosibirsk 630090, Russia
| | - Konstantin A Gavrilov
- Center of New Medical Technologies, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Department of Surgical Diseases, V. Zelman Institute for Medicine and Psychology, Novosibirsk State University (NSU), Novosibirsk 630090, Russia
| | - Andrey I Shevela
- Center of New Medical Technologies, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Laboratory of Invasive Medical Technologies, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
- Department of Surgical Diseases, V. Zelman Institute for Medicine and Psychology, Novosibirsk State University (NSU), Novosibirsk 630090, Russia
| | - Maxim L Filipenko
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine (ICBFM) SB RAS, Novosibirsk 630090, Russia
| |
Collapse
|
2
|
Kechin AA, Ivanov AA, Kel AE, Kalmykov AS, Oskorbin IP, Boyarskikh UA, Kharpov EA, Bakharev SY, Oskina NA, Samuilenkova OV, Vikhlyanov IV, Kushlinskii NE, Filipenko ML. Prediction of EVT6-NTRK3-Dependent Papillary Thyroid Cancer Using Minor Expression Profile. Bull Exp Biol Med 2022; 173:252-256. [PMID: 35737155 DOI: 10.1007/s10517-022-05528-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Indexed: 10/17/2022]
Abstract
Solid tumors resulting from oncogenic stimulation of neurotrophin receptors (TRK) by chimeric proteins are a group of rare tumors of various localization that respond to therapy with targeted drugs entrectinib and larotrectinib. The standard method for detecting chimeric TRK genes in tumor samples today is considered to be next generation sequencing with the determination of the prime structure of the chimeric transcripts. We hypothesized that expression of the chimeric tyrosine kinase proteins in tumors can determine the specific transcriptomic profile of tumor cells. We detected differentially expressed genes allowing distinguishing between TRK-dependent tumors papillary thyroid cancer (TC) from other molecular variants of tumors of this type. Using PCR with reverse transcription (RT-PCR), we identified 7 samples of papillary TC carrying a EVT6-NTRK3 rearrangement (7/215, 3.26%). Using machine learning and the data extracted from TCGA, we developed of a recognition function for predicting the presence of rearrangement in NTRK genes based on the expression of 10 key genes: AUTS2, DTNA, ERBB4, HDAC1, IGF1, KDR, NTRK1, PASK, PPP2R5B, and PRSS1. The recognition function was used to analyze the expression data of the above genes in 7 TRK-dependent and 10 TRK-independent thyroid tumors obtained by RT-PCR. On the test samples from TCGA, the sensitivity was 72.7%, the specificity - 99.6%. On our independent validation samples tested by RT-PCR, sensitivity was 100%, specificity - 70%. We proposed an mRNA profile of ten genes that can classify TC in relation to the presence of driver NTRK-chimeric TRK genes with acceptable sensitivity and specificity.
Collapse
Affiliation(s)
- A A Kechin
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia
| | - A A Ivanov
- Altay Regional Oncological Center, Barnaul, Russia
| | - A E Kel
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia
| | | | - I P Oskorbin
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia
| | - U A Boyarskikh
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia
| | - E A Kharpov
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia
| | | | - N A Oskina
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | - N E Kushlinskii
- N. N. Blokhin National Medical Research Center of Oncology, Ministry of Health of the Russian Federation, Moscow, Russia
| | - M L Filipenko
- Institute of Chemical Biology and Fundamental Medicine, Siberian Division of the Russian Academy of Sciences, Novosibirsk, Russia.
| |
Collapse
|
3
|
Markov AV, Kel AE, Salomatina OV, Salakhutdinov NF, Zenkova MA, Logashenko EB. Deep insights into the response of human cervical carcinoma cells to a new cyano enone-bearing triterpenoid soloxolone methyl: a transcriptome analysis. Oncotarget 2019; 10:5267-5297. [PMID: 31523389 PMCID: PMC6731101 DOI: 10.18632/oncotarget.27085] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 06/19/2019] [Indexed: 02/07/2023] Open
Abstract
Semisynthetic triterpenoids, bearing cyano enone functionality in ring A, are considered now as novel promising anti-tumor agents. However, despite the large-scale studies, their effects on cervical carcinoma cells and, moreover, mechanisms underlying cell death activation by such compounds in this cell type have not been fully elucidated. In this work, we attempted to reconstitute the key pathways and master regulators involved in the response of human cervical carcinoma KB-3-1 cells to the novel glycyrrhetinic acid derivative soloxolone methyl (SM) by a transcriptomic approach. Functional annotation of differentially expressed genes, analysis of their cis- regulatory sequences and protein-protein interaction network clearly indicated that stress of endoplasmic reticulum (ER) is the central event triggered by SM in the cells. A range of key ER stress sensors and transcription factor AP-1 were identified as upstream transcriptional regulators, controlling the response of the cells to SM. Additionally, by using Gene Expression Omnibus data, we showed the ability of SM to modulate the expression of key genes involved in regulation of the high proliferative rate of cervical carcinoma cells. Further Connectivity Map analysis revealed similarity of SM's effects with known ER stress inducers thapsigargin and geldanamycin, targeting SERCA and Grp94, respectively. According to the molecular docking study, SM could snugly fit into the active sites of these proteins in the positions very close to that of both inhibitors. Taken together, our findings provide a basis for the better understanding of the intracellular processes in tumor cells switched on in response to cyano enone-bearing triterpenoids.
Collapse
Affiliation(s)
- Andrey V Markov
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation
| | - Alexander E Kel
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation.,geneXplain GmbH, Wolfenbüttel 38302, Germany
| | - Oksana V Salomatina
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation.,N. N. Vorozhtsov Novosibirsk Institute of Organic Chemistry, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation
| | - Nariman F Salakhutdinov
- N. N. Vorozhtsov Novosibirsk Institute of Organic Chemistry, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation
| | - Marina A Zenkova
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation
| | - Evgeniya B Logashenko
- Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russian Federation
| |
Collapse
|
4
|
Smetanina MA, Kel AE, Sevost'ianova KS, Maiborodin IV, Shevela AI, Zolotukhin IA, Stegmaier P, Filipenko ML. DNA methylation and gene expression profiling reveal MFAP5 as a regulatory driver of extracellular matrix remodeling in varicose vein disease. Epigenomics 2018; 10:1103-1119. [PMID: 30070582 DOI: 10.2217/epi-2018-0001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
AIM To integrate transcriptomic and DNA-methylomic measurements on varicose versus normal veins using a systems biological analysis to shed light on the interplay between genetic and epigenetic factors. MATERIALS & METHODS Differential expression and methylation were measured using microarrays, supported by real-time quantitative PCR and immunohistochemistry confirmation for relevant gene products. A systems biological 'upstream analysis' was further applied. RESULTS We identified several potential key players contributing to extracellular matrix remodeling in varicose veins. Specifically, our analysis suggests MFAP5 acting as a master regulator, upstream of integrins, of the cellular network affecting the varicose vein condition. Possible mechanism and pathogenic model were outlined. CONCLUSION A coherent model proposed incorporates the relevant signaling networks and will hopefully aid further studies on varicose vein pathogenesis.
Collapse
Affiliation(s)
- Mariya A Smetanina
- Laboratory of Pharmacogenomics, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia.,Department of Fundamental Medicine, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Alexander E Kel
- Laboratory of Pharmacogenomics, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia.,Department of Research & Development, geneXplain GmbH, Wolfenbüttel D-38302, Germany
| | - Ksenia S Sevost'ianova
- Department of Fundamental Medicine, Novosibirsk State University, Novosibirsk 630090, Russia.,Center of New Medical Technologies, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia
| | - Igor V Maiborodin
- Stem Cell Laboratory, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia
| | - Andrey I Shevela
- Department of Fundamental Medicine, Novosibirsk State University, Novosibirsk 630090, Russia.,Center of New Medical Technologies, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia
| | - Igor A Zolotukhin
- Laboratory of Pharmacogenomics, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia.,Chair of Faculty Surgery of the Medical Department, Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Philip Stegmaier
- Department of Research & Development, geneXplain GmbH, Wolfenbüttel D-38302, Germany
| | - Maxim L Filipenko
- Laboratory of Pharmacogenomics, Institute of Chemical Biology & Fundamental Medicine, Novosibirsk 630090, Russia.,Department of Fundamental Medicine, Novosibirsk State University, Novosibirsk 630090, Russia
| |
Collapse
|
5
|
Boyarskikh UA, Shadrina AS, Smetanina MA, Tsepilov YA, Oscorbin IP, Kozlov VV, Kel AE, Filipenko ML. Mycoplasma hyorhinis reduces sensitivity of human lung carcinoma cells to Nutlin-3 and promotes their malignant phenotype. J Cancer Res Clin Oncol 2018; 144:1289-1300. [PMID: 29737431 DOI: 10.1007/s00432-018-2658-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/02/2018] [Indexed: 02/08/2023]
Abstract
PURPOSE MDM2 inhibitors are promising anticancer agents that induce cell cycle arrest and tumor cells death via p53 reactivation. We examined the influence of Mycoplasma hyorhinis infection on sensitivity of human lung carcinoma cells NCI-H292 to MDM2 inhibitor Nutlin-3. In order to unveil possible mechanisms underlying the revealed effect, we investigated gene expression changes and signal transduction networks activated in NCI-H292 cells in response to mycoplasma infection. METHODS Sensitivity of NCI-Н292 cells to Nutlin-3 was estimated by resazurin-based cell viability assay. Genome-wide transcriptional profiles of NCI-H292 and NCI-Н292Myc.h cell lines were determined using Illumina Human HT-12 v3 Expression BeadChip. Search for key transcription factors and key node molecules was performed using the geneXplain platform. Ability for anchorage-independent growth was tested by soft agar colony formation assay. RESULTS NCI-Н292Myc.h cells were shown to be 1.5- and 5.2-fold more resistant to killing by Nutlin-3 at concentrations of 15 and 30 µM than uninfected NCI-Н292 cells (P < 0.05 and P < 0.001, respectively). Transcriptome analysis revealed differential expression of multiple genes involved in cancer progression and metastasis as well as epithelial-mesenchymal transition (EMT). Moreover, we have shown experimentally that NCI-Н292Myc.h cells were more capable of growing and dividing without binding to a substrate. The most likely mechanism explaining the observed changes was found to be TLR4- and IL-1b-mediated activation of NF-κB pathway. CONCLUSIONS Our results provide evidence that mycoplasma infection is an important factor modulating the effect of MDM2 inhibitors on cancer cells and is able to induce EMT-related changes.
Collapse
Affiliation(s)
- Uljana A Boyarskikh
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentjev Avenue, Novosibirsk, 630090, Russia
| | - Alexandra S Shadrina
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentjev Avenue, Novosibirsk, 630090, Russia. .,Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090, Russia.
| | - Mariya A Smetanina
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentjev Avenue, Novosibirsk, 630090, Russia.,Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090, Russia
| | - Yakov A Tsepilov
- Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090, Russia.,Institute of Cytology and Genetics, 10 Lavrentjev Avenue, Novosibirsk, 630090, Russia
| | - Igor P Oscorbin
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentjev Avenue, Novosibirsk, 630090, Russia.,Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090, Russia
| | - Vadim V Kozlov
- Novosibirsk Regional Clinical Oncological Center, 2 Plakhotnogo Street, Novosibirsk, 630108, Russia
| | - Alexander E Kel
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentjev Avenue, Novosibirsk, 630090, Russia.,Department of Research and Development, geneXplain GmbH, Am Exer 10b, 38302, Wolfenbüttel, Germany
| | - Maxim L Filipenko
- Laboratory of Pharmacogenomics, Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentjev Avenue, Novosibirsk, 630090, Russia.,Novosibirsk State University, 2 Pirogova Street, Novosibirsk, 630090, Russia
| |
Collapse
|
6
|
Abstract
In this chapter, we present an approach that allows a causal analysis of multiple "-omics" data with the help of an "upstream analysis" strategy. The goal of this approach is to identify master regulators in gene regulatory networks as potential drug targets for a pathological process. The data analysis strategy includes a state-of-the-art promoter analysis for potential transcription factor (TF)-binding sites using the TRANSFAC® database combined with an analysis of the upstream signal transduction pathways that control the activity of these TFs. When applied to genes that are associated with a switch to a pathological process, the approach identifies potential key molecules (master regulators) that may exert major control over and maintenance of transient stability of the pathological state. We demonstrate this approach on examples of analysis of multi-omics data sets that contain transcriptomics and epigenomics data in cancer. The results of this analysis helped us to better understand the molecular mechanisms of cancer development and cancer drug resistance. Such an approach promises to be very effective for rapid and accurate identification of cancer drug targets with true potential. The upstream analysis approach is implemented as an automatic workflow in the geneXplain platform ( www.genexplain.com ) using the open-source BioUML framework ( www.biouml.org ).
Collapse
Affiliation(s)
- Alexander E Kel
- Institute of Chemical Biology and Fundamental Medicine, SBRAN, Novosibirsk, Russia. .,Biosoft.ru, Ltd., Novosibirsk, Russia. .,geneXplain GmbH, Am Exer 10B, D-38302, Wolfenbüttel, Germany.
| |
Collapse
|
7
|
Kel AE, Stegmaier P, Valeev T, Koschmann J, Poroikov V, Kel-Margoulis OV, Wingender E. Multi-omics "upstream analysis" of regulatory genomic regions helps identifying targets against methotrexate resistance of colon cancer. EuPA Open Proteom 2016; 13:1-13. [PMID: 29900117 PMCID: PMC5988513 DOI: 10.1016/j.euprot.2016.09.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Revised: 09/05/2016] [Accepted: 09/08/2016] [Indexed: 11/25/2022]
Abstract
Upstream analysis strategy for multi-omics data is proposed. Drug targets are predicted by search for TFBS and analysis of signaling network. Methotrexate resistance data include transcriptomics, proteomics and epigenomics. Predicted targets are: TGFalpha, IGFBP7, alpha9-integrin. Predicted drugs are: zardaverine, divalproex and human metabolite nicotinamide N-oxide.
We present an “upstream analysis” strategy for causal analysis of multiple “-omics” data. It analyzes promoters using the TRANSFAC database, combines it with an analysis of the upstream signal transduction pathways and identifies master regulators as potential drug targets for a pathological process. We applied this approach to a complex multi-omics data set that contains transcriptomics, proteomics and epigenomics data. We identified the following potential drug targets against induced resistance of cancer cells towards chemotherapy by methotrexate (MTX): TGFalpha, IGFBP7, alpha9-integrin, and the following chemical compounds: zardaverine and divalproex as well as human metabolites such as nicotinamide N-oxide.
Collapse
Affiliation(s)
- Alexander E Kel
- Institute of Chemical Biology and Fundamental Medicine, SBRAS, Novosibirsk, Russia.,Biosoft.ru, Ltd, Novosibirsk, Russia.,geneXplain GmbH, D-38302 Wolfenbüttel, Germany
| | | | - Tagir Valeev
- Biosoft.ru, Ltd, Novosibirsk, Russia.,A.P. Ershov Institute of Informatics Systems, SB RAS, Novosibirsk, Russia
| | | | | | | | - Edgar Wingender
- geneXplain GmbH, D-38302 Wolfenbüttel, Germany.,Institute of Bioinformatics, University Medical Center Göttingen, D-37077 Göttingen, Germany
| |
Collapse
|
8
|
Sokolova EA, Boyarskikh UA, Shirshova AN, Kel AE, Filipenko ML. [THE BIOMARKERS FOR TIMELY DIAGNOSTICS OF COLORECTAL CANCER]. Klin Lab Diagn 2015; 60:15-23. [PMID: 27032247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The colorectal cancer (CC) is one of the most widespread type of cancer all over the world. It is confirmed that the screening procedures intended for timely detection of CC and adenomatous polyps, significantly decrease mortality. The colonoscopy and analysis offeces for occult blood are widely applied as screening procedures. However, they have a number of shortcomings. The studies of the last decade revealed number of genetic and epigenetic markers potentially permitting revealing patients with CC at early stages of development of disease. The article analyzes CC-specific microRNA and their possible interactions with different transcriptional factors. These factors, being integrated into the structure of so called network s with direct signal propagation, ensure special stability of all regulatory system. The derangement of functioning of these networks quite often results in pathological alterations.
Collapse
|
9
|
Deyneko IV, Kel AE, Kel-Margoulis OV, Deineko EV, Wingender E, Weiss S. MatrixCatch--a novel tool for the recognition of composite regulatory elements in promoters. BMC Bioinformatics 2013; 14:241. [PMID: 23924163 PMCID: PMC3754795 DOI: 10.1186/1471-2105-14-241] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Accepted: 08/05/2013] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Accurate recognition of regulatory elements in promoters is an essential prerequisite for understanding the mechanisms of gene regulation at the level of transcription. Composite regulatory elements represent a particular type of such transcriptional regulatory elements consisting of pairs of individual DNA motifs. In contrast to the present approach, most available recognition techniques are based purely on statistical evaluation of the occurrence of single motifs. Such methods are limited in application, since the accuracy of recognition is greatly dependent on the size and quality of the sequence dataset. Methods that exploit available knowledge and have broad applicability are evidently needed. RESULTS We developed a novel method to identify composite regulatory elements in promoters using a library of known examples. In depth investigation of regularities encoded in known composite elements allowed us to introduce a new characteristic measure and to improve the specificity compared with other methods. Tests on an established benchmark and real genomic data show that our method outperforms other available methods based either on known examples or statistical evaluations. In addition to better recognition, a practical advantage of this method is first the ability to detect a high number of different types of composite elements, and second direct biological interpretation of the identified results. The program is available at http://gnaweb.helmholtz-hzi.de/cgi-bin/MCatch/MatrixCatch.pl and includes an option to extend the provided library by user supplied data. CONCLUSIONS The novel algorithm for the identification of composite regulatory elements presented in this paper was proved to be superior to existing methods. Its application to tissue specific promoters identified several highly specific composite elements with relevance to their biological function. This approach together with other methods will further advance the understanding of transcriptional regulation of genes.
Collapse
Affiliation(s)
- Igor V Deyneko
- Department of Molecular Immunology, Helmholtz Centre for Infection Research, Braunschweig, Germany.
| | | | | | | | | | | |
Collapse
|
10
|
Stegmaier P, Krull M, Voss N, Kel AE, Wingender E. Molecular mechanistic associations of human diseases. BMC Syst Biol 2010; 4:124. [PMID: 20815942 PMCID: PMC2946303 DOI: 10.1186/1752-0509-4-124] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 09/06/2010] [Indexed: 01/05/2023]
Abstract
Background The study of relationships between human diseases provides new possibilities for biomedical research. Recent achievements on human genetic diseases have stimulated interest to derive methods to identify disease associations in order to gain further insight into the network of human diseases and to predict disease genes. Results Using about 10000 manually collected causal disease/gene associations, we developed a statistical approach to infer meaningful associations between human morbidities. The derived method clustered cardiometabolic and endocrine disorders, immune system-related diseases, solid tissue neoplasms and neurodegenerative pathologies into prominent disease groups. Analysis of biological functions confirmed characteristic features of corresponding disease clusters. Inference of disease associations was further employed as a starting point for prediction of disease genes. Efforts were made to underpin the validity of results by relevant literature evidence. Interestingly, many inferred disease relationships correspond to known clinical associations and comorbidities, and several predicted disease genes were subjects of therapeutic target research. Conclusions Causal molecular mechanisms present a unifying principle to derive methods for disease classification, analysis of clinical disorder associations, and prediction of disease genes. According to the definition of causal disease genes applied in this study, these results are not restricted to genetic disease/gene relationships. This may be particularly useful for the study of long-term or chronic illnesses, where pathological derangement due to environmental or as part of sequel conditions is of importance and may not be fully explained by genetic background.
Collapse
Affiliation(s)
- Philip Stegmaier
- BIOBASE GmbH, Halchtersche Strasse 33, D-38304 Wolfenbüttel, Germany.
| | | | | | | | | |
Collapse
|
11
|
Deyneko IV, Kalybaeva YM, Kel AE, Blöcker H. Human-chimpanzee promoter comparisons: property-conserved evolution? Genomics 2010; 96:129-33. [PMID: 20600807 DOI: 10.1016/j.ygeno.2010.06.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2010] [Revised: 05/28/2010] [Accepted: 06/18/2010] [Indexed: 11/29/2022]
Abstract
Identification of different functional elements and their properties is a fundamental need in biomedical research and phylogenetic comparisons of a growing number of sequenced genomes form a solid basis for this task. Most available phylogenetic approaches are focused on searching for individual sequence alterations, responsible for the observed phenotype, or statistically evaluate observed mutations to infer general trends. However, being applied to close genomes such methods suffer from poor statistics of rare mutations and give only (at its best) coarse results concerning the potential functional importance of the nucleotide differences. However, quantifying the changes in physical properties of DNA allows to see the strength of introduced mutations and hence to classify them for further investigations. In this work we present the comparative sequence analysis of two evolutionarily close species-human and chimpanzee. In contrast to previous studies we evaluate changes in melting enthalpy of DNA rather than count nucleotide mismatches. We find that nucleotide mismatches in promoters were apparently introduced in a correlated manner during the course of evolution, so that, for example, the DNA property "melting enthalpy" was retained. Such property conservation of promoters is significantly different from nucleotide conservation, shows significant positional and functional biases, and seems to represent a novel feature of gene regulation.
Collapse
Affiliation(s)
- Igor V Deyneko
- Department of Genome Analysis, Helmholtz Centre for Infection Research, Inhoffenstrasse 7, Braunschweig, Germany.
| | | | | | | |
Collapse
|
12
|
Paragh G, Ugocsai P, Vogt T, Schling P, Kel AE, Tarabin V, Liebisch G, Orsó E, Markó L, Balogh A, Köbling T, Remenyik É, Wikonkál NM, Mandl J, Farwick M, Schmitz G. Whole genome transcriptional profiling identifies novel differentiation regulated genes in keratinocytes. Exp Dermatol 2010; 19:297-301. [DOI: 10.1111/j.1600-0625.2009.00920.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
13
|
Paragh G, Schling P, Ugocsai P, Kel AE, Liebisch G, Heimerl S, Moehle C, Schiemann Y, Wegmann M, Farwick M, Wikonkál NM, Mandl J, Langmann T, Schmitz G. Novel sphingolipid derivatives promote keratinocyte differentiation. Exp Dermatol 2008; 17:1004-16. [DOI: 10.1111/j.1600-0625.2008.00736.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
14
|
Kel AE, Niehof M, Matys V, Zemlin R, Borlak J. Genome wide prediction of HNF4alpha functional binding sites by the use of local and global sequence context. Genome Biol 2008; 9:R36. [PMID: 18291023 PMCID: PMC2374721 DOI: 10.1186/gb-2008-9-2-r36] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2007] [Revised: 11/09/2007] [Accepted: 02/21/2008] [Indexed: 11/16/2022] Open
Abstract
An application of machine learning algorithms enables prediction of the functional context of transcription factor binding sites in the human genome. We report an application of machine learning algorithms that enables prediction of the functional context of transcription factor binding sites in the human genome. We demonstrate that our method allowed de novo identification of hepatic nuclear factor (HNF)4α binding sites and significantly improved an overall recognition of faithful HNF4α targets. When applied to published findings, an unprecedented high number of false positives were identified. The technique can be applied to any transcription factor.
Collapse
Affiliation(s)
- Alexander E Kel
- BIOBASE GmbH, Halchtersche Str, 38304 Wolfenbüttel, Germany.
| | | | | | | | | |
Collapse
|
15
|
Posch S, Grau J, Gohr A, Ben-Gal I, Kel AE, Grosse I. Recognition of cis-regulatory elements with vombat. J Bioinform Comput Biol 2007; 5:561-77. [PMID: 17636862 DOI: 10.1142/s0219720007002886] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2006] [Revised: 02/14/2007] [Accepted: 02/15/2007] [Indexed: 11/18/2022]
Abstract
Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of cis-regulatory elements, and it has been demonstrated that they outperform traditional models such as position weight matrices, Markov models, and Bayesian trees for the recognition of binding sites in prokaryotes. Here, we study to which degree variable order models can improve the recognition of eukaryotic cis-regulatory elements. We find that variable order models can improve the recognition of binding sites of all the studied transcription factors. To ease a systematic evaluation of different model combinations based on problem-specific data sets and allow genomic scans of cis-regulatory elements based on fixed and variable order Markov models and Bayesian trees, we provide the VOMBATserver to the public community.
Collapse
Affiliation(s)
- Stefan Posch
- Institute of Computer Science, University Halle, 06099 Halle (Saale), Germany
| | | | | | | | | | | |
Collapse
|
16
|
Abstract
Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global ('omics') approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the foundation for comprehensive description, analysis and manipulation of whole living systems in modern "systems biology". The next step which is necessary for developing a systems biology that deals with systemic phenomena is to expand the existing and develop new methodologies that are appropriate to characterize intercellular processes and interactions without omitting the causal underlying molecular mechanisms. Modelling the processes on the different levels of complexity involved requires a comprehensive integration of information on gene regulatory events, signal transduction pathways, protein interaction and metabolic networks as well as cellular functions in the respective tissues / organs.
Collapse
Affiliation(s)
- Edgar Wingender
- BIOBASE GmbH, Halchtersche Str .33, D-38304 Wolfenbuttel, Germany.
| | | | | | | | | | | |
Collapse
|
17
|
Deyneko IV, Bredohl B, Wesely D, Kalybaeva YM, Kel AE, Blöcker H, Kauer G. FeatureScan: revealing property-dependent similarity of nucleotide sequences. Nucleic Acids Res 2006; 34:W591-5. [PMID: 16845077 PMCID: PMC1538849 DOI: 10.1093/nar/gkl337] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
FeatureScan is a software package aiming to reveal novel types of DNA sequence similarity by comparing physico-chemical properties. Thirty-eight different parameters of DNA double strands such as charge, melting enthalpy, conformational parameters and the like are provided. As input FeatureScan requires two sequences, a pattern sequence and a target sequence, search conditions are set by selecting a specific DNA parameter and a threshold value. Search results are displayed in FASTA format and directly linked to external genome databases/browsers (ENSEMBL, NCBI, UCSC). An Internet version of FeatureScan is accessible at . As part of the HOBIT initiative () FeatureScan is also accessible as a web service at its above home page. Currently, several preloaded genomes are provided at this Internet website (Homo sapiens, Mus musculus, Rattus norvegicus and four strains of Escherichia coli) as target sequences. Standalone executables of FeatureScan are available on request.
Collapse
Affiliation(s)
- Igor V. Deyneko
- Department of Genome Analysis, GBF (German Research Centre for Biotechnology)D-38124 Braunschweig, Germany
- Institute of Cytology and Genetics SB RAS, NovosibirskRussia
- To whom correspondence should be addressed. Tel: +49 531 6181 224; Fax: +49 531 6181 292;
| | - Björn Bredohl
- Department of Genome Analysis, GBF (German Research Centre for Biotechnology)D-38124 Braunschweig, Germany
- University of Applied Sciences, D-26723 EmdenGermany
| | - Daniel Wesely
- Department of Genome Analysis, GBF (German Research Centre for Biotechnology)D-38124 Braunschweig, Germany
- University of Applied Sciences, D-26723 EmdenGermany
| | - Yulia M. Kalybaeva
- Department of Genome Analysis, GBF (German Research Centre for Biotechnology)D-38124 Braunschweig, Germany
| | | | - Helmut Blöcker
- Department of Genome Analysis, GBF (German Research Centre for Biotechnology)D-38124 Braunschweig, Germany
- To whom correspondence should be addressed. Tel: +49 531 6181 224; Fax: +49 531 6181 292;
| | - Gerhard Kauer
- Department of Genome Analysis, GBF (German Research Centre for Biotechnology)D-38124 Braunschweig, Germany
- University of Applied Sciences, D-26723 EmdenGermany
| |
Collapse
|
18
|
Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K, Voss N, Stegmaier P, Lewicki-Potapov B, Saxel H, Kel AE, Wingender E. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 2006; 34:D108-10. [PMID: 16381825 PMCID: PMC1347505 DOI: 10.1093/nar/gkj143] [Citation(s) in RCA: 1660] [Impact Index Per Article: 92.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2005] [Revised: 10/27/2005] [Accepted: 10/27/2005] [Indexed: 02/06/2023] Open
Abstract
The TRANSFAC database on transcription factors, their binding sites, nucleotide distribution matrices and regulated genes as well as the complementing database TRANSCompel on composite elements have been further enhanced on various levels. A new web interface with different search options and integrated versions of Match and Patch provides increased functionality for TRANSFAC. The list of databases which are linked to the common GENE table of TRANSFAC and TRANSCompel has been extended by: Ensembl, UniGene, EntrezGene, HumanPSD and TRANSPRO. Standard gene names from HGNC, MGI and RGD, are included for human, mouse and rat genes, respectively. With the help of InterProScan, Pfam, SMART and PROSITE domains are assigned automatically to the protein sequences of the transcription factors. TRANSCompel contains now, in addition to the COMPEL table, a separate table for detailed information on the experimental EVIDENCE on which the composite elements are based. Finally, for TRANSFAC, in respect of data growth, in particular the gain of Drosophila transcription factor binding sites (by courtesy of the Drosophila DNase I footprint database) and of Arabidopsis factors (by courtesy of DATF, Database of Arabidopsis Transcription Factors) has to be stressed. The here described public releases, TRANSFAC 7.0 and TRANSCompel 7.0, are accessible under http://www.gene-regulation.com/pub/databases.html.
Collapse
Affiliation(s)
- V Matys
- BIOBASE GmbH, Halchtersche Strasse 33, D-38304 Wolfenbüttel, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Deyneko IV, Kel AE, Bloecker H, Kauer G. Signal-theoretical DNA similarity measure revealing unexpected similarities of E. coli promoters. In Silico Biol 2005; 5:547-55. [PMID: 16268796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
We present an implementation of the signal theory based approach for detection of novel types of DNA similarity which are based on physical properties of DNA. Systematic study of the sensitivity of the new similarity measure revealed qualitative differences to letter-based similarity. A variety of physical parameters of DNA double strands, which in a straightforward way reflect different kinds of information hidden behind the primary structure of DNA, showed a wide range of recognition power of the signal similarity measure. We applied the novel DNA similarity measure for the analysis of promoters of E.coli genes. We found that promoter similarities revealed by our approach correlate with their transcription regulatory responsivenesses to different antibiotic and osmotic treatments. Accelerated by special hardware for fast Fourier transformations, the method is easily applicable for the analysis of entire eukaryotic genomes in minutes.
Collapse
Affiliation(s)
- Igor V Deyneko
- Department of Genome Analysis, GBF (German Research Center for Biotechnology), D-38124 Braunschweig, Germany.
| | | | | | | |
Collapse
|
20
|
Kel-Margoulis OV, Tchekmenev D, Kel AE, Goessling E, Hornischer K, Lewicki-Potapov B, Wingender E. Composition-sensitive analysis of the human genome for regulatory signals. In Silico Biol 2004; 3:145-71. [PMID: 12954097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Abstract
Known transcription regulatory signals which generally act as transcription factor binding sites (TFs) differ significantly in their base composition. Therefore, their occurrence in a genome largely depends on the local base composition. In an attempt to initiate an all human genome analysis for the occurrence of potential TFs, we systematically analyzed the GC-content of distinct functional regions (e. g., upstream and downstream gene regions, exons, long and short introns, repetitive elements) and correlated the frequencies of potential binding sites of a representative set of TFs in these regions. For these analyses, we used the pattern collection of the TRANSFAC database on transcriptional regulation, the information about functionally relevant combinations of them from the database TRANSCompel, and our new resource, TRANSGenomeTM, which provides an overall annotation of the human genome with emphasis on its regulatory characteristics. We show that the occurrence of sequence patterns with regulatory potential may be supported by, but cannot be fully explained by either the GC content of a whole chromosome or its putative promoter regions, nor by the information content of the patterns. Several patterns, HNF-3, NFAT, and GC box, show a clear overrepresentation in all promoter groups as well as in all chromosomes. Other patterns, like E2F and CRE-BP1, are underrepresented in all promoter groups as well as in all chromosomes in comparison with random sequences. Simultaneously, both patterns are over-represented in promoters in comparison with repetitive elements. We define several structural characteristics of the proximal promoters that differentiate them from other functional genomic regions. Two well-known promoter elements, GC- and TATA-boxes, are statistically enriched in promoters in comparison with random sequences, repetitive elements and exons. Altogether, our findings provide insights into the macroheterogeneity amongst the individual chromosomes, into the microheterogeneity among different functional regions of individual chromosomes, contribute to further understanding of structural organization of gene regulatory regions, and give first hints on the development of regulatory features during evolution.
Collapse
|
21
|
Shelest E, Kel AE, Göessling E, Wingender E. Prediction of potential C/EBP/NF-kappaB composite elements using matrix-based search methods. In Silico Biol 2004; 3:71-9. [PMID: 12762847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Bacterial infections trigger a wide range of host cell responses. For the interaction of Pseudomonas aeruginosa and epithelial cells it is known that transcription factor NF-kappaB plays a central role, but its effects have to be specified by cooperation with additional factors. NF-B containing composite elements, e. g. with C/EBP, may be appropriate indicators for new antibacterial response genes. We refined matrix-based search methods for C/EBP, which was necessary because of weak consensi of the previously existing C/EBP matrices, established a model for C/EBP/ NF-kappaB composite element, used it for scanning all known human 5'-flanking sequences and identified 135 new candidate genes. The newly constructed C/EBP binding patterns will be available with one of the next releases of the TRANSFAC database (http://www.gene-regulation.de).
Collapse
Affiliation(s)
- Ekaterina Shelest
- GBF German Research Centre for Biotechnology, Mascheroder Weg 1, D-38124 Braunschweig, Germany.
| | | | | | | |
Collapse
|
22
|
Stegmaier P, Kel AE, Wingender E. Systematic DNA-binding domain classification of transcription factors. Genome Inform 2004; 15:276-86. [PMID: 15706513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
Based on the manual annotation of transcription factors stored in the TRANSFAC database, we developed a library of hidden Markov models (HMM) to represent their DNA-binding domains and used it for a comprehensive classification. The models constructed were applied on the UniProt/Swiss-Prot database, leading to a systematic classification of further DNA-binding protein entries. The HMM library obtained can be used to classify any newly discovered transcription factor according to its DNA-binding domain and, thus, to generate hypotheses about its DNA-binding specificity.
Collapse
Affiliation(s)
- Philip Stegmaier
- BIOBASE GmbH, Halchtersche Str. 33, D-38304 Wolfenbüttel, Germany.
| | | | | |
Collapse
|
23
|
Suzuki Y, Yamashita R, Shirota M, Sakakibara Y, Chiba J, Mizushima-Sugano J, Kel AE, Arakawa T, Carninci P, Kawai J, Hayashizaki Y, Takagi T, Nakai K, Sugano S. Large-scale collection and characterization of promoters of human and mouse genes. In Silico Biol 2004; 4:429-44. [PMID: 15506993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
We report the generation and initial characterization of a large-scale collection of sequences of putative promoter regions (PPRs) of human and mouse genes. Based on our unique collection of 400,225 and 580,209 human and mouse full-length cDNAs, we determined exact transcriptional start sites (TSSs). Using positional information of the TSSs, we could retrieve adjacent sequences as PPRs for 8,793 and 6,875 human and mouse genes, respectively. The positions of the PPRs were 4 kb upstream to previously reported 5'-ends of cDNAs on average, demonstrating that full-length cDNA information is indispensable for this purpose. Among those PPRs supported by experimentally validated TSSs, 3,324 could be paired as mutually homologous genes between human and mouse and were used for the comprehensive comparative studies. The sequence identities in the proximal regions of the TSSs were 45% on average, and 22,794 putative transcription factor binding sites that are conserved between human and mouse were identified. The data resource created in the present work and the results of the sequences' initial characterization should lay the firm foundation for deciphering the transcriptional modulations of human genes. All the data were deposited and made available through a database for comparative studies, DBTSS.
Collapse
Affiliation(s)
- Yutaka Suzuki
- Human Genome Center, The Institute of Medical Science, The University of Tokyo: 4-6-1 Shirokanedai, Tokyo, 108-8639, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Kel AE, Gössling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E. MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 2003; 31:3576-9. [PMID: 12824369 PMCID: PMC169193 DOI: 10.1093/nar/gkg585] [Citation(s) in RCA: 807] [Impact Index Per Article: 38.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Match is a weight matrix-based tool for searching putative transcription factor binding sites in DNA sequences. Match is closely interconnected and distributed together with the TRANSFAC database. In particular, Match uses the matrix library collected in TRANSFAC and therefore provides the possibility to search for a great variety of different transcription factor binding sites. Several sets of optimised matrix cut-off values are built in the system to provide a variety of search modes of different stringency. The user may construct and save his/her specific user profiles which are selected subsets of matrices including default or user-defined cut-off values. Furthermore a number of tissue-specific profiles are provided that were compiled by the TRANSFAC team. A public version of the Match tool is available at: http://www.gene-regulation.com/pub/programs.html#match. The same program with a different web interface can be found at http://compel.bionet.nsc.ru/Match/Match.html. An advanced version of the tool called Match Professional is available at http://www.biobase.de.
Collapse
Affiliation(s)
- A E Kel
- BIOBASE GmbH, Halchtersche Str. 33, D-38304 Wolfenbüttel, Germany.
| | | | | | | | | | | |
Collapse
|
25
|
Matys V, Fricke E, Geffers R, Gössling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, Kloos DU, Land S, Lewicki-Potapov B, Michael H, Münch R, Reuter I, Rotert S, Saxel H, Scheer M, Thiele S, Wingender E. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 2003; 31:374-8. [PMID: 12520026 PMCID: PMC165555 DOI: 10.1093/nar/gkg108] [Citation(s) in RCA: 1500] [Impact Index Per Article: 71.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2002] [Revised: 10/11/2002] [Accepted: 10/27/2002] [Indexed: 01/19/2023] Open
Abstract
The TRANSFAC database on eukaryotic transcriptional regulation, comprising data on transcription factors, their target genes and regulatory binding sites, has been extended and further developed, both in number of entries and in the scope and structure of the collected data. Structured fields for expression patterns have been introduced for transcription factors from human and mouse, using the CYTOMER database on anatomical structures and developmental stages. The functionality of Match, a tool for matrix-based search of transcription factor binding sites, has been enhanced. For instance, the program now comes along with a number of tissue-(or state-)specific profiles and new profiles can be created and modified with Match Profiler. The GENE table was extended and gained in importance, containing amongst others links to LocusLink, RefSeq and OMIM now. Further, (direct) links between factor and target gene on one hand and between gene and encoded factor on the other hand were introduced. The TRANSFAC public release is available at http://www.gene-regulation.com. For yeast an additional release including the latest data was made available separately as TRANSFAC Saccharomyces Module (TSM) at http://transfac.gbf.de. For CYTOMER free download versions are available at http://www.biobase.de:8080/index.html.
Collapse
Affiliation(s)
- V Matys
- BIOBASE GmbH, Halchtersche Strasse 33, D-38304 Wolfenbüttel, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Kel-Margoulis OV, Ivanova TG, Wingender E, Kel AE. Automatic annotation of genomic regulatory sequences by searching for composite clusters. Pac Symp Biocomput 2002:187-98. [PMID: 11928475 DOI: 10.1142/9789812799623_0018] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A new method was developed for revealing of composite clusters of cis-elements in promoters of eukaryotic genes that are functionally related or coexpressed. A software system "ClusterScan" have been created that enables: (i) to train system on representative samples of promoters to reveal cis-elements that tend to cluster, (ii) to train system on a number of samples of functionally related promoters to identify functionally coupled transcription factors; (iii) to provide tools for searching of this clusters in genomic sequences to identify and functionally characterize regulatory regions in genome. A number of training samples of different functional and structural groups of promoters were analysed. Search for composite clusters in human chromosomes 21 and 22 reveals a number of interesting examples. Finally, a decision tree system was constructed to classify promoters of several functionally related gene groups. The decision tree system enables to identify new promoters and computationally predict their possible function.
Collapse
|
27
|
Kel-Margoulis OV, Kel AE, Reuter I, Deineko IV, Wingender E. TRANSCompel: a database on composite regulatory elements in eukaryotic genes. Nucleic Acids Res 2002; 30:332-4. [PMID: 11752329 PMCID: PMC99108 DOI: 10.1093/nar/30.1.332] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Originating from COMPEL, the TRANSCompel database emphasizes the key role of specific interactions between transcription factors binding to their target sites providing specific features of gene regulation in a particular cellular content. Composite regulatory elements contain two closely situated binding sites for distinct transcription factors and represent minimal functional units providing combinatorial transcriptional regulation. Both specific factor--DNA and factor--factor interactions contribute to the function of composite elements (CEs). Information about the structure of known CEs and specific gene regulation achieved through such CEs appears to be extremely useful for promoter prediction, for gene function prediction and for applied gene engineering as well. Each database entry corresponds to an individual CE within a particular gene and contains information about two binding sites, two corresponding transcription factors and experiments confirming cooperative action between transcription factors. The COMPEL database, equipped with the search and browse tools, is available at http://www.gene-regulation.com/pub/databases.html#transcompel. Moreover, we have developed the program CATCH for searching potential CEs in DNA sequences. It is freely available as CompelPatternSearch at http://compel.bionet.nsc.ru/FunSite/CompelPatternSearch.html.
Collapse
|
28
|
Kel AE, Kel-Margoulis OV, Farnham PJ, Bartley SM, Wingender E, Zhang MQ. Computer-assisted identification of cell cycle-related genes: new targets for E2F transcription factors. J Mol Biol 2001; 309:99-120. [PMID: 11491305 DOI: 10.1006/jmbi.2001.4650] [Citation(s) in RCA: 133] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The processes that take place during development and differentiation are directed through coordinated regulation of expression of a large number of genes. One such gene regulatory network provides cell cycle control in eukaryotic organisms. In this work, we have studied the structural features of the 5' regulatory regions of cell cycle-related genes. We developed a new method for identifying composite substructures (modules) in regulatory regions of genes consisting of a binding site for a key transcription factor and additional contextual motifs: potential targets for other transcription factors that may synergistically regulate gene transcription. Applying this method to cell cycle-related promoters, we created a program for context-specific identification of binding sites for transcription factors of the E2F family which are key regulators of the cell cycle. We found that E2F composite modules are found at a high frequency and in close proximity to the start of transcription in cell cycle-related promoters in comparison with other promoters. Using this information, we then searched for E2F sites in genomic sequences with the goal of identifying new genes which play important roles in controlling cell proliferation, differentiation and apoptosis. Using a chromatin immunoprecipitation assay, we then experimentally verified the binding of E2F in vivo to the promoters predicted by the computer-assisted methods. Our identification of new E2F target genes provides new insight into gene regulatory networks and provides a framework for continued analysis of the role of contextual promoter features in transcriptional regulation. The tools described are available at http://compel.bionet.nsc.ru/FunSite/SiteScan.html.
Collapse
Affiliation(s)
- A E Kel
- Institute of Cytology and Genetics, Novosibirsk, Russia.
| | | | | | | | | | | |
Collapse
|
29
|
Kel-Margoulis OV, Romashchenko AG, Kolchanov NA, Wingender E, Kel AE. COMPEL: a database on composite regulatory elements providing combinatorial transcriptional regulation. Nucleic Acids Res 2000; 28:311-5. [PMID: 10592258 PMCID: PMC102399 DOI: 10.1093/nar/28.1.311] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/1999] [Accepted: 09/17/1999] [Indexed: 11/14/2022] Open
Abstract
COMPEL is a database on composite regulatory elements, the basic structures of combinatorial regulation. Composite regulatory elements contain two closely situated binding sites for distinct transcription factors and represent minimal functional units providing combinatorial transcriptional regulation. Both specific factor-DNA and factor-factor interactions contribute to the function of composite elements (CEs). Information about the structure of known CEs and specific gene regulation achieved through such CEs appears to be extremely useful for promoter prediction, for gene function prediction and for applied gene engineering as well. The structure of the relational model of COMPEL is determined by the concept of molecular structure and regulatory role of CEs. Based on the set of a particular CE, a program has been developed for searching potential CEs in gene regulatory regions. WWW search and browse routines were developed for COMPEL release 3.0. The COMPEL database equipped with the search and browse tools is available at http://compel.bionet.nsc.ru/. The program for prediction of potential CEs of NFAT type is available at http://compel.bionet.nsc. ru/FunSite.html and http://transfac.gbf.de/dbsearch/funsitep/ s_comp.html
Collapse
Affiliation(s)
- O V Kel-Margoulis
- Institute of Cytology, SB RAN, 10 Lavrentyev pr., 630090, Novosibirsk, Russia.
| | | | | | | | | |
Collapse
|
30
|
Kolchanov NA, Podkolodnaya OA, Ananko EA, Ignatieva EV, Stepanenko IL, Kel-Margoulis OV, Kel AE, Merkulova TI, Goryachkovskaya TN, Busygina TV, Kolpakov FA, Podkolodny NL, Naumochkin AN, Korostishevskaya IM, Romashchenko AG, Overton GC. Transcription regulatory regions database (TRRD): its status in 2000. Nucleic Acids Res 2000; 28:298-301. [PMID: 10592253 PMCID: PMC102412 DOI: 10.1093/nar/28.1.298] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/1999] [Accepted: 10/04/1999] [Indexed: 11/12/2022] Open
Abstract
Transcription Regulatory Regions Database (TRRD) has been developed for accumulation of experimental information on the structure-function features of regulatory regions of eukaryotic genes. Each entry in TRRD corresponds to a particular gene and contains a description of structure-function features of its regulatory regions (transcription factor binding sites, promoters, enhancers, silencers, etc.) and gene expression regulation patterns. The current release, TRRD 4.2.5, comprises the description of 760 genes, 3403 expression patterns, and >4600 regulatory elements including 3604 transcription factor binding sites, 600 promoters and 152 enhancers. This information was obtained through annotation of 2537 scientific publications. TRRD 4.2.5 is available through the WWW at http://wwwmgs.bionet.nsc.ru/mgs/dbases/trrd4/
Collapse
Affiliation(s)
- N A Kolchanov
- Institute of Cytology and Genetics (Siberian Branch of the Russian Academy of Sciences), Lavrentieva 10, Novosibirsk 630090, Russia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Kolchanov NA, Ponomarenko MP, Kel AE, Kondrakhin IV, Frolov AS, Kolpakov FA, Goriachkovskaia TN, Kel OV, Anan'ko EA, Ignat'eva EV. [GeneExpress: an integrator for databases and computer systems accessible by the Internet and intended for studying eukaryotic gene expression]. Biofizika 1999; 44:837-41. [PMID: 10624523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
We have developed GeneExpress that is the WWW-oriented integrator for the databases and systems supporting the investigation of gene expression. The total number of the Web-based resources integrated is 30. The database GeneNet on molecular events forming gene networks was assigned its integrative core. To navigate all these WWW-available resources, the SRS, HTML, and Java viewers were developed, http:@wwwmgs.bionet.nsc.ru/systems/GeneExpress/.
Collapse
Affiliation(s)
- N A Kolchanov
- Institute of Cytology and Genetics, Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Frolov AS, Lavriushev SV, Grigorovich DA, Kel AE, Ptitsyn AA, Kolchanov NA, Podkolodnyĭ NL, Solov'ev VV, Milanesi L, Bourne P. [WWWMGS: an integrated server for molecular-genetic studies]. Biofizika 1999; 44:832-6. [PMID: 10624522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
We report an integrative technology for molecular biology studies in the field of transcription regulation by using Internet. A set of databases, programs, and systems are included into WWWMGS Web server. For example, the use of TRRD database information for site prediction is described. Using this method, the computer system SeqAnn was developed. The system performs the "real time" searching for prediction of initiation transcription site position according to database information. WWWMGS is available at URL: http://wwwmgs.bionet.nsc.ru/.
Collapse
Affiliation(s)
- A S Frolov
- Institute of Cytology and Genetics, Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Heinemeyer T, Chen X, Karas H, Kel AE, Kel OV, Liebich I, Meinhardt T, Reuter I, Schacherer F, Wingender E. Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res 1999; 27:318-22. [PMID: 9847216 PMCID: PMC148171 DOI: 10.1093/nar/27.1.318] [Citation(s) in RCA: 245] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
TRANSFAC is a database on transcription factors, their genomic binding sites and DNA-binding profiles. In addition to being updated and extended by new features, it has been complemented now by a series of additional database modules. Among them, modules which provide data about signal transduction pathways (TRANSPATH) or about cell types/organs/developmental stages (CYTOMER) are available as well as an updated version of the previously described COMPEL database. The databases are available on the WWW at http://transfac.gbf.de/
Collapse
Affiliation(s)
- T Heinemeyer
- Gesellschaft für Biotechnologische Forschung mbH, Mascheroder Weg 1, D-38124 Braunschweig, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Kolchanov NA, Ananko EA, Podkolodnaya OA, Ignatieva EV, Stepanenko IL, Kel-Margoulis OV, Kel AE, Merkulova TI, Goryachkovskaya TN, Busygina TV, Kolpakov FA, Podkolodny NL, Naumochkin AN, Romashchenko AG. Transcription Regulatory Regions Database (TRRD):its status in 1999. Nucleic Acids Res 1999; 27:303-6. [PMID: 9847210 PMCID: PMC148165 DOI: 10.1093/nar/27.1.303] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Transcription Regulatory Regions Database (TRRD) is a curated database designed for accumulation of experimental data on extended regulatory regions of eukaryotic genes, the regulatory elements they contain, i.e., transcription factor binding sites, promoters, enhancers, silencers, etc., and expression patterns of the genes. Release 4.1 of TRRD offers a number of significant improvements, in particular, a more detailed description of transcription factor binding sites, transcription factors per se, and gene expression patterns in a computer-readable format. In addition, the new TRRD release provides considerably more references to other molecular biological databases. TRRD 4.1 is installed under SRS and is available through the WWW at http://www.bionet.nsc.ru/trrd/
Collapse
Affiliation(s)
- N A Kolchanov
- Institute of Cytology and Genetics (Siberian Branch of the Russian Academy of Sciences), Lavrentieva 10,Novosibirsk 630090, Russia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Kochetov AV, Ischenko IV, Vorobiev DG, Kel AE, Babenko VN, Kisselev LL, Kolchanov NA. Eukaryotic mRNAs encoding abundant and scarce proteins are statistically dissimilar in many structural features. FEBS Lett 1998; 440:351-5. [PMID: 9872401 DOI: 10.1016/s0014-5793(98)01482-3] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
It is well known that non-coding mRNA sequences are dissimilar in many structural features. For individual mRNAs correlations were found for some of these features and their translational efficiency. However, no systematic statistical analysis was undertaken to relate protein abundance and structural characteristics of mRNA encoding the given protein. We have demonstrated that structural and contextual features of eukaryotic mRNAs encoding high- and low-abundant proteins differ in the 5' untranslated regions (UTR). Statistically, 5' UTRs of low-expression mRNAs are longer, their guanine plus cytosine content is higher, they have a less optimal context of the translation initiation codons of the main open reading frames and contain more frequently upstream AUG than 5' UTRs of high-expression mRNAs. Apart from the differences in 5' UTRs, high-expression mRNAs contain stronger termination signals. Structural features of low- and high-expression mRNAs are likely to contribute to the yield of their protein products.
Collapse
Affiliation(s)
- A V Kochetov
- Institute of Cytology and Genetics, Novosibirsk, Russia
| | | | | | | | | | | | | |
Collapse
|
36
|
Kolchanov NA, Ponomarenko MP, Kel AE, Frolov AS, Kolpakov FA, Goryachkovsky TN, Kel OV, Ananko EA, Ignatieva EV, Podkolodnaya OA, Babenko VN, Stepanenko IL, Romashchenko AG, Merkulova TI, Vorobiev DG, Lavryushev SV, Kochetov AV, Kolesov GB, Solovyev VV, Milanesi L, Podkolodny NL, Wingender E, Heinemeyer T. GeneExpress: a computer system for description, analysis, and recognition of regulatory sequences in eukaryotic genome. Proc Int Conf Intell Syst Mol Biol 1998; 6:95-104. [PMID: 9783214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
GeneExpress system has been designed to integrate description, analysis, and recognition of eukaryotic regulatory sequences. The system includes 5 basic units: (1) GeneNet contains an object-oriented database for accumulation of data on gene networks and signal transduction pathways and a Java-based viewer that allows an exploration and visualization of the GeneNet information; (2) Transcription Regulation combines the database on transcription regulatory regions of eukaryotic genes (TRRD) and TRRD Viewer; (3) Transcription Factor Binding Site Recognition contains a compilation of transcription factor binding sites (TFBSC) and programs for their analysis and recognition; (4) mRNA Translation is designed for analysis of structural and contextual features of mRNA 5'UTRs and prediction of their translation efficiency; and (5) ACTIVITY is the module for analysis and site activity prediction of a given nucleotide sequence. Integration of the databases in the GeneExpress is based on the Sequence Retrieval System (SRS) created in the European Bioinformatics Institute.
Collapse
Affiliation(s)
- N A Kolchanov
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Heinemeyer T, Wingender E, Reuter I, Hermjakob H, Kel AE, Kel OV, Ignatieva EV, Ananko EA, Podkolodnaya OA, Kolpakov FA, Podkolodny NL, Kolchanov NA. Databases on transcriptional regulation: TRANSFAC, TRRD and COMPEL. Nucleic Acids Res 1998; 26:362-7. [PMID: 9399875 PMCID: PMC147251 DOI: 10.1093/nar/26.1.362] [Citation(s) in RCA: 1185] [Impact Index Per Article: 45.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
TRANSFAC, TRRD (Transcription Regulatory Region Database) and COMPEL are databases which store information about transcriptional regulation in eukaryotic cells. The three databases provide distinct views on the components involved in transcription: transcription factors and their binding sites and binding profiles (TRANSFAC), the regulatory hierarchy of whole genes (TRRD), and the structural and functional properties of composite elements (COMPEL). The quantitative and qualitative changes of all three databases and connected programs are described. The databases are accessible via WWW:http://transfac.gbf.de/TRANSFAC orhttp://www.bionet.nsc.ru/TRRD
Collapse
Affiliation(s)
- T Heinemeyer
- Gesellschaft für Biotechnologische Forschung mbH, Mascheroder Weg 1, D-38124 Braunschweig, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Ponomarenko MP, Ponomarenko JV, Kel AE, Kolchanov NA. Search for DNA conformational features for functional sites. Investigation of the TATA box. Pac Symp Biocomput 1997:340-51. [PMID: 9390304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A method for search of DNA conformational features significant for functional sites is developed. The method uses helical angles averaged for known X-ray structures. Nucleotide sequences are assigned mean angles in a given region. Choice of the significant angles is based on their capabilities to discriminate functional sites from random sequences. The yeast, invertebrate and vertebrate TATA boxes are analyzed using this method. Regions neighboring the TATA boxes are found to have smaller helical twist and roll angles. The results agree with the experimental data on Dickerson-Drew dodecamers. There is a significant decrease in the length of a small roll angle region with increasing complexity of taxon organization.
Collapse
|
39
|
Wingender E, Kel AE, Kel OV, Karas H, Heinemeyer T, Dietze P, Knüppel R, Romaschenko AG, Kolchanov NA. TRANSFAC, TRRD and COMPEL: towards a federated database system on transcriptional regulation. Nucleic Acids Res 1997; 25:265-8. [PMID: 9016550 PMCID: PMC146363 DOI: 10.1093/nar/25.1.265] [Citation(s) in RCA: 117] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Three databases that provide data on transcriptional regulation are described. TRANSFAC is a database on transcription factors and their DNA binding sites. TRRD (Transcription Regulatory Region Database) collects information about complete regulatory regions, their regulation properties and architecture. COMPEL comprises specific information on composite regulatory elements. Here, we describe the present status of these databases and the first steps towards their federation.
Collapse
Affiliation(s)
- E Wingender
- Gesellschaft für Biotechnologische Forschung mbH, Mascheroder Weg 1, D-38124 Braunschweig, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Kel OV, Romaschenko AG, Kel AE, Wingender E, Kolchanov NA. A compilation of composite regulatory elements affecting gene transcription in vertebrates. Nucleic Acids Res 1995; 23:4097-103. [PMID: 7479071 PMCID: PMC307349 DOI: 10.1093/nar/23.20.4097] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Over the past years, evidence has been accumulating for a fundamental role of protein-protein interactions between transcription factors in gene-specific transcription regulation. Many of these interactions run within composite elements containing binding sites for several factors. We have selected 101 composite regulatory elements identified experimentally in the regulatory regions of 64 genes of vertebrates and of their viruses and briefly described them in a compilation. Of these, 82 composite elements are of the synergistic type and 19 of the antagonistic type. Within the synergistic type composite elements, transcription factors bind to the corresponding sites simultaneously, thus cooperatively activating transcription. The factors, binding to their target sites within antagonistic type composite elements, produce opposing effects on transcription. The nucleotide sequence and localization in the genes, the names and brief description of transcription factors, are provided for each composite element, including a representation of experimental data on its functioning. Most of the composite elements (3/4) fall between -250 bp and the transcription start site. The distance between the binding sites within the composite elements described varies from complete overlapping to 80 bp. The compilation of composite elements is presented in the database COMPEL which is electronically accessible by anonymous ftp via internet.
Collapse
Affiliation(s)
- O V Kel
- Institute of Cytology and Genetics, Novosibirsk, Russia
| | | | | | | | | |
Collapse
|
41
|
Kondrakhin YV, Kel AE, Kolchanov NA, Romashchenko AG, Milanesi L. Eukaryotic promoter recognition by binding sites for transcription factors. Comput Appl Biosci 1995; 11:477-88. [PMID: 8590170 DOI: 10.1093/bioinformatics/11.5.477] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
A method for identification of eukaryotic promoters by localization of binding sites for transcription factors has been suggested. The binding sites for a range of transcription factors have been found to be distributed unevenly. Based on these distributions, we have constructed a weight matrix of binding site localization. On the basis of the weight matrix we have, in turn, designed an algorithm for promoter recognition. To increase the accuracy of the method, we have developed a routine that breaks any promoter sample into subsamples. The method to be reported on allows much better recognition accuracy than does the approach based on detection of the TATA box. In particular, the overprediction error is three times lower following our method. The program FunSiteP recognizes promoters from newly uncovered sequences and tentatively identifies the functional class the promoters must belong to. We have introduced the notion of 'regulatory potential' for the degree to which any region of the sequences is similar to the real eukaryotic promoter. By making use of the potential, we have revealed putative transcription start sites and extended regions of transcription regulation.
Collapse
Affiliation(s)
- Y V Kondrakhin
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | | | | |
Collapse
|
42
|
Kel AE, Kondrakhin YV, Kel OV, Romashenko AG, Wingender E, Milanesi L, Kolchanov NA. Computer tool FUNSITE for analysis of eukaryotic regulatory genomic sequences. Proc Int Conf Intell Syst Mol Biol 1995; 3:197-205. [PMID: 7584437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
We present the computer tool FUNSITE for description and analysis of regulatory sequences of eukaryotic genomes. The tool consists of the following main parts: 1) An integrated database for genomic regulatory sequences. The integrated database was designed on the basis of the databases TRANSFAC (Wingender 1994) and TRRD (Kel et al. 1995) that are currently under development. The following functions are performed: i) linkage to the EMBL database; ii) preparing samples of definite types of functional sites with their flanking sequences; iii) preparing samples of promoter sequences; iv) preparing samples of transcription factors classified with regard to structural and functional features of DNA binding and activating domains, functional families of the factors, their tissue specificity and other functional features; v) access to data on mutual disposition of cis-elements within the regulatory regions. 2) The second component of FUNSITE tool is the set of programs for analysis of the structural organization of regulatory sequences: i) Program for revealing of potential transcription factors binding sites based on their consensi; ii) program for revealing of the potential binding sites using homology search with nucleotide sequences of real binding sites; iii) program for analysis of oligonucleotide context features which are characteristic of flank sequences of the binding sites; iv) program for design of recognition method for the functional sites based on generalized weight matrix; v) program for revealing potential composite elements. The results of analysis of the promoter sequences of eukaryotic genes with the FUNSITE are presented, too.
Collapse
Affiliation(s)
- A E Kel
- Institute of Cytology and Genetics, Siberian Branch, Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | | | | | | | | |
Collapse
|
43
|
Kolchanov NA, Vishnevsky OV, Babenko VN, Kel AE, Shindyalov IN. Identification of cDNA sequences by specific oligonucleotide sets. Computer tool and application. Proc Int Conf Intell Syst Mol Biol 1995; 3:206-214. [PMID: 7584438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
A computer tool has been developed for revealing sets of oligonucleotides invariant for isofunctional families of DNA (RNA) and for using these in functional identification of nucleotide sequences. The tool allows one to: build up vocabularies of invariant oligonucleotides for the families of isofunctional nucleotide sequences; assess significance of the vocabularies; identify nucleotide sequences with the vocabularies of invariant oligonucleotides; determine the most effective identification parameters to minimize first and second type errors; assess the efficiency of identification of individual isofunctional families with the oligonucleotide vocabularies; determine the evolutionary characteristics of the families of isofunctional sequences on which vocabulary volume depends. Based on the system mentioned, we have analyzed a total of 322 protein-encoding gene families and have built up sets of invariant oligonucleotides, or again, oligonucleotide vocabularies that are characteristic of gene families and subfamilies. Identification of nucleotide sequences belonging to these families with the sets of invariant oligonucleotides revealed has been shown. Under the most effective identification parameters, the first type error (false negative) on control (independent) data was 10-15%, the second type error (false positive) was just 1-2 redundant sequences per sequence being examined. As has been shown, the volume of a vocabulary of invariant oligonucleotides depends on the percentage of variable positions in the multiple alignment within a family.
Collapse
Affiliation(s)
- N A Kolchanov
- Institute of Cytology and Genetics, Siberian Branch, Russian Academy of Sciences, Novosibirsk, Russia
| | | | | | | | | |
Collapse
|
44
|
Kel AE, Ponomarenko MP, Likhachev EA, Ischenko IV, Milanesi L, Kolchanov NA. SITEVIDEO: a computer system for functional site analysis and recognition. Investigation of the human splice sites. Comput Appl Biosci 1993; 9:617-27. [PMID: 7511478 DOI: 10.1093/bioinformatics/9.6.617] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
We developed the computer system SITEVIDEO for analysis and recognition of the functional sites in DNA and RNA molecules. It reveals contextual features essential for site function and thus enable the user to design efficient methods for recognition of the functional sites. We mainly considered only quantitative characteristics reflecting the uneven distribution of oligonucleotides in the sequences of functional sites of interest. The approach suggested makes use of available information about the hierarchical organization of the functional sites, and ensures highly precise prediction of the sites. The present analysis is concerned with the human donor and acceptor splice sites. A method for recognizing these sites in the sequences with an accuracy of approximately 90% was developed.
Collapse
Affiliation(s)
- A E Kel
- Institute of Cytology and Genetics, Russian Academy of Sciences, Novosibirsk
| | | | | | | | | | | |
Collapse
|