Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Fogel GB, Weekes DG, Varga G, Dow ER, Craven AM, Harlow HB, Su EW, Onyia JE, Su C. A statistical analysis of the TRANSFAC database. Biosystems 2005;81:137-54. [PMID: 15941617 DOI: 10.1016/j.biosystems.2005.03.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2005] [Revised: 03/16/2005] [Indexed: 11/18/2022]

For:	Fogel GB, Weekes DG, Varga G, Dow ER, Craven AM, Harlow HB, Su EW, Onyia JE, Su C. A statistical analysis of the TRANSFAC database. Biosystems 2005;81:137-54. [PMID: 15941617 DOI: 10.1016/j.biosystems.2005.03.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2005] [Revised: 03/16/2005] [Indexed: 11/18/2022]

Number

Cited by Other Article(s)

Gill JK, Chetty M, Lim S, Hallinan J. Large language model based framework for automated extraction of genetic interactions from unstructured data. PLoS One 2024;19:e0303231. [PMID: 38771886 PMCID: PMC11108146 DOI: 10.1371/journal.pone.0303231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 04/23/2024] [Indexed: 05/23/2024] Open

Abstract

Extracting biological interactions from published literature helps us understand complex biological systems, accelerate research, and support decision-making in drug or treatment development. Despite efforts to automate the extraction of biological relations using text mining tools and machine learning pipelines, manual curation continues to serve as the gold standard. However, the rapidly increasing volume of literature pertaining to biological relations poses challenges in its manual curation and refinement. These challenges are further compounded because only a small fraction of the published literature is relevant to biological relation extraction, and the embedded sentences of relevant sections have complex structures, which can lead to incorrect inference of relationships. To overcome these challenges, we propose GIX, an automated and robust Gene Interaction Extraction framework, based on pre-trained Large Language models fine-tuned through extensive evaluations on various gene/protein interaction corpora including LLL and RegulonDB. GIX identifies relevant publications with minimal keywords, optimises sentence selection to reduce computational overhead, simplifies sentence structure while preserving meaning, and provides a confidence factor indicating the reliability of extracted relations. GIX's Stage-2 relation extraction method performed well on benchmark protein/gene interaction datasets, assessed using 10-fold cross-validation, surpassing state-of-the-art approaches. We demonstrated that the proposed method, although fully automated, performs as well as manual relation extraction, with enhanced robustness. We also observed GIX's capability to augment existing datasets with new sentences, incorporating newly discovered biological terms and processes. Further, we demonstrated GIX's real-world applicability in inferring E. coli gene circuits.

Collapse

Su L, Chen S, Zheng C, Wei H, Song X. Meta-Analysis of Gene Expression and Identification of Biological Regulatory Mechanisms in Alzheimer's Disease. Front Neurosci 2019;13:633. [PMID: 31333395 PMCID: PMC6616202 DOI: 10.3389/fnins.2019.00633] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 05/31/2019] [Indexed: 12/12/2022] Open

Abstract

Alzheimer's disease (AD), also known as senile dementia, is a progressive neurodegenerative disease. The etiology and pathogenesis of AD have not yet been elucidated. We examined common differentially expressed genes (DEGs) from different AD tissue microarray datasets by meta-analysis and screened the AD-associated genes from the common DEGs using GCBI. Then we studied the gene expression network using the STRING database and identified the hub genes using Cytoscape. Furthermore, we analyzed the microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and single nucleotide polymorphisms (SNPs) associated with the AD-associated genes, and then identified feed-forward loops. Finally, we performed SNP analysis of the AD-associated genes. Our results identified 207 common DEGs, of which 57 have previously been reported to be associated with AD. The common DEG expression network identified eight hub genes, all of which were previously known to be associated with AD. Further study of the regulatory miRNAs associated with the AD-associated genes and other genes specific to neurodegenerative diseases revealed 65 AD-associated miRNAs. Analysis of the miRNA associated transcription factor-miRNA-gene-gene associated TF (mTF-miRNA-gene-gTF) network around the AD-associated genes revealed 131 feed-forward loops (FFLs). Among them, one important FFL was found between the gene SERPINA3, hsa-miR-27a, and the transcription factor MYC. Furthermore, SNP analysis of the AD-associated genes identified 173 SNPs, and also found a role in AD for miRNAs specific to other neurodegenerative diseases, including hsa-miR-34c, hsa-miR-212, hsa-miR-34a, and hsa-miR-7. The regulatory network constructed in this study describes the mechanism of cell regulation in AD, in which miRNAs and lncRNAs can be considered AD regulatory factors.

Collapse

Hu J, Wang J, Lin J, Liu T, Zhong Y, Liu J, Zheng Y, Gao Y, He J, Shang X. MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites. BMC Bioinformatics 2019;20:200. [PMID: 31074373 PMCID: PMC6509868 DOI: 10.1186/s12859-019-2735-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

Galley JC, Durgin BG, Miller MP, Hahn SA, Yuan S, Wood KC, Straub AC. Antagonism of Forkhead Box Subclass O Transcription Factors Elicits Loss of Soluble Guanylyl Cyclase Expression. Mol Pharmacol 2019;95:629-637. [PMID: 30988014 PMCID: PMC6527398 DOI: 10.1124/mol.118.115386] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 03/31/2019] [Indexed: 01/12/2023] Open

Su L, Wang C, Zheng C, Wei H, Song X. A meta-analysis of public microarray data identifies biological regulatory networks in Parkinson's disease. BMC Med Genomics 2018;11:40. [PMID: 29653596 PMCID: PMC5899355 DOI: 10.1186/s12920-018-0357-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 03/26/2018] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Parkinson's disease (PD) is a long-term degenerative disease that is caused by environmental and genetic factors. The networks of genes and their regulators that control the progression and development of PD require further elucidation.

METHODS

We examine common differentially expressed genes (DEGs) from several PD blood and substantia nigra (SN) microarray datasets by meta-analysis. Further we screen the PD-specific genes from common DEGs using GCBI. Next, we used a series of bioinformatics software to analyze the miRNAs, lncRNAs and SNPs associated with the common PD-specific genes, and then identify the mTF-miRNA-gene-gTF network.

RESULT

Our results identified 36 common DEGs in PD blood studies and 17 common DEGs in PD SN studies, and five of the genes were previously known to be associated with PD. Further study of the regulatory miRNAs associated with the common PD-specific genes revealed 14 PD-specific miRNAs in our study. Analysis of the mTF-miRNA-gene-gTF network about PD-specific genes revealed two feed-forward loops: one involving the SPRK2 gene, hsa-miR-19a-3p and SPI1, and the second involving the SPRK2 gene, hsa-miR-17-3p and SPI. The long non-coding RNA (lncRNA)-mediated regulatory network identified lncRNAs associated with PD-specific genes and PD-specific miRNAs. Moreover, single nucleotide polymorphism (SNP) analysis of the PD-specific genes identified two significant SNPs, and SNP analysis of the neurodegenerative disease-specific genes identified seven significant SNPs. Most of these SNPs are present in the 3'-untranslated region of genes and are controlled by several miRNAs.

CONCLUSION

Our study identified a total of 53 common DEGs in PD patients compared with healthy controls in blood and brain datasets and five of these genes were previously linked with PD. Regulatory network analysis identified PD-specific miRNAs, associated long non-coding RNA and feed-forward loops, which contribute to our understanding of the mechanisms underlying PD. The SNPs identified in our study can determine whether a genetic variant is associated with PD. Overall, these findings will help guide our study of the complex molecular mechanism of PD.

Collapse

Plasticity of the MFS1 Promoter Leads to Multidrug Resistance in the Wheat Pathogen Zymoseptoria tritici. mSphere 2017;2:mSphere00393-17. [PMID: 29085913 PMCID: PMC5656749 DOI: 10.1128/msphere.00393-17] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 09/21/2017] [Indexed: 11/20/2022] Open

Abstract

The ascomycete Zymoseptoria tritici is the causal agent of Septoria leaf blotch on wheat. Disease control relies mainly on resistant wheat cultivars and on fungicide applications. The fungus displays a high potential to circumvent both methods. Resistance against all unisite fungicides has been observed over decades. A different type of resistance has emerged among wild populations with multidrug-resistant (MDR) strains. Active fungicide efflux through overexpression of the major facilitator gene MFS1 explains this emerging resistance mechanism. Applying a bulk-progeny sequencing approach, we identified in this study a 519-bp long terminal repeat (LTR) insert in the MFS1 promoter, a relic of a retrotransposon cosegregating with the MDR phenotype. Through gene replacement, we show the insert as a mutation responsible for MFS1 overexpression and the MDR phenotype. Besides this type I insert, we found two different types of promoter inserts in more recent MDR strains. Type I and type II inserts harbor potential transcription factor binding sites, but not the type III insert. Interestingly, all three inserts correspond to repeated elements present at different genomic locations in either IPO323 or other Z. tritici strains. These results underline the plasticity of repeated elements leading to fungicide resistance in Z. tritici and which contribute to its adaptive potential. IMPORTANCE Disease control through fungicides remains an important means to protect crops from fungal diseases and to secure the harvest. Plant-pathogenic fungi, especially Zymoseptoria tritici, have developed resistance against most currently used active ingredients, reducing or abolishing their efficacy. While target site modification is the most common resistance mechanism against single modes of action, active efflux of multiple drugs is an emerging phenomenon in fungal populations reducing additionally fungicides' efficacy in multidrug-resistant strains. We have investigated the mutations responsible for increased drug efflux in Z. tritici field strains. Our study reveals that three different insertions of repeated elements in the same promoter lead to multidrug resistance in Z. tritici. The target gene encodes the membrane transporter MFS1 responsible for drug efflux, with the promoter inserts inducing its overexpression. These results underline the plasticity of repeated elements leading to fungicide resistance in Z. tritici.

Collapse

Chen J, Zhang N, Wen J, Zhang Z. Silencing TAK1 alters gene expression signatures in bladder cancer cells. Oncol Lett 2017;13:2975-2981. [PMID: 28521404 PMCID: PMC5431247 DOI: 10.3892/ol.2017.5819] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 09/22/2016] [Indexed: 02/06/2023] Open

Abstract

The aim of the present study was to identify the differentially expressed genes (DEGs) that are induced by the silencing of transforming growth factor-β-activated kinase 1 (TAK1) in bladder cancer cells and to analyze the potential biological effects. Dataset GSE52452 from mutant fibroblast growth factor receptor 3 (FGFR3) bladder cancer cells transfected with control siRNA or TAK1-specific siRNA was downloaded from Gene Expression Omnibus. The DEGs between the two groups were identified using Limma package following data pre-processing by Affy in Bioconductor. Enrichment analysis of DEGs was performed using the Database for Annotation, Visualization and Integrated Discovery, followed by functional annotation using TRANSFAC, TSGene and TAG databases. Integrated networks were constructed by Cytoscape and sub-networks were extracted employing BioNet, followed by enrichment analysis of DEGs in the sub-network. A total of 43 downregulated and 21 upregulated genes were obtained. The downregulated genes were enriched in five pathways, including NOD-like receptor signaling pathway and functions related to cellular response. The upregulated genes were associated with cellular developmental processes. Transcription factor EGR1 and 9 tumor-associated genes were screened from the DEGs. Among the DEGs, 10 hub nodes may represent important roles in the complex metabolic network, including EGFR, CYP3A5, MAP3K7, GSTA1, PTHLH, ALDH1A1, KCND2, EGR1, ARRB1 and ITPR1. Additionally, EGFR was correlated with ERBB2, GRB2 and PIK3R1, and these were enriched in ErbB signaling pathway and various cancer-associated pathways. Silencing TAK1 may decrease cellular response to chemical stimulus via downregulating CYP3A5, MAP3K7, GSTA1, ALDH1A1, ARRB1 and ITPR1; increase cancer cell development via upregulating EGFR, EGR1 and PTHLH; and regulate cancer metastasis through EGFR, ERBB2, GRB2 and PIK3R1.

Collapse

Dai X, Li J, Liu T, Zhao PX. HRGRN: A Graph Search-Empowered Integrative Database of Arabidopsis Signaling Transduction, Metabolism and Gene Regulation Networks. PLANT & CELL PHYSIOLOGY 2016;57:e12. [PMID: 26657893 PMCID: PMC4722177 DOI: 10.1093/pcp/pcv200] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 12/07/2015] [Indexed: 05/10/2023]

Abstract

The biological networks controlling plant signal transduction, metabolism and gene regulation are composed of not only tens of thousands of genes, compounds, proteins and RNAs but also the complicated interactions and co-ordination among them. These networks play critical roles in many fundamental mechanisms, such as plant growth, development and environmental response. Although much is known about these complex interactions, the knowledge and data are currently scattered throughout the published literature, publicly available high-throughput data sets and third-party databases. Many 'unknown' yet important interactions among genes need to be mined and established through extensive computational analysis. However, exploring these complex biological interactions at the network level from existing heterogeneous resources remains challenging and time-consuming for biologists. Here, we introduce HRGRN, a graph search-empowered integrative database of Arabidopsis signal transduction, metabolism and gene regulatory networks. HRGRN utilizes Neo4j, which is a highly scalable graph database management system, to host large-scale biological interactions among genes, proteins, compounds and small RNAs that were either validated experimentally or predicted computationally. The associated biological pathway information was also specially marked for the interactions that are involved in the pathway to facilitate the investigation of cross-talk between pathways. Furthermore, HRGRN integrates a series of graph path search algorithms to discover novel relationships among genes, compounds, RNAs and even pathways from heterogeneous biological interaction data that could be missed by traditional SQL database search methods. Users can also build subnetworks based on known interactions. The outcomes are visualized with rich text, figures and interactive network graphs on web pages. The HRGRN database is freely available at http://plantgrn.noble.org/hrgrn/.

Collapse

Broin PÓ, Smith TJ, Golden AA. Alignment-free clustering of transcription factor binding motifs using a genetic-k-medoids approach. BMC Bioinformatics 2015;16:22. [PMID: 25627106 PMCID: PMC4384390 DOI: 10.1186/s12859-015-0450-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 01/02/2015] [Indexed: 11/10/2022] Open

Abstract

Background

Familial binding profiles (FBPs) represent the average binding specificity for a group of structurally related DNA-binding proteins. The construction of such profiles allows the classification of novel motifs based on similarity to known families, can help to reduce redundancy in motif databases and de novo prediction algorithms, and can provide valuable insights into the evolution of binding sites. Many current approaches to automated motif clustering rely on progressive tree-based techniques, and can suffer from so-called frozen sub-alignments, where motifs which are clustered early on in the process remain ‘locked’ in place despite the potential for better placement at a later stage. In order to avoid this scenario, we have developed a genetic-k-medoids approach which allows motifs to move freely between clusters at any point in the clustering process.

Results

We demonstrate the performance of our algorithm, GMACS, on multiple benchmark motif datasets, comparing results obtained with current leading approaches. The first dataset includes 355 position weight matrices from the TRANSFAC database and indicates that the k-mer frequency vector approach used in GMACS outperforms other motif comparison techniques. We then cluster a set of 79 motifs from the JASPAR database previously used in several motif clustering studies and demonstrate that GMACS can produce a higher number of structurally homogeneous clusters than other methods without the need for a large number of singletons. Finally, we show the robustness of our algorithm to noise on multiple synthetic datasets consisting of known motifs convolved with varying degrees of noise.

Conclusions

Our proposed algorithm is generally applicable to any DNA or protein motifs, can produce highly stable and biologically meaningful clusters, and, by avoiding the problem of frozen sub-alignments, can provide improved results when compared with existing techniques on benchmark datasets.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0450-2) contains supplementary material, which is available to authorized users.

Collapse

Dissecting neural differentiation regulatory networks through epigenetic footprinting. Nature 2014;518:355-359. [PMID: 25533951 PMCID: PMC4336237 DOI: 10.1038/nature13990] [Citation(s) in RCA: 138] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 10/21/2014] [Indexed: 12/16/2022]

Sebastian A, Contreras-Moreira B. footprintDB: a database of transcription factors with annotated cis elements and binding interfaces. ACTA ACUST UNITED AC 2013;30:258-65. [PMID: 24234003 DOI: 10.1093/bioinformatics/btt663] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Nagore LI, Nadeau RJ, Guo Q, Jadhav YLA, Jarrett HW, Haskins WE. Purification and characterization of transcription factors. MASS SPECTROMETRY REVIEWS 2013;32:386-398. [PMID: 23832591 PMCID: PMC3758410 DOI: 10.1002/mas.21369] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Revised: 11/19/2012] [Accepted: 11/19/2012] [Indexed: 06/02/2023]

Affiliation(s)

LI Nagore Department of Chemistry, University of Texas at San Antonio, San Antonio, TX, 78249
RJ Nadeau Department of Chemistry, University of Texas at San Antonio, San Antonio, TX, 78249 Protein Biomarkers Cores, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Interdisciplinary Health Research, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Research & Training in the Sciences, University of Texas at San Antonio, San Antonio, TX, 78249
Q Guo Department of Chemistry, University of Texas at San Antonio, San Antonio, TX, 78249 Protein Biomarkers Cores, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Interdisciplinary Health Research, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Research & Training in the Sciences, University of Texas at San Antonio, San Antonio, TX, 78249
YLA Jadhav Pediatric Biochemistry Laboratory, University of Texas at San Antonio, San Antonio, TX, 78249 RCMI Proteomics, University of Texas at San Antonio, San Antonio, TX, 78249 Protein Biomarkers Cores, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Interdisciplinary Health Research, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Research & Training in the Sciences, University of Texas at San Antonio, San Antonio, TX, 78249
HW Jarrett Department of Chemistry, University of Texas at San Antonio, San Antonio, TX, 78249 Protein Biomarkers Cores, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Interdisciplinary Health Research, University of Texas at San Antonio, San Antonio, TX, 78249
WE Haskins Pediatric Biochemistry Laboratory, University of Texas at San Antonio, San Antonio, TX, 78249 Department of Chemistry, University of Texas at San Antonio, San Antonio, TX, 78249 Departments of Biology, University of Texas at San Antonio, San Antonio, TX, 78249 RCMI Proteomics, University of Texas at San Antonio, San Antonio, TX, 78249 Protein Biomarkers Cores, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Interdisciplinary Health Research, University of Texas at San Antonio, San Antonio, TX, 78249 Center for Research & Training in the Sciences, University of Texas at San Antonio, San Antonio, TX, 78249 Departments of Medicine, Division of Hematology & Medical Oncology, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229 Cancer Therapy & Research Center, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229

Collapse

Thompson JA, Congdon CB. An Exploration Into Improving DNA Motif Inference by Looking for Highly Conserved Core Regions. IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY PROCEEDINGS. IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2013;2013:60-67. [PMID: 31008453 PMCID: PMC6474685 DOI: 10.1109/cibcb.2013.6595389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Hu J, Dang N, Menu E, De Bruyne E, De Bryune E, Xu D, Van Camp B, Van Valckenborgh E, Vanderkerken K. Activation of ATF4 mediates unwanted Mcl-1 accumulation by proteasome inhibition. Blood 2012;119:826-37. [PMID: 22128141 DOI: 10.1182/blood-2011-07-366492] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Bischoff E, Vaquero C. In silico and biological survey of transcription-associated proteins implicated in the transcriptional machinery during the erythrocytic development of Plasmodium falciparum. BMC Genomics 2010;11:34. [PMID: 20078850 PMCID: PMC2821373 DOI: 10.1186/1471-2164-11-34] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 01/15/2010] [Indexed: 11/12/2022] Open

Abstract

Background

Malaria is the most important parasitic disease in the world with approximately two million people dying every year, mostly due to Plasmodium falciparum infection. During its complex life cycle in the Anopheles vector and human host, the parasite requires the coordinated and modulated expression of diverse sets of genes involved in epigenetic, transcriptional and post-transcriptional regulation. However, despite the availability of the complete sequence of the Plasmodium falciparum genome, we are still quite ignorant about Plasmodium mechanisms of transcriptional gene regulation. This is due to the poor prediction of nuclear proteins, cognate DNA motifs and structures involved in transcription.

Results

A comprehensive directory of proteins reported to be potentially involved in Plasmodium transcriptional machinery was built from all in silico reports and databanks. The transcription-associated proteins were clustered in three main sets of factors: general transcription factors, chromatin-related proteins (structuring, remodelling and histone modifying enzymes), and specific transcription factors. Only a few of these factors have been molecularly analysed. Furthermore, from transcriptome and proteome data we modelled expression patterns of transcripts and corresponding proteins during the intra-erythrocytic cycle. Finally, an interactome of these proteins based either on in silico or on 2-yeast-hybrid experimental approaches is discussed.

Conclusion

This is the first attempt to build a comprehensive directory of potential transcription-associated proteins in Plasmodium. In addition, all complete transcriptome, proteome and interactome raw data were re-analysed, compared and discussed for a better comprehension of the complex biological processes of Plasmodium falciparum transcriptional regulation during the erythrocytic development.

Collapse

FISim: a new similarity measure between transcription factor binding sites based on the fuzzy integral. BMC Bioinformatics 2009;10:224. [PMID: 19615102 PMCID: PMC2722654 DOI: 10.1186/1471-2105-10-224] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2008] [Accepted: 07/20/2009] [Indexed: 01/22/2023] Open

Meier S, Gehring C. A guide to the integrated application of on-line data mining tools for the inference of gene functions at the systems level. Biotechnol J 2009;3:1375-87. [PMID: 18830970 DOI: 10.1002/biot.200800142] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

de Vooght KMK, van Wijk R, van Solinge WW. Management of gene promoter mutations in molecular diagnostics. Clin Chem 2009;55:698-708. [PMID: 19246615 DOI: 10.1373/clinchem.2008.120931] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Cai Y, He J, Li X, Lu L, Yang X, Feng K, Lu W, Kong X. A Novel Computational Approach To Predict Transcription Factor DNA Binding Preference. J Proteome Res 2008;8:999-1003. [DOI: 10.1021/pr800717y] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Affiliation(s)

Yudong Cai CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
JianFeng He CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
XinLei Li CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
Lin Lu CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
XinYi Yang CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
KaiYan Feng CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
WenCong Lu CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
XiangYin Kong CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical

Collapse

Fogel GB, Porto VW, Varga G, Dow ER, Craven AM, Powers DM, Harlow HB, Su EW, Onyia JE, Su C. Evolutionary computation for discovery of composite transcription factor binding sites. Nucleic Acids Res 2008;36:e142. [PMID: 18927103 PMCID: PMC2588514 DOI: 10.1093/nar/gkn738] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2008] [Revised: 09/05/2008] [Accepted: 10/02/2008] [Indexed: 12/02/2022] Open

Maston GA, Evans SK, Green MR. Transcriptional regulatory elements in the human genome. Annu Rev Genomics Hum Genet 2008;7:29-59. [PMID: 16719718 DOI: 10.1146/annurev.genom.7.080505.115623] [Citation(s) in RCA: 551] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Sandve GK, Abul O, Walseng V, Drabløs F. Improved benchmarks for computational motif discovery. BMC Bioinformatics 2007;8:193. [PMID: 17559676 PMCID: PMC1903367 DOI: 10.1186/1471-2105-8-193] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2006] [Accepted: 06/08/2007] [Indexed: 12/03/2022] Open

Abstract

Background

An important step in annotation of sequenced genomes is the identification of transcription factor binding sites. More than a hundred different computational methods have been proposed, and it is difficult to make an informed choice. Therefore, robust assessment of motif discovery methods becomes important, both for validation of existing tools and for identification of promising directions for future research.

Results

We use a machine learning perspective to analyze collections of transcription factors with known binding sites. Algorithms are presented for finding position weight matrices (PWMs), IUPAC-type motifs and mismatch motifs with optimal discrimination of binding sites from remaining sequence. We show that for many data sets in a recently proposed benchmark suite for motif discovery, none of the common motif models can accurately discriminate the binding sites from remaining sequence. This may obscure the distinction between the potential performance of the motif discovery tool itself versus the intrinsic complexity of the problem we are trying to solve. Synthetic data sets may avoid this problem, but we show on some previously proposed benchmarks that there may be a strong bias towards a presupposed motif model. We also propose a new approach to benchmark data set construction. This approach is based on collections of binding site fragments that are ranked according to the optimal level of discrimination achieved with our algorithms. This allows us to select subsets with specific properties. We present one benchmark suite with data sets that allow good discrimination between positive and negative instances with the common motif models. These data sets are suitable for evaluating algorithms for motif discovery that rely on these models. We present another benchmark suite where PWM, IUPAC and mismatch motif models are not able to discriminate reliably between positive and negative instances. This suite could be used for evaluating more powerful motif models.

Conclusion

Our improved benchmark suites have been designed to differentiate between the performance of motif discovery algorithms and the power of motif models. We provide a web server where users can download our benchmark suites, submit predictions and visualize scores on the benchmarks.

Collapse

Yan B, Lovley DR, Krushkal J. Genome-wide similarity search for transcription factors and their binding sites in a metal-reducing prokaryote Geobacter sulfurreducens. Biosystems 2006;90:421-41. [PMID: 17184904 DOI: 10.1016/j.biosystems.2006.10.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2006] [Revised: 09/21/2006] [Accepted: 10/20/2006] [Indexed: 12/26/2022]

Abstract

The knowledge obtained from understanding individual elements involved in gene regulation is important for reconstructing gene regulatory networks, a key for understanding cellular behavior. To study gene regulatory interactions in a model microorganism, Geobacter sulfurreducens, which participates in metal reduction and energy harvesting, we investigated the presence of 59 known Escherichia coli transcription factors and predicted transcription regulatory sites in its genome. The supplementary material, available at http://www.geobacter.org/research/genomescan/, provides the results of similarity comparisons that identified regulatory proteins of G. sulfurreducens and the genome locations of the predicted regulatory sites, including the list of putative regulatory elements in the upstream regions of every predicted operon and singleton open reading frame. Regulatory sequence elements, predicted using genome similarity searches to matrices of established transcription regulatory elements from E. coli, provide an initial insight into regulation of genes and operons in G. sulfurreducens. The predicted regulatory elements were predominantly located in the upstream regions of operons and singleton open reading frames. The validity of the predictions was examined using a permutation approach. Sequence similarity searches indicate that E. coli transcription factors ArgR, CytR, DeoR, FlhCD (both FlhC and FlhD subunits), FruR, GalR, GlpR, H-NS, LacI, MetJ, PurR, TrpR, and Tus are likely missing from G. sulfurreducens. Phylogenetic analysis suggests that one HU subunit is present in G. sulfurreducens as compared to two subunits in E. coli, while each of the two E. coli IHF subunits, HimA and HimD, have two homologs in G. sulfurreducens. The closest homolog of E. coli RpoE in G. sulfurreducens may be more similar to FecI than to RpoE. These findings represent the first step in the understanding of the regulatory relationships in G. sulfurreducens on the genome scale.

Collapse

Liu CC, Lin CC, Chen WSE, Chen HY, Chang PC, Chen JJ, Yang PC. CRSD: a comprehensive web server for composite regulatory signature discovery. Nucleic Acids Res 2006;34:W571-7. [PMID: 16845073 PMCID: PMC1538777 DOI: 10.1093/nar/gkl279] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Perco P, Rapberger R, Siehs C, Lukas A, Oberbauer R, Mayer G, Mayer B. Transforming omics data into context: Bioinformatics on genomics and proteomics raw data. Electrophoresis 2006;27:2659-75. [PMID: 16739231 DOI: 10.1002/elps.200600064] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Perco P, Kainz A, Mayer G, Lukas A, Oberbauer R, Mayer B. Detection of coregulation in differential gene expression profiles. Biosystems 2005;82:235-47. [PMID: 16181729 DOI: 10.1016/j.biosystems.2005.08.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2005] [Revised: 08/02/2005] [Accepted: 08/02/2005] [Indexed: 01/04/2023]