1701
|
Thomas S, Green A, Sturm NR, Campbell DA, Myler PJ. Histone acetylations mark origins of polycistronic transcription in Leishmania major. BMC Genomics 2009; 10:152. [PMID: 19356248 PMCID: PMC2679053 DOI: 10.1186/1471-2164-10-152] [Citation(s) in RCA: 100] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Accepted: 04/08/2009] [Indexed: 11/19/2022] Open
Abstract
Background Many components of the RNA polymerase II transcription machinery have been identified in kinetoplastid protozoa, but they diverge substantially from other eukaryotes. Furthermore, protein-coding genes in these organisms lack individual transcriptional regulation, since they are transcribed as long polycistronic units. The transcription initiation sites are assumed to lie within the 'divergent strand-switch' regions at the junction between opposing polycistronic gene clusters. However, the mechanism by which Kinetoplastidae initiate transcription is unclear, and promoter sequences are undefined. Results The chromosomal location of TATA-binding protein (TBP or TRF4), Small Nuclear Activating Protein complex (SNAP50), and H3 histones were assessed in Leishmania major using microarrays hybridized with DNA obtained through chromatin immunoprecipitation (ChIP-chip). The TBP and SNAP50 binding patterns were almost identical and high intensity peaks were associated with tRNAs and snRNAs. Only 184 peaks of acetylated H3 histone were found in the entire genome, with substantially higher intensity in rapidly-dividing cells than stationary-phase. The majority of the acetylated H3 peaks were found at divergent strand-switch regions, but some occurred at chromosome ends and within polycistronic gene clusters. Almost all these peaks were associated with lower intensity peaks of TBP/SNAP50 binding a few kilobases upstream, evidence that they represent transcription initiation sites. Conclusion The first genome-wide maps of DNA-binding protein occupancy in a kinetoplastid organism suggest that H3 histones at the origins of polycistronic transcription of protein-coding genes are acetylated. Global regulation of transcription initiation may be achieved by modifying the acetylation state of these origins.
Collapse
Affiliation(s)
- Sean Thomas
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| | | | | | | | | |
Collapse
|
1702
|
ChIP-Seq of ERalpha and RNA polymerase II defines genes differentially responding to ligands. EMBO J 2009; 28:1418-28. [PMID: 19339991 DOI: 10.1038/emboj.2009.88] [Citation(s) in RCA: 334] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2008] [Accepted: 03/09/2009] [Indexed: 12/20/2022] Open
Abstract
We used ChIP-Seq to map ERalpha-binding sites and to profile changes in RNA polymerase II (RNAPII) occupancy in MCF-7 cells in response to estradiol (E2), tamoxifen or fulvestrant. We identify 10 205 high confidence ERalpha-binding sites in response to E2 of which 68% contain an estrogen response element (ERE) and only 7% contain a FOXA1 motif. Remarkably, 596 genes change significantly in RNAPII occupancy (59% up and 41% down) already after 1 h of E2 exposure. Although promoter proximal enrichment of RNAPII (PPEP) occurs frequently in MCF-7 cells (17%), it is only observed on a minority of E2-regulated genes (4%). Tamoxifen and fulvestrant partially reduce ERalpha DNA binding and prevent RNAPII loading on the promoter and coding body on E2-upregulated genes. Both ligands act differently on E2-downregulated genes: tamoxifen acts as an agonist thus downregulating these genes, whereas fulvestrant antagonizes E2-induced repression and often increases RNAPII occupancy. Furthermore, our data identify genes preferentially regulated by tamoxifen but not by E2 or fulvestrant. Thus (partial) antagonist loaded ERalpha acts mechanistically different on E2-activated and E2-repressed genes.
Collapse
|
1703
|
van Hooff SR, Koster J, Hulsen T, van Schaik BDC, Roos M, van Batenburg MF, Versteeg R, van Kampen AHC. The construction of genome-based transcriptional units. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2009; 13:105-14. [PMID: 19320556 DOI: 10.1089/omi.2008.0036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Gene-oriented sequence clusters (transcriptional units) have found many applications in genomics research including the construction of transcriptome maps and identification of splice variants. We developed a new method to construct transcriptional that uses the genomic sequence as a template. We present and discuss our method in detail together with an evaluation of the transcriptional units for human. We constructed 33,007 and 27,792 transcriptional units for human and mouse, respectively. The sensitivity (81%) and specificity (90%) of our method compares favorably to other established methods. We evaluated the representation of experimentally validated and predicted intergenic spliced transcripts in humans and show that we correctly represent a large fraction of these cases by single transcriptional units. Our method performs well, but the evaluation of the final set of transcriptional units show that improvements to the algorithm are still possible. However, because the precise number and types of errors are difficult to track, it is not obvious how to significantly improve the algorithm. We believe that ongoing research efforts are necessary to further improve current methods. This should include detailed documentation, comparison, and evaluation of current methods.
Collapse
Affiliation(s)
- Sander R van Hooff
- Bioinformatics Laboratory, Academic Medical Center, Meibergdreef 9, Amsterdam, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
1704
|
Schmitz KM, Schmitt N, Hoffmann-Rohrer U, Schäfer A, Grummt I, Mayer C. TAF12 recruits Gadd45a and the nucleotide excision repair complex to the promoter of rRNA genes leading to active DNA demethylation. Mol Cell 2009; 33:344-53. [PMID: 19217408 DOI: 10.1016/j.molcel.2009.01.015] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2008] [Revised: 12/11/2008] [Accepted: 01/27/2009] [Indexed: 11/28/2022]
Abstract
Many studies have detailed the repressive effects of DNA methylation on gene expression. However, the mechanisms that promote active demethylation are just beginning to emerge. Here, we show that methylation of the rDNA promoter is a dynamic and reversible process. Demethylation of rDNA is initiated by recruitment of Gadd45a (growth arrest and DNA damage inducible protein 45 alpha) to the rDNA promoter by TAF12, a TBP-associated factor that is contained in Pol I- and Pol II-specific TBP-TAF complexes. Once targeted to rDNA, Gadd45a triggers demethylation of promoter-proximal DNA by recruiting the nucleotide excision repair (NER) machinery to remove methylated cytosines. Knockdown of Gadd45a, XPA, XPG, XPF, or TAF12 or treatment with drugs that inhibit NER causes hypermethylation of rDNA, establishes heterochromatic histone marks, and impairs transcription. The results reveal a mechanism that recruits the DNA repair machinery to the promoter of active genes, keeping them in a hypomethylated state.
Collapse
Affiliation(s)
- Kerstin-Maike Schmitz
- Division of Molecular Biology of the Cell II, German Cancer Research Center, DKFZ-ZMBH Alliance, INF 581, D-69120 Heidelberg, Germany
| | | | | | | | | | | |
Collapse
|
1705
|
Lebedev A, Scharffetter-Kochanek K, Iben S. A novel activity enhances promoter escape of RNA polymerase I. Biochem Biophys Res Commun 2009; 380:695-8. [PMID: 19285024 DOI: 10.1016/j.bbrc.2009.01.154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2009] [Accepted: 01/26/2009] [Indexed: 11/27/2022]
Abstract
We have characterized a novel transcriptional activity from HeLa cells that is required for ribosomal gene transcription by RNA polymerase I. This activity has a native molecular mass of 16 kDa and does not bind to conventional chromatographic resins. Single-round and immobilized-template experiments revealed that initiation complex formation is independent of the novel activity. Functional studies showed that it stimulates the transition from initiation to elongation, promoter escape. Thus the novel activity does not resemble the mouse initiation/elongation factor TIF-IC but is a true novel entity.
Collapse
Affiliation(s)
- Anton Lebedev
- Department of Dermatology and Allergic Diseases, University of Ulm, Meyerhofstrasse N27, 89081 Ulm, Germany
| | | | | |
Collapse
|
1706
|
Gao J, Li Z. Comparing four different approaches for the determination of inter-residue interactions provides insight for the structure prediction of helical membrane proteins. Biopolymers 2009; 91:547-56. [PMID: 19241463 DOI: 10.1002/bip.21175] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Studying inter-residue interactions provides insight into the folding and stability of both soluble and membrane proteins and is essential for developing computational tools for protein structure prediction. As the first step, various approaches for elucidating such interactions within protein structures have been proposed and proven useful. Since different approaches may grasp different aspects of protein structural folds, it is of interest to systematically compare them. In this work, we applied four approaches for determining inter-residue interactions to the analysis of three distinct structure datasets of helical membrane proteins and compared their correlation to the three individual quality measures of structures in these datasets. These datasets included one of 35 structures of rhodopsin receptors and bacterial rhodopsins determined at various resolutions, one derived from the HOMEP benchmark dataset previously reported, and one comprising of 139 homology models. It was found that the correlation between the average number of inter-residue interactions obtained by applying the four approaches and the available structure quality measures varied quite significantly among them. The best correlation was achieved by the approach focusing exclusively on favorable inter-residue interactions. These results provide interesting insight for the development of objective quality measure for the structure prediction of helical membrane proteins.
Collapse
Affiliation(s)
- Jun Gao
- Department of Bioinformatics, University of the Sciences in Philadelphia, Philadelphia, PA 19104, USA
| | | |
Collapse
|
1707
|
Altenhoff AM, Dessimoz C. Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol 2009; 5:e1000262. [PMID: 19148271 PMCID: PMC2612752 DOI: 10.1371/journal.pcbi.1000262] [Citation(s) in RCA: 278] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2008] [Accepted: 11/26/2008] [Indexed: 01/06/2023] Open
Abstract
Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) and two standard methods (bidirectional best-hit and reciprocal smallest distance). We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology to verify orthology. And third, it sets performance standards for current and future approaches.
Collapse
Affiliation(s)
- Adrian M Altenhoff
- Institute of Computational Science, ETH Zurich, and Swiss Institute of Bioinformatics, Zürich, Switzerland.
| | | |
Collapse
|
1708
|
Hulsen T, Groenen PMA, de Vlieg J, Alkema W. PhyloPat: an updated version of the phylogenetic pattern database contains gene neighborhood. Nucleic Acids Res 2009; 37:D731-7. [PMID: 18832367 PMCID: PMC2686476 DOI: 10.1093/nar/gkn645] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Phylogenetic patterns show the presence or absence of certain genes in a set of full genomes derived from different species. They can also be used to determine sets of genes that occur only in certain evolutionary branches. Previously, we presented a database named PhyloPat which allows the complete Ensembl gene database to be queried using phylogenetic patterns. Here, we describe an updated version of PhyloPat which can be queried by an improved web server. We used a single linkage clustering algorithm to create 241,697 phylogenetic lineages, using all the orthologies provided by Ensembl v49. PhyloPat offers the possibility of querying with binary phylogenetic patterns or regular expressions, or through a phylogenetic tree of the 39 included species. Users can also input a list of Ensembl, EMBL, EntrezGene or HGNC IDs to check which phylogenetic lineage any gene belongs to. A link to the FatiGO web interface has been incorporated in the HTML output. For each gene, the surrounding genes on the chromosome, color coded according to their phylogenetic lineage can be viewed, as well as FASTA files of the peptide sequences of each lineage. Furthermore, lists of omnipresent, polypresent, oligopresent and anticorrelating genes have been included. PhyloPat is freely available at http://www.cmbi.ru.nl/phylopat.
Collapse
Affiliation(s)
- Tim Hulsen
- Computational Drug Discovery, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands.
| | | | | | | |
Collapse
|
1709
|
Pabuwal V, Li Z. Comparative analysis of the packing topology of structurally important residues in helical membrane and soluble proteins. Protein Eng Des Sel 2008; 22:67-73. [DOI: 10.1093/protein/gzn074] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
1710
|
Kuo CH, Wares JP, Kissinger JC. The Apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees. Mol Biol Evol 2008; 25:2689-98. [PMID: 18820254 PMCID: PMC2582981 DOI: 10.1093/molbev/msn213] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/18/2008] [Indexed: 11/26/2022] Open
Abstract
The protistan phylum Apicomplexa contains many important pathogens and is the subject of intense genome sequencing efforts. Based upon the genome sequences from seven apicomplexan species and a ciliate outgroup, we identified 268 single-copy genes suitable for phylogenetic inference. Both concatenation and consensus approaches inferred the same species tree topology. This topology is consistent with most prior conceptions of apicomplexan evolution based upon ultrastructural and developmental characters, that is, the piroplasm genera Theileria and Babesia form the sister group to the Plasmodium species, the coccidian genera Eimeria and Toxoplasma are monophyletic and are the sister group to the Plasmodium species and piroplasm genera, and Cryptosporidium forms the sister group to the above mentioned with the ciliate Tetrahymena as the outgroup. The level of incongruence among gene trees appears to be high at first glance; only 19% of the genes support the species tree, and a total of 48 different gene-tree topologies are observed. Detailed investigations suggest that the low signal-to-noise ratio in many genes may be the main source of incongruence. The probability of being consistent with the species tree increases as a function of the minimum bootstrap support observed at tree nodes for a given gene tree. Moreover, gene sequences that generate high bootstrap support are robust to the changes in alignment parameters or phylogenetic method used. However, caution should be taken in that some genes can infer a "wrong" tree with strong support because of paralogy, model violations, or other causes. The importance of examining multiple, unlinked genes that possess a strong phylogenetic signal cannot be overstated.
Collapse
|
1711
|
Koeppel M, van Heeringen SJ, Smeenk L, Navis AC, Janssen-Megens EM, Lohrum M. The novel p53 target gene IRF2BP2 participates in cell survival during the p53 stress response. Nucleic Acids Res 2008; 37:322-35. [PMID: 19042971 PMCID: PMC2632907 DOI: 10.1093/nar/gkn940] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The tumor suppressor p53 contributes to the cellular fate after genotoxic insults, mainly through the regulation of target genes, thereby allowing e.g. repair mechanisms resulting in cell survival or inducing apoptosis. Unresolved so far is the issue, which exact mechanisms lead to one or the other cellular outcome. Here, we describe the interferon regulatory factor-2-binding protein-2 (IRF2BP2) as a new direct target gene of p53, influencing the p53-mediated cellular decision. We show that upregulation of IRF2BP2 after treatment with actinomycin D (Act.D) is dependent on functional p53 in different cell lines. This occurs in parallel with the down-regulation of the interacting partner of IRF2BP2, the interferon regulatory factor-2 (IRF2), which is known to positively influence cell growth. Analyzing the molecular functions of IRF2BP2, it appears to be able to impede on the p53-mediated transactivation of the p21- and the Bax-gene. We show here that overexpressed IRF2BP2 has an impact on the cellular stress response after Act.D treatment and that it diminishes the induction of apoptosis after doxorubicin treatment. Furthermore, the knockdown of IRF2BP2 leads to an upregulation of p21 and faster induction of apoptosis after doxorubicin as well as Act.D treatment.
Collapse
Affiliation(s)
- Max Koeppel
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands
| | | | | | | | | | | |
Collapse
|
1712
|
Affiliation(s)
- Brian McStay
- Biomedical Research Center, Ninewells Hospital, University of Dundee, Dundee DD1 9SY, United Kingdom;
| | - Ingrid Grummt
- Molecular Biology of the Cell II, German Cancer Research Center, DKFZ-ZMBH Alliance, D-69120 Heidelberg, Germany;
| |
Collapse
|
1713
|
Abstract
Automated use of phylogenetic trees to deduce orthology relationships in proteins. Reliable orthology prediction is central to comparative genomics. Although orthology is defined by phylogenetic criteria, most automated prediction methods are based on pairwise sequence comparisons. Recently, automated phylogeny-based orthology prediction has emerged as a feasible alternative for genome-wide studies.
Collapse
Affiliation(s)
- Toni Gabaldón
- Bioinformatics and Genomics Program, Center for Genomic Regulation, Doctor Aiguader 88, Barcelona, Spain.
| |
Collapse
|
1714
|
Xing H, Vanderford NL, Sarge KD. The TBP-PP2A mitotic complex bookmarks genes by preventing condensin action. Nat Cell Biol 2008; 10:1318-23. [PMID: 18931662 DOI: 10.1038/ncb1790] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2008] [Accepted: 08/12/2008] [Indexed: 11/09/2022]
Abstract
To maintain phenotypes of cell lineages, cells must 'remember' which genes were active before mitosis entry and transmit this information to their daughter cells so that expression patterns can be faithfully re-established in G1. This phenomenon is called gene bookmarking. However, during mitosis transcription ceases, most sequence-specific proteins dissociate from DNA and the chromatin is tightly compacted, making it difficult to understand how gene activity 'memory' is maintained through this stage of the cell cycle. A feature of gene bookmarking is that in mitotic cells, the promoters of formerly active genes lack compaction, but how compaction of these regions is inhibited is unknown. Here we show that during mitosis, TATA-binding protein (TBP), which remains bound to DNA during mitosis, recruits PP2A. TBP also interacts with condensin to allow efficient dephosphorylation and inactivation of condensin near these promoters to inhibit their compaction. Further, ChIP-on-chip data show that TBP is bound to many chromosomal sites during mitosis, and is higher in transcribed regions but low in regions containing pseudogenes and genes whose expression is tissue-restricted. These results suggest that TBP is involved not only in gene transcription during interphase but also in preserving the memory of gene activity through mitosis to daughter cells.
Collapse
Affiliation(s)
- Hongyan Xing
- Department of Molecular and Cellular Biochemistry, Chandler Medical Center, University of Kentucky, Lexington, KY 40536, USA
| | | | | |
Collapse
|
1715
|
Hulsen T, de Vlieg J, Alkema W. BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics 2008. [PMID: 18925949 DOI: 10.1186/1471‐2164‐9‐488] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In many genomics projects, numerous lists containing biological identifiers are produced. Often it is useful to see the overlap between different lists, enabling researchers to quickly observe similarities and differences between the data sets they are analyzing. One of the most popular methods to visualize the overlap and differences between data sets is the Venn diagram: a diagram consisting of two or more circles in which each circle corresponds to a data set, and the overlap between the circles corresponds to the overlap between the data sets. Venn diagrams are especially useful when they are 'area-proportional' i.e. the sizes of the circles and the overlaps correspond to the sizes of the data sets. Currently there are no programs available that can create area-proportional Venn diagrams connected to a wide range of biological databases. RESULTS We designed a web application named BioVenn to summarize the overlap between two or three lists of identifiers, using area-proportional Venn diagrams. The user only needs to input these lists of identifiers in the textboxes and push the submit button. Parameters like colors and text size can be adjusted easily through the web interface. The position of the text can be adjusted by 'drag-and-drop' principle. The output Venn diagram can be shown as an SVG or PNG image embedded in the web application, or as a standalone SVG or PNG image. The latter option is useful for batch queries. Besides the Venn diagram, BioVenn outputs lists of identifiers for each of the resulting subsets. If an identifier is recognized as belonging to one of the supported biological databases, the output is linked to that database. Finally, BioVenn can map Affymetrix and EntrezGene identifiers to Ensembl genes. CONCLUSION BioVenn is an easy-to-use web application to generate area-proportional Venn diagrams from lists of biological identifiers. It supports a wide range of identifiers from the most used biological databases currently available. Its implementation on the World Wide Web makes it available for use on any computer with internet connection, independent of operating system and without the need to install programs locally. BioVenn is freely accessible at http://www.cmbi.ru.nl/cdd/biovenn/.
Collapse
Affiliation(s)
- Tim Hulsen
- Computational Drug Discovery, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands.
| | | | | |
Collapse
|
1716
|
Hulsen T, de Vlieg J, Alkema W. BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics 2008; 9:488. [PMID: 18925949 PMCID: PMC2584113 DOI: 10.1186/1471-2164-9-488] [Citation(s) in RCA: 1096] [Impact Index Per Article: 68.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2008] [Accepted: 10/16/2008] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND In many genomics projects, numerous lists containing biological identifiers are produced. Often it is useful to see the overlap between different lists, enabling researchers to quickly observe similarities and differences between the data sets they are analyzing. One of the most popular methods to visualize the overlap and differences between data sets is the Venn diagram: a diagram consisting of two or more circles in which each circle corresponds to a data set, and the overlap between the circles corresponds to the overlap between the data sets. Venn diagrams are especially useful when they are 'area-proportional' i.e. the sizes of the circles and the overlaps correspond to the sizes of the data sets. Currently there are no programs available that can create area-proportional Venn diagrams connected to a wide range of biological databases. RESULTS We designed a web application named BioVenn to summarize the overlap between two or three lists of identifiers, using area-proportional Venn diagrams. The user only needs to input these lists of identifiers in the textboxes and push the submit button. Parameters like colors and text size can be adjusted easily through the web interface. The position of the text can be adjusted by 'drag-and-drop' principle. The output Venn diagram can be shown as an SVG or PNG image embedded in the web application, or as a standalone SVG or PNG image. The latter option is useful for batch queries. Besides the Venn diagram, BioVenn outputs lists of identifiers for each of the resulting subsets. If an identifier is recognized as belonging to one of the supported biological databases, the output is linked to that database. Finally, BioVenn can map Affymetrix and EntrezGene identifiers to Ensembl genes. CONCLUSION BioVenn is an easy-to-use web application to generate area-proportional Venn diagrams from lists of biological identifiers. It supports a wide range of identifiers from the most used biological databases currently available. Its implementation on the World Wide Web makes it available for use on any computer with internet connection, independent of operating system and without the need to install programs locally. BioVenn is freely accessible at http://www.cmbi.ru.nl/cdd/biovenn/.
Collapse
Affiliation(s)
- Tim Hulsen
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB Nijmegen, The Netherlands
| | - Jacob de Vlieg
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB Nijmegen, The Netherlands
- Molecular Design and Informatics, Schering-Plough, P.O. Box 20, 5340 BH Oss, The Netherlands
| | - Wynand Alkema
- Molecular Design and Informatics, Schering-Plough, P.O. Box 20, 5340 BH Oss, The Netherlands
| |
Collapse
|
1717
|
McMillan LEM, Martin ACR. Automatically extracting functionally equivalent proteins from SwissProt. BMC Bioinformatics 2008; 9:418. [PMID: 18838004 PMCID: PMC2576269 DOI: 10.1186/1471-2105-9-418] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2008] [Accepted: 10/06/2008] [Indexed: 11/10/2022] Open
Abstract
Background There is a frequent need to obtain sets of functionally equivalent homologous proteins (FEPs) from different species. While it is usually the case that orthology implies functional equivalence, this is not always true; therefore datasets of orthologous proteins are not appropriate. The information relevant to extracting FEPs is contained in databanks such as UniProtKB/Swiss-Prot and a manual analysis of these data allow FEPs to be extracted on a one-off basis. However there has been no resource allowing the easy, automatic extraction of groups of FEPs – for example, all instances of protein C. We have developed FOSTA, an automatically generated database of FEPs annotated as having the same function in UniProtKB/Swiss-Prot which can be used for large-scale analysis. The method builds a candidate list of homologues and filters out functionally diverged proteins on the basis of functional annotations using a simple text mining approach. Results Large scale evaluation of our FEP extraction method is difficult as there is no gold-standard dataset against which the method can be benchmarked. However, a manual analysis of five protein families confirmed a high level of performance. A more extensive comparison with two manually verified functional equivalence datasets also demonstrated very good performance. Conclusion In summary, FOSTA provides an automated analysis of annotations in UniProtKB/Swiss-Prot to enable groups of proteins already annotated as functionally equivalent, to be extracted. Our results demonstrate that the vast majority of UniProtKB/Swiss-Prot functional annotations are of high quality, and that FOSTA can interpret annotations successfully. Where FOSTA is not successful, we are able to highlight inconsistencies in UniProtKB/Swiss-Prot annotation. Most of these would have presented equal difficulties for manual interpretation of annotations. We discuss limitations and possible future extensions to FOSTA, and recommend changes to the UniProtKB/Swiss-Prot format, which would facilitate text-mining of UniProtKB/Swiss-Prot.
Collapse
Affiliation(s)
- Lisa E M McMillan
- Research Department of Structural & Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK.
| | | |
Collapse
|
1718
|
Härterich S, Koschatzky S, Einsiedel J, Gmeiner P. Novel insights into GPCR—Peptide interactions: Mutations in extracellular loop 1, ligand backbone methylations and molecular modeling of neurotensin receptor 1. Bioorg Med Chem 2008; 16:9359-68. [DOI: 10.1016/j.bmc.2008.08.051] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2008] [Accepted: 08/22/2008] [Indexed: 11/24/2022]
|
1719
|
The quest for orthologs: finding the corresponding gene across genomes. Trends Genet 2008; 24:539-51. [PMID: 18819722 DOI: 10.1016/j.tig.2008.08.009] [Citation(s) in RCA: 238] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2007] [Revised: 08/20/2008] [Accepted: 08/21/2008] [Indexed: 11/23/2022]
Abstract
Orthology is a key evolutionary concept in many areas of genomic research. It provides a framework for subjects as diverse as the evolution of genomes, gene functions, cellular networks and functional genome annotation. Although orthologous proteins usually perform equivalent functions in different species, establishing true orthologous relationships requires a phylogenetic approach, which combines both trees and graphs (networks) using reliable species phylogeny and available genomic data from more than two species, and an insight into the processes of molecular evolution. Here, we evaluate the available bioinformatics tools and provide a set of guidelines to aid researchers in choosing the most appropriate tool for any situation.
Collapse
|
1720
|
Wong RSY, Bodart V, Metz M, Labrecque J, Bridger G, Fricker SP. Comparison of the potential multiple binding modes of bicyclam, monocylam, and noncyclam small-molecule CXC chemokine receptor 4 inhibitors. Mol Pharmacol 2008; 74:1485-95. [PMID: 18768385 DOI: 10.1124/mol.108.049775] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
CXC chemokine receptor (CXCR)4 is an HIV coreceptor and a chemokine receptor that plays an important role in several physiological and pathological processes, including hematopoiesis, leukocyte homing and trafficking, metastasis, and angiogenesis. This receptor belongs to the class A family of G protein-coupled receptors and is a validated target for the development of a new class of antiretroviral therapeutics. This study compares the interactions of three structurally diverse small-molecule CXCR4 inhibitors with the receptor and is the first report of the molecular interactions of the nonmacrocyclic CXCR4 inhibitor (S)-N'-(1H-benzimidazol-2-ylmethyl)-N'-(5,6,7,8-tetrahydroquinolin-8-yl)butene-1,4-diamine (AMD11070). Fourteen CXCR4 single-site mutants representing amino acid residues that span the entire putative ligand binding pocket were used in this study. These mutants were used in binding studies to examine how each single-site mutation affected the ability of the inhibitors to compete with (125)I-stromal-derived factor-1alpha binding. Our data suggest that these CXCR4 inhibitors bind to overlapping but not identical amino acid residues in the transmembrane regions of the receptor. In addition, our results identified amino acid residues that are involved in unique interactions with two of the CXCR4 inhibitors studied. These data suggest an extended binding pocket in the transmembrane regions close to the second extracellular loop of the receptor. Based on site-directed mutagenesis and molecular modeling, several potential binding modes were proposed for each inhibitor. These mechanistic studies might prove to be useful for the development of future generations of CXCR4 inhibitors with improved clinical pharmacology and safety profiles.
Collapse
|
1721
|
Franck E, Hulsen T, Huynen MA, de Jong WW, Lubsen NH, Madsen O. Evolution of closely linked gene pairs in vertebrate genomes. Mol Biol Evol 2008; 25:1909-21. [PMID: 18566020 DOI: 10.1093/molbev/msn136] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
The orientation of closely linked genes in mammalian genomes is not random: there are more head-to-head (h2h) gene pairs than expected. To understand the origin of this enrichment in h2h gene pairs, we have analyzed the phylogenetic distribution of gene pairs separated by less than 600 bp of intergenic DNA (gene duos). We show here that a lack of head-to-tail (h2t) gene duos is an even more distinctive characteristic of mammalian genomes, with the platypus genome as the only exception. In nonmammalian vertebrate and in nonvertebrate genomes, the frequency of h2h, h2t, and tail-to-tail (t2t) gene duos is close to random. In tetrapod genomes, the h2t and t2t gene duos are more likely to be part of a larger gene cluster of closely spaced genes than h2h gene duos; in fish and urochordate genomes, the reverse is seen. In human and mouse tissues, the expression profiles of gene duos were skewed toward positive coexpression, irrespective of orientation. The organization of orthologs of both members of about 40% of the human gene duos could be traced in other species, enabling a prediction of the organization at the branch points of gnathostomes, tetrapods, amniotes, and euarchontoglires. The accumulation of h2h gene duos started in tetrapods, whereas that of h2t and t2t gene duos only started in amniotes. The apparent lack of evolutionary conservation of h2t and t2t gene duos relative to that of h2h gene duos is thus a result of their relatively late origin in the lineage leading to mammals; we show that once they are formed h2t and t2t gene duos are as stable as h2h gene duos.
Collapse
Affiliation(s)
- Erik Franck
- Biomolecular Chemistry, 271 Nijmegen Center of Molecular Life Science, Radboud University Nijmegen, Nijmegen, The Netherlands
| | | | | | | | | | | |
Collapse
|
1722
|
Bastien O, Maréchal E. Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores. BMC Bioinformatics 2008; 9:332. [PMID: 18687111 PMCID: PMC2529321 DOI: 10.1186/1471-2105-9-332] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2007] [Accepted: 08/07/2008] [Indexed: 11/23/2022] Open
Abstract
Background Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2) following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. Results We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory) is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure). Homologous sequences were considered as systems 1) having a high redundancy of information reflected by the magnitude of their alignment scores, 2) which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a constant rate, corresponding to the information hazard rate, and that pairwise sequence alignment scores should follow a Gumbel distribution, which parameters could find some theoretical rationale. In particular, one parameter corresponds to the information hazard rate. Conclusion Extreme value distribution of alignment scores, assessed from high scoring segments pairs following the Karlin-Altschul model, can also be deduced from the Reliability Theory applied to molecular sequences. It reflects the redundancy of information between homologous sequences, under functional conservative pressure. This model also provides a link between concepts of biological sequence analysis and of systems biology.
Collapse
Affiliation(s)
- Olivier Bastien
- UMR 5168 CNRS-CEA-INRA-Université J, Fourier, Laboratoire de Physiologie Cellulaire Végétale, Département Réponse et Dynamique Cellulaire, CEA Grenoble, 17 rue des Martyrs, F-38054, Grenoble cedex 09, France.
| | | |
Collapse
|
1723
|
Vinci G, Xia X, Veitia RA. Preservation of genes involved in sterol metabolism in cholesterol auxotrophs: facts and hypotheses. PLoS One 2008; 3:e2883. [PMID: 18682733 PMCID: PMC2478713 DOI: 10.1371/journal.pone.0002883] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2008] [Accepted: 07/11/2008] [Indexed: 12/02/2022] Open
Abstract
Background It is known that primary sequences of enzymes involved in sterol biosynthesis are well conserved in organisms that produce sterols de novo. However, we provide evidence for a preservation of the corresponding genes in two animals unable to synthesize cholesterol (auxotrophs): Drosophila melanogaster and Caenorhabditis elegans. Principal Findings We have been able to detect bona fide orthologs of several ERG genes in both organisms using a series of complementary approaches. We have detected strong sequence divergence between the orthologs of the nematode and of the fruitfly; they are also very divergent with respect to the orthologs in organisms able to synthesize sterols de novo (prototrophs). Interestingly, the orthologs in both the nematode and the fruitfly are still under selective pressure. It is possible that these genes, which are not involved in cholesterol synthesis anymore, have been recruited to perform different new functions. We propose a more parsimonious way to explain their accelerated evolution and subsequent stabilization. The products of ERG genes in prototrophs might be involved in several biological roles, in addition to sterol synthesis. In the case of the nematode and the fruitfly, the relevant genes would have lost their ancestral function in cholesterogenesis but would have retained the other function(s), which keep them under pressure. Conclusions By exploiting microarray data we have noticed a strong expressional correlation between the orthologs of ERG24 and ERG25 in D. melanogaster and genes encoding factors involved in intracellular protein trafficking and folding and with Start1 involved in ecdysteroid synthesis. These potential functional connections are worth being explored not only in Drosophila, but also in Caenorhabditis as well as in sterol prototrophs.
Collapse
Affiliation(s)
- Giovanna Vinci
- Institut Cochin, Département de Génétique et Développement. Inserm, U567, CNRS, UMR 8104, Université Paris 5, Faculté de Médecine Paris Descartes, UM 3, Paris, France
| | - Xuhua Xia
- CAREG and Biology Department, University of Ottawa, Ottawa, Ontario, Canada
| | - Reiner A. Veitia
- Institut Cochin, Département de Génétique et Développement. Inserm, U567, CNRS, UMR 8104, Université Paris 5, Faculté de Médecine Paris Descartes, UM 3, Paris, France
- Université Denis Diderot/Paris VII, Paris, France
- * E-mail:
| |
Collapse
|
1724
|
Abeel T, Saeys Y, Rouzé P, Van de Peer Y. ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles. Bioinformatics 2008; 24:i24-31. [PMID: 18586720 PMCID: PMC2718650 DOI: 10.1093/bioinformatics/btn172] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION More and more genomes are being sequenced, and to keep up with the pace of sequencing projects, automated annotation techniques are required. One of the most challenging problems in genome annotation is the identification of the core promoter. Because the identification of the transcription initiation region is such a challenging problem, it is not yet a common practice to integrate transcription start site prediction in genome annotation projects. Nevertheless, better core promoter prediction can improve genome annotation and can be used to guide experimental work. RESULTS Comparing the average structural profile based on base stacking energy of transcribed, promoter and intergenic sequences demonstrates that the core promoter has unique features that cannot be found in other sequences. We show that unsupervised clustering by using self-organizing maps can clearly distinguish between the structural profiles of promoter sequences and other genomic sequences. An implementation of this promoter prediction program, called ProSOM, is available and has been compared with the state-of-the-art. We propose an objective, accurate and biologically sound validation scheme for core promoter predictors. ProSOM performs at least as well as the software currently available, but our technique is more balanced in terms of the number of predicted sites and the number of false predictions, resulting in a better all-round performance. Additional tests on the ENCODE regions of the human genome show that 98% of all predictions made by ProSOM can be associated with transcriptionally active regions, which demonstrates the high precision. AVAILABILITY Predictions for the human genome, the validation datasets and the program (ProSOM) are available upon request.
Collapse
Affiliation(s)
- Thomas Abeel
- Department of Plant Systems Biology, VIB, 9052 Gent, Belgium
| | | | | | | |
Collapse
|
1725
|
Scognamiglio A, Nebbioso A, Manzo F, Valente S, Mai A, Altucci L. HDAC-class II specific inhibition involves HDAC proteasome-dependent degradation mediated by RANBP2. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2008; 1783:2030-8. [PMID: 18691615 DOI: 10.1016/j.bbamcr.2008.07.007] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2008] [Revised: 07/06/2008] [Accepted: 07/10/2008] [Indexed: 12/20/2022]
Abstract
Discovered for their ability to deacetylate histones and repress transcription, HDACs are a promising target for therapy of human diseases. The class II HDACs are mainly involved in developmental and differentiation processes, such as myogenesis. We report here that class I and class II HDAC inhibitors such as SAHA or the class II selective inhibitor MC1568 induce down-regulation of class II HDACs in human cells. In particular, both SAHA and MC1568 induce HDAC 4 down-regulation by increasing its specific sumoylation followed by activation of proteasomal pathways of degradation. Sumoylation that corresponds to HDAC 4 nuclear localization results in a transient increase of the HDAC 4 repressive action on target genes such as RARalpha and TNFalpha. The HDAC 4 degradation that follows to its sumoylation results in gene target activation. Silencing of the RANBP2 E3 ligase reverts HDAC 4 repression by blocking its own sumoylation. These findings identify a crosstalk occurring between acetylation, deacetylation and sumoylation pathways and suggest that class II specific HDAC inhibitors may affect different epigenetic pathways.
Collapse
Affiliation(s)
- Annamaria Scognamiglio
- Dipartimento di Patologia Generale, Seconda Università degli Studi di Napoli, Vico L. De Crecchio 7, 80138 Napoli, Italy
| | | | | | | | | | | |
Collapse
|
1726
|
Pseudo-NORs: a novel model for studying nucleoli. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2008; 1783:2116-23. [PMID: 18687368 DOI: 10.1016/j.bbamcr.2008.07.004] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2008] [Revised: 07/08/2008] [Accepted: 07/08/2008] [Indexed: 11/21/2022]
Abstract
Nucleolar organiser regions (NORs) are comprised of tandem arrays of ribosomal gene (rDNA) repeats that are transcribed by RNA polymerase I (Pol I), ultimately resulting in formation of a nucleolus. Upstream binding factor (UBF), a DNA binding protein and component of the Pol I transcription machinery, binds extensively across the rDNA repeat in vivo. Pseudo-NORs are tandem arrays of a heterologous DNA sequence with high affinity for UBF introduced into human chromosomes. In this review we describe how analysis of pseudo-NORs has provided important insights into nucleolar formation. Pseudo-NORs mimic endogenous NORs in a number of important respects. On metaphase chromosomes both appear as secondary constrictions comprised of undercondensed chromatin. The transcriptional silence of pseudo-NORs provides a platform for studying the transcription independent recruitment of factors required for nucleolar formation by this specialised chromatin structure. During interphase, pseudo-NORs appear as distinct and novel sub-nuclear bodies. Analysis of these bodies and comparison to their endogenous counterpart has provided insights into nucleolar formation and structure.
Collapse
|
1727
|
Pisabarro AG, Perez G, Lavin JL, Ramirez L. Genetic networks for the functional study of genomes. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2008; 7:249-63. [DOI: 10.1093/bfgp/eln026] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
1728
|
Conservation of key members in the course of the evolution of the insulin signaling pathway. Biosystems 2008; 95:7-16. [PMID: 18616978 DOI: 10.1016/j.biosystems.2008.06.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 05/20/2008] [Accepted: 06/06/2008] [Indexed: 11/20/2022]
Abstract
Our understanding of the evolution of the insulin signaling pathway (ISP) is still incomplete. One intriguing unanswered question is the explanation of the emergence of the glucostatic role of insulin in mammals. To find out whether this is due to the development of new sets of signaling transduction elements in these organisms, or to the establishment of new interactions between pre-existing proteins, we rebuilt putative orthologous ISPs in 17 eukaryotic organisms. Then, we computed the conservation of orthologous ISPs at different levels, from sequence similarity of orthologous proteins to co-evolution of interacting domains. We found that the emergence of glucostatic role in mammals can neither be explained by the development of new sets of signaling elements, nor by the establishment of new interactions between pre-existing proteins. The comparison of orthologous IRS molecules indicates that only in mammals have they acquired their complete functionality as efficient recruiters of effector sub-pathways.
Collapse
|
1729
|
Witt M, Ślusarz M, Ciarkowski J. Molecular Modeling of Vasopressin V2 Receptor Tetramer in Hydrated Lipid Membrane. ACTA ACUST UNITED AC 2008. [DOI: 10.1002/qsar.200730082] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
1730
|
Martinelli A, Tuccinardi T. Molecular modeling of adenosine receptors: new results and trends. Med Res Rev 2008; 28:247-77. [PMID: 17492754 DOI: 10.1002/med.20106] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Adenosine is a ubiquitous neuromodulator, which carries out its biological task by stimulating four cell surface receptors (A(1), A(2A), A(2B), and A(3)). Adenosine receptors (ARs) are members of the superfamily of G protein-coupled receptors (GPCRs). Their discovery opened up new avenues for potential drug treatment of a variety of conditions such as asthma, neurodegenerative disorders, chronic inflammatory diseases, and many other physiopathological states that are believed to be associated with changes in adenosine levels. Knowledge of the 3D structure of ARs could be of great help in the task of understanding their function and in the rational design of specific ligands. However, since GPCRs are membrane-bound proteins, high-resolution structural characterization is still an extremely difficult task. For this reason, great importance has been placed on molecular modeling studies and, particularly in the last few years, on homology modeling (HM) techniques. The publication of the first high-resolution crystal structure for bovine rhodopsin (bRh), a GPCR superfamily member, provides the option of utilizing HM to generate 3D models based on detailed structural information. In this review we report, analyze, and compare the main experimental data, computational HM procedures and validation methods used for ARs, describing in detail the most successful results.
Collapse
Affiliation(s)
- Adriano Martinelli
- Dipartimento di Scienze Farmaceutiche, Università di Pisa, via Bonanno 6, 56126 Pisa, Italy.
| | | |
Collapse
|
1731
|
Smeenk L, van Heeringen SJ, Koeppel M, van Driel MA, Bartels SJJ, Akkers RC, Denissov S, Stunnenberg HG, Lohrum M. Characterization of genome-wide p53-binding sites upon stress response. Nucleic Acids Res 2008; 36:3639-54. [PMID: 18474530 PMCID: PMC2441782 DOI: 10.1093/nar/gkn232] [Citation(s) in RCA: 172] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
The tumor suppressor p53 is a sequence-specific transcription factor, which regulates the expression of target genes involved in different stress responses. To understand p53's essential transcriptional functions, unbiased analysis of its DNA-binding repertoire is pivotal. In a genome-wide tiling ChIP-on-chip approach, we have identified and characterized 1546 binding sites of p53 upon Actinomycin D treatment. Among those binding sites were known as well as novel p53 target sites, which included regulatory regions of potentially novel transcripts. Using this collection of genome-wide binding sites, a new high-confidence algorithm was developed, p53scan, to identify the p53 consensus-binding motif. Strikingly, this motif was present in the majority of all bound sequences with 83% of all binding sites containing the motif. In the surrounding sequences of the binding sites, several motifs for potential regulatory cobinders were identified. Finally, we show that the majority of the genome-wide p53 target sites can also be bound by overexpressed p63 and p73 in vivo, suggesting that they can possibly play an important role at p53 binding sites. This emphasizes the possible interplay of p53 and its family members in the context of target gene binding. Our study greatly expands the known, experimentally validated p53 binding site repertoire and serves as a valuable knowledgebase for future research.
Collapse
Affiliation(s)
- Leonie Smeenk
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands
| | | | | | | | | | | | | | | | | |
Collapse
|
1732
|
Jiang LW, Lin KL, Lu CL. OGtree: a tool for creating genome trees of prokaryotes based on overlapping genes. Nucleic Acids Res 2008; 36:W475-80. [PMID: 18456706 PMCID: PMC2447762 DOI: 10.1093/nar/gkn240] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
OGtree is a web-based tool for constructing genome trees of prokaryotic species based on a measure of combining overlapping-gene content and overlapping-gene order in their whole genomes. The overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. In fact, OGs are ubiquitous in microbial genomes and more conserved between species than non-OGs. Based on these properties, it has been suggested that OGs can serve as better phylogenetic characters than non-OGs for reconstructing the evolutionary relationships among microbial genomes. OGtree takes the accession numbers of prokaryotic genomes as its input. It then downloads their complete genomes from the National Centre for Biotechnology Information and identifies OGs in each genome and their orthologous OGs in other genomes. Next, OGtree computes an overlapping-gene distance between each pair of input genomes based on a combination of their OG content and orthologous OG order. Finally, it utilizes distance-based methods of building tree to reconstruct the genome trees of input prokaryotic genomes according to their pairwise OG distance. OGtree is available online at http://bioalgorithm.life.nctu.edu.tw/OGtree/.
Collapse
Affiliation(s)
- Li-Wei Jiang
- Institute of Bioinformatics and Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | | | | |
Collapse
|
1733
|
Braliou GG, Verga Falzacappa MV, Chachami G, Casanovas G, Muckenthaler MU, Simos G. 2-Oxoglutarate-dependent oxygenases control hepcidin gene expression. J Hepatol 2008; 48:801-10. [PMID: 18313788 DOI: 10.1016/j.jhep.2007.12.021] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/08/2007] [Revised: 09/13/2007] [Accepted: 12/13/2007] [Indexed: 12/04/2022]
Abstract
BACKGROUND/AIMS Hepcidin is a liver-produced hormone that regulates systemic iron homeostasis. Hepcidin expression is stimulated upon iron overload or inflammation while iron deficiency, anemia and tissue hypoxia are negative regulators. We investigated the involvement of 2-oxoglutarate-dependent oxygenases, HIF-1 and other transcription factors in the hypoxic suppression of hepcidin. METHODS Northern blotting analysis and real time PCR were used to determine hepcidin mRNA levels in hepatoma cells and hepcidin promoter activity was measured using Huh7 cells transfected with suitable reporter constructs under various conditions. RESULTS Treatment of human cultured hepatoma cells with hypoxia or known inhibitors of 2-oxoglutarate-dependent oxygenases, such as the iron chelator desferrioxamine, cobalt or the 2-oxoglutarate analogue dimethyl-oxalylglycine significantly reduced hepcidin mRNA levels and down-regulated its gene promoter activity. This effect was not dependent on the HREs or other known putative response elements in the hepcidin promoter and was observed even under interleukin-6 treatment. CONCLUSIONS 2-Oxoglutarate-dependent oxygenases are important to maintain high hepcidin mRNA expression in a HIF-1-independent manner. We suggest that modulation of oxygenase activity may be of therapeutic value in iron-related disorders.
Collapse
Affiliation(s)
- Georgia G Braliou
- Laboratory of Biochemistry, Department of Medicine, University of Thessaly, 22 Papakyriazi Street, Larissa, Greece
| | | | | | | | | | | |
Collapse
|
1734
|
Jawdekar GW, Henry RW. Transcriptional regulation of human small nuclear RNA genes. BIOCHIMICA ET BIOPHYSICA ACTA 2008; 1779:295-305. [PMID: 18442490 PMCID: PMC2684849 DOI: 10.1016/j.bbagrm.2008.04.001] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2007] [Revised: 04/01/2008] [Accepted: 04/02/2008] [Indexed: 01/06/2023]
Abstract
The products of human snRNA genes have been frequently described as performing housekeeping functions and their synthesis refractory to regulation. However, recent studies have emphasized that snRNA and other related non-coding RNA molecules control multiple facets of the central dogma, and their regulated expression is critical to cellular homeostasis during normal growth and in response to stress. Human snRNA genes contain compact and yet powerful promoters that are recognized by increasingly well-characterized transcription factors, thus providing a premier model system to study gene regulation. This review summarizes many recent advances deciphering the mechanism by which the transcription of human snRNA and related genes are regulated.
Collapse
Affiliation(s)
- Gauri W. Jawdekar
- Department of Microbiology, Immunology, and Molecular Genetics, University of California at Los Angeles, Los Angeles, CA 90095
| | - R. William Henry
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824
| |
Collapse
|
1735
|
Conte MG, Gaillard S, Droc G, Perin C. Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants. BMC Genomics 2008; 9:183. [PMID: 18426584 PMCID: PMC2377279 DOI: 10.1186/1471-2164-9-183] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2007] [Accepted: 04/21/2008] [Indexed: 12/04/2022] Open
Abstract
Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods.
Collapse
Affiliation(s)
- Matthieu G Conte
- CIRAD, UMR 1096 TA40/03k, Avenue Agropolis, 34398 Montpellier, Cedex 5, France.
| | | | | | | |
Collapse
|
1736
|
Wendel C, Gohlke H. Predicting transmembrane helix pair configurations with knowledge-based distance-dependent pair potentials. Proteins 2008; 70:984-99. [PMID: 17847096 DOI: 10.1002/prot.21574] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
As a first step toward a novel de novo structure prediction approach for alpha-helical membrane proteins, we developed coarse-grained knowledge-based potentials to score the mutual configuration of transmembrane (TM) helices. Using a comprehensive database of 71 known membrane protein structures, pairwise potentials depending solely on amino acid types and distances between C(alpha)-atoms were derived. To evaluate the potentials, they were used as an objective function for the rigid docking of 442 TM helix pairs. This is by far the largest test data set reported to date for that purpose. After clustering 500 docking runs for each pair and considering the largest cluster, we found solutions with a root mean squared (RMS) deviation <2 A for about 30% of all helix pairs. Encouragingly, if only clusters that contain at least 20% of all decoys are considered, a success rate >71% (with a RMS deviation <2 A) is obtained. The cluster size thus serves as a measure of significance to identify good docking solutions. In a leave-one-protein-family-out cross-validation study, more than 2/3 of the helix pairs were still predicted with an RMS deviation <2.5 A (if only clusters that contain at least 20% of all decoys are considered). This demonstrates the predictive power of the potentials in general, although it is advisable to further extend the knowledge base to derive more robust potentials in the future. When compared to the scoring function of Fleishman and Ben-Tal, a comparable performance is found by our cross-validated potentials. Finally, well-predicted "anchor helix pairs" can be reliably identified for most of the proteins of the test data set. This is important for an extension of the approach towards TM helix bundles because these anchor pairs will act as "nucleation sites" to which more helices will be added subsequently, which alleviates the sampling problem.
Collapse
Affiliation(s)
- Christina Wendel
- Department of Biological Sciences, Molecular Bioinformatics Group, J. W. Goethe-University, Frankfurt, Germany
| | | |
Collapse
|
1737
|
GARDINER ANASTASIA, BARKER DANIEL, BUTLIN ROGERK, JORDAN WILLIAMC, RITCHIE MICHAELG. Drosophilachemoreceptor gene evolution: selection, specialization and genome size. Mol Ecol 2008; 17:1648-57. [DOI: 10.1111/j.1365-294x.2008.03713.x] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
1738
|
Huerta-Cepas J, Dopazo H, Dopazo J, Gabaldón T. The human phylome. Genome Biol 2008; 8:R109. [PMID: 17567924 PMCID: PMC2394744 DOI: 10.1186/gb-2007-8-6-r109] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2006] [Revised: 03/16/2007] [Accepted: 06/13/2007] [Indexed: 01/09/2023] Open
Abstract
The human phylome, which includes evolutionary relationships of all human proteins and their homologs among thirty-nine fully sequenced eukaryotes, is reconstructed. Background: Phylogenomics analyses serve to establish evolutionary relationships among organisms and their genes. A phylome, the complete collection of all gene phylogenies in a genome, constitutes a valuable source of information, but its use in large genomes still constitutes a technical challenge. The use of phylomes also requires the development of new methods that help us to interpret them. Results: We reconstruct here the human phylome, which includes the evolutionary relationships of all human proteins and their homologs among 39 fully sequenced eukaryotes. Phylogenetic techniques used include alignment trimming, branch length optimization, evolutionary model testing and maximum likelihood and Bayesian methods. Although differences with alternative topologies are minor, most of the trees support the Coelomata and Unikont hypotheses as well as the grouping of primates with laurasatheria to the exclusion of rodents. We assess the extent of gene duplication events and their relationship with the functional roles of the protein families involved. We find support for at least one, and probably two, rounds of whole genome duplications before vertebrate radiation. Using a novel algorithm that is independent from a species phylogeny, we derive orthology and paralogy relationships of human proteins among eukaryotic genomes. Conclusion: Topological variations among phylogenies for different genes are to be expected, highlighting the danger of gene-sampling effects in phylogenomic analyses. Several links can be established between the functions of gene families duplicated at certain phylogenetic splits and major evolutionary transitions in those lineages. The pipeline implemented here can be easily adapted for use in other organisms.
Collapse
Affiliation(s)
- Jaime Huerta-Cepas
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| | - Hernán Dopazo
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| | - Joaquín Dopazo
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| | - Toni Gabaldón
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| |
Collapse
|
1739
|
Genome-wide pattern of TCF7L2/TCF4 chromatin occupancy in colorectal cancer cells. Mol Cell Biol 2008; 28:2732-44. [PMID: 18268006 DOI: 10.1128/mcb.02175-07] [Citation(s) in RCA: 198] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Wnt signaling activates gene expression through the induced formation of complexes between DNA-binding T-cell factors (TCFs) and the transcriptional coactivator beta-catenin. In colorectal cancer, activating Wnt pathway mutations transform epithelial cells through the inappropriate activation of a TCF7L2/TCF4 target gene program. Through a DNA array-based genome-wide analysis of TCF4 chromatin occupancy, we have identified 6,868 high-confidence TCF4-binding sites in the LS174T colorectal cancer cell line. Most TCF4-binding sites are located at large distances from transcription start sites, while target genes are frequently "decorated" by multiple binding sites. Motif discovery algorithms define the in vivo-occupied TCF4-binding site as evolutionarily conserved A-C/G-A/T-T-C-A-A-A-G motifs. The TCF4-binding regions significantly correlate with Wnt-responsive gene expression profiles derived from primary human adenomas and often behave as beta-catenin/TCF4-dependent enhancers in transient reporter assays.
Collapse
|
1740
|
Discovery of a novel class of selective human CB1 inverse agonists. Bioorg Med Chem Lett 2008; 18:1199-206. [DOI: 10.1016/j.bmcl.2007.11.133] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2007] [Revised: 11/28/2007] [Accepted: 11/29/2007] [Indexed: 11/19/2022]
|
1741
|
Roshan U, Chikkagoudar S, Livesay DR. Searching for evolutionary distant RNA homologs within genomic sequences using partition function posterior probabilities. BMC Bioinformatics 2008; 9:61. [PMID: 18226231 PMCID: PMC2248559 DOI: 10.1186/1471-2105-9-61] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2007] [Accepted: 01/28/2008] [Indexed: 11/11/2022] Open
Abstract
Background Identification of RNA homologs within genomic stretches is difficult when pairwise sequence identity is low or unalignable flanking residues are present. In both cases structure-sequence or profile/family-sequence alignment programs become difficult to apply because of unreliable RNA structures or family alignments. As such, local sequence-sequence alignment programs are frequently used instead. We have recently demonstrated that maximal expected accuracy alignments using partition function match probabilities (implemented in Probalign) are significantly better than contemporary methods on heterogeneous length protein sequence datasets, thus suggesting an affinity for local alignment. Results We create a pairwise RNA-genome alignment benchmark from RFAM families with average pairwise sequence identity up to 60%. Each dataset contains a query RNA aligned to a target RNA (of the same family) embedded in a genomic sequence at least 5K nucleotides long. To simulate common conditions when exact ends of an ncRNA are unknown, each query RNA has 5' and 3' genomic flanks of size 50, 100, and 150 nucleotides. We subsequently compare the error of the Probalign program (adjusted for local alignment) to the commonly used local alignment programs HMMER, SSEARCH, and BLAST, and the popular ClustalW program with zero end-gap penalties. Parameters were optimized for each program on a small subset of the benchmark. Probalign has overall highest accuracies on the full benchmark. It leads by 10% accuracy over SSEARCH (the next best method) on 5 out of 22 families. On datasets restricted to maximum of 30% sequence identity, Probalign's overall median error is 71.2% vs. 83.4% for SSEARCH (P-value < 0.05). Furthermore, on these datasets Probalign leads SSEARCH by at least 10% on five families; SSEARCH leads Probalign by the same margin on two of the fourteen families. We also demonstrate that the Probalign mean posterior probability, compared to the normalized SSEARCH Z-score, is a better discriminator of alignment quality. All datasets and software are available online. Conclusion We demonstrate, for the first time, that partition function match probabilities used for expected accuracy alignment, as done in Probalign, provide statistically significant improvement over current approaches for identifying distantly related RNA sequences in larger genomic segments.
Collapse
Affiliation(s)
- Usman Roshan
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
| | | | | |
Collapse
|
1742
|
Pavy N, Pelgas B, Beauseigle S, Blais S, Gagnon F, Gosselin I, Lamothe M, Isabel N, Bousquet J. Enhancing genetic mapping of complex genomes through the design of highly-multiplexed SNP arrays: application to the large and unsequenced genomes of white spruce and black spruce. BMC Genomics 2008; 9:21. [PMID: 18205909 PMCID: PMC2246113 DOI: 10.1186/1471-2164-9-21] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2007] [Accepted: 01/18/2008] [Indexed: 11/24/2022] Open
Abstract
Background To explore the potential value of high-throughput genotyping assays in the analysis of large and complex genomes, we designed two highly multiplexed Illumina bead arrays using the GoldenGate SNP assay for gene mapping in white spruce (Picea glauca [Moench] Voss) and black spruce (Picea mariana [Mill.] B.S.P.). Results Each array included 768 SNPs, identified by resequencing genomic DNA from parents of each mapping population. For white spruce and black spruce, respectively, 69.2% and 77.1% of genotyped SNPs had valid GoldenGate assay scores and segregated in the mapping populations. For each of these successful SNPs, on average, valid genotyping scores were obtained for over 99% of progeny. SNP data were integrated to pre-existing ALFP, ESTP, and SSR markers to construct two individual linkage maps and a composite map for white spruce and black spruce genomes. The white spruce composite map contained 821 markers including 348 gene loci. Also, 835 markers including 328 gene loci were positioned on the black spruce composite map. In total, 215 anchor markers (mostly gene markers) were shared between the two species. Considering lineage divergence at least 10 Myr ago between the two spruces, interspecific comparison of homoeologous linkage groups revealed remarkable synteny and marker colinearity. Conclusion The design of customized highly multiplexed Illumina SNP arrays appears as an efficient procedure to enhance the mapping of expressed genes and make linkage maps more informative and powerful in such species with poorly known genomes. This genotyping approach will open new avenues for co-localizing candidate genes and QTLs, partial genome sequencing, and comparative mapping across conifers.
Collapse
Affiliation(s)
- Nathalie Pavy
- Arborea and Canada Research Chair in Forest and Environmental Genomics, Centre d'Etude de la Forêt, Pavillon Charles-Eugène-Marchand, Université Laval, Québec, Québec G1V 0A6, Canada.
| | | | | | | | | | | | | | | | | |
Collapse
|
1743
|
Pabuwal V, Li Z. Network pattern of residue packing in helical membrane proteins and its application in membrane protein structure prediction. Protein Eng Des Sel 2008; 21:55-64. [DOI: 10.1093/protein/gzm059] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
1744
|
Salgado D, Gimenez G, Coulier F, Marcelle C. COMPARE, a multi-organism system for cross-species data comparison and transfer of information. Bioinformatics 2007; 24:447-9. [PMID: 18056065 DOI: 10.1093/bioinformatics/btm599] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION COMPARE is a multi-organism web-based resource system designed to easily retrieve, correlate and interpret data across species. The COMPARE interface provides access to a wide array of information including genomic structure, expression data, annotations, pathways and literature links for human and three widely studied animal models (zebrafish, Drosophila and mouse). A consensus ortholog-finding pipeline combining several ortholog prediction methods allows accurate comparisons of data across species and has been utilized to transfer information from well studied organisms to more poorly annotated ones. AVAILABILITY http://compare.ibdml.univ-mrs.fr.
Collapse
Affiliation(s)
- David Salgado
- Developmental Biology Institute of Marseille Luminy (IBDML), CNRS UMR 6216, Université de la Méditerranée, Campus de Luminy, case 907. 13288 Marseille, France
| | | | | | | |
Collapse
|
1745
|
Berglund AC, Sjölund E, Östlund G, Sonnhammer ELL. InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res 2007; 36:D263-6. [PMID: 18055500 PMCID: PMC2238924 DOI: 10.1093/nar/gkm1020] [Citation(s) in RCA: 175] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The InParanoid eukaryotic ortholog database (http://InParanoid.sbc.su.se/) has been updated to version 6 and is now based on 35 species. We collected all available ‘complete’ eukaryotic proteomes and Escherichia coli, and calculated ortholog groups for all 595 species pairs using the InParanoid program. This resulted in 2 642 187 pairwise ortholog groups in total. The orthology-based species relations are presented in an orthophylogram. InParanoid clusters contain one or more orthologs from each of the two species. Multiple orthologs in the same species, i.e. inparalogs, result from gene duplications after the species divergence. A new InParanoid website has been developed which is optimized for speed both for users and for updating the system. The XML output format has been improved for efficient processing of the InParanoid ortholog clusters.
Collapse
Affiliation(s)
- Ann-Charlotte Berglund
- Linnaeus Centre for Bioinformatics, Uppsala University, BMC Box 598, 75124, Uppsala and Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-10691 Stockholm, Sweden
| | - Erik Sjölund
- Linnaeus Centre for Bioinformatics, Uppsala University, BMC Box 598, 75124, Uppsala and Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-10691 Stockholm, Sweden
| | - Gabriel Östlund
- Linnaeus Centre for Bioinformatics, Uppsala University, BMC Box 598, 75124, Uppsala and Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-10691 Stockholm, Sweden
| | - Erik L. L. Sonnhammer
- Linnaeus Centre for Bioinformatics, Uppsala University, BMC Box 598, 75124, Uppsala and Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-10691 Stockholm, Sweden
- *To whom correspondence should be addressed.+46 8 55378567+46 8 55378214
| |
Collapse
|
1746
|
Lemoine F, Lespinet O, Labedan B. Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data. BMC Evol Biol 2007; 7:237. [PMID: 18047665 PMCID: PMC2238764 DOI: 10.1186/1471-2148-7-237] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2007] [Accepted: 11/29/2007] [Indexed: 11/15/2022] Open
Abstract
Background Comparison of completely sequenced microbial genomes has revealed how fluid these genomes are. Detecting synteny blocks requires reliable methods to determining the orthologs among the whole set of homologs detected by exhaustive comparisons between each pair of completely sequenced genomes. This is a complex and difficult problem in the field of comparative genomics but will help to better understand the way prokaryotic genomes are evolving. Results We have developed a suite of programs that automate three essential steps to study conservation of gene order, and validated them with a set of 107 bacteria and archaea that cover the majority of the prokaryotic taxonomic space. We identified the whole set of shared homologs between two or more species and computed the evolutionary distance separating each pair of homologs. We applied two strategies to extract from the set of homologs a collection of valid orthologs shared by at least two genomes. The first computes the Reciprocal Smallest Distance (RSD) using the PAM distances separating pairs of homologs. The second method groups homologs in families and reconstructs each family's evolutionary tree, distinguishing bona fide orthologs as well as paralogs created after the last speciation event. Although the phylogenetic tree method often succeeds where RSD fails, the reverse could occasionally be true. Accordingly, we used the data obtained with either methods or their intersection to number the orthologs that are adjacent in for each pair of genomes, the Positional Orthologous Genes (POGs), and to further study their properties. Once all these synteny blocks have been detected, we showed that POGs are subject to more evolutionary constraints than orthologs outside synteny groups, whichever the taxonomic distance separating the compared organisms. Conclusion The suite of programs described in this paper allows a reliable detection of orthologs and is useful for evaluating gene order conservation in prokaryotes whichever their taxonomic distance. Thus, our approach will make easy the rapid identification of POGS in the next few years as we are expecting to be inundated with thousands of completely sequenced microbial genomes.
Collapse
Affiliation(s)
- Frédéric Lemoine
- Institut de Génétique et Microbiologie, CNRS UMR 8621, Bâtiment 400, Université Paris Sud XI, 91405 Orsay Cedex, France.
| | | | | |
Collapse
|
1747
|
Buza TJ, McCarthy FM, Burgess SC. Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome. BMC Genomics 2007; 8:425. [PMID: 18021451 PMCID: PMC2204016 DOI: 10.1186/1471-2164-8-425] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2007] [Accepted: 11/19/2007] [Indexed: 11/11/2022] Open
Abstract
Background The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. Results We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology), we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO) functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. Conclusion We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and inform gene prediction algorithms.
Collapse
Affiliation(s)
- Teresia J Buza
- Department of Basic Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS 39762, USA.
| | | | | |
Collapse
|
1748
|
Vermeulen M, Mulder KW, Denissov S, Pijnappel WWMP, van Schaik FMA, Varier RA, Baltissen MPA, Stunnenberg HG, Mann M, Timmers HTM. Selective anchoring of TFIID to nucleosomes by trimethylation of histone H3 lysine 4. Cell 2007; 131:58-69. [PMID: 17884155 DOI: 10.1016/j.cell.2007.08.016] [Citation(s) in RCA: 657] [Impact Index Per Article: 38.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2007] [Revised: 05/09/2007] [Accepted: 08/15/2007] [Indexed: 12/25/2022]
Abstract
Trimethylation of histone H3 at lysine 4 (H3K4me3) is regarded as a hallmark of active human promoters, but it remains unclear how this posttranslational modification links to transcriptional activation. Using a stable isotope labeling by amino acids in cell culture (SILAC)-based proteomic screening we show that the basal transcription factor TFIID directly binds to the H3K4me3 mark via the plant homeodomain (PHD) finger of TAF3. Selective loss of H3K4me3 reduces transcription from and TFIID binding to a subset of promoters in vivo. Equilibrium binding assays and competition experiments show that the TAF3 PHD finger is highly selective for H3K4me3. In transient assays, TAF3 can act as a transcriptional coactivator in a PHD finger-dependent manner. Interestingly, asymmetric dimethylation of H3R2 selectively inhibits TFIID binding to H3K4me3, whereas acetylation of H3K9 and H3K14 potentiates TFIID interaction. Our experiments reveal crosstalk between histone modifications and the transcription factor TFIID. This has important implications for regulation of RNA polymerase II-mediated transcription in higher eukaryotes.
Collapse
Affiliation(s)
- Michiel Vermeulen
- Department of Proteomics and Signal Transduction, Max-Planck-Institute for Biochemistry, D-82152 Martinsried, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
1749
|
Abstract
Xenopus tropicalis is rapidly being adopted as a model organism for developmental biology research and has enormous potential for increasing our understanding of how embryonic development is controlled. In recent years there has been a well-organized initiative within the Xenopus community, funded largely through the support of the National Institutes of Health in the US, to develop X. tropicalis as a new genetic model system with the potential to impact diverse fields of research. Concerted efforts have been made both to adapt established methodologies for use in X. tropicalis and to develop new techniques. A key resource to come out of these efforts is the genome sequence, produced by the US Department of Energy's Joint Genome Institute and made freely available to the community in draft form for the past three years. In this review, we focus on how advances in X. tropicalis genetics coupled with the sequencing of its genome are likely to form a foundation from which we can build a better understanding of the genetic control of vertebrate development and why, when we already have other vertebrate genetic models, we should want to develop genetic analysis in the frog.
Collapse
Affiliation(s)
- Chris Showell
- Carolina Cardiovascular Biology Center and Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
| | - Frank L. Conlon
- Carolina Cardiovascular Biology Center and Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina
- Correspondence to: Frank L. Conlon, 220 Fordham Hall, Medical Drive, Chapel Hill, NC 27599-3280., E-mail:
| |
Collapse
|
1750
|
Ramírez F, Schlicker A, Assenov Y, Lengauer T, Albrecht M. Computational analysis of human protein interaction networks. Proteomics 2007; 7:2541-52. [PMID: 17647236 DOI: 10.1002/pmic.200600924] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Large amounts of human protein interaction data have been produced by experiments and prediction methods. However, the experimental coverage of the human interactome is still low in contrast to predicted data. To gain insight into the value of publicly available human protein network data, we compared predicted datasets, high-throughput results from yeast two-hybrid screens, and literature-curated protein-protein interactions. This evaluation is not only important for further methodological improvements, but also for increasing the confidence in functional hypotheses derived from predictions. Therefore, we assessed the quality and the potential bias of the different datasets using functional similarity based on the Gene Ontology, structural iPfam domain-domain interactions, likelihood ratios, and topological network parameters. This analysis revealed major differences between predicted datasets, but some of them also scored at least as high as the experimental ones regarding multiple quality measures. Therefore, since only small pair wise overlap between most datasets is observed, they may be combined to enlarge the available human interactome data. For this purpose, we additionally studied the influence of protein length on data quality and the number of disease proteins covered by each dataset. We could further demonstrate that protein interactions predicted by more than one method achieve an elevated reliability.
Collapse
Affiliation(s)
- Fidel Ramírez
- Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany
| | | | | | | | | |
Collapse
|