3651
|
Concordant gene expression in leukemia cells and normal leukocytes is associated with germline cis-SNPs. PLoS One 2008; 3:e2144. [PMID: 18478092 PMCID: PMC2374895 DOI: 10.1371/journal.pone.0002144] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Accepted: 04/03/2008] [Indexed: 01/24/2023] Open
Abstract
The degree to which gene expression covaries between different primary tissues within an individual is not well defined. We hypothesized that expression that is concordant across tissues is more likely influenced by genetic variability than gene expression which is discordant between tissues. We quantified expression of 11,873 genes in paired samples of primary leukemia cells and normal leukocytes from 92 patients with acute lymphoblastic leukemia (ALL). Genetic variation at >500,000 single nucleotide polymorphisms (SNPs) was also assessed. The expression of only 176/11,783 (1.5%) genes was correlated (p<0.008, FDR = 25%) in the two tissue types, but expression of a high proportion (20 of these 176 genes) was significantly related to cis-SNP genotypes (adjusted p<0.05). In an independent set of 134 patients with ALL, 14 of these 20 genes were validated as having expression related to cis-SNPs, as were 9 of 20 genes in a second validation set of HapMap cell lines. Genes whose expression was concordant among tissue types were more likely to be associated with germline cis-SNPs than genes with discordant expression in these tissues; genes affected were involved in housekeeping functions (GSTM2, GAPDH and NCOR1) and purine metabolism.
Collapse
|
3652
|
Lunshof JE, Chadwick R, Vorhaus DB, Church GM. From genetic privacy to open consent. Nat Rev Genet 2008; 9:406-11. [PMID: 18379574 DOI: 10.1038/nrg2360] [Citation(s) in RCA: 225] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Recent advances in high-throughput genomic technologies are showing concrete results in the form of an increasing number of genome-wide association studies and in the publication of comprehensive individual genome-phenome data sets. As a consequence of this flood of information the established concepts of research ethics are stretched to their limits, and issues of privacy, confidentiality and consent for research are being re-examined. Here, we show the feasibility of the co-development of scientific innovation and ethics, using the open-consent framework that was implemented in the Personal Genome Project as an example.
Collapse
Affiliation(s)
- Jeantine E Lunshof
- Department of Molecular Cell Physiology, c/o room M236, Faculty of Earth and Life Sciences, VU University Amsterdam, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands.
| | | | | | | |
Collapse
|
3653
|
Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 2008; 9:356-69. [PMID: 18398418 DOI: 10.1038/nrg2344] [Citation(s) in RCA: 1906] [Impact Index Per Article: 112.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The past year has witnessed substantial advances in understanding the genetic basis of many common phenotypes of biomedical importance. These advances have been the result of systematic, well-powered, genome-wide surveys exploring the relationships between common sequence variation and disease predisposition. This approach has revealed over 50 disease-susceptibility loci and has provided insights into the allelic architecture of multifactorial traits. At the same time, much has been learned about the successful prosecution of association studies on such a scale. This Review highlights the knowledge gained, defines areas of emerging consensus, and describes the challenges that remain as researchers seek to obtain more complete descriptions of the susceptibility architecture of biomedical traits of interest and to translate the information gathered into improvements in clinical management.
Collapse
|
3654
|
Duret L, Arndt PF. The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet 2008; 4:e1000071. [PMID: 18464896 PMCID: PMC2346554 DOI: 10.1371/journal.pgen.1000071] [Citation(s) in RCA: 259] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Accepted: 04/11/2008] [Indexed: 01/19/2023] Open
Abstract
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2 = 47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp. Mammalian genomes show a very strong heterogeneity of base composition along chromosomes (the so-called isochores). The functional significance of these peculiar genomic landscapes is highly debated: do isochores confer some selective advantage, or are they simply the by-product of neutral evolutionary processes? To resolve this issue, we analyzed the pattern of substitution in the human genome by comparison with chimpanzee and macaque. We show that the evolution of base composition (GC-content) is essentially determined by the rate of recombination. This effect appears to be much stronger in male than in female germline, which rules out selective explanations for the evolution of isochores. We show that this impact of recombination is most probably a consequence of the process of biased gene conversion (BGC). This neutral process mimics the action of selection and can induce strong substitution hotspots within recombination hotspots, sometimes leading to the fixation of deleterious mutations. BGC appears to be one of the major factors driving genome evolution. It is therefore essential to take this process into account if we want to be able to interpret genome sequences.
Collapse
Affiliation(s)
- Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
- * E-mail: (LD); (PFA)
| | - Peter F. Arndt
- Department for Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail: (LD); (PFA)
| |
Collapse
|
3655
|
Alsner J, Andreassen CN, Overgaard J. Genetic markers for prediction of normal tissue toxicity after radiotherapy. Semin Radiat Oncol 2008; 18:126-35. [PMID: 18314067 DOI: 10.1016/j.semradonc.2007.10.004] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
During the last decade, a number of studies have supported the hypothesis that there is an important genetic component to the observed interpatient variability in normal tissue toxicity after radiotherapy. This review summarizes the candidate gene association studies published so far on the risk of radiation-induced morbidity and highlights some recent successful whole-genome association studies showing feasibility in other research areas. Future genetic association studies are discussed in relation to methodological problems such as the characterization of clinical and biological phenotypes, genetic haplotypes, and handling of confounding factors. Finally, candidate gene studies elucidating the genetic component of radiation-induced morbidity and the functional consequences of single nucleotide polymorphisms by studying intermediate phenotypes will be discussed.
Collapse
Affiliation(s)
- Jan Alsner
- Department of Experimental Clinical Oncology, Aarhus University Hospital, Aarhus, Denmark.
| | | | | |
Collapse
|
3656
|
Abstract
A regional analysis of nucleotide substitution rates along human genes and their flanking regions allows us to quantify the effect of mutational mechanisms associated with transcription in germ line cells. Our analysis reveals three distinct patterns of substitution rates. First, a sharp decline in the deamination rate of methylated CpG dinucleotides, which is observed in the vicinity of the 5' end of genes. Second, a strand asymmetry in complementary substitution rates, which extends from the 5' end to 1 kbp downstream from the 3' end, associated with transcription-coupled repair. Finally, a localized strand asymmetry, an excess of C-->T over G-->A substitution in the nontemplate strand confined to the first 1-2 kbp downstream of the 5' end of genes. We hypothesize that higher exposure of the nontemplate strand near the 5' end of genes leads to a higher cytosine deamination rate. Up to now, only the somatic hypermutation (SHM) pathway has been known to mediate localized and strand-specific mutagenic processes associated with transcription in mammalia. The mutational patterns in SHM are induced by cytosine deaminase, which just targets single-stranded DNA. This DNA conformation is induced by R-loops, which preferentially occur at the 5' ends of genes. We predict that R-loops are extensively formed in the beginning of transcribed regions in germ line cells.
Collapse
|
3657
|
Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells. Proc Natl Acad Sci U S A 2008; 105:6864-9. [PMID: 18458335 DOI: 10.1073/pnas.0800420105] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The formation of chimeric mRNAs is a strategy used by human cells to increase the complexity of their proteome, as revealed by the ENCODE project. Here, we use Saccharomyces cerevisiae to show a way by which trans-spliced mRNAs can be generated. We demonstrate that a pretRNA inserted into a premRNA context directs the splicing reaction precisely to the sites of the tRNA intron. A suppressor pretRNA gene was inserted, in cis, into the sequence encoding the third cytoplasmic loop of the Ste2 or Ste3 G protein-coupled receptor. The hybrid RNAs are spliced at the specific pretRNA splicing sites, releasing both functional tRNAs that suppress nonsense mutations and translatable mRNAs that activate the signal transduction pathway. The RNA molecules extracted from yeast cells were amplified by RT-PCR, and their sequences were determined, confirming the identity of the splice junctions. We then constructed two fusions between the premRNA sequence (STE2 or STE3) and the 5'- or 3'-pretRNA half, so that the two hybrid RNAs can associate with each other, in trans, through their tRNA halves. Splicing occurs at the predicted pretRNA sites, producing a chimeric STE3-STE2 receptor mRNA. RNA trans-splicing mediated by tRNA sequences, therefore, is a mechanism capable of producing new kinds of RNAs, which could code for novel proteins.
Collapse
|
3658
|
Alcamo EA, Chirivella L, Dautzenberg M, Dobreva G, Fariñas I, Grosschedl R, McConnell SK. Satb2 regulates callosal projection neuron identity in the developing cerebral cortex. Neuron 2008; 57:364-77. [PMID: 18255030 DOI: 10.1016/j.neuron.2007.12.012] [Citation(s) in RCA: 503] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2007] [Revised: 10/23/2007] [Accepted: 12/03/2007] [Indexed: 02/07/2023]
Abstract
Satb2 is a DNA-binding protein that regulates chromatin organization and gene expression. In the developing brain, Satb2 is expressed in cortical neurons that extend axons across the corpus callosum. To assess the role of Satb2 in neurons, we analyzed mice in which the Satb2 locus was disrupted by insertion of a LacZ gene. In mutant mice, beta-galactosidase-labeled axons are absent from the corpus callosum and instead descend along the corticospinal tract. Satb2 mutant neurons acquire expression of Ctip2, a transcription factor that is necessary and sufficient for the extension of subcortical projections by cortical neurons. Conversely, ectopic expression of Satb2 in neural stem cells markedly decreases Ctip2 expression. Finally, we find that Satb2 binds directly to regulatory regions of Ctip2 and induces changes in chromatin structure. These data suggest that Satb2 functions as a repressor of Ctip2 and regulatory determinant of corticocortical connections in the developing cerebral cortex.
Collapse
|
3659
|
Abstract
Promoter-proximal polymerase II stalling prepares genes for prompt expression when signals are received. Stalling of RNA polymerase II near the promoter has recently been found to be much more common than previously thought. Genome-wide surveys of the phenomenon suggest that it is likely to be a rate-limiting control on gene activation that poises developmental and stimulus-responsive genes for prompt expression when inducing signals are received.
Collapse
Affiliation(s)
- Jia Qian Wu
- Molecular, Cellular and Developmental Biology Department, Yale University, PO Box 208103, New Haven, CT 06511, USA.
| | | |
Collapse
|
3660
|
Functional characterization of a -100_-102delAAG deletion-insertion polymorphism in the promoter region of the HTR3B gene. Pharmacogenet Genomics 2008; 18:219-30. [PMID: 18300944 DOI: 10.1097/fpc.0b013e3282f51092] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE The HTR3B gene encodes the B-subunit of the type 3 serotonin receptor (5-HT3). A -100_-102delAAG deletion in the promoter region has been associated with poor response to antiemetic medication and susceptibility to bipolar affective disorders. The molecular mechanisms underlying these associations, however, remained unclear. METHODS We performed electrophoretic mobility shift and luciferase reporter gene assays to elucidate the effect of this polymorphism on the HTR3B promoter activity in PC-12 and HEK293 cells. The reporter constructs carried a 2171 bp fragment of the native HTR3B promoter or 30 bp of the polymorphic locus in tandem triplication upstream of the thymidine kinase minimal promoter. RESULTS Deletion mapping indicated that the sequence around the -100_-102delAAG polymorphism had significant promoter activity. Electrophoretic mobility shift assays indicated differential binding of nuclear proteins to the polymorphic DNA region with stronger binding to the insertion than to the deletion allele. The activity of the native promoter carrying the deletion allele was 25% higher in PC-12 (P=0.016) and 40% higher in HEK cells (P=0.016) compared with the respective insertion construct. Constructs carrying the deletion allele in tandem triplicates showed 43% (PC-12 cells, P=0.002) and 28% (HEK293 cells, P=0.015) higher activity than those carrying the insertion allele. The polymorphism was not linked with known amino acid substitutions in HTR3A and HTR3B. CONCLUSIONS The -100_-102delAAG 3 bp deletion increases the HTR3B promoter activity in vitro. The consequences of this for the structure and the function of the resulting 5-HT3 receptors remain to be elucidated.
Collapse
|
3661
|
Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, Graves T, Hansen N, Teague B, Alkan C, Antonacci F, Haugen E, Zerr T, Yamada NA, Tsang P, Newman TL, Tüzün E, Cheng Z, Ebling HM, Tusneem N, David R, Gillett W, Phelps KA, Weaver M, Saranga D, Brand A, Tao W, Gustafson E, McKernan K, Chen L, Malig M, Smith JD, Korn JM, McCarroll SA, Altshuler DA, Peiffer DA, Dorschner M, Stamatoyannopoulos J, Schwartz D, Nickerson DA, Mullikin JC, Wilson RK, Bruhn L, Olson MV, Kaul R, Smith DR, Eichler EE. Mapping and sequencing of structural variation from eight human genomes. Nature 2008; 453:56-64. [PMID: 18451855 PMCID: PMC2424287 DOI: 10.1038/nature06862] [Citation(s) in RCA: 798] [Impact Index Per Article: 46.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Accepted: 02/15/2008] [Indexed: 11/08/2022]
Abstract
Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale--particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation--a standard for genotyping platforms and a prelude to future individual genome sequencing projects.
Collapse
Affiliation(s)
- Jeffrey M Kidd
- Department of Genome Sciences and Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3662
|
Morra M, Geigenmuller U, Curran J, Rainville IR, Brennan T, Curtis J, Reichert V, Hovhannisyan H, Majzoub J, Miller DT. Genetic Diagnosis of Primary Immune Deficiencies. Immunol Allergy Clin North Am 2008; 28:387-412, x. [DOI: 10.1016/j.iac.2008.01.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
3663
|
Pipkin ME, Monticelli S. Genomics and the immune system. Immunology 2008; 124:23-32. [PMID: 18298549 PMCID: PMC2434389 DOI: 10.1111/j.1365-2567.2008.02818.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2007] [Revised: 01/22/2008] [Accepted: 01/23/2008] [Indexed: 01/04/2023] Open
Abstract
While the hereditary information encoded in the Watson-Crick base pairing of genomes is largely static within a given individual, access to this information is controlled by dynamic mechanisms. The human genome is pervasively transcribed, but the roles played by the majority of the non-protein-coding genome sequences are still largely unknown. In this review we focus on insights to gene transcriptional regulation by placing special emphasis on genome-wide approaches, and on how non-coding RNAs, which derive from global transcription of the genome, in turn control gene expression. We review recent progress in the field with highlights on the immune system.
Collapse
Affiliation(s)
- Matthew E Pipkin
- Immune Disease Institute and Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
3664
|
Jawdekar GW, Henry RW. Transcriptional regulation of human small nuclear RNA genes. BIOCHIMICA ET BIOPHYSICA ACTA 2008; 1779:295-305. [PMID: 18442490 PMCID: PMC2684849 DOI: 10.1016/j.bbagrm.2008.04.001] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/06/2007] [Revised: 04/01/2008] [Accepted: 04/02/2008] [Indexed: 01/06/2023]
Abstract
The products of human snRNA genes have been frequently described as performing housekeeping functions and their synthesis refractory to regulation. However, recent studies have emphasized that snRNA and other related non-coding RNA molecules control multiple facets of the central dogma, and their regulated expression is critical to cellular homeostasis during normal growth and in response to stress. Human snRNA genes contain compact and yet powerful promoters that are recognized by increasingly well-characterized transcription factors, thus providing a premier model system to study gene regulation. This review summarizes many recent advances deciphering the mechanism by which the transcription of human snRNA and related genes are regulated.
Collapse
Affiliation(s)
- Gauri W. Jawdekar
- Department of Microbiology, Immunology, and Molecular Genetics, University of California at Los Angeles, Los Angeles, CA 90095
| | - R. William Henry
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824
| |
Collapse
|
3665
|
Petretto E, Sarwar R, Grieve I, Lu H, Kumaran MK, Muckett PJ, Mangion J, Schroen B, Benson M, Punjabi PP, Prasad SK, Pennell DJ, Kiesewetter C, Tasheva ES, Corpuz LM, Webb MD, Conrad GW, Kurtz TW, Kren V, Fischer J, Hubner N, Pinto YM, Pravenec M, Aitman TJ, Cook SA. Integrated genomic approaches implicate osteoglycin (Ogn) in the regulation of left ventricular mass. Nat Genet 2008; 40:546-52. [PMID: 18443592 PMCID: PMC2742198 DOI: 10.1038/ng.134] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Accepted: 02/20/2008] [Indexed: 01/19/2023]
Abstract
Left ventricular mass (LVM) and cardiac gene expression are complex traits regulated by factors both intrinsic and extrinsic to the heart. To dissect the major determinants of LVM, we combined expression quantitative trait locus1 and quantitative trait transcript (QTT) analyses of the cardiac transcriptome in the rat. Using these methods and in vitro functional assays, we identified osteoglycin (Ogn) as a major candidate regulator of rat LVM, with increased Ogn protein expression associated with elevated LVM. We also applied genome-wide QTT analysis to the human heart and observed that, out of 22,000 transcripts, OGN transcript abundance had the highest correlation with LVM. We further confirmed a role for Ogn in the in vivo regulation of LVM in Ogn knockout mice. Taken together, these data implicate Ogn as a key regulator of LVM in rats, mice and humans, and suggest that Ogn modifies the hypertrophic response to extrinsic factors such as hypertension and aortic stenosis.
Collapse
Affiliation(s)
- Enrico Petretto
- Medical Research Council Clinical Sciences Centre, Faculty of Medicine, Imperial College London, Hammersmith Hospital, Du Cane Road, London, W12 0NN, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3666
|
Okoruwa OE, Weston MD, Sanjeevi DC, Millemon AR, Fritzsch B, Hallworth R, Beisel KW. Evolutionary insights into the unique electromotility motor of mammalian outer hair cells. Evol Dev 2008; 10:300-15. [PMID: 18460092 PMCID: PMC2666851 DOI: 10.1111/j.1525-142x.2008.00239.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Prestin (SLC26A5) is the molecular motor responsible for cochlear amplification by mammalian cochlea outer hair cells and has the unique combined properties of energy-independent motility, voltage sensitivity, and speed of cellular shape change. The ion transporter capability, typical of SLC26A members, was exchanged for electromotility function and is a newly derived feature of the therian cochlea. A putative minimal essential motif for the electromotility motor (meEM) was identified through the amalgamation of comparative genomic, evolution, and structural diversification approaches. Comparisons were done among nonmammalian vertebrates, eutherian mammalian species, and the opossum and platypus. The opossum and platypus SLC26A5 proteins were comparable to the eutherian consensus sequence. Suggested from the point-accepted mutation analysis, the meEM motif spans all the transmembrane segments and represented residues 66-503. Within the eutherian clade, the meEM was highly conserved with a substitution frequency of only 39/7497 (0.5%) residues, compared with 5.7% in SLC26A4 and 12.8% in SLC26A6 genes. Clade-specific substitutions were not observed and there was no sequence correlation with low or high hearing frequency specialists. We were able to identify that within the highly conserved meEM motif two regions, which are unique to all therian species, appear to be the most derived features in the SLC26A5 peptide.
Collapse
Affiliation(s)
- Oseremen E. Okoruwa
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | - Michael D. Weston
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | - Divvya C. Sanjeevi
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | - Amanda R. Millemon
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | - Bernd Fritzsch
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | - Richard Hallworth
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | - Kirk W. Beisel
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| |
Collapse
|
3667
|
Abstract
PURPOSE OF REVIEW The analysis of globin gene expression during erythropoiesis has established many principles underlying normal mammalian gene expression. New aspects of gene regulation have been revealed by natural mutations that downregulate globin gene expression and cause thalassemia. Deletions involving sequences upstream of the alpha and beta clusters suggested that the globin genes might be controlled by remote regulatory elements. This was demonstrated experimentally and suggested that many mammalian genes may be controlled in a similar manner. RECENT FINDINGS Completion of the Human Genome Project and the associated encyclopaedia of DNA elements (ENCODE) project confirmed that human gene expression is commonly controlled by long-range, cis-acting elements. The development of chromatin immunoprecipitation has allowed us to identify binding of transcription factors and chromatin modifications at the key cis-acting sequences in vivo. In addition, chromosome conformation capture has enabled us to address the topological models proposed to mediate long-range interactions. Together, these methods have given us some insight into how long-range elements may influence gene expression and how this process may be subverted in thalassemia. SUMMARY The review asks how remote elements regulate alpha globin expression and how natural mutations interfere with this mechanism to cause alpha thalassemia. We also speculate as to why long-range control of gene expression may have evolved in higher organisms.
Collapse
Affiliation(s)
- Douglas R Higgs
- MRC Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford, UK.
| | | |
Collapse
|
3668
|
Tsirigos A, Rigoutsos I. Human and mouse introns are linked to the same processes and functions through each genome's most frequent non-conserved motifs. Nucleic Acids Res 2008; 36:3484-93. [PMID: 18450818 PMCID: PMC2425492 DOI: 10.1093/nar/gkn155] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
We identified the most frequent, variable-length DNA sequence motifs in the human and mouse genomes and sub-selected those with multiple recurrences in the intergenic and intronic regions and at least one additional exonic instance in the corresponding genome. We discovered that these motifs have virtually no overlap with intronic sequences that are conserved between human and mouse, and thus are genome-specific. Moreover, we found that these motifs span a substantial fraction of previously uncharacterized human and mouse intronic space. Surprisingly, we found that these genome-specific motifs are over-represented in the introns of genes belonging to the same biological processes and molecular functions in both the human and mouse genomes even though the underlying sequences are not conserved between the two genomes. In fact, the processes and functions that are linked to these genome-specific sequence-motifs are distinct from the processes and functions which are associated with intronic regions that are conserved between human and mouse. The findings show that intronic regions from different genomes are linked to the same processes and functions in the absence of underlying sequence conservation. We highlight the ramifications of this observation with a concrete example that involves the microsatellite instability gene MLH1.
Collapse
Affiliation(s)
- Aristotelis Tsirigos
- Bioinformatics and Pattern Discovery Group, IBM Thomas J. Watson Research Center, PO Box 218, Yorktown Heights, NY 10598, USA
| | | |
Collapse
|
3669
|
Brinkmeyer-Langford C, Raudsepp T, Gustafson-Seabury A, Chowdhary BP. A BAC contig map over the proximal approximately 3.3 Mb region of horse chromosome 21. Cytogenet Genome Res 2008; 120:164-72. [PMID: 18467843 DOI: 10.1159/000118758] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2007] [Indexed: 11/19/2022] Open
Abstract
A total of 207 BAC clones containing 155 loci were isolated and arranged into a map of linearly ordered overlapping clones over the proximal part of horse chromosome 21 (ECA21), which corresponds to the proximal half of the short arm of human chromosome 19 (HSA19p) and part of HSA5. The clones form two contigs - each corresponding to the respective human chromosomes - that are estimated to be separated by a gap of approximately 200 kb. Of the 155 markers present in the two contigs, 141 (33 genes and 108 STS) were generated and mapped in this study. The BACs provide a 4-5x coverage of the region and span an estimated length of approximately 3.3 Mb. The region presently contains one mapped marker per 22 kb on average, which represents a major improvement over the previous resolution of one marker per 380 kb obtained through the generation of a dense RH map for this segment. Dual color fluorescence in situ hybridization on metaphase and interphase chromosomes verified the relative order of some of the BACs and helped to orient them accurately in the contigs. Despite having similar gene order and content, the equine region covered by the contigs appears to be distinctly smaller than the corresponding region in human (3.3 Mb vs. 5.5-6 Mb) because the latter harbors a host of repetitive elements and gene families unique to humans/primates. Considering limited representation of the region in the latest version of the horse whole genome sequence EquCab2, the dense map developed in this study will prove useful for the assembly and annotation of the sequence data on ECA21 and will be instrumental in rapid search and isolation of candidate genes for traits mapped to this region.
Collapse
Affiliation(s)
- C Brinkmeyer-Langford
- Department of Veterinary Integrative Biomedical Sciences, College of Veterinary Medicine, Texas A&M University, College Station, TX, USA
| | | | | | | |
Collapse
|
3670
|
Lu J, Luo L. Prediction for human transcription start site using diversity measure with quadratic discriminant. Bioinformation 2008; 2:316-21. [PMID: 18478087 PMCID: PMC2374378 DOI: 10.6026/97320630002316] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2008] [Revised: 03/17/2008] [Accepted: 04/15/2008] [Indexed: 11/23/2022] Open
Abstract
The accurate identification of promoter regions and transcription start sites is a challenge to the construction of human transcription regulation networks. Thus, an efficient prediction method based on theoretical formulation is necessary for this purpose. We used the method of increment diversity with quadratic discriminant analysis (IDQD) to predict transcription start sites (TSS). The method produced sensitivity and positive predictive value of more than 65% with positives to negatives ratio of 1:58. The performance evaluation using Receiver Operator Characteristics (ROC) showed an auROC (area under ROC) of greater than 96%. The evaluation by Precision Recall Curves (PRC) showed an auPRC (area under PRC) of about 26% for positives to negatives ratio of 1:679 and about 64% for positives to negatives ratio of 1:113. The results documented in this approach are either better or comparable to other known methods.
Collapse
Affiliation(s)
- Jun Lu
- Laboratory of Theoretical Biophysics, Faculty of Science and Technology, Inner Mongolia University, Hohhot 010021, P.R.China.
| | | |
Collapse
|
3671
|
Yaragatti M, Basilico C, Dailey L. Identification of active transcriptional regulatory modules by the functional assay of DNA from nucleosome-free regions. Genome Res 2008; 18:930-8. [PMID: 18441229 DOI: 10.1101/gr.073460.107] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The identification of transcriptional regulatory modules within mammalian genomes is a prerequisite to understanding the mechanisms controlling regulated gene expression. While high-throughput microarray- and sequencing-based approaches have been used to map the genomic locations of sites of nuclease hypersensitivity or target DNA sequences bound by specific protein factors, the identification of regulatory elements using functional assays, which would provide important complementary data, has been relatively rare. Here we present a method that permits the functional identification of active transcriptional regulatory modules using a simple procedure for the isolation and analysis of DNA derived from nucleosome-free regions (NFRs), the 2% of the cellular genome that contains these elements. The more than 100 new active regulatory DNAs identified in this manner from F9 cells correspond to both promoter-proximal and distal elements, and display several features predicted for endogenous transcriptional regulators, including localization within DNase-accessible chromatin and CpG islands, and proximity to expressed genes. Furthermore, comparison with published ChIP-seq data of ES-cell chromatin shows that the functional elements we identified correspond with genomic regions enriched for H3K4me3, a histone modification associated with active transcriptional regulatory elements, and that the correspondence of H3K4me3 with our promoter-distal elements is largely ES-cell specific. The majority of the distal elements exhibit enhancer activity. Importantly, these functional DNA fragments are an average 149 bp in length, greatly facilitating future applications to identify transcription factor binding sites mediating their activity. Thus, this approach provides a tool for the high-resolution identification of the functional components of active promoters and enhancers.
Collapse
Affiliation(s)
- Mahesh Yaragatti
- Department of Microbiology, New York University School of Medicine, New York, New York 10016, USA
| | | | | |
Collapse
|
3672
|
Petrykowska HM, Vockley CM, Elnitski L. Detection and characterization of silencers and enhancer-blockers in the greater CFTR locus. Genome Res 2008; 18:1238-46. [PMID: 18436892 DOI: 10.1101/gr.073817.107] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Silencers and enhancer-blockers (EBs) are cis-acting, negative regulatory elements (NREs) that control interactions between promoters and enhancers. Although relatively uncharacterized in terms of biological mechanisms, these elements are likely to be abundant in the genome. We developed an experimental strategy to identify silencers and EBs using transient transfection assays. A known insulator and EB from the chicken beta-globin locus, cHS4, served as a control element for these assays. We examined 47 sequences from a 1.8-Mb region of human chromosome 7 for silencer and EB activities. The majority of functional elements displayed directional and promoter-specific activities. A limited number of sequences acted in a dual manner, as both silencers and EBs. We examined genomic data, epigenetic modifications, and sequence motifs within these regions. Strong silencer elements contained a novel CT-rich motif, often in multiple copies. Deletion of the motif from three regions caused a measurable loss of silencing ability in these sequences. Moreover, five duplicate occurrences of this motif were identified in the cHS4 insulator. These motifs provided an explanation for an uncharacterized silencing activity we measured in the insulator element. Overall, we identified 15 novel NREs, which contribute new insights into the prevalence and composition of sequences that negatively regulate gene expression.
Collapse
Affiliation(s)
- Hanna M Petrykowska
- Genome Technology Branch, National Human Genome Research Institute, NIH, Rockville, MD 20852, USA
| | | | | |
Collapse
|
3673
|
Geller SF, Ge PS, Visel M, Flannery JG. In vitro analysis of promoter activity in Müller cells. Mol Vis 2008; 14:691-705. [PMID: 18437242 PMCID: PMC2330062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Accepted: 03/07/2008] [Indexed: 11/29/2022] Open
Abstract
PURPOSE Rational modification of promoter architecture is necessary for manipulation of transgene activity and requires accurate deciphering of regulatory control elements. Identification of minimally sized promoters is critical to the design of viral vectors for gene therapy. To this end, we evaluated computational methods for predicting short DNA sequences capable of driving gene expression in Müller cells. METHODS We measured enhanced green fluorescent protein (eGFP) expression levels driven by "full-length" promoters, and compared these data with computationally identified shorter promoter elements from the same genes. We cloned and screened over 90 sequences from nine Müller cell-associated genes: CAR2, CD44, GFAP, GLUL, PDGFRA, RLBP1, S100B, SLC1A3, and vimentin (VIM). We PCR-amplified the "full-length" promoter (~1500 bp), the proximal promoter (~500 bp), and the most proximal evolutionarily conserved region (ECR; 95-871 bp) for each gene, both with and without their respective 5' untranslated regions (UTRs), from C57BL/6J mouse genomic DNA. We selected and cloned additional ECRs from more distal genomic regions (both 5' and 3') of the VIM and CD44 genes, using both mouse and rat (Sprague-Dawley) genomic DNA as templates. PCR products were cloned into the pFTMGW or pFTM3GW lentiviral transfer vectors. Plasmid constructs were transfected into rat (wMC) or human (MIO-M1) Müller cells, and eGFP expression levels were evaluated by fluorescence microscopy and flow cytometry. Selected constructs were also examined in NIH/3T3 and Neuro-2a cells. RESULTS Several ECRs from the nine Müller cell-associated genes were able to drive reporter gene expression as well as their longer counterparts. Preliminary comparisons of ECRs from the VIM and CD44 genes suggested that inclusion of UTRs in promoter constructs resulted in increased transgene expression levels. Systematic comparison of promoter activity from nine Müller cell-expressed genes supported this finding, and characteristic regulation profiles were evident among the different genes tested. Importantly, individual cloned promoter sequences were capable of driving distinct levels of transgene expression, resulting in up to eightfold more cells expressing eGFP with up to 3.8-fold higher mean fluorescence intensity (MFI). Furthermore, combining constructs into single regulatory "units" modulated transgene expression, suggesting that secondary gene sequences provided in cis may be used to fine-tune gene expression levels. CONCLUSIONS In this study, we demonstrate that computational and empirical methods, when used in combination, can efficiently identify short promoters that are active in cultured Müller cells. In addition, the pFTM3GW vector can be used to study the effects of combined promoter elements. We anticipate that these methods will expedite the design and testing of synthetic/chimeric promoter constructs that should be useful for both in vitro and in vivo applications.
Collapse
Affiliation(s)
- Scott F Geller
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720-3190, USA
| | | | | | | |
Collapse
|
3674
|
Cooper GM, Brown CD. Qualifying the relationship between sequence conservation and molecular function. Genome Res 2008; 18:201-5. [PMID: 18245453 DOI: 10.1101/gr.7205808] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Quantification of evolutionary constraints via sequence conservation can be leveraged to annotate genomic functional sequences. Recent efforts addressing the converse of this relationship have identified many sites in metazoan genomes with molecular function but without detectable conservation between related species. Here, we discuss explanations and implications for these results considering both practical and theoretical issues. In particular, phylogenetic scope influences the relationship between sequence conservation and function. Comparisons of distantly related species can detect constraint with high specificity due to the loss of conserved neutral sequence, but sensitivity is sacrificed as a result of functional changes related to lineage-specific biology. The strength of natural selection operating on functional sequence is also important. Mutations to functional sequences that result in small fitness effects are subject to weaker constraints. Therefore, particularly when comparing highly divergent species, functional sequences that are degenerate or biologically redundant will be prone to turnover, wherein functional sequences are replaced by effectively equivalent, but nonorthologous counterparts. Finally, considering the size and complexity of metazoan genomes and the fact that many nonconserved sequences are associated with sequence-degenerate, low-level molecular functions, we find it likely that there exist many biochemically functional sequences that are not under constraint. This hypothesis does not lead to the conclusion that huge amounts of vertebrate genomes are functionally important, but rather that such "functionality" represents molecular noise that has weak or no effect on organismal phenotypes.
Collapse
Affiliation(s)
- Gregory M Cooper
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.
| | | |
Collapse
|
3675
|
Prediction and analysis of nucleosome exclusion regions in the human genome. BMC Genomics 2008; 9:186. [PMID: 18430246 PMCID: PMC2386137 DOI: 10.1186/1471-2164-9-186] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2008] [Accepted: 04/22/2008] [Indexed: 11/28/2022] Open
Abstract
Background Nucleosomes are the basic structural units of eukaryotic chromatin, and they play a significant role in regulating gene expression. Specific DNA sequence patterns are known, from empirical and theoretical studies, to influence DNA bending and flexibility, and have been shown to exclude nucleosomes. A whole genome localization of these patterns, and their analysis, can add important insights on the gene regulation mechanisms that depend upon the structure of chromatin in and around a gene. Results A whole genome annotation for nucleosome exclusion regions (NXRegions) was carried out on the human genome. Nucleosome exclusion scores (NXScores) were calculated individually for each nucleotide, giving a measure of how likely a specific nucleotide and its immediate neighborhood would impair DNA bending and, consequently, exclude nucleosomes. The resulting annotations were correlated with 19055 gene expression profiles. We developed a new method based on Grubbs' outliers test for ranking genes based on their tissue specificity, and correlated this ranking with NXScores. The results show a strong correlation between tissue specificity of a gene and the propensity of its promoter to exclude nucleosomes (the promoter region was taken as -1500 to +500 bp from the RefSeq-annotated transcription start site). In addition, NXScores correlated well with gene density, gene expression levels, and DNaseI hypersensitive sites. Conclusion We present, for the first time, a whole genome prediction of nucleosome exclusion regions for the human genome (the data are available for download from Additional Materials). Nucleosome exclusion patterns are correlated with various factors that regulate gene expression, which emphasizes the need to include chromatin structural parameters in experimental analysis of gene expression.
Collapse
|
3676
|
Hannenhalli S. Eukaryotic transcription factor binding sites--modeling and integrative search methods. Bioinformatics 2008; 24:1325-31. [PMID: 18426806 DOI: 10.1093/bioinformatics/btn198] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
A comprehensive knowledge of transcription factor binding sites (TFBS) is important for a mechanistic understanding of transcriptional regulation as well as for inferring gene regulatory networks. Because the DNA motif recognized by a transcription factor is typically short and degenerate, computational approaches for identifying binding sites based only on the sequence motif inevitably suffer from high error rates. Current state-of-the-art techniques for improving computational identification of binding sites can be broadly categorized into two classes: (1) approaches that aim to improve binding motif models by extracting maximal sequence information from experimentally determined binding sites and (2) approaches that supplement binding motif models with additional genomic or other attributes (such as evolutionary conservation). In this review we will discuss recent attempts to improve computational identification of TFBS through these two types of approaches and conclude with thoughts on future development.
Collapse
Affiliation(s)
- Sridhar Hannenhalli
- Penn Center for Bioinformatics and Department of Genetics, University of Pennsylvania, Philadelphia, USA.
| |
Collapse
|
3677
|
Lin MF, Deoras AN, Rasmussen MD, Kellis M. Performance and scalability of discriminative metrics for comparative gene identification in 12 Drosophila genomes. PLoS Comput Biol 2008; 4:e1000067. [PMID: 18421375 PMCID: PMC2291194 DOI: 10.1371/journal.pcbi.1000067] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2007] [Accepted: 03/20/2008] [Indexed: 01/22/2023] Open
Abstract
Comparative genomics of multiple related species is a powerful methodology for the discovery of functional genomic elements, and its power should increase with the number of species compared. Here, we use 12 Drosophila genomes to study the power of comparative genomics metrics to distinguish between protein-coding and non-coding regions. First, we study the relative power of different comparative metrics and their relationship to single-species metrics. We find that even relatively simple multi-species metrics robustly outperform advanced single-species metrics, especially for shorter exons (< or =240 nt), which are common in animal genomes. Moreover, the two capture largely independent features of protein-coding genes, with different sensitivity/specificity trade-offs, such that their combinations lead to even greater discriminatory power. In addition, we study how discovery power scales with the number and phylogenetic distance of the genomes compared. We find that species at a broad range of distances are comparably effective informants for pairwise comparative gene identification, but that these are surpassed by multi-species comparisons at similar evolutionary divergence. In particular, while pairwise discovery power plateaued at larger distances and never outperformed the most advanced single-species metrics, multi-species comparisons continued to benefit even from the most distant species with no apparent saturation. Last, we find that genes in functional categories typically considered fast-evolving can nonetheless be recovered at very high rates using comparative methods. Our results have implications for comparative genomics analyses in any species, including the human.
Collapse
Affiliation(s)
- Michael F. Lin
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts, United States of America
| | - Ameya N. Deoras
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Matthew D. Rasmussen
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Manolis Kellis
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts, United States of America
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
3678
|
Wilm A, Higgins DG, Notredame C. R-Coffee: a method for multiple alignment of non-coding RNA. Nucleic Acids Res 2008; 36:e52. [PMID: 18420654 PMCID: PMC2396437 DOI: 10.1093/nar/gkn174] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
R-Coffee is a multiple RNA alignment package, derived from T-Coffee, designed to align RNA sequences while exploiting secondary structure information. R-Coffee uses an alignment-scoring scheme that incorporates secondary structure information within the alignment. It works particularly well as an alignment improver and can be combined with any existing sequence alignment method. In this work, we used R-Coffee to compute multiple sequence alignments combining the pairwise output of sequence aligners and structural aligners. We show that R-Coffee can improve the accuracy of all the sequence aligners. We also show that the consistency-based component of T-Coffee can improve the accuracy of several structural aligners. R-Coffee was tested on 388 BRAliBase reference datasets and on 11 longer Cmfinder datasets. Altogether our results suggest that the best protocol for aligning short sequences (less than 200 nt) is the combination of R-Coffee with the RNA pairwise structural aligner Consan. We also show that the simultaneous combination of the four best sequence alignment programs with R-Coffee produces alignments almost as accurate as those obtained with R-Coffee/Consan. Finally, we show that R-Coffee can also be used to align longer datasets beyond the usual scope of structural aligners. R-Coffee is freely available for download, along with documentation, from the T-Coffee web site (www.tcoffee.org).
Collapse
Affiliation(s)
- Andreas Wilm
- The Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Ireland
| | | | | |
Collapse
|
3679
|
Soldà G, Suyama M, Pelucchi P, Boi S, Guffanti A, Rizzi E, Bork P, Tenchini ML, Ciccarelli FD. Non-random retention of protein-coding overlapping genes in Metazoa. BMC Genomics 2008; 9:174. [PMID: 18416813 PMCID: PMC2330155 DOI: 10.1186/1471-2164-9-174] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2007] [Accepted: 04/16/2008] [Indexed: 11/26/2022] Open
Abstract
Background Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm). Results For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression. Conclusion Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.
Collapse
Affiliation(s)
- Giulia Soldà
- 1Department of Biology and Genetics for Medical Sciences, University of Milan, 20133 Milan, Italy.
| | | | | | | | | | | | | | | | | |
Collapse
|
3680
|
Dingel J, Hanus P, Leonardi N, Hagenauer J, Zech J, Mueller JC. Local conservation scores without a priori assumptions on neutral substitution rates. BMC Bioinformatics 2008; 9:190. [PMID: 18405366 PMCID: PMC2375903 DOI: 10.1186/1471-2105-9-190] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2007] [Accepted: 04/11/2008] [Indexed: 12/05/2022] Open
Abstract
Background Comparative genomics aims to detect signals of evolutionary conservation as an indicator of functional constraint. Surprisingly, results of the ENCODE project revealed that about half of the experimentally verified functional elements found in non-coding DNA were classified as unconstrained by computational predictions. Following this observation, it has been hypothesized that this may be partly explained by biased estimates on neutral evolutionary rates used by existing sequence conservation metrics. All methods we are aware of rely on a comparison with the neutral rate and conservation is estimated by measuring the deviation of a particular genomic region from this rate. Consequently, it is a reasonable assumption that inaccurate neutral rate estimates may lead to biased conservation and constraint estimates. Results We propose a conservation signal that is produced by local Maximum Likelihood estimation of evolutionary parameters using an optimized sliding window and present a Kullback-Leibler projection that allows multiple different estimated parameters to be transformed into a conservation measure. This conservation measure does not rely on assumptions about neutral evolutionary substitution rates and little a priori assumptions on the properties of the conserved regions are imposed. We show the accuracy of our approach (KuLCons) on synthetic data and compare it to the scores generated by state-of-the-art methods (phastCons, GERP, SCONE) in an ENCODE region. We find that KuLCons is most often in agreement with the conservation/constraint signatures detected by GERP and SCONE while qualitatively very different patterns from phastCons are observed. Opposed to standard methods KuLCons can be extended to more complex evolutionary models, e.g. taking insertion and deletion events into account and corresponding results show that scores obtained under this model can diverge significantly from scores using the simpler model. Conclusion Our results suggest that discriminating among the different degrees of conservation is possible without making assumptions about neutral rates. We find, however, that it cannot be expected to discover considerably different constraint regions than GERP and SCONE. Consequently, we conclude that the reported discrepancies between experimentally verified functional and computationally identified constraint elements are likely not to be explained by biased neutral rate estimates.
Collapse
Affiliation(s)
- Janis Dingel
- Institute for Communications Engineering, Technische Universität München, Munich, Germany.
| | | | | | | | | | | |
Collapse
|
3681
|
Shu W, Bo X, Zheng Z, Wang S. A novel representation of RNA secondary structure based on element-contact graphs. BMC Bioinformatics 2008; 9:188. [PMID: 18402706 PMCID: PMC2373570 DOI: 10.1186/1471-2105-9-188] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2007] [Accepted: 04/11/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Depending on their specific structures, noncoding RNAs (ncRNAs) play important roles in many biological processes. Interest in developing new topological indices based on RNA graphs has been revived in recent years, as such indices can be used to compare, identify and classify RNAs. Although the topological indices presented before characterize the main topological features of RNA secondary structures, information on RNA structural details is ignored to some degree. Therefore, it is necessity to identify topological features with low degeneracy based on complete and fine-grained RNA graphical representations. RESULTS In this study, we present a complete and fine scheme for RNA graph representation as a new basis for constructing RNA topological indices. We propose a combination of three vertex-weighted element-contact graphs (ECGs) to describe the RNA element details and their adjacent patterns in RNA secondary structure. Both the stem and loop topologies are encoded completely in the ECGs. The relationship among the three typical topological index families defined by their ECGs and RNA secondary structures was investigated from a dataset of 6,305 ncRNAs. The applicability of topological indices is illustrated by three application case studies. Based on the applied small dataset, we find that the topological indices can distinguish true pre-miRNAs from pseudo pre-miRNAs with about 96% accuracy, and can cluster known types of ncRNAs with about 98% accuracy, respectively. CONCLUSION The results indicate that the topological indices can characterize the details of RNA structures and may have a potential role in identifying and classifying ncRNAs. Moreover, these indices may lead to a new approach for discovering novel ncRNAs. However, further research is needed to fully resolve the challenging problem of predicting and classifying noncoding RNAs.
Collapse
Affiliation(s)
- Wenjie Shu
- Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | | | | | | |
Collapse
|
3682
|
Uhr M, Tontsch A, Namendorf C, Ripke S, Lucae S, Ising M, Dose T, Ebinger M, Rosenhagen M, Kohli M, Kloiber S, Salyakina D, Bettecken T, Specht M, Pütz B, Binder EB, Müller-Myhsok B, Holsboer F. Polymorphisms in the drug transporter gene ABCB1 predict antidepressant treatment response in depression. Neuron 2008; 57:203-9. [PMID: 18215618 DOI: 10.1016/j.neuron.2007.11.017] [Citation(s) in RCA: 247] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Revised: 08/10/2007] [Accepted: 11/12/2007] [Indexed: 01/01/2023]
Abstract
The clinical efficacy of a systemically administered drug acting on the central nervous system depends on its ability to pass the blood-brain barrier, which is regulated by transporter molecules such as ABCB1 (MDR1). Here we report that polymorphisms in the ABCB1 gene predict the response to antidepressant treatment in those depressed patients receiving drugs that have been identified as substrates of ABCB1 using abcb1ab double-knockout mice. Our results indicate that the combined consideration of both the medication's capacity to act as an ABCB1-transporter substrate and the patient's ABCB1 genotype are strong predictors for achieving a remission. This finding can be viewed as a further step into personalized antidepressant treatment.
Collapse
Affiliation(s)
- Manfred Uhr
- Max Planck Institute of Psychiatry, Kraepelinstr. 10, 80804 Munich, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3683
|
Abstract
Biological functions governed by the circadian clock are the evident result of the entrainment operated by the earth's day and night cycle on living organisms. However, the circadian clock is not unique, and cells and organisms possess many other cyclic activities. These activities are difficult to observe if carried out by single cells and the cells are not coordinated but, if they can be detected, cell-to-cell cross-talk and synchronization among cells must exist. Some of these cycles are metabolic and cell synchronization is due to small molecules acting as metabolic messengers. We propose a short survey of cellular cycles, paying special attention to metabolic cycles and cellular cross-talking, particularly when the synchronization of metabolism or, more generally, cellular functions are concerned. Questions arising from the observation of phenomena based on cell communication and from basic cellular cycles are also proposed.
Collapse
Affiliation(s)
- Michele M Bianchi
- Department of Cell and Developmental Biology, University of Rome La Sapienza, Rome, Italy.
| |
Collapse
|
3684
|
Kawaji H, Nakamura M, Takahashi Y, Sandelin A, Katayama S, Fukuda S, Daub CO, Kai C, Kawai J, Yasuda J, Carninci P, Hayashizaki Y. Hidden layers of human small RNAs. BMC Genomics 2008. [PMID: 18402656 DOI: 10.1186/1471-12164-9-157] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023] Open
Abstract
BACKGROUND Small RNA attracts increasing interest based on the discovery of RNA silencing and the rapid progress of our understanding of these phenomena. Although recent studies suggest the possible existence of yet undiscovered types of small RNAs in higher organisms, many studies to profile small RNA have focused on miRNA and/or siRNA rather than on the exploration of additional classes of RNAs. RESULTS Here, we explored human small RNAs by unbiased sequencing of RNAs with sizes of 19-40 nt. We provide substantial evidences for the existence of independent classes of small RNAs. Our data shows that well-characterized non-coding RNA, such as tRNA, snoRNA, and snRNA are cleaved at sites specific to the class of ncRNA. In particular, tRNA cleavage is regulated depending on tRNA type and tissue expression. We also found small RNAs mapped to genomic regions that are transcribed in both directions by bidirectional promoters, indicating that the small RNAs are a product of dsRNA formation and their subsequent cleavage. Their partial similarity with ribosomal RNAs (rRNAs) suggests unrevealed functions of ribosomal DNA or interstitial rRNA. Further examination revealed six novel miRNAs. CONCLUSION Our results underscore the complexity of the small RNA world and the biogenesis of small RNAs.
Collapse
Affiliation(s)
- Hideya Kawaji
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Main Campus, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3685
|
Kawaji H, Nakamura M, Takahashi Y, Sandelin A, Katayama S, Fukuda S, Daub CO, Kai C, Kawai J, Yasuda J, Carninci P, Hayashizaki Y. Hidden layers of human small RNAs. BMC Genomics 2008; 9:157. [PMID: 18402656 PMCID: PMC2359750 DOI: 10.1186/1471-2164-9-157] [Citation(s) in RCA: 239] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2007] [Accepted: 04/10/2008] [Indexed: 01/09/2023] Open
Abstract
Background Small RNA attracts increasing interest based on the discovery of RNA silencing and the rapid progress of our understanding of these phenomena. Although recent studies suggest the possible existence of yet undiscovered types of small RNAs in higher organisms, many studies to profile small RNA have focused on miRNA and/or siRNA rather than on the exploration of additional classes of RNAs. Results Here, we explored human small RNAs by unbiased sequencing of RNAs with sizes of 19–40 nt. We provide substantial evidences for the existence of independent classes of small RNAs. Our data shows that well-characterized non-coding RNA, such as tRNA, snoRNA, and snRNA are cleaved at sites specific to the class of ncRNA. In particular, tRNA cleavage is regulated depending on tRNA type and tissue expression. We also found small RNAs mapped to genomic regions that are transcribed in both directions by bidirectional promoters, indicating that the small RNAs are a product of dsRNA formation and their subsequent cleavage. Their partial similarity with ribosomal RNAs (rRNAs) suggests unrevealed functions of ribosomal DNA or interstitial rRNA. Further examination revealed six novel miRNAs. Conclusion Our results underscore the complexity of the small RNA world and the biogenesis of small RNAs.
Collapse
Affiliation(s)
- Hideya Kawaji
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Main Campus, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3686
|
Abstract
The past few years have revealed that the genomes of all studied eukaryotes are almost entirely transcribed, generating an enormous number of non-protein-coding RNAs (ncRNAs). In parallel, it is increasingly evident that many of these RNAs have regulatory functions. Here, we highlight recent advances that illustrate the diversity of ncRNA control of genome dynamics, cell biology, and developmental programming.
Collapse
Affiliation(s)
- Paulo P Amaral
- Institute for Molecular Bioscience, University of Queensland, St. Lucia QLD 4072, Australia
| | | | | | | |
Collapse
|
3687
|
RNA landscape of evolution for optimal exon and intron discrimination. Proc Natl Acad Sci U S A 2008; 105:5797-802. [PMID: 18391195 DOI: 10.1073/pnas.0801692105] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Accurate pre-mRNA splicing requires primary splicing signals, including the splice sites, a polypyrimidine tract, and a branch site, other splicing-regulatory elements (SREs). The SREs include exonic splicing enhancers (ESEs), exonic splicing silencers (ESSs), intronic splicing enhancers (ISEs), and intronic splicing silencers (ISSs), which are typically located near the splice sites. However, it is unclear to what extent splicing-driven selective pressure constrains exonic and intronic sequences, especially those distant from the splice sites. Here, we studied the distribution of SREs in human genes in terms of DNA strand-asymmetry patterns. Under a neutral evolution model, each mononucleotide or oligonucleotide should have a symmetric (Chargaff's second parity rule), or weakly asymmetric yet uniform, distribution throughout a pre-mRNA transcript. However, we found that large sets of unbiased, experimentally determined SREs show a distinct strand-asymmetry pattern that is inconsistent with the neutral evolution model, and reflects their functional roles in splicing. ESEs are selected in exons and depleted in introns and vice versa for ESSs. Surprisingly, this trend extends into deep intronic sequences, accounting for one third of the genome. Selection is detectable even at the mononucleotide level, so that the asymmetric base compositions of exons and introns are predictive of ESEs and ESSs. We developed a method that effectively predicts SREs based on strand asymmetry, expanding the current catalog of SREs. Our results suggest that human genes have been optimized for exon and intron discrimination through an RNA landscape shaped during evolution.
Collapse
|
3688
|
Levers and fulcrums: progress in cis-regulatory motif models. Nat Methods 2008; 5:297-8. [DOI: 10.1038/nmeth0408-297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
3689
|
Brown SDM, Hardisty-Hughes RE, Mburu P. Quiet as a mouse: dissecting the molecular and genetic basis of hearing. Nat Rev Genet 2008; 9:277-90. [PMID: 18283275 DOI: 10.1038/nrg2309] [Citation(s) in RCA: 107] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Mouse genetics has made crucial contributions to the understanding of the molecular mechanisms of hearing. With the help of a plethora of mouse mutants, many of the key genes that are involved in the development and functioning of the auditory system have been elucidated. Mouse mutants continue to shed light on the genetic and physiological bases of human hearing impairment, including both early- and late-onset deafness. A combination of genetic and physiological studies of mouse mutant lines, allied to investigations into the protein networks of the stereocilia bundle in the inner ear, are identifying key complexes that are crucial for auditory function and for providing profound insights into the underlying causes of hearing loss.
Collapse
|
3690
|
Mendenhall EM, Bernstein BE. Chromatin state maps: new technologies, new insights. Curr Opin Genet Dev 2008; 18:109-15. [PMID: 18339538 PMCID: PMC2486450 DOI: 10.1016/j.gde.2008.01.010] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Revised: 01/15/2008] [Accepted: 01/15/2008] [Indexed: 02/02/2023]
Abstract
Recent years have seen unprecedented characterization of mammalian chromatin thanks to advances in chromatin assays, antibody development, and genomics. Genome-wide maps of chromatin state can now be readily acquired using microarrays or next-generation sequencing technologies. These datasets reveal local and long-range chromatin patterns that offer insight into the locations and functions of underlying regulatory elements and genes. These patterns are dynamic across developmental stages and lineages. Global studies of chromatin in embryonic stem cells have led to intriguing hypotheses regarding Polycomb/trithorax and RNA polymerase roles in 'poising' transcription. Chromatin state maps thus provide a rich resource for understanding chromatin at a 'systems level', and a starting point for mechanistic studies aimed at defining epigenetic controls that underlie development.
Collapse
Affiliation(s)
- Eric M. Mendenhall
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA
- Department of Pathology, Harvard Medical School, Boston, MA
- Broad Institute of Harvard and MIT, Cambridge, MA
| | | |
Collapse
|
3691
|
Abstract
PURPOSE OF REVIEW Over the past two decades serious efforts has been invested in the search for genes that predispose to common obesity, but progress has been slow and success limited. Genome-wide association, however, has revived optimism. Here we review recent advances in the field of obesity genetics and discuss the most important findings of candidate gene, genome-wide linkage studies and genome-wide association studies. We conclude by speculating about the way forward in the near future. RECENT FINDINGS Although large-scale candidate gene studies have placed MC4R more firmly on the human obesity map, the major breakthrough in obesity genetics was the discovery of FTO through genome-wide association. Variants located in the first intron of FTO were unequivocally associated with a 1.67-fold increased risk for obesity and a 0.40-0.66 kg/m2 increase in body mass index. SUMMARY Genome-wide association promises to enhance greatly our understanding of the genetic basis of common obesity, although candidate gene studies will remain a valuable approach because they allow more detailed analyses of biologically relevant candidates. A key factor contributing to continued success lies in large-scale data integration through international collaboration, which will provide the sample sizes required to identify genetic association with conclusive evidence.
Collapse
Affiliation(s)
- Shengxu Li
- Medical Research Council Epidemiology Unit, Institute of Metabolic Science, Cambridge, UK
| | | |
Collapse
|
3692
|
Merkenschlager M, Wilson CB. RNAi and chromatin in T cell development and function. Curr Opin Immunol 2008; 20:131-8. [PMID: 18440793 DOI: 10.1016/j.coi.2008.03.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Accepted: 03/27/2008] [Indexed: 11/24/2022]
Abstract
Small noncoding RNAs including small interfering RNAs (siRNAs) and microRNAs (miRNAs) have emerged as important regulators of gene expression. They control transcription through changing the structure of chromatin and regulate mRNA stability and translation at the post-transcriptional level. This year has seen exciting progress in our ability to map chromatin structure and chromatin-associated factors genome-wide as well as striking examples how individual miRNAs affect the development and the function of the immune system.
Collapse
Affiliation(s)
- Matthias Merkenschlager
- Lymphocyte Development Group, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK.
| | | |
Collapse
|
3693
|
Pérez A, Lankas F, Luque FJ, Orozco M. Towards a molecular dynamics consensus view of B-DNA flexibility. Nucleic Acids Res 2008; 36:2379-94. [PMID: 18299282 PMCID: PMC2367714 DOI: 10.1093/nar/gkn082] [Citation(s) in RCA: 131] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2007] [Revised: 02/07/2008] [Accepted: 02/08/2008] [Indexed: 01/05/2023] Open
Abstract
We present a systematic study of B-DNA flexibility in aqueous solution using long-scale molecular dynamics simulations with the two more recent versions of nucleic acids force fields (CHARMM27 and parmbsc0) using four long duplexes designed to contain several copies of each individual base pair step. Our study highlights some differences between pambsc0 and CHARMM27 families of simulations, but also extensive agreement in the representation of DNA flexibility. We also performed additional simulations with the older AMBER force fields parm94 and parm99, corrected for non-canonical backbone flips. Taken together, the results allow us to draw for the first time a consensus molecular dynamics picture of B-DNA flexibility.
Collapse
Affiliation(s)
- Alberto Pérez
- Joint IRB-BSC Program on Computational Biology, Institute of Research in Biomedicine, Parc Científic de Barcelona, Josep Samitier 1-5, Barcelona 08028, Barcelona Supercomputing Centre, Jordi Girona 31, Edifici Torre Girona. Barcelona 08034, Departament de Fisicoquímica, Facultat de Farmàcia, Avgda Diagonal sn, Barcelona 08028, Spain, Laboratory for Computation and Visualization in Mathematics and Mechanics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland, Centre for Complex Molecular Systems and Biomolecues, Institute of Organic Chemistry and Biochemistry Flemingovo nam. 2, 166 10 Praha 6, Czech Republic, National Institute of Bioinformatics, Parc Científic de Barcelona, Josep Samitier 1-5 and Departament de Bioquímica, Facultat de Biología, Avgda Diagonal 647, Barcelona 08028, Spain
| | - Filip Lankas
- Joint IRB-BSC Program on Computational Biology, Institute of Research in Biomedicine, Parc Científic de Barcelona, Josep Samitier 1-5, Barcelona 08028, Barcelona Supercomputing Centre, Jordi Girona 31, Edifici Torre Girona. Barcelona 08034, Departament de Fisicoquímica, Facultat de Farmàcia, Avgda Diagonal sn, Barcelona 08028, Spain, Laboratory for Computation and Visualization in Mathematics and Mechanics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland, Centre for Complex Molecular Systems and Biomolecues, Institute of Organic Chemistry and Biochemistry Flemingovo nam. 2, 166 10 Praha 6, Czech Republic, National Institute of Bioinformatics, Parc Científic de Barcelona, Josep Samitier 1-5 and Departament de Bioquímica, Facultat de Biología, Avgda Diagonal 647, Barcelona 08028, Spain
| | - F. Javier Luque
- Joint IRB-BSC Program on Computational Biology, Institute of Research in Biomedicine, Parc Científic de Barcelona, Josep Samitier 1-5, Barcelona 08028, Barcelona Supercomputing Centre, Jordi Girona 31, Edifici Torre Girona. Barcelona 08034, Departament de Fisicoquímica, Facultat de Farmàcia, Avgda Diagonal sn, Barcelona 08028, Spain, Laboratory for Computation and Visualization in Mathematics and Mechanics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland, Centre for Complex Molecular Systems and Biomolecues, Institute of Organic Chemistry and Biochemistry Flemingovo nam. 2, 166 10 Praha 6, Czech Republic, National Institute of Bioinformatics, Parc Científic de Barcelona, Josep Samitier 1-5 and Departament de Bioquímica, Facultat de Biología, Avgda Diagonal 647, Barcelona 08028, Spain
| | - Modesto Orozco
- Joint IRB-BSC Program on Computational Biology, Institute of Research in Biomedicine, Parc Científic de Barcelona, Josep Samitier 1-5, Barcelona 08028, Barcelona Supercomputing Centre, Jordi Girona 31, Edifici Torre Girona. Barcelona 08034, Departament de Fisicoquímica, Facultat de Farmàcia, Avgda Diagonal sn, Barcelona 08028, Spain, Laboratory for Computation and Visualization in Mathematics and Mechanics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland, Centre for Complex Molecular Systems and Biomolecues, Institute of Organic Chemistry and Biochemistry Flemingovo nam. 2, 166 10 Praha 6, Czech Republic, National Institute of Bioinformatics, Parc Científic de Barcelona, Josep Samitier 1-5 and Departament de Bioquímica, Facultat de Biología, Avgda Diagonal 647, Barcelona 08028, Spain
| |
Collapse
|
3694
|
Borel C, Gagnebin M, Gehrig C, Kriventseva EV, Zdobnov EM, Antonarakis SE. Mapping of small RNAs in the human ENCODE regions. Am J Hum Genet 2008; 82:971-81. [PMID: 18394580 DOI: 10.1016/j.ajhg.2008.02.016] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2007] [Revised: 01/28/2008] [Accepted: 02/26/2008] [Indexed: 10/22/2022] Open
Abstract
The elucidation of the largely unknown transcriptome of small RNAs is crucial for the understanding of genome and cellular function. We report here the results of the analysis of small RNAs (< 50 nt) in the ENCODE regions of the human genome. Size-fractionated RNAs from four different cell lines (HepG2, HelaS3, GM06990, SK-N-SH) were mapped with the forward and reverse ENCODE high-density resolution tiling arrays. The top 1% of hybridization signals are termed SmRfrags (Small RNA fragments). Eight percent of SmRfrags overlap the GENCODE genes (CDS), given that the majority map to intergenic regions (34%), intronic regions (53%), and untranslated regions (UTRs) (5%). In addition, 9.6% and 16.8% of SmRfrags in the 5' UTR regions overlap significantly with His/Pol II/TAF250 binding sites and DNase I Hypersensitive sites, respectively (compared to the 5.3% and 9% expected). Interestingly, 17%-24% (depending on the cell line) of SmRfrags are sense-antisense strand pairs that show evidence of overlapping transcription. Only 3.4% and 7.2% of SmRfrags in intergenic regions overlap transcribed fragments (Txfrags) in HeLa and GM06990 cell lines, respectively. We hypothesized that a fraction of the identified SmRfrags corresponded to microRNAs. We tested by Northern blot a set of 15 high-likelihood predictions of microRNA candidates that overlap with smRfrags and validated three potential microRNAs ( approximately 20 nt length). Notably, most of the remaining candidates showed a larger hybridizing band ( approximately 100 nt) that could be a microRNA precursor. The small RNA transcriptome is emerging as an important and abundant component of the genome function.
Collapse
|
3695
|
de Guzman Strong C, Segre JA. Navigating the genome. J Cell Sci 2008; 121:921-3. [PMID: 18354081 DOI: 10.1242/jcs.022400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Cristina de Guzman Strong
- National Human Genome Research Institute, National Institutes of Health, 49 Convent Drive, Bethesda, MD 20892, USA
| | | |
Collapse
|
3696
|
Eskin E. Increasing power in association studies by using linkage disequilibrium structure and molecular function as prior information. Genes Dev 2008; 18:653-60. [PMID: 18353808 PMCID: PMC2279252 DOI: 10.1101/gr.072785.107] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2007] [Accepted: 02/12/2008] [Indexed: 11/25/2022]
Abstract
The availability of various types of genomic data provides an opportunity to incorporate this data as prior information in genetic association studies. This information includes knowledge of linkage disequilibrium structure as well as which regions are likely to be involved in disease. In this paper, we present an approach for incorporating this information by revisiting how we perform multiple-hypothesis correction. In a traditional association study, in order to correct for multiple-hypothesis testing, the significance threshold at each marker, t, is set to control the total false-positive rate. In our framework, we vary the threshold at each marker t(i) and use these thresholds to incorporate prior information. We present a numerical procedure for solving for thresholds that maximizes association study power using prior information. We also present the results of benchmark simulation experiments using the HapMap data, which demonstrate a significant increase in association study power under this framework. We provide a Web server for performing association studies using our method and provide thresholds optimized for the Affymetrix 500 k and Illumina HumanHap 550 chips and demonstrate the application of our framework to the analysis of the Wellcome Trust Case Control Consortium data.
Collapse
Affiliation(s)
- Eleazar Eskin
- Department of Computer Science and Human Genetics, University of California, Los Angeles, Los Angeles, California 90095, USA.
| |
Collapse
|
3697
|
Nielsen DA, Ji F, Yuferov V, Ho A, Chen A, Levran O, Ott J, Kreek MJ. Genotype patterns that contribute to increased risk for or protection from developing heroin addiction. Mol Psychiatry 2008; 13:417-28. [PMID: 18195715 PMCID: PMC3810149 DOI: 10.1038/sj.mp.4002147] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/03/2007] [Accepted: 12/06/2007] [Indexed: 11/09/2022]
Abstract
A genome-wide association study was conducted using microarray technology to identify genes that may be associated with the vulnerability to develop heroin addiction, using DNA from 104 individual former severe heroin addicts (meeting Federal criteria for methadone maintenance) and 101 individual control subjects, all Caucasian. Using separate analyses for autosomal and X chromosomal variants, we found that the strongest associations of allele frequency with heroin addiction were with the autosomal variants rs965972, located in the Unigene cluster Hs.147755 (experiment-wise q=0.053), and rs1986513 (q=0.187). The three variants exhibiting the strongest association with heroin addiction by genotype frequency were rs1714984, located in an intron of the gene for the transcription factor myocardin (P=0.000022), rs965972 (P=0.000080) and rs1867898 (P=0.000284). One genotype pattern (AG-TT-GG) was found to be significantly associated with developing heroin addiction (odds ratio (OR)=6.25) and explained 27% of the population attributable risk for heroin addiction in this cohort. Another genotype pattern (GG-CT-GG) of these variants was found to be significantly associated with protection from developing heroin addiction (OR=0.13), and lacking this genotype pattern explained 83% of the population attributable risk for developing heroin addiction. Evidence was found for involvement of five genes in heroin addiction, the genes coding for the mu opioid receptor, the metabotropic receptors mGluR6 and mGluR8, nuclear receptor NR4A2 and cryptochrome 1 (photolyase-like). This approach has identified several new genes potentially associated with heroin addiction and has confirmed the role of OPRM1 in this disease.
Collapse
Affiliation(s)
- D A Nielsen
- Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, NY 10065, USA.
| | | | | | | | | | | | | | | |
Collapse
|
3698
|
Houseley J, Tollervey D. The nuclear RNA surveillance machinery: The link between ncRNAs and genome structure in budding yeast? BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2008; 1779:239-46. [DOI: 10.1016/j.bbagrm.2007.12.008] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2007] [Revised: 12/18/2007] [Accepted: 12/20/2007] [Indexed: 11/26/2022]
|
3699
|
Ananthakrishnan L, Gervasi C, Szaro B. Dynamic regulation of middle neurofilament RNA pools during optic nerve regeneration. Neuroscience 2008; 153:144-53. [DOI: 10.1016/j.neuroscience.2008.02.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Revised: 12/10/2007] [Accepted: 02/04/2008] [Indexed: 10/22/2022]
|
3700
|
Schuster P. Modeling in biological chemistry. From biochemical kinetics to systems biology. MONATSHEFTE FUR CHEMIE 2008. [DOI: 10.1007/s00706-008-0892-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|