26
|
Lesca G, Moizard MP, Bussy G, Boggio D, Hu H, Haas SA, Ropers HH, Kalscheuer VM, Des Portes V, Labalme A, Sanlaville D, Edery P, Raynaud M, Lespinasse J. Clinical and neurocognitive characterization of a family with a novel MED12 gene frameshift mutation. Am J Med Genet A 2013; 161A:3063-71. [PMID: 24039113 DOI: 10.1002/ajmg.a.36162] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 07/08/2013] [Indexed: 11/07/2022]
Abstract
FG syndrome, Lujan syndrome, and Ohdo syndrome, the Maat-Kievit-Brunner type, have been described as distinct syndromes with overlapping non-specific features and different missense mutations of the MED12 gene have been reported in all of them. We report a family including 10 males and 1 female affected with profound non-specific intellectual disability (ID) which was linked to a 30-cM region extending from Xp11.21 (ALAS2) to Xq22.3 (COL4A5). Parallel sequencing of all X-chromosome exons identified a frameshift mutation (c.5898dupC) of MED12. Mutated mRNA was not affected by non-sense mediated RNA decay and induced an additional abnormal isoform due to activation of cryptic splice-sites in exon 41. Dysmorphic features common to most affected males were long narrow face, high forehead, flat malar area, high nasal bridge, and short philtrum. Language was absent or very limited. Most patients had a friendly personality. Cognitive impairment, varying from borderline to profound ID was similarly observed in seven heterozygous females. There was no correlation between cognitive function and X-chromosome inactivation profiles in blood cells. The severe degree of ID in male patients, as well as variable cognitive impairment in heterozygous females suggests that the duplication observed in the present family may have a more severe effect on MED12 function than missense mutations. In a cognitively impaired male from this family, who also presented with tall stature and dysmorphism and did not have the MED12 mutation, a 600-kb duplication at 17p13.3 including the YWHAE gene, was found in a mosaic state.
Collapse
|
27
|
Trepte CJC, Eichhorn V, Haas SA, Stahl K, Schmid F, Nitzschke R, Goetz AE, Reuter DA. Comparison of an automated respiratory systolic variation test with dynamic preload indicators to predict fluid responsiveness after major surgery. Br J Anaesth 2013; 111:736-42. [PMID: 23811425 DOI: 10.1093/bja/aet204] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Predicting the response of cardiac output to volume administration remains an ongoing clinical challenge. The objective of our study was to compare the ability to predict volume responsiveness of various functional measures of cardiac preload. These included pulse pressure variation (PPV), stroke volume variation (SVV), and the recently launched automated respiratory systolic variation test (RSVT) in patients after major surgery. METHODS In this prospective study, 24 mechanically ventilated patients after major surgery were enrolled. Three consecutive volume loading steps consisting of 300 ml 6% hydroxyethylstarch 130/0.4 were performed and cardiac index (CI) was assessed by transpulmonary thermodilution. Volume responsiveness was considered as positive if CI increased by >10%. RESULTS In total 72 volume loading steps were analysed, of which 41 showed a positive volume response. Receiver operating characteristic (ROC) curve analysis revealed an area under the curve (AUC) of 0.70 for PPV, 0.72 for SVV and 0.77 for RSVT. Areas under the curves of all variables did not differ significantly from each other (P>0.05). Suggested cut-off values were 9.9% for SVV, 10.1% for PPV, and 19.7° for RSVT as calculated by the Youden Index. CONCLUSION In predicting fluid responsiveness the new automated RSVT appears to be as accurate as established dynamic indicators of preload PPV and SVV in patients after major surgery. The automated RSVT is clinically easy to use and may be useful in guiding fluid therapy in ventilated patients.
Collapse
|
28
|
Van Maldergem L, Hou Q, Kalscheuer VM, Rio M, Doco-Fenzy M, Medeira A, de Brouwer APM, Cabrol C, Haas SA, Cacciagli P, Moutton S, Landais E, Motte J, Colleaux L, Bonnet C, Villard L, Dupont J, Man HY. Loss of function of KIAA2022 causes mild to severe intellectual disability with an autism spectrum disorder and impairs neurite outgrowth. Hum Mol Genet 2013; 22:3306-14. [PMID: 23615299 DOI: 10.1093/hmg/ddt187] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Existence of a discrete new X-linked intellectual disability (XLID) syndrome due to KIAA2022 deficiency was questioned by disruption of KIAA2022 by an X-chromosome pericentric inversion in a XLID family we reported in 2004. Three additional families with likely pathogenic KIAA2022 mutations were discovered within the frame of systematic parallel sequencing of familial cases of XLID or in the context of routine array-CGH evaluation of sporadic intellectual deficiency (ID) cases. The c.186delC and c.3597dupA KIAA2022 truncating mutations were identified by X-chromosome exome sequencing, while array CGH discovered a 70 kb microduplication encompassing KIAA2022 exon 1 in the third family. This duplication decreased KIAA2022 mRNA level in patients' lymphocytes by 60%. Detailed clinical examination of all patients, including the two initially reported, indicated moderate-to-severe ID with autistic features, strabismus in all patients, with no specific dysmorphic features other than a round face in infancy and no structural brain abnormalities on magnetic resonance imaging (MRI). Interestingly, the patient with decreased KIAA2022 expression had only mild ID with severe language delay and repetitive behaviors falling in the range of an autism spectrum disorder (ASD). Since little is known about KIAA2022 function, we conducted morphometric studies in cultured rat hippocampal neurons. We found that siRNA-mediated KIAA2022 knockdown resulted in marked impairment in neurite outgrowth including both the dendrites and the axons, suggesting a major role for KIAA2022 in neuron development and brain function.
Collapse
|
29
|
Huang L, Jolly LA, Willis-Owen S, Gardner A, Kumar R, Douglas E, Shoubridge C, Wieczorek D, Tzschach A, Cohen M, Hackett A, Field M, Froyen G, Hu H, Haas SA, Ropers HH, Kalscheuer VM, Corbett MA, Gecz J. A noncoding, regulatory mutation implicates HCFC1 in nonsyndromic intellectual disability. Am J Hum Genet 2012; 91:694-702. [PMID: 23000143 DOI: 10.1016/j.ajhg.2012.08.011] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Revised: 06/26/2012] [Accepted: 08/13/2012] [Indexed: 11/28/2022] Open
Abstract
The discovery of mutations causing human disease has so far been biased toward protein-coding regions. Having excluded all annotated coding regions, we performed targeted massively parallel resequencing of the nonrepetitive genomic linkage interval at Xq28 of family MRX3. We identified in the binding site of transcription factor YY1 a regulatory mutation that leads to overexpression of the chromatin-associated transcriptional regulator HCFC1. When tested on embryonic murine neural stem cells and embryonic hippocampal neurons, HCFC1 overexpression led to a significant increase of the production of astrocytes and a considerable reduction in neurite growth. Two other nonsynonymous, potentially deleterious changes have been identified by X-exome sequencing in individuals with intellectual disability, implicating HCFC1 in normal brain function.
Collapse
|
30
|
Sun R, Love MI, Zemojtel T, Emde AK, Chung HR, Vingron M, Haas SA. Breakpointer: using local mapping artifacts to support sequence breakpoint discovery from single-end reads. Bioinformatics 2012; 28:1024-5. [DOI: 10.1093/bioinformatics/bts064] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
|
31
|
Emde AK, Schulz MH, Weese D, Sun R, Vingron M, Kalscheuer VM, Haas SA, Reinert K. Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS. ACTA ACUST UNITED AC 2012; 28:619-27. [PMID: 22238266 DOI: 10.1093/bioinformatics/bts019] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
MOTIVATION The reliable detection of genomic variation in resequencing data is still a major challenge, especially for variants larger than a few base pairs. Sequencing reads crossing boundaries of structural variation carry the potential for their identification, but are difficult to map. RESULTS Here we present a method for 'split' read mapping, where prefix and suffix match of a read may be interrupted by a longer gap in the read-to-reference alignment. We use this method to accurately detect medium-sized insertions and long deletions with precise breakpoints in genomic resequencing data. Compared with alternative split mapping methods, SplazerS significantly improves sensitivity for detecting large indel events, especially in variant-rich regions. Our method is robust in the presence of sequencing errors as well as alignment errors due to genomic mutations/divergence, and can be used on reads of variable lengths. Our analysis shows that SplazerS is a versatile tool applicable to unanchored or single-end as well as anchored paired-end reads. In addition, application of SplazerS to targeted resequencing data led to the interesting discovery of a complete, possibly functional gene retrocopy variant. AVAILABILITY SplazerS is available from http://www.seqan.de/projects/ splazers. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
32
|
Schraders M, Haas SA, Weegerink NJD, Oostrik J, Hu H, Hoefsloot LH, Kannan S, Huygen PLM, Pennings RJE, Admiraal RJC, Kalscheuer VM, Kunst HPM, Kremer H. Next-generation sequencing identifies mutations of SMPX, which encodes the small muscle protein, X-linked, as a cause of progressive hearing impairment. Am J Hum Genet 2011; 88:628-34. [PMID: 21549342 DOI: 10.1016/j.ajhg.2011.04.012] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2011] [Revised: 04/07/2011] [Accepted: 04/18/2011] [Indexed: 01/12/2023] Open
Abstract
In a Dutch family with an X-linked postlingual progressive hearing impairment, a critical linkage interval was determined to span a region of 12.9 Mb flanked by the markers DXS7108 and DXS7110. This interval overlaps with the previously described DFNX4 locus and contains 75 annotated genes. Subsequent next-generation sequencing (NGS) detected one variant within the linkage interval, a nonsense mutation in SMPX. SMPX encodes the small muscle protein, X-linked (SMPX). Further screening was performed on 26 index patients from small families for which X-linked inheritance of nonsyndromic hearing impairment (NSHI) was not excluded. We detected a frameshift mutation in SMPX in one of the patients. Segregation analysis of both mutations in the families in whom they were found revealed that the mutations cosegregated with hearing impairment. Although we show that SMPX is expressed in many different organs, including the human inner ear, no obvious symptoms other than hearing impairment were observed in the patients. SMPX had previously been demonstrated to be specifically expressed in striated muscle and, therefore, seemed an unlikely candidate gene for hearing impairment. We hypothesize that SMPX functions in inner ear development and/or maintenance in the IGF-1 pathway, the integrin pathway through Rac1, or both.
Collapse
|
33
|
Warnatz HJ, Querfurth R, Guerasimova A, Cheng X, Haas SA, Hufton AL, Manke T, Vanhecke D, Nietfeld W, Vingron M, Janitz M, Lehrach H, Yaspo ML. Functional analysis and identification of cis-regulatory elements of human chromosome 21 gene promoters. Nucleic Acids Res 2010; 38:6112-23. [PMID: 20494980 PMCID: PMC2952857 DOI: 10.1093/nar/gkq402] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Given the inherent limitations of in silico studies relying solely on DNA sequence analysis, the functional characterization of mammalian promoters and associated cis-regulatory elements requires experimental support, which demands cloning and analysis of putative promoter regions. Focusing on human chromosome 21, we cloned 182 gene promoters of 2500 bp in length and conducted reporter gene assays on transfected-cell arrays. We found 56 promoters that were active in HEK293 cells, while another 49 promoters could be activated by treatment of cells with Trichostatin A or depletion of serum. We observed high correlations between promoter activities and endogenous transcript levels, RNA polymerase II occupancy, CpG islands and core promoter elements. Truncation of a subset of 62 promoters to ∼500 bp revealed that truncation rarely resulted in loss of activity, but rather in loss of responses to external stimuli, suggesting the presence of cis-regulatory response elements within distal promoter regions. In these regions, we found a strong enrichment of transcription factor binding sites that could potentially activate gene expression in the presence of stimuli. This study illustrates the modular functional architecture of chromosome 21 promoters and helps to reveal the complex mechanisms governing transcriptional regulation.
Collapse
|
34
|
Hu H, Wrogemann K, Kalscheuer V, Tzschach A, Richard H, Haas SA, Menzel C, Bienek M, Froyen G, Raynaud M, Van Bokhoven H, Chelly J, Ropers H, Chen W. Erratum to: Mutation screening in 86 known X-linked mental retardation genes by droplet-based multiplex PCR and massive parallel sequencing. THE HUGO JOURNAL 2010; 3:83. [PMID: 20535404 PMCID: PMC2882641 DOI: 10.1007/s11568-010-9142-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
[This corrects the article DOI: 10.1007/s11568-010-9137-y.].
Collapse
|
35
|
Hu H, Wrogemann K, Kalscheuer V, Tzschach A, Richard H, Haas SA, Menzel C, Bienek M, Froyen G, Raynaud M, Van Bokhoven H, Chelly J, Ropers H, Chen W. Mutation screening in 86 known X-linked mental retardation genes by droplet-based multiplex PCR and massive parallel sequencing. THE HUGO JOURNAL 2010; 3:41-9. [PMID: 21836662 PMCID: PMC2882650 DOI: 10.1007/s11568-010-9137-y] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2010] [Revised: 02/24/2010] [Accepted: 03/12/2010] [Indexed: 12/25/2022]
Abstract
Massive parallel sequencing has revolutionized the search for pathogenic variants in the human genome, but for routine diagnosis, re-sequencing of the complete human genome in a large cohort of patients is still far too expensive. Recently, novel genome partitioning methods have been developed that allow to target re-sequencing to specific genomic compartments, but practical experience with these methods is still limited. In this study, we have combined a novel droplet-based multiplex PCR method and next generation sequencing to screen patients with X-linked mental retardation (XLMR) for mutations in 86 previously identified XLMR genes. In total, affected males from 24 large XLMR families were analyzed, including three in whom the mutations were already known. Amplicons corresponding to functionally relevant regions of these genes were sequenced on an Illumina/Solexa Genome Analyzer II platform. Highly specific and uniform enrichment was achieved: on average, 67.9% unambiguously mapped reads were derived from amplicons, and for 88.5% of the targeted bases, the sequencing depth was sufficient to reliably detect variations. Potentially disease-causing sequence variants were identified in 10 out of 24 patients, including the three mutations that were already known, and all of these could be confirmed by Sanger sequencing. The robust performance of this approach demonstrates the general utility of droplet-based multiplex PCR for parallel mutation screening in hundreds of genes, which is a prerequisite for the diagnosis of mental retardation and other disorders that may be due to defects of a wide variety of genes.
Collapse
|
36
|
Richard H, Schulz MH, Sultan M, Nürnberger A, Schrinner S, Balzereit D, Dagand E, Rasche A, Lehrach H, Vingron M, Haas SA, Yaspo ML. Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments. Nucleic Acids Res 2010; 38:e112. [PMID: 20150413 PMCID: PMC2879520 DOI: 10.1093/nar/gkq041] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Alternative splicing, polyadenylation of pre-messenger RNA molecules and differential promoter usage can produce a variety of transcript isoforms whose respective expression levels are regulated in time and space, thus contributing specific biological functions. However, the repertoire of mammalian alternative transcripts and their regulation are still poorly understood. Second-generation sequencing is now opening unprecedented routes to address the analysis of entire transcriptomes. Here, we developed methods that allow the prediction and quantification of alternative isoforms derived solely from exon expression levels in RNA-Seq data. These are based on an explicit statistical model and enable the prediction of alternative isoforms within or between conditions using any known gene annotation, as well as the relative quantification of known transcript structures. Applying these methods to a human RNA-Seq dataset, we validated a significant fraction of the predictions by RT-PCR. Data further showed that these predictions correlated well with information originating from junction reads. A direct comparison with exon arrays indicated improved performances of RNA-Seq over microarrays in the prediction of skipped exons. Altogether, the set of methods presented here comprehensively addresses multiple aspects of alternative isoform analysis. The software is available as an open-source R-package called Solas at http://cmb.molgen.mpg.de/2ndGenerationSequencing/Solas/.
Collapse
|
37
|
Roider HG, Lenhard B, Kanhere A, Haas SA, Vingron M. CpG-depleted promoters harbor tissue-specific transcription factor binding signals--implications for motif overrepresentation analyses. Nucleic Acids Res 2009; 37:6305-15. [PMID: 19736212 PMCID: PMC2770660 DOI: 10.1093/nar/gkp682] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Motif overrepresentation analysis of proximal promoters is a common approach to characterize the regulatory properties of co-expressed sets of genes. Here we show that these approaches perform well on mammalian CpG-depleted promoter sets that regulate expression in terminally differentiated tissues such as liver and heart. In contrast, CpG-rich promoters show very little overrepresentation signal, even when associated with genes that display highly constrained spatiotemporal expression. For instance, while ∼50% of heart specific genes possess CpG-rich promoters we find that the frequently observed enrichment of MEF2-binding sites upstream of heart-specific genes is solely due to contributions from CpG-depleted promoters. Similar results are obtained for all sets of tissue-specific genes indicating that CpG-rich and CpG-depleted promoters differ fundamentally in their distribution of regulatory inputs around the transcription start site. In order not to dilute the respective transcription factor binding signals, the two promoter types should thus be treated as separate sets in any motif overrepresentation analysis.
Collapse
|
38
|
Roider HG, Manke T, O'Keeffe S, Vingron M, Haas SA. PASTAA: identifying transcription factors associated with sets of co-regulated genes. ACTA ACUST UNITED AC 2008; 25:435-42. [PMID: 19073590 PMCID: PMC2642637 DOI: 10.1093/bioinformatics/btn627] [Citation(s) in RCA: 121] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION A major challenge in regulatory genomics is the identification of associations between functional categories of genes (e.g. tissues, metabolic pathways) and their regulating transcription factors (TFs). While, for a limited number of categories, the regulating TFs are already known, still for many functional categories the responsible factors remain to be elucidated. RESULTS We put forward a novel method (PASTAA) for detecting transcriptions factors associated with functional categories, which utilizes the prediction of binding affinities of a TF to promoters. This binding strength information is compared to the likelihood of membership of the corresponding genes in the functional category under study. Coherence between the two ranked datasets is seen as an indicator of association between a TF and the category. PASTAA is applied primarily to the determination of TFs driving tissue-specific expression. We show that PASTAA is capable of recovering many TFs acting tissue specifically and, in addition, provides novel associations so far not detected by alternative methods. The application of PASTAA to detect TFs involved in the regulation of tissue-specific gene expression revealed a remarkable number of experimentally supported associations. The validated success for various datasets implies that PASTAA can directly be applied for the detection of TFs associated with newly derived gene sets. AVAILABILITY The PASTAA source code as well as a corresponding web interface is freely available at http://trap.molgen.mpg.de.
Collapse
|
39
|
Hecht J, Kuhl H, Haas SA, Bauer S, Poustka AJ, Lienau J, Schell H, Stiege AC, Seitz V, Reinhardt R, Duda GN, Mundlos S, Robinson PN. Gene identification and analysis of transcripts differentially regulated in fracture healing by EST sequencing in the domestic sheep. BMC Genomics 2006; 7:172. [PMID: 16822315 PMCID: PMC1578570 DOI: 10.1186/1471-2164-7-172] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2006] [Accepted: 07/05/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The sheep is an important model animal for testing novel fracture treatments and other medical applications. Despite these medical uses and the well known economic and cultural importance of the sheep, relatively little research has been performed into sheep genetics, and DNA sequences are available for only a small number of sheep genes. RESULTS In this work we have sequenced over 47 thousand expressed sequence tags (ESTs) from libraries developed from healing bone in a sheep model of fracture healing. These ESTs were clustered with the previously available 10 thousand sheep ESTs to a total of 19087 contigs with an average length of 603 nucleotides. We used the newly identified sequences to develop RT-PCR assays for 78 sheep genes and measured differential expression during the course of fracture healing between days 7 and 42 postfracture. All genes showed significant shifts at one or more time points. 23 of the genes were differentially expressed between postfracture days 7 and 10, which could reflect an important role for these genes for the initiation of osteogenesis. CONCLUSION The sequences we have identified in this work are a valuable resource for future studies on musculoskeletal healing and regeneration using sheep and represent an important head-start for genomic sequencing projects for Ovis aries, with partial or complete sequences being made available for over 5,800 previously unsequenced sheep genes.
Collapse
|
40
|
Gupta S, Vingron M, Haas SA. T-STAG: resource and web-interface for tissue-specific transcripts and genes. Nucleic Acids Res 2005; 33:W654-8. [PMID: 15980556 PMCID: PMC1160111 DOI: 10.1093/nar/gki350] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
T-STAG (tissue-specific transcripts and genes) is a resource and web-interface, designated to analyze tissue/tumor-specific expression patterns in human and mouse transcriptomes. It integrates our refined prediction of specific expression patterns both in genes as well as in individual isoforms with man–mouse orthology data. In combination with the features for combining/contrasting the genes expressed in different tissues, T-STAG implicates important biological applications, such as the detection of differentially expressed genes in tumors, the retrieval of orthologs with significant expression in the same tissue etc. Additionally, our refined categorization of expressed sequence tags (ESTs) according to the normalization of cDNA libraries allows searching for putative low-abundant transcripts. The results are tightly linked to our visualization tools, GeneNest (expression patterns of genes) and SpliceNest (gene structure and alternative splicing). The user-friendly interface of T-STAG offers a platform for comprehensive analysis of tissue and/or tumor-specific expression patterns revealed by the EST data. T-STAG is freely accessible at .
Collapse
|
41
|
Hui J, Hung LH, Heiner M, Schreiner S, Neumüller N, Reither G, Haas SA, Bindereif A. Intronic CA-repeat and CA-rich elements: a new class of regulators of mammalian alternative splicing. EMBO J 2005; 24:1988-98. [PMID: 15889141 PMCID: PMC1142610 DOI: 10.1038/sj.emboj.7600677] [Citation(s) in RCA: 180] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2004] [Accepted: 04/14/2005] [Indexed: 01/17/2023] Open
Abstract
We have recently identified an intronic polymorphic CA-repeat region in the human endothelial nitric oxide synthase (eNOS) gene as an important determinant of the splicing efficiency, requiring specific binding of hnRNP L. Here, we analyzed the position requirements of this CA-repeat element, which revealed its potential role in alternative splicing. In addition, we defined the RNA binding specificity of hnRNP L by SELEX: not only regular CA repeats are recognized with high affinity but also certain CA-rich clusters. Therefore, we have systematically searched the human genome databases for CA-repeat and CA-rich elements associated with alternative 5' splice sites (5'ss), followed by minigene transfection assays. Surprisingly, in several specific human genes that we tested, intronic CA RNA elements could function either as splicing enhancers or silencers, depending on their proximity to the alternative 5'ss. HnRNP L was detected specifically bound to these diverse CA elements. These data demonstrated that intronic CA sequences constitute novel and widespread regulatory elements of alternative splicing.
Collapse
|
42
|
Gupta S, Zink D, Korn B, Vingron M, Haas SA. Strengths and weaknesses of EST-based prediction of tissue-specific alternative splicing. BMC Genomics 2004; 5:72. [PMID: 15453915 PMCID: PMC521684 DOI: 10.1186/1471-2164-5-72] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2004] [Accepted: 09/28/2004] [Indexed: 12/15/2022] Open
Abstract
Background Alternative splicing contributes significantly to the complexity of the human transcriptome and proteome. Computational prediction of alternative splice isoforms are usually based on EST sequences that also allow to approximate the expression pattern of the related transcripts. However, the limited number of tissues represented in the EST data as well as the different cDNA construction protocols may influence the predictive capacity of ESTs to unravel tissue-specifically expressed transcripts. Methods We predict tissue and tumor specific splice isoforms based on the genomic mapping (SpliceNest) of the EST consensus sequences and library annotation provided in the GeneNest database. We further ascertain the potentially rare tissue specific transcripts as the ones represented only by ESTs derived from normalized libraries. A subset of the predicted tissue and tumor specific isoforms are then validated via RT-PCR experiments over a spectrum of 40 tissue types. Results Our strategy revealed 427 genes with at least one tissue specific transcript as well as 1120 genes showing tumor specific isoforms. While our experimental evaluation of computationally predicted tissue-specific isoforms revealed a high success rate in confirming the expression of these isoforms in the respective tissue, the strategy frequently failed to detect the expected restricted expression pattern. The analysis of putative lowly expressed transcripts using normalized cDNA libraries suggests that our ability to detect tissue-specific isoforms strongly depends on the expression level of the respective transcript as well as on the sensitivity of the experimental methods. Especially splice isoforms predicted to be disease-specific tend to represent transcripts that are expressed in a set of healthy tissues rather than novel isoforms. Conclusions We propose to combine the computational prediction of alternative splice isoforms with experimental validation for efficient delineation of an accurate set of tissue-specific transcripts.
Collapse
|
43
|
Gupta S, Zink D, Korn B, Vingron M, Haas SA. Genome wide identification and classification of alternative splicing based on EST data. Bioinformatics 2004; 20:2579-85. [PMID: 15117759 DOI: 10.1093/bioinformatics/bth288] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Alternative splicing is currently seen to explain the vast disparity between the number of predicted genes in the human genome and the highly diverse proteome. The mapping of expressed sequences tag (EST) consensus sequences derived from the GeneNest database onto the genome provides an efficient way of predicting exon-intron boundaries, gene structure and alternative splicing events. However, the alternative splicing events are obscured by a large number of putatively artificial exon boundaries arising due to genomic contamination or alignment errors. The current work describes a methodology to associate quality values to the predicted exon-intron boundaries. High quality exon-intron boundaries are used to predict constitutive and alternative splicing ranked by confidence values, aiming to facilitate large-scale analysis of alternative splicing and splicing in general. RESULTS Applying the current methodology, constitutive splicing is observed in 33,270 EST clusters, out of which 45% are alternatively spliced. The classification derived from the computed confidence values for 17 of these splice events frequently correlate (15/17) with RT-PCR experiments performed for 40 different tissue samples. As an application of the confidence measure, an evaluation of distribution of alternative splicing revealed that majority of variants correspond to the coding regions of the genes. However, still a significant fraction maps to non-coding regions, thereby indicating a functional relevance of alternative splicing in untranslated regions. AVAILABILITY The predicted alternative splice variants are visualized in the SpliceNest database at http://splicenest.molgen.mpg.de
Collapse
|
44
|
Xue Y, Haas SA, Brino L, Gusnanto A, Reimers M, Talibi D, Vingron M, Ekwall K, Wright APH. A DNA microarray for fission yeast: minimal changes in global gene expression after temperature shift. Yeast 2004; 21:25-39. [PMID: 14745780 DOI: 10.1002/yea.1053] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Completion of the fission yeast genome sequence has opened up possibilities for post-genomic approaches. We have constructed a DNA microarray for genome-wide gene expression analysis in fission yeast. The microarray contains DNA fragments, PCR-amplified from a genomic DNA template, that represent > 99% of the 5000 or so annotated fission yeast genes, as well as a number of control sequences. The GenomePRIDE software used attempts to design similarly sized DNA fragments corresponding to gene regions within single exons, near the 3'-end of genes that lack homology to other fission yeast genes. To validate the design and utility of the array, we studied expression changes after a 2 h temperature shift from 25 degrees C to 36 degrees C, conditions widely used when studying temperature-sensitive mutants. Obligingly, the vast majority of genes do not change more than two-fold, supporting the widely held view that temperature-shift experiments specifically reveal phenotypes associated with temperature-sensitive mutants. However, we did identify a small group of genes that showed a reproducible change in expression. Importantly, most of these corresponded to previously characterized heat-shock genes, whose expression has been reported to change after more extreme temperature shifts than those used here. We conclude that the DNA microarray represents a useful resource for fission yeast researchers as well as the broader yeast community, since it will facilitate comparison with the distantly related budding yeast, Saccharomyces cerevisiae. To maximize the utility of this resource, the array and its component parts are fully described in On-line Supplementary Information and are also available commercially.
Collapse
|
45
|
Boutros M, Kiger AA, Armknecht S, Kerr K, Hild M, Koch B, Haas SA, Paro R, Perrimon N. Genome-wide RNAi analysis of growth and viability in Drosophila cells. Science 2004; 303:832-5. [PMID: 14764878 DOI: 10.1126/science.1091266] [Citation(s) in RCA: 578] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
A crucial aim upon completion of whole genome sequences is the functional analysis of all predicted genes. We have applied a high-throughput RNA-interference (RNAi) screen of 19,470 double-stranded (ds) RNAs in cultured cells to characterize the function of nearly all (91%) predicted Drosophila genes in cell growth and viability. We found 438 dsRNAs that identified essential genes, among which 80% lacked mutant alleles. A quantitative assay of cell number was applied to identify genes of known and uncharacterized functions. In particular, we demonstrate a role for the homolog of a mammalian acute myeloid leukemia gene (AML1) in cell survival. Such a systematic screen for cell phenotypes, such as cell viability, can thus be effective in characterizing functionally related genes on a genome-wide scale.
Collapse
|
46
|
Hild M, Beckmann B, Haas SA, Koch B, Solovyev V, Busold C, Fellenberg K, Boutros M, Vingron M, Sauer F, Hoheisel JD, Paro R. An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome. Genome Biol 2003; 5:R3. [PMID: 14709175 PMCID: PMC395735 DOI: 10.1186/gb-2003-5-1-r3] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2003] [Revised: 10/13/2003] [Accepted: 11/19/2003] [Indexed: 11/19/2022] Open
Abstract
A novel Drosophila microarray constructed on the basis of an integrated in silico/wet biology approach provides evidence for the transcription of approximately 2,600 additional genes. Validation indicates a lower limit of 2,000 novel annotations, thus raising the number of genes that make a fly. Background While the genome sequences for a variety of organisms are now available, the precise number of the genes encoded is still a matter of debate. For the human genome several stringent annotation approaches have resulted in the same number of potential genes, but a careful comparison revealed only limited overlap. This indicates that only the combination of different computational prediction methods and experimental evaluation of such in silico data will provide more complete genome annotations. In order to get a more complete gene content of the Drosophila melanogaster genome, we based our new D. melanogaster whole-transcriptome microarray, the Heidelberg FlyArray, on the combination of the Berkeley Drosophila Genome Project (BDGP) annotation and a novel ab initio gene prediction of lower stringency using the Fgenesh software. Results Here we provide evidence for the transcription of approximately 2,600 additional genes predicted by Fgenesh. Validation of the developmental profiling data by RT-PCR and in situ hybridization indicates a lower limit of 2,000 novel annotations, thus substantially raising the number of genes that make a fly. Conclusions The successful design and application of this novel Drosophila microarray on the basis of our integrated in silico/wet biology approach confirms our expectation that in silico approaches alone will always tend to be incomplete. The identification of at least 2,000 novel genes highlights the importance of gathering experimental evidence to discover all genes within a genome. Moreover, as such an approach is independent of homology criteria, it will allow the discovery of novel genes unrelated to known protein families or those that have not been strictly conserved between species.
Collapse
|
47
|
Haas SA, Hild M, Wright APH, Hain T, Talibi D, Vingron M. Genome-scale design of PCR primers and long oligomers for DNA microarrays. Nucleic Acids Res 2003; 31:5576-81. [PMID: 14500820 PMCID: PMC206452 DOI: 10.1093/nar/gkg752] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
During the last years, the demand for custom-made cDNA chips/arrays as well as whole genome chips is increasing rapidly. The efficient selection of gene-specific primers/oligomers is of the utmost importance for the successful production of such chips. We developed GenomePRIDE, a highly flexible and scalable software for designing primers/oligomers for large-scale projects. The program is able to generate either long oligomers (40-70 bases), or PCR primers for the amplification of gene-specific DNA fragments of user-defined length. Additionally, primers can be designed in-frame in order to facilitate large-scale cloning into expression vectors. Furthermore, GenomePRIDE can be adapted to specific applications such as the generation of genomic amplicon arrays or the design of fragments specific for alternative splice isoforms. We tested the performance of GenomePRIDE on the entire genomes of Listeria monocytogenes (1584 gene-specific PCRs, 48 long oligomers) as well as of eukaryotes such as Schizosaccharomyces pombe (5006 gene-specific PCRs), and Drosophila melanogaster (21 306 gene-specific PCRs). With its computing speed of 1000 primer pairs per hour and a PCR amplification success of 99%, GenomePRIDE represents an extremely cost- and time-effective program.
Collapse
|
48
|
Coward E, Haas SA, Vingron M. SpliceNest: visualizing gene structure and alternative splicing based on EST clusters. Trends Genet 2002. [DOI: 10.1016/s0168-9525(01)02525-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
49
|
Krause A, Haas SA, Coward E, Vingron M. SYSTERS, GeneNest, SpliceNest: exploring sequence space from genome to protein. Nucleic Acids Res 2002; 30:299-300. [PMID: 11752319 PMCID: PMC99107 DOI: 10.1093/nar/30.1.299] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have integrated the protein families from SYSTERS and the expressed sequence tag (EST) clusters from our database GeneNest with SpliceNest, a new database mapping EST contigs into genomic DNA. The SYSTERS protein sequence cluster set provides an automatically generated classification of all sequences of the SWISS-PROT, TrEMBL and PIR databases into disjoint protein family and superfamily clusters. GeneNest is a database and software package for producing and visualizing gene indices from ESTs and mRNAs. Currently, the database comprises gene indices of human, mouse, Arabidopsis thaliana and zebrafish. SpliceNest is a web-based graphical tool to explore gene structure, including alternative splicing, based on a mapping of the EST consensus sequences from GeneNest to the complete human genome. The integration of SYSTERS, GeneNest and SpliceNest into one framework now permits an overall exploration of the whole sequence space covering protein, mRNA and EST sequences, as well as genomic DNA. The databases are available for querying and browsing at http://cmb.molgen.mpg.de.
Collapse
|
50
|
Haas SA, Beissbarth T, Rivals E, Krause A, Vingron M. GeneNest: automated generation and visualization of gene indices. Trends Genet 2000; 16:521-3. [PMID: 12199289 DOI: 10.1016/s0168-9525(00)02116-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|