101
|
Yu Y, Sanderson S, Reyes M, Sharma A, Dunbar N, Srivastava T, Jüppner H, Bergwitz C. Novel NaPi-IIc mutations causing HHRH and idiopathic hypercalciuria in several unrelated families: long-term follow-up in one kindred. Bone 2012; 50:1100-6. [PMID: 22387237 PMCID: PMC3322249 DOI: 10.1016/j.bone.2012.02.015] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/28/2011] [Revised: 02/09/2012] [Accepted: 02/15/2012] [Indexed: 02/06/2023]
Abstract
Homozygous and compound heterozygous mutations in SLC34A3, the gene encoding the sodium-dependent co-transporter NaPi-IIc, cause hereditary hypophosphatemic rickets with hypercalciuria (HHRH), a disorder characterized by renal phosphate-wasting resulting in hypophosphatemia, elevated 1,25(OH)(2) vitamin D levels, hypercalciuria, rickets/osteomalacia, and frequently kidney stones or nephrocalcinosis. Similar albeit less severe biochemical changes are also observed in heterozygous carriers, which are furthermore indistinguishable from those encountered in idiopathic hypercalciuria (IH). We now searched for SLC34A3 mutations (exons and introns) in two previously not reported HHRH kindreds, which resulted in the identification of three novel mutations. The affected members of kindred A were compound heterozygous for two different mutations, c.1046_47del and the intronic mutation c.560+23_561-42del, while the index case in kindred B was homozygous for the nonsense SLC34A3 mutation c.1764C>G (p.Y588X). The patient in kindred C was diagnosed with IH because of bilateral medullary nephrocalcinosis, suppressed PTH levels, and hypercalciuria; she was found to have a novel heterozygous c.1571_1880del mutation. The HHRH patients in kindred A were treated for up to 7years with oral phosphate, which led to reversal of hypophosphatemia, hypercalciuria, and prevention or healing of the mild bone abnormalities. PTH levels were normal throughout the observation period, while 1,25(OH)(2) vitamin D levels remained elevated and may thus be helpful for assessing treatment efficacy and patient compliance in HHRH.
Collapse
Affiliation(s)
- Y. Yu
- Endocrine Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - S.R. Sanderson
- Pediatric Endocrinology, Horizon Health Network, Saint John, New Brunswick E2L 4L2, Canada
| | - M. Reyes
- Endocrine Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - A. Sharma
- Pediatric Nephrology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - N. Dunbar
- Pediatric Endocrinology, Baystate Medical Center, Springfield, MA 01199, USA
| | - T. Srivastava
- Bone and Mineral Disorder Clinic, Section of Pediatric Nephrology, The Children’s Mercy Hospital and Clinics, University of Missouri at Kansas City, Kansas City, MO 64108, USA
| | - H. Jüppner
- Endocrine Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
- Pediatric Nephrology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - C. Bergwitz
- Endocrine Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
102
|
Moss SP, Joyce DA, Humphries S, Tindall KJ, Lunt DH. Comparative analysis of teleost genome sequences reveals an ancient intron size expansion in the zebrafish lineage. Genome Biol Evol 2011; 3:1187-96. [PMID: 21920901 PMCID: PMC3205604 DOI: 10.1093/gbe/evr090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We have developed a bioinformatics pipeline for the comparative evolutionary analysis of Ensembl genomes and have used it to analyze the introns of the five available teleost fish genomes. We show our pipeline to be a powerful tool for revealing variation between genomes that may otherwise be overlooked with simple summary statistics. We identify that the zebrafish, Danio rerio, has an unusual distribution of intron sizes, with a greater number of larger introns in general and a notable peak in the frequency of introns of approximately 500 to 2,000 bp compared with the monotonically decreasing frequency distributions of the other fish. We determine that 47% of D. rerio introns are composed of repetitive sequences, although the remainder, over 331 Mb, is not. Because repetitive elements may be the origin of the majority of all noncoding DNA, it is likely that the remaining D. rerio intronic sequence has an ancient repetitive origin and has since accumulated so many mutations that it can no longer be recognized as such. To study such an ancient expansion of repeats in the Danio, lineage will require further comparative analysis of fish genomes incorporating a broader distribution of teleost lineages.
Collapse
|
103
|
Laing R, Hunt M, Protasio AV, Saunders G, Mungall K, Laing S, Jackson F, Quail M, Beech R, Berriman M, Gilleard JS. Annotation of two large contiguous regions from the Haemonchus contortus genome using RNA-seq and comparative analysis with Caenorhabditis elegans. PLoS One 2011; 6:e23216. [PMID: 21858033 PMCID: PMC3156134 DOI: 10.1371/journal.pone.0023216] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2011] [Accepted: 07/12/2011] [Indexed: 11/30/2022] Open
Abstract
The genomes of numerous parasitic nematodes are currently being sequenced, but their complexity and size, together with high levels of intra-specific sequence variation and a lack of reference genomes, makes their assembly and annotation a challenging task. Haemonchus contortus is an economically significant parasite of livestock that is widely used for basic research as well as for vaccine development and drug discovery. It is one of many medically and economically important parasites within the strongylid nematode group. This group of parasites has the closest phylogenetic relationship with the model organism Caenorhabditis elegans, making comparative analysis a potentially powerful tool for genome annotation and functional studies. To investigate this hypothesis, we sequenced two contiguous fragments from the H. contortus genome and undertook detailed annotation and comparative analysis with C. elegans. The adult H. contortus transcriptome was sequenced using an Illumina platform and RNA-seq was used to annotate a 409 kb overlapping BAC tiling path relating to the X chromosome and a 181 kb BAC insert relating to chromosome I. In total, 40 genes and 12 putative transposable elements were identified. 97.5% of the annotated genes had detectable homologues in C. elegans of which 60% had putative orthologues, significantly higher than previous analyses based on EST analysis. Gene density appears to be less in H. contortus than in C. elegans, with annotated H. contortus genes being an average of two-to-three times larger than their putative C. elegans orthologues due to a greater intron number and size. Synteny appears high but gene order is generally poorly conserved, although areas of conserved microsynteny are apparent. C. elegans operons appear to be partially conserved in H. contortus. Our findings suggest that a combination of RNA-seq and comparative analysis with C. elegans is a powerful approach for the annotation and analysis of strongylid nematode genomes.
Collapse
Affiliation(s)
- Roz Laing
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- Faculty of Veterinary Medicine, University of Glasgow, Glasgow, Strathclyde, United Kingdom
| | - Martin Hunt
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Anna V. Protasio
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Gary Saunders
- Faculty of Veterinary Medicine, University of Glasgow, Glasgow, Strathclyde, United Kingdom
| | - Karen Mungall
- Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Steven Laing
- Faculty of Veterinary Medicine, University of Glasgow, Glasgow, Strathclyde, United Kingdom
| | - Frank Jackson
- Moredun Research Institute, Pentlands Science Park, Bush Loan, Penicuik, United Kingdom
| | - Michael Quail
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Robin Beech
- Institute of Parasitology, McGill University, Ste Anne de Bellevue, Quebec, Canada
| | - Matthew Berriman
- Welcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - John S. Gilleard
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- * E-mail:
| |
Collapse
|
104
|
Hatje K, Keller O, Hammesfahr B, Pillmann H, Waack S, Kollmar M. Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio. BMC Res Notes 2011; 4:265. [PMID: 21798037 PMCID: PMC3162530 DOI: 10.1186/1756-0500-4-265] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 07/28/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Obtaining transcripts of homologs of closely related organisms and retrieving the reconstructed exon-intron patterns of the genes is a very important process during the analysis of the evolution of a protein family and the comparative analysis of the exon-intron structure of a certain gene from different species. Due to the ever-increasing speed of genome sequencing, the gap to genome annotation is growing. Thus, tools for the correct prediction and reconstruction of genes in related organisms become more and more important. The tool Scipio, which can also be used via the graphical interface WebScipio, performs significant hit processing of the output of the Blat program to account for sequencing errors, missing sequence, and fragmented genome assemblies. However, Scipio has so far been limited to high sequence similarity and unable to reconstruct short exons. RESULTS Scipio and WebScipio have fundamentally been extended to better reconstruct very short exons and intron splice sites and to be better suited for cross-species gene structure predictions. The Needleman-Wunsch algorithm has been implemented for the search for short parts of the query sequence that were not recognized by Blat. Those regions might either be short exons, divergent sequence at intron splice sites, or very divergent exons. We have shown the benefit and use of new parameters with several protein examples from completely different protein families in searches against species from several kingdoms of the eukaryotes. The performance of the new Scipio version has been tested in comparison with several similar tools. CONCLUSIONS With the new version of Scipio very short exons, terminal and internal, of even just one amino acid can correctly be reconstructed. Scipio is also able to correctly predict almost all genes in cross-species searches even if the ancestors of the species separated more than 100 Myr ago and if the protein sequence identity is below 80%. For our test cases Scipio outperforms all other software tested. WebScipio has been restructured and provides easy access to the genome assemblies of about 640 eukaryotic species. Scipio and WebScipio are freely accessible at http://www.webscipio.org.
Collapse
Affiliation(s)
- Klas Hatje
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany.
| | | | | | | | | | | |
Collapse
|
105
|
Recent insertion of a 52-kb mitochondrial DNA segment in the wheat lineage. Funct Integr Genomics 2011; 11:599-609. [PMID: 21761280 DOI: 10.1007/s10142-011-0237-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2011] [Revised: 06/27/2011] [Accepted: 06/28/2011] [Indexed: 10/18/2022]
Abstract
The assembly of a 1.3-Mb size region of the wheat genome has provided the opportunity to study a recent nuclear mitochondrial DNA insertion (NUMT). In the present study, we have studied two bacterial artificial chromosomes (BACs) and characterized a 52-kb NUMT segment from the tetraploid and hexaploid wheat BAC libraries. The conserved orthologous NUMT regions from tetraploid and hexaploid wheat Langdon and Chinese Spring shared identical gene haplotypes even though mutations (insertions, deletions, and substitutions) had occurred. The 52-kb NUMT was present in hexaploid variety Chinese Spring, but absent in variety Hope, by sequence comparison of their corresponding region. Amplifying the NUMT junctions using a set of the wheat materials including diploid, tetraploid, and hexaploid lines showed that none of the diploid wheat carried the region and only some tetraploid and hexaploid wheat were positive for the NUMT. Age estimation of the NUMT displayed the mean ages of Langdon NUMT and Chinese Spring NUMT to be 378,000 and 416,000 years ago, respectively. Reverse transcription PCR and sequencing of the nad7 gene showed 28 C → U RNA editing sites and four partial editing sites, as expected for mitochondrial DNA expression. Specific SNPs discriminated between cDNA from the nucleus and the mitochondria and suggested that the nuclear copy was not expressed. The mitochondrial DNA studied was inserted into the genome quite recently within the wheat lineage and gave rise to the non-coding nuclear nad7 gene. The NUMT segment could be lost and acquired frequently during the wheat evolution.
Collapse
|
106
|
Gupta S, Kumari K, Das J, Lata C, Puranik S, Prasad M. Development and utilization of novel intron length polymorphic markers in foxtail millet (Setaria italica (L.) P. Beauv.). Genome 2011; 54:586-602. [PMID: 21751869 DOI: 10.1139/g11-020] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Introns are noncoding sequences in a gene that are transcribed to precursor mRNA but spliced out during mRNA maturation and are abundant in eukaryotic genomes. The availability of codominant molecular markers and saturated genetic linkage maps have been limited in foxtail millet (Setaria italica (L.) P. Beauv.). Here, we describe the development of 98 novel intron length polymorphic (ILP) markers in foxtail millet using sequence information of the model plant rice. A total of 575 nonredundant expressed sequence tag (EST) sequences were obtained, of which 327 and 248 unique sequences were from dehydration- and salinity-stressed suppression subtractive hybridization libraries, respectively. The BLAST analysis of 98 EST sequences suggests a nearly defined function for about 64% of them, and they were grouped into 11 different functional categories. All 98 ILP primer pairs showed a high level of cross-species amplification in two millets and two nonmillets species ranging from 90% to 100%, with a mean of ∼97%. The mean observed heterozygosity and Nei's average gene diversity 0.016 and 0.171, respectively, established the efficiency of the ILP markers for distinguishing the foxtail millet accessions. Based on 26 ILP markers, a reasonable dendrogram of 45 foxtail millet accessions was constructed, demonstrating the utility of ILP markers in germplasm characterizations and genomic relationships in millets and nonmillets species.
Collapse
Affiliation(s)
- Sarika Gupta
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | | | | | | | | | | |
Collapse
|
107
|
Abstract
Pre-mRNA splicing is catalyzed by the spliceosome, a multimegadalton ribonucleoprotein (RNP) complex comprised of five snRNPs and numerous proteins. Intricate RNA-RNA and RNP networks, which serve to align the reactive groups of the pre-mRNA for catalysis, are formed and repeatedly rearranged during spliceosome assembly and catalysis. Both the conformation and composition of the spliceosome are highly dynamic, affording the splicing machinery its accuracy and flexibility, and these remarkable dynamics are largely conserved between yeast and metazoans. Because of its dynamic and complex nature, obtaining structural information about the spliceosome represents a major challenge. Electron microscopy has revealed the general morphology of several spliceosomal complexes and their snRNP subunits, and also the spatial arrangement of some of their components. X-ray and NMR studies have provided high resolution structure information about spliceosomal proteins alone or complexed with one or more binding partners. The extensive interplay of RNA and proteins in aligning the pre-mRNA's reactive groups, and the presence of both RNA and protein at the core of the splicing machinery, suggest that the spliceosome is an RNP enzyme. However, elucidation of the precise nature of the spliceosome's active site, awaits the generation of a high-resolution structure of its RNP core.
Collapse
Affiliation(s)
- Cindy L Will
- Max Planck Institute for Biophysical Chemistry, Department of Cellular Biochemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | | |
Collapse
|
108
|
Ritz K, van Schaik BDC, Jakobs ME, Aronica E, Tijssen MA, van Kampen AHC, Baas F. Looking ultra deep: short identical sequences and transcriptional slippage. Genomics 2011; 98:90-5. [PMID: 21624457 DOI: 10.1016/j.ygeno.2011.05.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Revised: 05/11/2011] [Accepted: 05/16/2011] [Indexed: 01/26/2023]
Abstract
Studying transcriptomes by ultra deep sequencing provides an in-depth picture of transcriptional regulation and it facilitates the detection of rare transcriptional events. Using ultra deep sequencing of amplicons we identified known isoforms and also various new low frequency variants. Most of these variants likely involve the splicing machinery except for two events that we named variations affecting multiple exons, which are mainly deletions affecting parts of adjacent exons and intra-exonic deletions. Both events involve short identical sequences of 1 to 8 nucleotides at the junction and canonical splice sites are missing. They were identified in different genes and species at very low frequencies. We excluded that they are an artifact of PCR, sequencing, or reverse transcription. We propose that these variants represent intramolecular slippage events that require short identical sequences for reannealing of dissociated transcripts.
Collapse
Affiliation(s)
- Katja Ritz
- Department of Genome Analysis, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1105AZ Amsterdam, The Netherlands
| | | | | | | | | | | | | |
Collapse
|
109
|
Marhon SA, Kremer SC. Gene Prediction Based on DNA Spectral Analysis: A Literature Review. J Comput Biol 2011; 18:639-76. [PMID: 21381961 DOI: 10.1089/cmb.2010.0184] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Sajid A. Marhon
- School of Computer Science, University of Guelph, Guelph, Ontario, Canada
| | - Stefan C. Kremer
- School of Computer Science, University of Guelph, Guelph, Ontario, Canada
| |
Collapse
|
110
|
Rodríguez-Marí A, Wilson C, Titus TA, Cañestro C, BreMiller RA, Yan YL, Nanda I, Johnston A, Kanki JP, Gray EM, He X, Spitsbergen J, Schindler D, Postlethwait JH. Roles of brca2 (fancd1) in oocyte nuclear architecture, gametogenesis, gonad tumors, and genome stability in zebrafish. PLoS Genet 2011; 7:e1001357. [PMID: 21483806 PMCID: PMC3069109 DOI: 10.1371/journal.pgen.1001357] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2010] [Accepted: 02/28/2011] [Indexed: 01/07/2023] Open
Abstract
Mild mutations in BRCA2 (FANCD1) cause Fanconi anemia (FA) when homozygous, while severe mutations cause common cancers including breast, ovarian, and prostate cancers when heterozygous. Here we report a zebrafish brca2 insertional mutant that shares phenotypes with human patients and identifies a novel brca2 function in oogenesis. Experiments showed that mutant embryos and mutant cells in culture experienced genome instability, as do cells in FA patients. In wild-type zebrafish, meiotic cells expressed brca2; and, unexpectedly, transcripts in oocytes localized asymmetrically to the animal pole. In juvenile brca2 mutants, oocytes failed to progress through meiosis, leading to female-to-male sex reversal. Adult mutants became sterile males due to the meiotic arrest of spermatocytes, which then died by apoptosis, followed by neoplastic proliferation of gonad somatic cells that was similar to neoplasia observed in ageing dead end (dnd)-knockdown males, which lack germ cells. The construction of animals doubly mutant for brca2 and the apoptotic gene tp53 (p53) rescued brca2-dependent sex reversal. Double mutants developed oocytes and became sterile females that produced only aberrant embryos and showed elevated risk for invasive ovarian tumors. Oocytes in double-mutant females showed normal localization of brca2 and pou5f1 transcripts to the animal pole and vasa transcripts to the vegetal pole, but had a polarized rather than symmetrical nucleus with the distribution of nucleoli and chromosomes to opposite nuclear poles; this result revealed a novel role for Brca2 in establishing or maintaining oocyte nuclear architecture. Mutating tp53 did not rescue the infertility phenotype in brca2 mutant males, suggesting that brca2 plays an essential role in zebrafish spermatogenesis. Overall, this work verified zebrafish as a model for the role of Brca2 in human disease and uncovered a novel function of Brca2 in vertebrate oocyte nuclear architecture. Women with one strong BRCA2(FANCD1) mutation have high risks of breast and ovarian cancer. People with two mild BRCA2(FANCD1) mutations develop Fanconi Anemia, which reduces DNA repair leading to genome instability, small gonads, infertility, and cancer. Humans and mice lacking BRCA2 activity die before birth. We discovered that zebrafish brca2 mutants show chromosome instability and small gonads, and they develop only as sterile adult males. Female-to-male sex reversal is due to oocyte death during sex determination. Normal animals expressed brca2 in developing eggs and sperm that are repairing DNA breaks associated with genetic reshuffling. Normal developing eggs localized brca2 RNA near the nucleus, suggesting a role in protecting rapidly dividing early embryonic cells. Sperm-forming cells died in adult mutant males. Inhibition of cell death rescued sex reversal, but not fertility. Rescued females developed invasive ovarian tumors and formed eggs with abnormal nuclear architecture. The novel role of Brca2 in organizing the vertebrate egg nucleus may provide new insights into the origin of ovarian cancer. These results validate zebrafish as a model for human BRCA2-related diseases and provide a tool for the identification of substances that can rescue zebrafish brca2 mutants and thus become candidates for therapeutic molecules for human disease.
Collapse
Affiliation(s)
- Adriana Rodríguez-Marí
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Catherine Wilson
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Tom A. Titus
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Cristian Cañestro
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Ruth A. BreMiller
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Yi-Lin Yan
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Indrajit Nanda
- Institute of Human Genetics, Biocenter, University of Würzburg, Würzburg, Germany
| | - Adam Johnston
- Dana Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - John P. Kanki
- Dana Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - Erin M. Gray
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Xinjun He
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
| | - Jan Spitsbergen
- Marine and Freshwater Biomedical Sciences Center, Oregon State University, Corvallis, Oregon, United States of America
| | - Detlev Schindler
- Institute of Human Genetics, Biocenter, University of Würzburg, Würzburg, Germany
| | - John H. Postlethwait
- Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America
- * E-mail:
| |
Collapse
|
111
|
Evolution of exon-intron structure and alternative splicing. PLoS One 2011; 6:e18055. [PMID: 21464961 PMCID: PMC3064661 DOI: 10.1371/journal.pone.0018055] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 02/19/2011] [Indexed: 12/22/2022] Open
Abstract
Despite significant advances in high-throughput DNA sequencing, many important
species remain understudied at the genome level. In this study we addressed a
question of what can be predicted about the genome-wide characteristics of less
studied species, based on the genomic data from completely sequenced species.
Using NCBI databases we performed a comparative genome-wide analysis of such
characteristics as alternative splicing, number of genes, gene products and
exons in 36 completely sequenced model species. We created statistical
regression models to fit these data and applied them to loblolly pine
(Pinus taeda L.), an example of an important species whose
genome has not been completely sequenced yet. Using these models, the
genome-wide characteristics, such as total number of genes and exons, can be
roughly predicted based on parameters estimated from available limited genomic
data, e.g. exon length and exon/gene ratio.
Collapse
|
112
|
Wang D, Yu J. Both size and GC-content of minimal introns are selected in human populations. PLoS One 2011; 6:e17945. [PMID: 21437290 PMCID: PMC3060096 DOI: 10.1371/journal.pone.0017945] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2010] [Accepted: 02/16/2011] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND We previously have studied the insertion and deletion polymorphism by sequencing no more than one hundred introns in a mixed human population and found that the minimal introns tended to maintain length at an optimal size. Here we analyzed re-sequenced 179 individual genomes (from African, European, and Asian populations) from the data released by the 1000 Genome Project to study the size dynamics of minimal introns. PRINCIPAL FINDINGS We not only confirmed that minimal introns in human populations are selected but also found two major effects in minimal intron evolution: (i) Size-effect: minimal introns longer than an optimal size (87 nt) tend to have a higher ratio of deletion to insertion than those that are shorter than the optimal size; (ii) GC-effect: minimal introns with lower GC content tend to be more frequently deleted than those with higher GC content. The GC-effect results in a higher GC content in minimal introns than their flanking exons as opposed to larger introns (≥125 nt) that always have a lower GC content than that of their flanking exons. We also observed that the two effects are distinguishable but not completely separable within and between populations. CONCLUSIONS We validated the unique mutation dynamics of minimal introns in keeping their near-optimal size and GC content, and our observations suggest potentially important functions of human minimal introns in transcript processing and gene regulation.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate University of Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, People's Republic of China
| |
Collapse
|
113
|
Li WW, He L, Jin XK, Jiang H, Chen LL, Wang Y, Wang Q. Molecular cloning, characterization and expression analysis of cathepsin A gene in Chinese mitten crab, Eriocheir sinensis. Peptides 2011; 32:518-25. [PMID: 20817057 DOI: 10.1016/j.peptides.2010.08.027] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2010] [Revised: 08/27/2010] [Accepted: 08/27/2010] [Indexed: 10/19/2022]
Abstract
Cathepsins, a superfamily of hydrolytic enzymes produced and enclosed within lysosomes, function in immune response in vertebrates; however, their function within the innate immune system of invertebrates remains largely unknown. Therefore, we investigated the immune functionality of cathepsin A (catA) in Chinese mitten crab (Eriocheir sinensis), a commercially important and disease vulnerable aquaculture species. The full length catA cDNA (2200 bp) was cloned via PCR based upon an initial expressed sequence tag (EST) isolated from a hepatopancreatic cDNA library. The catA cDNA contained a 1398 bp open reading frame (ORF) that encoded a putative 465 amino acid (aa) protein. Comparisons with other reported vertebrate cathepsins sequences revealed percent identity range from 48 to 51%. CatA mRNA expression in E. sinensis was (a) tissue-specific, with the highest expression observed in gill and (b) responsive in hemocytes to a Vibrio anguillarum challenge, with peak exposure observed 12 h post-injection. Collectively, data demonstrate the successful isolation of catA from the Chinese mitten crab, and its involvement in the innate immune system of an invertebrate.
Collapse
Affiliation(s)
- Wei-Wei Li
- School of Life Science, East China Normal University, North Zhong-Shan Road, Shanghai, China
| | | | | | | | | | | | | |
Collapse
|
114
|
Sawada R, Mitaku S. How are exons encoding transmembrane sequences distributed in the exon-intron structure of genes? Genes Cells 2010; 16:115-21. [PMID: 21143351 DOI: 10.1111/j.1365-2443.2010.01468.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The exon-intron structure of eukaryotic genes raises a question about the distribution of transmembrane regions in membrane proteins. Were exons that encode transmembrane regions formed simply by inserting introns into preexisting genes or by some kind of exon shuffling? To answer this question, the exon-per-gene distribution was analyzed for all genes in 40 eukaryotic genomes with a particular focus on exons encoding transmembrane segments. In 21 higher multicellular eukaryotes, the percentage of multi-exon genes (those containing at least one intron) within all genes in a genome was high (>70%) and with a mean of 87%. When genes were grouped by the number of exons per gene in higher eukaryotes, good exponential distributions were obtained not only for all genes but also for the exons encoding transmembrane segments, leading to a constant ratio of membrane proteins independent of the exon-per-gene number. The positional distribution of transmembrane regions in single-pass membrane proteins showed that they are generally located in the amino or carboxyl terminal regions. This nonrandom distribution of transmembrane regions explains the constant ratio of membrane proteins to the exon-per-gene numbers because there are always two terminal (i.e., the amino and carboxyl) regions - independent of the length of sequences.
Collapse
Affiliation(s)
- Ryusuke Sawada
- Department of Computational Science and Engineering, Graduate School of Engineering, Nagoya University, Furocho, Chikusa-ku, Nagoya 464-8606, Japan.
| | | |
Collapse
|
115
|
Dimon MT, Sorber K, DeRisi JL. HMMSplicer: a tool for efficient and sensitive discovery of known and novel splice junctions in RNA-Seq data. PLoS One 2010; 5:e13875. [PMID: 21079731 PMCID: PMC2975632 DOI: 10.1371/journal.pone.0013875] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2010] [Accepted: 09/16/2010] [Indexed: 02/01/2023] Open
Abstract
Background High-throughput sequencing of an organism's transcriptome, or RNA-Seq, is a valuable and versatile new strategy for capturing snapshots of gene expression. However, transcriptome sequencing creates a new class of alignment problem: mapping short reads that span exon-exon junctions back to the reference genome, especially in the case where a splice junction is previously unknown. Methodology/Principal Findings Here we introduce HMMSplicer, an accurate and efficient algorithm for discovering canonical and non-canonical splice junctions in short read datasets. HMMSplicer identifies more splice junctions than currently available algorithms when tested on publicly available A. thaliana, P. falciparum, and H. sapiens datasets without a reduction in specificity. Conclusions/Significance HMMSplicer was found to perform especially well in compact genomes and on genes with low expression levels, alternative splice isoforms, or non-canonical splice junctions. Because HHMSplicer does not rely on pre-built gene models, the products of inexact splicing are also detected. For H. sapiens, we find 3.6% of 3′ splice sites and 1.4% of 5′ splice sites are inexact, typically differing by 3 bases in either direction. In addition, HMMSplicer provides a score for every predicted junction allowing the user to set a threshold to tune false positive rates depending on the needs of the experiment. HMMSplicer is implemented in Python. Code and documentation are freely available at http://derisilab.ucsf.edu/software/hmmsplicer.
Collapse
Affiliation(s)
- Michelle T. Dimon
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, United States of America
- Biological and Medical Informatics Program, University of California San Francisco, San Francisco, California, United States of America
| | - Katherine Sorber
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, United States of America
| | - Joseph L. DeRisi
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, United States of America
- Howard Hughes Medical Institute, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
116
|
Li WW, Jin XK, He L, Jiang H, Xie YN, Wang Q. Molecular cloning, characterization and expression analysis of cathepsin C gene involved in the antibacterial response in Chinese mitten crab, Eriocheir sinensis. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2010; 34:1170-1174. [PMID: 20600276 DOI: 10.1016/j.dci.2010.06.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Revised: 06/11/2010] [Accepted: 06/12/2010] [Indexed: 05/29/2023]
Abstract
Cathepsins, a superfamily of hydrolytic enzymes produced and enclosed within lysosomes, function in immune response in vertebrates; however, their function within the innate immune system of invertebrates remains largely unknown. Therefore, we investigated the immune functionality of cathepsin C (catC) in Chinese mitten crab (Eriocheir sinensis), a commercially important and disease vulnerable aquaculture species. The full-length catC cDNA (1481 bp) was cloned via PCR based upon an initial expressed sequence tag (EST) isolated from a hepatopancreatic cDNA library. The catC cDNA contained a 1284 bp open reading frame (ORF) that encoded a putative 427 amino acid (aa) protein. Comparisons with other reported invertebrate and vertebrate cathepsins sequences revealed high percent identity. CatC mRNA expression in E. sinensis was responsive in hemocytes to a Vibrio anguillarum challenge, with peak exposure observed 6 h post-injection. Collectively, data demonstrate the successful isolation of catC from the Chinese mitten crab, and its involvement in the innate immune system of an invertebrate.
Collapse
Affiliation(s)
- Wei-Wei Li
- School of Life Science, East China Normal University, North Zhongshan Road, 3663 Shanghai, China
| | | | | | | | | | | |
Collapse
|
117
|
Nlend Nlend R, Meyer K, Schümperli D. Repair of pre-mRNA splicing: prospects for a therapy for spinal muscular atrophy. RNA Biol 2010; 7:430-40. [PMID: 20523126 DOI: 10.4161/rna.7.4.12206] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Recent analyses of complete genomes have revealed that alternative splicing became more prevalent and important during eukaryotic evolution. Alternative splicing augments the protein repertoire--particularly that of the human genome--and plays an important role in the development and function of differentiated cell types. However, splicing is also extremely vulnerable, and defects in the proper recognition of splicing signals can give rise to a variety of diseases. In this review, we discuss splicing correction therapies, by using the inherited disease Spinal Muscular Atrophy (SMA) as an example. This lethal early childhood disorder is caused by deletions or other severe mutations of SMN1, a gene coding for the essential survival of motoneurons protein. A second gene copy present in humans and few non-human primates, SMN2, can only partly compensate for the defect because of a single nucleotide change in exon 7 that causes this exon to be skipped in the majority of mRNAs. Thus SMN2 is a prime therapeutic target for SMA. In recent years, several strategies based on small molecule drugs, antisense oligonucleotides or in vivo expressed RNAs have been developed that allow a correction of SMN2 splicing. For some of these, a therapeutic benefit has been demonstrated in mouse models for SMA. This means that clinical trials of such splicing therapies for SMA may become possible in the near future.
Collapse
|
118
|
Fahey ME, Mills W, Higgins DG, Moore T. Maternally and paternally silenced imprinted genes differ in their intron content. Comp Funct Genomics 2010; 5:572-83. [PMID: 18629181 PMCID: PMC2447473 DOI: 10.1002/cfg.437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2004] [Revised: 11/01/2004] [Accepted: 11/12/2004] [Indexed: 12/31/2022] Open
Abstract
Imprinted genes exhibit silencing of one of the parental alleles during embryonic development. In a previous study imprinted genes were found to have reduced intron content relative to a non-imprinted control set (Hurst et al., 1996). However, due to the small sample size, it was not possible to analyse the source of this effect. Here, we re-investigate this observation using larger datasets of imprinted and control (non-imprinted) genes that allow us to consider mouse and human, and maternally and paternally silenced, imprinted genes separately. We find that, in the human and mouse, there is reduced intron content in the maternally silenced imprinted genes relative to a non-imprinted control set. Among imprinted genes, a strong bias is also observed in the distribution of intronless genes, which are found exclusively in the maternally silenced dataset. The paternally silenced dataset in the human is not different to the control set; however, the mouse paternally silenced dataset has more introns than the control group. A direct comparison of mouse maternally and paternally silenced imprinted gene datasets shows that they differ significantly with respect to a variety of intron-related parameters. We discuss a variety of possible explanations for our observations.
Collapse
Affiliation(s)
- Marie E Fahey
- Department of Biochemistry, Biosciences Institute, University College Cork, College Road, Cork, Ireland
| | | | | | | |
Collapse
|
119
|
Guo B, Zou M, Gan X, He S. Genome size evolution in pufferfish: an insight from BAC clone-based Diodon holocanthus genome sequencing. BMC Genomics 2010; 11:396. [PMID: 20569428 PMCID: PMC2996927 DOI: 10.1186/1471-2164-11-396] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2010] [Accepted: 06/23/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Variations in genome size within and between species have been observed since the 1950 s in diverse taxonomic groups. Serving as model organisms, smooth pufferfish possess the smallest vertebrate genomes. Interestingly, spiny pufferfish from its sister family have genome twice as large as smooth pufferfish. Therefore, comparative genomic analysis between smooth pufferfish and spiny pufferfish is useful for our understanding of genome size evolution in pufferfish. RESULTS Ten BAC clones of a spiny pufferfish Diodon holocanthus were randomly selected and shotgun sequenced. In total, 776 kb of non-redundant sequences without gap representing 0.1% of the D. holocanthus genome were identified, and 77 distinct genes were predicted. In the sequenced D. holocanthus genome, 364 kb is homologous with 265 kb of the Takifugu rubripes genome, and 223 kb is homologous with 148 kb of the Tetraodon nigroviridis genome. The repetitive DNA accounts for 8% of the sequenced D. holocanthus genome, which is higher than that in the T. rubripes genome (6.89%) and that in the Te. nigroviridis genome (4.66%). In the repetitive DNA, 76% is retroelements which account for 6% of the sequenced D. holocanthus genome and belong to known families of transposable elements. More than half of retroelements were distributed within genes. In the non-homologous regions, repeat element proportion in D. holocanthus genome increased to 10.6% compared with T. rubripes and increased to 9.19% compared with Te. nigroviridis. A comparison of 10 well-defined orthologous genes showed that the average intron size (566 bp) in D. holocanthus genome is significantly longer than that in the smooth pufferfish genome (435 bp). CONCLUSION Compared with the smooth pufferfish, D. holocanthus has a low gene density and repeat elements rich genome. Genome size variation between D. holocanthus and the smooth pufferfish exhibits as length variation between homologous region and different accumulation of non-homologous sequences. The length difference of intron is consistent with the genome size variation between D. holocanthus and the smooth pufferfish. Different transposable element accumulation is responsible for genome size variation between D. holocanthus and the smooth pufferfish.
Collapse
Affiliation(s)
- Baocheng Guo
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | | | | | | |
Collapse
|
120
|
Schneider M, Will CL, Anokhina M, Tazi J, Urlaub H, Lührmann R. Exon definition complexes contain the tri-snRNP and can be directly converted into B-like precatalytic splicing complexes. Mol Cell 2010; 38:223-35. [PMID: 20417601 DOI: 10.1016/j.molcel.2010.02.027] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2009] [Revised: 11/18/2009] [Accepted: 02/09/2010] [Indexed: 10/19/2022]
Abstract
The first step in splicing of pre-mRNAs with long introns is exon definition, where U1 and U2 snRNPs bind at opposite ends of an exon. After exon definition, these snRNPs must form a complex across the upstream intron to allow splicing catalysis. Exon definition and conversion of cross-exon to cross-intron spliceosomal complexes are poorly understood. Here we demonstrate that, in addition to U1 and U2 snRNPs, cross-exon complexes contain U4, U5, and U6 (which form the tri-snRNP). Tri-snRNP docking involves the formation of U2/U6 helix II. This interaction is stabilized by a 5' splice site (SS)-containing oligonucleotide, which can bind the tri-snRNP and convert the cross-exon complex into a cross-intron, B-like complex. Our data suggest that the switch from cross-exon to cross-intron complexes can occur directly when an exon-bound tri-snRNP interacts with an upstream 5'SS, without prior formation of a cross-intron A complex, revealing an alternative spliceosome assembly pathway.
Collapse
Affiliation(s)
- Marc Schneider
- Department of Cellular Biochemistry, MPI of Biophysical Chemistry, Göttingen, Germany
| | | | | | | | | | | |
Collapse
|
121
|
Parsch J, Novozhilov S, Saminadin-Peter SS, Wong KM, Andolfatto P. On the utility of short intron sequences as a reference for the detection of positive and negative selection in Drosophila. Mol Biol Evol 2010; 27:1226-34. [PMID: 20150340 DOI: 10.1093/molbev/msq046] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The detection of selection, both positive and negative, acting on a DNA sequence or class of nucleotide sites requires comparison with a reference sequence that is unaffected by selection. In Drosophila, recent findings of widespread selective constraint, as well as adaptive evolution, in both coding and noncoding regions highlight the difficulties in choosing such a reference sequence. Here, we investigate the utility of short intron sequences as a reference for the detection of selection. For a set of 119 Drosophila melanogaster genes containing 195 short introns (<or=120 bp), we analyzed polymorphism and divergence at 1) 4-fold synonymous sites, 2) all sites of introns <or=120 bp, 3) all sites of introns <or=65 bp, 4) bases 8-30 of introns <or=120 bp, and 5) bases 8-30 of introns <or=65 bp. The last class of sites shows the highest levels of both interspecific divergence and intraspecific polymorphism, suggesting that these sites are under the least selective constraint. Bases 8-30 of introns <or=65 bp also have the lowest ratio of divergence to polymorphism, which may indicate that a small proportion of substitutions in the other classes of sites are the result of adaptive evolution. Although there is little signal of selection on the primary sequence of short introns, patterns of insertion-deletion polymorphism and divergence suggest that both positive and negative selection act to maintain an optimal intron length.
Collapse
Affiliation(s)
- John Parsch
- Department of Biology II, University of Munich, Planegg-Martinsried, Germany.
| | | | | | | | | |
Collapse
|
122
|
Molecular cloning, characterization and expression of PmRsr1, a Ras-related gene from yeast form of Penicillium marneffei. Mol Biol Rep 2010; 37:3533-40. [DOI: 10.1007/s11033-009-9947-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Accepted: 12/29/2009] [Indexed: 10/20/2022]
|
123
|
Ogino K, Tsuneki K, Furuya H. Unique genome of dicyemid mesozoan: Highly shortened spliceosomal introns in conservative exon/intron structure. Gene 2010; 449:70-6. [DOI: 10.1016/j.gene.2009.09.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2008] [Revised: 08/31/2009] [Accepted: 09/01/2009] [Indexed: 01/08/2023]
|
124
|
Chacko E, Ranganathan S. Genome-wide analysis of alternative splicing in cow: implications in bovine as a model for human diseases. BMC Genomics 2009; 10 Suppl 3:S11. [PMID: 19958474 PMCID: PMC2788363 DOI: 10.1186/1471-2164-10-s3-s11] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Alternative splicing (AS) is a primary mechanism of functional regulation in the human genome, with 60% to 80% of human genes being alternatively spliced. As part of the bovine genome annotation team, we have analysed 4567 bovine AS genes, compared to 16715 human and 16491 mouse AS genes, along with Gene Ontology (GO) analysis. We also analysed the two most important events, cassette exons and intron retention in 94 human disease genes and mapped them to the bovine orthologous genes. Of the 94 human inherited disease genes, a protein domain analysis was carried out for the transcript sequences of 12 human genes that have orthologous genes and have been characterised in cow. RESULTS Of the 21,755 bovine genes, 4,567 genes (21%) are alternatively spliced, compared to 16,715 (68%) in human and 16,491 (57%) in mouse. Gene-level analysis of the orthologous set suggested that bovine genes show fewer AS events compared to human and mouse genes. A detailed examination of cassette exons across human and cow for 94 human disease genes, suggested that a majority of cassette exons in human were present and constitutive in bovine as opposed to intron retention which exhibited 50% of the exons as present and 50% as absent in cow. We observed that AS plays a major role in disease implications in human through manipulations of essential/functional protein domains. It was also evident that majority of these 12 genes had conservation of all essential domains in their bovine orthologous counterpart, for these human diseases. CONCLUSION While alternative splicing has the potential to create many mRNA isoforms from a single gene, in cow the majority of genes generate two to three isoforms, compared to six in human and four in mouse. Our analyses demonstrated that a smaller number of bovine genes show greater transcript diversity. GO definitions for bovine AS genes provided 38% more functional information than currently available in the sequence database. Our protein domain analysis helped us verify the suitability of using bovine as a model for human diseases and also recognize the contribution of AS towards the disease phenotypes.
Collapse
Affiliation(s)
- Elsa Chacko
- Department of Chemistry and Biomolecular Sciences and ARC Centre of Excellence in Bioinformatics, Macquarie University, Sydney, NSW 2109, Australia.
| | | |
Collapse
|
125
|
Guerra Cardoso H, Doroteia Campos M, Rita Costa A, Catarina Campos M, Nothnagel T, Arnholdt-Schmitt B. Carrot alternative oxidase gene AOX2a demonstrates allelic and genotypic polymorphisms in intron 3. PHYSIOLOGIA PLANTARUM 2009; 137:592-608. [PMID: 19941625 DOI: 10.1111/j.1399-3054.2009.01299.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Single nucleotide polymorphisms (SNPs) and insertion-deletions (InDels) are becoming important genetic markers for major crop species. In this study, we focus on variations at genomic level of the Daucus carota L. AOX2a gene. The use of gene-specific primers designed in exon regions on the boundaries of introns permitted to recognize intron length polymorphism (ILP) in intron 3 AOX2a by simple polymerase chain reaction (PCR) assays. The length of intron 3 can vary in individual carrot plants. Thus, allelic variation can be used as a tool to discriminate between single plant genotypes. Using this approach, individual plants from cv. Rotin and from diverse breeding lines and cultivars were identified that showed genetic variability by AOX2a ILPs. Repetitive patterns of intron length variation have been observed which allows grouping of genotypes. Polymorphic and identical PCR fragments revealed underlying high levels of sequence polymorphism. Variability was due to InDel events and intron single nucleotide polymorphisms (ISNPs), with a repetitive deletion in intron 3 affecting a putative pre-miRNA site. The results suggest that high AOX2a gene diversity in D. carota can be explored for the development of functional markers related to agronomic traits.
Collapse
Affiliation(s)
- Hélia Guerra Cardoso
- EU Marie Curie Chair, ICAAM, University of Evora, Apartado 94, 7002-554 Evora, Portugal
| | | | | | | | | | | |
Collapse
|
126
|
Ferreira AO, Cardoso HG, Macedo ES, Breviario D, Arnholdt-Schmitt B. Intron polymorphism pattern in AOX1b of wild St John's wort (Hypericum perforatum) allows discrimination between individual plants. PHYSIOLOGIA PLANTARUM 2009; 137:520-31. [PMID: 19843238 DOI: 10.1111/j.1399-3054.2009.01291.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
The present paper deals with the analysis of natural polymorphism in a selected alternative oxidase (AOX) gene of the medicinal plant, St John's wort. Four partial AOX gene sequences were isolated from the genomic DNA of a wild plant of Hypericum perforatum L. Three genes belong to the subfamily AOX1 (HpAOX1a, b and c) and one to the subfamily AOX2 (HpAOX2). The partial sequence of HpAOX1b showed polymerase chain reaction (PCR) fragment size variation as a result of variable lengths in two introns. PCR performed by Exon Primed Intron Crossing (EPIC)-PCR displayed the same two-band pattern in six plants from a collection. Both fragments showed identical sequences for all exons. However, each of the two introns showed an insertion/deletion (InDel) in identical positions for all plants that counted for the difference in the two fragment sizes. The InDel in intron 1 influenced the predictability of a pre-microRNA site. The almost identical PCR fragment pattern was characterized by a high variability in the sequences. The InDels in both introns were linked to repetitive intron single nucleotide polymorphisms (ISNP)s. The polymorphic pattern obtained by InDels and ISNPs from both fragments together was appropriate to discriminate between all individual plants. We suggest that AOX sequence polymorphism in H. perforatum can be used for studies on gene diversity and biodiversity. Further, we conclude that AOX sequence polymorphism of individual plants should be considered in biological studies on AOX activity to exclude the influence of genetic diversity. The identified polymorphic fragments are available to be explored in future experiments as a potential source for functional marker development related to the characterization of origins/accessions and agronomic traits such as plant growth, development and yield stability.
Collapse
|
127
|
Zhu Z, Zhang Y, Long M. Extensive structural renovation of retrogenes in the evolution of the Populus genome. PLANT PHYSIOLOGY 2009; 151:1943-51. [PMID: 19789289 PMCID: PMC2785971 DOI: 10.1104/pp.109.142984] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Retroposition, as an important copy mechanism for generating new genes, was believed to play a negligible role in plants. As a representative dicot, the genomic sequences of Populus (poplar; Populus trichocarpa) provide an opportunity to investigate this issue. We identified 106 retrogenes and found the majority (89%) of them are associated with functional signatures in sequence evolution, transcription, and (or) translation. Remarkably, examination of gene structures revealed extensive structural renovation of these retrogenes: we identified 18 (17%) of them undergoing either chimerization to form new chimerical genes and (or) intronization (transformation into intron sequences of previously exonic sequences) to generate new intron-containing genes. Such a change might occur at a high speed, considering eight out of 18 such cases occurred recently after divergence between Arabidopsis (Arabidopsis thaliana) and Populus. This pattern also exists in Arabidopsis, with 15 intronized retrogenes occurring after the divergence between Arabidopsis and papaya (Carica papaya). Thus, the frequency of intronization in dicots revealed its importance as a mechanism in the evolution of exon-intron structure. In addition, we also examined the potential impact of the Populus nascent sex determination system on the chromosomal distribution of retrogenes and did not observe any significant effects of the extremely young sex chromosomes.
Collapse
|
128
|
Santos Macedo E, Cardoso HG, Hernández A, Peixe AA, Polidoros A, Ferreira A, Cordeiro A, Arnholdt-Schmitt B. Physiologic responses and gene diversity indicate olive alternative oxidase as a potential source for markers involved in efficient adventitious root induction. PHYSIOLOGIA PLANTARUM 2009; 137:532-52. [PMID: 19941624 DOI: 10.1111/j.1399-3054.2009.01302.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Olive (Olea europaea L.) trees are mainly propagated by adventitious rooting of semi-hardwood cuttings. However, efficient commercial propagation of valuable olive tree cultivars or landraces by semi-hardwood cuttings can often be restricted by a low rooting capacity. We hypothesize that root induction is a plant cell reaction linked to oxidative stress and that activity of stress-induced alternative oxidase (AOX) is importantly involved in adventitious rooting. To identify AOX as a source for potential functional marker sequences that may assist tree breeding, genetic variability has to be demonstrated that can affect gene regulation. The paper presents an applied, multidisciplinary research approach demonstrating first indications of an important relationship between AOX activity and differential adventitious rooting in semi-hardwood cuttings. Root induction in the easy-to-root Portuguese cultivar 'Cobrançosa' could be significantly reduced by treatment with salicyl-hydroxamic acid, an inhibitor of AOX activity. On the contrary, treatment with H2O2 or pyruvate, both known to induce AOX activity, increased the degree of rooting. Recently, identification of several O. europaea (Oe) AOX gene sequences has been reported from our group. Here we present for the first time partial sequences of OeAOX2. To search for polymorphisms inside of OeAOX genes, partial OeAOX2 sequences from the cultivars 'Galega vulgar', 'Cobrançosa' and 'Picual' were cloned from genomic DNA and cDNA, including exon, intron and 3'-untranslated regions (3'-UTRs) sequences. The data revealed polymorphic sites in several regions of OeAOX2. The 3'-UTR was the most important source for polymorphisms showing 5.7% of variability. Variability in the exon region accounted 3.4 and 2% in the intron. Further, analysis performed at the cDNA from microshoots of 'Galega vulgar' revealed transcript length variation for the 3'-UTR of OeAOX2 ranging between 76 and 301 bp. The identified polymorphisms and 3'-UTR length variation can be explored in future studies for effects on gene regulation and a potential linkage to olive rooting phenotypes in view of marker-assisted plant selection.
Collapse
|
129
|
Evolutionary genetic insights into Plasmodium falciparum functional genes. Parasitol Res 2009; 106:349-55. [PMID: 19902252 DOI: 10.1007/s00436-009-1668-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Accepted: 10/20/2009] [Indexed: 10/20/2022]
Abstract
Complex and rapidly evolving behavior of the human malaria parasite Plasmodium falciparum have always been mysterious to the evolutionary biologists, as the parasite is the most virulent and now becoming the most prevalent malaria parasite species across the globe. With the availability of complete genome sequence of P. falciparum, better understanding of the genome design and evolution could be possible. We herein utilized the available information of all known functional genes from whole genome of P. falciparum and investigate the differential mode of gene evolution. The study comparing P. falciparum functional genes with Plasmodium vivax revealed about 82% of genes to be conserved in the later species and the rest, 18% to be totally unique to P. falciparum. Genetic architectural pattern of functional genes shows absence of introns in about a half of the conserved genes, whereas almost all unique genes have introns. Similarly, distribution of intron number and length were also observed to be different for conserved and unique genes of P. falciparum. Statistically significant positive correlations between total intron length and gene lengths were detected in 11 chromosomes for unique genes, whereas only in three chromosomes for conserved genes. Preference of intron presence in some P. falciparum genes were also detected which provide functional relevance of introns. The study provides, for the first time, a detail evolutionary analysis of functional genes of a devastating malaria parasite. The marked differences in organization of introns between the unique and conserved genes in P. falciparum, and the contribution of introns to genome complexity are some of the hallmarks of the study.
Collapse
|
130
|
Belinky F, Cohen O, Huchon D. Large-scale parsimony analysis of metazoan indels in protein-coding genes. Mol Biol Evol 2009; 27:441-51. [PMID: 19864469 DOI: 10.1093/molbev/msp263] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Insertions and deletions (indels) are considered to be rare evolutionary events, the analysis of which may resolve controversial phylogenetic relationships. Indeed, indel characters are often assumed to be less homoplastic than amino acid and nucleotide substitutions and, consequently, more reliable markers for phylogenetic reconstruction. In this study, we analyzed indels from over 1,000 metazoan orthologous genes. We studied the impact of different species sampling, ortholog data sets, lengths of included indels, and indel-coding methods on the resulting metazoan tree. Our results show that, similar to sequence substitutions, indels are homoplastic characters, and their analysis is sensitive to the long-branch attraction artifact. Furthermore, improving the taxon sampling and choosing a closely related outgroup greatly impact the phylogenetic inference. Our indel-based inferences support the Ecdysozoa hypothesis over the Coelomata hypothesis and suggest that sponges are a sister clade to other animals.
Collapse
|
131
|
[Amplified consensus genetic markers and its application in plants]. YI CHUAN = HEREDITAS 2009; 31:913-20. [PMID: 19819844 DOI: 10.3724/sp.j.1005.2009.00913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Amplified consensus genetic markers (ACGM) is a novel PCR-based DNA molecular marker technique, which is based on the conservation of coding sequences and the potential polymorphism within non-coding sequences in homologous genes among closely related species. Along with the rapid development of comparative genomics and bioinformatics, the ACGM technique has already become a powerful tool for comparing homologous genes, analyzing phylogenetic relationships among species, and mapping of genes of interest. Here, an introduction to ACGM technique, especially its application in Brassica genera and Gramineae, was presented in detail. The prospects of ACGM were also discussed.
Collapse
|
132
|
Skouri-Gargouri H, Ben Ali M, Gargouri A. Molecular cloning, structural analysis and modelling of the AcAFP antifungal peptide from Aspergillus clavatus. Peptides 2009; 30:1798-804. [PMID: 19591888 DOI: 10.1016/j.peptides.2009.06.034] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Revised: 06/25/2009] [Accepted: 06/26/2009] [Indexed: 11/17/2022]
Abstract
An abundantly secreted thermostable peptide (designed AcAFP) with a molecular mass of 5777 Da was isolated and purified in a previous work from a local strain of A. clavatus (VR1). Based on the N-terminal amino acid (aa) sequence of the AcAFP peptide, an oligonucleotide probe was derived and allowed the amplification of the encoding cDNA by RT-PCR. This cDNA fragment encodes a pre-pro-protein of 94 aa which appears to be processed to a mature product of 51 aa cys-rich protein. The deduced aa sequence of the pre-pro-sequence reveals high similarity with ascomycetes antifungal peptide. Comparison of the nucleotide sequence of the genomic fragment and the cDNA clone revealed the presence of an open reading frame of 282 bp interrupted by two small introns of 89 and 56 bp with conserved splice site. The three-dimensional (3D) structure modeling of AcAFP exhibits a compact structure consisting of five anti-parallel beta barrel stabilized by four internal disulfide bridges. The folding pattern revealed also a cationic site and spatially adjacent hydrophobic stretch. The antifungal mechanism was investigated by transmission and confocal microscopy. AcAFP cause cell wall altering in a dose-dependent manner against the phytopathogenic fungus Fusarium oxysporum.
Collapse
Affiliation(s)
- Houda Skouri-Gargouri
- Laboratoire de Génétique Moléculaire des Eucaryotes, Centre de Biotechnologie de Sfax, Route Sidi Mansour, BP K 3038-Sfax, Tunisia
| | | | | |
Collapse
|
133
|
Andersson R, Enroth S, Rada-Iglesias A, Wadelius C, Komorowski J. Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res 2009; 19:1732-41. [PMID: 19687145 DOI: 10.1101/gr.092353.109] [Citation(s) in RCA: 257] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The genomes of higher organisms are packaged in nucleosomes with functional histone modifications. Until now, genome-wide nucleosome and histone modification studies have focused on transcription start sites (TSSs) where nucleosomes in RNA polymerase II (RNAPII) occupied genes are well positioned and have histone modifications that are characteristic of expression status. Using public data, we here show that there is a higher nucleosome-positioning signal in internal human exons and that this positioning is independent of expression. We observed a similarly strong nucleosome-positioning signal in internal exons of Caenorhabditis elegans. Among the 38 histone modifications analyzed in man, H3K36me3, H3K79me1, H2BK5me1, H3K27me1, H3K27me2, and H3K27me3 had evidently higher signals in internal exons than in the following introns and were clearly related to exon expression. These observations are suggestive of roles in splicing. Thus, exons are not only characterized by their coding capacity, but also by their nucleosome organization, which seems evolutionarily conserved since it is present in both primates and nematodes.
Collapse
Affiliation(s)
- Robin Andersson
- The Linnaeus Centre for Bioinformatics, Uppsala University, Sweden
| | | | | | | | | |
Collapse
|
134
|
Contrasting evolutionary dynamics between angiosperm and mammalian genomes. Trends Ecol Evol 2009; 24:572-82. [PMID: 19665255 DOI: 10.1016/j.tree.2009.04.010] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Revised: 04/06/2009] [Accepted: 04/22/2009] [Indexed: 12/23/2022]
Abstract
Continuing advances in genomics are revealing substantial differences between genomes of major eukaryotic lineages. Because most data (in terms of depth and phylogenetic breadth) are available for angiosperms and mammals, we explore differences between these groups and show that angiosperms have less highly compartmentalized and more diverse genomes than mammals. In considering the causes of these differences, four mechanisms are highlighted: polyploidy, recombination, retrotransposition and genome silencing, which have different modes and time scales of activity. Angiosperm genomes are evolutionarily more dynamic and labile, whereas mammalian genomes are more stable at both the sequence and chromosome level. We suggest that fundamentally different life strategies and development feedback on the genome exist, influencing dynamics and evolutionary trajectories at all levels from the gene to the genome.
Collapse
|
135
|
Abstract
Evolutionary reconstructions using maximum likelihood methods point to unexpectedly high densities of introns in protein-coding genes of ancestral eukaryotic forms including the last common ancestor of all extant eukaryotes. Combined with the evidence of the origin of spliceosomal introns from invading Group II self-splicing introns, these results suggest that early ancestral eukaryotic genomes consisted of up to 80% sequences derived from Group II introns, a much greater contribution of introns than that seen in any extant genome. An organism with such an unusual genome architecture could survive only under conditions of a severe population bottleneck.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
136
|
Strube C, Buschbaum S, von Samson-Himmelstjerna G, Schnieder T. Stage-dependent transcriptional changes and characterization of paramyosin of the bovine lungworm Dictyocaulus viviparus. Parasitol Int 2009; 58:334-40. [PMID: 19604498 DOI: 10.1016/j.parint.2009.07.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Revised: 07/02/2009] [Accepted: 07/03/2009] [Indexed: 01/15/2023]
Abstract
The bovine lungworm Dictyocaulus viviparus is of major economic importance in cattle farming in the temperate zones. The invertebrate protein paramyosin is one of the main components of muscle thick filaments but can also exhibit immunomodulatory functions. It represents a promising vaccine candidate in parasitic helminths. In this study, D. viviparus paramyosin (DvPmy) was characterized on the transcriptional as well as genomic level. The identified genomic sequence comprises 19 introns compared to only 10 introns in the Caenorhabditis elegans orthologue. Quantitative real time PCR transcriptional analysis revealed paramyosin transcription throughout the whole parasite's life cycle with the highest transcription rate in the agile moving first-stage larvae and the lowest in motionless hypobiosis induced third stage larvae. Recombinantly expressed DvPmy was found to bind collagen and IgG. Thereby the present study is the first showing that nematode paramyosin has the capability for immunomodulation and thus may be involved in host immune defence.
Collapse
Affiliation(s)
- C Strube
- Institute for Parasitology, University of Veterinary Medicine Hannover, Hannover, Germany.
| | | | | | | |
Collapse
|
137
|
Chacko E, Ranganathan S. Comprehensive splicing graph analysis of alternative splicing patterns in chicken, compared to human and mouse. BMC Genomics 2009; 10 Suppl 1:S5. [PMID: 19594882 PMCID: PMC2709266 DOI: 10.1186/1471-2164-10-s1-s5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Background Alternative transcript diversity manifests itself as a prime cause of complexity in higher eukaryotes. Recently, transcript diversity studies have suggested that 60–80% of human genes are alternatively spliced. We have used a splicing pattern approach for the bioinformatics analysis of Alternative Splicing (AS) in chicken, human and mouse. Exons involved in splicing are subdivided into distinct and variant exons, based on the prevalence of the exons across the transcripts. Four possible permutations of these two different groups of exons were categorised as class I (distinct-variant), class II (distinct-variant), class III (variant-distinct) and class IV (variant-variant). This classification quantifies the variation in transcript diversity in the three species. Results In all, 3901 chicken AS genes have been compared with 16,715 human and 16,491 mouse AS genes, with 23% of chicken genes being alternatively spliced, compared to 68% in humans and 57% in mice. To minimize any gene structure bias in the input data, comparative genome analysis has been carried out on the orthologous subset of AS genes for the three species. Gene-level analysis suggested that chicken genes show fewer AS events compared to human and mouse. An event-level analysis showed that the percentage of AS events in chicken is similar to that of human, which implies that a smaller number of chicken genes show greater transcript diversity. Overall, chicken genes were found to have fewer transcripts per gene and shorter introns than human and mouse genes. Conclusion In chicken, the majority of genes generate only two or three isoforms, compared to almost eight in human and six in mouse. We observed that intron definition is expressed strongly when compared to exon definition for chicken genome, based on 3% intron retention in chicken, compared to 2% in human and mouse. Splicing patterns with variant exons account for 33% of AS chicken orthologous genes compared to 24% in human and 27% in mouse, providing a novel measure to describe the species-wise complexity due to alternative transcript diversity.
Collapse
Affiliation(s)
- Elsa Chacko
- Department of Chemistry and Biomolecular Sciences, Macquarie University, NSW, Australia.
| | | |
Collapse
|
138
|
Plasmodium falciparum and Plasmodium vivax: so similar, yet very different. Parasitol Res 2009; 105:1169-71. [DOI: 10.1007/s00436-009-1521-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Accepted: 06/05/2009] [Indexed: 11/26/2022]
|
139
|
Hiller M, Findeiss S, Lein S, Marz M, Nickel C, Rose D, Schulz C, Backofen R, Prohaska SJ, Reuter G, Stadler PF. Conserved introns reveal novel transcripts in Drosophila melanogaster. Genome Res 2009; 19:1289-300. [PMID: 19458021 DOI: 10.1101/gr.090050.108] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Noncoding RNAs that are-like mRNAs-spliced, capped, and polyadenylated have important functions in cellular processes. The inventory of these mRNA-like noncoding RNAs (mlncRNAs), however, is incomplete even in well-studied organisms, and so far, no computational methods exist to predict such RNAs from genomic sequences only. The subclass of these transcripts that is evolutionarily conserved usually has conserved intron positions. We demonstrate here that a genome-wide comparative genomics approach searching for short conserved introns is capable of identifying conserved transcripts with a high specificity. Our approach requires neither an open reading frame nor substantial sequence or secondary structure conservation in the surrounding exons. Thus it identifies spliced transcripts in an unbiased way. After applying our approach to insect genomes, we predict 369 introns outside annotated coding transcripts, of which 131 are confirmed by expressed sequence tags (ESTs) and/or noncoding FlyBase transcripts. Of the remaining 238 novel introns, about half are associated with protein-coding genes-either extending coding or untranslated regions or likely belonging to unannotated coding genes. The remaining 129 introns belong to novel mlncRNAs that are largely unstructured. Using RT-PCR, we verified seven of 12 tested introns in novel mlncRNAs and 11 of 17 introns in novel coding genes. The expression level of all verified mlncRNA transcripts is low but varies during development, which suggests regulation. As conserved introns indicate both purifying selection on the exon-intron structure and conserved expression of the transcript in related species, the novel mlncRNAs are good candidates for functional transcripts.
Collapse
Affiliation(s)
- Michael Hiller
- Bioinformatics Group, Albert-Ludwigs-University Freiburg, 79110 Freiburg, Germany.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
140
|
Disassembly of Exon Junction Complexes by PYM. Cell 2009; 137:536-48. [DOI: 10.1016/j.cell.2009.02.042] [Citation(s) in RCA: 142] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2008] [Revised: 12/31/2008] [Accepted: 02/18/2009] [Indexed: 11/22/2022]
|
141
|
Cutter AD, Dey A, Murray RL. Evolution of the Caenorhabditis elegans genome. Mol Biol Evol 2009; 26:1199-234. [PMID: 19289596 DOI: 10.1093/molbev/msp048] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
A fundamental problem in genome biology is to elucidate the evolutionary forces responsible for generating nonrandom patterns of genome organization. As the first metazoan to benefit from full-genome sequencing, Caenorhabditis elegans has been at the forefront of research in this area. Studies of genomic patterns, and their evolutionary underpinnings, continue to be augmented by the recent push to obtain additional full-genome sequences of related Caenorhabditis taxa. In the near future, we expect to see major advances with the onset of whole-genome resequencing of multiple wild individuals of the same species. In this review, we synthesize many of the important insights to date in our understanding of genome organization and function that derive from the evolutionary principles made explicit by theoretical population genetics and molecular evolution and highlight fertile areas for future research on unanswered questions in C. elegans genome evolution. We call attention to the need for C. elegans researchers to generate and critically assess nonadaptive hypotheses for genomic and developmental patterns, in addition to adaptive scenarios. We also emphasize the potential importance of evolution in the gonochoristic (female and male) ancestors of the androdioecious (hermaphrodite and male) C. elegans as the source for many of its genomic and developmental patterns.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology & Evolutionary Biology and the Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario, Canada.
| | | | | |
Collapse
|
142
|
Ivashchenko AT, Tauasarova MI, Atambayeva SA. Exon-intron structure of genes in complete fungal genomes. Mol Biol 2009. [DOI: 10.1134/s002689330901004x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
143
|
Development and appraisement of functional molecular marker: intron sequence amplified polymorphism (ISAP). YI CHUAN = HEREDITAS 2009; 30:1207-16. [DOI: 10.3724/sp.j.1005.2008.01207] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
144
|
Zhu L, Zhang Y, Zhang W, Yang S, Chen JQ, Tian D. Patterns of exon-intron architecture variation of genes in eukaryotic genomes. BMC Genomics 2009; 10:47. [PMID: 19166620 PMCID: PMC2636830 DOI: 10.1186/1471-2164-10-47] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2008] [Accepted: 01/24/2009] [Indexed: 11/16/2022] Open
Abstract
Background The origin and importance of exon-intron architecture comprises one of the remaining mysteries of gene evolution. Several studies have investigated the variations of intron length, GC content, ordinal position in a gene and divergence. However, there is little study about the structural variation of exons and introns. Results We investigated the length, GC content, ordinal position and divergence in both exons and introns of 13 eukaryotic genomes, representing plant and animal. Our analyses revealed that three basic patterns of exon-intron variation were present in nearly all analyzed genomes (P < 0.001 in most cases): an ordinal reduction of length and divergence in both exon and intron, a co-variation between exon and its flanking introns in their length, GC content and divergence, and a decrease of average exon (or intron) length, GC content and divergence as the total exon numbers of a gene increased. In addition, we observed that the shorter introns had either low or high GC content, and the GC content of long introns was intermediate. Conclusion Although the factors contributing to these patterns have not been identified, our results provide three important clues: common factor(s) exist and may shape both exons and introns; the ordinal reduction patterns may reflect a time-orderly evolution; and the larger first and last exons may be splicing-required. These clues provide a framework for elucidating mechanisms involved in the organization of eukaryotic genomes and particularly in building exon-intron structures.
Collapse
Affiliation(s)
- Liucun Zhu
- State Key Laboratory of Pharmaceutical Biotechnology, Department of Biology, Nanjing University, Nanjing 210093, PR China.
| | | | | | | | | | | |
Collapse
|
145
|
Long range clustering of oligonucleotides containing the CG signal. J Theor Biol 2009; 258:18-26. [PMID: 19490875 DOI: 10.1016/j.jtbi.2009.01.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2008] [Revised: 01/14/2009] [Accepted: 01/14/2009] [Indexed: 11/24/2022]
Abstract
The distance distributions between successive occurrences of the same oligonucleotides in chromosomal DNA are studied, in different classes of higher eucaryotic organisms. A two-parameter modeling is undertaken and applied on the distance distribution of quintuplets (sequences of size five bps) and hexaplets (sequences of size six bps); the first parameter k refers to the short range exponential decay of the distributions, whereas the second parameter m refers to the power law behavior. A two-dimensional scatter plot representing the model equation demonstrates that the points corresponding to the distance distribution of oligonucleotides containing the CG consensus sequence (promoter of the RNA polymerase II) cluster together (group alpha), apart from all other oligonucleotides (group beta). This is shown for the available chordata Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, Gallus gallus and Danio rerio. This clustering is less evident in lower Animalia and plants, such as Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. Moreover, in all organisms the oligonucleotides which contain any consensus sequence are found to be described by long range distributions, whereas all others have a stronger influence of short range decay. Various measures are introduced and evaluated, to numerically characterize the clustering of the two groups. The one which most clearly discriminates the two classes is shown to be the proximity factor.
Collapse
|
146
|
Smith JJ, Putta S, Zhu W, Pao GM, Verma IM, Hunter T, Bryant SV, Gardiner DM, Harkins TT, Voss SR. Genic regions of a large salamander genome contain long introns and novel genes. BMC Genomics 2009; 10:19. [PMID: 19144141 PMCID: PMC2633012 DOI: 10.1186/1471-2164-10-19] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2008] [Accepted: 01/13/2009] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND The basis of genome size variation remains an outstanding question because DNA sequence data are lacking for organisms with large genomes. Sixteen BAC clones from the Mexican axolotl (Ambystoma mexicanum: c-value = 32 x 10(9) bp) were isolated and sequenced to characterize the structure of genic regions. RESULTS Annotation of genes within BACs showed that axolotl introns are on average 10x longer than orthologous vertebrate introns and they are predicted to contain more functional elements, including miRNAs and snoRNAs. Loci were discovered within BACs for two novel EST transcripts that are differentially expressed during spinal cord regeneration and skin metamorphosis. Unexpectedly, a third novel gene was also discovered while manually annotating BACs. Analysis of human-axolotl protein-coding sequences suggests there are 2% more lineage specific genes in the axolotl genome than the human genome, but the great majority (86%) of genes between axolotl and human are predicted to be 1:1 orthologs. Considering that axolotl genes are on average 5x larger than human genes, the genic component of the salamander genome is estimated to be incredibly large, approximately 2.8 gigabases! CONCLUSION This study shows that a large salamander genome has a correspondingly large genic component, primarily because genes have incredibly long introns. These intronic sequences may harbor novel coding and non-coding sequences that regulate biological processes that are unique to salamanders.
Collapse
Affiliation(s)
- Jeramiah J Smith
- Department of Biology and Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY 40506, USA
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA
- Benaroya Research Institute at Virginia Mason, Seattle, WA 98101, USA
| | - Srikrishna Putta
- Department of Biology and Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY 40506, USA
| | - Wei Zhu
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Gerald M Pao
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Inder M Verma
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Tony Hunter
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Susan V Bryant
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA 92697, USA
- The Developmental Biology Center, University of California Irvine, Irvine, CA 92697, USA
| | - David M Gardiner
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA 92697, USA
- The Developmental Biology Center, University of California Irvine, Irvine, CA 92697, USA
| | | | - S Randal Voss
- Department of Biology and Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY 40506, USA
| |
Collapse
|
147
|
Evolution of GHF5 endoglucanase gene structure in plant-parasitic nematodes: no evidence for an early domain shuffling event. BMC Evol Biol 2008; 8:305. [PMID: 18980666 PMCID: PMC2633302 DOI: 10.1186/1471-2148-8-305] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Accepted: 11/03/2008] [Indexed: 11/22/2022] Open
Abstract
Background Endo-1,4-beta-glucanases or cellulases from the glycosyl hydrolase family 5 (GHF5) have been found in numerous bacteria and fungi, and recently also in higher eukaryotes, particularly in plant-parasitic nematodes (PPN). The origin of these genes has been attributed to horizontal gene transfer from bacteria, although there still is a lot of uncertainty about the origin and structure of the ancestral GHF5 PPN endoglucanase. It is not clear whether this ancestral endoglucanase consisted of the whole gene cassette, containing a catalytic domain and a carbohydrate-binding module (CBM, type 2 in PPN and bacteria) or only of the catalytic domain while the CBM2 was retrieved by domain shuffling later in evolution. Previous studies on the evolution of these genes have focused primarily on data of sedentary nematodes, while in this study, extra data from migratory nematodes were included. Results Two new endoglucanases from the migratory nematodes Pratylenchus coffeae and Ditylenchus africanus were included in this study. The latter one is the first gene isolated from a PPN of a different superfamily (Sphaerularioidea); all previously known nematode endoglucanases belong to the superfamily Tylenchoidea (order Rhabditida). Phylogenetic analyses were conducted with the PPN GHF5 endoglucanases and homologous endoglucanases from bacterial and other eukaryotic lineages such as beetles, fungi and plants. No statistical incongruence between the phylogenetic trees deduced from the catalytic domain and the CBM2 was found, which could suggest that both domains have evolved together. Furthermore, based on gene structure data, we inferred a model for the evolution of the GHF5 endoglucanase gene structure in plant-parasitic nematodes. Our data confirm a close relationship between Pratylenchus spp. and the root knot nematodes, while some Radopholus similis endoglucanases are more similar to cyst nematode genes. Conclusion We conclude that the ancestral PPN GHF5 endoglucanase gene most probably consisted of the whole gene cassette, i.e. the GHF5 catalytic domain and the CBM2, rather than that it evolved by domain shuffling. Our evolutionary model for the gene structure in PPN GHF5 endoglucanases implies the occurrence of an early duplication event, and more recent gene duplications at genus or species level.
Collapse
|
148
|
Roy M, Kim N, Xing Y, Lee C. The effect of intron length on exon creation ratios during the evolution of mammalian genomes. RNA (NEW YORK, N.Y.) 2008; 14:2261-73. [PMID: 18796579 PMCID: PMC2578852 DOI: 10.1261/rna.1024908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Recent studies report that alternatively spliced exons tend to occur in longer introns, which is attributed to the length constraints for splice site pairing for the two major splicing mechanisms, intron definition versus exon definition. Using genome-wide studies of EST and microarray data from human and mouse, we have analyzed the distribution of various subsets of alternatively spliced exons, based on their inclusion level and evolutionary history, versus increasing intron length. Alternative exons may be included in either a major or minor fraction of all transcripts (known as major-form and minor-form exons, respectively). We find that major-form exons are seven- to eightfold more likely to be contained in short introns (<400 nt) than minor-form exons, which occur preferentially in longer introns. Since minor-form exons are more likely to be novel (approximately 75%), this implied that novel exons arise more frequently in longer introns. To test this hypothesis, we used whole genome alignments to classify exons according to their phylogenetic age. We find that older exons, i.e., exons that are conserved in all mammals, predominate at shorter intron lengths, for both major- and minor-form exons. In contrast, exons that arose recently during primate evolution are more prevalent at longer intron lengths (>1000 nt). This suggests that the observed correlation of longer intron lengths with alternatively spliced exons may be at least partly due to biases in the probability of exon creation, which is higher in long introns.
Collapse
Affiliation(s)
- Meenakshi Roy
- Molecular Biology Institute, University of California, Los Angeles, California 90024, USA
| | | | | | | |
Collapse
|
149
|
Wahlberg N, Wheat CW. Genomic outposts serve the phylogenomic pioneers: designing novel nuclear markers for genomic DNA extractions of lepidoptera. Syst Biol 2008; 57:231-42. [PMID: 18398768 DOI: 10.1080/10635150802033006] [Citation(s) in RCA: 228] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Abstract
Increasing the number of characters used in phylogenetic studies is the next crucial step towards generating robust and stable phylogenetic hypotheses - i.e., strongly supported and consistent across reconstruction method. Here we describe a genomic approach to finding new protein-coding genes for systematics in nonmodel taxa, which can be PCR amplified from standard, slightly degraded genomic DNA extracts. We test this approach on Lepidoptera, searching the draft genomic sequence of the silk moth Bombyx mori, for exons > 500 bp in length, removing annotated gene families, and compared remaining exons with butterfly EST databases to identify conserved regions for primer design. These primers were tested on a set of 65 taxa primarily in the butterfly family Nymphalidae. We were able to identify and amplify six previously unused gene regions (Arginine Kinase, GAPDH, IDH, MDH, RpS2, and RpS5) and two rarely used gene regions (CAD and DDC) that when added to the three traditional gene regions (COI, EF-1alpha and wingless) gave a data set of 8114 bp. Phylogenetic robustness and stability increased with increasing numbers of genes. Smaller taxanomic subsets were also robust when using the full gene data set. The full 11-gene data set was robust and stable across reconstruction methods, recovering the major lineages and strongly supporting relationships within them. Our methods and insights should be applicable to taxonomic groups having a single genomic reference species and several EST databases from taxa that diverged less than 100 million years ago.
Collapse
Affiliation(s)
- Niklas Wahlberg
- Department of Zoology, Stockholm University, Stockholm, Sweden.
| | | |
Collapse
|
150
|
Knapp K, Chonka A, Chen YPP. POEM, A 3-dimensional exon taxonomy and patterns in untranslated exons. BMC Genomics 2008; 9:428. [PMID: 18803852 PMCID: PMC2561055 DOI: 10.1186/1471-2164-9-428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2008] [Accepted: 09/20/2008] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The existence of exons and introns has been known for thirty years. Despite this knowledge, there is a lack of formal research into the categorization of exons. Exon taxonomies used by researchers tend to be selected ad hoc or based on an information poor de-facto standard. Exons have been shown to have specific properties and functions based on among other things their location and order. These factors should play a role in the naming to increase specificity about which exon type(s) are in question. RESULTS POEM (Protein Oriented Exon Monikers) is a new taxonomy focused on protein proximal exons. It integrates three dimensions of information (Global Position, Regional Position and Region), thus its exon categories are based on known statistical exon features. POEM is applied to two congruent untranslated exon datasets resulting in the following statistical properties. Using the POEM taxonomy previous wide ranging estimates of initial 5' untranslated region exons are resolved. According to our datasets, 29-36% of genes have wholly untranslated first exons. Untranslated exon containing sequences are shown to have consistently up to 6 times more 5' untranslated exons than 3' untranslated exons. Finally, three exon patterns are determined which account for 70% of untranslated exon genes. CONCLUSION We describe a thorough three-dimensional exon taxonomy called POEM, which is biologically and statistically relevant. No previous taxonomy provides such fine grained information and yet still includes all valid information dimensions. The use of POEM will improve the accuracy of genefinder comparisons and analysis by means of a common taxonomy. It will also facilitate unambiguous communication due to its fine granularity.
Collapse
Affiliation(s)
- Keith Knapp
- Faculty of Science and Technology, Deakin University, Victoria, Australia.
| | | | | |
Collapse
|