1
|
Rohmah L, Darwati S, Ulupi N, Khaerunnisa I, Sumantri C. Polymorphism of prolactin (PRL) gene exon 5 and its association with egg production in IPB-D1 chickens. Arch Anim Breed 2022; 65:449-455. [PMID: 36643022 PMCID: PMC9832302 DOI: 10.5194/aab-65-449-2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 11/28/2022] [Indexed: 12/24/2022] Open
Abstract
The prolactin (PRL) gene regulates the egg production and incubation in laying chickens. Local chickens' reproductive systems will disrupt as a result of the incubation period activity, and they will lay fewer eggs. This study aimed to determine the prolactin gene polymorphism in IPB-D1 hens and its relationship to egg production. The polymorphism of the exon 5 prolactin gene was examined on 112 samples of the IPB-D1 chicken DNA collection from the Division of Animal Genetics and Breeding, Faculty of Animal Sciences, IPB University. By performing the phenol-chloroform method, the genomic DNA was obtained. A polymerase chain reaction (PCR) product with a size of 557 bp was produced as a result of the DNA amplification. Three single-nucleotide sequences were discovered. Three single-nucleotide polymorphisms (SNPs), g.7835A > G, g.7886A > T, and g.8052T > C, were found in exon 5 of the PRL gene. Each mutation was polymorphic and in Hardy-Weinberg equilibrium. The point mutation g.8052T > C significantly impacted the egg production of IPB-D1 chickens, according to the SNP association analysis on egg production, and may serve as a marker to enhance the selection for the features of egg production in IPB-D1 chickens.
Collapse
Affiliation(s)
- Lailatul Rohmah
- Department of Animal Production and Technology, Faculty of Animal
Sciences, IPB University, Bogor 16680, Indonesia
| | - Sri Darwati
- Department of Animal Production and Technology, Faculty of Animal
Sciences, IPB University, Bogor 16680, Indonesia
| | - Niken Ulupi
- Department of Animal Production and Technology, Faculty of Animal
Sciences, IPB University, Bogor 16680, Indonesia
| | - Isyana Khaerunnisa
- Research Center for Applied Zoology, National Research and Innovation Agency, Bogor 16911, Indonesia
| | - Cece Sumantri
- Department of Animal Production and Technology, Faculty of Animal
Sciences, IPB University, Bogor 16680, Indonesia
| |
Collapse
|
2
|
Pradeepkumara N, Sharma PK, Munshi AD, Behera TK, Bhatia R, Kumari K, Singh J, Jaiswal S, Iquebal MA, Arora A, Rai A, Kumar D, Bhattacharya RC, Dey SS. Fruit transcriptional profiling of the contrasting genotypes for shelf life reveals the key candidate genes and molecular pathways regulating post-harvest biology in cucumber. Genomics 2022; 114:110273. [PMID: 35092817 DOI: 10.1016/j.ygeno.2022.110273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 01/17/2022] [Accepted: 01/21/2022] [Indexed: 02/07/2023]
Abstract
Cucumber fruits are perishable in nature and become unfit for market within 2-3 days of harvesting. A natural variant, DC-48 with exceptionally high shelf life was developed and used to dissect the genetic architecture and molecular mechanism for extended shelf life through RNA-seq for first time. A total of 1364 DEGs were identified and cell wall degradation, chlorophyll and ethylene metabolism related genes played key role. Polygalacturunase (PG), Expansin (EXP) and xyloglucan were down regulated determining fruit firmness and retention of fresh green colour was mainly attributed to the low expression level of the chlorophyll catalytic enzymes (CCEs). Gene regulatory networks revealed the hub genes and cross-talk associated with wide variety of the biological processes. Large number of SSRs (21524), SNPs (545173) and InDels (126252) identified will be instrumental in cucumber improvement. A web genomic resource, CsExSLDb developed will provide a platform for future investigation on cucumber post-harvest biology.
Collapse
Affiliation(s)
- N Pradeepkumara
- Division of Vegetable Science, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Parva Kumar Sharma
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - A D Munshi
- Division of Vegetable Science, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - T K Behera
- Division of Vegetable Science, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Reeta Bhatia
- Division of Floriculture and Landscaping, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Khushboo Kumari
- Division of Vegetable Science, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Jogendra Singh
- Division of Vegetable Science, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Sarika Jaiswal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Mir Asif Iquebal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Ajay Arora
- Division of Plant Physiology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dinesh Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - R C Bhattacharya
- ICAR-National Institute of Plant Biotechnology, New Delhi, India
| | - S S Dey
- Division of Vegetable Science, ICAR-Indian Agricultural Research Institute, New Delhi, India.
| |
Collapse
|
3
|
Vlassakis J, Herr AE. Joule Heating-Induced Dispersion in Open Microfluidic Electrophoretic Cytometry. Anal Chem 2017; 89:12787-12796. [PMID: 29110464 DOI: 10.1021/acs.analchem.7b03096] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
While protein electrophoresis conducted in capillaries and microchannels offers high-resolution separations, such formats can be cumbersome to parallelize for single-cell analysis. One approach for realizing large numbers of concurrent separations is open microfluidics (i.e., no microchannels). In an open microfluidic device adapted for single-cell electrophoresis, we perform 100s to 1000s of simultaneous separations of endogenous proteins. The microscope slide-sized device contains cells isolated in microwells located in a ∼40 μm polyacrylamide gel. The gel supports protein electrophoresis after concurrent in situ chemical lysis of each isolated cell. During electrophoresis, Joule (or resistive) heating degrades separation performance. Joule heating effects are expected to be acute in open microfluidic devices, where a single, high-conductivity buffer expedites the transition from cell lysis to protein electrophoresis. Here, we test three key assertions. First, Joule heating substantially impacts analytical sensitivity due to diffusive losses of protein out of the open microfluidic electrophoretic (EP) cytometry device. Second, increased analyte diffusivity due to autothermal runaway Joule heating is a dominant mechanism that reduces separation resolution in EP cytometry. Finally, buffer exchange reduces diffusive losses and band broadening, even when handling single-cell lysate protein concentrations in an open device. We develop numerical simulations of Joule heating-enhanced diffusion during electrophoresis and observe ∼50% protein loss out of the gel, which is reduced using the buffer exchange. Informed by analytical model predictions of separation resolution (with Joule heating), we empirically demonstrate nearly fully resolved separations of proteins with molecular mass differences of just 4 kDa or 12% (GAPDH, 36 kDa; PS6, 32 kDa) in each of 129 single cells. The attained separation performance with buffer exchange is relevant to detection of currently unmeasurable protein isoforms responsible for cancer progression.
Collapse
Affiliation(s)
- Julea Vlassakis
- Department of Bioengineering and ‡The UC Berkeley/UCSF Graduate Program in Bioengineering, University of California Berkeley , Berkeley, California 94720, United States
| | - Amy E Herr
- Department of Bioengineering and ‡The UC Berkeley/UCSF Graduate Program in Bioengineering, University of California Berkeley , Berkeley, California 94720, United States
| |
Collapse
|
4
|
Forsdyke DR. Exons and Introns. Evol Bioinform Online 2016. [DOI: 10.1007/978-3-319-28755-3_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
5
|
Piovesan A, Caracausi M, Ricci M, Strippoli P, Vitale L, Pelleri MC. Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank. DNA Res 2015; 22:495-503. [PMID: 26581719 PMCID: PMC4675715 DOI: 10.1093/dnares/dsv028] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 10/07/2015] [Indexed: 01/26/2023] Open
Abstract
We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.
Collapse
Affiliation(s)
- Allison Piovesan
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Maria Caracausi
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Marco Ricci
- Department of Biological, Geological and Environmental Sciences (BIGeA), University of Bologna, Bologna, BO 40126, Italy
| | - Pierluigi Strippoli
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Lorenza Vitale
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Maria Chiara Pelleri
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| |
Collapse
|
6
|
Trinh LA, Fraser SE. Enhancer and gene traps for molecular imaging and genetic analysis in zebrafish. Dev Growth Differ 2013; 55:434-45. [DOI: 10.1111/dgd.12055] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Revised: 03/04/2013] [Accepted: 03/05/2013] [Indexed: 01/28/2023]
Affiliation(s)
- Le A. Trinh
- Division of Biology; California Institute of Technology; Beckman Institute (139-74); 1200 E. California Blvd; Pasadena; California; 91125; USA
| | - Scott E. Fraser
- Division of Biology; California Institute of Technology; Beckman Institute (139-74); 1200 E. California Blvd; Pasadena; California; 91125; USA
| |
Collapse
|
7
|
Piwowar M, Krzysztof P, Piotr P. ExonVisualiser - application for visualization exon units in 2D and 3D protein structures. Bioinformation 2012; 8:1280-2. [PMID: 23275735 PMCID: PMC3532015 DOI: 10.6026/97320630081280] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Accepted: 11/14/2012] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED The web application oriented on identification and visualization of protein regions encoded by exons is presented. The Exon Visualiser can be used for visualisation on different levels of protein structure: at the primary (sequence) level and secondary structures level, as well as at the level of tertiary protein structure. The programme is suitable for processing data for all genes which have protein expressions deposited in the PDB database. The procedure steps implemented in the application: I) loading exons sequences and theirs coordinates from GenBank file as well as protein sequences: CDS from GenBank and aminoacid sequence from PDB II) consensus sequence creation (comparing amino acid sequences form PDB file with the CDS sequence from GenBank file) III) matching exon coordinates IV) visualisation in 2D and 3D protein structures. Presented web-tool among others provides the color-coded graphical display of protein sequences and chains in three dimensional protein structures which are correlated with the corresponding exons. AVAILABILITY http://149.156.12.53/ExonVisualiser/
Collapse
Affiliation(s)
- Monika Piwowar
- Department of Bioinformatics and Telemedicine, Collegium Medicum, Jagiellonian University, Lazarza 16, 31-530 Krakow, Poland
| | - Porembski Krzysztof
- Department of Bioinformatics and Telemedicine, Collegium Medicum, Jagiellonian University, Lazarza 16, 31-530 Krakow, Poland
| | - Piwowar Piotr
- Department of Measurement and Electronics, AGH University of Science and Technology, al. A. Mickiewicza 30, 30-059 Krakow, Poland
| |
Collapse
|
8
|
Brody Y, Neufeld N, Bieberstein N, Causse SZ, Böhnlein EM, Neugebauer KM, Darzacq X, Shav-Tal Y. The in vivo kinetics of RNA polymerase II elongation during co-transcriptional splicing. PLoS Biol 2011; 9:e1000573. [PMID: 21264352 PMCID: PMC3019111 DOI: 10.1371/journal.pbio.1000573] [Citation(s) in RCA: 163] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2010] [Accepted: 11/19/2010] [Indexed: 01/01/2023] Open
Abstract
Kinetic analysis shows that RNA polymerase elongation kinetics are not modulated by co-transcriptional splicing and that post-transcriptional splicing can proceed at the site of transcription without the presence of the polymerase. RNA processing events that take place on the transcribed pre-mRNA include capping, splicing, editing, 3′ processing, and polyadenylation. Most of these processes occur co-transcriptionally while the RNA polymerase II (Pol II) enzyme is engaged in transcriptional elongation. How Pol II elongation rates are influenced by splicing is not well understood. We generated a family of inducible gene constructs containing increasing numbers of introns and exons, which were stably integrated in human cells to serve as actively transcribing gene loci. By monitoring the association of the transcription and splicing machineries on these genes in vivo, we showed that only U1 snRNP localized to the intronless gene, consistent with a splicing-independent role for U1 snRNP in transcription. In contrast, all snRNPs accumulated on intron-containing genes, and increasing the number of introns increased the amount of spliceosome components recruited. This indicates that nascent RNA can assemble multiple spliceosomes simultaneously. Kinetic measurements of Pol II elongation in vivo, Pol II ChIP, as well as use of Spliceostatin and Meayamycin splicing inhibitors showed that polymerase elongation rates were uncoupled from ongoing splicing. This study shows that transcription elongation kinetics proceed independently of splicing at the model genes studied here. Surprisingly, retention of polyadenylated mRNA was detected at the transcription site after transcription termination. This suggests that the polymerase is released from chromatin prior to the completion of splicing, and the pre-mRNA is post-transcriptionally processed while still tethered to chromatin near the gene end. The pre-mRNA emerging from RNA polymerase II during eukaryotic transcription undergoes a series of processing events. These include 5′-capping, intron excision and exon ligation during splicing, 3′-end processing, and polyadenylation. Processing events occur co-transcriptionally, meaning that a variety of enzymes assemble on the pre-mRNA while the polymerase is still engaged in transcription. The concept of co-transcriptional mRNA processing raises questions about the possible coupling between the transcribing polymerase and the processing machineries. Here we examine how the co-transcriptional assembly of the splicing machinery (the spliceosome) might affect the elongation kinetics of the RNA polymerase. Using live-cell microscopy, we followed the kinetics of transcription of genes containing increasing numbers of introns and measured the recruitment of transcription and splicing factors. Surprisingly, a sub-set of splicing factors was recruited to an intronless gene, implying that there is a polymerase-coupled scanning mechanism for intronic sequences. There was no difference in polymerase elongation rates on genes with or without introns, suggesting that the spliceosome does not modulate elongation kinetics. Experiments including inhibition of splicing or transcription, together with stochastic computational simulation, demonstrated that pre-mRNAs can be retained on the gene when polymerase termination precedes completion of splicing. Altogether we show that polymerase elongation kinetics are not affected by splicing events on the emerging pre-mRNA, that increased splicing leads to more splicing factors being recruited to the mRNA, and that post-transcriptional splicing can proceed at the site of transcription in the absence of the polymerase.
Collapse
Affiliation(s)
- Yehuda Brody
- The Mina & Everard Goodman Faculty of Life Sciences & Institute of Nanotechnology, Bar-Ilan University, Ramat Gan, Israel
| | - Noa Neufeld
- The Mina & Everard Goodman Faculty of Life Sciences & Institute of Nanotechnology, Bar-Ilan University, Ramat Gan, Israel
| | - Nicole Bieberstein
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Sebastien Z. Causse
- Functional Imaging of Transcription, Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS, CNRS, UMR8197, Paris, France
| | - Eva-Maria Böhnlein
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Karla M. Neugebauer
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Xavier Darzacq
- Functional Imaging of Transcription, Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS, CNRS, UMR8197, Paris, France
| | - Yaron Shav-Tal
- The Mina & Everard Goodman Faculty of Life Sciences & Institute of Nanotechnology, Bar-Ilan University, Ramat Gan, Israel
- * E-mail:
| |
Collapse
|
9
|
Ogino K, Tsuneki K, Furuya H. Unique genome of dicyemid mesozoan: Highly shortened spliceosomal introns in conservative exon/intron structure. Gene 2010; 449:70-6. [DOI: 10.1016/j.gene.2009.09.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2008] [Revised: 08/31/2009] [Accepted: 09/01/2009] [Indexed: 01/08/2023]
|
10
|
Long range clustering of oligonucleotides containing the CG signal. J Theor Biol 2009; 258:18-26. [PMID: 19490875 DOI: 10.1016/j.jtbi.2009.01.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2008] [Revised: 01/14/2009] [Accepted: 01/14/2009] [Indexed: 11/24/2022]
Abstract
The distance distributions between successive occurrences of the same oligonucleotides in chromosomal DNA are studied, in different classes of higher eucaryotic organisms. A two-parameter modeling is undertaken and applied on the distance distribution of quintuplets (sequences of size five bps) and hexaplets (sequences of size six bps); the first parameter k refers to the short range exponential decay of the distributions, whereas the second parameter m refers to the power law behavior. A two-dimensional scatter plot representing the model equation demonstrates that the points corresponding to the distance distribution of oligonucleotides containing the CG consensus sequence (promoter of the RNA polymerase II) cluster together (group alpha), apart from all other oligonucleotides (group beta). This is shown for the available chordata Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, Gallus gallus and Danio rerio. This clustering is less evident in lower Animalia and plants, such as Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. Moreover, in all organisms the oligonucleotides which contain any consensus sequence are found to be described by long range distributions, whereas all others have a stronger influence of short range decay. Various measures are introduced and evaluated, to numerically characterize the clustering of the two groups. The one which most clearly discriminates the two classes is shown to be the proximity factor.
Collapse
|
11
|
Knapp K, Chonka A, Chen YPP. POEM, A 3-dimensional exon taxonomy and patterns in untranslated exons. BMC Genomics 2008; 9:428. [PMID: 18803852 PMCID: PMC2561055 DOI: 10.1186/1471-2164-9-428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2008] [Accepted: 09/20/2008] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The existence of exons and introns has been known for thirty years. Despite this knowledge, there is a lack of formal research into the categorization of exons. Exon taxonomies used by researchers tend to be selected ad hoc or based on an information poor de-facto standard. Exons have been shown to have specific properties and functions based on among other things their location and order. These factors should play a role in the naming to increase specificity about which exon type(s) are in question. RESULTS POEM (Protein Oriented Exon Monikers) is a new taxonomy focused on protein proximal exons. It integrates three dimensions of information (Global Position, Regional Position and Region), thus its exon categories are based on known statistical exon features. POEM is applied to two congruent untranslated exon datasets resulting in the following statistical properties. Using the POEM taxonomy previous wide ranging estimates of initial 5' untranslated region exons are resolved. According to our datasets, 29-36% of genes have wholly untranslated first exons. Untranslated exon containing sequences are shown to have consistently up to 6 times more 5' untranslated exons than 3' untranslated exons. Finally, three exon patterns are determined which account for 70% of untranslated exon genes. CONCLUSION We describe a thorough three-dimensional exon taxonomy called POEM, which is biologically and statistically relevant. No previous taxonomy provides such fine grained information and yet still includes all valid information dimensions. The use of POEM will improve the accuracy of genefinder comparisons and analysis by means of a common taxonomy. It will also facilitate unambiguous communication due to its fine granularity.
Collapse
Affiliation(s)
- Keith Knapp
- Faculty of Science and Technology, Deakin University, Victoria, Australia.
| | | | | |
Collapse
|
12
|
Di Giulio M. The split genes of Nanoarchaeum equitans are an ancestral character. Gene 2008; 421:20-6. [DOI: 10.1016/j.gene.2008.06.010] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Revised: 01/15/2008] [Accepted: 06/03/2008] [Indexed: 11/30/2022]
|
13
|
Mátés L, Izsvák Z, Ivics Z. Technology transfer from worms and flies to vertebrates: transposition-based genome manipulations and their future perspectives. Genome Biol 2007; 8 Suppl 1:S1. [PMID: 18047686 PMCID: PMC2106849 DOI: 10.1186/gb-2007-8-s1-s1] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
To meet the increasing demand of linking sequence information to gene function in vertebrate models, genetic modifications must be introduced and their effects analyzed in an easy, controlled, and scalable manner. In the mouse, only about 10% (estimate) of all genes have been knocked out, despite continuous methodologic improvement and extensive effort. Moreover, a large proportion of inactivated genes exhibit no obvious phenotypic alterations. Thus, in order to facilitate analysis of gene function, new genetic tools and strategies are currently under development in these model organisms. Loss of function and gain of function mutagenesis screens based on transposable elements have numerous advantages because they can be applied in vivo and are therefore phenotype driven, and molecular analysis of the mutations is straightforward. At present, laboratory harnessing of transposable elements is more extensive in invertebrate models, mostly because of their earlier discovery in these organisms. Transposons have already been found to facilitate functional genetics research greatly in lower metazoan models, and have been applied most comprehensively in Drosophila. However, transposon based genetic strategies were recently established in vertebrates, and current progress in this field indicates that transposable elements will indeed serve as indispensable tools in the genetic toolkit for vertebrate models. In this review we provide an overview of transposon based genetic modification techniques used in higher and lower metazoan model organisms, and we highlight some of the important general considerations concerning genetic applications of transposon systems.
Collapse
Affiliation(s)
- Lajos Mátés
- Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str, 13092 Berlin, Germany
| | | | | |
Collapse
|
14
|
Provata A, Oikonomou T. Power law exponents characterizing human DNA. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2007; 75:056102. [PMID: 17677128 DOI: 10.1103/physreve.75.056102] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2006] [Revised: 02/09/2007] [Indexed: 05/16/2023]
Abstract
The size distributions of all known coding and noncoding DNA sequences are studied in all human chromosomes. In a unified approach, both introns and intergenic regions are treated as noncoding regions. The distributions of noncoding segments Pnc(S) of size S present long tails Pnc(S) approximately S(-1-mu nc) , with exponents mu nc ranging between 0.71 (for chromosome 13) and 1.2 (for chromosome 19). On the contrary, the exponential, short-range decay terms dominate in the distributions of coding (exon) segments Pc(S) in all chromosomes. Aiming to address the emergence of these statistical features, minimal, stochastic, mean-field models are proposed, based on randomly aggregating DNA strings with duplication, influx and outflux of genomic segments. These minimal models produce both the short-range statistics in the coding and the observed power law and fractal statistics in the noncoding DNA. The minimal models also demonstrate that although the two systems (coding and noncoding) coexist, alternating on the same linear chain, they act independently: the coding as a closed, equilibrium system and the noncoding as an open, out-of-equilibrium one.
Collapse
Affiliation(s)
- A Provata
- Institute of Physical Chemistry, National Center for Scientific Research Demokritos, 15310 Athens, Greece.
| | | |
Collapse
|
15
|
SpliceMiner: a high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysis. BMC Bioinformatics 2007; 8:75. [PMID: 17338820 PMCID: PMC1839109 DOI: 10.1186/1471-2105-8-75] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2006] [Accepted: 03/05/2007] [Indexed: 12/12/2022] Open
Abstract
Background There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and design of new arrays for alternative splicing require assessment of probes at the sequence and exon levels. Description SpliceMiner is a web interface for querying Evidence Viewer Database (EVDB). EVDB is a comprehensive, non-redundant compendium of splice variant data for human genes. We constructed EVDB as a queryable implementation of the NCBI Evidence Viewer (EV). EVDB is based on data obtained from NCBI Entrez Gene and EV. The automated EVDB build process uses only complete coding sequences, which may or may not include partial or complete 5' and 3' UTRs, and filters redundant splice variants. Unlike EV, which supports only one-at-a-time queries, SpliceMiner supports high-throughput batch queries and provides results in an easily parsable format. SpliceMiner maps probes to splice variants, effectively delineating the variants identified by a probe. Conclusion EVDB can be queried by gene symbol, genomic coordinates, or probe sequence via a user-friendly web-based tool we call SpliceMiner (). The EVDB/SpliceMiner combination provides an interface with human splice variant information and, going beyond the very valuable NCBI Evidence Viewer, supports fluent, high-throughput analysis. Integration of EVDB information into microarray analysis and design pipelines has the potential to improve the analysis and bioinformatic interpretation of gene expression data, for both batch and interactive processing. For example, whenever a gene expression value is recognized as important or appears anomalous in a microarray experiment, the interactive mode of SpliceMiner can be used quickly and easily to check for possible splice variant issues.
Collapse
|
16
|
de Roos ADG. Conserved intron positions in ancient protein modules. Biol Direct 2007; 2:7. [PMID: 17288589 PMCID: PMC1800838 DOI: 10.1186/1745-6150-2-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2007] [Accepted: 02/08/2007] [Indexed: 12/31/2022] Open
Abstract
Background The timing of the origin of introns is of crucial importance for an understanding of early genome architecture. The Exon theory of genes proposed a role for introns in the formation of multi-exon proteins by exon shuffling and predicts the presence of conserved splice sites in ancient genes. In this study, large-scale analysis of potential conserved splice sites was performed using an intron-exon database (ExInt) derived from GenBank. Results A set of conserved intron positions was found by matching identical splice sites sequences from distantly-related eukaryotic kingdoms. Most amino acid sequences with conserved introns were homologous to consensus sequences of functional domains from conserved proteins including kinases, phosphatases, small GTPases, transporters and matrix proteins. These included ancient proteins that originated before the eukaryote-prokaryote split, for instance the catalytic domain of protein phosphatase 2A where a total of eleven conserved introns were found. Using an experimental setup in which the relation between a splice site and the ancientness of its surrounding sequence could be studied, it was found that the presence of an intron was positively correlated to the ancientness of its surrounding sequence. Intron phase conservation was linked to the conservation of the gene sequence and not to the splice site sequence itself. However, no apparent differences in phase distribution were found between introns in conserved versus non-conserved sequences. Conclusion The data confirm an origin of introns deep in the eukaryotic branch and is in concordance with the presence of introns in the first functional protein modules in an 'Exon theory of genes' scenario. A model is proposed in which shuffling of primordial short exonic sequences led to the formation of the first functional protein modules, in line with hypotheses that see the formation of introns integral to the origins of genome evolution. Reviewers This article was reviewed by Scott Roy (nominated by Anthony Poole), Sandro de Souza (nominated by Manyuan Long), and Gáspár Jékely.
Collapse
Affiliation(s)
- Albert D G de Roos
- Syncyte BioIntelligence, P.O. Box 600, 1000 AP, Amsterdam, The Netherlands.
| |
Collapse
|
17
|
Street TO, Rose GD, Barrick D. The role of introns in repeat protein gene formation. J Mol Biol 2006; 360:258-66. [PMID: 16781737 DOI: 10.1016/j.jmb.2006.05.024] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2006] [Revised: 05/08/2006] [Accepted: 05/10/2006] [Indexed: 11/23/2022]
Abstract
Genes composed of tandem repetitive sequence motifs are abundant in nature and are enriched in eukaryotes. To investigate repeat protein gene formation mechanisms, we have conducted a large-scale analysis of their introns and exons. We find that a wide variety of repeat motifs exhibit a striking conservation of intron position and phase, and are composed of exons that encode one or two complete repeats. These results suggest a simple model of repeat protein gene formation from local duplications. This model is corroborated by amino acid sequence similarity patterns among neighboring repeats from various repeat protein genes. The distribution of one- and two-repeat exons indicates that intron-facilitated repeat motif duplication, in which the start and end points of duplication are located in consecutive intronic regions, significantly exceeds intron-independent duplication. These results suggest that introns have contributed to the greater abundance of repeat protein genes in eukaryotic versus prokaryotic organisms, a conclusion that is supported by taxonomic analysis.
Collapse
Affiliation(s)
- Timothy O Street
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | | | | |
Collapse
|
18
|
Sakharkar KR, Sakharkar MK, Chow VTK. Gene fusion in Helicobacter pylori: making the ends meet. Antonie van Leeuwenhoek 2006; 89:169-80. [PMID: 16541196 DOI: 10.1007/s10482-005-9021-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 08/09/2005] [Accepted: 10/24/2005] [Indexed: 11/26/2022]
Abstract
Fusion genes have been reported as a means of enabling the development of novel or enhanced functions. In this report, we analyzed fusion genes in the genomes of two Helicobacter pylori strains (26695 and J99) and identified 32 fusion genes that are present as neighbours in one strain (components) and are fused in the second (composite), and vice-versa. The mechanism for each case of gene fusion is explored. 28 out of 32 genes identified as fusion products in this analysis were reported as essential genes in the previously documented transposon mutagenesis of H. pylori strain G27. This observation suggests the potential of the products of fusion genes as putative microbial drug targets. These results underscore the utility of bacterial genomic sequence comparisons for understanding gene evolution and for in silico drug target identification in the post-genomic era.
Collapse
Affiliation(s)
- Kishore R Sakharkar
- Programme in Infectious Diseases, Department of Microbiology, Yong Loo Lin School of Medicine, National University of Singapore, 5 Science Drive 2, Kent Ridge 117597, Singapore
| | | | | |
Collapse
|
19
|
Forsdyke DR. Exons and Introns. Evol Bioinform Online 2006. [DOI: 10.1007/978-0-387-33419-6_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
20
|
Nagasaki H, Arita M, Nishizawa T, Suwa M, Gotoh O. Species-specific variation of alternative splicing and transcriptional initiation in six eukaryotes. Gene 2005; 364:53-62. [PMID: 16219431 DOI: 10.1016/j.gene.2005.07.027] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2005] [Revised: 05/11/2005] [Accepted: 07/18/2005] [Indexed: 11/28/2022]
Abstract
The genome-wide detection of alternative splicing and transcriptional initiation (ASTI) was conducted in six eukaryotes (human, mouse, fruit fly, nematode, cress and rice) whose genome sequencing has been completed or nearly completed. Transcriptional isoforms were collected by mapping a batch of full-length cDNA sequences onto the respective cognate genomic sequences. Isoforms mapped on the same gene locus were compared pair-wise, ASTI patterns were segmented into minimal spans, and then the minimal patterns (ASTI units) were classified into unique types, such as the cassette type or the alternative donor site. All these procedures were performed automatically under the same conditions so that the results obtained from different species could be compared directly. The fraction of loci that underwent ASTI of the total mapped loci was the largest for mammals and fruit fly, and the smallest for plants. Exactly the same trend was observed for the number of unique ASTI types found in each species. The observed fractional representations of the ASTI types were similar between evolutionarily close species, such as human and mouse or cress and rice. On the other hand, the relative orders of abundance in individual ASTI type were considerably different between evolutionarily distant species, such as between mammals and plants. In human and mouse, alternative splicing other than the retained introns tended to occur within the protein coding sequence (CDS) regions rather than within the untranslated regions (UTRs), whereas this tendency was obscure in the other four species. In all the species examined, the difference in alternative exon lengths was most likely in multiples of three, and this tendency was most prominent when the alternative exons were embedded within the CDSs. These observations are generally consistent with the idea that higher organisms utilize the ASTI mechanisms more extensively and in a more complicated manner than lower organisms, and that ASTI actively participates in the enhancement of the functional and structural diversity of products generated from a limited number of genes on a genome.
Collapse
Affiliation(s)
- Hideki Nagasaki
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo 135-0064, Japan.
| | | | | | | | | |
Collapse
|
21
|
Vibranovski MD, Sakabe NJ, de Oliveira RS, de Souza SJ. Signs of ancient and modern exon-shuffling are correlated to the distribution of ancient and modern domains along proteins. J Mol Evol 2005; 61:341-50. [PMID: 16034650 DOI: 10.1007/s00239-004-0318-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2004] [Accepted: 03/11/2005] [Indexed: 11/24/2022]
Abstract
Exon-shuffling is an important mechanism accounting for the origin of many new proteins in eukaryotes. However, its role in the creation of proteins in the ancestor of prokaryotes and eukaryotes is still debatable. Excess of symmetric exons is thought to represent evidence for exon-shuffling since the exchange of exons flanked by introns of the same phase does not disrupt the reading frame of the host gene. In this report, we found that there is a significant correlation between symmetric units of shuffling and the age of protein domains. Ancient domains, present in both prokaryotes and eukaryotes, are more frequently bounded by phase 0 introns and their distribution is biased towards the central part of proteins. Modern domains are more frequently bounded by phase 1 introns and are present predominantly at the ends of proteins. We propose a model in which shuffling of ancient domains mainly flanked by phase 0 introns was important in the ancestor of eukaryotes and prokaryotes, during the creation of the central part of proteins. Shuffling of modern domains, predominantly flanked by phase 1 introns, accounted for the origin of the extremities of proteins during eukaryotic evolution.
Collapse
|
22
|
Vendramini D. Noncoding DNA and the teem theory of inheritance, emotions and innate behaviour. Med Hypotheses 2005; 64:512-9. [PMID: 15617858 DOI: 10.1016/j.mehy.2004.08.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2004] [Accepted: 08/25/2004] [Indexed: 10/26/2022]
Abstract
The evolutionary function of noncoding 'junk' DNA remains one of the most challenging mysteries of genetics. Here a new model of DNA is proposed to explain this function. The hypothesis asserts the DNA molecule contains not one, but two separate modes of inheritance. In addition to exons that code for proteins and physical traits, it is argued noncoding repetitive elements code for the inheritance of emotions and innate behaviour in metazoans. That is to say, noncoding DNA functions as the medium of a second, hitherto unknown evolutionary process that genetically archives adaptive information, configured as emotions and acquired during the life of an organism, into an inheritable form. This second evolutionary process, here called 'Teemosis', is a selectionist process, but paradoxically, because it does not affect physical traits, it has no maladaptive Lamarckian consequences. The medical implications of the hypothesis are discussed.
Collapse
|
23
|
Issac B, Raghava GPS. EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches. Genome Res 2004; 14:1756-66. [PMID: 15342559 PMCID: PMC515322 DOI: 10.1101/gr.2524704] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2004] [Accepted: 07/07/2004] [Indexed: 11/24/2022]
Abstract
EGPred is a Web-based server that combines ab initio methods and similarity searches to predict genes, particularly exon regions, with high accuracy. The EGPred program proceeds in the following steps: (1) an initial BLASTX search of genomic sequence against the RefSeq database is used to identify protein hits with an E-value <1; (2) a second BLASTX search of genomic sequence against the hits from the previous run with relaxed parameters (E-values <10) helps to retrieve all probable coding exon regions; (3) a BLASTN search of genomic sequence against the intron database is then used to detect probable intron regions; (4) the probable intron and exon regions are compared to filter/remove wrong exons; (5) the NNSPLICE program is then used to reassign splicing signal site positions in the remaining probable coding exons; and (6) finally ab initio predictions are combined with exons derived from the fifth step based on the relative strength of start/stop and splice signal sites as obtained from ab initio and similarity search. The combination method increases the exon level performance of five different ab initio programs by 4%-10% when evaluated on the HMR195 data set. Similar improvement is observed when ab initio programs are evaluated on the Burset/Guigo data set. Finally, EGPred is demonstrated on an approximately 95-Mbp fragment of human chromosome 13. The list of predicted genes from this analysis are available in the supplementary material. The EGPred program is computationally intensive due to multiple BLAST runs during each analysis. The EGPred server is available at http://www.imtech.res.in/raghava/egpred/.
Collapse
Affiliation(s)
- Biju Issac
- Institute of Microbial Technology, Sector 39A, Chandigarh-160036. India
| | | |
Collapse
|
24
|
Guarnaccia C, Pintar A, Pongor S. Exon 6 of human Jagged-1 encodes an autonomously folding unit. FEBS Lett 2004; 574:156-60. [PMID: 15358557 DOI: 10.1016/j.febslet.2004.08.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2004] [Revised: 08/04/2004] [Accepted: 08/05/2004] [Indexed: 10/26/2022]
Abstract
Human Jagged-1 is predicted to contain 16 epidermal growth factor-like (EGF) repeats. The oxidative folding of EGF-2, despite the several conditions tested, systematically led to complex mixtures. A longer peptide spanning the C-terminal part of EGF-1 and the complete EGF-2 repeat, on the contrary, could be readily refolded. This peptide, which corresponds to the entire exon 6 of the Jagged-1 gene, thus represents an autonomously folding unit. We show that it is structured in solution, as suggested by circular dichroism and NMR spectroscopy, and displays an EGF-like disulfide bond topology, as determined by disulfide mapping.
Collapse
Affiliation(s)
- Corrado Guarnaccia
- International Centre for Genetic Engineering and Biotechnology (ICGEB), Protein Structure and Bioinformatics Group, AREA Science Park, Padriciano 99, I-34012 Trieste, Italy
| | | | | |
Collapse
|
25
|
Bányai L, Patthy L. Evidence that human genes of modular proteins have retained significantly more ancestral introns than their fly or worm orthologues. FEBS Lett 2004; 565:127-32. [PMID: 15135065 DOI: 10.1016/j.febslet.2004.03.088] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2004] [Revised: 03/25/2004] [Accepted: 03/26/2004] [Indexed: 11/19/2022]
Abstract
Comparison of the exon-intron structures of human, fly and worm orthologues of mosaic genes assembled from class 1-1 modules by exon-shuffling has revealed that human genes retained significantly more of the original inter-module introns than their protostome orthologues. It is suggested that the much higher rate of intron loss in the worm- and insect lineages than in the chordate lineage reflects their greater tendency for genome compaction.
Collapse
Affiliation(s)
- László Bányai
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, P.O. Box 7, H-1518 Budapest, Hungary
| | | |
Collapse
|
26
|
Gopalan V, Tan TW, Lee BTK, Ranganathan S. Xpro: database of eukaryotic protein-encoding genes. Nucleic Acids Res 2004; 32:D59-63. [PMID: 14681359 PMCID: PMC308785 DOI: 10.1093/nar/gkh051] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Xpro is a relational database that contains all the eukaryotic protein-encoding DNA sequences contained in GenBank with associated data required for the analysis of eukaryotic gene architecture. In addition to the information found in the GenBank records, which includes properties such as sequence, position, length and description about introns, exons and protein-coding regions, Xpro provides annotations on the splice sites and intron phases. Furthermore, Xpro validates intron positions using alignment information between the record's sequence and EST sequences found in dbEST. In the process of validation, alternative splicing information is also obtained and can be found in the database. The intron-containing genes in the Xpro are also classified as experimental or predicted based on the intron position validation and specific keywords in the GenBank records that are present in predicted genes. An Entrez-like query system, which is familiar to most biologists, is provided for accessing the information present in the database system. A non-redundant set of Xpro database contents is also obtained by cross-referencing to the Swiss-Prot/TrEMBL and Pfam databases. The database currently contains information for 493,983 genes--351,918 intron- containing genes and 142,065 intron-less genes. Xpro is updated for each new GenBank release and is freely available via the internet at http://origin.bic. nus.edu.sg/xpro.
Collapse
Affiliation(s)
- Vivek Gopalan
- Department of Biochemistry, National University of Singapore, Singapore 119260
| | | | | | | |
Collapse
|
27
|
Xue HY, Forsdyke DR. Low-complexity segments in Plasmodium falciparum proteins are primarily nucleic acid level adaptations. Mol Biochem Parasitol 2003; 128:21-32. [PMID: 12706793 DOI: 10.1016/s0166-6851(03)00039-2] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Protein segments that contain few of the possible 20 amino acids, sometimes in tandem repeat arrays, are referred to as containing "simple" or "low-complexity" sequence. Many Plasmodium falciparum proteins are longer than their homologs in other species by virtue of their content of such low-complexity segments that have no known function; these are interspersed among segments of higher complexity to which function can often be ascribed. If there is low complexity at the protein level, there is likely to be low complexity at the corresponding nucleic acid level (departure from equifrequency of the four bases). Thus, low complexity may have been selected primarily at the nucleic acid level and low complexity at the protein level may be secondary. In this case, the amino acid composition of low-complexity segments should be more reflective than that of high complexity segments on forces operating at the nucleic acid level, which include GC-pressure and AG-pressure. Consistent with this, for amino acid determining first and second codon positions, open reading frames containing low-complexity segments show increased contributions to downward GC-pressure (revealed as decreased percentage of G+C) and to upward AG-pressure (revealed as increased percentage A+G). When not countermanded by high contributions to AG-pressure, low-complexity segments can contribute to base order-dependent fold potential; in this respect, they resemble introns. Thus, in P. falciparum, low-complexity segments appear as adaptations primarily serving nucleic acid level functions.
Collapse
Affiliation(s)
- H Y Xue
- Department of Biochemistry, Queen's University, Kingston, Ont, K7L3N6, Canada
| | | |
Collapse
|
28
|
Kaessmann H, Zöllner S, Nekrutenko A, Li WH. Signatures of domain shuffling in the human genome. Genome Res 2002; 12:1642-50. [PMID: 12421750 PMCID: PMC187552 DOI: 10.1101/gr.520702] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
To elucidate the role of exon shuffling in shaping the complexity of the human genome/proteome, we have systematically analyzed intron phase distributions in the coding sequence of human protein domains. We found that introns at the boundaries of domains show high excess of symmetrical phase combinations (i.e., 0-0, 1-1, and 2-2), whereas nonboundary introns show no excess symmetry. This suggests that exon shuffling has primarily involved rearrangement of structural and functional domains as a whole. Furthermore, we found that domains flanked by phase 1 introns have dramatically expanded in the human genome due to domain shuffling and that 1-1 symmetrical domains and domain families are nonrandomly distributed with respect to their age. The predominance and extracellular location of 1-1 symmetrical domains among domains specific to metazoans suggests that they are associated with the rise of multicellularity. On the other hand, 0-0 symmetrical domains tend to be over-represented among ancient protein domains that are shared between the eukaryotic and prokaryotic kingdoms, which is compatible with the suggestion of primordial domain shuffling in the progenote. To see whether the human data reflect general genomic patterns of metazoans, similar analyses were done for the nematode Caenorhabditis elegans. Although the C. elegans data generally concur with the human patterns, we identified fewer intron-bounded domains in this organism, consistent with the lower complexity of C. elegans genes. [The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: Z. Gu and R. Stevens.]
Collapse
Affiliation(s)
- Henrik Kaessmann
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA.
| | | | | | | |
Collapse
|