1
|
Pseudogenes: Four Decades of Discovery. Methods Mol Biol 2021. [PMID: 34165705 DOI: 10.1007/978-1-0716-1503-4_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2023]
Abstract
A pseudogene is defined as a genomic DNA sequence that looks like a mutated or truncated version of a known functional gene. Nearly four decades since their first discovery it has been estimated that between ~12,000 and ~20,000 pseudogenes exist in the human genome. Early efforts to characterize functions for pseudogenes were unsuccessful, thus they were considered functionless relics of evolutionary selection, junk DNA or genetic fossils. Remarkably, an increasing number of pseudogenes have been reported to be expressed as RNA transcripts above and beyond levels considered accidental or spurious transcription. There is emerging evidence that some expressed pseudogene transcripts have biological functions and should be defined as a subclass of functional long noncoding RNAs (lncRNA). In this introductory chapter, I briefly summarize the history and the current knowledge of pseudogenes, and highlight the emerging functions of some pseudogenes in human biology and disease. This second iteration of Pseudogenes in Methods in Molecular Biology highlights new methodological approaches to investigate this intriguing family of lncRNAs and the extent of their biological function.
Collapse
|
2
|
Stephens Z, Milosevic D, Kipp B, Grebe S, Iyer RK, Kocher JPA. PB-Motif-A Method for Identifying Gene/Pseudogene Rearrangements With Long Reads: An Application to CYP21A2 Genotyping. Front Genet 2021; 12:716586. [PMID: 34394200 PMCID: PMC8355628 DOI: 10.3389/fgene.2021.716586] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 07/05/2021] [Indexed: 12/30/2022] Open
Abstract
Long read sequencing technologies have the potential to accurately detect and phase variation in genomic regions that are difficult to fully characterize with conventional short read methods. These difficult to sequence regions include several clinically relevant genes with highly homologous pseudogenes, many of which are prone to gene conversions or other types of complex structural rearrangements. We present PB-Motif, a new method for identifying rearrangements between two highly homologous genomic regions using PacBio long reads. PB-Motif leverages clustering and filtering techniques to efficiently report rearrangements in the presence of sequencing errors and other systematic artifacts. Supporting reads for each high-confidence rearrangement can then be used for copy number estimation and phased variant calling. First, we demonstrate PB-Motif's accuracy with simulated sequence rearrangements of PMS2 and its pseudogene PMS2CL using simulated reads sweeping over a range of sequencing error rates. We then apply PB-Motif to 26 clinical samples, characterizing CYP21A2 and its pseudogene CYP21A1P as part of a diagnostic assay for congenital adrenal hyperplasia. We successfully identify damaging variation and patient carrier status concordant with clinical diagnosis obtained from multiplex ligation-dependent amplification (MLPA) and Sanger sequencing. The source code is available at: github.com/zstephens/pb-motif.
Collapse
Affiliation(s)
- Zachary Stephens
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL, United States
| | | | | | | | - Ravishankar K Iyer
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL, United States
| | | |
Collapse
|
3
|
Khan AA, Ali MS, Babar F, Fatima A, Shafqat MA, Asghar B, Ilyas N, Fatima M, Liaqat A, Gondal MA. Lack of CpG islands in human unitary pseudogenes and its implication. Mamm Genome 2021; 32:443-447. [PMID: 34272576 DOI: 10.1007/s00335-021-09893-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/07/2021] [Indexed: 11/24/2022]
Abstract
CpG islands (CGIs) are aggregation of CpG dinucleotides in the promoters of mammalian genes. These CGIs are present in almost all the housekeeping genes and some tissue-specific genes in the mammalian genome. Extensive research has been done on the prevalence and role of CGIs in protein-coding genes. However, little is known about CGIs in pseudogenes. In the current research project, we focused on CGIs in three main classes of pseudogenes e.g., duplicated pseudogenes (DPGs), processed pseudogenes (PPGs), and unitary pseudogenes (UPGs). We discovered a predominant absence of CGIs in the promoters of all three pseudogenes. We also compared the CGI profile of these pseudogenes with their parent genes and found that unitary pseudogenes (UPGs) differ from the DPGs and PPGs in the sense that in the latter, lack of CGIs is a consequential event while in UPGs, this lack of CGIs in their promoters is not a result of pseudogenization process. We also discussed the implication of the results obtained from this comparison. To our knowledge, this is the first-ever study highlighting this aspect of UPGs throwing new insights into the evolution of genome in general and especially in the context of pseudogenes.
Collapse
Affiliation(s)
- Ammad Aslam Khan
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan.
| | - Muhammad Shahryar Ali
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Farah Babar
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Anees Fatima
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Muhammad Awais Shafqat
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Bisma Asghar
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Nimra Ilyas
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Maheen Fatima
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | - Ayesha Liaqat
- Department of Bioinformatics and Computational Biology, Virtual University, Lahore, 547 92, Pakistan
| | | |
Collapse
|
4
|
Tian R, Geng Y, Guo H, Yang C, Seim I, Yang G. Comparative analysis of the superoxide dismutase gene family in Cetartiodactyla. J Evol Biol 2021; 34:1046-1060. [PMID: 33896059 DOI: 10.1111/jeb.13792] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/29/2021] [Accepted: 04/16/2021] [Indexed: 12/18/2022]
Abstract
Cetacea, whales, dolphins and porpoises form an order of mammals adapted to aquatic life. Their transition to an aquatic habitat resulted in exceptional protection against cellular insults, including oxidative and osmotic stress. Here, we considered the structure and molecular evolution of the superoxide dismutase (SOD) gene family, which encodes essential enzymes in the mammalian antioxidant system, in the superorder Cetartiodactyla. To this end, we juxtaposed cetaceans and their closest extant relatives (order Artiodactyla). We identified 94 genes in 23 species, of which 70 are bona fide intact genes. Although the SOD gene family is conserved in Cetartiodactyla, lineage-specific gene duplications and deletions were observed. Phylogenetic analyses show that the SOD2 subfamily diverged from a clade containing SOD1 and SOD3, suggesting that cytoplasmic, extracellular and mitochondrial SODs have started down independent evolutionary paths. Specific-amino acid changes (e.g. K130N in SOD2) that may enhance ROS elimination were identified in cetaceans. In silico analysis suggests that the core transcription factor repertoire of cetartiodactyl SOD genes may include Sp1, NF-κB, Nrf2 and AHR. Putative transcription factors binding sites responding to hypoxia were (e.g. Suppressor of Hairless; Su(H)) found in the cetacean SOD1 gene. We found significant evidence for positive selection in cetaceans using codon models. Cetaceans with different diving abilities also show divergent evolution of SOD1 and SOD2. Our genome-wide analysis of SOD genes helps clarify their relationship and evolutionary trajectory and identify putative functional changes in cetaceans.
Collapse
Affiliation(s)
- Ran Tian
- Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing, China.,Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Yuepan Geng
- Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Han Guo
- Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Chen Yang
- Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Inge Seim
- Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing, China.,School of Biology and Environmental Science, Queensland University of Technology, Brisbane, QLD, Australia
| | - Guang Yang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| |
Collapse
|
5
|
Transcriptional activity and strain-specific history of mouse pseudogenes. Nat Commun 2020; 11:3695. [PMID: 32728065 PMCID: PMC7392758 DOI: 10.1038/s41467-020-17157-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Accepted: 06/08/2020] [Indexed: 01/07/2023] Open
Abstract
Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of strain-sequencing and transcriptional data. Here, combining both manual curation and automatic pipelines, we present a genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains (available via the mouse.pseudogene.org resource). We also annotate 165 unitary pseudogenes in mouse, and 303, in human. The overall pseudogene repertoire in mouse is similar to that in human in terms of size, biotype distribution, and family composition (e.g. with GAPDH and ribosomal proteins being the largest families). Notable differences arise in the pseudogene age distribution, with multiple retro-transpositional bursts in mouse evolutionary history and only one in human. Furthermore, in each strain about a fifth of all pseudogenes are unique, reflecting strain-specific evolution. Finally, we find that ~15% of the mouse pseudogenes are transcribed, and that highly transcribed parent genes tend to give rise to many processed pseudogenes.
Collapse
|
6
|
Sen K, Sarkar A, Maji RK, Ghosh Z, Gupta S, Ghosh TC. Deciphering the cross-talking of human competitive endogenous RNAs in K562 chronic myelogenous leukemia cell line. MOLECULAR BIOSYSTEMS 2016; 12:3633-3642. [DOI: 10.1039/c6mb00568c] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Chronic myelogenous leukemia (CML) is a myeloproliferative disorder characterized by increased proliferation or abnormal accumulation of the granulocytic cell line without the depletion of their capacity to differentiate.
Collapse
Affiliation(s)
- Kamalika Sen
- Bioinformatics Centre
- Bose Institute
- Kolkata-700 054
- India
| | | | | | - Zhumur Ghosh
- Bioinformatics Centre
- Bose Institute
- Kolkata-700 054
- India
| | - Sanjib Gupta
- Bioinformatics Centre
- Bose Institute
- Kolkata-700 054
- India
| | | |
Collapse
|
7
|
Porter KA, Duffy EB, Nyland P, Atianand MK, Sharifi H, Harton JA. The CLRX.1/NOD24 (NLRP2P) pseudogene codes a functional negative regulator of NF-κB, pyrin-only protein 4. Genes Immun 2014; 15:392-403. [PMID: 24871464 DOI: 10.1038/gene.2014.30] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Revised: 04/30/2014] [Accepted: 05/02/2014] [Indexed: 01/06/2023]
Abstract
Pseudogenes are duplicated yet defunct copies of functional parent genes. However, some pseudogenes have gained or retained function. In this study, we consider a functional role for the NLRP2-related, higher primate-specific, processed pseudogene NLRP2P, which is closely related to Pyrin-only protein 2 (POP2/PYDC2), a regulator of nuclear factor-κB (NF-κB) and the inflammasome. The NLRP2P open-reading frame on chromosome X has features consistent with a processed pseudogene (retrotransposon), yet encodes a 45-amino-acid, Pyrin-domain-related protein. The open-reading frame of NLRP2P shares 80% identity with POP2 and is under purifying selection across Old World primates. Although widely expressed, NLRP2P messenger RNA is upregulated by lipopolysaccharide in human monocytic cells. Functionally, NLRP2P impairs NF-κB p65 transactivation by reducing activating phosphorylation of RelA/p65. Reminiscent of POP2, NLRP2P reduces production of the NF-κB-dependent cytokines tumor necrosis factor alpha and interleukin (IL)-6 following toll-like receptor stimulation. In contrast to POP2, NLRP2P fails to inhibit the ASC-dependent NLRP3 inflammasome. In addition, beyond regulating cytokine production, NLRP2P has a potential role in cell cycle regulation and cell death. Collectively, our findings suggest that NLRP2P is a resurrected processed pseudogene that regulates NF-κB RelA/p65 activity and thus represents the newest member of the POP family, POP4.
Collapse
Affiliation(s)
- K A Porter
- Center for Immunology and Microbial Disease, Albany Medical College, Albany, NY, USA
| | - E B Duffy
- Center for Immunology and Microbial Disease, Albany Medical College, Albany, NY, USA
| | - P Nyland
- Center for Immunology and Microbial Disease, Albany Medical College, Albany, NY, USA
| | - M K Atianand
- Center for Immunology and Microbial Disease, Albany Medical College, Albany, NY, USA
| | - H Sharifi
- Center for Immunology and Microbial Disease, Albany Medical College, Albany, NY, USA
| | - J A Harton
- Center for Immunology and Microbial Disease, Albany Medical College, Albany, NY, USA
| |
Collapse
|
8
|
Enhanced evolution by stochastically variable modification of epigenetic marks in the early embryo. Proc Natl Acad Sci U S A 2014; 111:6353-8. [PMID: 24733912 DOI: 10.1073/pnas.1402585111] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Evolution by gene duplication is generally accepted as one of the crucial driving forces for the gain of new complexity and functions, but the formation of pseudogenes remains a problem for this mechanism. Here we expand on earlier ideas that epigenetic modifications can drive neo- and subfunctionalization in evolution by gene duplication. We explore the effects of stochastic epigenetic modifications on the evolution (and thus development) of complex organisms in a constant environment. Modeling is done both using a modified genetic drift analytical treatment and computer simulations, which were found to agree. A transposon silencing model is also explored. Some key assumptions made include (i) stochastic, incomplete removal (or addition) of repressive epigenetic marks takes place during a window(s) of opportunity in the zygote and early embryo; (ii) there is no statistical variation of the marks after the window closes; and (iii) the genes affected are sensitive to dosage. Our genetic drift treatment takes into account that after gene duplication the prevailing case upon which selection operates is a duplicate/singlet heterozygote; to the best of our knowledge, this has not been considered in previous treatments. We conclude from our modeling that stochastic epigenetic modifications, with rates consistent with experimental observation, can both increase the rate of gene fixation and decrease pseudogenization, thus dramatically improving the efficacy of evolution by gene duplication. We also find that a transposon silencing model is advantageous for fixation of recessive genes in diploid organisms, especially with large effective population sizes.
Collapse
|
9
|
Sen K, Ghosh TC. Pseudogenes and their composers: delving in the 'debris' of human genome. Brief Funct Genomics 2013; 12:536-47. [PMID: 23900003 DOI: 10.1093/bfgp/elt026] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Pseudogenes, the nonfunctional homologs of functional genes and thus exemplified as 'genomic fossils' provide intriguing snapshots of the evolutionary history of human genome. These defunct copies generally arise by retrotransposition or duplication followed by various genetic disablements. In this study, focusing on human pseudogenes and their functional homologues we describe their characteristic features and relevance to protein sequence evolution. We recapitulate that pseudogenes harbor disease-causing degenerative sequence variations in conjunction with the immense disease gene association of their progenitors. Furthermore, we also discuss the issue of functional resurrection and the potentiality observed in some pseudogenes to regulate their functional counterparts.
Collapse
Affiliation(s)
- Kamalika Sen
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India. Tel.: +91 33 2355 6626; Fax: +91 33 2355 3886;
| | | |
Collapse
|
10
|
Wang L, Si W, Yao Y, Tian D, Araki H, Yang S. Genome-wide survey of pseudogenes in 80 fully re-sequenced Arabidopsis thaliana accessions. PLoS One 2012; 7:e51769. [PMID: 23272162 PMCID: PMC3521719 DOI: 10.1371/journal.pone.0051769] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Accepted: 11/07/2012] [Indexed: 11/18/2022] Open
Abstract
Pseudogenes (Ψs), including processed and non-processed Ψs, are ubiquitous genetic elements derived from originally functional genes in all studied genomes within the three kingdoms of life. However, systematic surveys of non-processed Ψs utilizing genomic information from multiple samples within a species are still rare. Here a systematic comparative analysis was conducted of Ψs within 80 fully re-sequenced Arabidopsis thaliana accessions, and 7546 genes, representing ∼28% of the genomic annotated open reading frames (ORFs), were found with disruptive mutations in at least one accession. The distribution of these Ψs on chromosomes showed a significantly negative correlation between Ψs/ORFs and their local gene densities, suggesting a higher proportion of Ψs in gene desert regions, e.g. near centromeres. On the other hand, compared with the non-Ψ loci, even the intact coding sequences (CDSs) in the Ψ loci were found to have shorter CDS length, fewer exon number and lower GC content. In addition, a significant functional bias against the null hypothesis was detected in the Ψs mainly involved in responses to environmental stimuli and biotic stress as reported, suggesting that they are likely important for adaptive evolution to rapidly changing environments by pseudogenization to accumulate successive mutations.
Collapse
Affiliation(s)
- Long Wang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Weina Si
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Yongfang Yao
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Dacheng Tian
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Hitoshi Araki
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Center of Ecology, Evolution and Biogeochemistry, Kastanienbaum, Switzerland
- * E-mail: (SY); (HA)
| | - Sihai Yang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
- * E-mail: (SY); (HA)
| |
Collapse
|
11
|
Sen K, Ghosh TC. Evolutionary conservation and disease gene association of the human genes composing pseudogenes. Gene 2012; 501:164-70. [PMID: 22521745 DOI: 10.1016/j.gene.2012.04.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Revised: 02/09/2012] [Accepted: 04/05/2012] [Indexed: 01/16/2023]
Abstract
Pseudogenes, the 'genomic fossils' present portrayal of evolutionary history of human genome. The human genes configuring pseudogenes are also now coming forth as important resources in the study of human protein evolution. In this communication, we explored evolutionary conservation of the genes forming pseudogenes over the genes lacking any pseudogene and delving deeper, we probed an evolutionary rate difference between the disease genes in the two groups. We illustrated this differential evolutionary pattern by gene expressivity, number of regulatory miRNA targeting per gene, abundance of protein complex forming genes and lesser percentage of protein intrinsic disorderness. Furthermore, pseudogenes are observed to harbor sequence variations, over their entirety, those become degenerative disease-causing mutations though the disease involvement of their progenitors is still unexplored. Here, we unveiled an immense association of disease genes in the genes casting pseudogenes in human. We interpreted the issue by disease associated miRNA targeting, genes containing polymorphisms in miRNA target sites, abundance of genes having disease causing non-synonymous mutations, disease gene specific network properties, presence of genes having repeat regions, affluence of dosage sensitive genes and the presence of intrinsically unstructured protein regions.
Collapse
Affiliation(s)
- Kamalika Sen
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India.
| | | |
Collapse
|
12
|
Podder S, Ghosh TC. Evolutionary dynamics of human autoimmune disease genes and malfunctioned immunological genes. BMC Evol Biol 2012; 12:10. [PMID: 22276655 PMCID: PMC3347981 DOI: 10.1186/1471-2148-12-10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Accepted: 01/25/2012] [Indexed: 02/01/2023] Open
Abstract
Background One of the main issues of molecular evolution is to divulge the principles in dictating the evolutionary rate differences among various gene classes. Immunological genes have received considerable attention in evolutionary biology as candidates for local adaptation and for studying functionally important polymorphisms. The normal structure and function of immunological genes will be distorted when they experience mutations leading to immunological dysfunctions. Results Here, we examined the fundamental differences between the genes which on mutation give rise to autoimmune or other immune system related diseases and the immunological genes that do not cause any disease phenotypes. Although the disease genes examined are analogous to non-disease genes in product, expression, function, and pathway affiliation, a statistically significant decrease in evolutionary rate has been found in autoimmune disease genes relative to all other immune related diseases and non-disease genes. Possible ways of accumulation of mutation in the three steps of the central dogma (DNA-mRNA-Protein) have been studied to trace the mutational effects predisposed to disease consequence and acquiring higher selection pressure. Principal Component Analysis and Multivariate Regression Analysis have established the predominant role of single nucleotide polymorphisms in guiding the evolutionary rate of immunological disease and non-disease genes followed by m-RNA abundance, paralogs number, fraction of phosphorylation residue, alternatively spliced exon, protein residue burial and protein disorder. Conclusions Our study provides an empirical insight into the etiology of autoimmune disease genes and other immunological diseases. The immediate utility of our study is to help in disease gene identification and may also help in medicinal improvement of immune related disease.
Collapse
|
13
|
Sen K, Podder S, Ghosh TC. On the quest for selective constraints shaping the expressivity of the genes casting retropseudogenes in human. BMC Genomics 2011; 12:401. [PMID: 21824418 PMCID: PMC3162935 DOI: 10.1186/1471-2164-12-401] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Accepted: 08/08/2011] [Indexed: 02/04/2023] Open
Abstract
Background Pseudogenes, the nonfunctional homologues of functional genes are now coming to light as important resources regarding the study of human protein evolution. Processed pseudogenes arising by reverse transcription and reinsertion can provide molecular record on the dynamics and evolution of genomes. Researches on the progenitors of human processed pseudogenes delved out their highly expressed and evolutionarily conserved characters. They are reported to be short and GC-poor indicating their high efficiency for retrotransposition. In this article we focused on their high expressivity and explored the factors contributing for that and their relevance in the milieu of protein sequence evolution. Results We here, analyzed the high expressivity of these genes configuring processed or retropseudogenes by their immense connectivity in protein-protein interaction network, an inclination towards alternative splicing mechanism, a lower rate of mRNA disintegration and a slower evolutionary rate. While the unusual trend of the upraised disorder in contrast with the high expressivity of the proteins encoded by processed pseudogene ancestors is accredited by a predominance of hub-protein encoding genes, a high propensity of repeat sequence containing genes, elevated protein stability and the functional constraint to perform the transcription regulatory jobs. Linear regression analysis demonstrates mRNA decay rate and protein intrinsic disorder as the influential factors controlling the expressivity of these retropseudogene ancestors while the latter one is found to have the most significant regulatory power. Conclusions Our findings imply that, the affluence of disordered regions elevating the network attachment to be involved in important cellular assignments and the stability in transcriptional level are acting as the prevailing forces behind the high expressivity of the human genes configuring processed pseudogenes.
Collapse
Affiliation(s)
- Kamalika Sen
- Bioinformatics Centre, Bose Institute, P 1/12, C,I,T, Scheme VII M, Kolkata- 700 054, India
| | | | | |
Collapse
|