1
|
Graham LA, Davies PL. Fish antifreeze protein origin in sculpins by frameshifting within a duplicated housekeeping gene. FEBS J 2024; 291:4043-4061. [PMID: 38923815 DOI: 10.1111/febs.17205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/25/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024]
Abstract
Antifreeze proteins (AFPs) are found in a variety of marine cold-water fishes where they prevent freezing by binding to nascent ice crystals. Their diversity (types I, II, III and antifreeze glycoproteins), as well as their scattered taxonomic distribution hint at their complex evolutionary history. In particular, type I AFPs appear to have arisen in response to the Late Cenozoic Ice Age that began ~ 34 million years ago via convergence in four different groups of fish that diverged from lineages lacking this AFP. The progenitor of the alanine-rich α-helical type I AFPs of sculpins has now been identified as lunapark, an integral membrane protein of the endoplasmic reticulum. Following gene duplication and loss of all but three of the 15 exons, the final exon, which encoded a glutamate- and glutamine-rich segment, was converted to an alanine-rich sequence by a combination of frameshifting and mutation. Subsequent gene duplications produced numerous isoforms falling into four distinct groups. The origin of the flounder type I AFP is quite different. Here, a small segment from the original antiviral protein gene was amplified and the rest of the coding sequence was lost, while the gene structure was largely retained. The independent origins of type I AFPs with up to 83% sequence identity in flounder and sculpin demonstrate strong convergent selection at the level of protein sequence for alanine-rich single alpha helices that bind to ice. Recent acquisition of these AFPs has allowed sculpins to occupy icy seawater niches with reduced competition and predation from other teleost species.
Collapse
Affiliation(s)
- Laurie A Graham
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Canada
| | - Peter L Davies
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Canada
| |
Collapse
|
2
|
Zhao Q, Zheng Y, Li Y, Shi L, Zhang J, Ma D, You M. An Orphan Gene Enhances Male Reproductive Success in Plutella xylostella. Mol Biol Evol 2024; 41:msae142. [PMID: 38990889 PMCID: PMC11290247 DOI: 10.1093/molbev/msae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 06/28/2024] [Accepted: 07/05/2024] [Indexed: 07/13/2024] Open
Abstract
Plutella xylostella exhibits exceptional reproduction ability, yet the genetic basis underlying the high reproductive capacity remains unknown. Here, we demonstrate that an orphan gene, lushu, which encodes a sperm protein, plays a crucial role in male reproductive success. Lushu is located on the Z chromosome and is prevalent across different P. xylostella populations worldwide. We subsequently generated lushu mutants using transgenic CRISPR/Cas9 system. Knockout of Lushu results in reduced male mating efficiency and accelerated death in adult males. Furthermore, our findings highlight that the deficiency of lushu reduced the transfer of sperms from males to females, potentially resulting in hindered sperm competition. Additionally, the knockout of Lushu results in disrupted gene expression in energy-related pathways and elevated insulin levels in adult males. Our findings reveal that male reproductive performance has evolved through the birth of a newly evolved, lineage-specific gene with enormous potentiality in fecundity success. These insights hold valuable implications for identifying the target for genetic control, particularly in relation to species-specific traits that are pivotal in determining high levels of fecundity.
Collapse
Affiliation(s)
- Qian Zhao
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Joint International Research Laboratory of Ecological Pest Control, Ministry of Education, Fuzhou 350002, China
| | - Yahong Zheng
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Yiying Li
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Lingping Shi
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Jing Zhang
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Joint International Research Laboratory of Ecological Pest Control, Ministry of Education, Fuzhou 350002, China
| | - Dongna Ma
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Minsheng You
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Joint International Research Laboratory of Ecological Pest Control, Ministry of Education, Fuzhou 350002, China
| |
Collapse
|
3
|
Ardern Z. Alternative Reading Frames are an Underappreciated Source of Protein Sequence Novelty. J Mol Evol 2023; 91:570-580. [PMID: 37326679 DOI: 10.1007/s00239-023-10122-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 05/31/2023] [Indexed: 06/17/2023]
Abstract
Protein-coding DNA sequences can be translated into completely different amino acid sequences if the nucleotide triplets used are shifted by a non-triplet amount on the same DNA strand or by translating codons from the opposite strand. Such "alternative reading frames" of protein-coding genes are a major contributor to the evolution of novel protein products. Recent studies demonstrating this include examples across the three domains of cellular life and in viruses. These sequences increase the number of trials potentially available for the evolutionary invention of new genes and also have unusual properties which may facilitate gene origin. There is evidence that the structure of the standard genetic code contributes to the features and gene-likeness of some alternative frame sequences. These findings have important implications across diverse areas of molecular biology, including for genome annotation, structural biology, and evolutionary genomics.
Collapse
|
4
|
Champagne J, Pataskar A, Blommaert N, Nagel R, Wernaart D, Ramalho S, Kenski J, Bleijerveld OB, Zaal EA, Berkers CR, Altelaar M, Peeper DS, Faller WJ, Agami R. Oncogene-dependent sloppiness in mRNA translation. Mol Cell 2021; 81:4709-4721.e9. [PMID: 34562372 DOI: 10.1016/j.molcel.2021.09.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 07/13/2021] [Accepted: 08/30/2021] [Indexed: 12/15/2022]
Abstract
mRNA translation is a highly conserved and tightly controlled mechanism for protein synthesis. Despite protein quality control mechanisms, amino acid shortage in melanoma induces aberrant proteins by ribosomal frameshifting. The extent and the underlying mechanisms related to this phenomenon are yet unknown. Here, we show that tryptophan depletion-induced ribosomal frameshifting is a widespread phenomenon in cancer. We termed this event sloppiness and strikingly observed its association with MAPK pathway hyperactivation. Sloppiness is stimulated by RAS activation in primary cells, suppressed by pharmacological inhibition of the oncogenic MAPK pathway in sloppy cells, and restored in cells with acquired resistance to MAPK pathway inhibition. Interestingly, sloppiness causes aberrant peptide presentation at the cell surface, allowing recognition and specific killing of drug-resistant cancer cells by T lymphocytes. Thus, while oncogenes empower cancer progression and aggressiveness, they also expose a vulnerability by provoking the production of aberrant peptides through sloppiness.
Collapse
Affiliation(s)
- Julien Champagne
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Abhijeet Pataskar
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Naomi Blommaert
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Remco Nagel
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Demi Wernaart
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Sofia Ramalho
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Juliana Kenski
- Division of Molecular Oncology & Immunology, Oncode Institute, The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Onno B Bleijerveld
- Proteomics Facility, The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Esther A Zaal
- Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, Utrecht, the Netherlands; Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research, Utrecht Institute for Pharmaceutical Sciences, Utrecht University and Netherlands Proteomics Centre, Utrecht, the Netherlands
| | - Celia R Berkers
- Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, Utrecht, the Netherlands; Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research, Utrecht Institute for Pharmaceutical Sciences, Utrecht University and Netherlands Proteomics Centre, Utrecht, the Netherlands
| | - Maarten Altelaar
- Proteomics Facility, The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands; Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research, Utrecht Institute for Pharmaceutical Sciences, Utrecht University and Netherlands Proteomics Centre, Utrecht, the Netherlands
| | - Daniel S Peeper
- Division of Molecular Oncology & Immunology, Oncode Institute, The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - William J Faller
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands
| | - Reuven Agami
- Division of Oncogenomics, Oncode Institute, the Netherlands Cancer Institute, Plesmanlaan 121, 1066CX Amsterdam, the Netherlands; Erasmus MC, Rotterdam University, Rotterdam, the Netherlands.
| |
Collapse
|
5
|
Gao X, Li Y, Adetula AA, Wu Y, Chen H. Analysis of new retrogenes provides insight into dog adaptive evolution. Ecol Evol 2019; 9:11185-11197. [PMID: 31641464 PMCID: PMC6802060 DOI: 10.1002/ece3.5620] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 08/07/2019] [Accepted: 08/08/2019] [Indexed: 01/01/2023] Open
Abstract
The origin and subsequent evolution of new genes have been considered as an important source of genetic and phenotypic diversity in organisms. Dog breeds show great phenotypic diversity for morphological, physiological, and behavioral traits. However, the contributions of newly originated retrogenes, which provide important genetic bases for dog species differentiation and adaptive traits, are largely unknown. Here, we analyzed the dog genome to identify new RNA-based duplications and comprehensively investigated their origin, evolution, functions in adaptive traits, and gene movement processes. First, we totally identified 3,025 retrocopies including 476 intact retrogenes, 2,518 retropseudogenes, and 31 chimerical retrogenes. Second, selective pressure along with ESTs expression analysis showed that most of the intact retrogenes were significantly under stronger purifying selection and subjected to more functional constraints when compared to retropseudogenes. Furthermore, a large number of retrocopies and chimerical retrogenes that occurred approximately 22 million years ago implied a burst of retrotransposition in the dog genome after the divergence time between dog and its closely related species red fox. Interestingly, GO and pathway analyses showed that new retrogenes had expanded in glutathione biosynthetic/metabolic process which likely provided important genetic basis for dogs' adaptation to scavenge human waste dumps. Finally, consistent with the results in human and mouse, a significant excess of functional retrogenes movement on and off the X chromosome in the dog confirmed a general pattern of gene movement process in mammals which was likely driven by natural selection or sexual antagonism. Together, these results increase our understanding that new retrogenes can reshape the dog genome and provide further exploration of the molecular mechanisms underlying the dogs' adaptive evolution.
Collapse
Affiliation(s)
- Xiang Gao
- Center LaboratoryRenmin Hospital of Wuhan UniversityWuhanChina
| | - Yan Li
- Department of Infectious DiseasesZhongnan Hospital of Wuhan UniversityWuhanChina
| | - Adeyinka A. Adetula
- Key Laboratory of Agricultural Animal Genetics, Breeding, and ReproductionHuazhong Agricultural UniversityWuhanChina
| | - Yu Wu
- Oilfield Community D-1-902WuhanChina
| | - Hong Chen
- Department of Scientific ResearchRenmin Hospital of Wuhan UniversityWuhanChina
| |
Collapse
|
6
|
Suvorova YM, Korotkova MA, Skryabin KG, Korotkov EV. Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes. DNA Res 2019; 26:157-170. [PMID: 30726896 PMCID: PMC6476729 DOI: 10.1093/dnares/dsy046] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 12/07/2018] [Indexed: 01/01/2023] Open
Abstract
A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins.
Collapse
Affiliation(s)
- Y M Suvorova
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - M A Korotkova
- National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), Moscow, Russia
| | - K G Skryabin
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - E V Korotkov
- Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia.,National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), Moscow, Russia
| |
Collapse
|
7
|
Suvorova YM, Pugacheva VM, Korotkov EV. A Database of Potential Reading Frame Shifts in Coding Sequences from Different Eukaryotic Genomes. Biophysics (Nagoya-shi) 2019. [DOI: 10.1134/s0006350919030217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
8
|
Fang C, Zou C, Fu Y, Li J, Li Y, Ma Y, Zhao S, Li C. DNA methylation changes and evolution of RNA-based duplication in Sus scrofa: based on a two-step strategy. Epigenomics 2018; 10:199-218. [PMID: 29334230 DOI: 10.2217/epi-2017-0071] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AIM This study aims to couple DNA methylation changes and evolution of retrogenes. MATERIALS & METHODS A new two-step strategy was developed to screen retrogenes. Further, reduced representation bisulfite sequencing and RNA-seq data of eight tissues were used to analyze retrogenes. RESULTS A total of 964 retrocopies were identified and new retrocopies were available for the synthesis of glycans and lipids corresponding to pig phenotypic traits. Retrogenes were consistently hypermethylated. Hypomethylation of parental genes presented more susceptibility to retroposition. Promoter DNA methylation of retrogenes was negatively correlated with evolutionary time and played important roles in regulating retrogene tissue-specific expression pattern. CONCLUSION A two-step procedure is effective and necessary for identifying retrogenes. DNA methylation drives origination, survival, evolution and expression of retrogenes.
Collapse
Affiliation(s)
- Chengchi Fang
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Cheng Zou
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yuhua Fu
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Jingxuan Li
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yao Li
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yunlong Ma
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Shuhong Zhao
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Changchun Li
- Key Lab of Agriculture Animal Genetics, Breeding, & Reproduction of Ministry of Education, College of Animal Sciences & Technology, Huazhong Agricultural University, Wuhan 430070, PR China
| |
Collapse
|
9
|
Melloni GEM, Mazzarella L, Bernard L, Bodini M, Russo A, Luzi L, Pelicci PG, Riva L. A knowledge-based framework for the discovery of cancer-predisposing variants using large-scale sequencing breast cancer data. Breast Cancer Res 2017; 19:63. [PMID: 28569218 PMCID: PMC5452392 DOI: 10.1186/s13058-017-0854-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 05/08/2017] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The landscape of cancer-predisposing genes has been extensively investigated in the last 30 years with various methodologies ranging from candidate gene to genome-wide association studies. However, sequencing data are still poorly exploited in cancer predisposition studies due to the lack of statistical power when comparing millions of variants at once. METHOD To overcome these power limitations, we propose a knowledge-based framework founded on the characteristics of known cancer-predisposing variants and genes. Under our framework, we took advantage of a combination of previously generated datasets of sequencing experiments to identify novel breast cancer-predisposing variants, comparing the normal genomes of 673 breast cancer patients of European origin against 27,173 controls matched by ethnicity. RESULTS We detected several expected variants on known breast cancer-predisposing genes, like BRCA1 and BRCA2, and 11 variants on genes associated with other cancer types, like RET and AKT1. Furthermore, we detected 183 variants that overlap with somatic mutations in cancer and 41 variants associated with 38 possible loss-of-function genes, including PIK3CB and KMT2C. Finally, we found a set of 19 variants that are potentially pathogenic, negatively correlate with age at onset, and have never been associated with breast cancer. CONCLUSIONS In this study, we demonstrate the usefulness of a genomic-driven approach nested in a classic case-control study to prioritize cancer-predisposing variants. In addition, we provide a resource containing variants that may affect susceptibility to breast cancer.
Collapse
Affiliation(s)
- Giorgio E M Melloni
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia, Via Adamello 16, Milan, Italy
| | - Luca Mazzarella
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, Milan, Italy.,Division of New Drug Development, European Institute of Oncology, Via Ripamonti 435, Milan, Italy
| | - Loris Bernard
- Clinical Genomics Lab, European Institute of Oncology, via Ripamonti 435, Milano, Italy
| | - Margherita Bodini
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia, Via Adamello 16, Milan, Italy
| | - Anna Russo
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, Milan, Italy
| | - Lucilla Luzi
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, Milan, Italy
| | - Pier Giuseppe Pelicci
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, Milan, Italy.,Department of Oncology and Hemato-oncology, University of Milan, via Festa del Perdono 7, Milan, Italy
| | - Laura Riva
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia, Via Adamello 16, Milan, Italy.
| |
Collapse
|
10
|
Jammali S, Kuitche E, Rachati A, Bélanger F, Scott M, Ouangraoua A. Aligning coding sequences with frameshift extension penalties. Algorithms Mol Biol 2017; 12:10. [PMID: 28373895 PMCID: PMC5374649 DOI: 10.1186/s13015-017-0101-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Accepted: 03/18/2017] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Frameshift translation is an important phenomenon that contributes to the appearance of novel coding DNA sequences (CDS) and functions in gene evolution, by allowing alternative amino acid translations of gene coding regions. Frameshift translations can be identified by aligning two CDS, from a same gene or from homologous genes, while accounting for their codon structure. Two main classes of algorithms have been proposed to solve the problem of aligning CDS, either by amino acid sequence alignment back-translation, or by simultaneously accounting for the nucleotide and amino acid levels. The former does not allow to account for frameshift translations and up to now, the latter exclusively accounts for frameshift translation initiation, not considering the length of the translation disruption caused by a frameshift. RESULTS We introduce a new scoring scheme with an algorithm for the pairwise alignment of CDS accounting for frameshift translation initiation and length, while simultaneously considering nucleotide and amino acid sequences. The main specificity of the scoring scheme is the introduction of a penalty cost accounting for frameshift extension length to compute an adequate similarity score for a CDS alignment. The second specificity of the model is that the search space of the problem solved is the set of all feasible alignments between two CDS. Previous approaches have considered restricted search space or additional constraints on the decomposition of an alignment into length-3 sub-alignments. The algorithm described in this paper has the same asymptotic time complexity as the classical Needleman-Wunsch algorithm. CONCLUSIONS We compare the method to other CDS alignment methods based on an application to the comparison of pairs of CDS from homologous human, mouse and cow genes of ten mammalian gene families from the Ensembl-Compara database. The results show that our method is particularly robust to parameter changes as compared to existing methods. It also appears to be a good compromise, performing well both in the presence and absence of frameshift translations. An implementation of the method is available at https://github.com/UdeS-CoBIUS/FsePSA.
Collapse
Affiliation(s)
- Safa Jammali
- Département d’informatique, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, QC J1K2R1 Canada
| | - Esaie Kuitche
- Département d’informatique, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, QC J1K2R1 Canada
| | - Ayoub Rachati
- Département d’informatique, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, QC J1K2R1 Canada
| | - François Bélanger
- Département d’informatique, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, QC J1K2R1 Canada
| | - Michelle Scott
- Département de biochimie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E4K8 Canada
| | - Aïda Ouangraoua
- Département d’informatique, Faculté des Sciences, Université de Sherbrooke, Sherbrooke, QC J1K2R1 Canada
| |
Collapse
|
11
|
Pancsa R, Tompa P. Coding Regions of Intrinsic Disorder Accommodate Parallel Functions. Trends Biochem Sci 2016; 41:898-906. [DOI: 10.1016/j.tibs.2016.08.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 08/16/2016] [Accepted: 08/19/2016] [Indexed: 02/01/2023]
|
12
|
Sruthy KS, Chaithanya ER, Sathyan N, Nair A, Antony SP, Singh ISB, Philip R. Molecular Characterization and Phylogenetic Analysis of Novel Isoform of Anti-lipopolysaccharide Factor from the Mantis Shrimp, Miyakea nepa. Probiotics Antimicrob Proteins 2016; 7:275-83. [PMID: 26187684 DOI: 10.1007/s12602-015-9198-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Anti-lipopolysaccharide factor (ALF) is a cationic anti-microbial peptide representing humoral defence system exhibiting a diverse spectrum of activity against microbial pathogens, including gram-negative and gram-positive bacteria, fungi, parasites and viruses. In this study, we identified and characterized a novel ALF homologue (MnALF) encoding cDNA sequence from the haemocytes of stomatopod mantis shrimp Miyakea nepa. The deduced peptide of MnALF encoded for a 123-amino acid peptide with a 25-residue signal peptide containing selenocysteine followed by a highly cationic mature peptide comprised of a putative LPS-binding domain flanked by two cysteine residues. BLAST analysis of MnALF showed that it exhibits identity to crustacean and limulid ALFs. The mature peptide of MnALF has a net charge of +7 and predicted molecular weight of 10.998 kDa with a theoretical isoelectric point (pI) of 9.93. Spatial structure of MnALF comprises three α-helices packed against a four-stranded β-sheet of which two were linked by a disulphide bond to form an amphipathic loop similar to the structure of Penaeus monodon, ALF-Pm3. All these features suggest that MnALF could play an imperative role in the innate defence mechanism of M. nepa. To our knowledge, this study accounts for the first report of an anti-microbial peptide from the order stomatopoda.
Collapse
Affiliation(s)
- K S Sruthy
- Department of Marine Biology, Microbiology and Biochemistry, School of Marine Sciences, Cochin University of Science and Technology, Fine Arts Avenue, Kochi, 682 016, Kerala, India
| | - E R Chaithanya
- Department of Marine Biology, Microbiology and Biochemistry, School of Marine Sciences, Cochin University of Science and Technology, Fine Arts Avenue, Kochi, 682 016, Kerala, India
| | - Naveen Sathyan
- Department of Marine Biology, Microbiology and Biochemistry, School of Marine Sciences, Cochin University of Science and Technology, Fine Arts Avenue, Kochi, 682 016, Kerala, India
| | - Aishwarya Nair
- Department of Marine Biology, Microbiology and Biochemistry, School of Marine Sciences, Cochin University of Science and Technology, Fine Arts Avenue, Kochi, 682 016, Kerala, India
| | - Swapna P Antony
- National Centre for Aquatic Animal Health, Cochin University of Science and Technology, Kochi, 682 016, Kerala, India
| | - I S Bright Singh
- National Centre for Aquatic Animal Health, Cochin University of Science and Technology, Kochi, 682 016, Kerala, India
| | - Rosamma Philip
- Department of Marine Biology, Microbiology and Biochemistry, School of Marine Sciences, Cochin University of Science and Technology, Fine Arts Avenue, Kochi, 682 016, Kerala, India.
| |
Collapse
|
13
|
Yu JF, Cao Z, Yang Y, Wang CL, Su ZD, Zhao YW, Wang JH, Zhou Y. Natural protein sequences are more intrinsically disordered than random sequences. Cell Mol Life Sci 2016; 73:2949-57. [PMID: 26801222 PMCID: PMC4937073 DOI: 10.1007/s00018-016-2138-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 01/10/2016] [Accepted: 01/11/2016] [Indexed: 11/16/2022]
Abstract
Most natural protein sequences have resulted from millions or even billions of years of evolution. How they differ from random sequences is not fully understood. Previous computational and experimental studies of random proteins generated from noncoding regions yielded inclusive results due to species-dependent codon biases and GC contents. Here, we approach this problem by investigating 10,000 sequences randomized at the amino acid level. Using well-established predictors for protein intrinsic disorder, we found that natural sequences have more long disordered regions than random sequences, even when random and natural sequences have the same overall composition of amino acid residues. We also showed that random sequences are as structured as natural sequences according to contents and length distributions of predicted secondary structure, although the structures from random sequences may be in a molten globular-like state, according to molecular dynamics simulations. The bias of natural sequences toward more intrinsic disorder suggests that natural sequences are created and evolved to avoid protein aggregation and increase functional diversity.
Collapse
Affiliation(s)
- Jia-Feng Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
| | - Zanxia Cao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China
| | - Yuedong Yang
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD, 4222, Australia
| | - Chun-Ling Wang
- College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China
| | - Zhen-Dong Su
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
| | - Ya-Wei Zhao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
| | - Ji-Hua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China
| | - Yaoqi Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China.
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD, 4222, Australia.
| |
Collapse
|
14
|
Fellner L, Simon S, Scherling C, Witting M, Schober S, Polte C, Schmitt-Kopplin P, Keim DA, Scherer S, Neuhaus K. Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting. BMC Evol Biol 2015; 15:283. [PMID: 26677845 PMCID: PMC4683798 DOI: 10.1186/s12862-015-0558-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 12/06/2015] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Gene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented. RESULTS RNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the -2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5' rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria. CONCLUSIONS Since nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli's central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.
Collapse
Affiliation(s)
- Lea Fellner
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| | - Svenja Simon
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Constance, Germany.
| | - Christian Scherling
- Lehrstuhl für Ernährungsphysiologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Gregor-Mendel-Straße 2, D-85354, Freising, Germany.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85754, Neuherberg, Germany.
| | - Steffen Schober
- Institute of Communications Engineering, Universität Ulm, Albert-Einstein-Allee 43, 89081, Ulm, Germany. .,Present address: Blue Yonder GmbH, Ohiostraße 8, Karlsruhe, Germany.
| | - Christine Polte
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany. .,Present address: Institut für Biochemie und Molekularbiologie, Universität Hamburg, Martin-Luther-King Platz 6, 20146, Hamburg, Germany.
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85754, Neuherberg, Germany.
| | - Daniel A Keim
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Constance, Germany.
| | - Siegfried Scherer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| | - Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| |
Collapse
|
15
|
Abstract
Genes are perpetually added to and deleted from genomes during evolution. Thus, it is important to understand how new genes are formed and how they evolve to be critical components of the genetic systems that determine the biological diversity of life. Two decades of effort have shed light on the process of new gene origination and have contributed to an emerging comprehensive picture of how new genes are added to genomes, ranging from the mechanisms that generate new gene structures to the presence of new genes in different organisms to the rates and patterns of new gene origination and the roles of new genes in phenotypic evolution. We review each of these aspects of new gene evolution, summarizing the main evidence for the origination and importance of new genes in evolution. We highlight findings showing that new genes rapidly change existing genetic systems that govern various molecular, cellular, and phenotypic functions.
Collapse
Affiliation(s)
- Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637;
| | | | | | | |
Collapse
|
16
|
Wissler L, Gadau J, Simola DF, Helmkampf M, Bornberg-Bauer E. Mechanisms and dynamics of orphan gene emergence in insect genomes. Genome Biol Evol 2013; 5:439-55. [PMID: 23348040 PMCID: PMC3590893 DOI: 10.1093/gbe/evt009] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Orphan genes are defined as genes that lack detectable similarity to genes in other species and therefore no clear signals of common descent (i.e., homology) can be inferred. Orphans are an enigmatic portion of the genome because their origin and function are mostly unknown and they typically make up 10% to 30% of all genes in a genome. Several case studies demonstrated that orphans can contribute to lineage-specific adaptation. Here, we study orphan genes by comparing 30 arthropod genomes, focusing in particular on seven recently sequenced ant genomes. This setup allows analyzing a major metazoan taxon and a comparison between social Hymenoptera (ants and bees) and nonsocial Diptera (flies and mosquitoes). First, we find that recently split lineages undergo accelerated genomic reorganization, including the rapid gain of many orphan genes. Second, between the two insect orders Hymenoptera and Diptera, orphan genes are more abundant and emerge more rapidly in Hymenoptera, in particular, in leaf-cutter ants. With respect to intragenomic localization, we find that ant orphan genes show little clustering, which suggests that orphan genes in ants are scattered uniformly over the genome and between nonorphan genes. Finally, our results indicate that the genetic mechanisms creating orphan genes—such as gene duplication, frame-shift fixation, creation of overlapping genes, horizontal gene transfer, and exaptation of transposable elements—act at different rates in insects, primates, and plants. In Formicidae, the majority of orphan genes has their origin in intergenic regions, pointing to a high rate of de novo gene formation or generalized gene loss, and support a recently proposed dynamic model of frequent gene birth and death.
Collapse
Affiliation(s)
- Lothar Wissler
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | | | | | | | | |
Collapse
|
17
|
|
18
|
Prokunina-Olsson L, Muchmore B, Tang W, Pfeiffer RM, Park H, Dickensheets H, Hergott D, Porter-Gill P, Mumy A, Kohaar I, Chen S, Brand N, Tarway M, Liu L, Sheikh F, Astemborski J, Bonkovsky HL, Edlin BR, Howell CD, Morgan TR, Thomas DL, Rehermann B, Donnelly RP, O'Brien TR. A variant upstream of IFNL3 (IL28B) creating a new interferon gene IFNL4 is associated with impaired clearance of hepatitis C virus. Nat Genet 2013; 45:164-71. [PMID: 23291588 DOI: 10.1038/ng.2521] [Citation(s) in RCA: 733] [Impact Index Per Article: 66.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 12/07/2012] [Indexed: 02/06/2023]
Abstract
Chronic infection with hepatitis C virus (HCV) is a common cause of liver cirrhosis and cancer. We performed RNA sequencing in primary human hepatocytes activated with synthetic double-stranded RNA to mimic HCV infection. Upstream of IFNL3 (IL28B) on chromosome 19q13.13, we discovered a new transiently induced region that harbors a dinucleotide variant ss469415590 (TT or ΔG), which is in high linkage disequilibrium with rs12979860, a genetic marker strongly associated with HCV clearance. ss469415590[ΔG] is a frameshift variant that creates a novel gene, designated IFNL4, encoding the interferon-λ4 protein (IFNL4), which is moderately similar to IFNL3. Compared to rs12979860, ss469415590 is more strongly associated with HCV clearance in individuals of African ancestry, although it provides comparable information in Europeans and Asians. Transient overexpression of IFNL4 in a hepatoma cell line induced STAT1 and STAT2 phosphorylation and the expression of interferon-stimulated genes. Our findings provide new insights into the genetic regulation of HCV clearance and its clinical management.
Collapse
Affiliation(s)
- Ludmila Prokunina-Olsson
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, US National Institutes of Health, Bethesda, Maryland, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Abstract
Frameshifting results from two main mechanisms: genomic insertions or deletions (indels) or programmed ribosomal frameshifting. Whereas indels can disrupt normal protein function, programmed ribosomal frameshifting can result in dual-coding genes, each of which can produce multiple functional products. Here, I summarize technical advances that have made it possible to identify programmed ribosomal frameshifting events in a systematic way. The results of these studies suggest that such frameshifting occurs in all genomes, and I will discuss methods that could help characterize the resulting alternative proteomes.
Collapse
Affiliation(s)
- Robin Ketteler
- MRC Laboratory for Molecular Cell Biology, Translational Research Resource Centre, University College London London, UK
| |
Collapse
|
20
|
Korotkova MA, Kudryashov NA, Korotkov EV. An approach for searching insertions in bacterial genes leading to the phase shift of triplet periodicity. GENOMICS PROTEOMICS & BIOINFORMATICS 2012; 9:158-70. [PMID: 22196359 PMCID: PMC5054449 DOI: 10.1016/s1672-0229(11)60019-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 08/02/2011] [Indexed: 11/28/2022]
Abstract
The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.
Collapse
Affiliation(s)
- Maria A. Korotkova
- National University of Nuclear Investigations (MIFI), Moscow 115409, Russia
| | | | - Eugene V. Korotkov
- National University of Nuclear Investigations (MIFI), Moscow 115409, Russia
- Centre of Bioengineering, Russian Academy of Sciences, Moscow 117312, Russia
- Corresponding author.
| |
Collapse
|
21
|
An overlapping genetic code for frameshifted overlapping genes in Drosophila mitochondria: Antisense antitermination tRNAs UAR insert serine. J Theor Biol 2012; 298:51-76. [DOI: 10.1016/j.jtbi.2011.12.026] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2010] [Revised: 12/19/2011] [Accepted: 12/22/2011] [Indexed: 01/27/2023]
|
22
|
Cheng C, Shaw N, Zhang X, Zhang M, Ding W, Wang BC, Liu ZJ. Structural view of a non Pfam singleton and crystal packing analysis. PLoS One 2012; 7:e31673. [PMID: 22363703 PMCID: PMC3282739 DOI: 10.1371/journal.pone.0031673] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Accepted: 01/11/2012] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Comparative genomic analysis has revealed that in each genome a large number of open reading frames have no homologues in other species. Such singleton genes have attracted the attention of biochemists and structural biologists as a potential untapped source of new folds. Cthe_2751 is a 15.8 kDa singleton from an anaerobic, hyperthermophile Clostridium thermocellum. To gain insights into the architecture of the protein and obtain clues about its function, we decided to solve the structure of Cthe_2751. RESULTS The protein crystallized in 4 different space groups that diffracted X-rays to 2.37 Å (P3(1)21), 2.17 Å (P2(1)2(1)2(1)), 3.01 Å (P4(1)22), and 2.03 Å (C222(1)) resolution, respectively. Crystal packing analysis revealed that the 3-D packing of Cthe_2751 dimers in P4(1)22 and C222(1) is similar with only a rotational difference of 2.69° around the C axes. A new method developed to quantify the differences in packing of dimers in crystals from different space groups corroborated the findings of crystal packing analysis. Cthe_2751 is an all α-helical protein with a central hydrophobic core providing thermal stability via π:cation and π: π interactions. A ProFunc analysis retrieved a very low match with a splicing endonuclease, suggesting a role for the protein in the processing of nucleic acids. CONCLUSIONS Non-Pfam singleton Cthe_2751 folds into a known all α-helical fold. The structure has increased sequence coverage of non-Pfam proteins such that more protein sequences can be amenable to modelling. Our work on crystal packing analysis provides a new method to analyze dimers of the protein crystallized in different space groups. The utility of such an analysis can be expanded to oligomeric structures of other proteins, especially receptors and signaling molecules, many of which are known to function as oligomers.
Collapse
Affiliation(s)
- Chongyun Cheng
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Neil Shaw
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
- College of Stem Cell and Molecular Clinical Medicine, Kunming Medical University, Kunming, China
| | - Xuejun Zhang
- Department of Immunology, Tianjin Medical University, Tianjin, China
| | - Min Zhang
- School of Life Sciences, Anhui University, Hefei, Anhui, China
| | - Wei Ding
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Bi-Cheng Wang
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, United States of America
| | - Zhi-Jie Liu
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
- College of Stem Cell and Molecular Clinical Medicine, Kunming Medical University, Kunming, China
- * E-mail:
| |
Collapse
|
23
|
Bozorgmehr JEH. An ancient frame-shifting event in the highly conserved KPNA gene family has undergone extensive compensation by natural selection in vertebrates. Biosystems 2011; 105:210-5. [PMID: 21550380 DOI: 10.1016/j.biosystems.2011.04.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2010] [Revised: 03/02/2011] [Accepted: 04/20/2011] [Indexed: 10/18/2022]
Abstract
One of the prevailing arguments in evolutionary theory is that the duplicates of genes can acquire novel functionality. This is because only one of the paralogs need maintain the ancestral function, leaving room for natural experimentation due to a respite in purifying selection. Although many duplicates can subsequently become disabled by nullifying mutations, a few may also go on to diverge along a novel evolutionary trajectory. Here, evidence is provided that demonstrates how this scenario may not always be true. Rather, in the case of the highly conserved KPNA importin family, an initial relaxation in selection induced a frameshift that was later suppressed and heavily compensated for as part of a reparative and optimizing process. Despite a resulting divergence, there remains a distinct preservation of both sequence and functionality among the paralogs. This would indicate that duplicates can be retained by selection for reasons related to their redundant functionality. It also shows that, even when positive selection is inferred in duplicate genes, this may be of a compensatory nature rather than one representing any biochemical innovation. Generally, this development would perhaps be a more common outcome for gene duplication than is currently maintained.
Collapse
|
24
|
Gontijo AM, Miguela V, Whiting MF, Woodruff RC, Dominguez M. Intron retention in the Drosophila melanogaster Rieske Iron Sulphur Protein gene generated a new protein. Nat Commun 2011; 2:323. [PMID: 21610726 PMCID: PMC3113295 DOI: 10.1038/ncomms1328] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Accepted: 04/27/2011] [Indexed: 11/09/2022] Open
Abstract
Genomes can encode a variety of proteins with unrelated architectures and activities. It is known that protein-coding genes of de novo origin have significantly contributed to this diversity. However, the molecular mechanisms and evolutionary processes behind these originations are still poorly understood. Here we show that the last 102 codons of a novel gene, Noble, assembled directly from non-coding DNA following an intronic deletion that induced alternative intron retention at the Drosophila melanogaster Rieske Iron Sulphur Protein (RFeSP) locus. A systematic analysis of the evolutionary processes behind the origin of Noble showed that its emergence was strongly biased by natural selection on and around the RFeSP locus. Noble mRNA is shown to encode a bona fide protein that lacks an iron sulphur domain and localizes to mitochondria. Together, these results demonstrate the generation of a novel protein at a naturally selected site.
Collapse
Affiliation(s)
- Alisson M Gontijo
- Instituto de Neurociencias de Alicante, CSIC-UMH, Sant Joan d'Alacant, Alicante 03550, Spain.
| | | | | | | | | |
Collapse
|
25
|
Vakhrusheva AA, Kazanov MD, Mironov AA, Bazykin GA. Evolution of prokaryotic genes by shift of stop codons. J Mol Evol 2010; 72:138-46. [PMID: 21082168 DOI: 10.1007/s00239-010-9408-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 10/29/2010] [Indexed: 11/30/2022]
Abstract
De novo origin of coding sequence remains an obscure issue in molecular evolution. One of the possible paths for addition (subtraction) of DNA segments to (from) a gene is stop codon shift. Single nucleotide substitutions can destroy the existing stop codon, leading to uninterrupted translation up to the next stop codon in the gene's reading frame, or create a premature stop codon via a nonsense mutation. Furthermore, short indels-caused frameshifts near gene's end may lead to premature stop codons or to translation past the existing stop codon. Here, we describe the evolution of the length of coding sequence of prokaryotic genes by change of positions of stop codons. We observed cases of addition of regions of 3'UTR to genes due to mutations at the existing stop codon, and cases of subtraction of C-terminal coding segments due to nonsense mutations upstream of the stop codon. Many of the observed stop codon shifts cannot be attributed to sequencing errors or rare deleterious variants segregating within bacterial populations. The additions of regions of 3'UTR tend to occur in those genes in which they are facilitated by nearby downstream in-frame triplets which may serve as new stop codons. Conversely, subtractions of coding sequence often give rise to in-frame stop codons located nearby. The amino acid composition of the added region is significantly biased, compared to the overall amino acid composition of the genes. Our results show that in prokaryotes, shift of stop codon is an underappreciated contributor to functional evolution of gene length.
Collapse
Affiliation(s)
- Anna A Vakhrusheva
- Department of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Vorbyevy Gory 1-73, Moscow 119992, Russia
| | | | | | | |
Collapse
|
26
|
Wang L, Stein LD. Localizing triplet periodicity in DNA and cDNA sequences. BMC Bioinformatics 2010; 11:550. [PMID: 21059240 PMCID: PMC2992068 DOI: 10.1186/1471-2105-11-550] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 11/08/2010] [Indexed: 01/23/2023] Open
Abstract
Background The protein-coding regions (coding exons) of a DNA sequence exhibit a triplet periodicity (TP) due to fact that coding exons contain a series of three nucleotide codons that encode specific amino acid residues. Such periodicity is usually not observed in introns and intergenic regions. If a DNA sequence is divided into small segments and a Fourier Transform is applied on each segment, a strong peak at frequency 1/3 is typically observed in the Fourier spectrum of coding segments, but not in non-coding regions. This property has been used in identifying the locations of protein-coding genes in unannotated sequence. The method is fast and requires no training. However, the need to compute the Fourier Transform across a segment (window) of arbitrary size affects the accuracy with which one can localize TP boundaries. Here, we report a technique that provides higher-resolution identification of these boundaries, and use the technique to explore the biological correlates of TP regions in the genome of the model organism C. elegans. Results Using both simulated TP signals and the real C. elegans sequence F56F11 as an example, we demonstrate that, (1) Modified Wavelet Transform (MWT) can better define the boundary of TP region than the conventional Short Time Fourier Transform (STFT); (2) The scale parameter (a) of MWT determines the precision of TP boundary localization: bigger values of a give sharper TP boundaries but result in a lower signal to noise ratio; (3) RNA splicing sites have weaker TP signals than coding region; (4) TP signals in coding region can be destroyed or recovered by frame-shift mutations; (5) 6 bp periodicities in introns and intergenic region can generate false positive signals and it can be removed with 6 bp MWT. Conclusions MWT can provide more precise TP boundaries than STFT and the boundaries can be further refined by bigger scale MWT. Subtraction of 6 bp periodicity signals reduces the number of false positives. Experimentally-introduced frame-shift mutations help recover TP signal that have been lost by possible ancient frame-shifts. More importantly, TP signal has the potential to be used to detect the splice junctions in fully spliced mRNA sequence.
Collapse
Affiliation(s)
- Liya Wang
- Cold Spring Harbor Laboratory, Williams #5, Cold Spring Harbor, NY 11724, USA.
| | | |
Collapse
|
27
|
Gîrdea M, Noé L, Kucherov G. Back-translation for discovering distant protein homologies in the presence of frameshift mutations. Algorithms Mol Biol 2010; 5:6. [PMID: 20047662 PMCID: PMC2821327 DOI: 10.1186/1748-7188-5-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 01/04/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. RESULTS We developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at [http://bioinfo.lifl.fr/path/]. CONCLUSIONS Our approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.
Collapse
Affiliation(s)
- Marta Gîrdea
- Laboratoire d'Informatique Fondamentale de Lille (Centre National de la Recherche Scientifique, Université Lille 1), Lille, France
- Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France
| | - Laurent Noé
- Laboratoire d'Informatique Fondamentale de Lille (Centre National de la Recherche Scientifique, Université Lille 1), Lille, France
- Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France
| | - Gregory Kucherov
- Laboratoire d'Informatique Fondamentale de Lille (Centre National de la Recherche Scientifique, Université Lille 1), Lille, France
- Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France
- French-Russian J-V Poncelet Laboratory, Moscow, Russia
| |
Collapse
|
28
|
Frenkel FE, Korotkov EV. Using triplet periodicity of nucleotide sequences for finding potential reading frame shifts in genes. DNA Res 2009; 16:105-14. [PMID: 19261626 PMCID: PMC2671204 DOI: 10.1093/dnares/dsp002] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
We introduce a novel approach for the detection of possible mutations leading to a reading frame (RF) shift in a gene. Deletions and insertions of DNA coding regions are considerable events for genes because an RF shift results in modifications of the extensive region of amino acid sequence coded by a gene. The suggested method is based on the phenomenon of triplet periodicity (TP) in coding regions of genes and its relative resistance to substitutions in DNA sequence. We attempted to extend 326 933 regions of continuous TP found in genes from the KEGG databank by considering possible insertions and deletions. We revealed totally 824 genes where such extension was possible and statistically significant. Then we generated amino acid sequences according to active (KEGG's) and hypothetically ancient RFs in order to find confirmation of a shift at a protein level. Consequently, 64 sequences have protein similarities only for ancient RF, 176 only for active RF, 3 for both and 581 have no protein similarity at all. We aimed to have revealed lower bound for the number of genes in which a shift between RF and TP is possible. Further ways to increase the number of revealed RF shifts are discussed.
Collapse
Affiliation(s)
- F E Frenkel
- Bioengineering Centre of RAS, 60-letiya Oktyabrya prosp., 7/1, Moscow, Russia.
| | | |
Collapse
|
29
|
Toll-Riera M, Bosch N, Bellora N, Castelo R, Armengol L, Estivill X, Albà MM. Origin of primate orphan genes: a comparative genomics approach. Mol Biol Evol 2008; 26:603-12. [PMID: 19064677 DOI: 10.1093/molbev/msn281] [Citation(s) in RCA: 182] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Genomes contain a large number of genes that do not have recognizable homologues in other species and that are likely to be involved in important species-specific adaptive processes. The origin of many such "orphan" genes remains unknown. Here we present the first systematic study of the characteristics and mechanisms of formation of primate-specific orphan genes. We determine that codon usage values for most orphan genes fall within the bulk of the codon usage distribution of bona fide human proteins, supporting their current protein-coding annotation. We also show that primate orphan genes display distinctive features in relation to genes of wider phylogenetic distribution: higher tissue specificity, more rapid evolution, and shorter peptide size. We estimate that around 24% are highly divergent members of mammalian protein families. Interestingly, around 53% of the orphan genes contain sequences derived from transposable elements (TEs) and are mostly located in primate-specific genomic regions. This indicates frequent recruitment of TEs as part of novel genes. Finally, we also obtain evidence that a small fraction of primate orphan genes, around 5.5%, might have originated de novo from mammalian noncoding genomic regions.
Collapse
Affiliation(s)
- Macarena Toll-Riera
- Evolutionary Genomics Group, Biomedical Informatics Research Programme, Fundació Institut Municipal d'Investigació Mèdica, Barcelona, Spain
| | | | | | | | | | | | | |
Collapse
|
30
|
Kerr NCH, Holmes FE, Wynick D. Novel mRNA isoforms of the sodium channels Na(v)1.2, Na(v)1.3 and Na(v)1.7 encode predicted two-domain, truncated proteins. Neuroscience 2008; 155:797-808. [PMID: 18675520 DOI: 10.1016/j.neuroscience.2008.04.060] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2007] [Revised: 04/28/2008] [Accepted: 04/28/2008] [Indexed: 12/19/2022]
Abstract
The expression of voltage-gated sodium channels is regulated at multiple levels, and in this study we addressed the potential for alternative splicing of the Na(v)1.2, Na(v)1.3, Na(v)1.6 and Na(v)1.7 mRNAs. We isolated novel mRNA isoforms of Na(v)1.2 and Na(v)1.3 from adult mouse and rat dorsal root ganglia (DRG), Na(v)1.3 and Na(v)1.7 from adult mouse brain, and Na(v)1.7 from neonatal rat brain. These alternatively spliced isoforms introduce an additional exon (Na(v)1.2 exon 17A and topologically equivalent Na(v)1.7 exon 16A) or exon pair (Na(v)1.3 exons 17A and 17B) that contain an in-frame stop codon and result in predicted two-domain, truncated proteins. The mouse and rat orthologous exon sequences are highly conserved (94-100% identities), as are the paralogous Na(v)1.2 and Na(v)1.3 exons (93% identity in mouse) to which the Na(v)1.7 exon has only 60% identity. Previously, Na(v)1.3 mRNA has been shown to be upregulated in rat DRG following peripheral nerve injury, unlike the downregulation of all other sodium channel transcripts. Here we show that the expression of Na(v)1.3 mRNA containing exons 17A and 17B is unchanged in mouse following peripheral nerve injury (axotomy), whereas total Na(v)1.3 mRNA expression is upregulated by 33% (P=0.003), suggesting differential regulation of the alternatively spliced transcripts. The alternatively spliced rodent exon sequences are highly conserved in both the human and chicken genomes, with 77-89% and 72-76% identities to mouse, respectively. The widespread conservation of these sequences strongly suggests an additional level of regulation in the expression of these channels, that is also tissue-specific.
Collapse
Affiliation(s)
- N C H Kerr
- Departments of Physiology and Pharmacology, and Clinical Sciences South Bristol, School of Medical Sciences, University of Bristol, Bristol, BS8 1TD, UK
| | | | | |
Collapse
|
31
|
Abstract
The fact that promoters are essential for the function of all genes presents the basis of the general idea that retrotranspositions give rise to processed pseudogenes. However, recent studies have demonstrated that some retrotransposed genes are transcriptionally active. Because promoters are not thought to be retrotransposed along with exonic sequences, these transcriptionally active genes must have acquired a functional promoter by mechanisms that are yet to be determined. Hence, comparison between a retrotransposed gene and its source gene appears to provide a unique opportunity to investigate the promoter creation for a new gene. Here, we identified 29 gene pairs in the human genome, consisting of a functional retrotransposed gene and its parental gene, and compared their respective promoters. In more than half of these cases, we unexpectedly found that a large part of the core promoter had been transcribed, reverse transcribed, and then integrated to be operative at the transposed locus. This observation can be ascribed to the recent discovery that transcription start sites tend to be interspersed rather than situated at 1 specific site. This propensity could confer retrotransposability to promoters per se. Accordingly, the retrotransposability can explain the genesis of some alternative promoters.
Collapse
Affiliation(s)
- Kohji Okamura
- Human Genome Centre, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | | |
Collapse
|
32
|
Deshayes C, Perrodou E, Euphrasie D, Frapy E, Poch O, Bifani P, Lecompte O, Reyrat JM. Detecting the molecular scars of evolution in the Mycobacterium tuberculosis complex by analyzing interrupted coding sequences. BMC Evol Biol 2008; 8:78. [PMID: 18325090 PMCID: PMC2277376 DOI: 10.1186/1471-2148-8-78] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2007] [Accepted: 03/06/2008] [Indexed: 11/30/2022] Open
Abstract
Background Computer-assisted analyses have shown that all bacterial genomes contain a small percentage of open reading frames with a frameshift or in-frame stop codon We report here a comparative analysis of these interrupted coding sequences (ICDSs) in six isolates of M. tuberculosis, two of M. bovis and one of M. africanum and question their phenotypic impact and evolutionary significance. Results ICDSs were classified as "common to all strains" or "strain-specific". Common ICDSs are believed to result from mutations acquired before the divergence of the species, whereas strain-specific ICDSs were acquired after this divergence. Comparative analyses of these ICDSs therefore define the molecular signature of a particular strain, phylogenetic lineage or species, which may be useful for inferring phenotypic traits such as virulence and molecular relationships. For instance, in silico analysis of the W-Beijing lineage of M. tuberculosis, an emergent family involved in several outbreaks, is readily distinguishable from other phyla by its smaller number of common ICDSs, including at least one known to be associated with virulence. Our observation was confirmed through the sequencing analysis of ICDSs in a panel of 21 clinical M. tuberculosis strains. This analysis further illustrates the divergence of the W-Beijing lineage from other phyla in terms of the number of full-length ORFs not containing a frameshift. We further show that ICDS formation is not associated with the presence of a mutated promoter, and suggest that promoter extinction is not the main cause of pseudogene formation. Conclusion The correlation between ICDSs, function and phenotypes could have important evolutionary implications. This study provides population geneticists with a list of targets, which could undergo selective pressure and thus alters relationships between the various lineages of M. tuberculosis strains and their host. This approach could be applied to any closely related bacterial strains or species for which several genome sequences are available.
Collapse
Affiliation(s)
- Caroline Deshayes
- Université Paris Descartes, Faculté de Médecine René Descartes, Paris Cedex 15, F-75730, France.
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Harrison P, Yu Z. Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures. BMC Genomics 2007; 8:371. [PMID: 17937804 PMCID: PMC2194788 DOI: 10.1186/1471-2164-8-371] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Accepted: 10/15/2007] [Indexed: 11/24/2022] Open
Abstract
Background Efforts to gather genomic evidence for the processes of gene evolution are ongoing, and are closely coupled to improved gene annotation methods. Such annotation is complicated by the occurrence of disrupted mRNAs (dmRNAs), harbouring frameshifts and premature stop codons, which can be considered indicators of decay into pseudogenes. Results We have derived a procedure to annotate dmRNAs, and have applied it to human data. Subsequences are generated from parsing at key frame-disruption positions and are required to align significantly within any original protein homology. We find 419 high-quality human dmRNAs (3% of total). Significant dmRNA subpopulations include: zinc-finger-containing transcription factors with long disrupted exons, and antisense homologies to distal genes. We analysed the distribution of initial frame disruptions in dmRNAs with respect to positions of: (i) protein domains, (ii) alternatively-spliced exons, and (iii) regions susceptible to nonsense-mediated decay (NMD). We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions. We do not find any evidence for evolution of novelty in protein structures through frameshifting. Conclusion Our results indicate largely negative selection pressures related to frame disruption during gene evolution.
Collapse
Affiliation(s)
- Paul Harrison
- Department of Biology, McGill University, Stewart Biology Building, 1205 Docteur Penfield Ave,, Montreal, QC, H3A 1B1 Canada.
| | | |
Collapse
|
34
|
Zhao X, McGirr KM, Buehring GC. Potential evolutionary influences on overlapping reading frames in the bovine leukemia virus pXBL region. Genomics 2007; 89:502-11. [PMID: 17239558 DOI: 10.1016/j.ygeno.2006.12.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2006] [Revised: 11/27/2006] [Accepted: 12/14/2006] [Indexed: 01/25/2023]
Abstract
Bovine leukemia virus contains a pXBL region encoding the 3' parts of four regulatory proteins (Tax, Rex, G4, R3) in overlapping reading frames. Here we report the pXBL polymorphisms of 30 isolates from four countries. Rates of overall and synonymous substitutions were consistently lower, and nucleotide/amino acid composition bias and codon bias higher, in more-overlapped than in less-overlapped regions. Ratios of nonsynonymous/synonymous substitutions were lowest in the tax gene and its subregions. The 5' parts of the four genes showed selection patterns corresponding to their genomic context outside of the pXBL region. Longer G4 variants due to a natural stop codon mutation had additional triple overlap with reduced sequence variability. These data support the concept that a higher level of overlapping in coding regions correlates with greater evolutionary constraint. Tax, the most conserved among the four regulatory proteins, showed purifying selection consistent with its importance in the viral life cycle.
Collapse
Affiliation(s)
- Xiangrong Zhao
- Graduate Program in Endocrinology, University of California at Berkeley, 3060 Valley Life Science Building, Berkeley, CA 94720-3140, USA.
| | | | | |
Collapse
|