1
|
Abstract
Members of the Closteroviridae and Potyviridae families of the plant positive-strand RNA viruses encode one or two papain-like leader proteinases. In addition to a C-terminal proteolytic domain, each of these proteinases possesses a nonproteolytic N-terminal domain. We compared functions of the several leader proteinases using a gene swapping approach. The leader proteinase (L-Pro) of Beet yellows virus (BYV; a closterovirus) was replaced with L1 or L2 proteinases of Citrus tristeza virus (CTV; another closterovirus), P-Pro proteinase of Lettuce infectious yellows virus (LIYV; a crinivirus), and HC-Pro proteinase of Tobacco etch virus (a potyvirus). Each foreign proteinase efficiently processed the chimeric BYV polyprotein in vitro. However, only L1 and P-Pro, not L2 and HC-Pro, were able to rescue the amplification of the chimeric BYV variants. The combined expression of L1 and L2 resulted in an increased RNA accumulation compared to that of the parental BYV. Remarkably, this L1-L2 chimera exhibited reduced invasiveness and inability to move from cell to cell. Similar analyses of the BYV hybrids, in which only the papain-like domain of L-Pro was replaced with those derived from L1, L2, P-Pro, and HC-Pro, also revealed functional specialization of these domains. In subcellular-localization experiments, distinct patterns were observed for the leader proteinases of BYV, CTV, and LIYV. Taken together, these results demonstrated that, in addition to a common proteolytic activity, the leader proteinases of closteroviruses possess specialized functions in virus RNA amplification, virus invasion, and cell-to-cell movement. The phylogenetic analysis suggested that functionally distinct L1 and L2 of CTV originated by a gene duplication event.
Collapse
Affiliation(s)
- C W Peng
- Department of Botany and Plant Pathology and Center for Gene Research and Biotechnology, Oregon State University, Corvallis, Oregon 97331, USA
| | | | | | | | | |
Collapse
|
2
|
Mushegian AR, Vishnivetskiy SA, Gurevich VV. Conserved phosphoprotein interaction motif is functionally interchangeable between ataxin-7 and arrestins. Biochemistry 2000; 39:6809-13. [PMID: 10841760 DOI: 10.1021/bi992694y] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Olivopontocerebellar atrophy with retinal degeneration is a hereditary neurodegenerative disorder that belongs to the subtype II of the autosomal dominant cerebellar ataxias and is characterized by early-onset cerebellar and macular degeneration preceded by diagnostically useful tritan colorblindness. The gene mutated in the disease (SCA7) has been mapped to chromosome 3p12-13.5, and positional cloning identified the cause of the disease as CAG repeat expansion in this gene. The SCA7 gene product, ataxin-7, is an 897 amino acid protein with an expandable polyglutamine tract close to its N-terminus. No clues to ataxin-7 function have been obtained from sequence database searches. Here we report that ataxin-7 has a motif of ca. 50 amino acids, related to the phosphate-binding site of arrestins. To test the relevance of this sequence similarity, we introduced the putative ataxin-7 phosphate-binding site into visual arrestin and beta-arrestin. Both chimeric arrestins retain receptor-binding affinity and show characteristic high selectivity for phosphorylated activated forms of rhodopsin and beta-adrenergic receptor, respectively. Although the insertion of a Gly residue (absent in arrestins but present in the putative phosphate-binding site of ataxin-7) disrupts the function of visual arrestin-ataxin-7 chimera, it enhances the function of beta-arrestin-ataxin-7 chimera. Taken together, our data suggest that the arrestin-like site in the ataxin-7 sequence is a functional phosphate-binding site. The presence of the phosphate-binding site in ataxin-7 suggests that this protein may be involved in phosphorylation-dependent binding to its protein partner(s) in the cell.
Collapse
Affiliation(s)
- A R Mushegian
- Ralph and Muriel Roberts Laboratory for Vision Science, Sun Health Research Institute, 10515 West Santa Fe Drive, Sun City, Arizona 85351, USA
| | | | | |
Collapse
|
3
|
|
4
|
Affiliation(s)
- E V Koonin
- Computational Biology Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | |
Collapse
|
5
|
Mushegian AR, Garey JR, Martin J, Liu LX. Large-scale taxonomic profiling of eukaryotic model organisms: a comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes. Genome Res 1998; 8:590-8. [PMID: 9647634 DOI: 10.1101/gr.8.6.590] [Citation(s) in RCA: 122] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Comparisons of DNA and protein sequences between humans and model organisms, including the yeast Saccharomyces cerevisiae, the nematode Caenorhabditis elegans, and the fruit fly Drosophila melanogaster, are a significant source of information about the function of human genes and proteins in both normal and disease states. Important questions regarding cross-species sequence comparison remain unanswered, including (1) the fraction of the metabolic, signaling, and regulatory pathways that is shared by humans and the various model organisms; and (2) the validity of functional inferences based on sequence homology. We addressed these questions by analyzing the available fractions of human, fly, nematode, and yeast genomes for orthologous protein-coding genes, applying strict criteria to distinguish between candidate orthologous and paralogous proteins. Forty-two quartets of proteins could be identified as candidate orthologs. Twenty-four Drosophila protein sequences were more similar to their human orthologs than the corresponding nematode proteins. Analysis of sequence substitutions and evolutionary distances in this data set revealed that most C. elegans genes are evolving more rapidly than Drosophila genes, suggesting that unequal evolutionary rates may contribute to the differences in similarity to human protein sequences. The available fraction of Drosophila proteins appears to lack representatives of many protein families and domains, reflecting the relative paucity of genomic data from this species.
Collapse
Affiliation(s)
- A R Mushegian
- AxyS Pharmaceuticals, Inc., La Jolla, California 92037, USA.
| | | | | | | |
Collapse
|
6
|
Affiliation(s)
- V Morozov
- LION, Bioscience AG, Heidelberg, Germany
| | | | | | | |
Collapse
|
7
|
Koonin EV, Mushegian AR, Galperin MY, Walker DR. Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea. Mol Microbiol 1997; 25:619-37. [PMID: 9379893 DOI: 10.1046/j.1365-2958.1997.4821861.x] [Citation(s) in RCA: 235] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Protein sequences encoded in three complete bacterial genomes, those of Haemophilus influenzae, Mycoplasma genitalium and Synechocystis sp., and the first available archaeal genome sequence, that of Methanococcus jannaschii, were analysed using the BLAST2 algorithm and methods for amino acid motif detection. Between 75% and 90% of the predicted proteins encoded in each of the bacterial genomes and 73% of the M. jannaschii proteins showed significant sequence similarity to proteins from other species. The fraction of bacterial and archaeal proteins containing regions conserved over long phylogenetic distances is nearly the same and close to 70%. Functions of 70-85% of the bacterial proteins and about 70% of the archaeal proteins were predicted with varying precision. This contrasts with the previous report that more than half of the archaeal proteins have no homologues and shows that, with more sensitive methods and detailed analysis of conserved motifs, archaeal genomes become as amenable to meaningful interpretation by computer as bacterial genomes. The analysis of conserved motifs resulted in the prediction of a number of previously undetected functions of bacterial and archaeal proteins and in the identification of novel protein families. In spite of the generally high conservation of protein sequences, orthologues of 25% or less of the M. jannaschii genes were detected in each individual completely sequenced genome, supporting the uniqueness of archaea as a distinct domain of life. About 53% of the M. jannaschii proteins belong to families of paralogues, a fraction similar to that in bacteria with larger genomes, such as Synechocystis sp. and Escherichia coli, but higher than that in H. influenzae, which has approximately the same number of genes as M. jannaschii. Certain groups of proteins, e.g. molecular chaperones and DNA repair enzymes, thought to be ubiquitous and represented in the minimal gene set derived by bacterial genome comparison, are missing in M. jannaschii, indicating massive non-orthologous displacement of genes responsible for essential functions. An unexpectedly large fraction of the M. jannaschii gene products, 44%, shows significantly higher similarity to bacterial than to eukaryotic proteins, compared with 13% that have eukaryotic proteins as their closest homologues (the rest of the proteins show approximately the same level of similarity to bacterial and eukaryotic homologues or have no homologues). Proteins involved in translation, transcription, replication and protein secretion are most closely related to eukaryotic proteins, whereas metabolic enzymes, metabolite uptake systems, enzymes for cell wall biosynthesis and many uncharacterized proteins appear to be 'bacterial'. A similar prevalence of proteins of apparent bacterial origin was observed among the currently available sequences from the distantly related archaeal genus, Sulfolobus. It is likely that the evolution of archaea included at least one major merger between ancestral cells from the bacterial lineage and the lineage leading to the eukaryotic nucleocytoplasm.
Collapse
Affiliation(s)
- E V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | | | | | |
Collapse
|
8
|
|
9
|
Mushegian AR, Bassett DE, Boguski MS, Bork P, Koonin EV. Positionally cloned human disease genes: patterns of evolutionary conservation and functional motifs. Proc Natl Acad Sci U S A 1997; 94:5831-6. [PMID: 9159160 PMCID: PMC20866 DOI: 10.1073/pnas.94.11.5831] [Citation(s) in RCA: 177] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Positional cloning has already produced the sequences of more than 70 human genes associated with specific diseases. In addition to their medical importance, these genes are of interest as a set of human genes isolated solely on the basis of the phenotypic effect of the respective mutations. We analyzed the protein sequences encoded by the positionally cloned disease genes using an iterative strategy combining several sensitive computer methods. Comparisons to complete sequence databases and to separate databases of nematode, yeast, and bacterial proteins showed that for most of the disease gene products, statistically significant sequence similarities are detectable in each of the model organisms. Only the nematode genome encodes apparent orthologs with conserved domain architecture for the majority of the disease genes. In yeast and bacterial homologs, domain organization is typically not conserved, and sequence similarity is limited to individual domains. Generally, human genes complement mutations only in orthologous yeast genes. Most of the positionally cloned genes encode large proteins with several globular and nonglobular domains, the functions of some or all of which are not known. We detected conserved domains and motifs not described previously in a number of proteins encoded by disease genes and predicted functions for some of them. These predictions include an ATP-binding domain in the product of hereditary nonpolyposis colon cancer gene (a MutL homolog), which is conserved in the HS90 family of chaperone proteins, type II DNA topoisomerases, and histidine kinases, and a nuclease domain homologous to bacterial RNase D and the 3'-5' exonuclease domain of DNA polymerase I in the Werner syndrome gene product.
Collapse
Affiliation(s)
- A R Mushegian
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | | | |
Collapse
|
10
|
Heath JD, Boulton MI, Raineri DM, Doty SL, Mushegian AR, Charles TC, Davies JW, Nester EW. Discrete regions of the sensor protein virA determine the strain-specific ability of Agrobacterium to agroinfect maize. Mol Plant Microbe Interact 1997; 10:221-7. [PMID: 9057328 DOI: 10.1094/mpmi.1997.10.2.221] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The ability of Agrobacterium strains to infect transformation-recalcitrant maize plants has been shown to be determined mainly by the virA locus, implicating vir gene induction as the major factor influencing maize infection. In this report, we further explore the roles of vir induction-associated bacterial factors in maize infection using the technique of agroinfection. The Ti plasmid and virA source are shown to be important in determining the ability of a strain to infect maize, and the monosaccharide binding protein ChvE is absolutely required for maize agroinfection. The linker domain of VirAC58 from an agroinfection-competent strain, C58, is sufficient to convert VirAA6 of a nonagroinfecting strain, A348,to agroinfection competence. The periplasmic domain of VirAC58 is also able to confer a moderate level of agroinfection competence to VirAA6. In addition, the VirAA6 protein from A348 is agroinfection competent when removed from its cognate Ti plasmid background and placed in a pTiC58 background. The presence of a pTiA6-encoded, VirAA6-specific inhibitor is hypothesized and examined.
Collapse
Affiliation(s)
- J D Heath
- University of Washington, Department of Microbiology, Seattle 98195-7242, USA
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Abstract
The availability of complete genome sequences of cellular life forms creates the opportunity to explore the functional content of the genomes and evolutionary relationships between them at a new qualitative level. With the advent of these sequences, the construction of a minimal gene set sufficient for sustaining cellular life and reconstruction of the genome of the last common ancestor of bacteria, eukaryotes, and archaea become realistic, albeit challenging, research projects. A version of the minimal gene set for modern-type cellular life derived by comparative analysis of two bacterial genomes, those of Haemophilus influenzae and Mycoplasma genitalium, consists of approximately 250 genes. A comparison of the protein sequences encoded in these genes with those of the proteins encoded in the complete yeast genome suggests that the last common ancestor of all extant life might have had an RNA genome.
Collapse
Affiliation(s)
- E V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | |
Collapse
|
12
|
Abstract
The SPO1 gene of Saccharomyces cerevisiae has been cloned and sequenced. The Spo1 protein reveals significant similarity with fungal phospholipase B (PLB) enzymes. Features of the SPO1 gene sequence are presented.
Collapse
Affiliation(s)
- G G Tevzadze
- Department of Molecular Genetics and Cell Biology, University of Chicago, IL 60637, USA
| | | | | |
Collapse
|
13
|
Abstract
Most of the genes involved in the development of multicellular eukaryotes encode large, multidomain proteins. To decipher the major trends in the evolution of these proteins and make functional predictions for uncharacterized domains, we applied a strategy of sequence database search that includes construction of specialized data sets and iterative subsequence masking. This computational approach allowed us to detect previously unnoticed but potentially important sequence similarities. Developmental gene products are enriched in predicted nonglobular regions as compared to unbiased sets of eukaryotic and bacterial proteins. Developmental genes that act intracellularly, primarily at the level of transcription regulation, typically code for proteins containing highly conserved DNA-binding domains, most of which appear to have evolved before the radiation of bacteria and eukaryotes. We identified bacterial homologues, namely a protein family that includes the Escherichia coli universal stress protein UspA, for the MADS-box transcription regulators previously described only in eukaryotes. We also show that the FUS6 family of eukaryotic proteins contains a putative DNA-binding domain related to bacterial helix-turn-helix transcription regulators. Developmental proteins that act extracellularly are less conserved and often do not have bacterial homologues. Nevertheless, several provocative similarities between different groups of such proteins were detected.
Collapse
Affiliation(s)
- A R Mushegian
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | |
Collapse
|
14
|
Abstract
The recently sequenced genome of the parasitic bacterium Mycoplasma genitalium contains only 468 identified protein-coding genes that have been dubbed a minimal gene complement [Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., et al. (1995) Science 270, 397-403]. Although the M. genitalium gene complement is indeed the smallest among known cellular life forms, there is no evidence that it is the minimal self-sufficient gene set. To derive such a set, we compared the 468 predicted M. genitalium protein sequences with the 1703 protein sequences encoded by the other completely sequenced small bacterial genome, that of Haemophilus influenzae. M. genitalium and H. influenzae belong to two ancient bacterial lineages, i.e., Gram-positive and Gram-negative bacteria, respectively. Therefore, the genes that are conserved in these two bacteria are almost certainly essential for cellular function. It is this category of genes that is most likely to approximate the minimal gene set. We found that 240 M. genitalium genes have orthologs among the genes of H. influenzae. This collection of genes falls short of comprising the minimal set as some enzymes responsible for intermediate steps in essential pathways are missing. The apparent reason for this is the phenomenon that we call nonorthologous gene displacement when the same function is fulfilled by nonorthologous proteins in two organisms. We identified 22 nonorthologous displacements and supplemented the set of orthologs with the respective M. genitalium genes. After examining the resulting list of 262 genes for possible functional redundancy and for the presence of apparently parasite-specific genes, 6 genes were removed. We suggest that the remaining 256 genes are close to the minimal gene set that is necessary and sufficient to sustain the existence of a modern-type cell. Most of the proteins encoded by the genes from the minimal set have eukaryotic or archaeal homologs but seven key proteins of DNA replication do not. We speculate that the last common ancestor of the three primary kingdoms had an RNA genome. Possibilities are explored to further reduce the minimal set to model a primitive cell that might have existed at a very early stage of life evolution.
Collapse
Affiliation(s)
- A R Mushegian
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | |
Collapse
|
15
|
Koonin EV, Mushegian AR, Bork P. Non-orthologous gene displacement. Trends Genet 1996; 12:334-6. [PMID: 8855656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
16
|
Romanova LY, Deriagin GV, Mashkova TD, Tumeneva IG, Mushegian AR, Kisselev LL, Alexandrov IA. Evidence for selection in evolution of alpha satellite DNA: the central role of CENP-B/pJ alpha binding region. J Mol Biol 1996; 261:334-40. [PMID: 8780776 DOI: 10.1006/jmbi.1996.0466] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Conservation of DNA segments performing sequence-related functions is a landmark of selection and functional significance. Phylogenetic variability of alpha satellite and apparent absence of conserved regions calls its functional significance into question, even though sequence-specific alpha satellite-binding proteins pJ alpha and CENP-B have been discovered. Moreover, the function of pJ alpha is obscure and CENP-B binding satellite DNA, which is thought to participate in centromere formation, is found only in few species and not necessarily in all chromosomes. Analysis of alpha satellite evolution allows us to recognize the order in this variability. Here we report a new alpha satellite suprachromosomal family, which together with the four defined earlier, covers all known alpha satellite sequences. Although each family has its characteristic types of monomers, they all descend from two prototypes, A and B. We show that most differences between prototypes are concentrated in a short region (positions 35 to 51), which exists in two alternative states: it matches a binding site for pJ alpha in type A and the one for CENP-B in type B. Lower primates have only type A monomers whereas great apes have both A and B. The new family is formed by monomeric types almost identical to A and B prototypes, thus representing a living relic of alpha satellite. Analysis of these data shows that selection-driven evolution, rather than random fixation of mutations, formed the distinction between A and B types. To our knowledge, this is the first evidence for selection in any of the known satellite DNAs.
Collapse
Affiliation(s)
- L Y Romanova
- National Research Center of Mental Health, Moscow, Russia
| | | | | | | | | | | | | |
Collapse
|
17
|
|
18
|
Mushegian AR, Fullner KJ, Koonin EV, Nester EW. A family of lysozyme-like virulence factors in bacterial pathogens of plants and animals. Proc Natl Acad Sci U S A 1996; 93:7321-6. [PMID: 8692991 PMCID: PMC38982 DOI: 10.1073/pnas.93.14.7321] [Citation(s) in RCA: 97] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
We describe a conserved family of bacterial gene products that includes the VirB1 virulence factor encoded by tumor-inducing plasmids of Agrobacterium spp., proteins involved in conjugative DNA transfer of broad-host-range bacterial plasmids, and gene products that may be involved in invasion by Shigella spp. and Salmonella enterica. Sequence analysis and structural modeling show that the proteins in this group are related to chicken egg white lysozyme and are likely to adopt a lysozyme-like structural fold. Based on their similarity to lysozyme, we predict that these proteins have glycosidase activity. Iterative data base searches with three conserved sequence motifs from this protein family detect a more distant relationship to bacterial and bacteriophage lytic transglycosylases, and goose egg white lysozyme. Two acidic residues in the VirB1 protein of Agrobacterium tumefaciens form a putative catalytic dyad, Each of these residues was changed into the corresponding amide by site-directed mutagenesis. Strains of A. tumefaciens that express mutated VirB1 proteins have a significantly reduced virulence. We hypothesize that many bacterial proteins involved in export of macromolecules belong to a widespread class of hydrolases and cleave beta-1,4-glycosidic bonds as part of their function.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Microbiology, University of Washington, Seattle, WA 98195, USA
| | | | | | | |
Collapse
|
19
|
Karasev AV, Nikolaeva OV, Mushegian AR, Lee RF, Dawson WO. Organization of the 3'-terminal half of beet yellow stunt virus genome and implications for the evolution of closteroviruses. Virology 1996; 221:199-207. [PMID: 8661428 DOI: 10.1006/viro.1996.0366] [Citation(s) in RCA: 44] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The 3'-terminal half of the beet yellow stunt virus (BYSV) genome 10,545 nt, has been cloned and sequenced. The sequenced portion of the BYSV genome encompasses 10 open reading frames (ORFs) and 241 nt of the 3' untranslated region. The sequence spans, in the 5' to 3' direction, the C-terminal region of the replication-associated polyprotein gene (ORF 1a) which includes the set of motifs typical of helicases (HEL), the entire 53-kDa polymerase (RdRp) gene (ORF 1b), and genes encoding 30-kDa (ORF 2), 6-kDa (ORF 3), 66-kDa (ORF 4), 61-kDa (ORF 5), 25-kDa (ORF 6), 23.7-kDa (coat protein, CP) (ORF 7), 18-kDa (ORF 8), and 22-kDa (ORF 9) proteins. The double-stranded RNA "replicative form" of the BYSV was demonstrated to have a nontemplate G residue at the 3' terminus of the (+) strand. The RdRp of BYSV is presumably expressed via a +1 ribosomal frameshift. The five-gene module conserved among closteroviruses was identified in BYSV; it includes a gene array coding for a 6-kDa small hydrophobic protein, a 66-kDa homolog of the cellular HSP70 heat shock proteins, a 61-kDa protein, and a 25-kDa diverged copy of the CP followed by the CP gene itself. Phylogenetic analysis of the replication-associated HEL and RdRp domains as well as proteins from the five-gene module demonstrated the closest relationship between BYSV and two other closteroviruses, beet yellows (BYV) and citrus tristeza (CTV) viruses. Like CTV, the BYSV genome contains a 30-kDa protein gene between the RdRp and the 6-kDa protein genes, and like BYV it has only two genes downstream of the CP gene. The organization of the BYSV genome appears to be intermediate between BYV and CTV, which suggests that these three viruses might represent three distinct but probably close stages in the closterovirus evolution.
Collapse
Affiliation(s)
- A V Karasev
- University of Florida, Citrus Research and Education Center, Lake Alfred 33850-2299, USA
| | | | | | | | | |
Collapse
|
20
|
Abstract
The complete sequences of two small bacterial genomes have recently become available, and those of several more species should follow within the next two years. Sequence comparisons show that the most bacterial proteins are highly conserved in evolution, allowing predictions to be made about the functions of most products of an uncharacterized genome. Bacterial genomes differ vastly in their gene repertoires. Although genes for components of the translation and transcription machinery, and for molecular chaperones, are typically maintained, many regulatory and metabolic systems are absent in bacteria with small genomes. Mycoplasma genitalium, with the smallest known genome of any cellular life form, lacks virtually all known regulatory genes, and its gene expression may be regulated differently than in other bacteria. Genome organization is evolutionarily labile: extensive gene shuffling leaves only very few conserved gene arrays in distantly related bacteria.
Collapse
Affiliation(s)
- E V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | |
Collapse
|
21
|
Tatusov RL, Mushegian AR, Bork P, Brown NP, Hayes WS, Borodovsky M, Rudd KE, Koonin EV. Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli. Curr Biol 1996; 6:279-91. [PMID: 8805245 DOI: 10.1016/s0960-9822(02)00478-5] [Citation(s) in RCA: 228] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
BACKGROUND The 1.83 Megabase (Mb) sequence of the Haemophilus influenzae chromosome, the first completed genome sequence of a cellular life form, has been recently reported. Approximately 75 % of the 4.7 Mb genome sequence of Escherichia coli is also available. The life styles of the two bacteria are very different - H. influenzae is an obligate parasite that lives in human upper respiratory mucosa and can be cultivated only on rich media, whereas E. coli is a saprophyte that can grow on minimal media. A detailed comparison of the protein products encoded by these two genomes is expected to provide valuable insights into bacterial cell physiology and genome evolution. RESULTS We describe the results of computer analysis of the amino-acid sequences of 1703 putative proteins encoded by the complete genome of H. influenzae. We detected sequence similarity to proteins in current databases for 92 % of the H. influenzae protein sequences, and at least a general functional prediction was possible for 83 %. A comparison of the H. influenzae protein sequences with those of 3010 proteins encoded by the sequenced 75 % of the E. coli genome revealed 1128 pairs of apparent orthologs, with an average of 59 % identity. In contrast to the high similarity between orthologs, the genome organization and the functional repertoire of genes in the two bacteria were remarkably different. The smaller genome size of H. influenzae is explained, to a large extent, by a reduction in the number of paralogous genes. There was no long range colinearity between the E. coli and H. influenzae gene orders, but over 70 % of the orthologous genes were found in short conserved strings, only about half of which were operons in E. coli. Superposition of the H. influenzae enzyme repertoire upon the known E. coli metabolic pathways allowed us to reconstruct similar and alternative pathways in H. influenzae and provides an explanation for the known nutritional requirements. CONCLUSIONS By comparing proteins encoded by the two bacterial genomes, we have shown that extensive gene shuffling and variation in the extent of gene paralogy are major trends in bacterial evolution; this comparison has also allowed us to deduce crucial aspects of the largely uncharacterized metabolism of H. influenzae.
Collapse
Affiliation(s)
- R L Tatusov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Abstract
Viruses have developed successful strategies for propagation at the expense of their host cells. Efficient gene expression, genome multiplication, and invasion of the host are enabled by virus-encoded genetic elements, many of which are well characterized. Sequences derived from plant DNA and RNA viruses can be used to control expression of other genes in vivo. The main groups of plant virus genetic elements useful in genetic engineering are reviewed, including the signals for DNA-dependent and RNA-dependent RNA synthesis, sequences on the virus mRNAs that enable translational control, and sequences that control processing and intracellular sorting of virus proteins. Use of plant viruses as extrachromosomal expression vectors is also discussed, along with the issue of their stability.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Plant Pathology, University of Kentucky, Lexington 40546-0091, USA
| | | |
Collapse
|
23
|
Ducasse DA, Mushegian AR, Shepherd RJ. Gene I mutants of peanut chlorotic streak virus, a caulimovirus, replicate in plants but do not move from cell to cell. J Virol 1995; 69:5781-6. [PMID: 7543587 PMCID: PMC189441 DOI: 10.1128/jvi.69.9.5781-5786.1995] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Gene I of peanut chlorotic streak virus (PCISV), a caulimovirus, is homologous to gene I of other caulimoviruses and may encode a protein for virus movement. To evaluate the function of gene I, several mutations were created in this gene of an infectious, partially redundant clone of PCISV. Constructs with an in-frame deletion and a single amino acid substitution in gene I were not infectious. To test for replication of these mutants in primarily infected cells, an immunosorbent PCR technique was devised. Virus particles formed by mutants in plants were recovered by binding to antivirus antibodies on a solid matrix and DNase treated to discriminate against residual inoculum, and DNA of trapped virions was subjected to PCR amplification. Gene I mutants were shown to direct formation of encapsidated DNA as revealed by a PCR product. Control gene V mutants (reverse transcriptase essential for replication) did not yield a PCR product. Quantitative PCR allowed estimation of the proportion of cells initially infected by gene I mutants and the amount of extractable virus per cell. It is concluded that PCISV gene I encodes a movement protein and that the immunoselection-PCR technique is useful in studying subliminal virus infection in plants.
Collapse
Affiliation(s)
- D A Ducasse
- Department of Plant Pathology, University of Kentucky, Lexington 40503, USA
| | | | | |
Collapse
|
24
|
Abstract
Using methods for database screening with individual protein sequences and alignment blocks, a conserved domain is delineated in a group of proteins including several FAD-dependent oxidases. Two motifs within this domain resemble phosphate-binding loops and may be directly involved in FAD binding. These motifs can be readily distinguished from previously described nucleotide-binding sites using a method for database screening with position-dependent weight matrices derived from alignment blocks. Unexpectedly, this group of known and predicted FAD-dependent oxidases includes the product of the DIMINUTO gene, which is involved in Arabidopsis development, and its homologues from man and Mycobacterium leprae.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Microbiology, University of Washington, Seattle 98195, USA
| | | |
Collapse
|
25
|
Mushegian AR, Wolff JA, Richins RD, Shepherd RJ. Molecular analysis of the essential and nonessential genetic elements in the genome of peanut chlorotic streak caulimovirus. Virology 1995; 206:823-34. [PMID: 7531917 DOI: 10.1006/viro.1995.1005] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The DNA genome of caulimoviruses contains a set of essential genes: I (movement gene), IV (major capsid protein gene), V (reverse transcriptase gene), and VI (gene coding for a post-transcriptional activator of the expression of other virus genes). In peanut chlorotic streak caulimovirus (PCISV), three ORFs, A, B, and C, are located between genes I and IV. They are dissimilar to other caulimovirus ORFs. ORF VII of PCISV is a homolog of ORF VII of soybean chlorotic mottle caulimovirus (SoCMV), but is not similar to the nonconserved ORF VII in other caulimoviruses. The sequence complementary to a portion of tRNA(Met), thought to be essential for the priming of minus-strand DNA synthesis in caulimoviruses, is located within the coding sequence of ORF A. To explore the functional significance of ORFs VII, A, B, and C, various mutations were engineered into an infectious DNA clone of PCISV. ORFs VII and B are shown to be dispensable, while ORFs A and C are essential. ORF C is a possible functional equivalent of gene III in other caulimoviruses. Sequences within ORF A that are required for efficient priming of minus-strand synthesis are likely to extend beyond the 12-bp tRNA-binding site. Complete deletion of ORF VII was correlated with severe symptoms, notably with the necrosis of apical meristems. Significance of these observations for the understanding of replication and pathogenesis of plant pararetroviruses and for the improvement of caulimovirus-based expression vectors is discussed.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Plant Pathology, University of Kentucky, Lexington 40546-0091
| | | | | | | |
Collapse
|
26
|
Koonin EV, Mushegian AR, Tatusov RL, Altschul SF, Bryant SH, Bork P, Valencia A. Eukaryotic translation elongation factor 1 gamma contains a glutathione transferase domain--study of a diverse, ancient protein superfamily using motif search and structural modeling. Protein Sci 1994; 3:2045-54. [PMID: 7703850 PMCID: PMC2142650 DOI: 10.1002/pro.5560031117] [Citation(s) in RCA: 117] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Using computer methods for multiple alignment, sequence motif search, and tertiary structure modeling, we show that eukaryotic translation elongation factor 1 gamma (EF1 gamma) contains an N-terminal domain related to class theta glutathione S-transferases (GST). GST-like proteins related to class theta comprise a large group including, in addition to typical GSTs and EF1 gamma, stress-induced proteins from bacteria and plants, bacterial reductive dehalogenases and beta-etherases, and several uncharacterized proteins. These proteins share 2 conserved sequence motifs with GSTs of other classes (alpha, mu, and pi). Tertiary structure modeling showed that in spite of the relatively low sequence similarity, the GST-related domain of EF1 gamma is likely to form a fold very similar to that in the known structures of class alpha, mu, and pi GSTs. One of the conserved motifs is implicated in glutathione binding, whereas the other motif probably is involved in maintaining the proper conformation of the GST domain. We predict that the GST-like domain in EF1 gamma is enzymatically active and that to exhibit GST activity, EF1 gamma has to form homodimers. The GST activity may be involved in the regulation of the assembly of multisubunit complexes containing EF1 and aminoacyl-tRNA synthetases by shifting the balance between glutathione, disulfide glutathione, thiol groups of cysteines, and protein disulfide bonds. The GST domain is a widespread, conserved enzymatic module that may be covalently or noncovalently complexed with other proteins. Regulation of protein assembly and folding may be 1 of the functions of GST.
Collapse
Affiliation(s)
- E V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| | | | | | | | | | | | | |
Collapse
|
27
|
Mushegian AR, Edskes HK, Koonin EV. Eukaryotic RNAse H shares a conserved domain with caulimovirus proteins that facilitate translation of polycistronic RNA. Nucleic Acids Res 1994; 22:4163-6. [PMID: 7937142 PMCID: PMC331909 DOI: 10.1093/nar/22.20.4163] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
RNAse H (RNH1 protein) from the trypanosomatid Crithidia fasciculata has a functionally uncharacterized N-terminal domain dispensable for the RNAse H activity. Using computer methods for database search and multiple alignment, we show that the N-terminal domains of RNH1 and its homologue encoded by a cDNA from chicken lens are related to the conserved domain in caulimovirus ORF VI product that facilitates translation of polycistronic virus RNA in plant cells. We hypothesize that the N-terminal domain of eukaryotic RNAse H performs an as yet uncharacterized regulatory function, possibly in mRNA translation or turnover.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Plant Pathology, University of Kentucky, Lexington 40546-0091
| | | | | |
Collapse
|
28
|
Affiliation(s)
- A R Mushegian
- Department of Plant Pathology, University of Kentucky, Lexington
| |
Collapse
|
29
|
Abstract
Amino acid sequences of enzymes that catalyze hydrolysis or phosphorolysis of the N-glycosidic bond in nucleosides and nucleotides (nucleosidases and phosphoribosyltransferases) were explored using computer methods for database similarity search and multiple alignment. Two new families, each including bacterial and eukaryotic enzymes, were identified. Family I consists of Escherichia coli AMP hydrolase (Amn), uridine phosphorylase (Udp), purine phosphorylase (DeoD), uncharacterized proteins from E. coli and Bacteroides uniformis, and, unexpectedly, a group of plant stress-inducible proteins. It is hypothesized that these plant proteins have evolved from nucleosidases and may possess nucleosidase activity. The proteins in this new family contain 3 conserved motifs, one of which was found also in eukaryotic purine nucleosidases, where it corresponds to the nucleoside-binding site. Family II is comprised of bacterial and eukaryotic thymidine phosphorylases and anthranilate phosphoribosyltransferases, the relationship between which has not been suspected previously. Based on the known tertiary structure of E. coli thymidine phosphorylase, structural interpretation was given to the sequence conservation in this family. The highest conservation is observed in the N-terminal alpha-helical domain, whose exact function is not known. Parts of the conserved active site of thymidine phosphorylases and anthranilate phosphoribosyltransferases were delineated. A motif in the putative phosphate-binding site is conserved in family II and in other phosphoribosyltransferases. Our analysis suggests that certain enzymes of very similar specificity, e.g., uridine and thymidine phosphorylases, could have evolved independently. In contrast, enzymes catalyzing such different reactions as AMP hydrolysis and uridine phosphorolysis or thymidine phosphorolysis and phosphoribosyl anthranilate synthesis are likely to have evolved from common ancestors.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Plant Pathology, University of Kentucky, Lexington 40546-0091
| | | |
Collapse
|
30
|
|
31
|
Mushegian AR, Koonin EV. Cell-to-cell movement of plant viruses. Insights from amino acid sequence comparisons of movement proteins and from analogies with cellular transport systems. Arch Virol 1993; 133:239-57. [PMID: 8257287 PMCID: PMC7086723 DOI: 10.1007/bf01313766] [Citation(s) in RCA: 103] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Cell-to-cell movement is a crucial step in plant virus infection. In many viruses, the movement function is secured by specific virus-encoded proteins. Amino acid sequence comparisons of these proteins revealed a vast superfamily containing a conserved sequence motif that may comprise a hydrophobic interaction domain. This superfamily combines proteins of viruses belonging to all principal groups of positive-strand RNA viruses, as well as single-stranded DNA containing geminiviruses, double-stranded DNA-containing pararetroviruses (caulimoviruses and badnaviruses), and tospoviruses that have negative-strand RNA genomes with two ambisense segments. In several groups of positive-strand RNA viruses, the movement function is provided by the proteins encoded by the so-called triple gene block including two putative small membrane-associated proteins and a putative RNA helicase. A distinct type of movement proteins with very high content of proline is found in tymoviruses. It is concluded that classification of movement proteins based on comparison of their amino acid sequences does not correlate with the type of genome nucleic acid or with grouping of viruses based on phylogenetic analysis of replicative proteins or with the virus host range. Recombination between unrelated or distantly related viruses could have played a major role in the evolution of the movement function. Limited sequence similarities were observed between i) movement proteins of dianthoviruses and the MIP family of cellular integral membrane proteins, and ii) between movement proteins of bromoviruses and cucumoviruses and M1 protein of influenza viruses which is involved in nuclear export of viral ribonucleoproteins. It is hypothesized that all movement proteins of plant viruses may mediate hydrophobic interactions between viral and cellular macromolecules.
Collapse
Affiliation(s)
- A R Mushegian
- Department of Plant Pathology, University of Kentucky, Lexington
| | | |
Collapse
|
32
|
Koonin EV, Mushegian AR, Ryabov EV, Dolja VV. Diverse groups of plant RNA and DNA viruses share related movement proteins that may possess chaperone-like activity. J Gen Virol 1991; 72 ( Pt 12):2895-903. [PMID: 1684985 DOI: 10.1099/0022-1317-72-12-2895] [Citation(s) in RCA: 109] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Amino acid sequences of plant virus proteins mediating cell-to-cell movement were compared to each other and to protein sequences in databases. Two families of movement proteins have been identified, the members of which show statistically significant sequence similarity. The first, larger family (I) encompasses the movement proteins of tobamo-, tobra-, caulimo- and comoviruses, apple chlorotic leaf spot virus (ACLSV) and geminiviruses with bipartite genomes. Thus this family includes viruses which move by two methods, those requiring the coat protein for the cell-to-cell spread (comoviruses) and those not having this requirement (tobamoviruses). The previously unsuspected relationship between the movement proteins of RNA and DNA viruses having no RNA stage in their life cycle (geminiviruses) suggested that their movement mechanisms might be similar. The second, smaller family (II) consists of the movement proteins of tricornaviruses (bromoviruses, cucumoviruses, alfalfa mosaic virus and tobacco streak virus) and dianthoviruses. Alignment of the sequences of family I movement proteins highlighted two motifs, centred at conserved Gly and Asp residues, respectively, which are assumed to be crucial for the movement protein function(s). Screening the amino acid sequence database revealed another conserved motif that is shared by a large subset of family I movement proteins (those of caulimo- and comoviruses, and ACLSV) and the family of cellular 90K heat shock proteins (HSP90). Based on the analogy to HSP90, it is speculated that many plant virus movement proteins may mediate virus transport in a chaperone-like manner.
Collapse
Affiliation(s)
- E V Koonin
- Institute of Microbiology, USSR Academy of Sciences, Moscow
| | | | | | | |
Collapse
|
33
|
Mushegian AR, Malyshenko SI, Taliansky ME, Atabekov JG. Host-dependent Suppression of Temperature-sensitive Mutations in Tobacco Mosaic Virus Transport Gene. J Gen Virol 1989. [DOI: 10.1099/0022-1317-70-12-3421] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|