1
|
Affiliation(s)
- Stephen Branden Van Oss
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America
| |
Collapse
|
2
|
Zhu X, Wang J, Peng B, Shete S. Empirical estimation of sequencing error rates using smoothing splines. BMC Bioinformatics 2016; 17:177. [PMID: 27102907 PMCID: PMC4840868 DOI: 10.1186/s12859-016-1052-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 04/14/2016] [Indexed: 01/24/2023] Open
Abstract
Background Next-generation sequencing has been used by investigators to address a diverse range of biological problems through, for example, polymorphism and mutation discovery and microRNA profiling. However, compared to conventional sequencing, the error rates for next-generation sequencing are often higher, which impacts the downstream genomic analysis. Recently, Wang et al. (BMC Bioinformatics 13:185, 2012) proposed a shadow regression approach to estimate the error rates for next-generation sequencing data based on the assumption of a linear relationship between the number of reads sequenced and the number of reads containing errors (denoted as shadows). However, this linear read-shadow relationship may not be appropriate for all types of sequence data. Therefore, it is necessary to estimate the error rates in a more reliable way without assuming linearity. We proposed an empirical error rate estimation approach that employs cubic and robust smoothing splines to model the relationship between the number of reads sequenced and the number of shadows. Results We performed simulation studies using a frequency-based approach to generate the read and shadow counts directly, which can mimic the real sequence counts data structure. Using simulation, we investigated the performance of the proposed approach and compared it to that of shadow linear regression. The proposed approach provided more accurate error rate estimations than the shadow linear regression approach for all the scenarios tested. We also applied the proposed approach to assess the error rates for the sequence data from the MicroArray Quality Control project, a mutation screening study, the Encyclopedia of DNA Elements project, and bacteriophage PhiX DNA samples. Conclusions The proposed empirical error rate estimation approach does not assume a linear relationship between the error-free read and shadow counts and provides more accurate estimations of error rates for next-generation, short-read sequencing data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1052-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xuan Zhu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Jian Wang
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Bo Peng
- Department of Bioinformatics & Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Sanjay Shete
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA. .,Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.
| |
Collapse
|
3
|
Lee CF, Lai HL, Lee YC, Chien CL, Chern Y. The A2A adenosine receptor is a dual coding gene: a novel mechanism of gene usage and signal transduction. J Biol Chem 2013; 289:1257-70. [PMID: 24293369 DOI: 10.1074/jbc.m113.509059] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The A2A adenosine receptor (A2AR) is a G protein-coupled receptor and a major target of caffeine. The A2AR gene encodes alternative transcripts that are initiated from at least two independent promoters. The different transcripts of the A2AR gene contain the same coding region and 3'-untranslated region and different 5'-untranslated regions that are highly conserved among species. We report here that in addition to the production of the A2AR protein, translation from an upstream, out-of-frame AUG of the rat A2AR gene produces a 134-amino acid protein (designated uORF5). An anti-uORF5 antibody recognized a protein of the predicted size of uORF5 in PC12 cells and rat brains. Up-regulation of A2AR transcripts by hypoxia led to increased levels of both the A2AR and uORF5 proteins. Moreover, stimulation of A2AR increased the level of the uORF5 protein via post-transcriptional regulation. Expression of the uORF5 protein suppressed the AP1-mediated transcription promoted by nerve growth factor and modulated the expression of several proteins that were implicated in the MAPK pathway. Taken together, our results show that the rat A2AR gene encodes two distinct proteins (A2AR and uORF5) in an A2AR-dependent manner. Our study reveals a new example of the complexity of the mammalian genome and provides novel insights into the function of A2AR.
Collapse
Affiliation(s)
- Chien-fei Lee
- From the Institute of Neuroscience, School of Life Sciences, National Yang Ming University, Taipei 112, Taiwan
| | | | | | | | | |
Collapse
|
4
|
|
5
|
Affiliation(s)
- Frederick Sanger
- Medical Research Council Laboratory of Molecular Biology, Hills Road, CB2 2QH, Cambridge, UK
| |
Collapse
|
6
|
Ikehara K, Amada F, Yoshida S, Mikata Y, Tanaka A. A possible origin of newly-born bacterial genes: significance of GC-rich nonstop frame on antisense strand. Nucleic Acids Res 1996; 24:4249-55. [PMID: 8932380 PMCID: PMC146247 DOI: 10.1093/nar/24.21.4249] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Base compositions were examined at every position in codons of more than 50 genes from taxonomically different bacteria and of the corresponding antisense sequences on the bacterial genes. We propose that the nonstop frame on antisense strand [NSF(a)] of GC-rich bacterial genes is the most promising sequence for newly-born genes. Reasons are: (i) NSF(a) frequently appears on the antisense strand of GC-rich bacterial genes; (ii) base compositions at three positions in the codon are nearly symmetrical between the gene having around 55% GC content and the corresponding NSF(a); (iii) amino acid compositions of actual proteins are also similar to those of hypothetical proteins from the GC-rich NSF(a); and (iv) proteins from NSF(a) of 60% or more GC content are flexible enough to adapt to various molecules encountered as novel substrates, due to the high glycine content. To support our proposition, using a computer we generated hypothetical antisense sequences with the same base compositions as of NSF(a) at each base position in the codon, and examined properties of resulting proteins encoded by the imaginary genes. It was confirmed that NSF(a) of GC-rich gene carrying about 60% GC content is competent enough for a newly-born gene.
Collapse
Affiliation(s)
- K Ikehara
- Department of Chemistry, Faculty of Science, Nara Women's University, Kita-uoya Nishi-machi, Japan.
| | | | | | | | | |
Collapse
|
7
|
Ringuette MJ, Spencer JH. Mapping the initiation sites of in vitro transcripts of bacteriophage S13. BIOCHIMICA ET BIOPHYSICA ACTA 1994; 1218:331-8. [PMID: 8049259 DOI: 10.1016/0167-4781(94)90185-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Analysis of in vitro run-off transcripts synthesized by Escherichia coli RNA polymerase holoenzyme on linearized bacteriophage S13 DNA templates revealed five major transcription initiation sites. The sites, located at positions 45, 982, 1823 (1827), 4876 and 5211, are each within the boundaries of promoters or putative promoters previously mapped by footprinting and RNA polymerase binding analyses. They correspond to initiations at promoters upstream of the A, B, and D genes, and at a medium-affinity and a high-affinity RNA polymerase binding site P5211, respectively. Sequence analysis of the 5'-ends of two transcripts confirmed their initiation with pppA at nt 982 and nt 5211, the B gene and high-affinity binding site P5211, respectively. Some of the transcripts initiated at nt 4876 and nt 5211 terminated at nt 64, providing direct evidence of the functionality of a p-independent termination site at nt 64.
Collapse
Affiliation(s)
- M J Ringuette
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada
| | | |
Collapse
|
8
|
Abstract
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.
Collapse
Affiliation(s)
- P K Keese
- Commonwealth Scientific and Industrial Organisation, Division of Plant Industry, Australian National University, Canberra
| | | |
Collapse
|
9
|
McKenna R, Xia D, Willingmann P, Ilag LL, Krishnaswamy S, Rossmann MG, Olson NH, Baker TS, Incardona NL. Atomic structure of single-stranded DNA bacteriophage phi X174 and its functional implications. Nature 1992; 355:137-43. [PMID: 1370343 PMCID: PMC4167681 DOI: 10.1038/355137a0] [Citation(s) in RCA: 167] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The mechanism of DNA ejection, viral assembly and evolution are related to the structure of bacteriophage phi X174. The F protein forms a T = 1 capsid whose major folding motif is the eight-stranded antiparallel beta barrel found in many other icosahedral viruses. Groups of 5 G proteins form 12 dominating spikes that enclose a hydrophilic channel containing some diffuse electron density. Each G protein is a tight beta barrel with its strands running radially outwards and with a topology similar to that of the F protein. The 12 'pilot' H proteins per virion may be partially located in the putative ion channel. The small, basic J protein is associated with the DNA and is situated in an interior cleft of the F protein. Tentatively, there are three regions of partially ordered DNA structure,
Collapse
Affiliation(s)
- R McKenna
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Willingmann P, Krishnaswamy S, McKenna R, Smith TJ, Olson NH, Rossmann MG, Stow PL, Incardona NL. Preliminary investigation of the phage phi X174 crystal structure. J Mol Biol 1990; 212:345-50. [PMID: 2138678 DOI: 10.1016/0022-2836(90)90129-a] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Crystals of the single-stranded DNA bacteriophage phi X174 have been grown. They have a monoclinic unit cell with space group P2(1), unit cell dimensions of a = 306.0 (+/- 0.2) A, b = 361.1 (+/- 0.2) A, c = 299.7 (+/- 0.2 degrees) A, beta = 92.91 degrees (+/- 0.02 degrees) and diffract to at least 2.7 A resolution. There are two virus particles per unit cell. Packing considerations show that the mean diameter of the virus particles is 280 A. The virus separates into two bands in a sucrose gradient. The ratio between the absorbance at 260 nm and 280 nm is 1.45 to 1.65 for the faster and 1.15 to 1.35 for the slower bands, but both bands contain intact particles. Crystals derived from these bands are isomorphous and there is no detectable difference in their structure amplitudes.
Collapse
Affiliation(s)
- P Willingmann
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Evaluation of the interaction of phi X174 gene products E and K in E-mediated lysis of Escherichia coli. J Virol 1988; 62:4362-4. [PMID: 2971822 PMCID: PMC253874 DOI: 10.1128/jvi.62.11.4362-4364.1988] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Gene K of bacteriophage phi X174 was cloned, and its gene product was localized in the cell envelope of Escherichia coli. Compared with the sole expression of the phi X174 lysis gene E, the simultaneous expression of the K and E genes had no effect on scheduling of cell lysis. Therefore, a direct interaction of proteins E and K could be excluded. In contrast, phi X174 infection of a host carrying a plasmid expressing gene K resulted in a delayed lysis and an apparent increase in phage titer.
Collapse
|
12
|
Carlomagno MS, Chiariotti L, Alifano P, Nappo AG, Bruni CB. Structure and function of the Salmonella typhimurium and Escherichia coli K-12 histidine operons. J Mol Biol 1988; 203:585-606. [PMID: 3062174 DOI: 10.1016/0022-2836(88)90194-5] [Citation(s) in RCA: 123] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
We have determined the complete nucleotide sequence of the histidine operons of Escherichia coli and of Salmonella typhimurium. This structural information enabled us to investigate the expression and organization of the histidine operon. The proteins coded by each of the putative histidine cistrons were identified by subcloning appropriate DNA fragments and by analyzing the polypeptides synthesized in minicells. A structural comparison of the gene products was performed. The histidine messenger RNA molecules produced in vivo and the internal transcription initiation sites were identified by Northern blot analysis and S1 nuclease mapping. A comparative analysis of the different transcriptional and translational control elements within the two operons reveals a remarkable preservation for most of them except for the intercistronic region between the first (hisG) and second (hisD) structural genes and for the rho-independent terminator of transcription at the end of the operon. Overall, the operon structure is very compact and its expression appears to be regulated at several levels.
Collapse
Affiliation(s)
- M S Carlomagno
- Centro di Endocrinologia ed Oncologia, Sperimentale del Consiglio, Nazionale delle Ricerche, University of Naples, Napoli, Italy
| | | | | | | | | |
Collapse
|
13
|
Ringuette M, Spencer JH. Localization of Escherichia coli RNA polymerase-binding sites on bacteriophage S13 replicative form I DNA by protection of restriction enzyme cleavage sites. J Virol 1987; 61:2297-303. [PMID: 3035227 PMCID: PMC283695 DOI: 10.1128/jvi.61.7.2297-2303.1987] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Protection of restriction endonuclease cleavage sites by Escherichia coli RNA polymerase bound to the replicative form I of bacteriophage S13 DNA has been used to identify a number of regions of RNA polymerase binding. Digestion with HincII, AluI, HinfI, or HaeIII, under conditions optimized for "open" complex formation, revealed 12 regions of RNA polymerase binding. Based on differential salt sensitivities, five of the regions were classified as strong or tight binding sites. These were located before genes A (two sites), B, and D and at the 5' end of gene F. The seven regions which exhibited weaker binding were located at the 5' end of gene C (two sites), in the middle of gene D, just before and at the 3' end of gene F, at the 5' end of gene G, and in the middle of gene H. The sites before genes B and D coincide with sites previously identified as promoters in bacteriophage phi X174. One of the sites before gene A, that at nucleotides 5175-5211, represents a new putative promoter site in bacteriophage S13 and phi X174 located before the previously identified A gene promoter at nucleotides 10-45.
Collapse
|
14
|
Gillam S, Atkinson T, Markham A, Smith M. Gene K of bacteriophage phi X174 codes for a protein which affects the burst size of phage production. J Virol 1985; 53:708-9. [PMID: 3155803 PMCID: PMC254692 DOI: 10.1128/jvi.53.2.708-709.1985] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Site-directed mutagenesis has been used to produce a T----A change at nucleotide 70 of phi X174 genome. This generates an am codon, TAG, in the gene K reading frame without affecting the amino acid, leucine, encoded by the overlapping gene A. The gene K mutant produces small plaques on su- hosts. It has an identical latent period, but a more reduced burst size than that of the wild-type phi X174. The reduced burst size in the gene K mutant suggests that the gene K protein, although not essential, has a role in increasing infectivity by increasing the burst size three- to sixfold.
Collapse
|
15
|
Abstract
The complete sequence of bacteriophage S13 DNA has been determined. The molecule has 5386 nucleotides and differs from phi X174 by 87 transitions and 24 transversions. All the proteins, A,A*,B,C,D,E,F,G,H, J and K found in phi X174 are also present in S13. Due to changes in the H/A intergenic region of S13, the start of an additional protein, A', has been identified. Genes F and H coding for the capsid and spike proteins, respectively, are the least conserved in comparison to phi X174. Many of the silent changes, as well as some amino acid changes, are found in the same nucleotide sequence positions in phage G4, confirming the interrelationship between the three phages.
Collapse
|
16
|
Abstract
The expression of the cloned Saccharomyces cerevisiae URA3 gene in Escherichia coli on both plasmid and phage vectors was studied. Isolates of the gene from two different laboratory strains of yeast differ in their ability to be expressed in E. coli in the absence of external adjacent promoters of transcription. The DNA sequence of the two genes was determined and revealed several differences in the DNA flanking the structural gene. One base change alters the "Pribnow-box" of an E. coli promoter present in the yeast sequences. Three amber alleles of the yeast gene were also cloned from yeast. Two of the alleles could be suppressed in E. coli by a tRNA suppressor mutation. One of the amber alleles was determined to be a mutation in the seventh codon of the structural gene, thereby establishing the reading frame and extent of the coding sequence. The initiator codon of the reading frame encoding the URA3 structural gene is preceded by two other ATG codons in a different reading frame 61 and 79 bp away. The nearer ATG begins an open reading frame that overlaps the structural gene sequences by 17 bp. With the DNA sequence of the URA3 gene many of the common yeast vector plasmids are now completely known at the level of DNA sequence.
Collapse
|
17
|
|
18
|
Rose M, Botstein D. Structure and function of the yeast URA3 gene. Differentially regulated expression of hybrid beta-galactosidase from overlapping coding sequences in yeast. J Mol Biol 1983; 170:883-904. [PMID: 6315953 DOI: 10.1016/s0022-2836(83)80193-4] [Citation(s) in RCA: 104] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Expression of the URA3 gene of Saccharomyces cerevisiae was studied by analysis of URA3-lacZ gene fusions constructed in vitro. Synthesis of hybrid beta-galactosidase by fusions in frame with the coding sequence for orotidine-5'-phosphate decarboxylase (OMPdecarboxylase) was found to be normally regulated even when only 11 nucleotides of URA3 coding sequence remained, indicating that all transcription initiation and regulatory sites are present at the beginning of the URA3 gene. An upstream initiator codon that begins a short overlapping coding sequence in another reading frame was also found to be active in producing hybrid beta-galactosidase. However this beta-galactosidase synthesis showed little or no regulation. Nuclease protection experiments revealed numerous species of URA3 mRNA. The regulation of these is consistent with the idea that the URA3 protein and the overlapping peptide are translated from differentially regulated mRNAs of different lengths.
Collapse
|
19
|
Walker JE, Auffret AD, Carne A, Gurnett A, Hanisch P, Hill D, Saraste M. Solid-phase sequence analysis of polypeptides eluted from polyacrylamide gels. An aid to interpretation of DNA sequences exemplified by the Escherichia coli unc operon and bacteriophage lambda. EUROPEAN JOURNAL OF BIOCHEMISTRY 1982; 123:253-60. [PMID: 6210528 DOI: 10.1111/j.1432-1033.1982.tb19761.x] [Citation(s) in RCA: 66] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
An approach to sequencing proteins by the solid-phase method combined with isolation of proteins and polypeptides by gel electrophoresis is described. Mixtures of proteins or polypeptides resulting from digests are fractionated in the presence of dodecylsulphate in polyacrylamide gels. They are detected with Coomassie blue, eluted, selectively reacted with porous glass derivatives and sequenced in their amino-terminal regions with the aid of a new microsequencer. Alternatively they can be analysed or digested with enzymes and fingerprinted. It is a relatively rapid method of purifying proteins for sequence analysis which we have used to provide partial protein sequence data to complement DNA sequences. Nine genes, four from the unc operon of Escherichia coli encoding the alpha, beta, gamma and epsilon subunits of ATP synthase and five for capsid proteins of bacteriophage lambda, have been identified by this method.
Collapse
|
20
|
|
21
|
|
22
|
Simons GF, Konings RN, Schoenmakers JG. Genes VI, VII, and IX of phage M13 code for minor capsid proteins of the virion. Proc Natl Acad Sci U S A 1981; 78:4194-8. [PMID: 6945579 PMCID: PMC319755 DOI: 10.1073/pnas.78.7.4194] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The minor capsid proteins C and D from phage M13 have been characterized by differential amino acid labeling and amino-terminal sequence analysis. We demonstrate that D protein (Mr 12,260) is the product of gene VI, whereas the C component is composed of the products of both gene VII (Mr 3580) and gene IX (Mr 3650). Our data further show that the proteins of genes VI, VII, and IX are not subject to proteolytic processing but are packaged into mature virions as their primary translational products. On the basis of incorporation of specific amino acids, the copy numbers of these proteins in M13 virions could be estimated relative to the number of A protein molecules. The M13 phage contains on the average 5 molecules of A protein, 5 molecules of VI protein and 3-4 molecules of both VII protein and IX protein. These copy numbers remained unchanged in M13 recombinant phages of up to two times the length of wild-type phages, a fact that indicates that these minor capsid proteins are located at either one or both ends of the phage filament.
Collapse
|
23
|
|
24
|
|
25
|
Quintrell N, Hughes SH, Varmus HE, Bishop JM. Structure of viral DNA and RNA in mammalian cells infected with avian sarcoma virus. J Mol Biol 1980; 143:363-93. [PMID: 6262515 DOI: 10.1016/0022-2836(80)90218-1] [Citation(s) in RCA: 88] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
26
|
Grantham R, Gautier C, Gouy M. Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. Nucleic Acids Res 1980; 8:1893-912. [PMID: 6159596 PMCID: PMC324046 DOI: 10.1093/nar/8.9.1893] [Citation(s) in RCA: 281] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The poor printing of our previous Figure 2 (1) is corrected. Codon usage in mRNA sequences just published is also given. A new correspondence analysis is done, based on simultaneous comparison in all mRNA of use of the 61 codons. This analysis reinforces our claim that most genes in a genome, or genome type, have the same coding strategy; that is, they show similar choices among synonymous codons, or among degenerate bases (2). Like analysis on frequency variation in the amino acids coded reveals an entirely different pattern.
Collapse
|
27
|
|
28
|
Grantham R, Gautier C, Gouy M, Mercier R, Pavé A. Codon catalog usage and the genome hypothesis. Nucleic Acids Res 1980; 8:r49-r62. [PMID: 6986610 PMCID: PMC327256 DOI: 10.1093/nar/8.1.197-c] [Citation(s) in RCA: 476] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Frequencies for each of the 61 amino acid codons have been determined in every published mRNA sequence of 50 or more codons. The frequencies are shown for each kind of genome and for each individual gene. A surprising consistency of choices exists among genes of the same or similar genomes. Thus each genome, or kind of genome, appears to possess a "system" for choosing between codons. Frameshift genes, however, have widely different choice strategies from normal genes. Our work indicates that the main factors distinguishing between mRNA sequences relate to choices among degenerate bases. These systematic third base choices can therefore be used to establish a new kind of genetic distance, which reflects differences in coding strategy. The choice patterns we find seem compatible with the idea that the genome and not the individual gene is the unit of selection. Each gene in a genome tends to conform to its species' usage of the codon catalog; this is our genome hypothesis.
Collapse
|
29
|
Abstract
In bacteriophage lambda, genes C and Nu3, two of the four cistrons which are essential for normal prohead formation, have overlapping nucleotide sequences. These genes are translated in the same reading frame so that the Nu3 protein is identical to the COOH-terminal one-third of the C protein. This structural relationship may provide for the functional interaction of the C and Nu3 proteins through their regions of structural homology during prohead assembly. The in-phase overlapping organisation of genes may constitute a general strategy to facilitate the mutual interaction of a pair of proteins through their common structural domains.
Collapse
|
30
|
Tessman ES, Tessman I, Pollock TJ. Gene K of bacteriophage phi X 174 codes for a nonessential protein. J Virol 1980; 33:557-60. [PMID: 6445011 PMCID: PMC288574 DOI: 10.1128/jvi.33.1.557-560.1980] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Gene K of phi X 174, which overlaps genes A, B, and C, was found to be nonessential, although possibly beneficial for the growth of the phage. Viable mutants of gene K made less than 4% of the normal amount of K protein as judged by quantitative fluorography of sodium dodecyl sulfate-polyacrylamide gels; compared with the wild-type phi X, K mutants had an identical latent period but a two- to threefold reduction in burst size.
Collapse
|
31
|
Simons GF, Konings RN, Schoenmakers JG. Identification of two new capsid proteins in bacteriophage M13. FEBS Lett 1979; 106:8-12. [PMID: 387445 DOI: 10.1016/0014-5793(79)80683-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
32
|
Sander C, Schulz GE. Degeneracy of the information contained in amino acid sequences: evidence from overlaid genes. J Mol Evol 1979; 13:245-52. [PMID: 228047 DOI: 10.1007/bf01739483] [Citation(s) in RCA: 31] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The observed gene overlays in the viruses phi X174 and SV40 show a surprising economy of information storage; two different amino acid sequences are read in different frames from the same stretch of DNA. This phenomenon appears contradictory in that the information in the two overlaid amino acid sequences is strongly interdependent, yet each of the two proteins has evolved to its own well-defined function. The contradiction can be resolved by assuming sufficiently large degeneracy of the information contents of amino acid sequences with respect to function. Such a degeneracy is familiar from homologous proteins where a given biological function is implemented by many different amino acid sequences. It is shown that the very existence of viral overlays allows to derive a lower limit for the magnitude of this degeneracy: The degeneracy is equal to, or greater than fourfold; on the average, at each position of the chain a choice of 1 out of 5 or less amino acids, and not a choice of 1 out of 20 is neccessary for constructing a protein with a specified function. In addition, the strong dependence of overlay probabilities on chain length allows the definition of a maximal length of overlays; in bacterial viruses overlay regions should be shorter than about 150 residues.
Collapse
|
33
|
|
34
|
|
35
|
|
36
|
Keegstra W, Godson GN, Weisbeek PJ, Jansz HS. Comparison of the G4 and phiX174 phage genomes by electron microscopy. Virology 1979; 93:527-36. [PMID: 452414 DOI: 10.1016/0042-6822(79)90255-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
37
|
Engelberg-Kulka H, Dekel L, Israeli-Reches M, Belfort M. The requirement of nonsense suppression for the development of several phages. MOLECULAR & GENERAL GENETICS : MGG 1979; 170:155-9. [PMID: 372760 DOI: 10.1007/bf00337791] [Citation(s) in RCA: 35] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A spontaneous streptomycin-resistant Escherichia coli mutant which is temperature-sensitive for suppression of a nonsense codon was studied for its ability to propagate phages T2, T4D, T5, phi K, f2, MS2, R17, Q beta, lambda as well as filamentous phages fl, fd and M13. Of all phages tested, only the growth of Q beta, lambda, and filamentous phages is inhibited in the mutant at 42 degree C. This selective inhibition suggests that, like Q beta, lambda and filamentous phages also require a read-through proten(s) which results from suppression of a termination codon.
Collapse
|
38
|
Sonneborn TM, Schneller MV. A genetic system for alternative stable characteristics in genomically identical homozygous clones. ACTA ACUST UNITED AC 1979. [DOI: 10.1002/dvg.1020010105] [Citation(s) in RCA: 38] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
39
|
|
40
|
Abstract
The 5,577 nucleotide long sequence of bacteriophage G4 DNA has been determined using the 'plus and minus' and chain termination methods of DNA sequencing. This sequence has been compared with that of the closely related bacteriophage phiX174 (refs 1, 55). In the coding regions there is an average of 33.1% nucleotide sequence differences between the two genomes, but the distribution of these changes is not random and the sequence of some genes is more conserved than others. There is less sequence similarity between the untranslated intergenic regions of G4 and phiX174, but despite this the sequences of the J/F, F/G and H/A untranslated spaces in both genomes have similar sized hairpin loops, which may be related to their function.
Collapse
|
41
|
|
42
|
Sanger F, Coulson AR, Friedmann T, Air GM, Barrell BG, Brown NL, Fiddes JC, Hutchison CA, Slocombe PM, Smith M. The nucleotide sequence of bacteriophage phiX174. J Mol Biol 1978; 125:225-46. [PMID: 731693 DOI: 10.1016/0022-2836(78)90346-7] [Citation(s) in RCA: 515] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
43
|
Abstract
The product of gene E, the lysis gene of phiX174, has been identified as a distinct band in a sodium dodecyl sulfate-gel electropherogram. The position of the band is consistent with the molecular weight of 10,589 calculated from the nucleotide sequence of the gene. The band is eliminated by a nonsense mutation in gene E. It is estimated that roughly 100 to 300 molecules of E protein are made in an infected cell; this appears to be less than one-tenth the amount of protein made by gene D, in which gene E is wholly contained.
Collapse
|
44
|
Pollock TJ, Tessman I, Tessman ES. Radiological mapping of functional transcription units of bacteriophages phiX174 and S13. The function of proximal and distal promoters. J Mol Biol 1978; 124:147-60. [PMID: 361967 DOI: 10.1016/0022-2836(78)90153-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
45
|
Pollock TJ, Tessman I, Tessman ES. Potential for variability through multiple gene products of bacteriophage phiX174. Nature 1978; 274:34-7. [PMID: 661992 DOI: 10.1038/274034a0] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The small single-stranded DNA phages phiX174 and S13 produce multiple products of certain phage genes, as observed by electrophoresis on SDS-polyacrylamide slab gels. Two A protein products, two A products and four G products are observed. The multiple gene products may arise from multiple sites for initiation or termination of translation, or by protein modification. Some of the variant products may provide a substitute for heterozygosity without a concomitant increase in the size of the genome.
Collapse
|