Abstract
Small DNA fragments (60 to 80 nucleotides), randomly obtained from a collection of 14 catabolic, biosynthetic or regulatory Escherichia coli genes, have been shot-gun cloned in place of the lacZ ribosome binding site. A total of 47 recombinants showing substantial beta-galactosidase synthesis (at least 1/30th of the wild-type) were isolated, and their newly acquired translational starts were characterized. Of these, 46 were found to carry a ribosome binding site from one of the original genes, and only one, a non-natural start. Moreover, 12 out of the 14 natural starts were found. The two that were not found are the only ones lacking a Shine-Dalgarno element. So, real starts are generally active in the lac mRNA, whereas the many sites (approx. 100 in this gene collection) that carry a Shine-Dalgarno element followed by AUG or GUG but are located in intra- or intergenic regions, or on non-transcribed strands, are inactive. I conclude that: (1) these "false" starts, being strongly discriminated against in the lac message, are presumably also inactive in their original mRNAs; (2) the discriminating information, being portable from one mRNA to another, must be contained within a small DNA region surrounding the starts. Indeed, I further show that it generally lies within a sequence of about 35 nucleotides bracketing real starts; and (3) this information must have a larger effect on initiation than the exact structure of the mRNA, because the discrimination persists despite a complete change of this structure. Previous statistical analysis has shown that real starts differ from false starts in having a non-random sequence composition from nucleotides -20 to +15 with respect to the start. To uncover whether these biases constitute the discriminating information or simply reflect coding constraints, translational starts were randomly searched in eukaryotic, largely non-coding, DNA. These "eukaryotic" starts all have an in-phase AUG or GUG, preceded by a typical Shine-Dalgarno sequence; outside these elements, the initiator region is strikingly rich in A, and poor in C. These biases match those found around real starts, demonstrating that they are indeed part of the initiation signal. Finally, I describe a simple procedure for introducing any DNA fragment in place of the lac operator site on the E. coli chromosome.
Collapse