26
|
Uberbacher EC, Xu Y, Shah MB, Olman V, Parang M, Mural RJ. An editing environment for DNA sequence analysis and annotation (extended abstract). PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 1998:217-27. [PMID: 9697184 DOI: 10.2172/563243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
This paper presents a computer system for analyzing and annotating large-scale genomic sequences. The core of the system is a multiple-gene structure identification program, which predicts the most "probable" gene structures based on the given evidence, including pattern recognition, EST and protein homology information. A graphics-based user interface provides an environment which allows the user to interactively control the evidence to be used in the gene identification process. To overcome the computational bottleneck in the database similarity search used in the gene identification process, we have developed an effective way to partition a database into a set of sub-databases of "related" sequences, and reduced the search problem on a large database to a signature identification problem and a search problem on a much smaller sub-database. This reduces the number of sequences to be searched from N to O ([square root of] N) on average, and hence greatly reduces the search time, where N is the number of sequences in the original database. The system provides the user with the ability to facilitate and modify the analysis and modeling in real time.
Collapse
|
27
|
Xu Y, Mural RJ, Uberbacher EC. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 1997; 5:344-53. [PMID: 9322060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.
Collapse
|
28
|
Abstract
Computational methods for gene identification in genomic sequences typically have two phases: coding region recognition and gene parsing. While there are a number of effective methods for recognizing coding regions (exons), parsing the recognized exons into proper gene structures, to a large extent, remains an unsolved problem. We have developed a computer program which can automatically parse the recognized exons into gene models that are most consistent with the available Expressed Sequence Tags (ESTs) and a set of biological heuristics, derived empirically. The gene modeling algorithm used in this program provides a general framework for applying EST information so the modeling accuracy improves as the amount of available EST information increases. Based on preliminary tests on a number of large DNA sequences, using the dbEST database, we have observed that the algorithm can (1) accurately model complicated multiple gene structures, including embedded genes, (2) identify falsely-recognized exons and locate missed exons by the initial exon recognition phase, and (3) make more accurate exon boundary predictions, if the necessary EST information is available. We have extended this EST-based gene modeling algorithm to model genes on unfinished DNA contigs at the end of the shotgun sequencing. This extended version can automatically determine the orientations and the relative order of the DNA contigs (with gaps between them) using the available ESTs as reference models, before the gene modeling phase.
Collapse
|
29
|
Xu Y, Uberbacher EC. A polynomial-time algorithm for a class of protein threading problems. COMPUTER APPLICATIONS IN THE BIOSCIENCES : CABIOS 1996; 12:511-7. [PMID: 9021270 DOI: 10.1093/bioinformatics/12.6.511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
This paper presents an algorithm for constructing an optimal alignment between a three-dimensional protein structure template and an amino acid sequence. A protein structure template is given as a sequence of amino acid residue positions in three-dimensional space, along with an array of physical properties attached to each position; these residue positions are sequentially grouped into a series of core secondary structures (central helices and beta sheets). In addition to match scores and gap penalties, as in a traditional sequence-sequence alignment problem, the quality of a structure-sequence alignment is also determined by interaction preferences among amino acids aligned with structure positions that are spatially close (we call these 'long-range interactions'). Although it is known that constructing such a structure-sequence alignment in the most general form is NP-hard, our algorithm runs in polynomial time when restricted to structures with a 'modest' number of long-range amino acid interactions. In the current work, long-range interactions are limited to interactions between amino acids from different core secondary structures. Dividing the series of core secondary structures into two subseries creates a cut set of long-range interactions. If we use N, M and C to represent the size of an amino acid sequence, the size of a structure template, and the maximum cut size of long-range interactions, respectively, the algorithm finds an optimal structure-sequence alignment in O(21C NM) time, a polynomial function of N and M when C = O(log(N + M)). When running on structure-sequence alignment problems without long-range intersections, i.e. C = 0, the algorithm achieves the same asymptotic computational complexity of the Smith-Waterman sequence-sequence alignment algorithm.
Collapse
|
30
|
Harp JM, Uberbacher EC, Roberson AE, Palmer EL, Gewiess A, Bunick GJ. X-ray Diffraction Analysis of Crystals Containing Twofold Symmetric Nucleosome Core Particles. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 1996; 52:283-8. [PMID: 15299701 DOI: 10.1107/s0907444995009139] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Nucleosome core particles containing a DNA palindrome and purified chicken erythrocyte histone octamer have been reconstituted and crystallized. The dyad symmetry of the palindrome extends the dyad symmetry of the histone octamer to result in a twofold symmetric particle. This ensures that the structure determined by X-ray diffraction will yield a true representation of the DNA strand rather than the twofold averaged structure which would result from using a non-palindromic DNA sequence. The crystals provide isotropic diffraction to 3.2 A with observed reflections extending to d spacings of about 2.8 A using a rotating-anode Cu Kalpha X-ray source. Although the DNA palindrome is a factor contributing to the quality of the diffraction data, another significant factor is an improved preparative technique which enriches for correctly phased nucleosome core particles.
Collapse
|
31
|
Guan X, Uberbacher EC. Alignments of DNA and protein sequences containing frameshift errors. COMPUTER APPLICATIONS IN THE BIOSCIENCES : CABIOS 1996; 12:31-40. [PMID: 8670617 DOI: 10.1093/bioinformatics/12.1.31] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Molecular sequences, like all experimental data, are subject to error. Many current DNA sequencing protocols have very significant error rates and often generate artefactual insertions and deletions of bases (indels) which corrupt the translation of sequences and compromise the detection of protein homologies. The impact of these errors on the utility of molecular sequence data is dependent on the analytic technique used to interpret the data. In the presence of frameshift errors, standard algorithms using six-frame translation can miss important homologies because only subfragments of the correct translation are available in any given frame. We present a new algorithm which can detect and correct frameshift errors in DNA sequences during comparison of translated sequences with protein sequences in the databases. This algorithm can recognize homologous proteins sharing 30% identity even in the presence of a 7% frameshift error rate. Our algorithm uses dynamic programming, producing a guaranteed optimal alignment in the presence of frameshifts, and has a sensitivity equivalent to Smith-Waterman. The computational efficiency of the algorithm is O(nm) where n and m are the sizes of two sequences being compared. The algorithm does not rely on prior knowledge or heuristic rules and performs significantly better than any previously reported method.
Collapse
|
32
|
Xu Y, Mural RJ, Uberbacher EC. An iterative algorithm for correcting sequencing errors in DNA coding regions. J Comput Biol 1996; 3:333-44. [PMID: 8891953 DOI: 10.1089/cmb.1996.3.333] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Insertion and deletion (indel) sequencing errors in DNA coding regions disrupt DNA-to-protein translation frames, and hence make most frame-sensitive coding recognition approaches fail. This paper extends the authors' previous work on indel detection and "correction" algorithms, and presents a more effective algorithm for localizing indels that appear in DNA coding regions and "correcting" the located indels by inserting or deleting DNA bases. The algorithm localizes indels by discovering changes of the preferred translation frames within presumed coding regions, and then "corrects" them to restore a consistent translation frame within each coding region. An iterative strategy is exploited to repeatedly localize and "correct" indels until no more indels can be found. Test results have shown that this improved algorithm can detect and "correct" more indels while not worsening the rate of introduction of false indels when compared to the authors' previous work.
Collapse
|
33
|
Uberbacher EC, Xu Y, Mural RJ. Discovering and understanding genes in human DNA sequence using GRAIL. Methods Enzymol 1996; 266:259-81. [PMID: 8743689 DOI: 10.1016/s0076-6879(96)66018-2] [Citation(s) in RCA: 100] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
34
|
Xu Y, Mural RJ, Uberbacher EC. Correcting sequencing errors in DNA coding regions using a dynamic programming approach. COMPUTER APPLICATIONS IN THE BIOSCIENCES : CABIOS 1995; 11:117-24. [PMID: 7620982 DOI: 10.1093/bioinformatics/11.2.117] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
This paper presents an algorithm for detecting and 'correcting' sequencing errors that occur in DNA coding regions. The types of sequencing errors addressed are insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. This would permit improved sequencing efficiency and reduce genome sequencing costs. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of 'neutral' bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. We have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. Preliminary test results have shown the usefulness of this algorithm and also exhibited some of its weakness, providing possible directions for further improvement. On a test set consisting of 68 human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the 'corrected' sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the 'corrupted' sequences using standard GRAIL II method (version 1.2).(ABSTRACT TRUNCATED AT 250 WORDS)
Collapse
|
35
|
Craven MW, Mural RJ, Hauser LJ, Uberbacher EC. Predicting protein folding classes without overly relying on homology. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 1995; 3:98-106. [PMID: 7584472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
An important open problem in molecular biology is how to use computational methods to understand the structure and function of proteins given only their primary sequences. We describe and evaluate an original machine-learning approach to classifying protein sequences according to their structural folding class. Our work is novel in several respects: we use a set of protein classes that previously have not been used for classifying primary sequences, and we use a unique set of attributes to represent protein sequences to the learners. We evaluate our approach by measuring its ability to correctly classify proteins that were not in its training set. We compare our input representation to a commonly used input representation--amino acid composition--and show that our approach more accurately classifies proteins that have very limited homology to the sequences on which the systems are trained.
Collapse
|
36
|
Xu Y, Mural RJ, Uberbacher EC. Constructing gene models from accurately predicted exons: an application of dynamic programming. COMPUTER APPLICATIONS IN THE BIOSCIENCES : CABIOS 1994; 10:613-23. [PMID: 7704660 DOI: 10.1093/bioinformatics/10.6.613] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
This paper presents a computationally efficient algorithm, the Gene Assembly Program III (GAP III), for constructing gene models from a set of accurately-predicted 'exons'. The input to the algorithm is a set of clusters of exon candidates, generated by a new version of the GRAIL coding region recognition system. The exon candidates of a cluster differ in their presumed edges and occasionally in their reading frames. Each exon candidate has a numerical score representing its 'probability' of being an actual exon. GAP III uses a dynamic programming algorithm to construct a gene model, complete or partial, by optimizing a predefined objective function. The optimal gene models constructed by GAP III correspond very well with the structures of genes which have been determined experimentally and reported in the Genome Sequence Database (GSDB). On a test set of 137 human and mouse DNA sequences consisting of 954 true exons, GAP III constructed 137 gene models using 892 exons, among which 859 (859/954 = 90%) are true exons and 33 (33/892 = 3%) are false positive. Among the 859 true positives, 635 (74%) match the actual exons exactly, and 838 (98%) have at least one edge correct. GAP III is computationally efficient. If we use E and C to represent the total number of exon candidates in all clusters and the number of clusters, respectively, the running time of GAP III is proportional to (E x C).
Collapse
|
37
|
Mural RJ, Einstein JR, Guan X, Mann RC, Uberbacher EC. An artificial intelligence approach to DNA sequence feature recognition. Trends Biotechnol 1992; 10:66-9. [PMID: 1367939 DOI: 10.1016/0167-7799(92)90173-s] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The ultimate goal of the Human Genome project is to extract the biologically relevant information recorded in the estimated 100,000 genes encoded by the 3 x 10(9) bases of the human genome. This necessitates development of reliable computer-based methods capable of analysing and correctly identifying genes in the vast amounts of DNA-sequence data generated. Such tools may save time and labour by simplifying, for example, screening of cDNA libraries. They may also facilitate the localization of human disease genes by identifying candidate genes in promising regions of anonymous DNA sequence.
Collapse
|
38
|
Uberbacher EC, Mural RJ. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc Natl Acad Sci U S A 1991; 88:11261-5. [PMID: 1763041 PMCID: PMC53114 DOI: 10.1073/pnas.88.24.11261] [Citation(s) in RCA: 440] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts.
Collapse
|
39
|
Abstract
The x-ray crystallographic structure of the nucleosome core particle has been determined using 8 A resolution diffraction data. The particle has a mean diameter of 106 A and a maximum thickness of 65 A in the superhelical axis direction. The longest chord through the histone core measures 85 A and is in a non-axial direction. The 1.87 turn superhelix consists of B-DNA with about 78 base pairs or 7.6 helical repeats per superhelical turn. The mean DNA helical repeat contains 10.2 +/- 0.05 base pairs and spans 35 A, slightly more than standard B-DNA. The superhelix varies several Angstroms in radius and pitch, and has three distinct domains of curvature (with radii of curvature of 60, 45 and 51 A). These regions are separated by localized sharper bends +/- 10 and +/- 40 base pairs from the center of the particle, resulting in an overall radius of curvature about 43 A. Compression of superhelical DNA grooves on the inner surface and expansion on the outer surface can be seen throughout the DNA electron density. This density has been fit with a double helical ribbon model providing groove width estimates of 12 +/- 1 A inside vs. 19 +/- 1 A outside for the major groove, and 8 +/- 1 A inside vs. 13 +/- 1 A outside for the minor groove. The histone core is primarily contained within the bounds defined by the superhelical DNA, contacting the DNA where the phosphate backbone faces in toward the core. Possible extensions of density between the gyres have been located, but these are below the significance level of the electron density map. In cross-section, a tripartite organization of the histone octamer is apparent, with the tetramer occupying the central region and the dimers at the extremes. Several extensions of histone density are present which form contacts between nucleosomes in the crystal, perhaps representing flexible or "tail" histone regions. The radius of gyration of the histone portion of the electron density is calculated to be 30.4 A (in reasonable agreement with solution scattering values), and the histone core volume in the map is 93% of its theoretical volume.
Collapse
|
40
|
Abstract
Several investigators have recognized the importance of non-periodic DNA sequence information in determining the translational position of precisely positioned nucleosomes. The purpose of this study is to determine the extent of such information, in addition to the character of periodic information present. This is accomplished by examining the half-nucleosome DNA sequences of a considerable number of precisely positioned nucleosomes, and determining the probability of occurrence of each dinucleotide type as a function of position from the nucleosome center to the terminus (positions 0 to 72). By the nature of this procedure, no assumptions of periodicity are made. The results show the importance of several DNA sequence periodicities including 6-7, 10, and 21 base pairs, in addition to significant nonperiodic information. The results demonstrate that each dinucleotide type is unique in terms of its positional preference in precisely positioned nucleosomes (for example AA not equal to TT). The probabilities of occurrence for the dinucleotide types can be used to predict the translational positions of a number of observed nucleosomes.
Collapse
|
41
|
Bhattacharyya D, Tano K, Bunick GJ, Uberbacher EC, Behnke WD, Mitra S. Rapid, large-scale purification and characterization of 'Ada protein' (O6 methylguanine-DNA methyltransferase) of E. coli. Nucleic Acids Res 1988; 16:6397-410. [PMID: 3041376 PMCID: PMC338304 DOI: 10.1093/nar/16.14.6397] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The E. coli Ada protein (O6-methylguanine-DNA methyltransferase) has been purified using a high-level expression vector with a yield of about 3 mg per liter of E. coli culture. The 39-kDa protein has an extinction coefficient (E280 nm (1%)) of 5.3. Its isoelectric point of 7.1 is lower than that predicted from the amino acid content. The homogeneous Ada protein is fully active as a methyl acceptor from O6-methylguanine in DNA. Its reaction with O6-methylguanine in a synthetic DNA has a second-order rate constant of 1.1 x 10(9) M-1 min-1 at O degree C. Both the native form and the protein methylated at Cys-69 are monomeric. The CD spectrum suggests a low alpha-helical content and the radius of gyration of 23 A indicates a compact, globular shape. The middle region of the protein is sensitive to a variety of proteases, including an endogenous activity in E. coli, suggesting that the protein is composed of N-terminal and C-terminal domains connected by a hinge region. E. coli B has a higher level of this protease than does K12.
Collapse
|
42
|
Consler TG, Uberbacher EC, Bunick GJ, Liebman MN, Lee JC. Domain interaction in rabbit muscle pyruvate kinase. II. Small angle neutron scattering and computer simulation. J Biol Chem 1988; 263:2794-801. [PMID: 3343233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The effects of ligands on the structure of rabbit muscle pyruvate kinase were studied by small angle neutron scattering. The radius of gyration, RG, decreases by about 1 A in the presence of the substrate phosphoenolpyruvate, but increases by about the same magnitude in the presence of the allosteric inhibitor phenylalanine. With increasing pH or in the absence of Mg2+ and K+, the RG of pyruvate kinase increases. Hence, there is a 2-A difference in RG between two alternative conformations. Length distribution analysis indicates that, under all experimental conditions which increase the radius of gyration, there is a pronounced increase observed in the probability for interatomic distance between 80 and 110 A. These small angle neutron scattering results indicate a "contraction" and "expansion" of the enzyme when it transforms between its active and inactive forms. Using the alpha-carbon coordinates of crystalline cat muscle pyruvate kinase, a length distribution profile was calculated, and it matches the scattering profile of the inactive form. These observations are expected since the crystals were grown in the absence of divalent cations (Stuart, D. I., Levine, M., Muirhead, H., and Stammers, D. K. (1979) J. Mol. Biol. 134, 109-142). Hence, results from neutron scattering, x-ray crystallographic, and sedimentation studies (Oberfelder, R. W., Lee, L. L.-Y., and Lee, J.C. (1984) Biochemistry 23, 3813-3821) are totally consistent with each other. With the aid of computer modeling, the crystal structure has been manipulated in order to effect changes that are consistent with the conformational change described by the solution scattering data. The structural manipulation involves the rotation of the B domain relative to the A domain, leading to the closure of the cleft between these domains. These manipulations resulted in the generation of new sets of atomic (C-alpha) coordinates, which were utilized in calculations, the result of which compared favorably with the solution data.
Collapse
|
43
|
Consler TG, Uberbacher EC, Bunick GJ, Liebman MN, Lee JC. Domain interaction in rabbit muscle pyruvate kinase. II. Small angle neutron scattering and computer simulation. J Biol Chem 1988. [DOI: 10.1016/s0021-9258(18)69139-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
44
|
Stoops JK, Wakil SJ, Uberbacher EC, Bunick GJ. Small-angle neutron-scattering and electron microscope studies of the chicken liver fatty acid synthase. J Biol Chem 1987; 262:10246-51. [PMID: 3611059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
A structural model for the chicken liver fatty acid synthase is proposed based on electron microscope and small-angle neutron-scattering studies of the enzyme. The model has the overall appearance of two side by side cylinders with dimensions of 160 X 146 X 73 A, with each subunit 160 A in length and 73 A in diameter. The model was constructed by dividing each cylinder into three domains having lengths of 32, 82, and 46 A, with the domain structures in the two subunits being related to each other by a dyad axis. The model is consistent with chemical cross-linking studies which indicated that the subunits are arranged in a head to tail fashion. The cross-linking studies further showed that the beta-ketoacyl synthase active site contains a cysteine and a pantetheine residue from adjacent subunits. It is proposed that the domains which catalyze the addition of C2 units from malonate to the growing fatty acid chain lie in the crevice between the two subunits and that the two independent sets of fatty acid-synthesizing centers lie on the major axis of the model on opposite ends of the molecular dyad.
Collapse
|
45
|
Uberbacher EC, Harp JM, Wilkinson-Singley E, Bunick GJ. Shape analysis of the histone octamer in solution. Science 1986; 232:1247-9. [PMID: 3704649 DOI: 10.1126/science.3704649] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The conformation of the histone octamer is shown to depend upon the specific salt used to solubilize it. In 2M sodium chloride the octamer is similar in size and shape to the histone component of crystallized core nucleosomes. In contrast, in 3.5M ammonium sulfate the octamer is elongated, resembling an ellipsoid with approximate dimensions of 114 by 62 by 62 angstroms. These results indicate that the elongated conformation seen in the 3.3 angstroms electron density map of the histone octamer crystallized in ammonium sulfate is due to the particular salt conditions used.
Collapse
|
46
|
Goddette DW, Uberbacher EC, Bunick GJ, Frieden C. Formation of actin dimers as studied by small angle neutron scattering. J Biol Chem 1986; 261:2605-9. [PMID: 3949737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Small angle neutron scattering has been used to study the dimensions of G-actin and the formation of low molecular weight actin oligomers under conditions where rapid polymerization does not take place. In the presence of 200 microM Ca2+, actin in solution consists of a single component with a radius of gyration (Rg) of 19.9 +/- 0.4 A, consistent with the known molecular dimensions of the G-actin molecule. In the presence of 50 microM Mg2+, however, formation of an actin species with a larger Rg occurs over a 4-h period. Multicomponent fits were tried and the data were best fit assuming two components, the monomer and a species with an Rg of 29 +/- 1 A. This latter value is consistent with the dimensions expected for certain actin dimers. The apparent dissociation constant for dimer formation is approximately 150 microM with forward and reverse rate constants of 6.0 X 10(-7) microM-1 s-1 and 8.8 X 10(-5) s-1, respectively. Kinetic fluorescence experiments show that the dimer formed in the presence of low levels of Mg2+ is a nonproductive complex which does not participate in the polymerization process. However, the addition of cytochalasin D to actin in the presence of 50 microM Mg2+ rapidly induces the formation of dimers, presumably related to cytochalasin's ability to nucleate actin polymerization.
Collapse
|
47
|
Goddette DW, Uberbacher EC, Bunick GJ, Frieden C. Formation of actin dimers as studied by small angle neutron scattering. J Biol Chem 1986. [DOI: 10.1016/s0021-9258(17)35830-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
48
|
Uberbacher EC, Bunick GJ. Crystallographic Structure of the Octamer Histone Core of the Nucleosome. Science 1985; 229:1112-3. [PMID: 17753285 DOI: 10.1126/science.229.4718.1112] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
49
|
Abstract
Two monoclinic crystal forms (P2(1),C2) of chicken erythrocyte nucleosomes have been under study in this laboratory. The x-ray structure of the P2(1) crystal form has been solved to 15 A resolution. The B-DNA superhelix has a relatively uniform curvature, with only several local distortions observed in the superhelix. The individual histone domains have been localized and specific contacts between each histone and the DNA can be observed. Histone contacts to the inner surface of the DNA superhelix occur predominantly at the minor groove sites. Most of the histone core is contained within the inner surface of the superhelical DNA, except for part of H2A which extends between the DNA gyres near the terminus of the DNA. No part of H2A blocks the DNA terminus or would prevent a smooth exit of the DNA into the linker region. A similar extension of a portion of histone H4 between the DNA gyres occurs close to the dyad axis. Both unique nucleosomes in the P2(1) asymmetric unit demonstrate good dyad symmetry and are similar to each other throughout the histone core and DNA regions.
Collapse
|
50
|
Uberbacher EC, Ramakrishnan V, Olins DE, Bunick GJ. Neutron scattering studies of nucleosome structure at low ionic strength. Biochemistry 1983; 22:4916-23. [PMID: 6639936 DOI: 10.1021/bi00290a007] [Citation(s) in RCA: 32] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Ionic strength studies using homogeneous preparations of chicken erythrocyte nucleosomes containing either 146 or 175 base pairs of DNA show a single unfolding transition at about 1.5 mM ionic strength as determined by small-angle neutron scattering. The transition seen by some investigators at between 2.9 and 7.5 mM ionic strength is not observed by small-angle neutron scattering in either type of nucleosome particle. The two contrasts measured (H2O and D2O) indicate that only small conformational changes occur in the protein core, but the DNA is partially unfolded below the transition point. Patterson inversion of the data and analysis of models indicate that the DNA in both types of particle is unwinding from the ends, leaving about one turn of supercoiled DNA bound to the histone core in approximately its normal (compact) conformation. The mechanism of unfolding appears to be similar for both types of particles and in both cases occurs at the same ionic strength. The unfolding observed for nucleosomes in this study is in definite disagreement with extended superhelical models for the DNA and also disagrees with models incorporating an unfolded histone core.
Collapse
|