1
|
Deep Analysis of Residue Constraints (DARC): identifying determinants of protein functional specificity. Sci Rep 2020; 10:1691. [PMID: 32015389 PMCID: PMC6997377 DOI: 10.1038/s41598-019-55118-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 11/23/2019] [Indexed: 01/03/2023] Open
Abstract
Protein functional constraints are manifest as superfamily and functional-subgroup conserved residues, and as pairwise correlations. Deep Analysis of Residue Constraints (DARC) aids the visualization of these constraints, characterizes how they correlate with each other and with structure, and estimates statistical significance. This can identify determinants of protein functional specificity, as we illustrate for bacterial DNA clamp loader ATPases. These load ring-shaped sliding clamps onto DNA to keep polymerase attached during replication and contain one δ, three γ, and one δ’ AAA+ subunits semi-circularly arranged in the order δ-γ1-γ2-γ3-δ’. Only γ is active, though both γ and δ’ functionally influence an adjacent γ subunit. DARC identifies, as functionally-congruent features linking allosterically the ATP, DNA, and clamp binding sites: residues distinctive of γ and of γ/δ’ that mutually interact in trans, centered on the catalytic base; several γ/δ’-residues and six γ/δ’-covariant residue pairs within the DNA binding N-termini of helices α2 and α3; and γ/δ’-residues associated with the α2 C-terminus and the clamp-binding loop. Most notable is a trans-acting γ/δ’ hydroxyl group that 99% of other AAA+ proteins lack. Mutation of this hydroxyl to a methyl group impedes clamp binding and opening, DNA binding, and ATP hydrolysis—implying a remarkably clamp-loader-specific function.
Collapse
|
2
|
Neuwald AF, Altschul SF. Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations. PLoS Comput Biol 2016; 12:e1005294. [PMID: 28002465 PMCID: PMC5225019 DOI: 10.1371/journal.pcbi.1005294] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 01/10/2017] [Accepted: 12/08/2016] [Indexed: 11/25/2022] Open
Abstract
Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes’ theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu). Protein sequence data, when gathered in great quantity, contain important but implicit biological information manifest as statistical correlations. Here we describe an approach to access this information by comprehensively modeling and characterizing the distribution of sequences belonging to a major protein superfamily. This approach takes as input a large set of unaligned sequences belonging to the superfamily. By applying the minimum description length principle, it seeks the statistical model that best explains the sequences while avoiding over-fitting the data. It concurrently aligns the sequences and, to model evolutionary divergence, partitions them into subgroups that are hierarchically-arranged based upon correlated residue patterns. Auxiliary routines create PyMOL scripts to visualize the locations of correlated residues within available structures. Because these correlations likely arise from structural and biochemical constraints, they can help elucidate protein properties important for functional specificity. Comparing and contrasting sequence and structural features in this way may therefore suggest, in the light of published studies, plausible biological hypotheses for experimental investigation. We illustrate this approach with N-acetyltransferases.
Collapse
Affiliation(s)
- Andrew F. Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, BioPark II, Room 617, Baltimore, MD, United States of America
- * E-mail:
| | - Stephen F. Altschul
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America
| |
Collapse
|
3
|
Abstract
Hierarchically-arranged multiple sequence alignment profiles are useful for modeling protein domains that have functionally diverged into evolutionarily-related subgroups. Currently such alignment hierarchies are largely constructed through manual curation, as for the NCBI Conserved Domain Database (CDD). Recently, however, I developed a Gibbs sampler that uses an approach termed statistical evolutionary dynamics analysis to accomplish this task automatically while, at the same time, identifying sequence determinants of protein function. Here I describe the statistical model and sampling strategies underlying this sampler. When implemented and applied to simulated protein sequences (which conform to the underlying statistical model precisely), these sampling strategies efficiently converge on the hierarchy used to generate the sequences. However, for real protein sequences the sampler finds alternative, nearly-optimal hierarchies for many domains, indicating a significant degree of ambiguity. I illustrate how both the nature of such ambiguities and the most robust ("consensus") features of a hierarchy may be determined from an ensemble of independently generated hierarchies for the same domain. Such consensus hierarchies can provide reliably stable models of protein domain functional divergence.
Collapse
|
4
|
A novel function for the conserved glutamate residue in the walker B motif of replication factor C. Genes (Basel) 2014; 4:134-51. [PMID: 23946885 PMCID: PMC3740443 DOI: 10.3390/genes4020134] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
In all domains of life, sliding clamps tether DNA polymerases to DNA to increase the processivity of synthesis. Clamp loaders load clamps onto DNA in a multi-step process that requires ATP binding and hydrolysis. Like other AAA+ proteins, clamp loaders contain conserved Walker A and Walker B sequence motifs, which participate in ATP binding and hydrolysis, respectively. Mutation of the glutamate residue in Walker B motifs (or DExx-boxes) in AAA+ proteins typically reduces ATP hydrolysis by as much as a couple orders of magnitude, but has no effect on ATP binding. Here, the Walker B Glu in each of the four active ATP sites of the eukaryotic clamp loader, RFC, was mutated to Gln and Ala separately, and ATP binding- and hydrolysis-dependent activities of the quadruple mutant clamp loaders were characterized. Fluorescence-based assays were used to measure individual reaction steps required for clamp loading including clamp binding, clamp opening, DNA binding and ATP hydrolysis. Our results show that the Walker B mutations affect ATP-binding-dependent interactions of RFC with the clamp and DNA in addition to reducing ligand-dependent ATP hydrolysis activity. Here, we show that the Walker B glutamate is required for ATP-dependent ligand binding activity, a previously unknown function for this conserved Glu residue in RFC.
Collapse
|
5
|
Neuwald AF, Lanczycki CJ, Marchler-Bauer A. Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures. BMC Bioinformatics 2012; 13:144. [PMID: 22726767 PMCID: PMC3599474 DOI: 10.1186/1471-2105-13-144] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Accepted: 06/09/2012] [Indexed: 11/17/2022] Open
Abstract
Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, BioPark II, Room 617, 801 West Baltimore St, Baltimore, MD 21201, USA.
| | | | | |
Collapse
|
6
|
Kelch BA, Makino DL, O'Donnell M, Kuriyan J. Clamp loader ATPases and the evolution of DNA replication machinery. BMC Biol 2012; 10:34. [PMID: 22520345 PMCID: PMC3331839 DOI: 10.1186/1741-7007-10-34] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Accepted: 04/20/2012] [Indexed: 11/19/2022] Open
Abstract
Clamp loaders are pentameric ATPases of the AAA+ family that operate to ensure processive DNA replication. They do so by loading onto DNA the ring-shaped sliding clamps that tether the polymerase to the DNA. Structural and biochemical analysis of clamp loaders has shown how, despite differences in composition across different branches of life, all clamp loaders undergo the same concerted conformational transformations, which generate a binding surface for the open clamp and an internal spiral chamber into which the DNA at the replication fork can slide, triggering ATP hydrolysis, release of the clamp loader, and closure of the clamp round the DNA. We review here the current understanding of the clamp loader mechanism and discuss the implications of the differences between clamp loaders from the different branches of life.
Collapse
Affiliation(s)
- Brian A Kelch
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | |
Collapse
|
7
|
Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms. Stat Appl Genet Mol Biol 2011; 10:Article 36. [PMID: 22331370 DOI: 10.2202/1544-6115.1666] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Certain residues have no known function yet are co-conserved across distantly related protein families and diverse organisms, suggesting that they perform critical roles associated with as-yet-unidentified molecular properties and mechanisms. This raises the question of how to obtain additional clues regarding these mysterious biochemical phenomena with a view to formulating experimentally testable hypotheses. One approach is to access the implicit biochemical information encoded within the vast amount of genomic sequence data now becoming available. Here, a new Gibbs sampling strategy is formulated and implemented that can partition hundreds of thousands of sequences within a major protein class into multiple, functionally-divergent categories based on those pattern residues that best discriminate between categories. The sampler precisely defines the partition and pattern for each category by explicitly modeling unrelated, non-functional and related-yet-divergent proteins that would otherwise obscure the analysis. To aid biological interpretation, auxiliary routines can characterize pattern residues within available crystal structures and identify those structures most likely to shed light on the roles of pattern residues. This approach can be used to define and annotate automatically subgroup-specific conserved domain profiles based on statistically-rigorous empirical criteria rather than on the subjective and labor-intensive process of manual curation. Incorporating such profiles into domain database search sites (such as the NCBI BLAST site) will provide biologists with previously inaccessible molecular information useful for hypothesis generation and experimental design. Analyses of P-loop GTPases and of AAA+ ATPases illustrate the sampler's ability to obtain such information.
Collapse
|
8
|
Neuwald AF. Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases. Biol Direct 2010; 5:66. [PMID: 21129209 PMCID: PMC3012027 DOI: 10.1186/1745-6150-5-66] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 12/03/2010] [Indexed: 11/22/2022] Open
Abstract
Background Certain residues within proteins are highly conserved across very distantly related organisms, yet their (presumably critical) structural or mechanistic roles are completely unknown. To obtain clues regarding such residues within Arf and Arf-like (Arf/Arl) GTPases--which function as on/off switches regulating vesicle trafficking, phospholipid metabolism and cytoskeletal remodeling--I apply a new sampling procedure for comparative sequence analysis, termed multiple category Bayesian Partitioning with Pattern Selection (mcBPPS). Results The mcBPPS sampler classified sequences within the entire P-loop GTPase class into multiple categories by identifying those evolutionarily-divergent residues most likely to be responsible for functional specialization. Here I focus on categories of residues that most distinguish various Arf/Arl GTPases from other GTPases. This identified residues whose specific roles have been previously proposed (and in some cases corroborated experimentally and that thus serve as positive controls), as well as several categories of co-conserved residues whose possible roles are first hinted at here. For example, Arf/Arl/Sar GTPases are most distinguished from other GTPases by a conserved aspartate residue within the phosphate binding loop (P-loop) and by co-conserved residues nearby that, together, can form a network of salt-bridge and hydrogen bond interactions centered on the GTPase active site. Residues corresponding to an N-[VI] motif that is conserved within Arf/Arl GTPases may play a role in the interswitch toggle characteristic of the Arf family, whereas other, co-conserved residues may modulate the flexibility of the guanine binding loop. Arl8 GTPases conserve residues that strikingly diverge from those typically found in other Arf/Arl GTPases and that form structural interactions suggestive of a novel interswitch toggle mechanism. Conclusions This analysis suggests specific mutagenesis experiments to explore mechanisms underlying GTP hydrolysis, nucleotide exchange and interswitch toggling within Arf/Arl GTPases. More generally, it illustrates how the mcBPPS sampler can complement traditional evolutionary analyses by providing an objective, quantitative and statistically rigorous way to explore protein functional-divergence in molecular detail. Because the sampler classifies the input sequences at the same time, it can be used to generate subgroup profiles, in which functionally-divergent categories of residues are annotated automatically. Reviewers This article was reviewed by Frank Eisenhaber, L Aravind and Daniel Gaston (nominated by Eric Bapteste). For the full reviews, go to the Reviewers' comments section.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Department of Biochemistry & Molecular Biology, Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II, Room 617, 801 West Baltimore St, Baltimore, MD 21201, USA.
| |
Collapse
|
9
|
Neuwald AF. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences. ACTA ACUST UNITED AC 2009; 25:1869-75. [PMID: 19505947 DOI: 10.1093/bioinformatics/btp342] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. RESULTS This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. AVAILABILITY A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Department of Biochemistry & Molecular Biology and The Institute for Genome Sciences, University of Maryland, School of Medicine, BioPark II, Baltimore, MD 21201, USA.
| |
Collapse
|
10
|
Neuwald AF. The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops. BMC STRUCTURAL BIOLOGY 2009; 9:11. [PMID: 19265520 PMCID: PMC2656535 DOI: 10.1186/1472-6807-9-11] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2008] [Accepted: 03/05/2009] [Indexed: 11/10/2022]
Abstract
Background Ras-like GTPases function as on-off switches in intracellular signalling pathways and include the Rab, Rho/Rac, Ran, Ras, Arf, Sar and Gα families. How these families have evolutionarily diverged from each other at the sequence level provides clues to underlying mechanisms associated with their functional specialization. Results Bayesian analysis of divergent patterns within a multiple alignment of Ras-like GTPase sequences identifies a structural component, termed here the glycine brace, as the feature that most distinguishes Rab, Rho/Rac, Ran and (to some degree) Ras family GTPases from other Ras-like GTPases. The glycine brace consists of four residues: An aromatic residue that forms a stabilizing CH-π interaction with a conserved glycine at the start of the guanine-binding loop; a second aromatic residue, which is nearly always a tryptophan, that likewise forms stabilizing CH-π and NH-π interactions with a glycine at the start of the phosphate-binding P-loop; and two other residues (typically an aspartate and a serine or threonine) that, together with a conserved buried water molecule, form a network of interactions connecting the two aromatic residues. Conclusion It is proposed that the two glycine residues function as hinges and that the glycine brace influences guanine nucleotide binding and release by interacting with these hinges.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, 801 West Baltimore St,, BioPark II, Baltimore, MD 21201, USA.
| |
Collapse
|
11
|
Neuwald AF. Galpha Gbetagamma dissociation may be due to retraction of a buried lysine and disruption of an aromatic cluster by a GTP-sensing Arg Trp pair. Protein Sci 2008; 16:2570-7. [PMID: 17962409 DOI: 10.1110/ps.073098107] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The heterotrimeric G protein alpha subunit (Galpha) functions as a molecular switch by cycling between inactive GDP-bound and active GTP-bound states. When bound to GDP, Galpha interacts with high affinity to a complex of the beta and gamma subunits (Gbetagamma), but when bound to GTP, Galpha dissociates from this complex to activate downstream signaling pathways. Galpha's state is communicated to other cellular components via conformational changes within its switch I and II regions. To identify key determinants of Galpha's function as a signaling pathway molecular switch, a Bayesian approach was used to infer the selective constraints that most distinguish Galpha and closely related Arf family GTPases from distantly related translational and metabolic GTPases. The strongest of these constraints are imposed on seven residues within or near the switch II region. Likewise, constraints imposed on Galpha but not on other, closely related molecular switches correspond to four nearby residues. These constraints are explained by a proposed mechanism for GTP-induced dissociation of Galpha from Gbetagamma where an Arg-Trp pair senses the presence of bound GTP leading to conformational retraction of a nearby lysine and to disruption of an aromatic cluster. Within a complex of Gialpha, Gibetagamma, and GDP, this lysine establishes greater surface contact with Gibeta than does any other residue in Gialpha, whereas the aromatic cluster packs against a highly conserved tryptophan in Gibeta that establishes greater surface contact with Gialpha than does any other residue in Gibeta. Other structural features associated with Galpha functional divergence further support the proposed mechanism.
Collapse
|
12
|
Neuwald AF. The CHAIN program: forging evolutionary links to underlying mechanisms. Trends Biochem Sci 2007; 32:487-93. [PMID: 17962021 DOI: 10.1016/j.tibs.2007.08.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Revised: 08/13/2007] [Accepted: 08/17/2007] [Indexed: 11/25/2022]
Abstract
Proteins evolve new functions by modifying and extending the molecular machinery of an ancestral protein. Such changes show up as divergent sequence patterns, which are conserved in descendent proteins that maintain the divergent function. After multiply-aligning a set of input sequences, the CHAIN program partitions the sequences into two functionally divergent groups and then outputs an alignment that is annotated to reveal the selective pressures imposed on divergent residue positions. If atomic coordinates are also provided, hydrogen bonds and other atomic interactions associated with various categories of divergent residues are graphically displayed. Such analyses establish links between protein evolutionary divergence and functionally crucial atomic features and, as a result, can suggest plausible molecular mechanisms for experimental testing. This is illustrated here by its application to bacterial clamp-loader ATPases.
Collapse
Affiliation(s)
- Andrew F Neuwald
- The J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
| |
Collapse
|
13
|
Kannan N, Haste N, Taylor SS, Neuwald AF. The hallmark of AGC kinase functional divergence is its C-terminal tail, a cis-acting regulatory module. Proc Natl Acad Sci U S A 2007; 104:1272-7. [PMID: 17227859 PMCID: PMC1783090 DOI: 10.1073/pnas.0610251104] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The catalytic activities of eukaryotic protein kinases (EPKs) are regulated by movement of the C-helix, movement of the N and C lobes upon ATP binding, and movement of the activation loop upon phosphorylation. Statistical analysis of the selective constraints associated with AGC kinase functional divergence reveals conserved interactions between these regulatory regions and three regions of the C-terminal tail (C-tail): the N-lobe tether (NLT), the active-site tether (AST), and the C-lobe tether (CLT). The NLT serves as a docking site for an upstream kinase PDK1 and, upon activation, positions the C-helix within the ATP binding pocket. The AST directly interacts with the ATP binding pocket, and the CLT interacts with the interlobe linker and the alphaC-beta4 loop, which appears to serve as a hinge for C-helix movement. The C-tail is a hallmark of AGC functional divergence inasmuch as most of the conserved core residues that distinguish AGC kinases from other EPKs are associated with the NLT, AST, or CLT. Moreover, several AGC catalytic core conserved residues that interact with the C-tail strikingly diverge from the canonical residues observed at corresponding positions in nearly all other EPKs, suggesting that the catalytic core may have coevolved with the C-tail in AGC kinases. These observations, along with the fact that the C-tail is needed for catalytic activity suggests that the C-tail is a cis-acting regulatory module that can also serve as a regulatory "handle," to which trans-acting cellular components can bind to modulate activity.
Collapse
Affiliation(s)
- Natarajan Kannan
- *Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0654; and
| | - Nina Haste
- *Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0654; and
| | - Susan S. Taylor
- *Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0654; and
- To whom correspondence may be addressed: E-mail:
or
| | - Andrew F. Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724
- To whom correspondence may be addressed: E-mail:
or
| |
Collapse
|
14
|
Neuwald AF. Hypothesis: bacterial clamp loader ATPase activation through DNA-dependent repositioning of the catalytic base and of a trans-acting catalytic threonine. Nucleic Acids Res 2006; 34:5280-90. [PMID: 17012286 PMCID: PMC1636414 DOI: 10.1093/nar/gkl519] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The prokaryotic DNA polymerase III clamp loader complex loads the β clamp onto DNA to link the replication complex to DNA during processive synthesis and unloads it again once synthesis is complete. This minimal complex consists of one δ, one δ′ and three γ subunits, all of which possess an AAA+ module—though only the γ subunit exhibits ATPase activity. Here clues to underlying clamp loader mechanisms are obtained through Bayesian inference of various categories of selective constraints imposed on the γ and δ′ subunits. It is proposed that a conserved histidine is ionized via electron transfer involving structurally adjacent residues within the sensor 1 region of γ's AAA+ module. The resultant positive charge on this histidine inhibits ATPase activity by drawing the negatively charged catalytic base away from the active site. It is also proposed that this arrangement is disrupted upon interaction of DNA with basic residues in γ implicated previously in DNA binding, regarding which a lysine that is near the sensor 1 region and that is highly conserved both in bacterial and in eukaryotic clamp loader ATPases appears to play a critical role. γ ATPases also appear to utilize a trans-acting threonine that is donated by helix 6 of an adjacent γ or δ′ subunit and that assists in the activation of a water molecule for nucleophilic attack on the γ phosphorous atom of ATP. As eukaryotic and archaeal clamp loaders lack most of these key residues, it appears that eubacteria utilize a fundamentally different mechanism for clamp loader activation than do these other organisms.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road PO Box 100, Cold Spring Harbor, NY 11724, USA
| |
Collapse
|