1
|
Chu SKS, Narang K, Siegel JB. Protein stability prediction by fine-tuning a protein language model on a mega-scale dataset. PLoS Comput Biol 2024; 20:e1012248. [PMID: 39038042 PMCID: PMC11293664 DOI: 10.1371/journal.pcbi.1012248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 08/01/2024] [Accepted: 06/13/2024] [Indexed: 07/24/2024] Open
Abstract
Protein stability plays a crucial role in a variety of applications, such as food processing, therapeutics, and the identification of pathogenic mutations. Engineering campaigns commonly seek to improve protein stability, and there is a strong interest in streamlining these processes to enable rapid optimization of highly stabilized proteins with fewer iterations. In this work, we explore utilizing a mega-scale dataset to develop a protein language model optimized for stability prediction. ESMtherm is trained on the folding stability of 528k natural and de novo sequences derived from 461 protein domains and can accommodate deletions, insertions, and multiple-point mutations. We show that a protein language model can be fine-tuned to predict folding stability. ESMtherm performs reasonably on small protein domains and generalizes to sequences distal from the training set. Lastly, we discuss our model's limitations compared to other state-of-the-art methods in generalizing to larger protein scaffolds. Our results highlight the need for large-scale stability measurements on a diverse dataset that mirrors the distribution of sequence lengths commonly observed in nature.
Collapse
Affiliation(s)
- Simon K. S. Chu
- Biophysics Graduate Program, University of California Davis, Davis, California, United States of America
| | - Kush Narang
- College of Biological Sciences, University of California Davis, Davis, California, United States of America
| | - Justin B. Siegel
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Chemistry, University of California Davis, Davis, California, United States of America
- Department of Biochemistry and Molecular Medicine, University of California Davis, Davis, California, United States of America
| |
Collapse
|
2
|
Kaur G, Kapoor S, Kaundal S, Dutta D, Thakur KG. Structure-Guided Designing and Evaluation of Peptides Targeting Bacterial Transcription. Front Bioeng Biotechnol 2020; 8:797. [PMID: 33014990 PMCID: PMC7505949 DOI: 10.3389/fbioe.2020.00797] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 06/22/2020] [Indexed: 11/17/2022] Open
Abstract
The mycobacterial RNA polymerase (RNAP) is an essential and validated drug target for developing antibacterial drugs. The β-subunit of Mycobacterium tuberculosis (Mtb) RNAP (RpoB) interacts with an essential and global transcription factor, CarD, and confers antibiotic and oxidative stress resistance to Mtb. Compromising the RpoB/CarD interactions results in the killing of mycobacteria, hence disrupting the RpoB/CarD interaction has been proposed as a novel strategy for the development of anti-tubercular drugs. Here, we describe the first approach to rationally design and test the efficacy of the peptide-based inhibitors which specifically target the conserved PPI interface between the bacterial RNAP β/transcription factor complex. We performed in silico protein-peptide docking studies along with biochemical assays to characterize the novel peptide-based inhibitors. Our results suggest that the top ranked peptides are highly stable, soluble in aqueous buffer, and capable of inhibiting transcription with IC50 > 50 μM concentration. Using peptide-based molecules, our study provides the first piece of evidence to target the conserved RNAP β/transcription factor interface for designing new inhibitors. Our results may hence form the basis to further improve the potential of these novel peptides in modulating bacterial gene expression, thus inhibiting bacterial growth and combating bacterial infections.
Collapse
Affiliation(s)
- Gundeep Kaur
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology, Chandigarh, India
| | - Srajan Kapoor
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology, Chandigarh, India
| | - Soni Kaundal
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology, Chandigarh, India
| | - Dipak Dutta
- Molecular Microbiology Laboratory, Council of Scientific and Industrial Research-Institute of Microbial Technology, Chandigarh, India
| | - Krishan Gopal Thakur
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, Council of Scientific and Industrial Research-Institute of Microbial Technology, Chandigarh, India
| |
Collapse
|
3
|
Caulobacter crescentus CdnL is a non-essential RNA polymerase-binding protein whose depletion impairs normal growth and rRNA transcription. Sci Rep 2017; 7:43240. [PMID: 28233804 PMCID: PMC5324124 DOI: 10.1038/srep43240] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Accepted: 01/23/2017] [Indexed: 12/22/2022] Open
Abstract
CdnL is an essential RNA polymerase (RNAP)-binding activator of rRNA transcription in mycobacteria and myxobacteria but reportedly not in Bacillus. Whether its function and mode of action are conserved in other bacteria thus remains unclear. Because virtually all alphaproteobacteria have a CdnL homolog and none of these have been characterized, we studied the homolog (CdnLCc) of the model alphaproteobacterium Caulobacter crescentus. We show that CdnLCc is not essential for viability but that its absence or depletion causes slow growth and cell filamentation. CdnLCc is degraded in vivo in a manner dependent on its C-terminus, yet excess CdnLCc resulting from its stabilization did not adversely affect growth. We find that CdnLCc interacts with itself and with the RNAP β subunit, and localizes to at least one rRNA promoter in vivo, whose activity diminishes upon depletion of CdnLCc. Interestingly, cells expressing CdnLCc mutants unable to interact with the RNAP were cold-sensitive, suggesting that CdnLCc interaction with RNAP is especially required at lower than standard growth temperatures in C. crescentus. Our study indicates that despite limited sequence similarities and regulatory differences compared to its myco/myxobacterial homologs, CdnLCc may share similar biological functions, since it affects rRNA synthesis, probably by stabilizing open promoter-RNAP complexes.
Collapse
|
4
|
Structure-function dissection of Myxococcus xanthus CarD N-terminal domain, a defining member of the CarD_CdnL_TRCF family of RNA polymerase interacting proteins. PLoS One 2015; 10:e0121322. [PMID: 25811865 PMCID: PMC4374960 DOI: 10.1371/journal.pone.0121322] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Accepted: 01/30/2015] [Indexed: 01/08/2023] Open
Abstract
Two prototypes of the large CarD_CdnL_TRCF family of bacterial RNA polymerase (RNAP)-binding proteins, Myxococcus xanthus CarD and CdnL, have distinct functions whose molecular basis remain elusive. CarD, a global regulator linked to the action of several extracytoplasmic function (ECF) σ-factors, binds to the RNAP β subunit (RNAP-β) and to protein CarG via an N-terminal domain, CarDNt, and to DNA via an intrinsically unfolded C-terminal domain resembling eukaryotic high-mobility-group A (HMGA) proteins. CdnL, a CarDNt-like protein that is essential for cell viability, is implicated in σA-dependent rRNA promoter activation and interacts with RNAP-β but not with CarG. While the HMGA-like domain of CarD by itself is inactive, we find that CarDNt has low but observable ability to activate ECF σ-dependent promoters in vivo, indicating that the C-terminal DNA-binding domain is required to maximize activity. Our structure-function dissection of CarDNt reveals an N-terminal, five-stranded β -sheet Tudor-like domain, CarD1-72, whose structure and contacts with RNAP-β mimic those of CdnL. Intriguingly, and in marked contrast to CdnL, CarD mutations that disrupt its interaction with RNAP-β did not annul activity. Our data suggest that the CarDNt C-terminal segment, CarD61-179, may be structurally distinct from its CdnL counterpart, and that it houses at least two distinct and crucial function determinants: (a) CarG-binding, which is specific to CarD; and (b) a basic residue stretch, which is also conserved and functionally required in CdnL. This study highlights the evolution of shared and divergent interactions in similar protein modules that enable the distinct activities of two related members of a functionally important and widespread bacterial protein family.
Collapse
|
5
|
Gallego-García A, Mirassou Y, García-Moreno D, Elías-Arnanz M, Jiménez MA, Padmanabhan S. Structural insights into RNA polymerase recognition and essential function of Myxococcus xanthus CdnL. PLoS One 2014; 9:e108946. [PMID: 25272012 PMCID: PMC4182748 DOI: 10.1371/journal.pone.0108946] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 08/28/2014] [Indexed: 11/18/2022] Open
Abstract
CdnL and CarD are two functionally distinct members of the CarD_CdnL_TRCF family of bacterial RNA polymerase (RNAP)-interacting proteins, which co-exist in Myxococcus xanthus. While CarD, found exclusively in myxobacteria, has been implicated in the activity of various extracytoplasmic function (ECF) σ-factors, the function and mode of action of the essential CdnL, whose homologs are widespread among bacteria, remain to be elucidated in M. xanthus. Here, we report the NMR solution structure of CdnL and present a structure-based mutational analysis of its function. An N-terminal five-stranded β-sheet Tudor-like module in the two-domain CdnL mediates binding to RNAP-β, and mutations that disrupt this interaction impair cell growth. The compact CdnL C-terminal domain consists of five α-helices folded as in some tetratricopeptide repeat-like protein-protein interaction domains, and contains a patch of solvent-exposed nonpolar and basic residues, among which a set of basic residues is shown to be crucial for CdnL function. We show that CdnL, but not its loss-of-function mutants, stabilizes formation of transcriptionally competent, open complexes by the primary σA-RNAP holoenzyme at an rRNA promoter in vitro. Consistent with this, CdnL is present at rRNA promoters in vivo. Implication of CdnL in RNAP-σA activity and of CarD in ECF-σ function in M. xanthus exemplifies how two related members within a widespread bacterial protein family have evolved to enable distinct σ-dependent promoter activity.
Collapse
Affiliation(s)
- Aránzazu Gallego-García
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, Murcia, Spain
| | - Yasmina Mirassou
- Instituto de Química Física ‘Rocasolano’, Consejo Superior de Investigaciones Científicas (IQFR-CSIC), Madrid, Spain
| | - Diana García-Moreno
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, Murcia, Spain
| | - Montserrat Elías-Arnanz
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, Murcia, Spain
- * E-mail: (MEA); (MAJ); (SP)
| | - María Angeles Jiménez
- Instituto de Química Física ‘Rocasolano’, Consejo Superior de Investigaciones Científicas (IQFR-CSIC), Madrid, Spain
- * E-mail: (MEA); (MAJ); (SP)
| | - S. Padmanabhan
- Instituto de Química Física ‘Rocasolano’, Consejo Superior de Investigaciones Científicas (IQFR-CSIC), Madrid, Spain
- * E-mail: (MEA); (MAJ); (SP)
| |
Collapse
|
6
|
Kaur G, Dutta D, Thakur KG. Crystal structure of Mycobacterium tuberculosis CarD, an essential RNA polymerase binding protein, reveals a quasidomain-swapped dimeric structural architecture. Proteins 2013; 82:879-84. [PMID: 24115125 DOI: 10.1002/prot.24419] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Revised: 08/28/2013] [Accepted: 09/03/2013] [Indexed: 01/22/2023]
Abstract
Mycobacterium tuberculosis (Mtb) CarD is an essential transcriptional regulator that binds RNA polymerase and plays an important role in reprogramming transcription machinery under diverse stress conditions. Here, we report the crystal structure of CarD at 2.3 Å resolution, that represents the first structural description of CarD/CdnL-Like family of proteins. CarD adopts an overall bi-lobed structural architecture where N-terminal domain resembles 'tudor-like' domain and C-terminal domain adopts a novel five helical fold that lacks the predicted leucine zipper structural motif. The structure reveals dimeric state of CarD resulting from β-strand swapping between the N-terminal domains of each individual subunits. The structure provides crucial insights into the possible mode(s) of CarD/RNAP interactions.
Collapse
Affiliation(s)
- Gundeep Kaur
- Structural Biology Laboratory, G. N. Ramachandran Protein Centre, CSIR-Institute of Microbial Technology, Chandigarh, 160036, India
| | | | | |
Collapse
|
7
|
Gulten G, Sacchettini JC. Structure of the Mtb CarD/RNAP β-lobes complex reveals the molecular basis of interaction and presents a distinct DNA-binding domain for Mtb CarD. Structure 2013; 21:1859-69. [PMID: 24055315 DOI: 10.1016/j.str.2013.08.014] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 07/05/2013] [Accepted: 08/05/2013] [Indexed: 11/16/2022]
Abstract
CarD from Mycobacterium tuberculosis (Mtb) is an essential protein shown to be involved in stringent response through downregulation of rRNA and ribosomal protein genes. CarD interacts with the β-subunit of RNAP and this interaction is vital for Mtb's survival during the persistent infection state. We have determined the crystal structure of CarD in complex with the RNAP β-subunit β1 and β2 domains at 2.1 Å resolution. The structure reveals the molecular basis of CarD/RNAP interaction, providing a basis to further our understanding of RNAP regulation by CarD. The structural fold of the CarD N-terminal domain is conserved in RNAP interacting proteins such as TRCF-RID and CdnL, and displays similar interactions to the predicted homology model based on the TRCF/RNAP β1 structure. Interestingly, the structure of the C-terminal domain, which is required for complete CarD function in vivo, represents a distinct DNA-binding fold.
Collapse
Affiliation(s)
- Gulcin Gulten
- Department of Chemistry, Texas A&M University, College Station, TX 77843, USA
| | | |
Collapse
|
8
|
Structure and function of CarD, an essential mycobacterial transcription factor. Proc Natl Acad Sci U S A 2013; 110:12619-24. [PMID: 23858468 DOI: 10.1073/pnas.1308270110] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
CarD, an essential transcription regulator in Mycobacterium tuberculosis, directly interacts with the RNA polymerase (RNAP). We used a combination of in vivo and in vitro approaches to establish that CarD is a global regulator that stimulates the formation of RNAP-holoenzyme open promoter (RPo) complexes. We determined the X-ray crystal structure of Thermus thermophilus CarD, allowing us to generate a structural model of the CarD/RPo complex. On the basis of our structural and functional analyses, we propose that CarD functions by forming protein/protein and protein/DNA interactions that bridge the RNAP to the promoter DNA. CarD appears poised to interact with a DNA structure uniquely presented by the RPo: the splayed minor groove at the double-stranded/single-stranded DNA junction at the upstream edge of the transcription bubble. Thus, CarD uses an unusual mechanism for regulating transcription, sensing the DNA conformation where transcription bubble formation initiates.
Collapse
|