1
|
Morrill GA, Kostellow AB. Molecular Properties of Globin Channels and Pores: Role of Cholesterol in Ligand Binding and Movement. Front Physiol 2016; 7:360. [PMID: 27656147 PMCID: PMC5011150 DOI: 10.3389/fphys.2016.00360] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Accepted: 08/08/2016] [Indexed: 02/02/2023] Open
Abstract
Globins contain one or more cavities that control or affect such functions as ligand movement and ligand binding. Here we report that the extended globin family [cytoglobin (Cygb); neuroglobin (Ngb); myoglobin (Mb); hemoglobin (Hb) subunits Hba(α); and Hbb(β)] contain either a transmembrane (TM) helix or pore-lining region as well as internal cavities. Protein motif/domain analyses indicate that Ngb and Hbb each contain 5 cholesterol- binding (CRAC/CARC) domains and 1 caveolin binding motif, whereas the Cygb dimer has 6 cholesterol-binding domains but lacks caveolin-binding motifs. Mb and Hba each exhibit 2 cholesterol-binding domains and also lack caveolin-binding motifs. The Hb αβ-tetramer contains 14 cholesterol-binding domains. Computer algorithms indicate that Cygb and Ngb cavities display multiple partitions and C-terminal pore-lining regions, whereas Mb has three major cavities plus a C-terminal pore-lining region. The Hb tetramer exhibits a large internal cavity but the subunits differ in that they contain a C-terminal TM helix (Hba) and pore-lining region (Hbb). The cavities include 43 of 190 Cygb residues, 38 of 151 of Ngb residues, 55 of 154 Mb residues, and 137 of 688 residues in the Hb tetramer. Each cavity complex includes 6 to 8 residues of the TM helix or pore-lining region and CRAC/CARC domains exist within all cavities. Erythrocyte Hb αβ-tetramers are largely cytosolic but also bind to a membrane anion exchange protein, "band 3," which contains a large internal cavity and 12 TM helices (5 being pore-lining regions). The Hba TM helix may be the erythrocyte membrane "band 3" attachment site. "Band 3" contributes 4 caveolin binding motifs and 10 CRAC/CARC domains. Cholesterol binding may create lipid-disordered phases that alter globin cavities and facilitate ligand movement, permitting ion channel formation and conformational changes that orchestrate anion and ligand (O2, CO2, NO) movement within the large internal cavities and channels of the globins.
Collapse
Affiliation(s)
- Gene A Morrill
- Department of Physiology and Biophysics, Albert Einstein College of Medicine Bronx, NY, USA
| | - Adele B Kostellow
- Department of Physiology and Biophysics, Albert Einstein College of Medicine Bronx, NY, USA
| |
Collapse
|
2
|
Shatnawi M, Zaki N. Inter-domain linker prediction using amino acid compositional index. Comput Biol Chem 2015; 55:23-30. [DOI: 10.1016/j.compbiolchem.2015.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2014] [Revised: 01/22/2015] [Accepted: 01/22/2015] [Indexed: 10/24/2022]
|
3
|
Shatnawi M, Zaki N, Yoo PD. Protein inter-domain linker prediction using Random Forest and amino acid physiochemical properties. BMC Bioinformatics 2014; 15 Suppl 16:S8. [PMID: 25521329 PMCID: PMC4290662 DOI: 10.1186/1471-2105-15-s16-s8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein chains are generally long and consist of multiple domains. Domains are distinct structural units of a protein that can evolve and function independently. The accurate prediction of protein domain linkers and boundaries is often regarded as the initial step of protein tertiary structure and function predictions. Such information not only enhances protein-targeted drug development but also reduces the experimental cost of protein analysis by allowing researchers to work on a set of smaller and independent units. In this study, we propose a novel and accurate domain-linker prediction approach based on protein primary structure information only. We utilize a nature-inspired machine-learning model called Random Forest along with a novel domain-linker profile that contains physiochemical and domain-linker information of amino acid sequences. RESULTS The proposed approach was tested on two well-known benchmark protein datasets and achieved 68% sensitivity and 99% precision, which is better than any existing protein domain-linker predictor. Without applying any data balancing technique such as class weighting and data re-sampling, the proposed approach is able to accurately classify inter-domain linkers from highly imbalanced datasets. CONCLUSION Our experimental results prove that the proposed approach is useful for domain-linker identification in highly imbalanced single- and multi-domain proteins.
Collapse
|
4
|
Faure G, Callebaut I. Comprehensive repertoire of foldable regions within whole genomes. PLoS Comput Biol 2013; 9:e1003280. [PMID: 24204229 PMCID: PMC3812050 DOI: 10.1371/journal.pcbi.1003280] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2013] [Accepted: 08/15/2013] [Indexed: 11/30/2022] Open
Abstract
In order to get a comprehensive repertoire of foldable domains within whole proteomes, including orphan domains, we developed a novel procedure, called SEG-HCA. From only the information of a single amino acid sequence, SEG-HCA automatically delineates segments possessing high densities in hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA). These hydrophobic clusters mainly correspond to regular secondary structures, which together form structured or foldable regions. Genome-wide analyses revealed that SEG-HCA is opposite of disorder predictors, both addressing distinct structural states. Interestingly, there is however an overlap between the two predictions, including small segments of disordered sequences, which undergo coupled folding and binding. SEG-HCA thus gives access to these specific domains, which are generally poorly represented in domain databases. Comparison of the whole set of SEG-HCA predictions with the Conserved Domain Database (CDD) also highlighted a wide proportion of predicted large (length >50 amino acids) segments, which are CDD orphan. These orphan sequences may either correspond to highly divergent members of already known families or belong to new families of domains. Their comprehensive description thus opens new avenues to investigate new functional and/or structural features, which remained so far uncovered. Altogether, the data described here provide new insights into the protein architecture and organization throughout the three kingdoms of life. Spontaneous or induced folding into a specific 3D structure is a key property of proteins to perform their biological functions. Folded 3D structures of proteins perform specific functions, including interactions with other proteins. Intrinsically disordered regions also mediate interaction, gaining structure only when bound to a target protein. In both cases, hydrophobicity generally plays a major role in the protein segment “foldability”. Here, we developed an original procedure to identify foldable segments from only the information of a single amino acid sequence and to explore protein structures at a proteomic scale. Our approach goes beyond the simple consideration of mean hydrophobicity, by including the secondary structure information through the use of a two-dimensional transposition of the sequence. The developed procedure, combined with disorder predictors, may facilitate the specific identification of small segments that undergo coupled folding and binding. Combined with the analysis of specific domain databases, it also highlights orphan foldable segments, which remain yet uncharacterized.
Collapse
Affiliation(s)
- Guilhem Faure
- CNRS, UPMC Univ Paris 6, IMPMC, UMR7590 - IUC, Paris, France
| | | |
Collapse
|
5
|
Sadowski MI. Prediction of protein domain boundaries from inverse covariances. Proteins 2013; 81:253-60. [PMID: 22987736 PMCID: PMC3563215 DOI: 10.1002/prot.24181] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Revised: 08/10/2012] [Accepted: 09/04/2012] [Indexed: 01/04/2023]
Abstract
It has been known even since relatively few structures had been solved that longer protein chains often contain multiple domains, which may fold separately and play the role of reusable functional modules found in many contexts. In many structural biology tasks, in particular structure prediction, it is of great use to be able to identify domains within the structure and analyze these regions separately. However, when using sequence data alone this task has proven exceptionally difficult, with relatively little improvement over the naive method of choosing boundaries based on size distributions of observed domains. The recent significant improvement in contact prediction provides a new source of information for domain prediction. We test several methods for using this information including a kernel smoothing-based approach and methods based on building alpha-carbon models and compare performance with a length-based predictor, a homology search method and four published sequence-based predictors: DOMCUT, DomPRO, DLP-SVM, and SCOOBY-DOmain. We show that the kernel-smoothing method is significantly better than the other ab initio predictors when both single-domain and multidomain targets are considered and is not significantly different to the homology-based method. Considering only multidomain targets the kernel-smoothing method outperforms all of the published methods except DLP-SVM. The kernel smoothing method therefore represents a potentially useful improvement to ab initio domain prediction.
Collapse
Affiliation(s)
- Michael I Sadowski
- MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London, United Kingdom.
| |
Collapse
|
6
|
Ying M, Huang X, Zhao H, Wu Y, Wan F, Huang C, Jie K. Comprehensively surveying structure and function of RING domains from Drosophila melanogaster. PLoS One 2011; 6:e23863. [PMID: 21912646 PMCID: PMC3166285 DOI: 10.1371/journal.pone.0023863] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2011] [Accepted: 07/26/2011] [Indexed: 12/22/2022] Open
Abstract
Using a complete set of RING domains from Drosophila melanogaster, all the solved RING domains and cocrystal structures of RING-containing ubiquitin-ligases (RING-E3) and ubiquitin-conjugating enzyme (E2) pairs, we analyzed RING domains structures from their primary to quarternary structures. The results showed that: i) putative orthologs of RING domains between Drosophila melanogaster and the human largely occur (118/139, 84.9%); ii) of the 118 orthologous pairs from Drosophila melanogaster and the human, 117 pairs (117/118, 99.2%) were found to retain entirely uniform domain architectures, only Iap2/Diap2 experienced evolutionary expansion of domain architecture; iii) 4 evolutionary structurally conserved regions (SCRs) are responsible for homologous folding of RING domains at the superfamily level; iv) besides the conserved Cys/His chelating zinc ions, 6 equivalent residues (4 hydrophobic and 2 polar residues) in the SCRs possess good-consensus and conservation- these 4 SCRs function in the structural positioning of 6 equivalent residues as determinants for RING-E3 catalysis; v) members of these RING proteins located nucleus, multiple subcellular compartments, membrane protein and mitochondrion are respectively 42 (42/139, 30.2%), 71 (71/139, 51.1%), 22 (22/139, 15.8%) and 4 (4/139, 2.9%); vi) CG15104 (Topors) and CG1134 (Mul1) in C3HC4, and CG3929 (Deltex) in C3H2C3 seem to display broader E2s binding profiles than other RING-E3s; vii) analyzing intermolecular interfaces of E2/RING-E3 complexes indicate that residues directly interacting with E2s are all from the SCRs in RING domains. Of the 6 residues, 2 hydrophobic ones contribute to constructing the conserved hydrophobic core, while the 2 hydrophobic and 2 polar residues directly participate in E2/RING-E3 interactions. Based on sequence and structural data, SCRs, conserved equivalent residues and features of intermolecular interfaces were extracted, highlighting the presence of a nucleus for RING domain fold and formation of catalytic core in which related residues and regions exhibit preferential evolutionary conservation.
Collapse
Affiliation(s)
- Muying Ying
- Department of Molecular Biology and Biochemistry, Basic Medical College of Nanchang University, Nanchang, People's Republic of China.
| | | | | | | | | | | | | |
Collapse
|
7
|
Gonnet M, Erauso G, Prieur D, Le Romancer M. pAMT11, a novel plasmid isolated from a Thermococcus sp. strain closely related to the virus-like integrated element TKV1 of the Thermococcus kodakaraensis genome. Res Microbiol 2010; 162:132-43. [PMID: 21144896 DOI: 10.1016/j.resmic.2010.11.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2010] [Accepted: 10/05/2010] [Indexed: 10/18/2022]
Abstract
A novel extrachromosomal element that we called pAMT11 was discovered in a deep-sea vent isolate belonging to the hyperthermophilic euryarchaeal order Thermococcales. It consists of a double-stranded DNA of 20,534bp which encodes 30 putative open reading frames (ORFs) of which six could be assigned to a putative function on the basis of sequence similarity to known genes or to protein domain families. Most of the ORFs of pAMT1 showed homology and synteny with a genomic island of Thermococcus kodakaraensis KOD1. This region, named TKV1, was previously described as a "virus-like integrated element" and assumed to integrate into the host chromosome by a site-specific recombination mechanism similar to that of Sulfolobus solfataricus virus 1. While most of the genes shared by pAMT11 and TKV1 encode putative membrane proteins presumably involved in virus particle formation, attempts to induce production of virus particles by mitomycin treatment of AMT11 cultures failed, suggesting that pAMT11 may represent the genome of a defective virus or a plasmid. Genomes of mobile elements usually contain two regions: a core of conserved genes mainly involved in replication, maintenance or spreading of the genetic element, and a variable set of accessory genes. Surprisingly, genes presumably implied in the replication process are quite divergent between TKV1 and pAMT11. Indeed, TKV1 possesses a MCM-like protein that may function as a replication initiator, while pAMT11 encodes a putative non-conventional protein distantly related to the Rep protein previously described in a small plasmid of Pyrococcus sp. strain JT1, assumed to replicate by a rolling-circle (RC) mechanism. However, in the case of pAMT11, this mode of plasmid replication could not be experimentally proven and is questionable given the lack of significant similarities with any other members of the RC-Rep superfamily and its unusual large size compared to other RC plasmids.
Collapse
Affiliation(s)
- Mathieu Gonnet
- Unité d'Epidémiologie Animale, UR356, INRA centre de Clermont-Ferrand Theix, Route de Theix, 63122 Saint Genès Champanelle, France.
| | | | | | | |
Collapse
|
8
|
Soundararajan V, Raman R, Raguram S, Sasisekharan V, Sasisekharan R. Atomic interaction networks in the core of protein domains and their native folds. PLoS One 2010; 5:e9391. [PMID: 20186337 PMCID: PMC2826414 DOI: 10.1371/journal.pone.0009391] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 02/03/2010] [Indexed: 11/19/2022] Open
Abstract
Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be “signature” of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1–2 angstroms (mean 1.61A) Cα RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the ‘twilight’ and ‘midnight’ zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools.
Collapse
Affiliation(s)
- Venkataramanan Soundararajan
- Harvard-MIT Division of Health Sciences & Technology, Koch Institute for Integrative Cancer Research and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Rahul Raman
- Harvard-MIT Division of Health Sciences & Technology, Koch Institute for Integrative Cancer Research and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - S. Raguram
- Harvard-MIT Division of Health Sciences & Technology, Koch Institute for Integrative Cancer Research and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - V. Sasisekharan
- Harvard-MIT Division of Health Sciences & Technology, Koch Institute for Integrative Cancer Research and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Ram Sasisekharan
- Harvard-MIT Division of Health Sciences & Technology, Koch Institute for Integrative Cancer Research and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
9
|
Yin Y, Bangs F, Paton IR, Prescott A, James J, Davey MG, Whitley P, Genikhovich G, Technau U, Burt DW, Tickle C. The Talpid3 gene (KIAA0586) encodes a centrosomal protein that is essential for primary cilia formation. Development 2009; 136:655-64. [PMID: 19144723 DOI: 10.1242/dev.028464] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The chicken talpid(3) mutant, with polydactyly and defects in other embryonic regions that depend on hedgehog (Hh) signalling (e.g. the neural tube), has a mutation in KIAA0568. Similar phenotypes are seen in mice and in human syndromes with mutations in genes that encode centrosomal or intraflagella transport proteins. Such mutations lead to defects in primary cilia, sites where Hh signalling occurs. Here, we show that cells of talpid(3) mutant embryos lack primary cilia and that primary cilia can be rescued with constructs encoding Talpid3. talpid(3) mutant embryos also develop polycystic kidneys, consistent with widespread failure of ciliogenesis. Ultrastructural studies of talpid(3) mutant neural tube show that basal bodies mature but fail to dock with the apical cell membrane, are misorientated and almost completely lack ciliary axonemes. We also detected marked changes in actin organisation in talpid(3) mutant cells, which may explain misorientation of basal bodies. KIAA0586 was identified in the human centrosomal proteome and, using an antibody against chicken Talpid3, we detected Talpid3 in the centrosome of wild-type chicken cells but not in mutant cells. Cloning and bioinformatic analysis of the Talpid3 homolog from the sea anemone Nematostella vectensis identified a highly conserved region in the Talpid3 protein, including a predicted coiled-coil domain. We show that this region is required to rescue primary cilia formation and neural tube patterning in talpid(3) mutant embryos, and is sufficient for centrosomal localisation. Thus, Talpid3 is one of a growing number of centrosomal proteins that affect both ciliogenesis and Hh signalling.
Collapse
Affiliation(s)
- Yili Yin
- Division of Cell and Developmental Biology, Wellcome Trust Biocentre, The University of Dundee, Dundee DD1 5EH, UK
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Bondugula R, Lee MS, Wallqvist A. FIEFDom: a transparent domain boundary recognition system using a fuzzy mean operator. Nucleic Acids Res 2008; 37:452-62. [PMID: 19056827 PMCID: PMC2632928 DOI: 10.1093/nar/gkn944] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Protein domain prediction is often the preliminary step in both experimental and computational protein research. Here we present a new method to predict the domain boundaries of a multidomain protein from its amino acid sequence using a fuzzy mean operator. Using the nr-sequence database together with a reference protein set (RPS) containing known domain boundaries, the operator is used to assign a likelihood value for each residue of the query sequence as belonging to a domain boundary. This procedure robustly identifies contiguous boundary regions. For a dataset with a maximum sequence identity of 30%, the average domain prediction accuracy of our method is 97% for one domain proteins and 58% for multidomain proteins. The presented model is capable of using new sequence/structure information without re-parameterization after each RPS update. When tested on a current database using a four year old RPS and on a database that contains different domain definitions than those used to train the models, our method consistently yielded the same accuracy while two other published methods did not. A comparison with other domain prediction methods used in the CASP7 competition indicates that our method performs better than existing sequence-based methods.
Collapse
Affiliation(s)
- Rajkumar Bondugula
- Biotechnology HPC Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, MD 21702, USA.
| | | | | |
Collapse
|