1
|
De Coninck T, Gippert GP, Henrissat B, Desmet T, Van Damme EJM. Investigating diversity and similarity between CBM13 modules and ricin-B lectin domains using sequence similarity networks. BMC Genomics 2024; 25:643. [PMID: 38937673 PMCID: PMC11212257 DOI: 10.1186/s12864-024-10554-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 06/24/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND The CBM13 family comprises carbohydrate-binding modules that occur mainly in enzymes and in several ricin-B lectins. The ricin-B lectin domain resembles the CBM13 module to a large extent. Historically, ricin-B lectins and CBM13 proteins were considered completely distinct, despite their structural and functional similarities. RESULTS In this data mining study, we investigate structural and functional similarities of these intertwined protein groups. Because of the high structural and functional similarities, and differences in nomenclature usage in several databases, confusion can arise. First, we demonstrate how public protein databases use different nomenclature systems to describe CBM13 modules and putative ricin-B lectin domains. We suggest the introduction of a novel CBM13 domain identifier, as well as the extension of CAZy cross-references in UniProt to guard the distinction between CAZy and non-CAZy entries in public databases. Since similar problems may occur with other lectin families and CBM families, we suggest the introduction of novel CBM InterPro domain identifiers to all existing CBM families. Second, we investigated phylogenetic, nomenclatural and structural similarities between putative ricin-B lectin domains and CBM13 modules, making use of sequence similarity networks. We concluded that the ricin-B/CBM13 superfamily may be larger than initially thought and that several putative ricin-B lectin domains may display CAZyme functionalities, although biochemical proof remains to be delivered. CONCLUSIONS Ricin-B lectin domains and CBM13 modules are associated groups of proteins whose database semantics are currently biased towards ricin-B lectins. Revision of the CAZy cross-reference in UniProt and introduction of a dedicated CBM13 domain identifier in InterPro may resolve this issue. In addition, our analyses show that several proteins with putative ricin-B lectin domains show very strong structural similarity to CBM13 modules. Therefore ricin-B lectin domains and CBM13 modules could be considered distant members of a larger ricin-B/CBM13 superfamily.
Collapse
Affiliation(s)
- Tibo De Coninck
- Laboratory of Biochemistry and Glycobiology, Department of Biotechnology, Ghent University, Proeftuinstraat 86, Ghent, 9000, Belgium
- Centre for Synthetic Biology, Department of Biotechnology, Ghent University, Coupure Links 653, Ghent, 9000, Belgium
| | - Garry P Gippert
- Section for Protein Chemistry and Enzyme Technology, Department of Biotechnology & Biomedicine, Technical University of Denmark, Søltofts Plads 224, Kgs. Lyngby, 2800, Denmark
| | - Bernard Henrissat
- Section for Protein Chemistry and Enzyme Technology, Department of Biotechnology & Biomedicine, Technical University of Denmark, Søltofts Plads 224, Kgs. Lyngby, 2800, Denmark
| | - Tom Desmet
- Centre for Synthetic Biology, Department of Biotechnology, Ghent University, Coupure Links 653, Ghent, 9000, Belgium
| | - Els J M Van Damme
- Laboratory of Biochemistry and Glycobiology, Department of Biotechnology, Ghent University, Proeftuinstraat 86, Ghent, 9000, Belgium.
| |
Collapse
|
2
|
Shalileh F, Gheibzadeh MS, Lloyd JR, Fietz S, Shahbani Zahiri H, Zolfaghari Emameh R. Evolutionary analysis and quality assessment of ζ-carbonic anhydrase sequences from environmental microbiome. J Basic Microbiol 2023; 63:1412-1425. [PMID: 37670218 DOI: 10.1002/jobm.202300323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 08/06/2023] [Accepted: 08/22/2023] [Indexed: 09/07/2023]
Abstract
Carbonic anhydrase (CA) is one of the most vital enzymes in living cells. This study has been performed due to the significance of this metalloenzyme for life and the novelty of some CA families like ζ-CA to evaluate evolutionary processes and quality check their sequences. In this study, bioinformatics methods revealed the presence of ζ-CA in some eukaryotic and prokaryotic microorganisms. Notably, it has not been previously reported in prokaryotes. The coexistence of β- and ζ-CAs in some microorganisms is also a novel finding as well. Also, our analysis identified several CA proteins with 6-14 amino acid intervals between histidine and cysteine in the second highly conserved motif, which can be classified as the novel ζ-CA subfamily members that emerged under the Zn deficiency of aquatic ecosystems and selection pressure in these environments. There is also a possibility that the achieved results are rooted in the contamination of samples from the environmental microbiome genome with genomes of diatom species and the occurrence of errors was observed in the DNA sequencing outcomes. Combining of all results from evolutionary analysis to quality control of ζ-CA DNA sequences is the incentive motivation to explore more the hidden aspects of ζ-CAs.
Collapse
Affiliation(s)
- Farzaneh Shalileh
- Department of Energy and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Mohammad S Gheibzadeh
- Department of Energy and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - James R Lloyd
- Department of Genetics, Institute for Plant Biotechnology, University of Stellenbosch, Stellenbosch, South Africa
| | - Susanne Fietz
- Department of Earth Sciences, Stellenbosch University, Stellenbosch, South Africa
| | - Hossein Shahbani Zahiri
- Department of Energy and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Reza Zolfaghari Emameh
- Department of Energy and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| |
Collapse
|
3
|
Verma A, Jakhar R, Kumar D, Kumar V, Dhillon T, Dangi M, Chhillar AK. A computational approach to discover antioxidant and anti-inflammatory attributes of silymarin derived from Silybum marianum by comparison with hydroxytyrosol. J Biomol Struct Dyn 2023; 41:11101-11121. [PMID: 36546728 DOI: 10.1080/07391102.2022.2159879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]
Abstract
Medicinal plants possess therapeutic potential for reducing reactive oxygen species (ROS)-mediated cellular damage. Hydroxytyrosol is one of the most potent antioxidants that served as control in the current study, including other synthetic antioxidants to computationally identify the antioxidant properties of Silymarin. The sequences of the receptors IκB kinase (IKK), Kelch-like ECH-associated protein 1 (Keap-1) and mitochondrial transcription factor A (Tfam) were retrieved from UniProtKB and homology modeling was performed using Swiss-Model server. Thereof the molecular docking and dynamic simulation studies were performed using Schrödinger's software version 11.5. From the current study, it was reported that on comparison of the binding energy of silymarin, hydroxytyrosol, α-tocopherol, ascorbic acid, butylated hydroxy anisole (BHA) and butylated hydroxytoluene (BHT), Silymarin exhibited better affinities with IKK receptor followed by Hydroxytyrosol suggesting it as the best or comparable of all other known antioxidants that could potentially suppress inflammation and other diseases. Also, Silymarin exhibited poorest binding affinity with Tfam promoting mitochondrial biogenesis, thereby scavenging ROS. However, with Keap-1, Silymarin is ranked 4th in the list, whereas hydroxytyrosol exhibited highest binding affinity to release oxidative stress. The stability of docked complexes made us conclude that Silymarin has comparable antioxidant properties to hydroxytyrosol, better anti-inflammatory potential and mitochondrial biogenesis enhancing properties to ultimately reduce oxidative stress. Now it can be tested further for in vitro or in vivo studies as potential drug against oxidative insult.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Annu Verma
- Department of Biochemistry, Maharshi Dayanand University, Rohtak, India
| | - Ritu Jakhar
- Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, India
| | - Dev Kumar
- Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, India
| | - Vijay Kumar
- Department of Biochemistry, Maharshi Dayanand University, Rohtak, India
| | - Twinkle Dhillon
- Department of Biochemistry, Maharshi Dayanand University, Rohtak, India
| | - Mehak Dangi
- Centre for Bioinformatics, Maharshi Dayanand University, Rohtak, India
| | | |
Collapse
|
4
|
Bisquert R, Planells-Cárcel A, Alonso-Del-Real J, Muñiz-Calvo S, Guillamón JM. The Role of the PAA1 Gene on Melatonin Biosynthesis in Saccharomyces cerevisiae: A Search of New Arylalkylamine N-Acetyltransferases. Microorganisms 2023; 11:1115. [PMID: 37317089 DOI: 10.3390/microorganisms11051115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 04/20/2023] [Accepted: 04/23/2023] [Indexed: 06/16/2023] Open
Abstract
Recently, the presence of melatonin in fermented beverages has been correlated with yeast metabolism during alcoholic fermentation. Melatonin, originally considered a unique product of the pineal gland of vertebrates, has been also identified in a wide range of invertebrates, plants, bacteria, and fungi in the last two decades. These findings bring the challenge of studying the function of melatonin in yeasts and the mechanisms underlying its synthesis. However, the necessary information to improve the selection and production of this interesting molecule in fermented beverages is to disclose the genes involved in the metabolic pathway. So far, only one gene has been proposed as involved in melatonin production in Saccharomyces cerevisiae, PAA1, a polyamine acetyltransferase, a homolog of the vertebrate's aralkylamine N-acetyltransferase (AANAT). In this study, we assessed the in vivo function of PAA1 by evaluating the bioconversion of the different possible substrates, such as 5-methoxytryptamine, tryptamine, and serotonin, using different protein expression platforms. Moreover, we expanded the search for new N-acetyltransferase candidates by combining a global transcriptome analysis and the use of powerful bioinformatic tools to predict similar domains to AANAT in S. cerevisiae. The AANAT activity of the candidate genes was validated by their overexpression in E. coli because, curiously, this system evidenced higher differences than the overexpression in their own host S. cerevisiae. Our results confirm that PAA1 possesses the ability to acetylate different aralkylamines, but AANAT activity does not seem to be the main acetylation activity. Moreover, we also prove that Paa1p is not the only enzyme with this AANAT activity. Our search of new genes detected HPA2 as a new arylalkylamine N-acetyltransferase in S. cerevisiae. This is the first report that clearly proves the involvement of this enzyme in AANAT activity.
Collapse
Affiliation(s)
- Ricardo Bisquert
- Instituto de Agroquímica y Tecnología de Alimentos IATA, CSIC, 46980 Paterna, Spain
| | | | - Javier Alonso-Del-Real
- Instituto de Agroquímica y Tecnología de Alimentos IATA, CSIC, 46980 Paterna, Spain
- Instituto de Biomedicina de Valencia IBV, CSIC, 46010 Valencia, Spain
| | - Sara Muñiz-Calvo
- Instituto de Agroquímica y Tecnología de Alimentos IATA, CSIC, 46980 Paterna, Spain
- Department of Life Sciences, Chalmers University of Technology, SE41296 Gothenburg, Sweden
| | | |
Collapse
|
5
|
Krishnan G S, Joshi A, Akhtar N, Kaushik V. Immunoinformatics designed T cell multi epitope dengue peptide vaccine derived from non structural proteome. Microb Pathog 2021; 150:104728. [PMID: 33400987 DOI: 10.1016/j.micpath.2020.104728] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 12/20/2020] [Accepted: 12/27/2020] [Indexed: 12/20/2022]
Abstract
Dengue viral disease has been reported as an Aedes aegypti mosquito-borne human disease and causing a severe global public health concern. In this study, immunoinformatics methods was deployed for crafting CTL T-cell epitopes as dengue vaccine candidates. The NS1 protein sequence of dengue serotype 1 strain retrieved from the protein database and T-cell epitopes (n = 85) were predicted by the artificial neural network. The conserved epitopes (n = 10) were predicted and selected for intensive computational analysis. The machine learning technique and quantitative matrix-based toxicity analysis assured nontoxic peptide selection. Hidden Markov Model derived Structural Alphabet (SA) based algorithm predicted the 3D molecular structure and all-atom structure of peptide ligand validated by Ramachandran-plot. Three-tier molecular docking approaches were used to predictthe peptide - HLA docking complex. Molecular dynamics (MD) simulation study confirmed the docking complex was stable in the time frame of 100ns. Population coverage analysis predicted the interaction epitope interaction with a particular population of HLA. These results concluded that the computationally designed HTLWSNGVL and FTTNIWLKL epitope peptides could be used as putative agents for the multi CTL T cell epitope vaccine. The vaccine protein sequence expression and translation were analyzed in the prokaryotic vector adapted by codon usage. Such in silico formulated CTL T-cell-based prophylactic vaccines could encourage the commercial development of dengue vaccines.
Collapse
Affiliation(s)
- Sunil Krishnan G
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India.
| | - Amit Joshi
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India.
| | - Nahid Akhtar
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India.
| | - Vikas Kaushik
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India.
| |
Collapse
|
6
|
Insights into Bioinformatic Applications for Glycosylation: Instigating an Awakening towards Applying Glycoinformatic Resources for Cancer Diagnosis and Therapy. Int J Mol Sci 2020; 21:ijms21249336. [PMID: 33302373 PMCID: PMC7762546 DOI: 10.3390/ijms21249336] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 11/26/2020] [Accepted: 12/01/2020] [Indexed: 01/10/2023] Open
Abstract
Glycosylation plays a crucial role in various diseases and their etiology. This has led to a clear understanding on the functions of carbohydrates in cell communication, which eventually will result in novel therapeutic approaches for treatment of various disease. Glycomics has now become one among the top ten technologies that will change the future. The direct implication of glycosylation as a hallmark of cancer and for cancer therapy is well established. As in proteomics, where bioinformatics tools have led to revolutionary achievements, bioinformatics resources for glycosylation have improved its practical implication. Bioinformatics tools, algorithms and databases are a mandatory requirement to manage and successfully analyze large amount of glycobiological data generated from glycosylation studies. This review consolidates all the available tools and their applications in glycosylation research. The achievements made through the use of bioinformatics into glycosylation studies are also presented. The importance of glycosylation in cancer diagnosis and therapy is discussed and the gap in the application of widely available glyco-informatic tools for cancer research is highlighted. This review is expected to bring an awakening amongst glyco-informaticians as well as cancer biologists to bridge this gap, to exploit the available glyco-informatic tools for cancer.
Collapse
|
7
|
G SK, Joshi A, Kaushik V. T cell epitope designing for dengue peptide vaccine using docking and molecular simulation studies. MOLECULAR SIMULATION 2020. [DOI: 10.1080/08927022.2020.1772970] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Sunil Krishnan G
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India
| | - Amit Joshi
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India
| | - Vikas Kaushik
- Domain of Bioinformatics, School of Bio-Engineering and Bio-Sciences, Lovely Professional University, Punjab, India
| |
Collapse
|
8
|
Ditta A, Zhou Z, Cai X, Wang X, Okubazghi KW, Shehzad M, Xu Y, Hou Y, Sajid Iqbal M, Khan MKR, Wang K, Liu F. Assessment of Genetic Diversity, Population Structure, and Evolutionary Relationship of Uncharacterized Genes in a Novel Germplasm Collection of Diploid and Allotetraploid Gossypium Accessions Using EST and Genomic SSR Markers. Int J Mol Sci 2018; 19:E2401. [PMID: 30110970 PMCID: PMC6121227 DOI: 10.3390/ijms19082401] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Revised: 08/08/2018] [Accepted: 08/13/2018] [Indexed: 11/17/2022] Open
Abstract
This study evaluated the genetic diversity and population structures in a novel cotton germplasm collection comprising 132 diploids, including Glossypium klotzschianum and allotetraploid cotton accessions, including Glossypium barbadense, Glossypium darwinii, Glossypium tomentosum, Glossypium ekmanianum, and Glossypium stephensii, from Santa Cruz, Isabella, San Cristobal, Hawaiian, Dominican Republic, and Wake Atoll islands. A total of 111 expressed sequence tag (EST) and genomic simple sequence repeat (gSSR) markers produced 382 polymorphic loci with an average of 3.44 polymorphic alleles per SSR marker. Polymorphism information content values counted 0.08 to 0.82 with an average of 0.56. Analysis of a genetic distance matrix revealed values of 0.003 to 0.53 with an average of 0.33 in the wild cotton collection. Phylogenetic analysis supported the subgroups identified by STRUCTURE and corresponds well with the results of principal coordinate analysis with a cumulative variation of 45.65%. A total of 123 unique alleles were observed among all accessions and 31 identified only in G. ekmanianum. Analysis of molecular variance revealed highly significant variation between the six groups identified by structure analysis with 49% of the total variation and 51% of the variation was due to diversity within the groups. The highest genetic differentiation among tetraploid populations was observed between accessions from the Hawaiian and Santa Cruz regions with a pairwise FST of 0.752 (p < 0.001). DUF819 containing an uncharacterized gene named yjcL linked to genomic markers has been found to be highly related to tryptophan-aspartic acid (W-D) repeats in a superfamily of genes. The RNA sequence expression data of the yjcL-linked gene Gh_A09G2500 was found to be upregulated under drought and salt stress conditions. The existence of genetic diversity, characterization of genes and variation in novel germplasm collection will be a landmark addition to the genetic study of cotton germplasm.
Collapse
Affiliation(s)
- Allah Ditta
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
- Nuclear Institute for Agriculture and Biology (NIAB), Jhang Road, Faisalabad 38000, Punjab, Pakistan.
| | - Zhongli Zhou
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Xiaoyan Cai
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Xingxing Wang
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Kiflom Weldu Okubazghi
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
- Hamelmalo Agricultural College, P.O. Box 397, Keren, Eritrea.
| | - Muhammad Shehzad
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Yanchao Xu
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Yuqing Hou
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Muhammad Sajid Iqbal
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Muhammad Kashif Riaz Khan
- Nuclear Institute for Agriculture and Biology (NIAB), Jhang Road, Faisalabad 38000, Punjab, Pakistan.
| | - Kunbo Wang
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| | - Fang Liu
- State Key Laboratory of Cotton Biology/Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China.
| |
Collapse
|
9
|
A Comprehensive Computational Analysis of Mycobacterium Genomes Pinpoints the Genes Co-occurring with YczE, a Membrane Protein Coding Gene Under the Putative Control of a MocR, and Predicts its Function. Interdiscip Sci 2017; 10:111-125. [PMID: 29098594 DOI: 10.1007/s12539-017-0266-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2017] [Revised: 09/08/2017] [Accepted: 10/11/2017] [Indexed: 10/18/2022]
Abstract
Bacterial proteins belonging to the YczE family are predicted to be membrane proteins of yet unknown function. In many bacterial species, the yczE gene coding for the YczE protein is divergently transcribed with respect to an adjacent transcriptional regulator of the MocR family. According to in silico predictions, proteins named YczR are supposed to regulate the expression of yczE genes. These regulators linked to the yczE genes are predicted to constitute a subfamily within the MocR family. To put forward hypotheses amenable to experimental testing about the possible function of the YczE proteins, a phylogenetic profile strategy was applied. This strategy consists in searching for those genes that, within a set of genomes, co-occur exclusively with a certain gene of interest. Co-occurrence can be suggestive of a functional link. A set of 30 mycobacterial complete proteomes were collected. Of these, only 16 contained YczE proteins. Interestingly, in all cases each yczE gene was divergently transcribed with respect to a yczR gene. Two orthology clustering procedures were applied to find proteins co-occurring exclusively with the YczE proteins. The reported results suggest that YczE may be involved in the membrane translocation and metabolism of sulfur-containing compounds mostly in rapidly growing, low pathogenicity mycobacterial species. These observations may hint at potential targets for therapies to treat the emerging opportunistic infections provoked by the widespread environmental mycobacterial species and may contribute to the delineation of the genomic and physiological differences between the pathogenic and non-pathogenic mycobacterial species.
Collapse
|
10
|
Mandal SK, Chandravanshi M, Gogoi P, Kanaujia SP. In silico characterization of TTHA0596: A potential Zn 2+ binding protein of ATP-binding cassette transporter. GENE REPORTS 2017. [DOI: 10.1016/j.genrep.2017.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
11
|
Ou HD, Deerinck TJ, Bushong E, Ellisman MH, O'Shea CC. Visualizing viral protein structures in cells using genetic probes for correlated light and electron microscopy. Methods 2015; 90:39-48. [PMID: 26066760 PMCID: PMC4655137 DOI: 10.1016/j.ymeth.2015.06.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Revised: 06/01/2015] [Accepted: 06/02/2015] [Indexed: 01/08/2023] Open
Abstract
Structural studies of viral proteins most often use high-resolution techniques such as X-ray crystallography, nuclear magnetic resonance, single particle negative stain, or cryo-electron microscopy (EM) to reveal atomic interactions of soluble, homogeneous viral proteins or viral protein complexes. Once viral proteins or complexes are separated from their host's cellular environment, their natural in situ structure and details of how they interact with other cellular components may be lost. EM has been an invaluable tool in virology since its introduction in the late 1940's and subsequent application to cells in the 1950's. EM studies have expanded our knowledge of viral entry, viral replication, alteration of cellular components, and viral lysis. Most of these early studies were focused on conspicuous morphological cellular changes, because classic EM metal stains were designed to highlight classes of cellular structures rather than specific molecular structures. Much later, to identify viral proteins inducing specific structural configurations at the cellular level, immunostaining with a primary antibody followed by colloidal gold secondary antibody was employed to mark the location of specific viral proteins. This technique can suffer from artifacts in cellular ultrastructure due to compromises required to provide access to the immuno-reagents. Immunolocalization methods also require the generation of highly specific antibodies, which may not be available for every viral protein. Here we discuss new methods to visualize viral proteins and structures at high resolutions in situ using correlated light and electron microscopy (CLEM). We discuss the use of genetically encoded protein fusions that oxidize diaminobenzidine (DAB) into an osmiophilic polymer that can be visualized by EM. Detailed protocols for applying the genetically encoded photo-oxidizing protein MiniSOG to a viral protein, photo-oxidation of the fusion protein to yield DAB polymer staining, and preparation of photo-oxidized samples for TEM and serial block-face scanning EM (SBEM) for large-scale volume EM data acquisition are also presented. As an example, we discuss the recent multi-scale analysis of Adenoviral protein E4-ORF3 that reveals a new type of multi-functional polymer that disrupts multiple cellular proteins. This new capability to visualize unambiguously specific viral protein structures at high resolutions in the native cellular environment is revealing new insights into how they usurp host proteins and functions to drive pathological viral replication.
Collapse
Affiliation(s)
- Horng D Ou
- Molecular and Cell Biology Laboratory, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA; Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Thomas J Deerinck
- National Center for Microscopy and Imaging Research, Center for Research in Biological Systems, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Eric Bushong
- National Center for Microscopy and Imaging Research, Center for Research in Biological Systems, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Mark H Ellisman
- Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA; National Center for Microscopy and Imaging Research, Center for Research in Biological Systems, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA; Department of Neurosciences, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Clodagh C O'Shea
- Molecular and Cell Biology Laboratory, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA; Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA.
| |
Collapse
|
12
|
Orizio F, Damiati E, Giacopuzzi E, Benaglia G, Pianta S, Schauer R, Schwartz-Albiez R, Borsani G, Bresciani R, Monti E. Human sialic acid acetyl esterase: Towards a better understanding of a puzzling enzyme. Glycobiology 2015; 25:992-1006. [DOI: 10.1093/glycob/cwv034] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 05/17/2015] [Indexed: 01/09/2023] Open
|
13
|
Komaki H, Ichikawa N, Hosoyama A, Takahashi-Nakaguchi A, Matsuzawa T, Suzuki KI, Fujita N, Gonoi T. Genome based analysis of type-I polyketide synthase and nonribosomal peptide synthetase gene clusters in seven strains of five representative Nocardia species. BMC Genomics 2014; 15:323. [PMID: 24884595 PMCID: PMC4035055 DOI: 10.1186/1471-2164-15-323] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 04/15/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Actinobacteria of the genus Nocardia usually live in soil or water and play saprophytic roles, but they also opportunistically infect the respiratory system, skin, and other organs of humans and animals. Primarily because of the clinical importance of the strains, some Nocardia genomes have been sequenced, and genome sequences have accumulated. Genome sizes of Nocardia strains are similar to those of Streptomyces strains, the producers of most antibiotics. In the present work, we compared secondary metabolite biosynthesis gene clusters of type-I polyketide synthase (PKS-I) and nonribosomal peptide synthetase (NRPS) among genomes of representative Nocardia species/strains based on domain organization and amino acid sequence homology. RESULTS Draft genome sequences of Nocardia asteroides NBRC 15531(T), Nocardia otitidiscaviarum IFM 11049, Nocardia brasiliensis NBRC 14402(T), and N. brasiliensis IFM 10847 were read and compared with published complete genome sequences of Nocardia farcinica IFM 10152, Nocardia cyriacigeorgica GUH-2, and N. brasiliensis HUJEG-1. Genome sizes are as follows: N. farcinica, 6.0 Mb; N. cyriacigeorgica, 6.2 Mb; N. asteroides, 7.0 Mb; N. otitidiscaviarum, 7.8 Mb; and N. brasiliensis, 8.9 - 9.4 Mb. Predicted numbers of PKS-I, NRPS, and PKS-I/NRPS hybrid clusters ranged between 4-11, 7-13, and 1-6, respectively, depending on strains, and tended to increase with increasing genome size. Domain and module structures of representative or unique clusters are discussed in the text. CONCLUSION We conclude the following: 1) genomes of Nocardia strains carry as many PKS-I and NRPS gene clusters as those of Streptomyces strains, 2) the number of PKS-I and NRPS gene clusters in Nocardia strains varies substantially depending on species, and N. brasiliensis strains carry the largest numbers of clusters among the species studied, 3) the seven Nocardia strains studied in the present work have seven common PKS-I and/or NRPS clusters, some of whose products are yet to be studied, and 4) different N. brasiliensis strains have some different gene clusters of PKS-I/NRPS, although the rest of the clusters are common within the N. brasiliensis strains. Genome sequencing suggested that Nocardia strains are highly promising resources in the search of novel secondary metabolites.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Tohru Gonoi
- Medical Mycology Research Center (MMRC), Chiba University, Chuo-ku, Chiba 260-8673, Japan.
| |
Collapse
|
14
|
Algenäs C, Agaton C, Fagerberg L, Asplund A, Björling L, Björling E, Kampf C, Lundberg E, Nilsson P, Persson A, Wester K, Pontén F, Wernérus H, Uhlén M, Ottosson Takanen J, Hober S. Antibody performance in western blot applications is context-dependent. Biotechnol J 2014; 9:435-45. [PMID: 24403002 DOI: 10.1002/biot.201300341] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2013] [Revised: 12/12/2013] [Accepted: 01/03/2014] [Indexed: 11/09/2022]
Abstract
An important concern for the use of antibodies in various applications, such as western blot (WB) or immunohistochemistry (IHC), is specificity. This calls for systematic validations using well-designed conditions. Here, we have analyzed 13 000 antibodies using western blot with lysates from human cell lines, tissues, and plasma. Standardized stratification showed that 45% of the antibodies yielded supportive staining, and the rest either no staining (12%) or protein bands of wrong size (43%). A comparative study of WB and IHC showed that the performance of antibodies is application-specific, although a correlation between no WB staining and weak IHC staining could be seen. To investigate the influence of protein abundance on the apparent specificity of the antibody, new WB analyses were performed for 1369 genes that gave unsupportive WBs in the initial screening using cell lysates with overexpressed full-length proteins. Then, more than 82% of the antibodies yielded a specific band corresponding to the full-length protein. Hence, the vast majority of the antibodies (90%) used in this study specifically recognize the target protein when present at sufficiently high levels. This demonstrates the context- and application-dependence of antibody validation and emphasizes that caution is needed when annotating binding reagents as specific or cross-reactive. WB is one of the most commonly used methods for validation of antibodies. Our data implicate that solely using one platform for antibody validation might give misleading information and therefore at least one additional method should be used to verify the achieved data.
Collapse
Affiliation(s)
- Cajsa Algenäs
- Division of Proteomics, School of Biotechnology, Albanova University Center, KTH - Royal Institute of Technology, Stockholm, Sweden
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Abstract
More than 20% of all protein domains are currently annotated as “domains of unknown function” (DUFs). About 2,700 DUFs are found in bacteria compared with just over 1,500 in eukaryotes. Over 800 DUFs are shared between bacteria and eukaryotes, and about 300 of these are also present in archaea. A total of 2,786 bacterial Pfam domains even occur in animals, including 320 DUFs. Evolutionary conservation suggests that many of these DUFs are important. Here we show that 355 essential proteins in 16 model bacterial species contain 238 DUFs, most of which represent single-domain proteins, clearly establishing the biological essentiality of DUFs. We suggest that experimental research should focus on conserved and essential DUFs (eDUFs) for functional analysis given their important function and wide taxonomic distribution, including bacterial pathogens. The functional units of proteins are domains. Typically, each domain has a distinct structure and function. Genomes encode thousands of domains, and many of the domains have no known function (domains of unknown function [DUFs]). They are often ignored as of little relevance, given that many of them are found in only a few genomes. Here we show that many DUFs are essential DUFs (eDUFs) based on their presence in essential proteins. We also show that eDUFs are often essential even if they are found in relatively few genomes. However, in general, more common DUFs are more often essential than rare DUFs.
Collapse
|
16
|
Anwar T, Gourinath S. Analysis of the Protein phosphotome of Entamoeba histolytica reveals an intricate phosphorylation network. PLoS One 2013; 8:e78714. [PMID: 24236039 PMCID: PMC3827238 DOI: 10.1371/journal.pone.0078714] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Accepted: 09/22/2013] [Indexed: 01/06/2023] Open
Abstract
Phosphorylation is the most common mechanism for the propagation of intracellular signals. Protein phosphatases and protein kinases play a dynamic antagonistic role in protein phosphorylation. Protein phosphatases make up a significant fraction of eukaryotic proteome. In this article, we report the identification and analysis of protein phosphatases in the intracellular parasite Entamoeba histolytica. Based on an in silico analysis, we classified 250 non-redundant protein phosphatases in E. histolytica. The phosphotome of E. histolytica is 3.1% of its proteome and 1.3 times of the human phosphotome. In this extensive study, we identified 42 new putative phosphatases (39 hypothetical proteins and 3 pseudophosphatases). The presence of pseudophosphatases may have an important role in virulence of E. histolytica. A comprehensive phosphotome analysis of E. histolytica shows spectacular low similarity to human phosphatases, making them potent candidates for drug target.
Collapse
Affiliation(s)
- Tamanna Anwar
- School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | | |
Collapse
|
17
|
Male-specific region of the bovine Y chromosome is gene rich with a high transcriptomic activity in testis development. Proc Natl Acad Sci U S A 2013; 110:12373-8. [PMID: 23842086 DOI: 10.1073/pnas.1221104110] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The male-specific region of the mammalian Y chromosome (MSY) contains clusters of genes essential for male reproduction. The highly repetitive and degenerative nature of the Y chromosome impedes genomic and transcriptomic characterization. Although the Y chromosome sequence is available for the human, chimpanzee, and macaque, little is known about the annotation and transcriptome of nonprimate MSY. Here, we investigated the transcriptome of the MSY in cattle by direct testis cDNA selection and RNA-seq approaches. The bovine MSY differs radically from the primate Y chromosomes with respect to its structure, gene content, and density. Among the 28 protein-coding genes/families identified on the bovine MSY (12 single- and 16 multicopy genes), 16 are bovid specific. The 1,274 genes identified in this study made the bovine MSY gene density the highest in the genome; in comparison, primate MSYs have only 31-78 genes. Our results, along with the highly transcriptional activities observed from these Y-chromosome genes and 375 additional noncoding RNAs, challenge the widely accepted hypothesis that the MSY is gene poor and transcriptionally inert. The bovine MSY genes are predominantly expressed and are differentially regulated during the testicular development. Synonymous substitution rate analyses of the multicopy MSY genes indicated that two major periods of expansion occurred during the Miocene and Pliocene, contributing to the adaptive radiation of bovids. The massive amplification and vigorous transcription suggest that the MSY serves as a genomic niche regulating male reproduction during bovid expansion.
Collapse
|
18
|
Lee CC, Chen YPP, Yao TJ, Ma CY, Lo WC, Lyu PC, Tang CY. GI-POP: A combinational annotation and genomic island prediction pipeline for ongoing microbial genome projects. Gene 2013; 518:114-23. [DOI: 10.1016/j.gene.2012.11.063] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Accepted: 11/27/2012] [Indexed: 10/27/2022]
|
19
|
Jorda J, Lopez D, Wheatley NM, Yeates TO. Using comparative genomics to uncover new kinds of protein-based metabolic organelles in bacteria. Protein Sci 2013. [PMID: 23188745 DOI: 10.1002/pro.2196] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Bacterial microcompartment (MCP) organelles are cytosolic, polyhedral structures consisting of a thin protein shell and a series of encapsulated, sequentially acting enzymes. To date, different microcompartments carrying out three distinct types of metabolic processes have been characterized experimentally in various bacteria. In the present work, we use comparative genomics to explore the existence of yet uncharacterized microcompartments encapsulating a broader set of metabolic pathways. A clustering approach was used to group together enzymes that show a strong tendency to be encoded in chromosomal proximity to each other while also being near genes for microcompartment shell proteins. The results uncover new types of putative microcompartments, including one that appears to encapsulate B(12) -independent, glycyl radical-based degradation of 1,2-propanediol, and another potentially involved in amino alcohol metabolism in mycobacteria. Preliminary experiments show that an unusual shell protein encoded within the glycyl radical-based microcompartment binds an iron-sulfur cluster, hinting at complex mechanisms in this uncharacterized system. In addition, an examination of the computed microcompartment clusters suggests the existence of specific functional variations within certain types of MCPs, including the alpha carboxysome and the glycyl radical-based microcompartment. The findings lead to a deeper understanding of bacterial microcompartments and the pathways they sequester.
Collapse
Affiliation(s)
- Julien Jorda
- UCLA-DOE Institute for Genomics and Proteomics, 611 Charles Young Dr East, Los Angeles, California 90095, USA
| | | | | | | |
Collapse
|
20
|
Voigt B, Hieu CX, Hempel K, Becher D, Schlüter R, Teeling H, Glöckner FO, Amann R, Hecker M, Schweder T. Cell surface proteome of the marine planctomycete Rhodopirellula baltica. Proteomics 2012; 12:1781-91. [PMID: 22623273 DOI: 10.1002/pmic.201100512] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The surface proteome (surfaceome) of the marine planctomycete Rhodopirellula baltica SH1(T) was studied using a biotinylation and a proteinase K approach combined with SDS-PAGE and mass spectrometry. 52 of the proteins identified in both approaches could be assigned to the group of potential surface proteins. Among them are some high molecular weight proteins, potentially involved in cell-cell attachment, that contain domains shown before to be typical for surface proteins like cadherin/dockerin domains, a bacterial adhesion domain or the fasciclin domain. The identification of proteins with enzymatic functions in the R. baltica surfaceome provides further clues for the suggestion that some degradative enzymes may be anchored onto the cell surface. YTV proteins, which have been earlier supposed to be components of the proteinaceous cell wall of R. baltica, were detected in the surface proteome. Additionally, 8 proteins with a novel protein structure combining a conserved type IV pilin/N-methylation domain and a planctomycete-typical DUF1559 domain were identified.
Collapse
Affiliation(s)
- Birgit Voigt
- Institute for Microbiology, Ernst-Moritz-Arndt-University, Greifswald, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Godel C, Kumar S, Koutsovoulos G, Ludin P, Nilsson D, Comandatore F, Wrobel N, Thompson M, Schmid CD, Goto S, Bringaud F, Wolstenholme A, Bandi C, Epe C, Kaminsky R, Blaxter M, Mäser P. The genome of the heartworm, Dirofilaria immitis, reveals drug and vaccine targets. FASEB J 2012; 26:4650-61. [PMID: 22889830 PMCID: PMC3475251 DOI: 10.1096/fj.12-205096] [Citation(s) in RCA: 109] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The heartworm Dirofilaria immitis is an important parasite of dogs. Transmitted by mosquitoes in warmer climatic zones, it is spreading across southern Europe and the Americas at an alarming pace. There is no vaccine, and chemotherapy is prone to complications. To learn more about this parasite, we have sequenced the genomes of D. immitis and its endosymbiont Wolbachia. We predict 10,179 protein coding genes in the 84.2 Mb of the nuclear genome, and 823 genes in the 0.9-Mb Wolbachia genome. The D. immitis genome harbors neither DNA transposons nor active retrotransposons, and there is very little genetic variation between two sequenced isolates from Europe and the United States. The differential presence of anabolic pathways such as heme and nucleotide biosynthesis hints at the intricate metabolic interrelationship between the heartworm and Wolbachia. Comparing the proteome of D. immitis with other nematodes and with mammalian hosts, we identify families of potential drug targets, immune modulators, and vaccine candidates. This genome sequence will support the development of new tools against dirofilariasis and aid efforts to combat related human pathogens, the causative agents of lymphatic filariasis and river blindness.—Godel, C., Kumar, S., Koutsovoulos, G., Ludin, P., Nilsson, D., Comandatore, F., Wrobel, N., Thompson, M., Schmid, C. D., Goto, S., Bringaud, F., Wolstenholme, A., Bandi, C., Epe, C., Kaminsky, R., Blaxter, M., Mäser, P. The genome of the heartworm, Dirofilaria immitis, reveals drug and vaccine targets.
Collapse
|
22
|
Current challenges in genome annotation through structural biology and bioinformatics. Curr Opin Struct Biol 2012; 22:594-601. [PMID: 22884875 DOI: 10.1016/j.sbi.2012.07.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2012] [Revised: 06/29/2012] [Accepted: 07/09/2012] [Indexed: 01/25/2023]
Abstract
With the huge volume in genomic sequences being generated from high-throughout sequencing projects the requirement for providing accurate and detailed annotations of gene products has never been greater. It is proving to be a huge challenge for computational biologists to use as much information as possible from experimental data to provide annotations for genome data of unknown function. A central component to this process is to use experimentally determined structures, which provide a means to detect homology that is not discernable from just the sequence and permit the consequences of genomic variation to be realized at the molecular level. In particular, structures also form the basis of many bioinformatics methods for improving the detailed functional annotations of enzymes in combination with similarities in sequence and chemistry.
Collapse
|
23
|
Ludin P, Woodcroft B, Ralph SA, Mäser P. In silico prediction of antimalarial drug target candidates. INTERNATIONAL JOURNAL FOR PARASITOLOGY-DRUGS AND DRUG RESISTANCE 2012; 2:191-9. [PMID: 24533280 DOI: 10.1016/j.ijpddr.2012.07.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 06/28/2012] [Accepted: 07/03/2012] [Indexed: 10/28/2022]
Abstract
The need for new antimalarials is persistent due to the emergence of drug resistant parasites. Here we aim to identify new drug targets in Plasmodium falciparum by phylogenomics among the Plasmodium spp. and comparative genomics to Homo sapiens. The proposed target discovery pipeline is largely independent of experimental data and based on the assumption that P. falciparum proteins are likely to be essential if (i) there are no similar proteins in the same proteome and (ii) they are highly conserved across the malaria parasites of mammals. This hypothesis was tested using sequenced Saccharomycetaceae species as a touchstone. Consecutive filters narrowed down the potential target space of P. falciparum to proteins that are likely to be essential, matchless in the human proteome, expressed in the blood stages of the parasite, and amenable to small molecule inhibition. The final set of 40 candidate drug targets was significantly enriched in essential proteins and comprised proven targets (e.g. dihydropteroate synthetase or enzymes of the non-mevalonate pathway), targets currently under investigation (e.g. calcium-dependent protein kinases), and new candidates of potential interest such as phosphomannose isomerase, phosphoenolpyruvate carboxylase, signaling components, and transporters. The targets were prioritized based on druggability indices and on the availability of in vitro assays. Potential inhibitors were inferred from similarity to known targets of other disease systems. The identified candidates from P. falciparum provide insight into biochemical peculiarities and vulnerable points of the malaria parasite and might serve as starting points for rational drug discovery.
Collapse
Affiliation(s)
- Philipp Ludin
- Swiss Tropical and Public Health Institute, 4002 Basel, Switzerland ; University of Basel, 4000 Basel, Switzerland
| | - Ben Woodcroft
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Victoria 3010, Australia
| | - Stuart A Ralph
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Victoria 3010, Australia
| | - Pascal Mäser
- Swiss Tropical and Public Health Institute, 4002 Basel, Switzerland ; University of Basel, 4000 Basel, Switzerland
| |
Collapse
|
24
|
Park J, Costanzo MC, Balakrishnan R, Cherry JM, Hong EL. CvManGO, a method for leveraging computational predictions to improve literature-based Gene Ontology annotations. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas001. [PMID: 22434836 PMCID: PMC3308158 DOI: 10.1093/database/bas001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The set of annotations at the Saccharomyces Genome Database (SGD) that classifies the cellular function of S. cerevisiae gene products using Gene Ontology (GO) terms has become an important resource for facilitating experimental analysis. In addition to capturing and summarizing experimental results, the structured nature of GO annotations allows for functional comparison across organisms as well as propagation of functional predictions between related gene products. Due to their relevance to many areas of research, ensuring the accuracy and quality of these annotations is a priority at SGD. GO annotations are assigned either manually, by biocurators extracting experimental evidence from the scientific literature, or through automated methods that leverage computational algorithms to predict functional information. Here, we discuss the relationship between literature-based and computationally predicted GO annotations in SGD and extend a strategy whereby comparison of these two types of annotation identifies genes whose annotations need review. Our method, CvManGO (Computational versus Manual GO annotations), pairs literature-based GO annotations with computational GO predictions and evaluates the relationship of the two terms within GO, looking for instances of discrepancy. We found that this method will identify genes that require annotation updates, taking an important step towards finding ways to prioritize literature review. Additionally, we explored factors that may influence the effectiveness of CvManGO in identifying relevant gene targets to find in particular those genes that are missing literature-supported annotations, but our survey found that there are no immediately identifiable criteria by which one could enrich for these under-annotated genes. Finally, we discuss possible ways to improve this strategy, and the applicability of this method to other projects that use the GO for curation. Database URL:http://www.yeastgenome.org
Collapse
Affiliation(s)
- Julie Park
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | |
Collapse
|
25
|
Furnham N, Sillitoe I, Holliday GL, Cuff AL, Laskowski RA, Orengo CA, Thornton JM. Exploring the evolution of novel enzyme functions within structurally defined protein superfamilies. PLoS Comput Biol 2012; 8:e1002403. [PMID: 22396634 PMCID: PMC3291543 DOI: 10.1371/journal.pcbi.1002403] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2011] [Accepted: 01/09/2012] [Indexed: 11/18/2022] Open
Abstract
In order to understand the evolution of enzyme reactions and to gain an overview of biological catalysis we have combined sequence and structural data to generate phylogenetic trees in an analysis of 276 structurally defined enzyme superfamilies, and used these to study how enzyme functions have evolved. We describe in detail the analysis of two superfamilies to illustrate different paradigms of enzyme evolution. Gathering together data from all the superfamilies supports and develops the observation that they have all evolved to act on a diverse set of substrates, whilst the evolution of new chemistry is much less common. Despite that, by bringing together so much data, we can provide a comprehensive overview of the most common and rare types of changes in function. Our analysis demonstrates on a larger scale than previously studied, that modifications in overall chemistry still occur, with all possible changes at the primary level of the Enzyme Commission (E.C.) classification observed to a greater or lesser extent. The phylogenetic trees map out the evolutionary route taken within a superfamily, as well as all the possible changes within a superfamily. This has been used to generate a matrix of observed exchanges from one enzyme function to another, revealing the scale and nature of enzyme evolution and that some types of exchanges between and within E.C. classes are more prevalent than others. Surprisingly a large proportion (71%) of all known enzyme functions are performed by this relatively small set of 276 superfamilies. This reinforces the hypothesis that relatively few ancient enzymatic domain superfamilies were progenitors for most of the chemistry required for life. Enzymes, as biological catalysts, are crucial to life. Understanding how enzymes have evolved to perform the wide variety of reactions found across all kingdoms of life is fundamental to a broad range of biological studies, especially those leading to new therapeutics. To unravel the evolution of novel enzyme function requires combining information on protein structure, sequence, phylogeny and chemistry (in terms of interacting small molecules and reaction mechanisms). We have developed a protocol for integrating this wide range of data, which we have applied to a relatively large number of families comprising some very diverse relatives. This has permitted us to present an initial overview of the evolution of novel enzyme functions, in which we observe that some changes in function between relatives are more common than others, with most of the functionality observed in nature confined to relatively few families. Moreover, we are able to identify the evolutionary route taken within a superfamily to change the enzyme function from one reaction to another. This information may help in predicting the function of an enzyme that has yet to be experimentally characterised as well as in designing new enzymes for industrial and medical purposes.
Collapse
Affiliation(s)
- Nicholas Furnham
- EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
| | | | | | | | | | | | | |
Collapse
|
26
|
Costanzo MC, Park J, Balakrishnan R, Cherry JM, Hong EL. Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar004. [PMID: 21411447 PMCID: PMC3067894 DOI: 10.1093/database/bar004] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Annotation using Gene Ontology (GO) terms is one of the most important ways in which biological information about specific gene products can be expressed in a searchable, computable form that may be compared across genomes and organisms. Because literature-based GO annotations are often used to propagate functional predictions between related proteins, their accuracy is critically important. We present a strategy that employs a comparison of literature-based annotations with computational predictions to identify and prioritize genes whose annotations need review. Using this method, we show that comparison of manually assigned ‘unknown’ annotations in the Saccharomyces Genome Database (SGD) with InterPro-based predictions can identify annotations that need to be updated. A survey of literature-based annotations and computational predictions made by the Gene Ontology Annotation (GOA) project at the European Bioinformatics Institute (EBI) across several other databases shows that this comparison strategy could be used to maintain and improve the quality of GO annotations for other organisms besides yeast. The survey also shows that although GOA-assigned predictions are the most comprehensive source of functional information for many genomes, a large proportion of genes in a variety of different organisms entirely lack these predictions but do have manual annotations. This underscores the critical need for manually performed, literature-based curation to provide functional information about genes that are outside the scope of widely used computational methods. Thus, the combination of manual and computational methods is essential to provide the most accurate and complete functional annotation of a genome. Database URL:http://www.yeastgenome.org
Collapse
Affiliation(s)
- Maria C Costanzo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5120, USA
| | | | | | | | | |
Collapse
|
27
|
Kumar K, Desai V, Cheng L, Khitrov M, Grover D, Satya RV, Yu C, Zavaljevski N, Reifman J. AGeS: a software system for microbial genome sequence annotation. PLoS One 2011; 6:e17469. [PMID: 21408217 PMCID: PMC3049762 DOI: 10.1371/journal.pone.0017469] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2010] [Accepted: 02/01/2011] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. METHODOLOGY The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.
Collapse
Affiliation(s)
- Kamal Kumar
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Valmik Desai
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Li Cheng
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Maxim Khitrov
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Deepak Grover
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Ravi Vijaya Satya
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Chenggang Yu
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Nela Zavaljevski
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
| | - Jaques Reifman
- DoD Biotechnology High Performance Computing Software Applications
Institute, Telemedicine and Advanced Technology Research Center, U.S. Army
Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of
America
- * E-mail:
| |
Collapse
|
28
|
Jeong JC, Lin X, Chen XW. On position-specific scoring matrix for protein function prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:308-315. [PMID: 20855926 DOI: 10.1109/tcbb.2010.93] [Citation(s) in RCA: 124] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
While genome sequencing projects have generated tremendous amounts of protein sequence data for a vast number of genomes, substantial portions of most genomes are still unannotated. Despite the success of experimental methods for identifying protein functions, they are often lab intensive and time consuming. Thus, it is only practical to use in silico methods for the genome-wide functional annotations. In this paper, we propose new features extracted from protein sequence only and machine learning-based methods for computational function prediction. These features are derived from a position-specific scoring matrix, which has shown great potential in other bininformatics problems. We evaluate these features using four different classifiers and yeast protein data. Our experimental results show that features derived from the position-specific scoring matrix are appropriate for automatic function annotation.
Collapse
Affiliation(s)
- Jong Cheol Jeong
- Electrical Engineering and Computer Science Department, University of Kansas, Lawrence, KS 66045, USA.
| | | | | |
Collapse
|
29
|
Ivanov AS, Zgoda VG, Archakov AI. Technologies of protein interactomics: A review. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2011; 37:8-21. [DOI: 10.1134/s1068162011010092] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
30
|
SOYER OS, CREEVEY CJ. Duplicate retention in signalling proteins and constraints from network dynamics. J Evol Biol 2010; 23:2410-21. [DOI: 10.1111/j.1420-9101.2010.02101.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
31
|
Frank M, Schloissnig S. Bioinformatics and molecular modeling in glycobiology. Cell Mol Life Sci 2010; 67:2749-72. [PMID: 20364395 PMCID: PMC2912727 DOI: 10.1007/s00018-010-0352-4] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2009] [Revised: 03/08/2010] [Accepted: 03/11/2010] [Indexed: 12/11/2022]
Abstract
The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein-carbohydrate interaction are reviewed.
Collapse
Affiliation(s)
- Martin Frank
- Molecular Structure Analysis Core Facility-W160, Deutsches Krebsforschungszentrum (German Cancer Research Centre), 69120 Heidelberg, Germany.
| | | |
Collapse
|
32
|
Janga SC, Díaz-Mejía JJ, Moreno-Hagelsieb G. Network-based function prediction and interactomics: the case for metabolic enzymes. Metab Eng 2010; 13:1-10. [PMID: 20654726 DOI: 10.1016/j.ymben.2010.07.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2010] [Revised: 07/15/2010] [Accepted: 07/16/2010] [Indexed: 12/19/2022]
Abstract
As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases.
Collapse
Affiliation(s)
- S C Janga
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB20QH, United Kingdom.
| | | | | |
Collapse
|
33
|
Hinz U. From protein sequences to 3D-structures and beyond: the example of the UniProt knowledgebase. Cell Mol Life Sci 2010; 67:1049-64. [PMID: 20043185 PMCID: PMC2835715 DOI: 10.1007/s00018-009-0229-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Revised: 12/01/2009] [Accepted: 12/07/2009] [Indexed: 11/12/2022]
Abstract
With the dramatic increase in the volume of experimental results in every domain of life sciences, assembling pertinent data and combining information from different fields has become a challenge. Information is dispersed over numerous specialized databases and is presented in many different formats. Rapid access to experiment-based information about well-characterized proteins helps predict the function of uncharacterized proteins identified by large-scale sequencing. In this context, universal knowledgebases play essential roles in providing access to data from complementary types of experiments and serving as hubs with cross-references to many specialized databases. This review outlines how the value of experimental data is optimized by combining high-quality protein sequences with complementary experimental results, including information derived from protein 3D-structures, using as an example the UniProt knowledgebase (UniProtKB) and the tools and links provided on its website ( http://www.uniprot.org/ ). It also evokes precautions that are necessary for successful predictions and extrapolations.
Collapse
Affiliation(s)
- Ursula Hinz
- Swiss-Prot Group, Swiss Institute of Bioinformatics, 1 rue Michel Servet, 1211, Geneva, Switzerland.
| |
Collapse
|
34
|
Lee DA, Rentzsch R, Orengo C. GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains. Nucleic Acids Res 2009; 38:720-37. [PMID: 19923231 PMCID: PMC2817468 DOI: 10.1093/nar/gkp1049] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
GeMMA (Genome Modelling and Model Annotation) is a new approach to automatic functional subfamily classification within families and superfamilies of protein sequences. A major advantage of GeMMA is its ability to subclassify very large and diverse superfamilies with tens of thousands of members, without the need for an initial multiple sequence alignment. Its performance is shown to be comparable to the established high-performance method SCI-PHY. GeMMA follows an agglomerative clustering protocol that uses existing software for sensitive and accurate multiple sequence alignment and profile–profile comparison. The produced subfamilies are shown to be equivalent in quality whether whole protein sequences are used or just the sequences of component predicted structural domains. A faster, heuristic version of GeMMA that also uses distributed computing is shown to maintain the performance levels of the original implementation. The use of GeMMA to increase the functional annotation coverage of functionally diverse Pfam families is demonstrated. It is further shown how GeMMA clusters can help to predict the impact of experimentally determining a protein domain structure on comparative protein modelling coverage, in the context of structural genomics.
Collapse
Affiliation(s)
- David A Lee
- University College London - Structural and Molecular Biology, London, UK.
| | | | | |
Collapse
|
35
|
Abstract
Bioinformatics is a central discipline in modern life sciences aimed at describing the complex properties of living organisms starting from large-scale data sets of cellular constituents such as genes and proteins. In order for this wealth of information to provide useful biological knowledge, databases and software tools for data collection, analysis and interpretation need to be developed. In this paper, we review recent advances in the design and implementation of bioinformatics resources devoted to the study of metals in biological systems, a research field traditionally at the heart of bioinorganic chemistry. We show how metalloproteomes can be extracted from genome sequences, how structural properties can be related to function, how databases can be implemented, and how hints on interactions can be obtained from bioinformatics.
Collapse
Affiliation(s)
- Ivano Bertini
- Magnetic Resonance Center (CERM)-University of Florence, Via L. Sacconi 6, Sesto Fiorentino, Italy.
| | | |
Collapse
|
36
|
Yen MR, Choi J, Saier MH. Bioinformatic analyses of transmembrane transport: novel software for deducing protein phylogeny, topology, and evolution. J Mol Microbiol Biotechnol 2009; 17:163-76. [PMID: 19776645 DOI: 10.1159/000239667] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
During the past decade, we have experienced a revolution in the biological sciences resulting from the flux of information generated by genome-sequencing efforts. Our understanding of living organisms, the metabolic processes they catalyze, the genetic systems encoding cellular protein and stable RNA constituents, and the pathological conditions caused by some of these organisms has greatly benefited from the availability of complete genomic sequences and the establishment of comprehensive databases. Many research institutes around the world are now devoting their efforts largely to genome sequencing, data collection and data analysis. In this review, we summarize tools that are in routine use in our laboratory for characterizing transmembrane transport systems. Applications of these tools to specific transporter families are presented. Many of the computational approaches described should be applicable to virtually all classes of proteins and RNA molecules.
Collapse
Affiliation(s)
- Ming Ren Yen
- Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, USA
| | | | | |
Collapse
|
37
|
Schulz S, Beisswanger E, van den Hoek L, Bodenreider O, van Mulligen EM. Alignment of the UMLS semantic network with BioTop: methodology and assessment. Bioinformatics 2009; 25:i69-76. [PMID: 19478019 PMCID: PMC2687948 DOI: 10.1093/bioinformatics/btp194] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION For many years, the Unified Medical Language System (UMLS) semantic network (SN) has been used as an upper-level semantic framework for the categorization of terms from terminological resources in biomedicine. BioTop has recently been developed as an upper-level ontology for the biomedical domain. In contrast to the SN, it is founded upon strict ontological principles, using OWL DL as a formal representation language, which has become standard in the semantic Web. In order to make logic-based reasoning available for the resources annotated or categorized with the SN, a mapping ontology was developed aligning the SN with BioTop. METHODS The theoretical foundations and the practical realization of the alignment are being described, with a focus on the design decisions taken, the problems encountered and the adaptations of BioTop that became necessary. For evaluation purposes, UMLS concept pairs obtained from MEDLINE abstracts by a named entity recognition system were tested for possible semantic relationships. Furthermore, all semantic-type combinations that occur in the UMLS Metathesaurus were checked for satisfiability. RESULTS The effort-intensive alignment process required major design changes and enhancements of BioTop and brought up several design errors that could be fixed. A comparison between a human curator and the ontology yielded only a low agreement. Ontology reasoning was also used to successfully identify 133 inconsistent semantic-type combinations. AVAILABILITY BioTop, the OWL DL representation of the UMLS SN, and the mapping ontology are available at http://www.purl.org/biotop/.
Collapse
Affiliation(s)
- Stefan Schulz
- Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany.
| | | | | | | | | |
Collapse
|
38
|
Xia X, Postis VLG, Rahman M, Wright GSA, Roach PCJ, Deacon SE, Ingram JC, Henderson PJF, Findlay JBC, Phillips SEV, McPherson MJ, Baldwin SA. Investigation of the structure and function of a Shewanella oneidensis arsenical-resistance family transporter. Mol Membr Biol 2009; 25:691-705. [PMID: 19039703 DOI: 10.1080/09687680802535930] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The toxic metalloid arsenic is an abundant element and most organisms possess transport systems involved in its detoxification. One such family of arsenite transporters, the ACR3 family, is widespread in fungi and bacteria. To gain a better understanding of the molecular mechanism of arsenic transport, we report here the expression and characterization of a family member, So_ACR3, from the bacterium Shewanella oneidensis MR-1. Surprisingly, expression of this transporter in the arsenic-hypersensitive Escherichia coli strain AW3110 conferred resistance to arsenate, but not to arsenite. Purification of a C-terminally His-tagged form of the protein allowed the binding of putative permeants to be directly tested: arsenate but not arsenite quenched its intrinsic fluorescence in a concentration-dependent fashion. Fourier transform infrared spectroscopy showed that the purified protein was predominantly alpha-helical. A mutant bearing a single cysteine residue at position 3 retained the ability to confer arsenate resistance, and was accessible to membrane impermeant thiol reagents in intact cells. In conjunction with successful C-terminal tagging with oligohistidine, this finding is consistent with the experimentally-determined topology of the homologous human apical sodium-dependent bile acid transporter, namely 7 transmembrane helices and a periplasmic N-terminus, although the presence of additional transmembrane segments cannot be excluded. Mutation to alanine of the conserved residue proline 190, in the fourth putative transmembrane region, abrogated the ability of the transporter to confer arsenic resistance, but did not prevent arsenate binding. An apparently increased thermal stability is consistent with the mutant being unable to undergo the conformational transitions required for permeant translocation.
Collapse
Affiliation(s)
- Xiaobing Xia
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds, UK
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Prathipati P, Ma NL, Manjunatha UH, Bender A. Fishing the Target of Antitubercular Compounds: In Silico Target Deconvolution Model Development and Validation. J Proteome Res 2009; 8:2788-98. [DOI: 10.1021/pr8010843] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Philip Prathipati
- Novartis Institute for Tropical Diseases, 10 Biopolis Road, #05-01 Chromos, 138670, Singapore, and Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, Inc., 250 Massachusetts Avenue, Cambridge, Massachusetts 02139
| | - Ngai Ling Ma
- Novartis Institute for Tropical Diseases, 10 Biopolis Road, #05-01 Chromos, 138670, Singapore, and Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, Inc., 250 Massachusetts Avenue, Cambridge, Massachusetts 02139
| | - Ujjini H. Manjunatha
- Novartis Institute for Tropical Diseases, 10 Biopolis Road, #05-01 Chromos, 138670, Singapore, and Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, Inc., 250 Massachusetts Avenue, Cambridge, Massachusetts 02139
| | - Andreas Bender
- Novartis Institute for Tropical Diseases, 10 Biopolis Road, #05-01 Chromos, 138670, Singapore, and Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, Inc., 250 Massachusetts Avenue, Cambridge, Massachusetts 02139
| |
Collapse
|
40
|
Jorrín-Novo JV, Maldonado AM, Echevarría-Zomeño S, Valledor L, Castillejo MA, Curto M, Valero J, Sghaier B, Donoso G, Redondo I. Plant proteomics update (2007–2008): Second-generation proteomic techniques, an appropriate experimental design, and data analysis to fulfill MIAPE standards, increase plant proteome coverage and expand biological knowledge. J Proteomics 2009; 72:285-314. [DOI: 10.1016/j.jprot.2009.01.026] [Citation(s) in RCA: 174] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
41
|
Rentzsch R, Orengo CA. Protein function prediction--the power of multiplicity. Trends Biotechnol 2009; 27:210-9. [PMID: 19251332 DOI: 10.1016/j.tibtech.2009.01.002] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2008] [Revised: 01/21/2009] [Accepted: 01/23/2009] [Indexed: 01/07/2023]
Abstract
Advances in experimental and computational methods have quietly ushered in a new era in protein function annotation. This 'age of multiplicity' is marked by the notion that only the use of multiple tools, multiple evidence and considering the multiple aspects of function can give us the broad picture that 21st century biology will need to link and alter micro- and macroscopic phenotypes. It might also help us to undo past mistakes by removing errors from our databases and prevent us from producing more. On the downside, multiplicity is often confusing. We therefore systematically review methods and resources for automated protein function prediction, looking at individual (biochemical) and contextual (network) functions, respectively.
Collapse
Affiliation(s)
- Robert Rentzsch
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK.
| | | |
Collapse
|
42
|
Bachmann BO, Ravel J. Chapter 8. Methods for in silico prediction of microbial polyketide and nonribosomal peptide biosynthetic pathways from DNA sequence data. Methods Enzymol 2009; 458:181-217. [PMID: 19374984 DOI: 10.1016/s0076-6879(09)04808-3] [Citation(s) in RCA: 281] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Fore-knowledge of the secondary metabolic potential of cultivated and previously uncultivated microorganisms can potentially facilitate the process of natural product discovery. By combining sequence-based knowledge with biochemical precedent, translated gene sequence data can be used to rapidly derive structural elements encoded by secondary metabolic gene clusters from microorganisms. These structural elements provide an estimate of the secondary metabolic potential of a given organism and a starting point for identification of potential lead compounds in isolation/structure elucidation campaigns. The accuracy of these predictions for a given translated gene sequence depends on the biochemistry of the metabolite class, similarity to known metabolite gene clusters, and depth of knowledge concerning its biosynthetic machinery. This chapter introduces methods for prediction of structural elements for two well-studied classes: modular polyketides and nonribosomally encoded peptides. A bioinformatics tool is presented for rapid preliminary analysis of these modular systems, and prototypical methods for converting these analyses into substructural elements are described.
Collapse
Affiliation(s)
- Brian O Bachmann
- Department of Chemistry, Vanderbilt Institute for Chemical Biology, Vanderbilt University, Nashville, Tennessee, USA
| | | |
Collapse
|
43
|
Michael H, Hogan J, Kel A, Kel-Margoulis O, Schacherer F, Voss N, Wingender E. Building a knowledge base for systems pathology. Brief Bioinform 2008; 9:518-31. [PMID: 19073714 DOI: 10.1093/bib/bbn038] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Translating the exponentially growing amount of omics data into knowledge usable for a personalized medicine approach poses a formidable challenge. In this article-taking diabetes as a use case-we present strategies for developing data repositories into computer-accessible knowledge sources that can be used for a systemic view on the molecular causes of diseases, thus laying the foundation for systems pathology.
Collapse
Affiliation(s)
- Holger Michael
- Department of Bioinformatics, Goldschmidtstr. 1, D-37077 Göttingen, Germany
| | | | | | | | | | | | | |
Collapse
|
44
|
Vizcaíno JA, Mueller M, Hermjakob H, Martens L. Charting online OMICS resources: A navigational chart for clinical researchers. Proteomics Clin Appl 2008; 3:18-29. [PMID: 21136933 DOI: 10.1002/prca.200800082] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2008] [Indexed: 12/22/2022]
Abstract
The life sciences have sprouted several popular and successful OMICS technologies that span all levels of biological information transfer. Ever since the start of the Human Genome Project, the then revolutionary idea to make all resulting data publicly available has been central to all of the efforts across OMICS technologies. As a result, a great variety of publicly available data repositories and resources is currently available to the research community. This widespread availability of data does come at the price of increased confusion on the part of the users, especially for those that see the OMICS technologies as tools to help unravel a larger biological or clinical question. We therefore provide a comprehensive overview of the available resources across OMICS fields, with a special emphasis on those databases that are relevant to the study of proteins. Additionally, we also describe various integrative systems that have been established, and highlight new developments in the field that can revolutionize the way in which live data integration is achieved over the internet.
Collapse
Affiliation(s)
- Juan Antonio Vizcaíno
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | |
Collapse
|
45
|
Klein J, Münch R, Biegler I, Haddad I, Retter I, Jahn D. Strepto-DB, a database for comparative genomics of group A (GAS) and B (GBS) streptococci, implemented with the novel database platform 'Open Genome Resource' (OGeR). Nucleic Acids Res 2008; 37:D494-8. [PMID: 18854354 PMCID: PMC2686516 DOI: 10.1093/nar/gkn674] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Streptococci are the causative agent of many human infectious diseases including bacterial pneumonia and meningitis. Here, we present Strepto-DB, a database for the comparative genome analysis of group A (GAS) and group B (GBS) streptococci. The known genomes of various GAS and GBS contain a large fraction of distributed genes that were found absent in other strains or serotypes of the same species. Strepto-DB identifies the homologous proteins deduced from the genomes of interest. It allows for the elucidation of the GAS and GBS core- and pan-genomes via genome-wide comparisons. Moreover, an intergenic region analysis tool provides alignments and predictions for transcription factor binding sites in the non-coding sequences. An interactive genome browser visualizes functional annotations. Strepto-DB (http://oger.tu-bs.de/strepto_db) was created by the use of OGeR, the Open Genome Resource for comparative analysis of prokaryotic genomes. OGeR is a newly developed open source database and tool platform for the web-based storage, distribution, visualization and comparison of prokaryotic genome data. The system automatically creates the dedicated relational database and web interface and imports an arbitrary number of genomes derived from standardized genome files. OGeR can be downloaded at http://oger.tu-bs.de.
Collapse
Affiliation(s)
- Johannes Klein
- Institute for Microbiology, Technische Universität Braunschweig, Spielmannstrasse 7, 38106 Braunschweig, Germany
| | | | | | | | | | | |
Collapse
|