301
|
Griffin WC, Trakselis MA. The MCM8/9 complex: A recent recruit to the roster of helicases involved in genome maintenance. DNA Repair (Amst) 2019; 76:1-10. [PMID: 30743181 DOI: 10.1016/j.dnarep.2019.02.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 02/03/2019] [Indexed: 12/11/2022]
Abstract
There are several DNA helicases involved in seemingly overlapping aspects of homologous and homoeologous recombination. Mutations of many of these helicases are directly implicated in genetic diseases including cancer, rapid aging, and infertility. MCM8/9 are recent additions to the catalog of helicases involved in recombination, and so far, the evidence is sparse, making assignment of function difficult. Mutations in MCM8/9 correlate principally with primary ovarian failure/insufficiency (POF/POI) and infertility indicating a meiotic defect. However, they also act when replication forks collapse/break shuttling products into mitotic recombination and several mutations are found in various somatic cancers. This review puts MCM8/9 in context with other replication and recombination helicases to narrow down its genomic maintenance role. We discuss the known structure/function relationship, the mutational spectrum, and dissect the available cellular and organismal data to better define its role in recombination.
Collapse
Affiliation(s)
- Wezley C Griffin
- Department of Chemistry and Biochemistry, Baylor University, Waco, Texas, 76798, USA
| | - Michael A Trakselis
- Department of Chemistry and Biochemistry, Baylor University, Waco, Texas, 76798, USA.
| |
Collapse
|
302
|
Puzakova LV, Puzakov MV, Soldatov AA. Gene Encoding a Novel Enzyme of LDH2/MDH2 Family is Lost in Plant and Animal Genomes During Transition to Land. J Mol Evol 2019; 87:52-59. [PMID: 30607448 DOI: 10.1007/s00239-018-9884-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 12/27/2018] [Indexed: 11/28/2022]
Abstract
L-Lactate/malate dehydrogenases (LDH/MDH) and type 2 L-lactate/malate dehydrogenases (LDH2/MDH2) belong to NADH/NADPH-dependent oxidoreductases (anaerobic dehydrogenases). They form a large protein superfamily with multiple enzyme homologs found in all branches of life: from bacteria and archaea to eukaryotes, and play an essential role in metabolism. Here, we describe the gene encoding a new enzyme of LDH2/MDH2 oxidoreductase family. This gene is found in genomes of all studied groups/classes of bacteria and fungi. In the plant kingdom, this gene was observed only in algae, but not in bryophyta or spermatophyta. This gene is present in all taxonomic groups of animal kingdom beginning with protozoa, but is lost in lungfishes and other, higher taxa of vertebrates (amphibians, reptiles, avians and mammals). Since the gene encoding the new enzyme is found only in taxa associated with the aquatic environment, we named it AqE (aquatic enzyme). We demonstrated that AqE gene is convergently lost in different independent lineages of animals and plants. Interestingly, the loss of the gene is consistently associated with transition from aquatic to terrestrial life forms, which suggests that this enzyme is essential in aquatic environment, but redundant or even detrimental in terrestrial organisms.
Collapse
Affiliation(s)
- L V Puzakova
- The A.O. Kovalevsky Institute of Marine Biology Research of RAS, Nakhimov av., 2, Sevastopol, Russia, 299011
| | - M V Puzakov
- The A.O. Kovalevsky Institute of Marine Biology Research of RAS, Nakhimov av., 2, Sevastopol, Russia, 299011.
| | - A A Soldatov
- The A.O. Kovalevsky Institute of Marine Biology Research of RAS, Nakhimov av., 2, Sevastopol, Russia, 299011
| |
Collapse
|
303
|
Oldfield CJ, Chen K, Kurgan L. Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2019; 1958:73-100. [PMID: 30945214 DOI: 10.1007/978-1-4939-9161-7_4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many new methods for the sequence-based prediction of the secondary and supersecondary structures have been developed over the last several years. These and older sequence-based predictors are widely applied for the characterization and prediction of protein structure and function. These efforts have produced countless accurate predictors, many of which rely on state-of-the-art machine learning models and evolutionary information generated from multiple sequence alignments. We describe and motivate both types of predictions. We introduce concepts related to the annotation and computational prediction of the three-state and eight-state secondary structure as well as several types of supersecondary structures, such as β hairpins, coiled coils, and α-turn-α motifs. We review 34 predictors focusing on recent tools and provide detailed information for a selected set of 14 secondary structure and 3 supersecondary structure predictors. We conclude with several practical notes for the end users of these predictive methods.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA
| | - Ke Chen
- School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, People's Republic of China
| | - Lukasz Kurgan
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
304
|
Wróblewski T, Spiridon L, Martin EC, Petrescu AJ, Cavanaugh K, Truco MJ, Xu H, Gozdowski D, Pawłowski K, Michelmore RW, Takken FL. Genome-wide functional analyses of plant coiled-coil NLR-type pathogen receptors reveal essential roles of their N-terminal domain in oligomerization, networking, and immunity. PLoS Biol 2018; 16:e2005821. [PMID: 30540748 PMCID: PMC6312357 DOI: 10.1371/journal.pbio.2005821] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 12/31/2018] [Accepted: 11/16/2018] [Indexed: 12/22/2022] Open
Abstract
The ability to induce a defense response after pathogen attack is a critical feature of the immune system of any organism. Nucleotide-binding leucine-rich repeat receptors (NLRs) are key players in this process and perceive the occurrence of nonself-activities or foreign molecules. In plants, coevolution with a variety of pests and pathogens has resulted in repertoires of several hundred diverse NLRs in single individuals and many more in populations as a whole. However, the mechanism by which defense signaling is triggered by these NLRs in plants is poorly understood. Here, we show that upon pathogen perception, NLRs use their N-terminal domains to transactivate other receptors. Their N-terminal domains homo- and heterodimerize, suggesting that plant NLRs oligomerize upon activation, similar to the vertebrate NLRs; however, consistent with their large number in plants, the complexes are highly heterometric. Also, in contrast to metazoan NLRs, the N-terminus, rather than their centrally located nucleotide-binding (NB) domain, can mediate initial partner selection. The highly redundant network of NLR interactions in plants is proposed to provide resilience to perturbation by pathogens.
Collapse
Affiliation(s)
- Tadeusz Wróblewski
- The Genome Center, University of California–Davis, Davis, California, United States of America
| | - Laurentiu Spiridon
- Department of Bioinformatics and Structural Biochemistry, Institute of Biochemistry of the Romanian Academy, Bucharest, Romania
| | - Eliza Cristina Martin
- Department of Bioinformatics and Structural Biochemistry, Institute of Biochemistry of the Romanian Academy, Bucharest, Romania
| | - Andrei-Jose Petrescu
- Department of Bioinformatics and Structural Biochemistry, Institute of Biochemistry of the Romanian Academy, Bucharest, Romania
| | - Keri Cavanaugh
- The Genome Center, University of California–Davis, Davis, California, United States of America
| | - Maria José Truco
- The Genome Center, University of California–Davis, Davis, California, United States of America
| | - Huaqin Xu
- The Genome Center, University of California–Davis, Davis, California, United States of America
| | - Dariusz Gozdowski
- Department of Experimental Design and Bioinformatics, Faculty of Agriculture and Biology, Warsaw University of Life Sciences, Warsaw, Poland
| | - Krzysztof Pawłowski
- Department of Experimental Design and Bioinformatics, Faculty of Agriculture and Biology, Warsaw University of Life Sciences, Warsaw, Poland
| | - Richard W. Michelmore
- The Genome Center, University of California–Davis, Davis, California, United States of America
- Departments of Plant Sciences, Molecular & Cellular Biology, and Medical Microbiology & Immunology, University of California–Davis, Davis, California, United States of America
- Department of Medical Microbiology and Immunology, University of California–Davis, Davis, California, United States of America
| | - Frank L.W. Takken
- Molecular Plant Pathology, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
305
|
Weber A, Alves J, Abujamra AL, Bustamante‐Filho IC. Structural modeling and mRNA expression of epididymal β‐defensins in GnRH immunized boars: A model for secondary hypogonadism in man. Mol Reprod Dev 2018; 85:921-933. [DOI: 10.1002/mrd.23069] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 10/09/2018] [Indexed: 01/18/2023]
Affiliation(s)
- Augusto Weber
- Laboratório de Biotecnologia, Universidade do Vale do Taquari – UnivatesLajeado RS Brazil
| | - Jayse Alves
- Laboratório de Biotecnologia, Universidade do Vale do Taquari – UnivatesLajeado RS Brazil
| | - Ana L. Abujamra
- Laboratório de Biotecnologia, Universidade do Vale do Taquari – UnivatesLajeado RS Brazil
| | | |
Collapse
|
306
|
Functional Characterization of AbaQ, a Novel Efflux Pump Mediating Quinolone Resistance in Acinetobacter baumannii. Antimicrob Agents Chemother 2018; 62:AAC.00906-18. [PMID: 29941648 DOI: 10.1128/aac.00906-18] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 06/18/2018] [Indexed: 01/15/2023] Open
Abstract
Acinetobacter baumannii has emerged as an important multidrug-resistant nosocomial pathogen. In previous work, we identified a putative MFS transporter, AU097_RS17040, involved in the pathogenicity of A. baumannii (M. Pérez-Varela, J. Corral, J. A. Vallejo, S. Rumbo-Feal, G. Bou, J. Aranda, and J. Barbé, Infect Immun 85:e00327-17, 2017, https://doi.org/10.1128/IAI.00327-17). In this study, we analyzed the susceptibility to diverse antimicrobial agents of A. baumannii cells defective in this transporter, referred to as AbaQ. Our results showed that AbaQ is mainly involved in the extrusion of quinolone-type drugs in A. baumannii.
Collapse
|
307
|
Kunjithapatham R, Ganapathy-Kanniappan S. GAPDH with NAD +-binding site mutation competitively inhibits the wild-type and affects glucose metabolism in cancer. Biochim Biophys Acta Gen Subj 2018; 1862:2555-2563. [PMID: 30077773 DOI: 10.1016/j.bbagen.2018.08.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 07/26/2018] [Accepted: 08/01/2018] [Indexed: 12/12/2022]
Abstract
BACKGROUND Rapid utilization of glucose is a metabolic signature of majority of cancers, hence enzymes of the glycolytic pathway remain attractive therapeutic targets. Recent reports have shown that targeting the glycolytic enzyme, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), an abundant, ubiquitous multifunctional protein frequently upregulated in cancer, affects cancer progression. Here, we report that a catalytically-deficient mutant-GAPDH competitively inhibits the wild-type, and disrupts glucose metabolism in cancer cells. METHODS Using site-directed mutagenesis, the human GAPDH clone was mutated at one of the NAD+-binding sites, (i.e.) arginine (R13) and isoleucine (I14) to glutamine (Q13) and phenylalanine (F14), respectively. The inhibitory role of the mutant-GAPDH, and its effect on energy metabolism and cancer phenotype was determined using in vitro and in vivo models of cancer. RESULTS The enzymatically-dysfunctional mutant-GAPDH competitively inhibited the wild-type GAPDH in a cell-free system. In cancer cells, ectopic expression of the mutant-GAPDH, but not the wild-type, inhibited the glycolytic capacity of cellular-GAPDH, and led to the induction of metabolic stress accompanied by a sharp decline in glucose-uptake. Furthermore, expression of mutant-GAPDH affected cancer growth in vitro and in vivo. Mechanistically, structural analysis by bioinformatics revealed that the mutations at the NAD+-binding site altered the solvent-accessibility that perhaps affected the functionality of mutant-GAPDH. CONCLUSION Mutant-GAPDH affects the enzymatic function of cellular-GAPDH and disrupts energy metabolism. GENERAL SIGNIFICANCE Our findings demonstrate that a minimal mutation at the NAD+-binding site is sufficient to generate a competitive but dysfunctional GAPDH, and its ectopic expression inhibits the wild-type to disrupt glycolysis.
Collapse
Affiliation(s)
- Rani Kunjithapatham
- The Division of Interventional Radiology, Russell H. Morgan Department of Radiology & Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Shanmugasundaram Ganapathy-Kanniappan
- The Division of Interventional Radiology, Russell H. Morgan Department of Radiology & Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
308
|
Flores-Ibarra A, Vértesy S, Medrano FJ, Gabius HJ, Romero A. Crystallization of a human galectin-3 variant with two ordered segments in the shortened N-terminal tail. Sci Rep 2018; 8:9835. [PMID: 29959397 PMCID: PMC6026190 DOI: 10.1038/s41598-018-28235-x] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 06/19/2018] [Indexed: 12/24/2022] Open
Abstract
Among members of the family of adhesion/growth-regulatory galectins, galectin-3 (Gal-3) bears a unique modular architecture. A N-terminal tail (NT) consisting of the N-terminal segment (NTS) and nine collagen-like repeats is linked to the canonical lectin domain. In contrast to bivalent proto- and tandem-repeat-type galectins, Gal-3 is monomeric in solution, capable to self-associate in the presence of bi- to multivalent ligands, and the NTS is involved in cellular compartmentalization. Since no crystallographic information on Gal-3 beyond the lectin domain is available, we used a shortened variant with NTS and repeats VII-IX. This protein crystallized as tetramers with contacts between the lectin domains. The region from Tyr101 (in repeat IX) to Leu114 (in the CRD) formed a hairpin. The NTS extends the canonical β-sheet of F1-F5 strands with two new β-strands on the F face. Together, crystallographic and SAXS data reveal a mode of intramolecular structure building involving the highly flexible Gal-3’s NT.
Collapse
Affiliation(s)
- Andrea Flores-Ibarra
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas, CSIC, Ramiro de Maeztu 9, 28040, Madrid, Spain
| | - Sabine Vértesy
- Institute of Physiological Chemistry, Faculty of Veterinary Medicine, Ludwig-Maximilians-University Munich, Veterinärstrabe 13, 80539, Munich, Germany
| | - Francisco J Medrano
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas, CSIC, Ramiro de Maeztu 9, 28040, Madrid, Spain
| | - Hans-Joachim Gabius
- Institute of Physiological Chemistry, Faculty of Veterinary Medicine, Ludwig-Maximilians-University Munich, Veterinärstrabe 13, 80539, Munich, Germany.
| | - Antonio Romero
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas, CSIC, Ramiro de Maeztu 9, 28040, Madrid, Spain.
| |
Collapse
|
309
|
Protein Secondary Structure Prediction Based on Data Partition and Semi-Random Subspace Method. Sci Rep 2018; 8:9856. [PMID: 29959372 PMCID: PMC6026213 DOI: 10.1038/s41598-018-28084-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 06/12/2018] [Indexed: 11/20/2022] Open
Abstract
Protein secondary structure prediction is one of the most important and challenging problems in bioinformatics. Machine learning techniques have been applied to solve the problem and have gained substantial success in this research area. However there is still room for improvement toward the theoretical limit. In this paper, we present a novel method for protein secondary structure prediction based on a data partition and semi-random subspace method (PSRSM). Data partitioning is an important strategy for our method. First, the protein training dataset was partitioned into several subsets based on the length of the protein sequence. Then we trained base classifiers on the subspace data generated by the semi-random subspace method, and combined base classifiers by majority vote rule into ensemble classifiers on each subset. Multiple classifiers were trained on different subsets. These different classifiers were used to predict the secondary structures of different proteins according to the protein sequence length. Experiments are performed on 25PDB, CB513, CASP10, CASP11, CASP12, and T100 datasets, and the good performance of 86.38%, 84.53%, 85.51%, 85.89%, 85.55%, and 85.09% is achieved respectively. Experimental results showed that our method outperforms other state-of-the-art methods.
Collapse
|
310
|
Low-resolution SAXS and comparative modeling based structure analysis of endo-β-1,4-xylanase a family 10 glycoside hydrolase from Pseudopedobacter saltans comb. nov. Int J Biol Macromol 2018; 112:1104-1114. [DOI: 10.1016/j.ijbiomac.2018.02.037] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Revised: 02/05/2018] [Accepted: 02/07/2018] [Indexed: 11/20/2022]
|
311
|
Structural characterization and antioxidant potential of phycocyanin from the cyanobacterium Geitlerinema sp. H8DM. ALGAL RES 2018. [DOI: 10.1016/j.algal.2018.04.024] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
312
|
Cheng CW, Putaporntip C, Jongwutiwes S. Polymorphism in merozoite surface protein-7E of Plasmodium vivax in Thailand: Natural selection related to protein secondary structure. PLoS One 2018; 13:e0196765. [PMID: 29718980 PMCID: PMC5931635 DOI: 10.1371/journal.pone.0196765] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 04/19/2018] [Indexed: 11/18/2022] Open
Abstract
Merozoite surface protein 7 (MSP-7) is a multigene family expressed during malaria blood-stage infection. MSP-7 forms complex with MSP-1 prior to merozoite egress from erythrocytes, and could affect merozoite invasion of erythrocytes. To characterize sequence variation in the orthologue in P. vivax (PvMSP-7), a gene member encoding PvMSP-7E was analyzed among 92 Thai isolates collected from 3 major endemic areas of Thailand (Northwest: Tak, Northeast: Ubon Ratchathani, and South: Yala and Narathiwat provinces). In total, 52 distinct haplotypes were found to circulate in these areas. Although population structure based on this locus was observed between each endemic area, no genetic differentiation occurred between populations collected from different periods in the same endemic area, suggesting spatial but not temporal genetic variation. Sequence microheterogeneity in both N- and C- terminal regions was predicted to display 4 and 6 α-helical domains, respectively. Signals of purifying selection were observed in α-helices II-X, suggesting structural or functional constraint in these domains. By contrast, α-helix-I spanning the putative signal peptide was under positive selection, in which amino acid substitutions could alter predicted CD4+ T helper cell epitopes. The central region of PvMSP-7E comprised the 5’-trimorphic and the 3’-dimorphic subregions. Positive selection was identified in the 3’ dimorphic subregion of the central domain. A consensus of intrinsically unstructured or disordered protein was predicted to encompass the entire central domain that contained a number of putative B cell epitopes and putative protein binding regions. Evidences of intragenic recombination were more common in the central region than the remainders of the gene. These results suggest that the extent of sequence variation, recombination events and selective pressures in the PvMSP-7E locus seem to be differentially affected by protein secondary structure.
Collapse
Affiliation(s)
- Chew Weng Cheng
- Molecular Biology of Malaria and Opportunistic Parasites Research Unit, Department of Parasitology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Chaturong Putaporntip
- Molecular Biology of Malaria and Opportunistic Parasites Research Unit, Department of Parasitology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Somchai Jongwutiwes
- Molecular Biology of Malaria and Opportunistic Parasites Research Unit, Department of Parasitology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- * E-mail:
| |
Collapse
|
313
|
Negahdaripour M, Nezafat N, Eslami M, Ghoshoon MB, Shoolian E, Najafipour S, Morowvat MH, Dehshahri A, Erfani N, Ghasemi Y. Structural vaccinology considerations for in silico designing of a multi-epitope vaccine. INFECTION GENETICS AND EVOLUTION 2018; 58:96-109. [DOI: 10.1016/j.meegid.2017.12.008] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 12/05/2017] [Accepted: 12/11/2017] [Indexed: 01/26/2023]
|
314
|
Cao C, Liu F, Tan H, Song D, Shu W, Li W, Zhou Y, Bo X, Xie Z. Deep Learning and Its Applications in Biomedicine. GENOMICS, PROTEOMICS & BIOINFORMATICS 2018; 16:17-32. [PMID: 29522900 PMCID: PMC6000200 DOI: 10.1016/j.gpb.2017.07.003] [Citation(s) in RCA: 236] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2017] [Revised: 06/18/2017] [Accepted: 07/05/2017] [Indexed: 12/19/2022]
Abstract
Advances in biological and medical technologies have been providing us explosive volumes of biological and physiological data, such as medical images, electroencephalography, genomic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning applications, including medical image classification, genomic sequence analysis, as well as protein structure classification and prediction. Finally, we offer our perspectives for the future directions in the field of deep learning.
Collapse
Affiliation(s)
- Chensi Cao
- CapitalBio Corporation, Beijing 102206, China
| | - Feng Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Hai Tan
- State Key Lab of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 500040, China
| | - Deshou Song
- State Key Lab of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 500040, China
| | - Wenjie Shu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Weizhong Li
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 500040, China
| | - Yiming Zhou
- CapitalBio Corporation, Beijing 102206, China; Department of Biomedical Engineering, Medical Systems Biology Research Center, Tsinghua University School of Medicine, Beijing 100084, China.
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | - Zhi Xie
- State Key Lab of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 500040, China.
| |
Collapse
|
315
|
Pandey V, Krishnan V, Basak N, Marathe A, Thimmegowda V, Dahuja A, Jolly M, Sachdev A. Molecular modeling and in silico characterization of GmABCC5: a phytate transporter and potential target for low-phytate crops. 3 Biotech 2018; 8:54. [PMID: 29354365 DOI: 10.1007/s13205-017-1053-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 12/17/2017] [Indexed: 02/06/2023] Open
Abstract
Designing low-phytate crops without affecting the developmental process in plants had led to the identification of ABCC5 gene in soybean. The GmABCC5 gene was identified and a partial gene sequence was cloned from popular Indian soybean genotype Pusa16. Conserved domains and motifs unique to ABC transporters were identified in the 30 homologous sequences retrieved by BLASTP analysis. The homologs were analyzed for their evolutionary relationship and physiochemical properties. Conserved domains, transmembrane architecture and secondary structure of GmABCC5 were predicted with the aid of computational tools. Analysis identified 53 alpha helices and 31 beta strands, predicting 60% residues in alpha conformation. A three-dimensional (3D) model for GmABCC5 was developed based on 5twv.1.B (Homo sapiens) template homology to gain better insight into its molecular mechanism of transport and sequestration. Spatio-temporal real-time PCR analysis identified mid-to-late seed developmental stages as the time window for the maximum GmABCC5 gene expression, a potential target stage for phytate reduction. Results of this study provide valuable insights into the structural and functional characteristics of GmABCC5, which may be further utilized for the development of nutritionally enriched low-phytate soybean with improved mineral bioavailability.
Collapse
Affiliation(s)
- Vanita Pandey
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
- Quality and Basic Sciences, ICAR-Indian Institute of Wheat and Barley Research, Karnal, New Delhi 132 001 India
| | - Veda Krishnan
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Nabaneeta Basak
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
- Crop Physiology and Biochemistry, ICAR-National Rice Research Institute, Cuttack, 753006 India
| | - Ashish Marathe
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Vinutha Thimmegowda
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Anil Dahuja
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Monica Jolly
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Archana Sachdev
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| |
Collapse
|
316
|
Strzalka A, Szafran MJ, Strick T, Jakimowicz D. C-terminal lysine repeats in Streptomyces topoisomerase I stabilize the enzyme-DNA complex and confer high enzyme processivity. Nucleic Acids Res 2017; 45:11908-11924. [PMID: 28981718 PMCID: PMC5714199 DOI: 10.1093/nar/gkx827] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2017] [Accepted: 09/06/2017] [Indexed: 12/12/2022] Open
Abstract
Streptomyces topoisomerase I (TopA) exhibits exceptionally high processivity. The enzyme, as other actinobacterial topoisomerases I, differs from its bacterial homologs in its C-terminal domain (CTD). Here, bioinformatics analyses established that the presence of lysine repeats is a characteristic feature of actinobacterial TopA CTDs. Streptomyces TopA contains the longest stretch of lysine repeats, which terminate with acidic amino acids. DNA-binding studies revealed that the lysine repeats stabilized the TopA–DNA complex, while single-molecule experiments showed that their elimination impaired enzyme processivity. Streptomyces coelicolor TopA processivity could not be restored by fusion of its N-terminal domain (NTD) with the Escherichia coli TopA CTD. The hybrid protein could not re-establish the distribution of multiple chromosomal copies in Streptomyces hyphae impaired by TopA depletion. We expected that the highest TopA processivity would be required during the growth of multigenomic sporogenic hyphae, and indeed, the elimination of lysine repeats from TopA disturbed sporulation. We speculate that the interaction of the lysine repeats with DNA allows the stabilization of the enzyme–DNA complex, which is additionally enhanced by acidic C-terminal amino acids. The complex stabilization, which may be particularly important for GC-rich chromosomes, enables high enzyme processivity. The high processivity of TopA allows rapid topological changes in multiple chromosomal copies during Streptomyces sporulation.
Collapse
Affiliation(s)
- Agnieszka Strzalka
- Faculty of Biotechnology, University of Wroclaw, Joliot-Curie 14A, 50-383 Wroclaw, Poland
| | - Marcin J Szafran
- Faculty of Biotechnology, University of Wroclaw, Joliot-Curie 14A, 50-383 Wroclaw, Poland
| | - Terence Strick
- Institut Jacques Monod, CNRS UMR 7592, University Paris Diderot, Sorbonne Paris Cite, F-75205 Paris, France
| | - Dagmara Jakimowicz
- Faculty of Biotechnology, University of Wroclaw, Joliot-Curie 14A, 50-383 Wroclaw, Poland
| |
Collapse
|
317
|
Abstract
BACKGROUND Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. RESULTS To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. CONCLUSION We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.
Collapse
Affiliation(s)
- Rui Xie
- Department of Computer Science, University of Missouri at Columbia, Columbia, MO USA
| | - Jia Wen
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC USA
| | - Andrew Quitadamo
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri at Columbia, Columbia, MO USA
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC USA
| |
Collapse
|
318
|
Xie S, Li Z, Hu H. Protein secondary structure prediction based on the fuzzy support vector machine with the hyperplane optimization. Gene 2017; 642:74-83. [PMID: 29104167 DOI: 10.1016/j.gene.2017.11.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 10/29/2017] [Accepted: 11/02/2017] [Indexed: 11/30/2022]
Abstract
The prediction of the protein secondary structure is a crucial point in bioinformatics and related fields. In the last years, machine learning methods have become a valuable tool, achieving satisfactory results. However, the prediction accuracy needs to be further ameliorated. This paper proposes a new method based on an improved fuzzy support vector machine (FSVM) for the prediction of the secondary structure of proteins. Unlike traditional methods to set the membership function, it firstly constructs an approximate optimal separating hyperplane by iterating the class centers in the feature space. Then sample points close to this hyperplane are assigned with large membership values, while outliers with small membership values according to the K-nearest neighbor. And some sample points with low membership values are removed, reducing the training time and improving the prediction accuracy. To optimize the prediction results, our method also exploits information on sequence-based structural similarity. We used three databases (e.g. RS126, CB513 and data1199) to test this method, showing the achievement of 94.2%, 93.1%, 96.7% Q3 accuracy and 91.7%, 89.7%, 94.1% SOV values for the three datasets, respectively. Overall, our method results are comparable to or often better than commonly used methods (Magnan & Baldi, 2014; Sheng et al., 2016) for secondary structure prediction.
Collapse
Affiliation(s)
- Shangxin Xie
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, China
| | - Zhong Li
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, China.
| | - Hailong Hu
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, China; School of Science, Zhejiang A&F University, Lin'an, Zhejiang 311300, China
| |
Collapse
|
319
|
Miguel-Arribas A, Hao JA, Luque-Ortega JR, Ramachandran G, Val-Calvo J, Gago-Córdoba C, González-Álvarez D, Abia D, Alfonso C, Wu LJ, Meijer WJJ. The Bacillus subtilis Conjugative Plasmid pLS20 Encodes Two Ribbon-Helix-Helix Type Auxiliary Relaxosome Proteins That Are Essential for Conjugation. Front Microbiol 2017; 8:2138. [PMID: 29163424 PMCID: PMC5675868 DOI: 10.3389/fmicb.2017.02138] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 10/19/2017] [Indexed: 12/15/2022] Open
Abstract
Bacterial conjugation is the process by which a conjugative element (CE) is transferred horizontally from a donor to a recipient cell via a connecting pore. One of the first steps in the conjugation process is the formation of a nucleoprotein complex at the origin of transfer (oriT), where one of the components of the nucleoprotein complex, the relaxase, introduces a site- and strand specific nick to initiate the transfer of a single DNA strand into the recipient cell. In most cases, the nucleoprotein complex involves, besides the relaxase, one or more additional proteins, named auxiliary proteins, which are encoded by the CE and/or the host. The conjugative plasmid pLS20 replicates in the Gram-positive Firmicute bacterium Bacillus subtilis. We have recently identified the relaxase gene and the oriT of pLS20, which are separated by a region of almost 1 kb. Here we show that this region contains two auxiliary genes that we name aux1LS20 and aux2LS20 , and which we show are essential for conjugation. Both Aux1LS20 and Aux2LS20 are predicted to contain a Ribbon-Helix-Helix DNA binding motif near their N-terminus. Analyses of the purified proteins show that Aux1LS20 and Aux2LS20 form tetramers and hexamers in solution, respectively, and that they both bind preferentially to oriTLS20 , although with different characteristics and specificities. In silico analyses revealed that genes encoding homologs of Aux1LS20 and/or Aux2LS20 are located upstream of almost 400 relaxase genes of the RelLS20 family (MOBL) of relaxases. Thus, Aux1LS20 and Aux2LS20 of pLS20 constitute the founding member of the first two families of auxiliary proteins described for CEs of Gram-positive origin.
Collapse
Affiliation(s)
- Andrés Miguel-Arribas
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| | - Jian-An Hao
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
- The Institute of Seawater Desalination and Multipurpose Utilization (SOA), Tianjin, China
| | | | - Gayetri Ramachandran
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| | - Jorge Val-Calvo
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| | - César Gago-Córdoba
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| | - Daniel González-Álvarez
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| | - David Abia
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| | - Carlos Alfonso
- Centro de Investigaciones Biológicas (CSIC), Madrid, Spain
| | - Ling J. Wu
- Centre for Bacterial Cell Biology, Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle Upon Tyne, United Kingdom
| | - Wilfried J. J. Meijer
- Department of Virology and Microbiology, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Instituto de Biología Molecular “Eladio Viñuela” (CSIC), Autonomous University of Madrid, Madrid, Spain
| |
Collapse
|
320
|
Mañas A, Wang S, Nelson A, Li J, Zhao Y, Zhang H, Davis A, Xie B, Maltsev N, Xiang J. The functional domains for Bax∆2 aggregate-mediated caspase 8-dependent cell death. Exp Cell Res 2017; 359:342-355. [PMID: 28807790 PMCID: PMC5718386 DOI: 10.1016/j.yexcr.2017.08.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Revised: 08/07/2017] [Accepted: 08/08/2017] [Indexed: 01/09/2023]
Abstract
Bax∆2 is a functional pro-apoptotic Bax isoform having alterations in its N-terminus, but sharing the rest of its sequence with Baxα. Bax∆2 is unable to target mitochondria due to the loss of helix α1. Instead, it forms cytosolic aggregates and activates caspase 8. However, the functional domain(s) responsible for BaxΔ2 behavior have remained elusive. Here we show that disruption of helix α1 makes Baxα mimic the behavior of Bax∆2. However, the other alterations in the Bax∆2 N-terminus have no significant impact on aggregation or cell death. We found that the hallmark BH3 domain is necessary but not sufficient for aggregation-mediated cell death. We also noted that the core region shared by Baxα and Bax∆2 is required for the formation of large aggregates, which is essential for BaxΔ2 cytotoxicity. However, aggregation by itself is unable to trigger cell death without the C-terminus. Interestingly, the C-terminal helical conformation, not its primary sequence, appears to be critical for caspase 8 recruitment and activation. As Bax∆2 shares core and C-terminal sequences with most Bax isoforms, our results not only reveal a structural basis for Bax∆2-induced cell death, but also imply an intrinsic potential for aggregate-mediated caspase 8-dependent cell death in other Bax family members.
Collapse
Affiliation(s)
- Adriana Mañas
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Sheng Wang
- Human Genetics Department, Computation Institute, University of Chicago, Chicago, IL 60637, USA
| | - Adam Nelson
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Jiajun Li
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Yu Zhao
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Huaiyuan Zhang
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Aislinn Davis
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Bingqing Xie
- Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Natalia Maltsev
- Human Genetics Department, Computation Institute, University of Chicago, Chicago, IL 60637, USA
| | - Jialing Xiang
- Department of Biology, Illinois Institute of Technology, Chicago, IL 60616, USA.
| |
Collapse
|
321
|
Wang S, Li Z, Yu Y, Xu J. Folding Membrane Proteins by Deep Transfer Learning. Cell Syst 2017; 5:202-211.e3. [PMID: 28957654 PMCID: PMC5637520 DOI: 10.1016/j.cels.2017.09.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 06/01/2017] [Accepted: 08/29/2017] [Indexed: 01/02/2023]
Abstract
Computational elucidation of membrane protein (MP) structures is challenging partially due to lack of sufficient solved structures for homology modeling. Here, we describe a high-throughput deep transfer learning method that first predicts MP contacts by learning from non-MPs and then predicts 3D structure models using the predicted contacts as distance restraints. Tested on 510 non-redundant MPs, our method has contact prediction accuracy at least 0.18 better than existing methods, predicts correct folds for 218 MPs, and generates 3D models with root-mean-square deviation (RMSD) less than 4 and 5 Å for 57 and 108 MPs, respectively. A rigorous blind test in the continuous automated model evaluation project shows that our method predicted high-resolution 3D models for two recent test MPs of 210 residues with RMSD ∼2 Å. We estimated that our method could predict correct folds for 1,345-1,871 reviewed human multi-pass MPs including a few hundred new folds, which shall facilitate the discovery of drugs targeting at MPs.
Collapse
Affiliation(s)
- Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA; Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Zhen Li
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA; Department of Computer Science, University of Hong Kong, Hong Kong
| | - Yizhou Yu
- Department of Computer Science, University of Hong Kong, Hong Kong
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
322
|
Wang S, Sun S, Xu J. Analysis of deep learning methods for blind protein contact prediction in CASP12. Proteins 2017; 86 Suppl 1:67-77. [PMID: 28845538 DOI: 10.1002/prot.25377] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Revised: 08/18/2017] [Accepted: 08/25/2017] [Indexed: 11/08/2022]
Abstract
Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method.
Collapse
Affiliation(s)
- Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, Illinois
| | - Siqi Sun
- Toyota Technological Institute at Chicago, Chicago, Illinois
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, Illinois
| |
Collapse
|
323
|
In silico analysis of protein toxin and bacteriocins from Lactobacillus paracasei SD1 genome and available online databases. PLoS One 2017; 12:e0183548. [PMID: 28837656 PMCID: PMC5570283 DOI: 10.1371/journal.pone.0183548] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 08/07/2017] [Indexed: 02/08/2023] Open
Abstract
Lactobacillus paracasei SD1 is a potential probiotic strain due to its ability to survive several conditions in human dental cavities. To ascertain its safety for human use, we therefore performed a comprehensive bioinformatics analysis and characterization of the bacterial protein toxins produced by this strain. We report the complete genome of Lactobacillus paracasei SD1 and its comparison to other Lactobacillus genomes. Additionally, we identify and analyze its protein toxins and antimicrobial proteins using reliable online database resources and establish its phylogenetic relationship with other bacterial genomes. Our investigation suggests that this strain is safe for human use and contains several bacteriocins that confer health benefits to the host. An in silico analysis of protein-protein interactions between the target bacteriocins and the microbial proteins gtfB and luxS of Streptococcus mutans was performed and is discussed here.
Collapse
|
324
|
Wang S, Ma J, Xu J. AUCpreD: proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields. Bioinformatics 2017; 32:i672-i679. [PMID: 27587688 DOI: 10.1093/bioinformatics/btw446] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION Protein intrinsically disordered regions (IDRs) play an important role in many biological processes. Two key properties of IDRs are (i) the occurrence is proteome-wide and (ii) the ratio of disordered residues is about 6%, which makes it challenging to accurately predict IDRs. Most IDR prediction methods use sequence profile to improve accuracy, which prevents its application to proteome-wide prediction since it is time-consuming to generate sequence profiles. On the other hand, the methods without using sequence profile fare much worse than using sequence profile. METHOD This article formulates IDR prediction as a sequence labeling problem and employs a new machine learning method called Deep Convolutional Neural Fields (DeepCNF) to solve it. DeepCNF is an integration of deep convolutional neural networks (DCNN) and conditional random fields (CRF); it can model not only complex sequence-structure relationship in a hierarchical manner, but also correlation among adjacent residues. To deal with highly imbalanced order/disorder ratio, instead of training DeepCNF by widely used maximum-likelihood, we develop a novel approach to train it by maximizing area under the ROC curve (AUC), which is an unbiased measure for class-imbalanced data. RESULTS Our experimental results show that our IDR prediction method AUCpreD outperforms existing popular disorder predictors. More importantly, AUCpreD works very well even without sequence profile, comparing favorably to or even outperforming many methods using sequence profile. Therefore, our method works for proteome-wide disorder prediction while yielding similar or better accuracy than the others. AVAILABILITY AND IMPLEMENTATION http://raptorx2.uchicago.edu/StructurePropertyPred/predict/ CONTACT wangsheng@uchicago.edu, jinboxu@gmail.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, IL, USA Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Jianzhu Ma
- Toyota Technological Institute at Chicago, Chicago, IL, USA
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, IL, USA
| |
Collapse
|
325
|
Tyagi R, Tiwari A, Garg VK, Gupta S. Transcriptome wide identification and characterization of starch branching enzyme in finger millet. Bioinformation 2017; 13:179-184. [PMID: 28729759 PMCID: PMC5512855 DOI: 10.6026/97320630013179] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 06/02/2017] [Indexed: 11/23/2022] Open
Abstract
Starch-branching enzymes (SBEs) are one of the four major enzyme classes involved in starch biosynthesis in plants and play an important role in determining the structure and physical properties of starch granules. Multiple SBEs are involved in starch biosynthesis in plants. Finger millet is calcium rich important serial crop belongs to grass family and the transcriptome data of developing spikes is available on NCBI. In this study it was try to find out the gene sequence of starch branching enzyme and annotate the sequence and submit the sequence for further use. Rice SBE sequence was taken as reference and for characterization of the sequence different in silico tools were used. Four domains were found in the finger millet Starch branching enzyme like alpha amylase catalytic domain from 925 to2172 with E value 0, N-terminal Early set domain from 634 to 915 with E value 1.62 e-42, Alpha amylase, C-terminal all-beta domain from 2224 to 2511 with E value 5.80e-24 and 1,4-alpha-glucan-branching enzyme from 421 to 2517 with E value 0. Major binding interactions with the GLC (alpha-d-glucose), CA (calcium ion), GOL (glycerol), TRS (2-amino-2-hydroxymethylpropane- 1, 3-diol), MG (magnesium ion) and FLC (citrate anion) are fond with different residues. It was found in the phylogenetic study of the finger millet SBE with the 6 species of grass family that two clusters were form A and B. In cluster A, finger millet showed closeness with Oryzasativa and Setariaitalica, Sorghum bicolour and Zea mays while cluster B was formed with Triticumaestivum and Brachypodium distachyon. The nucleotide sequence of Finger millet SBE was submitted to NCBI with the accession no KY648913 and protein structure of SBE of finger millet was also submitted in PMDB with the PMDB id - PM0080938. This research presents a comparative overview of Finger millet SBE and includes their properties, structural and functional characteristics, and recent developments on their post-translational regulation.
Collapse
Affiliation(s)
- Rajhans Tyagi
- Uttarakhand Technical University, Dehradun, Uttarakhand, 248007, India
| | - Apoorv Tiwari
- Sam Higginbottom University of Agriculture, Technology And Sciences (SHUATS), Allahabad, 211007, India
- Dept of Molecular Biology and Genetic Engineering, G.B. Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, 263145, India
| | - Vijay Kumar Garg
- Sam Higginbottom University of Agriculture, Technology And Sciences (SHUATS), Allahabad, 211007, India
| | - Sanjay Gupta
- Uttarakhand Technical University, Dehradun, Uttarakhand, 248007, India
| |
Collapse
|
326
|
Dreßler L, Michel F, Thondorf I, Mansfeld J, Golbik R, Ulbrich-Hofmann R. Metal ions and phosphatidylinositol 4,5-bisphosphate as interacting effectors of α-type plant phospholipase D. PHYTOCHEMISTRY 2017; 138:57-64. [PMID: 28283189 DOI: 10.1016/j.phytochem.2017.02.024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Revised: 02/02/2017] [Accepted: 02/22/2017] [Indexed: 06/06/2023]
Abstract
Plant phospholipases D (PLD) are typically characterized by a C2 domain with at least two Ca2+ binding sites. In vitro, the predominantly expressed α-type PLDs need 20-100 mM CaCl2 for optimum activity, whereas the essential activator of β- or γ-type PLDs, phosphatidylinositol 4,5-bisphosphate (PIP2), plays a secondary role. In the present paper, we have studied the interplay between PIP2 and metal ion activation of the well-known α-type PLD from cabbage (PLDα). With mixed micelles containing phosphatidyl-p-nitrophenol as substrate, PIP2-concentrations in the nanomolar range are able to activate the enzyme in addition to the essential Ca2+ activation. Mg2+ ions are able to replace Ca2+ ions but they do not activate PLDα. Rather, they abolish the activation of the enzyme by Ca2+ ions in the absence, but not in the presence, of PIP2. The presence of PIP2 causes a shift in the pH optimum of PLDα activity to the acidic range. Employing fluorescence measurements and replacing Ca2+ by Tb3+ ions, confirmed the presence of two metal ion-binding sites, in which the one of lower affinity proved crucial for PLD activation. Moreover, we have generated a homology model of the C2 domain of this enzyme, which was used for Molecular Dynamics (MD) simulations and docking studies. As is common for C2 domains, it shows two antiparallel β-sheets consisting of four β-strands each and loop regions that harbor two Ca2+ binding sites. Based on the findings of the MD simulation, one of the bound Ca2+ ions is coordinated by five amino acid residues. The second Ca2+ ion induces a loop movement upon its binding to three amino acid residues. Docking studies with PIP2 reveal, in addition to the previously postulated PIP2-binding site in the middle of the β-sheet structure, another PIP2-binding site near the two Ca2+ ions, which is in accordance with the experimental interplay of PIP2, Ca2+ and Mg2+ ions.
Collapse
Affiliation(s)
- Lars Dreßler
- Institute of Biochemistry and Biotechnology, Martin-Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany
| | - Florian Michel
- Institute of Biochemistry and Biotechnology, Martin-Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany
| | - Iris Thondorf
- Institute of Biochemistry and Biotechnology, Martin-Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany
| | - Johanna Mansfeld
- Institute of Biochemistry and Biotechnology, Martin-Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany
| | - Ralph Golbik
- Institute of Biochemistry and Biotechnology, Martin-Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany
| | - Renate Ulbrich-Hofmann
- Institute of Biochemistry and Biotechnology, Martin-Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany.
| |
Collapse
|
327
|
Chitranshi N, Dheer Y, Wall RV, Gupta V, Abbasi M, Graham SL, Gupta V. Computational analysis unravels novel destructive single nucleotide polymorphisms in the non-synonymous region of human caveolin gene. GENE REPORTS 2017. [DOI: 10.1016/j.genrep.2016.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
328
|
Wang X, Stapleton JA, Klesmith JR, Hewlett EL, Whitehead TA, Maynard JA. Fine Epitope Mapping of Two Antibodies Neutralizing the Bordetella Adenylate Cyclase Toxin. Biochemistry 2017; 56:1324-1336. [PMID: 28177609 DOI: 10.1021/acs.biochem.6b01163] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Adenylate cyclase toxin (ACT) is an important Bordetella pertussis virulence factor that is not included in current acellular pertussis vaccines. We previously demonstrated that immunization with the repeat-in-toxin (RTX) domain of ACT elicits neutralizing antibodies in mice and discovered the first two antibodies to neutralize ACT activities by occluding the receptor-binding site. Here, we fully characterize these antibodies and their epitopes. Both antibodies bind ACT with low nanomolar affinity and cross-react with ACT homologues produced by B. parapertussis and B. bronchiseptica. Antibody M1H5 binds B. pertussis RTX751 ∼100-fold tighter than RTX751 from the other two species, while antibody M2B10 has similar affinity for all three variants. To initially map the antibody epitopes, we generated a series of ACT chimeras and truncation variants, which implicated the repeat blocks II-III. To identify individual epitope residues, we displayed randomly mutated RTX751 libraries on yeast and isolated clones with decreased antibody binding by flow cytometry. Next-generation sequencing identified candidate epitope residues on the basis of enrichment of clones with mutations at specific positions. These epitopes form two adjacent surface patches on a predicted structural model of the RTX751 domain, one for each antibody. Notably, the cellular receptor also binds within blocks II-III and shares at least one residue with the M1H5 epitope. The RTX751 model supports the notion that the antibody and receptor epitopes overlap. These data provide insight into mechanisms of ACT neutralization and guidance for engineering more stable RTX variants that may be more appropriate vaccine antigens.
Collapse
Affiliation(s)
- Xianzhe Wang
- Department of Chemical Engineering, University of Texas at Austin , Austin, Texas 78712, United States
| | - James A Stapleton
- Department of Chemical Engineering and Materials Science, Michigan State University , East Lansing, Michigan 48824, United States
| | - Justin R Klesmith
- Department of Biochemistry and Molecular Biology, Michigan State University , East Lansing, Michigan 48824, United States
| | - Erik L Hewlett
- Department of Medicine, University of Virginia , Charlottesville, Virginia 22906, United States
| | - Timothy A Whitehead
- Department of Chemical Engineering and Materials Science, Michigan State University , East Lansing, Michigan 48824, United States.,Department of Biosystems and Agricultural Engineering, Michigan State University , East Lansing, Michigan 48824, United States
| | - Jennifer A Maynard
- Department of Chemical Engineering, University of Texas at Austin , Austin, Texas 78712, United States
| |
Collapse
|
329
|
Wu W, Wang Z, Cong P, Li T. Accurate prediction of protein relative solvent accessibility using a balanced model. BioData Min 2017; 10:1. [PMID: 28127402 PMCID: PMC5259893 DOI: 10.1186/s13040-016-0121-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 12/27/2016] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Protein relative solvent accessibility provides insight into understanding protein structure and function. Prediction of protein relative solvent accessibility is often the first stage of predicting other protein properties. Recent predictors of relative solvent accessibility discriminate against exposed regions as compared with buried regions, resulting in higher prediction accuracy associated with buried regions relative to exposed regions. METHODS Here, we propose a more accurate and balanced predictor of protein relative solvent accessibility. First, we collected known proteins in three subsets according to sequence length and constructed a balanced dataset after reducing redundancy within each subset. Next, we measured the performance associated with different variables and variable combinations to determine the best variable combination. Finally, a predictor called BMRSA was constructed for modelling and prediction, which used the balanced set as the training set, the position- specific scoring matrix, predicted secondary structure, buried-exposed profile, and length of a query sequence as variables, and the conditional random field as the machine-learning method. RESULTS BMRSA performance on test sets confirmed that our approach improved prediction accuracy relative to state-of-the-art approaches and was balanced in its comparison of buried and exposed regions. Our method is valuable when higher levels of accuracy in predicting exposed-residue states are required. The BMRSA is available at: http://cheminfo.tongji.edu.cn:8080/BMRSA/.
Collapse
Affiliation(s)
- Wei Wu
- Department of Chemistry, Tongji University, Shanghai, China
| | - Zhiheng Wang
- Department of Chemistry, Tongji University, Shanghai, China
| | - Peisheng Cong
- Department of Chemistry, Tongji University, Shanghai, China
| | - Tonghua Li
- Department of Chemistry, Tongji University, Shanghai, China
| |
Collapse
|
330
|
Wang S, Sun S, Li Z, Zhang R, Xu J. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model. PLoS Comput Biol 2017; 13:e1005324. [PMID: 28056090 PMCID: PMC5249242 DOI: 10.1371/journal.pcbi.1005324] [Citation(s) in RCA: 553] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 01/20/2017] [Accepted: 12/20/2016] [Indexed: 12/02/2022] Open
Abstract
Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. Availability http://raptorx.uchicago.edu/ContactMap/ Protein contact prediction and contact-assisted folding has made good progress due to direct evolutionary coupling analysis (DCA). However, DCA is effective on only some proteins with a very large number of sequence homologs. To further improve contact prediction, we borrow ideas from deep learning, which has recently revolutionized object recognition, speech recognition and the GO game. Our deep learning method can model complex sequence-structure relationship and high-order correlation (i.e., contact occurrence patterns) and thus, improve contact prediction accuracy greatly. Our test results show that our method greatly outperforms the state-of-the-art methods regardless how many sequence homologs are available for a protein in question. Ab initio folding guided by our predicted contacts may fold many more test proteins than the other contact predictors. Our contact-assisted 3D models also have much better quality than homology models built from the training proteins, especially for membrane proteins. One interesting finding is that even trained mostly with soluble proteins, our method performs very well on membrane proteins. Recent blind CAMEO test confirms that our method can fold large proteins with a new fold and only a small number of sequence homologs.
Collapse
Affiliation(s)
- Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Siqi Sun
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Zhen Li
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Renyu Zhang
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
331
|
Skolnick J, Zhou H. Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods? J Phys Chem B 2016; 121:3546-3554. [PMID: 27748116 DOI: 10.1021/acs.jpcb.6b09517] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Despite their different implementations, comparison of the best threading approaches to the prediction of evolutionary distant protein structures reveals that they tend to succeed or fail on the same protein targets. This is true despite the fact that the structural template library has good templates for all cases. Thus, a key question is why are certain protein structures threadable while others are not. Comparison with threading results on a set of artificial sequences selected for stability further argues that the failure of threading is due to the nature of the protein structures themselves. Using a new contact map based alignment algorithm, we demonstrate that certain folds are highly degenerate in that they can have very similar coarse grained fractions of native contacts aligned and yet differ significantly from the native structure. For threadable proteins, this is not the case. Thus, contemporary threading approaches appear to have reached a plateau, and new approaches to structure prediction are required.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| |
Collapse
|
332
|
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES : EUROPEAN CONFERENCE, ECML PKDD ... : PROCEEDINGS. ECML PKDD (CONFERENCE) 2016; 9852:1-16. [PMID: 28884168 PMCID: PMC5584645 DOI: 10.1007/978-3-319-46227-1_1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.
Collapse
|