1
|
Shishparenok AN, Gladilina YA, Zhdanov DD. Engineering and Expression Strategies for Optimization of L-Asparaginase Development and Production. Int J Mol Sci 2023; 24:15220. [PMID: 37894901 PMCID: PMC10607044 DOI: 10.3390/ijms242015220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 10/11/2023] [Accepted: 10/13/2023] [Indexed: 10/29/2023] Open
Abstract
Genetic engineering for heterologous expression has advanced in recent years. Model systems such as Escherichia coli, Bacillus subtilis and Pichia pastoris are often used as host microorganisms for the enzymatic production of L-asparaginase, an enzyme widely used in the clinic for the treatment of leukemia and in bakeries for the reduction of acrylamide. Newly developed recombinant L-asparaginase (L-ASNase) may have a low affinity for asparagine, reduced catalytic activity, low stability, and increased glutaminase activity or immunogenicity. Some successful commercial preparations of L-ASNase are now available. Therefore, obtaining novel L-ASNases with improved properties suitable for food or clinical applications remains a challenge. The combination of rational design and/or directed evolution and heterologous expression has been used to create enzymes with desired characteristics. Computer design, combined with other methods, could make it possible to generate mutant libraries of novel L-ASNases without costly and time-consuming efforts. In this review, we summarize the strategies and approaches for obtaining and developing L-ASNase with improved properties.
Collapse
Affiliation(s)
- Anastasiya N. Shishparenok
- Laboratory of Medical Biotechnology, Institute of Biomedical Chemistry, Pogodinskaya St. 10/8, 119121 Moscow, Russia; (A.N.S.); (Y.A.G.)
| | - Yulia A. Gladilina
- Laboratory of Medical Biotechnology, Institute of Biomedical Chemistry, Pogodinskaya St. 10/8, 119121 Moscow, Russia; (A.N.S.); (Y.A.G.)
| | - Dmitry D. Zhdanov
- Laboratory of Medical Biotechnology, Institute of Biomedical Chemistry, Pogodinskaya St. 10/8, 119121 Moscow, Russia; (A.N.S.); (Y.A.G.)
- Department of Biochemistry, Peoples’ Friendship University of Russia named after Patrice Lumumba (RUDN University), Miklukho—Maklaya St. 6, 117198 Moscow, Russia
| |
Collapse
|
2
|
Chen L, Sun ZL. PmliHFM: Predicting Plant miRNA-lncRNA Interactions with Hybrid Feature Mining Network. Interdiscip Sci 2023; 15:44-54. [PMID: 36223068 DOI: 10.1007/s12539-022-00540-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/27/2022] [Accepted: 09/27/2022] [Indexed: 11/07/2022]
Abstract
Due to the crucial role of interactions between microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) in biological processes, the study of their biological functions is necessary. So far, the various computational methods have been employed to make predictions of the miRNA-lncRNA interaction, which compensate for the inadequacy of biological experiments. However, the existing methods do not consider the differences between miRNA and lncRNA in feature extraction. In this paper, we propose a hybrid feature mining network, named PmliHFM, for predicting plant miRNA-lncRNA interactions. Firstly, miRNA and lncRNA with different sequence lengths are encoded by different encodings, which can reduce the loss of information caused by using the same coding approach. Then, a hybrid feature mining network is designed to adapt to different encoding methods and extract more useful feature information than a single network. Finally, an ensemble module is utilized to integrate the training results of the hybrid feature mining network, while a prediction module is employed to determine whether there are interactions. By testing on multiple test sets, PmliHFM outperforms several state-of-the-art approaches. The results show that the AUC of PmliHFM achieves 0.8[Formula: see text], 3.1[Formula: see text] and 0.4[Formula: see text] improvement respectively on three balanced datasets, and achieves 2.1[Formula: see text] and 1.8[Formula: see text] improvement respectively on two imbalanced datasets. These experiments demonstrate the feasibility of the proposed method.
Collapse
Affiliation(s)
- Lin Chen
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China
| | - Zhan-Li Sun
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China.
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
3
|
Protein-protein interaction and non-interaction predictions using gene sequence natural vector. Commun Biol 2022; 5:652. [PMID: 35780196 PMCID: PMC9250521 DOI: 10.1038/s42003-022-03617-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 06/21/2022] [Indexed: 12/02/2022] Open
Abstract
Predicting protein–protein interaction and non-interaction are two important different aspects of multi-body structure predictions, which provide vital information about protein function. Some computational methods have recently been developed to complement experimental methods, but still cannot effectively detect real non-interacting protein pairs. We proposed a gene sequence-based method, named NVDT (Natural Vector combine with Dinucleotide and Triplet nucleotide), for the prediction of interaction and non-interaction. For protein–protein non-interactions (PPNIs), the proposed method obtained accuracies of 86.23% for Homo sapiens and 85.34% for Mus musculus, and it performed well on three types of non-interaction networks. For protein-protein interactions (PPIs), we obtained accuracies of 99.20, 94.94, 98.56, 95.41, and 94.83% for Saccharomyces cerevisiae, Drosophila melanogaster, Helicobacter pylori, Homo sapiens, and Mus musculus, respectively. Furthermore, NVDT outperformed established sequence-based methods and demonstrated high prediction results for cross-species interactions. NVDT is expected to be an effective approach for predicting PPIs and PPNIs. Protein-protein non-interactions and interactions are distinguished and predicted by gene sequence using single nucleotide and contiguous nucleotides combined with machine learning models.
Collapse
|
4
|
Martinez-Seidel F, Beine-Golovchuk O, Hsieh YC, Eshraky KE, Gorka M, Cheong BE, Jimenez-Posada EV, Walther D, Skirycz A, Roessner U, Kopka J, Pereira Firmino AA. Spatially Enriched Paralog Rearrangements Argue Functionally Diverse Ribosomes Arise during Cold Acclimation in Arabidopsis. Int J Mol Sci 2021; 22:6160. [PMID: 34200446 PMCID: PMC8201131 DOI: 10.3390/ijms22116160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 12/15/2022] Open
Abstract
Ribosome biogenesis is essential for plants to successfully acclimate to low temperature. Without dedicated steps supervising the 60S large subunits (LSUs) maturation in the cytosol, e.g., Rei-like (REIL) factors, plants fail to accumulate dry weight and fail to grow at suboptimal low temperatures. Around REIL, the final 60S cytosolic maturation steps include proofreading and assembly of functional ribosomal centers such as the polypeptide exit tunnel and the P-Stalk, respectively. In consequence, these ribosomal substructures and their assembly, especially during low temperatures, might be changed and provoke the need for dedicated quality controls. To test this, we blocked ribosome maturation during cold acclimation using two independent reil double mutant genotypes and tested changes in their ribosomal proteomes. Additionally, we normalized our mutant datasets using as a blank the cold responsiveness of a wild-type Arabidopsis genotype. This allowed us to neglect any reil-specific effects that may happen due to the presence or absence of the factor during LSU cytosolic maturation, thus allowing us to test for cold-induced changes that happen in the early nucleolar biogenesis. As a result, we report that cold acclimation triggers a reprogramming in the structural ribosomal proteome. The reprogramming alters the abundance of specific RP families and/or paralogs in non-translational LSU and translational polysome fractions, a phenomenon known as substoichiometry. Next, we tested whether the cold-substoichiometry was spatially confined to specific regions of the complex. In terms of RP proteoforms, we report that remodeling of ribosomes after a cold stimulus is significantly constrained to the polypeptide exit tunnel (PET), i.e., REIL factor binding and functional site. In terms of RP transcripts, cold acclimation induces changes in RP families or paralogs that are significantly constrained to the P-Stalk and the ribosomal head. The three modulated substructures represent possible targets of mechanisms that may constrain translation by controlled ribosome heterogeneity. We propose that non-random ribosome heterogeneity controlled by specialized biogenesis mechanisms may contribute to a preferential or ultimately even rigorous selection of transcripts needed for rapid proteome shifts and successful acclimation.
Collapse
Affiliation(s)
- Federico Martinez-Seidel
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia;
| | - Olga Beine-Golovchuk
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- Heidelberg University, Biochemie-Zentrum, Nuclear Pore Complex and Ribosome Assembly, 69120 Heidelberg, Germany
| | - Yin-Chen Hsieh
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- Institute for Arctic and Marine Biology, UiT Arctic University of Norway, 9037 Tromsø, Norway
| | - Kheloud El Eshraky
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Michal Gorka
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Bo-Eng Cheong
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia;
- Biotechnology Research Institute, Universiti Malaysia Sabah, Jalan UMS, 88400 Kota Kinabalu, Malaysia
| | - Erika V. Jimenez-Posada
- Grupo de Biotecnología-Productos Naturales, Universidad Tecnológica de Pereira, Pereira 660003, Colombia;
- Emerging Infectious Diseases and Tropical Medicine Research Group—Sci-Help, Pereira 660009, Colombia
| | - Dirk Walther
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Aleksandra Skirycz
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Ute Roessner
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia;
| | - Joachim Kopka
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| | - Alexandre Augusto Pereira Firmino
- Willmitzer Department, Max-Planck-Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany; (O.B.-G.); (Y.-C.H.); (K.E.E.); (M.G.); (B.-E.C.); (D.W.); (A.S.); (J.K.); (A.A.P.F.)
| |
Collapse
|
5
|
Patil S, Kondabagil K. Coevolutionary and Phylogenetic Analysis of Mimiviral Replication Machinery Suggest the Cellular Origin of Mimiviruses. Mol Biol Evol 2021; 38:2014-2029. [PMID: 33570580 PMCID: PMC8097291 DOI: 10.1093/molbev/msab003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Mimivirus is one of the most complex and largest viruses known. The origin and evolution of Mimivirus and other giant viruses have been a subject of intense study in the last two decades. The two prevailing hypotheses on the origin of Mimivirus and other viruses are the reduction hypothesis, which posits that viruses emerged from modern unicellular organisms; whereas the virus-first hypothesis proposes viruses as relics of precellular forms of life. In this study, to gain insights into the origin of Mimivirus, we have carried out extensive phylogenetic, correlation, and multidimensional scaling analyses of the putative proteins involved in the replication of its 1.2-Mb large genome. Correlation analysis and multidimensional scaling methods were validated using bacteriophage, bacteria, archaea, and eukaryotic replication proteins before applying to Mimivirus. We show that a large fraction of mimiviral replication proteins, including polymerase B, clamp, and clamp loaders are of eukaryotic origin and are coevolving. Although phylogenetic analysis places some components along the lineages of phage and bacteria, we show that all the replication-related genes have been homogenized and are under purifying selection. Collectively our analysis supports the idea that Mimivirus originated from a complex cellular ancestor. We hypothesize that Mimivirus has largely retained complex replication machinery reminiscent of its progenitor while losing most of the other genes related to processes such as metabolism and translation.
Collapse
Affiliation(s)
- Supriya Patil
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, Maharashtra, India
| | - Kiran Kondabagil
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, Maharashtra, India
| |
Collapse
|
6
|
A novel entropy-based mapping method for determining the protein-protein interactions in viral genomes by using coevolution analysis. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2020.102359] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
7
|
Alakus TB, Turkoglu I. A Novel Protein Mapping Method for Predicting the Protein Interactions in COVID-19 Disease by Deep Learning. Interdiscip Sci 2021; 13:44-60. [PMID: 33433784 PMCID: PMC7801232 DOI: 10.1007/s12539-020-00405-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 11/23/2020] [Accepted: 11/28/2020] [Indexed: 12/11/2022]
Abstract
The new type of corona virus (SARS-COV-2) emerging in Wuhan, China has spread rapidly to the world and has become a pandemic. In addition to having a significant impact on daily life, it also shows its effect in different areas, including public health and economy. Currently, there is no vaccine or antiviral drug available to prevent the COVID-19 disease. Therefore, determination of protein interactions of new types of corona virus is vital in clinical studies, drug therapy, identification of preclinical compounds and protein functions. Protein–protein interactions are important to examine protein functions and pathways involved in various biological processes and to determine the cause and progression of diseases. Various high-throughput experimental methods have been used to identify protein–protein interactions in organisms, yet, there is still a huge gap in specifying all possible protein interactions in an organism. In addition, since the experimental methods used include cloning, labeling, affinity purification mass spectrometry, the processes take a long time. Determining these interactions with artificial intelligence-based methods rather than experimental approaches may help to identify protein functions faster. Thus, protein–protein interaction prediction using deep-learning algorithms has been employed in conjunction with experimental method to explore new protein interactions. However, to predict protein interactions with artificial intelligence techniques, protein sequences need to be mapped. There are various types and numbers of protein-mapping methods in the literature. In this study, we wanted to contribute to the literature by proposing a novel protein-mapping method based on the AVL tree. The proposed method was inspired by the fast search performance on the dictionary structure of AVL tree and was used to verify the protein interactions between SARS-COV-2 virus and human. First, protein sequences were mapped by both the proposed method and various protein-mapping methods. Then, the mapped protein sequences were normalized and classified by bidirectional recurrent neural networks. The performance of the proposed method was evaluated with accuracy, f1-score, precision, recall, and AUC scores. Our results indicated that our mapping method predicts the protein interactions between SARS-COV-2 virus proteins and human proteins at an accuracy of 97.76%, precision of 97.60%, recall of 98.33%, f1-score of 79.42%, and with AUC 89% in average.
Collapse
Affiliation(s)
- Talha Burak Alakus
- Faculty of Engineering, Department of Software Engineering, Kirklareli University, 39000, Kirklareli, Turkey.
| | - Ibrahim Turkoglu
- Faculty of Technology, Department of Software Engineering, Firat University, 23119, Elazig, Turkey
| |
Collapse
|
8
|
Hu L, Hu P, Luo X, Yuan X, You ZH. Incorporating the Coevolving Information of Substrates in Predicting HIV-1 Protease Cleavage Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2017-2028. [PMID: 31056514 DOI: 10.1109/tcbb.2019.2914208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Human immunodeficiency virus 1 (HIV-1) protease (PR) plays a crucial role in the maturation of the virus. The study of substrate specificity of HIV-1 PR as a new endeavor strives to increase our ability to understand how HIV-1 PR recognizes its various cleavage sites. To predict HIV-1 PR cleavage sites, most of the existing approaches have been developed solely based on the homogeneity of substrate sequence information with supervised classification techniques. Although efficient, these approaches are found to be restricted to the ability of explaining their results and probably provide few insights into the mechanisms by which HIV-1 PR cleaves the substrates in a site-specific manner. In this work, a coevolutionary pattern-based prediction model for HIV-1 PR cleavage sites, namely EvoCleave, is proposed by integrating the coevolving information obtained from substrate sequences with a linear SVM classifier. The experiment results showed that EvoCleave yielded a very promising performance in terms of ROC analysis and f-measure. We also prospectively assessed the biological significance of coevolutionary patterns by applying them to study three fundamental issues of HIV-1 PR cleavage site. The analysis results demonstrated that the coevolutionary patterns offered valuable insights into the understanding of substrate specificity of HIV-1 PR.
Collapse
|
9
|
Han Y, Cheng L, Sun W. Analysis of Protein-Protein Interaction Networks through Computational Approaches. Protein Pept Lett 2020; 27:265-278. [PMID: 31692419 DOI: 10.2174/0929866526666191105142034] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 05/08/2019] [Accepted: 09/26/2019] [Indexed: 01/02/2023]
Abstract
The interactions among proteins and genes are extremely important for cellular functions. Molecular interactions at protein or gene levels can be used to construct interaction networks in which the interacting species are categorized based on direct interactions or functional similarities. Compared with the limited experimental techniques, various computational tools make it possible to analyze, filter, and combine the interaction data to get comprehensive information about the biological pathways. By the efficient way of integrating experimental findings in discovering PPIs and computational techniques for prediction, the researchers have been able to gain many valuable data on PPIs, including some advanced databases. Moreover, many useful tools and visualization programs enable the researchers to establish, annotate, and analyze biological networks. We here review and list the computational methods, databases, and tools for protein-protein interaction prediction.
Collapse
Affiliation(s)
- Ying Han
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weiju Sun
- Cardiovascular Department, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
10
|
Younginger BS, Friesen ML. Connecting signals and benefits through partner choice in plant-microbe interactions. FEMS Microbiol Lett 2020; 366:5626345. [PMID: 31730203 DOI: 10.1093/femsle/fnz217] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 10/17/2019] [Indexed: 12/20/2022] Open
Abstract
Stabilizing mechanisms in plant-microbe symbioses are critical to maintaining beneficial functions, with two main classes: host sanctions and partner choice. Sanctions are currently presumed to be more effective and widespread, based on the idea that microbes rapidly evolve cheating while retaining signals matching cooperative strains. However, hosts that effectively discriminate among a pool of compatible symbionts would gain a significant fitness advantage. Using the well-characterized legume-rhizobium symbiosis as a model, we evaluate the evidence for partner choice in the context of the growing field of genomics. Empirical studies that rely upon bacteria varying only in nitrogen-fixation ability ignore host-symbiont signaling and frequently conclude that partner choice is not a robust stabilizing mechanism. Here, we argue that partner choice is an overlooked mechanism of mutualism stability and emphasize that plants need not use the microbial services provided a priori to discriminate among suitable partners. Additionally, we present a model that shows that partner choice signaling increases symbiont and host fitness in the absence of sanctions. Finally, we call for a renewed focus on elucidating the signaling mechanisms that are critical to partner choice while further aiming to understand their evolutionary dynamics in nature.
Collapse
Affiliation(s)
- Brett S Younginger
- Department of Plant Pathology, Washington State University, PO Box 646430, 345 Johnson Hall, Pullman, WA 99164, USA
| | - Maren L Friesen
- Department of Plant Pathology, Washington State University, PO Box 646430, 345 Johnson Hall, Pullman, WA 99164, USA.,Department of Crop and Soil Sciences, Washington State University, PO Box 646420, 115 Johnson Hall, Pullman, WA 99164, USA
| |
Collapse
|
11
|
Zhang SW, Zhang XX, Fan XN, Li WN. LPI-CNNCP: Prediction of lncRNA-protein interactions by using convolutional neural network with the copy-padding trick. Anal Biochem 2020; 601:113767. [PMID: 32454029 DOI: 10.1016/j.ab.2020.113767] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 04/27/2020] [Accepted: 05/01/2020] [Indexed: 11/17/2022]
Abstract
Long noncoding RNAs (lncRNAs) play critical roles in many pathological and biological processes, such as post-transcription, cell differentiation and gene regulation. Increasingly more studies have shown that lncRNAs function through mainly interactions with specific RNA binding proteins (RBPs). However, experimental identification of potential lncRNA-protein interactions is costly and time-consuming. In this work, we propose a novel convolutional neural network-based method with the copy-padding trick (named LPI-CNNCP) to predict lncRNA-protein interactions. The copy-padding trick of the LPI-CNNCP convert the protein/RNA sequences with variable-length into the fixed-length sequences, thus enabling the construction of the CNN model. A high-order one-hot encoding is also applied to transform the protein/RNA sequences into image-like inputs for capturing the dependencies among amino acids (or nucleotides). In the end, these encoded protein/RNA sequences are feed into a CNN to predict the lncRNA-protein interactions. Compared with other state-of-the-art methods in 10-fold cross-validation (10CV) test, LPI-CNNCP shows the best performance. Results in the independent test demonstrate that our LPI-CNNCP can effectively predict the potential lncRNA-protein interactions. We also compared the copy-padding trick with two other existing tricks (i.e., zero-padding and cropping), and the results show that our copy-padding rick outperforms the zero-padding and cropping tricks on predicting lncRNA-protein interactions. The source code of LPI-CNNCP and the datasets used in this work are available at https://github.com/NWPU-903PR/LPI-CNNCP for academic users.
Collapse
Affiliation(s)
- Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Xi-Xi Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Xiao-Nan Fan
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Wei-Na Li
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| |
Collapse
|
12
|
Schilling PE, Kontaxis G, Dragosits M, Schiestl RH, Becker CFW, Maier I. Mannosylated hemagglutinin peptides bind cyanovirin-N independent of disulfide-bonds in complementary binding sites. RSC Adv 2020; 10:11079-11087. [PMID: 35495330 PMCID: PMC9050506 DOI: 10.1039/d0ra01128b] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 03/10/2020] [Indexed: 01/11/2023] Open
Abstract
Cyanovirin-N (CV-N) has been shown to reveal broad neutralizing activity against human immunodeficiency virus (HIV) and to specifically bind Manα(1→2)Manα units exposed on various glycoproteins of enveloped viruses, such as influenza hemagglutinin (HA) and Ebola glycoprotein. Chemically synthesized dimannosylated HA peptides bound domain-swapped and dimeric CV-N with either four disulfide-bonds (Cys–Cys), or three Cys–Cys bonds and an intact fold of the high-affinity binding site at an equilibrium dissociation constant KD of 10 μM. Cys–Cys mutagenesis with ion-pairing amino-acids glutamic acid and arginine was calculated by in silico structure-based protein design and allowed for recognizing dimannose and dimannosylated peptide binding to low-affinity binding sites (KD ≈ 11 μM for one C58–C73 bond, and binding to dimannosylated peptide). In comparison, binding to HA was achieved based on one ion-pairing C58E–C73R substitution at KD = 275 nM, and KD = 5 μM for two C58E–C73R substitutions. We were utilizing a triazole bioisostere linkage to form the respective mannosylated-derivative on the HA peptide sequence of residues glutamine, glycine, and glutamic acid. Thus, mono- and dimannosylated peptides with N-terminal cysteine facilitated site-specific interactions with HA peptides, mimicking a naturally found N-linked glycosylation site on the HA head domain. Di-mannosylated peptides reveal mannose binding to cyanovirin-N (CV-N) low-affinity binding sites.![]()
Collapse
Affiliation(s)
- Philipp E Schilling
- Faculty of Chemistry, Institute of Biological Chemistry, University of Vienna Währinger Straße 38 A-1090 Vienna Austria
| | - Georg Kontaxis
- Department of Structural and Computational Biology, Max Perutz Laboratories, University of Vienna Campus Vienna Bohrgasse 5 A-1030 Vienna Austria
| | - Martin Dragosits
- Department of Chemistry, Division of Biochemistry, University of Natural Resources and Life Sciences Muthgasse 18 A-1190 Vienna Austria
| | - Robert H Schiestl
- Department of Pathology and Laboratory Medicine, Geffen School of Medicine, University of California Los Angeles CA-90095 USA.,Department of Environmental Health Sciences, Fielding School of Public Health, University of California, Los Angeles 650 Charles E. Young Dr. South Los Angeles CA-90095 USA +1-310-267-2578 +1-310-267-2087
| | - Christian F W Becker
- Faculty of Chemistry, Institute of Biological Chemistry, University of Vienna Währinger Straße 38 A-1090 Vienna Austria
| | - Irene Maier
- Faculty of Chemistry, Institute of Biological Chemistry, University of Vienna Währinger Straße 38 A-1090 Vienna Austria.,Department of Environmental Health Sciences, Fielding School of Public Health, University of California, Los Angeles 650 Charles E. Young Dr. South Los Angeles CA-90095 USA +1-310-267-2578 +1-310-267-2087
| |
Collapse
|
13
|
Farkaš T, Sitarčík J, Brejová B, Lucká M. SWSPM: A Novel Alignment-Free DNA Comparison Method Based on Signal Processing Approaches. Evol Bioinform Online 2019; 15:1176934319849071. [PMID: 31210725 PMCID: PMC6545658 DOI: 10.1177/1176934319849071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 04/12/2019] [Indexed: 11/16/2022] Open
Abstract
Computing similarity between 2 nucleotide sequences is one of the fundamental problems in bioinformatics. Current methods are based mainly on 2 major approaches: (1) sequence alignment, which is computationally expensive, and (2) faster, but less accurate, alignment-free methods based on various statistical summaries, for example, short word counts. We propose a new distance measure based on mathematical transforms from the domain of signal processing. To tolerate large-scale rearrangements in the sequences, the transform is computed across sliding windows. We compare our method on several data sets with current state-of-art alignment-free methods. Our method compares favorably in terms of accuracy and outperforms other methods in running time and memory requirements. In addition, it is massively scalable up to dozens of processing units without the loss of performance due to communication overhead. Source files and sample data are available at https://bitbucket.org/fiitstubioinfo/swspm/src.
Collapse
Affiliation(s)
- Tomáš Farkaš
- Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, Bratislava, Slovakia
| | - Jozef Sitarčík
- Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, Bratislava, Slovakia
| | - Broňa Brejová
- Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Bratislava, Slovakia
| | - Mária Lucká
- Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, Bratislava, Slovakia
| |
Collapse
|
14
|
Spänig S, Heider D. Encodings and models for antimicrobial peptide classification for multi-resistant pathogens. BioData Min 2019; 12:7. [PMID: 30867681 PMCID: PMC6399931 DOI: 10.1186/s13040-019-0196-x] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 02/24/2019] [Indexed: 01/10/2023] Open
Abstract
Antimicrobial peptides (AMPs) are part of the inherent immune system. In fact, they occur in almost all organisms including, e.g., plants, animals, and humans. Remarkably, they show effectivity also against multi-resistant pathogens with a high selectivity. This is especially crucial in times, where society is faced with the major threat of an ever-increasing amount of antibiotic resistant microbes. In addition, AMPs can also exhibit antitumor and antiviral effects, thus a variety of scientific studies dealt with the prediction of active peptides in recent years. Due to their potential, even the pharmaceutical industry is keen on discovering and developing novel AMPs. However, AMPs are difficult to verify in vitro, hence researchers conduct sequence similarity experiments against known, active peptides. Unfortunately, this approach is very time-consuming and limits potential candidates to sequences with a high similarity to known AMPs. Machine learning methods offer the opportunity to explore the huge space of sequence variations in a timely manner. These algorithms have, in principal, paved the way for an automated discovery of AMPs. However, machine learning models require a numerical input, thus an informative encoding is very important. Unfortunately, developing an appropriate encoding is a major challenge, which has not been entirely solved so far. For this reason, the development of novel amino acid encodings is established as a stand-alone research branch. The present review introduces state-of-the-art encodings of amino acids as well as their properties in sequence and structure based aggregation. Moreover, albeit a well-chosen encoding is essential, performant classifiers are required, which is reflected by a tendency towards specifically designed models in the literature. Furthermore, we introduce these models with a particular focus on encodings derived from support vector machines and deep learning approaches. Albeit a strong focus has been set on AMP predictions, not all of the mentioned encodings have been elaborated as part of antimicrobial research studies, but rather as general protein or peptide representations.
Collapse
Affiliation(s)
- Sebastian Spänig
- Department of Bioinformatics, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg, Germany
| | - Dominik Heider
- Department of Bioinformatics, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg, Germany
| |
Collapse
|
15
|
Jang J, Bae SE. Comparative Co-Evolution Analysis Between the HA and NA Genes of Influenza A Virus. Virology (Auckl) 2018; 9:1178122X18788328. [PMID: 30038490 PMCID: PMC6053862 DOI: 10.1177/1178122x18788328] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 06/21/2018] [Indexed: 11/15/2022] Open
Abstract
Influenza A virus subtypes are determined based on envelope proteins encoded by the hemagglutinin (HA) gene and the neuraminidase (NA) gene, which are involved in attachment to the host, pathogenicity, and progeny production. Here, we evaluated such differences through co-evolution analysis between the HA and NA genes based on subtype and host. Event-based cophylogeny analysis revealed that humans had higher cospeciation values than avian. In particular, the yearly ML phylogenetic trees for the H1N1 and H3N2 subtypes in humans displayed similar topologies between the two genes in humans. Substitution analysis was verifying the strong positive correlation between the two genes in the H1N1 and H3N2 subtypes in humans compared with those in avian and swine. These results provided a proof of principle for the further development of vaccines according to hosts and subtypes against Influenza A virus.
Collapse
Affiliation(s)
- Jinhwa Jang
- Center for Applied Scientific Computing, Division of Supercomputing, Korea Institute ofScience and Technology Information, Daejeon, Republic of Korea.,Laboratory of Computational Biology & Bioinformatics, Institute of Health and Environment, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea
| | - Se-Eun Bae
- Bioinformatics Laboratory, Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
| |
Collapse
|
16
|
Abstract
The knowledge of protein-protein interactions (PPIs) and PPI networks (PPINs) is the key to starting to understand the biological processes inside the cell. Many computational tools have been designed to help explore PPIs and PPINs, such as those for interaction detection, reliability assessment and interaction network construction. Here, the application of computational tools is reviewed from three perspectives: PPI database construction, PPI prediction, and interaction network construction and analysis. This overview will provide researchers guidance on choosing appropriate methods for exploring PPIs.
Collapse
Affiliation(s)
- Shaowei Dong
- Department of Cell and System Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada
| | - Nicholas J Provart
- Department of Cell and System Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
17
|
Zielezinski A, Vinga S, Almeida J, Karlowski WM. Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol 2017; 18:186. [PMID: 28974235 PMCID: PMC5627421 DOI: 10.1186/s13059-017-1319-7] [Citation(s) in RCA: 241] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. The strength of these methods makes them particularly useful for next-generation sequencing data processing and analysis. However, many researchers are unclear about how these methods work, how they compare to alignment-based methods, and what their potential is for use for their research. We address these questions and provide a guide to the currently available alignment-free sequence analysis tools.
Collapse
Affiliation(s)
- Andrzej Zielezinski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Susana Vinga
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001, Lisbon, Portugal
| | - Jonas Almeida
- Stony Brook University (SUNY), 101 Nicolls Road, Stony Brook, NY, 11794, USA
| | - Wojciech M Karlowski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland.
| |
Collapse
|