1
|
Nápoles-Duarte J, Biswas A, Parker MI, Palomares-Baez J, Chávez-Rojo MA, Rodríguez-Valdez LM. Stmol: A component for building interactive molecular visualizations within streamlit web-applications. Front Mol Biosci 2022; 9:990846. [PMID: 36213112 PMCID: PMC9538479 DOI: 10.3389/fmolb.2022.990846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 08/29/2022] [Indexed: 01/31/2023] Open
Abstract
Streamlit is an open-source Python coding framework for building web-applications or "web-apps" and is now being used by researchers to share large data sets from published studies and other resources. Here we present Stmol, an easy-to-use component for rendering interactive 3D molecular visualizations of protein and ligand structures within Streamlit web-apps. Stmol can render protein and ligand structures with just a few lines of Python code by utilizing popular visualization libraries, currently Py3DMol and Speck. On the user-end, Stmol does not require expertise to interactively navigate. On the developer-end, Stmol can be easily integrated within structural bioinformatic and cheminformatic pipelines to provide a simple means for user-end researchers to advance biological studies and drug discovery efforts. In this paper, we highlight a few examples of how Stmol has already been utilized by scientific communities to share interactive molecular visualizations of protein and ligand structures from known open databases. We hope Stmol will be used by researchers to build additional open-sourced web-apps to benefit current and future generations of scientists.
Collapse
Affiliation(s)
- J.M. Nápoles-Duarte
- Laboratorio de Química Computacional, Facultad de Ciencias Químicas, Universidad Autónoma de Chihuahua, Nuevo Campus Universitario, Chihuahua, Mexico,*Correspondence: J.M. Nápoles-Duarte,
| | - Avratanu Biswas
- Doctoral School of Biology, University of Szeged, Szeged, Hungary,Biological Research Centre, Szeged, Hungary
| | - Mitchell I. Parker
- Molecular and Cell Biology and Genetics (MCBG) Program, Drexel University College of Medicine, Philadelphia, PA, United States,Program in Molecular Therapeutics, Fox Chase Cancer Center, Philadelphia, PA, United States
| | - J.P. Palomares-Baez
- Laboratorio de Química Computacional, Facultad de Ciencias Químicas, Universidad Autónoma de Chihuahua, Nuevo Campus Universitario, Chihuahua, Mexico
| | - M. A. Chávez-Rojo
- Laboratorio de Química Computacional, Facultad de Ciencias Químicas, Universidad Autónoma de Chihuahua, Nuevo Campus Universitario, Chihuahua, Mexico
| | - L. M. Rodríguez-Valdez
- Laboratorio de Química Computacional, Facultad de Ciencias Químicas, Universidad Autónoma de Chihuahua, Nuevo Campus Universitario, Chihuahua, Mexico
| |
Collapse
|
2
|
Shao J. Labeling Strategies for Surface-Exposed Protein Visualization and Determination in Plasmodium falciparum Malaria. Front Cell Infect Microbiol 2022; 12:914297. [PMID: 35755836 PMCID: PMC9226428 DOI: 10.3389/fcimb.2022.914297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/11/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Jinfeng Shao
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD, United States
| |
Collapse
|
3
|
Dallago C, Schütze K, Heinzinger M, Olenyi T, Littmann M, Lu AX, Yang KK, Min S, Yoon S, Morton JT, Rost B. Learned Embeddings from Deep Learning to Visualize and Predict Protein Sets. Curr Protoc 2021; 1:e113. [PMID: 33961736 DOI: 10.1002/cpz1.113] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Models from machine learning (ML) or artificial intelligence (AI) increasingly assist in guiding experimental design and decision making in molecular biology and medicine. Recently, Language Models (LMs) have been adapted from Natural Language Processing (NLP) to encode the implicit language written in protein sequences. Protein LMs show enormous potential in generating descriptive representations (embeddings) for proteins from just their sequences, in a fraction of the time with respect to previous approaches, yet with comparable or improved predictive ability. Researchers have trained a variety of protein LMs that are likely to illuminate different angles of the protein language. By leveraging the bio_embeddings pipeline and modules, simple and reproducible workflows can be laid out to generate protein embeddings and rich visualizations. Embeddings can then be leveraged as input features through machine learning libraries to develop methods predicting particular aspects of protein function and structure. Beyond the workflows included here, embeddings have been leveraged as proxies to traditional homology-based inference and even to align similar protein sequences. A wealth of possibilities remain for researchers to harness through the tools provided in the following protocols. © 2021 The Authors. Current Protocols published by Wiley Periodicals LLC. The following protocols are included in this manuscript: Basic Protocol 1: Generic use of the bio_embeddings pipeline to plot protein sequences and annotations Basic Protocol 2: Generate embeddings from protein sequences using the bio_embeddings pipeline Basic Protocol 3: Overlay sequence annotations onto a protein space visualization Basic Protocol 4: Train a machine learning classifier on protein embeddings Alternate Protocol 1: Generate 3D instead of 2D visualizations Alternate Protocol 2: Visualize protein solubility instead of protein subcellular localization Support Protocol: Join embedding generation and sequence space visualization in a pipeline.
Collapse
Affiliation(s)
- Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Garching/Munich, Germany
| | - Konstantin Schütze
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany
| | - Michael Heinzinger
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Garching/Munich, Germany
| | - Tobias Olenyi
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany
| | - Maria Littmann
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Garching/Munich, Germany
| | - Amy X Lu
- Department of Computer Science, University of Toronto, Toronto, Canada & Vector Institute
| | - Kevin K Yang
- Microsoft Research New England, Cambridge, Massachusetts
| | - Seonwoo Min
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
| | - Sungroh Yoon
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - James T Morton
- Center for Computational Biology, Flatiron Institute, New York, New York
| | - Burkhard Rost
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,Institute for Advanced Study (TUM-IAS), Garching/Munich, Germany.,TUM School of Life Sciences Weihenstephan (WZW), Freising, Germany.,Columbia University, Department of Biochemistry and Molecular Biophysics, New York, New York.,New York Consortium on Membrane Protein Structure (NYCOMPS), New York, New York
| |
Collapse
|
4
|
Martinez X, Chavent M, Baaden M. Visualizing protein structures - tools and trends. Biochem Soc Trans 2020; 48:499-506. [PMID: 32196545 DOI: 10.1042/BST20190621] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2020] [Revised: 03/01/2020] [Accepted: 03/04/2020] [Indexed: 02/06/2023]
Abstract
Molecular visualization is fundamental in the current scientific literature, textbooks and dissemination materials. It provides an essential support for presenting results, reasoning on and formulating hypotheses related to molecular structure. Tools for visual exploration of structural data have become easily accessible on a broad variety of platforms thanks to advanced software tools that render a great service to the scientific community. These tools are often developed across disciplines bridging computer science, biology and chemistry. This mini-review was written as a short and compact overview for scientists who need to visualize protein structures and want to make an informed decision which tool they should use. Here, we first describe a few 'Swiss Army knives' geared towards protein visualization for everyday use with an existing large user base, then focus on more specialized tools for peculiar needs that are not yet as broadly known. Our selection is by no means exhaustive, but reflects a diverse snapshot of scenarios that we consider informative for the reader. We end with an account of future trends and perspectives.
Collapse
|
5
|
Zhao Z, Aliwarga Y, Willcox MDP. Intrinsic protein fluorescence interferes with detection of tear glycoproteins in SDS-polyacrylamide gels using extrinsic fluorescent dyes. J Biomol Tech 2007; 18:331-335. [PMID: 18166676 PMCID: PMC2392995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Intrinsic protein fluorescence may interfere with the visualization of proteins after SDS-polyacrylamide electrophoresis. In an attempt to analyze tear glycoproteins in gels, we ran tear samples and stained the proteins with a glycoprotein-specific fluorescent dye. The fluorescence detected was not limited to glycoproteins. There was strong intrinsic fluorescence of proteins normally found in tears after soaking the gels in 40% methanol plus 1-10% acetic acid and, to a lesser extent, in methanol or acetic acid alone. Nanograms of proteins gave visible native fluorescence and interfere with extrinsic fluorescent dye detection. Poly-L-lysine, which does not contain intrinsically fluorescent amino acids, did not fluoresce.
Collapse
Affiliation(s)
- Zhenjun Zhao
- The institute for Eye Research, North Wing, Rupert Myers Building, Gate 14, Barker st, UNSW Sydney, NSW 2052, Australia.
| | | | | |
Collapse
|