1
|
Spalvieri D, Mauviel AM, Lambert M, Férey N, Sacquin-Mora S, Chavent M, Baaden M. Design - a new way to look at old molecules. J Integr Bioinform 2022; 19:jib-2022-0020. [PMID: 35776840 PMCID: PMC9377703 DOI: 10.1515/jib-2022-0020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 06/13/2022] [Indexed: 12/25/2022] Open
Abstract
We discuss how design enriches molecular science, particularly structural biology and bioinformatics. We present two use cases, one in academic practice and the other to design for outreach. The first case targets the representation of ion channels and their dynamic properties. In the second, we document a transition process from a research environment to general-purpose designs. Several testimonials from practitioners are given. By describing the design process of abstracted shapes, exploded views of molecular structures, motion-averaged slices, 360-degree panoramic projections, and experiments with lit sphere shading, we document how designers help make scientific data accessible without betraying its meaning, and how a creative mind adds value over purely data-driven visualizations. A similar conclusion was drawn for public outreach, as we found that comic-book-style drawings are better suited for communicating science to a broad audience.
Collapse
Affiliation(s)
- Davide Spalvieri
- Laboratoire de Biochimie Théorique, CNRS, Université Paris Cité, UPR 9080, 13 rue Pierre et Marie Curie, F-75005, Paris, France
- Institut de Biologie Physico-Chimique - Fondation Edmond de Rothschild, Paris, France
| | - Anne-Marine Mauviel
- Laboratoire de Biochimie Théorique, CNRS, Université Paris Cité, UPR 9080, 13 rue Pierre et Marie Curie, F-75005, Paris, France
- Institut de Biologie Physico-Chimique - Fondation Edmond de Rothschild, Paris, France
| | | | - Nicolas Férey
- Laboratoire de Biochimie Théorique, CNRS, Université Paris Cité, UPR 9080, 13 rue Pierre et Marie Curie, F-75005, Paris, France
- Institut de Biologie Physico-Chimique - Fondation Edmond de Rothschild, Paris, France
- Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, 91405, Orsay, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS, Université Paris Cité, UPR 9080, 13 rue Pierre et Marie Curie, F-75005, Paris, France
- Institut de Biologie Physico-Chimique - Fondation Edmond de Rothschild, Paris, France
| | - Matthieu Chavent
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, Université Paul Sabatier, 31400, Toulouse, France
| | - Marc Baaden
- Laboratoire de Biochimie Théorique, CNRS, Université Paris Cité, UPR 9080, 13 rue Pierre et Marie Curie, F-75005, Paris, France
- Institut de Biologie Physico-Chimique - Fondation Edmond de Rothschild, Paris, France
| |
Collapse
|
2
|
David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminform 2020; 12:56. [PMID: 33431035 PMCID: PMC7495975 DOI: 10.1186/s13321-020-00460-5] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Accepted: 09/05/2020] [Indexed: 02/08/2023] Open
Abstract
The technological advances of the past century, marked by the computer revolution and the advent of high-throughput screening technologies in drug discovery, opened the path to the computational analysis and visualization of bioactive molecules. For this purpose, it became necessary to represent molecules in a syntax that would be readable by computers and understandable by scientists of various fields. A large number of chemical representations have been developed over the years, their numerosity being due to the fast development of computers and the complexity of producing a representation that encompasses all structural and chemical characteristics. We present here some of the most popular electronic molecular and macromolecular representations used in drug discovery, many of which are based on graph representations. Furthermore, we describe applications of these representations in AI-driven drug discovery. Our aim is to provide a brief guide on structural representations that are essential to the practice of AI in drug discovery. This review serves as a guide for researchers who have little experience with the handling of chemical representations and plan to work on applications at the interface of these fields.
Collapse
Affiliation(s)
- Laurianne David
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden.
| | - Amol Thakkar
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| | - Rocío Mercado
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden
| |
Collapse
|
3
|
Shabash B, Wiese KC. RNA Visualization: Relevance and the Current State-of-the-Art Focusing on Pseudoknots. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:696-712. [PMID: 26915129 DOI: 10.1109/tcbb.2016.2522421] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
RNA visualization is crucial in order to understand the relationship that exists between RNA structure and its function, as well as the development of better RNA structure prediction algorithms. However, in the context of RNA visualization, one key structure remains difficult to visualize: Pseudoknots. Pseudoknots occur in RNA folding when two secondary structural components form base-pairs between them. The three-dimensional nature of these components makes them challenging to visualize in two-dimensional media, such as print media or screens. In this review, we focus on the advancements that have been made in the field of RNA visualization in two-dimensional media in the past two decades. The review aims at presenting all relevant aspects of pseudoknot visualization. We start with an overview of several pseudoknotted structures and their relevance in RNA function. Next, we discuss the theoretical basis for RNA structural topology classification and present RNA classification systems for both pseudoknotted and non-pseudoknotted RNAs. Each description of RNA classification system is followed by a discussion of the software tools and algorithms developed to date to visualize RNA, comparing the different tools' strengths and shortcomings.
Collapse
|
4
|
Kocincová L, Jarešová M, Byška J, Parulek J, Hauser H, Kozlíková B. Comparative visualization of protein secondary structures. BMC Bioinformatics 2017; 18:23. [PMID: 28251875 PMCID: PMC5333176 DOI: 10.1186/s12859-016-1449-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Background Protein function is determined by many factors, namely by its constitution, spatial arrangement, and dynamic behavior. Studying these factors helps the biochemists and biologists to better understand the protein behavior and to design proteins with modified properties. One of the most common approaches to these studies is to compare the protein structure with other molecules and to reveal similarities and differences in their polypeptide chains. Results We support the comparison process by proposing a new visualization technique that bridges the gap between traditionally used 1D and 3D representations. By introducing the information about mutual positions of protein chains into the 1D sequential representation the users are able to observe the spatial differences between the proteins without any occlusion commonly present in 3D view. Our representation is designed to serve namely for comparison of multiple proteins or a set of time steps of molecular dynamics simulation. Conclusions The novel representation is demonstrated on two usage scenarios. The first scenario aims to compare a set of proteins from the family of cytochromes P450 where the position of the secondary structures has a significant impact on the substrate channeling. The second scenario focuses on the protein flexibility when by comparing a set of time steps our representation helps to reveal the most dynamically changing parts of the protein chain.
Collapse
Affiliation(s)
| | | | - Jan Byška
- Masaryk University, Brno, Czech Republic. .,University of Bergen, Bergen, Norway.
| | | | | | | |
Collapse
|
5
|
Frączek T. Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand–Protein Interactions. J Chem Inf Model 2016; 56:2320-2335. [DOI: 10.1021/acs.jcim.6b00391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- Tomasz Frączek
- Institute of Applied Radiation
Chemistry, Lodz University of Technology, Zeromskiego 116, 90-924 Lodz, Poland
| |
Collapse
|
6
|
Sardella R, Ianni F, Lisanti A, Scorzoni S, Marinozzi M, Natalini B. S-Trityl-(R)-Cysteine, a Multipurpose Chiral Selector for Ligand-Exchange Liquid Chromatography Applications. Crit Rev Anal Chem 2015; 45:323-33. [DOI: 10.1080/10408347.2014.937851] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
7
|
Moal IH, Torchala M, Bates PA, Fernández-Recio J. The scoring of poses in protein-protein docking: current capabilities and future directions. BMC Bioinformatics 2013; 14:286. [PMID: 24079540 PMCID: PMC3850738 DOI: 10.1186/1471-2105-14-286] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 09/25/2013] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling. RESULTS We present an evaluation of 115 scoring functions on an unbound docking decoy benchmark covering 118 complexes for which a near-native solution can be found, yielding top 10 success rates of up to 58%. Hierarchical clustering is performed, so as to group together functions which identify near-natives in similar subsets of complexes. Three set theoretic approaches are used to identify pairs of scoring functions capable of correctly scoring different complexes. This shows that functions in different clusters capture different aspects of binding and are likely to work together synergistically. CONCLUSIONS All functions designed specifically for docking perform well, indicating that functions are transferable between sampling methods. We also identify promising methods from the field of homology modelling. Further, differential success rates by docking difficulty and solution quality suggest a need for flexibility-dependent scoring. Investigating pairs of scoring functions, the set theoretic measures identify known scoring strategies as well as a number of novel approaches, indicating promising augmentations of traditional scoring methods. Such augmentation and parameter combination strategies are discussed in the context of the learning-to-rank paradigm.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Super computing Center, Barcelona 08034, Spain
| | - Mieczyslaw Torchala
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | - Paul A Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | - Juan Fernández-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Super computing Center, Barcelona 08034, Spain
| |
Collapse
|
8
|
Doncheva NT, Klein K, Domingues FS, Albrecht M. Analyzing and visualizing residue networks of protein structures. Trends Biochem Sci 2011; 36:179-82. [PMID: 21345680 DOI: 10.1016/j.tibs.2011.01.002] [Citation(s) in RCA: 194] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2010] [Revised: 01/19/2011] [Accepted: 01/21/2011] [Indexed: 11/27/2022]
Abstract
The study of individual amino acid residues and their molecular interactions in protein structures is crucial for understanding structure-function relationships. Recent work has indicated that residue networks derived from 3D protein structures provide additional insights into the structural and functional roles of interacting residues. Here, we present the new software tools RINerator and RINalyzer for the automatized generation, 2D visualization, and interactive analysis of residue interaction networks, and highlight their use in different application scenarios.
Collapse
Affiliation(s)
- Nadezhda T Doncheva
- Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbrücken, Germany
| | | | | | | |
Collapse
|
9
|
Stierand K, Rarey M. Flat and Easy: 2D Depiction of Protein-Ligand Complexes. Mol Inform 2011; 30:12-9. [DOI: 10.1002/minf.201000167] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Accepted: 01/05/2011] [Indexed: 11/07/2022]
|
10
|
Tian F, Zhang C, Fan X, Yang X, Wang X, Liang H. Predicting the Flexibility Profile of Ribosomal RNAs. Mol Inform 2010; 29:707-15. [PMID: 27464014 DOI: 10.1002/minf.201000092] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 09/28/2010] [Indexed: 11/06/2022]
Abstract
Flexibility in biomolecules is an important determinant of biological functionality, which can be measured quantitatively by atomic Debye-Waller factor or B-factor. Although numerous works have been addressed on theoretical and computational studies of the B-factor profiles of proteins, the methods used for predicting B-factor values of nucleic acids, especially the complicated ribosomal RNAs (rRNAs), which are very functionally similar to proteins in providing matrix structures and in catalyzing biochemical reactions, still remain unexploited. In this article, we present a quantitative structure-flexibility relationship (QSFR) study with the aim at the quantitative prediction of rRNA B-factor based on primary sequences (sequence-based) and advanced structures (structure-based) by using both linear and nonlinear machine learning approaches, including partial least squares regression (PLS), least squares support vector machine (LSSVM), and Gaussian process (GP). By rigorously examining the performance and reliability of constructed statistical models and by comparing our models in detail to those developed previously for protein B-factors, we demonstrate that (i) rRNA B-factors could be predicted at a similar level of accuracy with that of protein, (ii) a structure-based approach performed much better as compared to sequence-based methods in modeling of rRNA B-factors, and (iii) rRNA flexibility is primarily governed by the local features of nonbonding potential landscapes, such as electrostatic and van der Waals forces.
Collapse
Affiliation(s)
- Feifei Tian
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404.,College of Bioengineering, Chongqing University, Chongqing 400044, China
| | - Chun Zhang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xia Fan
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xue Yang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xi Wang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Huaping Liang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404.
| |
Collapse
|