1
|
Liu Y, Liu X, Su A, Gong C, Chen S, Xia L, Zhang C, Tao X, Li Y, Li Y, Sun T, Bu M, Shao W, Zhao J, Li X, Peng Y, Guo P, Han Y, Zhu Y. Revolutionizing the structural design and determination of covalent-organic frameworks: principles, methods, and techniques. Chem Soc Rev 2024; 53:502-544. [PMID: 38099340 DOI: 10.1039/d3cs00287j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
Covalent organic frameworks (COFs) represent an important class of crystalline porous materials with designable structures and functions. The interconnected organic monomers, featuring pre-designed symmetries and connectivities, dictate the structures of COFs, endowing them with high thermal and chemical stability, large surface area, and tunable micropores. Furthermore, by utilizing pre-functionalization or post-synthetic functionalization strategies, COFs can acquire multifunctionalities, leading to their versatile applications in gas separation/storage, catalysis, and optoelectronic devices. Our review provides a comprehensive account of the latest advancements in the principles, methods, and techniques for structural design and determination of COFs. These cutting-edge approaches enable the rational design and precise elucidation of COF structures, addressing fundamental physicochemical challenges associated with host-guest interactions, topological transformations, network interpenetration, and defect-mediated catalysis.
Collapse
Affiliation(s)
- Yikuan Liu
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Xiaona Liu
- National Engineering Research Center of Lower-Carbon Catalysis Technology, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China.
| | - An Su
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Chengtao Gong
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Shenwei Chen
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Liwei Xia
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Chengwei Zhang
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Xiaohuan Tao
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Yue Li
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
| | - Yonghe Li
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Tulai Sun
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Mengru Bu
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Wei Shao
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Jia Zhao
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Xiaonian Li
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Yongwu Peng
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| | - Peng Guo
- National Engineering Research Center of Lower-Carbon Catalysis Technology, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Yu Han
- School of Emergent Soft Matter, South China University of Technology, Guangzhou, China.
- King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
| | - Yihan Zhu
- Center for Electron Microscopy, Institute for Frontier and Interdisciplinary Sciences, State Key Laboratory Breeding Base of Green Chemistry Synthesis Technology, College of Materials Science and Engineering and College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China.
| |
Collapse
|
2
|
Chandrasekhar V, Sharma N, Schaub J, Steinbeck C, Rajan K. Cheminformatics Microservice: unifying access to open cheminformatics toolkits. J Cheminform 2023; 15:98. [PMID: 37845745 PMCID: PMC10577930 DOI: 10.1186/s13321-023-00762-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 09/19/2023] [Indexed: 10/18/2023] Open
Abstract
In recent years, cheminformatics has experienced significant advancements through the development of new open-source software tools based on various cheminformatics programming toolkits. However, adopting these toolkits presents challenges, including proper installation, setup, deployment, and compatibility management. In this work, we present the Cheminformatics Microservice. This open-source solution provides a unified interface for accessing commonly used functionalities of multiple cheminformatics toolkits, namely RDKit, Chemistry Development Kit (CDK), and Open Babel. In addition, more advanced functionalities like structure generation and Optical Chemical Structure Recognition (OCSR) are made available through the Cheminformatics Microservice based on pre-existing tools. The software service also enables developers to extend the functionalities easily and to seamlessly integrate them with existing workflows and applications. It is built on FastAPI and containerized using Docker, making it highly scalable. An instance of the microservice is publicly available at https://api.naturalproducts.net . The source code is publicly accessible on GitHub, accompanied by comprehensive documentation, version control, and continuous integration and deployment workflows. All resources can be found at the following link: https://github.com/Steinbeck-Lab/cheminformatics-microservice .
Collapse
Affiliation(s)
- Venkata Chandrasekhar
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany
| | - Nisha Sharma
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany
| | - Jonas Schaub
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany
| | - Kohulan Rajan
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany.
| |
Collapse
|
3
|
Rodeiro J, Vidaña-Vila E, Navarro J, Mallol R. CloMet: A Novel Open-Source and Modular Software Platform That Connects Established Metabolomics Repositories and Data Analysis Resources. J Proteome Res 2023; 22:2540-2547. [PMID: 37428859 PMCID: PMC10857572 DOI: 10.1021/acs.jproteome.2c00602] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Indexed: 07/12/2023]
Abstract
The field of metabolomics has witnessed the development of hundreds of computational tools, but only a few have become cornerstones of this field. While MetaboLights and Metabolomics Workbench are two well-established data repositories for metabolomics data sets, Workflows4Metabolomics and MetaboAnalyst are two well-established web-based data analysis platforms for metabolomics. Yet, the raw data stored in the aforementioned repositories lack standardization in terms of the file system format used to store the associated acquisition files. Consequently, it is not straightforward to reuse available data sets as input data in the above-mentioned data analysis resources, especially for non-expert users. This paper presents CloMet, a novel open-source modular software platform that contributes to standardization, reusability, and reproducibility in the metabolomics field. CloMet, which is available through a Docker file, converts raw and NMR-based metabolomics data from MetaboLights and Metabolomics Workbench to a file format that can be used directly either in MetaboAnalyst or in Workflows4Metabolomics. We validated both CloMet and the output data using data sets from these repositories. Overall, CloMet fills the gap between well-established data repositories and web-based statistical platforms and contributes to the consolidation of a data-driven perspective of the metabolomics field by leveraging and connecting existing data and resources.
Collapse
Affiliation(s)
- Jordi Rodeiro
- Human
Environment Research, La Salle - Universitat
Ramon Llull, 08022 Barcelona, Spain
| | - Ester Vidaña-Vila
- Human
Environment Research, La Salle - Universitat
Ramon Llull, 08022 Barcelona, Spain
| | - Joan Navarro
- Research
Group on Smart Society, La Salle - Universitat
Ramon Llull, 08022 Barcelona, Spain
| | - Roger Mallol
- Human
Environment Research, La Salle - Universitat
Ramon Llull, 08022 Barcelona, Spain
| |
Collapse
|
4
|
Domingues NP, Moosavi SM, Talirz L, Jablonka KM, Ireland CP, Ebrahim FM, Smit B. Using genetic algorithms to systematically improve the synthesis conditions of Al-PMOF. Commun Chem 2022; 5:170. [PMID: 36697847 PMCID: PMC9814730 DOI: 10.1038/s42004-022-00785-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 11/22/2022] [Indexed: 12/14/2022] Open
Abstract
The synthesis of metal-organic frameworks (MOFs) is often complex and the desired structure is not always obtained. In this work, we report a methodology that uses a joint machine learning and experimental approach to optimize the synthesis conditions of Al-PMOF (Al2(OH)2TCPP) [H2TCPP = meso-tetra(4-carboxyphenyl)porphine], a promising material for carbon capture applications. Al-PMOF was previously synthesized using a hydrothermal reaction, which gave a low throughput yield due to its relatively long reaction time (16 hours). Here, we use a genetic algorithm to carry out a systematic search for the optimal synthesis conditions and a microwave-based high-throughput robotic platform for the syntheses. We show that, in just two generations, we could obtain excellent crystallinity and yield close to 80% in a much shorter reaction time (50 minutes). Moreover, by analyzing the failed and partially successful experiments, we could identify the most important experimental variables that determine the crystallinity and yield.
Collapse
Affiliation(s)
- Nency P Domingues
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland
| | - Seyed Mohamad Moosavi
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
| | - Leopold Talirz
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland
- Theory and Simulation of Materials (THEOS), School of Engineering (STI), École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Vaud, Switzerland
| | - Kevin Maik Jablonka
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland
| | - Christopher P Ireland
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland
| | - Fatmah Mish Ebrahim
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland
- Cavendish Laboratory, School of Physical Sciences, University of Cambridge, Cambridge, UK
| | - Berend Smit
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland.
| |
Collapse
|
5
|
Jablonka KM, Patiny L, Smit B. Making the collective knowledge of chemistry open and machine actionable. Nat Chem 2022; 14:365-376. [PMID: 35379967 DOI: 10.1038/s41557-022-00910-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 02/10/2022] [Indexed: 11/09/2022]
Abstract
Large amounts of data are generated in chemistry labs-nearly all instruments record data in a digital form, yet a considerable proportion is also captured non-digitally and reported in ways non-accessible to both humans and their computational agents. Chemical research is still largely centred around paper-based lab notebooks, and the publication of data is often more an afterthought than an integral part of the process. Here we argue that a modular open-science platform for chemistry would be beneficial not only for data-mining studies but also, well beyond that, for the entire chemistry community. Much progress has been made over the past few years in developing technologies such as electronic lab notebooks that aim to address data-management concerns. This will help make chemical data reusable, however it is only one step. We highlight the importance of centring open-science initiatives around open, machine-actionable data and emphasize that most of the required technologies already exist-we only need to connect, polish and embrace them.
Collapse
Affiliation(s)
- Kevin Maik Jablonka
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingenierie Chimiques (ISIC), École Polytechnique Fédérale de Lausanne (EPFL), Sion, Switzerland
| | - Luc Patiny
- Institut des Sciences et Ingénierie Chimiques (ISIC), École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Berend Smit
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingenierie Chimiques (ISIC), École Polytechnique Fédérale de Lausanne (EPFL), Sion, Switzerland.
| |
Collapse
|
6
|
Wishart DS, Sayeeda Z, Budinski Z, Guo A, Lee BL, Berjanskii M, Rout M, Peters H, Dizon R, Mah R, Torres-Calzada C, Hiebert-Giesbrecht M, Varshavi D, Varshavi D, Oler E, Allen D, Cao X, Gautam V, Maras A, Poynton EF, Tavangar P, Yang V, van Santen JA, Ghosh R, Sarma S, Knutson E, Sullivan V, Jystad AM, Renslow R, Sumner LW, Linington RG, Cort JR. NP-MRD: the Natural Products Magnetic Resonance Database. Nucleic Acids Res 2021; 50:D665-D677. [PMID: 34791429 PMCID: PMC8728158 DOI: 10.1093/nar/gkab1052] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/15/2021] [Accepted: 10/19/2021] [Indexed: 11/15/2022] Open
Abstract
The Natural Products Magnetic Resonance Database (NP-MRD) is a comprehensive, freely available electronic resource for the deposition, distribution, searching and retrieval of nuclear magnetic resonance (NMR) data on natural products, metabolites and other biologically derived chemicals. NMR spectroscopy has long been viewed as the ‘gold standard’ for the structure determination of novel natural products and novel metabolites. NMR is also widely used in natural product dereplication and the characterization of biofluid mixtures (metabolomics). All of these NMR applications require large collections of high quality, well-annotated, referential NMR spectra of pure compounds. Unfortunately, referential NMR spectral collections for natural products are quite limited. It is because of the critical need for dedicated, open access natural product NMR resources that the NP-MRD was funded by the National Institute of Health (NIH). Since its launch in 2020, the NP-MRD has grown quickly to become the world's largest repository for NMR data on natural products and other biological substances. It currently contains both structural and NMR data for nearly 41,000 natural product compounds from >7400 different living species. All structural, spectroscopic and descriptive data in the NP-MRD is interactively viewable, searchable and fully downloadable in multiple formats. Extensive hyperlinks to other databases of relevance are also provided. The NP-MRD also supports community deposition of NMR assignments and NMR spectra (1D and 2D) of natural products and related meta-data. The deposition system performs extensive data enrichment, automated data format conversion and spectral/assignment evaluation. Details of these database features, how they are implemented and plans for future upgrades are also provided. The NP-MRD is available at https://np-mrd.org.
Collapse
Affiliation(s)
- David S Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada.,Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada.,Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2B7, Canada.,Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2H7, Canada
| | - Zinat Sayeeda
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
| | - Zachary Budinski
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - AnChi Guo
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Brian L Lee
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Mark Berjanskii
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Manoj Rout
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Harrison Peters
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Raynard Dizon
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Robert Mah
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | | | | | - Dorna Varshavi
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Dorsa Varshavi
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Eponine Oler
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Dana Allen
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Xuan Cao
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Andrew Maras
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Ella F Poynton
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Pegah Tavangar
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Vera Yang
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | | | - Rajarshi Ghosh
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA.,MU Metabolomics Center, University of Missouri, Columbia, MO 65211, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Saurav Sarma
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA.,MU Metabolomics Center, University of Missouri, Columbia, MO 65211, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Eleanor Knutson
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Victoria Sullivan
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Amy M Jystad
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Ryan Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Lloyd W Sumner
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA.,MU Metabolomics Center, University of Missouri, Columbia, MO 65211, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - John R Cort
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
7
|
Jablonka KM, Moosavi SM, Asgari M, Ireland C, Patiny L, Smit B. A data-driven perspective on the colours of metal-organic frameworks. Chem Sci 2020; 12:3587-3598. [PMID: 34163632 PMCID: PMC8179528 DOI: 10.1039/d0sc05337f] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Colour is at the core of chemistry and has been fascinating humans since ancient times. It is also a key descriptor of optoelectronic properties of materials and is often used to assess the success of a synthesis. However, predicting the colour of a material based on its structure is challenging. In this work, we leverage subjective and categorical human assignments of colours to build a model that can predict the colour of compounds on a continuous scale. In the process of developing the model, we also uncover inadequacies in current reporting mechanisms. For example, we show that the majority of colour assignments are subject to perceptive spread that would not comply with common printing standards. To remedy this, we suggest and implement an alternative way of reporting colour—and chemical data in general. All data is captured in an objective, and standardised, form in an electronic lab notebook and subsequently automatically exported to a repository in open formats, from where it can be interactively explored by other researchers. We envision this to be key for a data-driven approach to chemical research. Colour is at the core of chemistry and has been fascinating humans since ancient times.![]()
Collapse
Affiliation(s)
- Kevin Maik Jablonka
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL) Rue de l'Industrie 17 CH-1951 Sion Switzerland
| | - Seyed Mohamad Moosavi
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL) Rue de l'Industrie 17 CH-1951 Sion Switzerland
| | - Mehrdad Asgari
- Institute of Mechanical Engineering (IGM), School of Engineering (STI), École Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne Switzerland.,Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL) Rue de l'Industrie 17 CH-1951 Sion Valais Switzerland
| | - Christopher Ireland
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL) Rue de l'Industrie 17 CH-1951 Sion Switzerland
| | - Luc Patiny
- Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne Switzerland
| | - Berend Smit
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL) Rue de l'Industrie 17 CH-1951 Sion Switzerland
| |
Collapse
|
8
|
Abstract
According to the World Drug Report 2020, cocaine and ecstasy are the most consumed stimulant drugs, with 19 and 27 million estimated users in 2018. In this context, large efforts are being made to design fast and cost-effective analytical methods to track and monitor the distribution networks of these synthetic drugs. Here, we share two datasets of ecstasy pills seized in the northeast of Switzerland between 2010 and 2011. The first contains 621 forensic-grade images of pills, while the second one consists of 486 mid-infrared (mIR) spectra. While both sets are not covering the same seizure, both provide high-quality data with orthogonal information to evaluate clustering and dimension reduction methods.
Collapse
|
9
|
Moosavi S, Jablonka KM, Smit B. The Role of Machine Learning in the Understanding and Design of Materials. J Am Chem Soc 2020; 142:20273-20287. [PMID: 33170678 PMCID: PMC7716341 DOI: 10.1021/jacs.0c09105] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Indexed: 12/21/2022]
Abstract
Developing algorithmic approaches for the rational design and discovery of materials can enable us to systematically find novel materials, which can have huge technological and social impact. However, such rational design requires a holistic perspective over the full multistage design process, which involves exploring immense materials spaces, their properties, and process design and engineering as well as a techno-economic assessment. The complexity of exploring all of these options using conventional scientific approaches seems intractable. Instead, novel tools from the field of machine learning can potentially solve some of our challenges on the way to rational materials design. Here we review some of the chief advancements of these methods and their applications in rational materials design, followed by a discussion on some of the main challenges and opportunities we currently face together with our perspective on the future of rational materials design and discovery.
Collapse
Affiliation(s)
- Seyed
Mohamad Moosavi
- Laboratory of Molecular Simulation,
Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Rue de l’Industrie 17, CH-1951 Sion, Valais, Switzerland
| | - Kevin Maik Jablonka
- Laboratory of Molecular Simulation,
Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Rue de l’Industrie 17, CH-1951 Sion, Valais, Switzerland
| | - Berend Smit
- Laboratory of Molecular Simulation,
Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), Rue de l’Industrie 17, CH-1951 Sion, Valais, Switzerland
| |
Collapse
|
10
|
Gagalova KK, Leon Elizalde MA, Portales-Casamar E, Görges M. What You Need to Know Before Implementing a Clinical Research Data Warehouse: Comparative Review of Integrated Data Repositories in Health Care Institutions. JMIR Form Res 2020; 4:e17687. [PMID: 32852280 PMCID: PMC7484778 DOI: 10.2196/17687] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 06/09/2020] [Accepted: 07/17/2020] [Indexed: 12/23/2022] Open
Abstract
Background Integrated data repositories (IDRs), also referred to as clinical data warehouses, are platforms used for the integration of several data sources through specialized analytical tools that facilitate data processing and analysis. IDRs offer several opportunities for clinical data reuse, and the number of institutions implementing an IDR has grown steadily in the past decade. Objective The architectural choices of major IDRs are highly diverse and determining their differences can be overwhelming. This review aims to explore the underlying models and common features of IDRs, provide a high-level overview for those entering the field, and propose a set of guiding principles for small- to medium-sized health institutions embarking on IDR implementation. Methods We reviewed manuscripts published in peer-reviewed scientific literature between 2008 and 2020, and selected those that specifically describe IDR architectures. Of 255 shortlisted articles, we found 34 articles describing 29 different architectures. The different IDRs were analyzed for common features and classified according to their data processing and integration solution choices. Results Despite common trends in the selection of standard terminologies and data models, the IDRs examined showed heterogeneity in the underlying architecture design. We identified 4 common architecture models that use different approaches for data processing and integration. These different approaches were driven by a variety of features such as data sources, whether the IDR was for a single institution or a collaborative project, the intended primary data user, and purpose (research-only or including clinical or operational decision making). Conclusions IDR implementations are diverse and complex undertakings, which benefit from being preceded by an evaluation of requirements and definition of scope in the early planning stage. Factors such as data source diversity and intended users of the IDR influence data flow and synchronization, both of which are crucial factors in IDR architecture planning.
Collapse
Affiliation(s)
- Kristina K Gagalova
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada.,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada.,Research Institute, BC Children's Hospital, Vancouver, BC, Canada
| | - M Angelica Leon Elizalde
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada.,School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada
| | - Elodie Portales-Casamar
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada.,Department of Pediatrics, University of British Columbia, Vancouver, BC, Canada
| | - Matthias Görges
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada.,Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
11
|
Wist J. HastaLaVista, a web-based user interface for NMR-based untargeted metabolic profiling analysis in biomedical sciences: towards a new publication standard. J Cheminform 2019; 11:75. [PMID: 33430999 PMCID: PMC6896291 DOI: 10.1186/s13321-019-0399-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 11/27/2019] [Indexed: 02/19/2023] Open
Abstract
Metabolic profiling has been shown to be useful to improve our understanding of complex metabolic processes. Shared data are key to the analysis and validation of metabolic profiling and untargeted spectral analysis and may increase the pace of new discovery. Improving the existing portfolio of open software may increase the fraction of shared data by decreasing the amount of effort required to publish them in a manner that is useful to others. However, a weakness of open software, when compared to commercial ones, is the lack of user-friendly graphical interface that may discourage inexperienced researchers. Here, a web-browser-oriented solution is presented and demonstrated for metabolic profiling analysis that combines the power of R for back-end statistical analyses and of JavaScript for front-end visualisations and user interactivity. This unique combination of statistical programming and web-browser visualisation brings enhanced data interoperability and interactivity into the open source realm. It is exemplified by characterizing the extent to which bariatric surgery perturbs the metabolisms of rats, showing the value of the approach in iterative analysis by the end-user to establish a deeper understanding of the system perturbation. HastaLaVista is available at: (https://github.com/jwist/hastaLaVista, 10.5281/zenodo.3544800) under MIT license. The approach described in this manuscript can be extended to connect the interface to other scripting languages such as Python, and to create interfaces for other types of data analysis.
Collapse
Affiliation(s)
- Julien Wist
- Chemistry Department, Universidad del Valle, Cali, 76001, Valle del Cauca, Colombia.
| |
Collapse
|
12
|
Abstract
The fundamental goal of the growing open science movement is to increase the efficiency of the global scientific community and accelerate progress and discoveries for the common good. Central to this principle is the rapid disclosure of research outputs in open-access peer-reviewed journals and on pre-print servers. The next bold step in this direction is open laboratory notebooks, where research scientists share their research - including detailed protocols, negative and positive results - online and in near-real-time to synergize with their peers. Here, we highlight the benefits of open lab notebooks to science, society and scientists, and discuss the challenges that this nascent movement is facing. We also present the implementation and progress of our own initiative at openlabnotebooks.org, with more than 20 active contributors after one year of operation.
Collapse
Affiliation(s)
- Matthieu Schapira
- Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada.,Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | | | - Rachel J Harding
- Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
| |
Collapse
|
13
|
Pupier M, Nuzillard JM, Wist J, Schlörer NE, Kuhn S, Erdelyi M, Steinbeck C, Williams AJ, Butts C, Claridge TD, Mikhova B, Robien W, Dashti H, Eghbalnia HR, Farès C, Adam C, Kessler P, Moriaud F, Elyashberg M, Argyropoulos D, Pérez M, Giraudeau P, Gil RR, Trevorrow P, Jeannerat D. NMReDATA, a standard to report the NMR assignment and parameters of organic compounds. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2018; 56:703-715. [PMID: 29656574 PMCID: PMC6226248 DOI: 10.1002/mrc.4737] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 02/22/2018] [Accepted: 03/25/2018] [Indexed: 05/29/2023]
Abstract
Even though NMR has found countless applications in the field of small molecule characterization, there is no standard file format available for the NMR data relevant to structure characterization of small molecules. A new format is therefore introduced to associate the NMR parameters extracted from 1D and 2D spectra of organic compounds to the proposed chemical structure. These NMR parameters, which we shall call NMReDATA (for nuclear magnetic resonance extracted data), include chemical shift values, signal integrals, intensities, multiplicities, scalar coupling constants, lists of 2D correlations, relaxation times, and diffusion rates. The file format is an extension of the existing Structure Data Format, which is compatible with the commonly used MOL format. The association of an NMReDATA file with the raw and spectral data from which it originates constitutes an NMR record. This format is easily readable by humans and computers and provides a simple and efficient way for disseminating results of structural chemistry investigations, allowing automatic verification of published results, and for assisting the constitution of highly needed open-source structural databases.
Collapse
Affiliation(s)
- Marion Pupier
- Department of Organic Chemistry, University of Geneva, 30 Quai E. Ansermet, 1211 Geneva 4, Switzerland
| | - Jean-Marc Nuzillard
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, BP 1039, 51687, Reims Cedex 2, France
| | - Julien Wist
- Chemistry Department, Universidad del Valle, 76001 Cali, Colombia
| | - Nils E. Schlörer
- Department of Chemistry, University of Cologne, Greinstr. 4, 50939 Köln, Germany
| | - Stefan Kuhn
- Department of Chemistry, University of Cologne, Greinstr. 4, 50939 Köln, Germany
| | - Mate Erdelyi
- Department of Chemistry - BMC, Uppsala University, Husargatan 3, 752 37 Uppsala, Sweden
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University, Lessingstr. 8, 07743 Jena, Germany
| | - Antony J. Williams
- National Center for Computational Toxicology, Environmental Protection Agency, 109 T.W. Alexander Drive, Room D131I, Mail Drop D143-02, Research Triangle Park, NC 27711, USA
| | - Craig Butts
- School of Chemistry, Bristol University, BS8 1TS Bristol, UK
| | - Tim D.W. Claridge
- Department of Chemistry, University of Oxford, Chemistry Research Laboratory, Mansfield Road, Oxford OX1 3TA, UK
| | - Bozhana Mikhova
- Institute of Organic Chemistry with Centre of Phytochemistry, Bulgarian Academy of Sciences, Akad. G. Bonchev Str. Bl.9, Sofia 1113, Bulgaria
| | - Wolfgang Robien
- University of Vienna, Department of Organic Chemistry, Währingerstr. 38, 1090 Vienna, Austria
| | - Hesam Dashti
- Department of Biochemistry, National Magnetic Resonance Facility at Madison (NMRFAM), 433 Babcock Drive, Madison, WI, USA
| | - Hamid R. Eghbalnia
- Department of Biochemistry, National Magnetic Resonance Facility at Madison (NMRFAM), 433 Babcock Drive, Madison, WI, USA
| | - Christophe Farès
- Max-Planck-Institut für Kohlenforschung, Abteilung NMR, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Christian Adam
- Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| | - Pavel Kessler
- Bruker BioSpin GmbH, Silberstreifen, 76287 Rheinstetten, Germany
| | - Fabrice Moriaud
- Bruker BioSpin AG, Industriestrasse 26, 8117 Fällanden, Switzerland
| | - Mikhail Elyashberg
- Moscow Department, Advanced Chemistry Development, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | - Dimitris Argyropoulos
- Advanced Chemistry Development, Inc. (ACD/Labs), Venture House, Arlington Square, Downshire Way, Bracknell, Berkshire RG12 1WA, UK
| | - Manuel Pérez
- Mestrelab Research, S.L., Feliciano Barrera 9B - Bajo, ES-15706 Santiago de Compostela, Spain
| | - Patrick Giraudeau
- EBSI Team, Chimie et Interdisciplinarité: Synthèse, Analyse, Modélisation (CEISAM) CNRS, UMR 6230, Université de Nantes, 92208, 2 rue de la Houssinière, BP 44322 Nantes, France
- Institut Universitaire de France, 1 rue Descartes, 75005 Paris Cedex 05, France
| | - Roberto R. Gil
- Department of Chemistry, Carnegie Mellon University, 4400 Fifth Ave., Pittsburgh, PA 15213, USA
| | | | - Damien Jeannerat
- Department of Organic Chemistry, University of Geneva, 30 Quai E. Ansermet, 1211 Geneva 4, Switzerland
| |
Collapse
|