1
|
Jeliazkova N, Longhin E, El Yamani N, Rundén-Pran E, Moschini E, Serchi T, Vrček IV, Burgum MJ, Doak SH, Cimpan MR, Rios-Mondragon I, Cimpan E, Battistelli CL, Bossa C, Tsekovska R, Drobne D, Novak S, Repar N, Ammar A, Nymark P, Di Battista V, Sosnowska A, Puzyn T, Kochev N, Iliev L, Jeliazkov V, Reilly K, Lynch I, Bakker M, Delpivo C, Sánchez Jiménez A, Fonseca AS, Manier N, Fernandez-Cruz ML, Rashid S, Willighagen E, D Apostolova M, Dusinska M. A template wizard for the cocreation of machine-readable data-reporting to harmonize the evaluation of (nano)materials. Nat Protoc 2024; 19:2642-2684. [PMID: 38755447 DOI: 10.1038/s41596-024-00993-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 02/20/2024] [Indexed: 05/18/2024]
Abstract
Making research data findable, accessible, interoperable and reusable (FAIR) is typically hampered by a lack of skills in technical aspects of data management by data generators and a lack of resources. We developed a Template Wizard for researchers to easily create templates suitable for consistently capturing data and metadata from their experiments. The templates are easy to use and enable the compilation of machine-readable metadata to accompany data generation and align them to existing community standards and databases, such as eNanoMapper, streamlining the adoption of the FAIR principles. These templates are citable objects and are available as online tools. The Template Wizard is designed to be user friendly and facilitates using and reusing existing templates for new projects or project extensions. The wizard is accompanied by an online template validator, which allows self-evaluation of the template (to ensure mapping to the data schema and machine readability of the captured data) and transformation by an open-source parser into machine-readable formats, compliant with the FAIR principles. The templates are based on extensive collective experience in nanosafety data collection and include over 60 harmonized data entry templates for physicochemical characterization and hazard assessment (cell viability, genotoxicity, environmental organism dose-response tests, omics), as well as exposure and release studies. The templates are generalizable across fields and have already been extended and adapted for microplastics and advanced materials research. The harmonized templates improve the reliability of interlaboratory comparisons, data reuse and meta-analyses and can facilitate the safety evaluation and regulation process for (nano) materials.
Collapse
Affiliation(s)
| | - Eleonora Longhin
- Health Effects Laboratory, Department of Environmental Chemistry & Health Effects, The Climate and Environmental Research Institute NILU, Kjeller, Norway
| | - Naouale El Yamani
- Health Effects Laboratory, Department of Environmental Chemistry & Health Effects, The Climate and Environmental Research Institute NILU, Kjeller, Norway
| | - Elise Rundén-Pran
- Health Effects Laboratory, Department of Environmental Chemistry & Health Effects, The Climate and Environmental Research Institute NILU, Kjeller, Norway
| | - Elisa Moschini
- Environmental Health group, Department of Environmental Research and Innovation, Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Tommaso Serchi
- Environmental Health group, Department of Environmental Research and Innovation, Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | | | - Michael J Burgum
- In Vitro Toxicology Group, Faculty of Medicine, Health and Life Sciences, Institute of Life Sciences, Swansea University Medical School, Singleton Park, Swansea, Wales, UK
| | - Shareen H Doak
- In Vitro Toxicology Group, Faculty of Medicine, Health and Life Sciences, Institute of Life Sciences, Swansea University Medical School, Singleton Park, Swansea, Wales, UK
| | | | | | - Emil Cimpan
- Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway
| | | | - Cecilia Bossa
- Environment and Health Department, Istituto Superiore di Sanità, Rome, Italy
| | - Rositsa Tsekovska
- Medical and Biological Research Laboratory, Roumen Tsanev Institute of Molecular Biology-Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Damjana Drobne
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Sara Novak
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Neža Repar
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Ammar Ammar
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Penny Nymark
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Veronica Di Battista
- BASF SE, Material Physics, Carl Bosch straße, Ludwigshafen, Germany
- Department of Environmental and Resource Engineering, DTU, Kgs. Lyngby, Denmark
| | - Anita Sosnowska
- QSAR Lab Ltd., Gdańsk, Poland
- University of Gdańsk, Faculty of Chemistry, Gdansk, Poland
| | - Tomasz Puzyn
- QSAR Lab Ltd., Gdańsk, Poland
- University of Gdańsk, Faculty of Chemistry, Gdansk, Poland
| | - Nikolay Kochev
- Ideaconsult Ltd., Sofia, Bulgaria
- Department of Analytical Chemistry and Computer Chemistry, Faculty of Chemistry, University of Plovdiv, Plovdiv, Bulgaria
| | | | | | - Katie Reilly
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham, UK
| | - Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham, UK
| | - Martine Bakker
- National Institute for Public Health and the Environment, Bilthoven, the Netherlands
| | | | - Araceli Sánchez Jiménez
- Spanish National Institute of Health and Safety, Centro Nacional de Verificación de Maquinaria, Barakaldo, Spain
| | - Ana Sofia Fonseca
- National Research Centre for the Working Environment, Copenhagen, Denmark
| | - Nicolas Manier
- Ecotoxicology of Substances and Environmental Matrices Unit, French National Institute for Industrial Environment and Risks, Verneuil-en-Halatte, France
| | - María Luisa Fernandez-Cruz
- Department of Environment and Agronomy, National Institute for Agriculture and Food Research and Technology, Spanish National Research Council, Madrid, Spain
| | - Shahzad Rashid
- Institute of Occupational Medicine, Research Avenue North, Edinburgh, UK
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Margarita D Apostolova
- Medical and Biological Research Laboratory, Roumen Tsanev Institute of Molecular Biology-Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Maria Dusinska
- Health Effects Laboratory, Department of Environmental Chemistry & Health Effects, The Climate and Environmental Research Institute NILU, Kjeller, Norway.
| |
Collapse
|
2
|
Saifi I, Bhat BA, Hamdani SS, Bhat UY, Lobato-Tapia CA, Mir MA, Dar TUH, Ganie SA. Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical science. J Biomol Struct Dyn 2024; 42:6523-6541. [PMID: 37434311 DOI: 10.1080/07391102.2023.2234039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 07/03/2023] [Indexed: 07/13/2023]
Abstract
In the ever-evolving field of drug discovery, the integration of Artificial Intelligence (AI) and Machine Learning (ML) with cheminformatics has proven to be a powerful combination. Cheminformatics, which combines the principles of computer science and chemistry, is used to extract chemical information and search compound databases, while the application of AI and ML allows for the identification of potential hit compounds, optimization of synthesis routes, and prediction of drug efficacy and toxicity. This collaborative approach has led to the discovery, preclinical evaluations and approval of over 70 drugs in recent years. To aid researchers in the pursuit of new drugs, this article presents a comprehensive list of databases, datasets, predictive and generative models, scoring functions and web platforms that have been launched between 2021 and 2022. These resources provide a wealth of information and tools for computer-assisted drug development, and are a valuable asset for those working in the field of cheminformatics. Overall, the integration of AI, ML and cheminformatics has greatly advanced the drug discovery process and continues to hold great potential for the future. As new resources and technologies become available, we can expect to see even more groundbreaking discoveries and advancements in these fields.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ifra Saifi
- Chaudhary Charan Singh University, Meerut, Uttar Pradesh, India
| | - Basharat Ahmad Bhat
- Department of Bioresources, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | - Syed Suhail Hamdani
- Department of Bioresources, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | - Umar Yousuf Bhat
- Department of Zoology, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | | | - Mushtaq Ahmad Mir
- Department of Clinical Laboratory Sciences, College of Applied Medical Science, King Khalid University, KSA, Saudi Arabia
| | - Tanvir Ul Hasan Dar
- Department of Biotechnology, School of Biosciences and Biotechnology, BGSB University, Rajouri, India
| | - Showkat Ahmad Ganie
- Department of Clinical Biochemistry, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| |
Collapse
|
3
|
Ianevski A, Kushnir A, Nader K, Miihkinen M, Xhaard H, Aittokallio T, Tanoli Z. RepurposeDrugs: an interactive web-portal and predictive platform for repurposing mono- and combination therapies. Brief Bioinform 2024; 25:bbae328. [PMID: 38980370 PMCID: PMC11232279 DOI: 10.1093/bib/bbae328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 06/05/2024] [Accepted: 06/24/2024] [Indexed: 07/10/2024] Open
Abstract
RepurposeDrugs (https://repurposedrugs.org/) is a comprehensive web-portal that combines a unique drug indication database with a machine learning (ML) predictor to discover new drug-indication associations for approved as well as investigational mono and combination therapies. The platform provides detailed information on treatment status, disease indications and clinical trials across 25 indication categories, including neoplasms and cardiovascular conditions. The current version comprises 4314 compounds (approved, terminated or investigational) and 161 drug combinations linked to 1756 indications/conditions, totaling 28 148 drug-disease pairs. By leveraging data on both approved and failed indications, RepurposeDrugs provides ML-based predictions for the approval potential of new drug-disease indications, both for mono- and combinatorial therapies, demonstrating high predictive accuracy in cross-validation. The validity of the ML predictor is validated through a number of real-world case studies, demonstrating its predictive power to accurately identify repurposing candidates with a high likelihood of future approval. To our knowledge, RepurposeDrugs web-portal is the first integrative database and ML-based predictor for interactive exploration and prediction of both single-drug and combination approval likelihood across indications. Given its broad coverage of indication areas and therapeutic options, we expect it accelerates many future drug repurposing projects.
Collapse
Affiliation(s)
- Aleksandr Ianevski
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
| | - Aleksandr Kushnir
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
| | - Kristen Nader
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
| | - Mitro Miihkinen
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki and Helsinki University Hospital, Finland
| | - Henri Xhaard
- Faculty of Pharmacy, University of Helsinki, Finland
- Drug Discovery and Chemical Biology (DDCB) consortium, Biocenter Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki and Helsinki University Hospital, Finland
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Norway
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Norway
| | - Ziaurrehman Tanoli
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki and Helsinki University Hospital, Finland
- Drug Discovery and Chemical Biology (DDCB) consortium, Biocenter Finland
- BioICAWtech, Helsinki, Finland
| |
Collapse
|
4
|
Wang Y, Liu M, Jafari M, Tang J. A critical assessment of Traditional Chinese Medicine databases as a source for drug discovery. Front Pharmacol 2024; 15:1303693. [PMID: 38738181 PMCID: PMC11082401 DOI: 10.3389/fphar.2024.1303693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 04/15/2024] [Indexed: 05/14/2024] Open
Abstract
Traditional Chinese Medicine (TCM) has been used for thousands of years to treat human diseases. Recently, many databases have been devoted to studying TCM pharmacology. Most of these databases include information about the active ingredients of TCM herbs and their disease indications. These databases enable researchers to interrogate the mechanisms of action of TCM systematically. However, there is a need for comparative studies of these databases, as they are derived from various resources with different data processing methods. In this review, we provide a comprehensive analysis of the existing TCM databases. We found that the information complements each other by comparing herbs, ingredients, and herb-ingredient pairs in these databases. Therefore, data harmonization is vital to use all the available information fully. Moreover, different TCM databases may contain various annotation types for herbs or ingredients, notably for the chemical structure of ingredients, making it challenging to integrate data from them. We also highlight the latest TCM databases on symptoms or gene expressions, suggesting that using multi-omics data and advanced bioinformatics approaches may provide new insights for drug discovery in TCM. In summary, such a comparative study would help improve the understanding of data complexity that may ultimately motivate more efficient and more standardized strategies towards the digitalization of TCM.
Collapse
Affiliation(s)
- Yinyin Wang
- School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Minxia Liu
- Faculty of Life Science, Anhui Medical University, Hefei, China
| | - Mohieddin Jafari
- Department Biochemistry and Developmental Biology, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Department Biochemistry and Developmental Biology, University of Helsinki, Helsinki, Finland
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| |
Collapse
|
6
|
Wang Y, Aldahdooh J, Hu Y, Yang H, Vähä-Koskela M, Tang J, Tanoli Z. DrugRepo: a novel approach to repurposing drugs based on chemical and genomic features. Sci Rep 2022; 12:21116. [PMID: 36477604 PMCID: PMC9729186 DOI: 10.1038/s41598-022-24980-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/23/2022] [Indexed: 12/12/2022] Open
Abstract
The drug development process consumes 9-12 years and approximately one billion US dollars in costs. Due to the high finances and time costs required by the traditional drug discovery paradigm, repurposing old drugs to treat cancer and rare diseases is becoming popular. Computational approaches are mainly data-driven and involve a systematic analysis of different data types leading to the formulation of repurposing hypotheses. This study presents a novel scoring algorithm based on chemical and genomic data to repurpose drugs for 669 diseases from 22 groups, including various cancers, musculoskeletal, infections, cardiovascular, and skin diseases. The data types used to design the scoring algorithm are chemical structures, drug-target interactions (DTI), pathways, and disease-gene associations. The repurposed scoring algorithm is strengthened by integrating the most comprehensive manually curated datasets for each data type. At DrugRepo score ≥ 0.4, we repurposed 516 approved drugs across 545 diseases. Moreover, hundreds of novel predicted compounds can be matched with ongoing studies at clinical trials. Our analysis is supported by a web tool available at: http://drugrepo.org/ .
Collapse
Affiliation(s)
- Yinyin Wang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jehad Aldahdooh
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Yingying Hu
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Hongbin Yang
- Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Ziaurrehman Tanoli
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.
- BioICAWtech, Helsinki, Finland.
| |
Collapse
|
8
|
Aldahdooh J, Vähä-Koskela M, Tang J, Tanoli Z. Using BERT to identify drug-target interactions from whole PubMed. BMC Bioinformatics 2022; 23:245. [PMID: 35729494 PMCID: PMC9214985 DOI: 10.1186/s12859-022-04768-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 06/03/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. RESULTS Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. CONCLUSION The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
Collapse
Affiliation(s)
- Jehad Aldahdooh
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Doctoral Programme in Computer Science, University of Helsinki, Helsinki, Finland
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Ziaurrehman Tanoli
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland. .,BioICAWtech, Helsinki, Finland.
| |
Collapse
|