1
|
Stankovic B, Marinkovic F. A novel procedure for selection of molecular descriptors: QSAR model for mutagenicity of nitroaromatic compounds. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:54603-54617. [PMID: 39207617 DOI: 10.1007/s11356-024-34800-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 08/22/2024] [Indexed: 09/04/2024]
Abstract
Nitroaromatic compounds (NACs) stand out as pervasive organic pollutants, prompting an imperative need to investigate their hazardous effects. Computational chemistry methods play a crucial role in this exploration, offering a safer and more time-efficient approach, mandated by various legislations. In this study, our focus lay on the development of transparent, interpretable, reproducible, and publicly available methodologies aimed at deriving quantitative structure-activity relationship models and testing them by modelling the mutagenicity of NACs against the Salmonella typhimurium TA100 strain. Descriptors were selected from Mordred and RDKit molecular descriptors, along with several quantum chemistry descriptors. For that purpose, the genetic algorithm (GA), as the most widely used method in the literature, and three alternative algorithms (Boruta, Featurewiz, and ForwardSelector) combined with the forward stepwise selection technique were used. The construction of models utilized the multiple linear regression method, with subsequent scrutiny of fitting and predictive performance, reliability, and robustness through various statistical validation criteria. The models were ranked by the Multi-Criteria Decision Making procedure. Findings have revealed that the proposed methodology for descriptor selection outperforms GA, with Featurewiz showing a slight advantage over Boruta and ForwardSelector. These constructed models can serve as valuable tools for the quick and reliable prediction of NACs mutagenicity.
Collapse
Affiliation(s)
- Branislav Stankovic
- Department for Nuclear and Plasma Physics, Vinča Institute of Nuclear Sciences -National Institute of the Republic of Serbia, University of Belgrade, Belgrade, Serbia.
| | | |
Collapse
|
2
|
Huang ETC, Yang JS, Liao KYK, Tseng WCW, Lee CK, Gill M, Compas C, See S, Tsai FJ. Predicting blood-brain barrier permeability of molecules with a large language model and machine learning. Sci Rep 2024; 14:15844. [PMID: 38982309 PMCID: PMC11233737 DOI: 10.1038/s41598-024-66897-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/05/2024] [Indexed: 07/11/2024] Open
Abstract
Predicting the blood-brain barrier (BBB) permeability of small-molecule compounds using a novel artificial intelligence platform is necessary for drug discovery. Machine learning and a large language model on artificial intelligence (AI) tools improve the accuracy and shorten the time for new drug development. The primary goal of this research is to develop artificial intelligence (AI) computing models and novel deep learning architectures capable of predicting whether molecules can permeate the human blood-brain barrier (BBB). The in silico (computational) and in vitro (experimental) results were validated by the Natural Products Research Laboratories (NPRL) at China Medical University Hospital (CMUH). The transformer-based MegaMolBART was used as the simplified molecular input line entry system (SMILES) encoder with an XGBoost classifier as an in silico method to check if a molecule could cross through the BBB. We used Morgan or Circular fingerprints to apply the Morgan algorithm to a set of atomic invariants as a baseline encoder also with an XGBoost classifier to compare the results. BBB permeability was assessed in vitro using three-dimensional (3D) human BBB spheroids (human brain microvascular endothelial cells, brain vascular pericytes, and astrocytes). Using multiple BBB databases, the results of the final in silico transformer and XGBoost model achieved an area under the receiver operating characteristic curve of 0.88 on the held-out test dataset. Temozolomide (TMZ) and 21 randomly selected BBB permeable compounds (Pred scores = 1, indicating BBB-permeable) from the NPRL penetrated human BBB spheroid cells. No evidence suggests that ferulic acid or five BBB-impermeable compounds (Pred scores < 1.29423E-05, which designate compounds that pass through the human BBB) can pass through the spheroid cells of the BBB. Our validation of in vitro experiments indicated that the in silico prediction of small-molecule permeation in the BBB model is accurate. Transformer-based models like MegaMolBART, leveraging the SMILES representations of molecules, show great promise for applications in new drug discovery. These models have the potential to accelerate the development of novel targeted treatments for disorders of the central nervous system.
Collapse
Affiliation(s)
- Eddie T C Huang
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Jai-Sing Yang
- Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| | - Ken Y K Liao
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Warren C W Tseng
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - C K Lee
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Michelle Gill
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Colin Compas
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Simon See
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Fuu-Jen Tsai
- School of Chinese Medicine, College of Chinese Medicine, China Medical University, China Medical University Children's Hospital, No. 2, Yude Road, Taichung, 404332, Taiwan.
- China Medical University Children's Hospital, Taichung, Taiwan.
| |
Collapse
|
3
|
Sardar S, Bhattacharya A, Amin SA, Jha T, Gayen S. Exploring molecular fingerprints of different drugs having bile interaction: a stepping stone towards better drug delivery. Mol Divers 2024; 28:1471-1483. [PMID: 37369957 DOI: 10.1007/s11030-023-10670-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 06/10/2023] [Indexed: 06/29/2023]
Abstract
Bile acids are amphiphilic substances produced naturally in humans. In the context of drug delivery and dosage form design, it is critical to understand whether a drug interacts with bile inside the gastrointestinal (GI) tract or not. This study focuses on the identification of structural fingerprints/features important for bile interaction. Molecular modelling methods such as Bayesian classification and recursive partitioning (RP) studies are executed to find important fingerprints/features for the bile interaction. For the Bayesian classification study, the ROC score of 0.837 and 0.950 are found for the training set and the test set compounds, respectively. The fluorine-containing aliphatic/aromatic group, the branched chain of the alkyl group containing hydroxyl moiety and the phenothiazine ring etc. are identified as good fingerprints having a positive contribution towards bile interactions, whereas, the bad fingerprints such as free carboxylate group, purine, and pyrimidine ring etc. have a negative contribution towards bile interactions. The best tree (tree ID: 1) from the RP study classifies the bile interacting or non-interacting compounds with a ROC score of 0.941 for the training and 0.875 for the test set. Additionally, SARpy and QSAR-Co analyses are also been performed to classify compounds as bile interacting/non-interacting. Moreover, forty-six recently FDA-approved drugs have been screened by the developed SARpy and QSAR-Co models to assess their bile interaction properties. Overall, this attempt may facilitate the researchers to identify bile interacting/non-interacting molecules in a faster way and help in the design of formulations and target-specific drug development.
Collapse
Affiliation(s)
- Sourav Sardar
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Arijit Bhattacharya
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Sk Abdul Amin
- Department of Pharmaceutical Technology, JIS University, 81, Nilgunj Road, Agarpara, Kolkata, West Bengal, India
| | - Tarun Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| | - Shovanlal Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| |
Collapse
|
4
|
Toropov AA, Toropova AP, Roncaglioni A, Benfenati E. In silico prediction of the mutagenicity of nitroaromatic compounds using correlation weights of fragments of local symmetry. MUTATION RESEARCH. GENETIC TOXICOLOGY AND ENVIRONMENTAL MUTAGENESIS 2023; 891:503684. [PMID: 37770141 DOI: 10.1016/j.mrgentox.2023.503684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 07/24/2023] [Accepted: 08/17/2023] [Indexed: 10/03/2023]
Abstract
Most quantitative structure-property/activity relationships (QSPRs/QSARs) techniques involve using different programs separately for generating molecular descriptors and separately for building models based on available descriptors. Here, the capabilities of the CORAL program are evaluated. A user of the program should apply as the basis for models the representation of the molecular structure by means of the simplified molecular input-line entry system (SMILES) as well as experimental data on the endpoint of interest. The local symmetry of SMILES is a novel composition of symmetrically represented symbols, which are three 'xyx', four 'xyyx', or five symbols 'xyzyx'. We updated our CORAL software using this optimal, new flexible descriptor, sensitive to the symmetric composition of a specific part of the molecule. Computational experiments have shown that taking account of these attributes of SMILES can improve the predictive potential of models for the mutagenicity of nitroaromatic compounds. In addition, the above computational experiments have confirmed the advantage of using the index of ideality of correlation (IIC) and the correlation intensity index (CII) for Monte Carlo optimization of the correlation weights for various attributes of SMILES, including the local symmetry. The average value of the coefficient of determination for the validation set (five different models) without fragments of local symmetry is 0.8589 ± 0.025, whereas using fragments of local symmetry improves this criterion of the predictive potential up to 0.9055 ± 0.010.
Collapse
Affiliation(s)
- Andrey A Toropov
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milano, Italy
| | - Alla P Toropova
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milano, Italy.
| | - Alessandra Roncaglioni
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milano, Italy
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milano, Italy
| |
Collapse
|
5
|
Jillella GK, Roy K. QSAR modelling of organic dyes for their acute toxicity in Daphnia magna using 2D-descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2022; 33:111-139. [PMID: 35156472 DOI: 10.1080/1062936x.2022.2033318] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2021] [Accepted: 01/20/2022] [Indexed: 06/14/2023]
Abstract
The present study reports quantitative structure-activity relationship (QSAR) models for 22 organic dyes spanning a broad chemical domain to predict their toxicity in Daphnia magna [log (1/EC50)]. Only two-dimensional descriptors with clear physicochemical meaning were used to construct the QSAR models. The process of development, validation, and interpretation of models adheres to the stringent recommendations of the Organization for Economic Cooperation and Development (OECD) guidelines. In this study, the multi-layered stepwise regression method and linear discriminant analysis (LDA) method were employed for the deployment of regression - and classification-based models respectively; however, the final regression-based QSAR models were obtained through the partial least squares (PLS) regression. Additionally, the applicability domain of the developed models was verified. The constructed models should be applicable in the absence of toxicity data of new or untested dye structures, particularly when the compounds fall within the developed models' scope, and also implementable to develop more environmentally friendly alternatives.
Collapse
Affiliation(s)
- G K Jillella
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Educational and Research (NIPER), Kolkata, India
| | - K Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
6
|
Jillella GK, Ojha PK, Roy K. Application of QSAR for the identification of key molecular fragments and reliable predictions of effects of textile dyes on growth rate and biomass values of Raphidocelis subcapitata. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2021; 238:105925. [PMID: 34332198 DOI: 10.1016/j.aquatox.2021.105925] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 06/27/2021] [Accepted: 07/19/2021] [Indexed: 06/13/2023]
Abstract
The current quantitative structure-activity relationship (QSAR) study seeks to explore the underlying causes of fluctuations in growth rate and biomass of microalgae mainly due to textile dyes. The derived QSAR models cover two endpoints: ErC50 (growth rate) and EbC50 (biomass) of Raphidocelis subcapitata. In order to extract the structural features involved, multiple PLS (partial least squares) models have been developed with easy to interpret and uncomplicated 2D descriptors having proper physico-chemical meaning. These descriptors were calculated from Dragon, SiRMS, and PaDEL-descriptor software. Then, the models were developed initially using stepwise regression followed by partial least squares (PLS) regression, and the model development procedure for both the endpoints (ErC50 and EbC50) followed the stringent Organization for Economic Cooperation and Development (OECD) rules. Later on, the model validation was carried out with statistically significant and internationally accepted metrics (both internally and externally) in both the cases. Next, we have used the "Intelligent Consensus Predictor" tool (available from http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/) to test the prediction quality with an "intelligent" approach to select multiple models. The estimated prediction quality for the appropriate test sets reveals that the consensus models (CM) surpass the quality shown by individual models (IM) for both the endpoints (ErC50 and EbC50). Finally, the developed models were able to identify the major contributing features (hydrophobic units, unsaturation, saturation, electronegativity, branched atoms and charged fragments) related to aquatic toxicity of textile dyes.
Collapse
Affiliation(s)
- Gopala Krishna Jillella
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Educational and Research (NIPER), Chunilal Bhawan, 168, Maniktala Main Road, 700054, Kolkata, India
| | - Probir Kumar Ojha
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India.
| |
Collapse
|
7
|
Kuz’min V, Artemenko A, Ognichenko L, Hromov A, Kosinskaya A, Stelmakh S, Sessions ZL, Muratov EN. Simplex representation of molecular structure as universal QSAR/QSPR tool. Struct Chem 2021; 32:1365-1392. [PMID: 34177203 PMCID: PMC8218296 DOI: 10.1007/s11224-021-01793-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 05/07/2021] [Indexed: 10/24/2022]
Abstract
We review the development and application of the Simplex approach for the solution of various QSAR/QSPR problems. The general concept of the simplex method and its varieties are described. The advantages of utilizing this methodology, especially for the interpretation of QSAR/QSPR models, are presented in comparison to other fragmentary methods of molecular structure representation. The utility of SiRMS is demonstrated not only in the standard QSAR/QSPR applications, but also for mixtures, polymers, materials, and other complex systems. In addition to many different types of biological activity (antiviral, antimicrobial, antitumor, psychotropic, analgesic, etc.), toxicity and bioavailability, the review examines the simulation of important properties, such as water solubility, lipophilicity, as well as luminescence, and thermodynamic properties (melting and boiling temperatures, critical parameters, etc.). This review focuses on the stereochemical description of molecules within the simplex approach and details the possibilities of universal molecular stereo-analysis and stereochemical configuration description, along with stereo-isomerization mechanism and molecular fragment "topography" identification.
Collapse
Affiliation(s)
- Victor Kuz’min
- Department of Molecular Structures and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, 65080 Ukraine
| | - Anatoly Artemenko
- Department of Molecular Structures and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, 65080 Ukraine
| | - Luidmyla Ognichenko
- Department of Molecular Structures and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, 65080 Ukraine
| | - Alexander Hromov
- Department of Molecular Structures and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, 65080 Ukraine
| | - Anna Kosinskaya
- Department of Molecular Structures and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, 65080 Ukraine
- Department of Medical Chemistry, Odessa National Medical University, Odessa, 65082 Ukraine
| | - Sergij Stelmakh
- Department of Molecular Structures and Chemoinformatics, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, 65080 Ukraine
| | - Zoe L. Sessions
- UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Eugene N. Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599 USA
- Department of Pharmaceutical Sciences, Federal University of Paraiba, Joao Pessoa, PB 58059 Brazil
| |
Collapse
|
8
|
Hao Y, Sun G, Fan T, Tang X, Zhang J, Liu Y, Zhang N, Zhao L, Zhong R, Peng Y. In vivo toxicity of nitroaromatic compounds to rats: QSTR modelling and interspecies toxicity relationship with mouse. JOURNAL OF HAZARDOUS MATERIALS 2020; 399:122981. [PMID: 32534390 DOI: 10.1016/j.jhazmat.2020.122981] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 05/14/2020] [Accepted: 05/16/2020] [Indexed: 06/11/2023]
Abstract
Nitroaromatic compounds (NACs) in the environment can cause serious public health and environmental problems due to their potential toxicity. This study established quantitative structure-toxicity relationship (QSTR) models for the acute oral toxicity of NACs towards rats following the stringent OECD principles for QSTR modelling. All models were assessed by various internationally accepted validation metrics and the OECD criteria. The best QSTR model contains seven simple and interpretable 2D descriptors with defined physicochemical meaning. Mechanistic interpretation indicated that van der Waals surface area, presence of C-F at topological distance 6, heteroatom content and frequency of C-N at topological distance 9 are main factors responsible for the toxicity of NACs. This proposed model was successfully applied to a true external set (295 compounds), and prediction reliability was analysed and discussed. Moreover, the rat-mouse and mouse-rat interspecies quantitative toxicity-toxicity relationship (iQTTR) models were also constructed, validated and employed in toxicity prediction for true external sets consisting of 67 and 265 compounds, respectively. These models showed good external predictivity that can be used to rapidly predict the rat oral acute toxicity of new or untested NACs falling within the applicability domain of the models, thus being beneficial in environmental risk assessment and regulatory purposes.
Collapse
Affiliation(s)
- Yuxing Hao
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Guohui Sun
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Tengjiao Fan
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Xiaoyu Tang
- College of Environmental and Energy Engineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Jing Zhang
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Yongdong Liu
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Na Zhang
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Lijiao Zhao
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Rugang Zhong
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, PR China.
| | - Yongzhen Peng
- National Engineering Laboratory for Advanced Municipal Wastewater Treatment and Reuse Technology, Engineering Research Center of Beijing, Beijing University of Technology, Beijing 100124, PR China.
| |
Collapse
|