1
|
Yuan X, Qing J, Zhi W, Wu F, Yan Y, Li Y. Gut and respiratory microbiota landscapes in IgA nephropathy: a cross-sectional study. Ren Fail 2024; 46:2399749. [PMID: 39248406 PMCID: PMC11385635 DOI: 10.1080/0886022x.2024.2399749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 08/27/2024] [Accepted: 08/29/2024] [Indexed: 09/10/2024] Open
Abstract
BACKGROUND IgA nephropathy (IgAN) is intimately linked to mucosal immune responses, with nasopharyngeal and intestinal lymphoid tissues being crucial for its abnormal mucosal immunity. The specific pathogenic bacteria in these sites associated with IgAN, however, remain elusive. Our study employs 16S rRNA sequencing and machine learning (ML) approaches to identify specific pathogenic bacteria in these locations and to investigate common pathogens that may exacerbate IgAN. METHODS In this cross-sectional analysis, we collected pharyngeal swabs and stool specimens from IgAN patients and healthy controls. We applied 16SrRNA sequencing to identify differential microbial populations. ML algorithms were then used to classify IgAN based on these microbial differences. Spearman correlation analysis was employed to link key bacteria with clinical parameters. RESULTS We observed a reduced microbial diversity in IgAN patients compared to healthy controls. In the gut microbiota of IgAN patients, increases in Bacteroides, Escherichia-Shigella, and Parabacteroides, and decreases in Parasutterella, Dialister, Faecalibacterium, and Subdoligranulum were notable. In the respiratory microbiota, increases in Neisseria, Streptococcus, Fusobacterium, Porphyromonas, and Ralstonia, and decreases in Prevotella, Leptotrichia, and Veillonella were observed. Post-immunosuppressive therapy, Oxalobacter and Butyricoccus levels were significantly reduced in the gut, while Neisseria and Actinobacillus levels decreased in the respiratory tract. Veillonella and Fusobacterium appeared to influence IgAN through dual immune loci, with Fusobacterium abundance correlating with IgAN severity. CONCLUSIONS This study revealing that changes in flora structure could provide important pathological insights for identifying therapeutic targets, and ML could facilitate noninvasive diagnostic methods for IgAN.
Collapse
Affiliation(s)
- Xiaoli Yuan
- The Fifth Clinical Medical College of Shanxi Medical University, Taiyuan, China
- Department of Nephrology, Shanxi Provincial People's Hospital (Fifth Hospital), Shanxi Medical University, Taiyuan, China
| | - Jianbo Qing
- Department of Nephrology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wenqiang Zhi
- The Fifth Clinical Medical College of Shanxi Medical University, Taiyuan, China
| | - Feng Wu
- The Fifth Clinical Medical College of Shanxi Medical University, Taiyuan, China
| | - Yan Yan
- The Fifth Clinical Medical College of Shanxi Medical University, Taiyuan, China
| | - Yafeng Li
- Department of Nephrology, Shanxi Provincial People's Hospital (Fifth Hospital), Shanxi Medical University, Taiyuan, China
- Core Laboratory, Shanxi Provincial People's Hospital (Fifth Hospital), Shanxi Medical University, Taiyuan, China
- Medicinal Basic Research Innovation Center of Chronic Kidney Disease, Ministry of Education, Shanxi Medical University, Taiyuan, China
- Academy of Microbial Ecology, Shanxi Medical University, Taiyuan, China
| |
Collapse
|
2
|
Yu X, Eid Y, Jama M, Pham D, Ahmed M, Attar MS, Samiuddin Z, Barakat K. Combining machine learning, molecular dynamics, and free energy analysis for (5HT)-2A receptor modulator classification. J Mol Graph Model 2024; 132:108842. [PMID: 39151376 DOI: 10.1016/j.jmgm.2024.108842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 07/03/2024] [Accepted: 08/02/2024] [Indexed: 08/19/2024]
Abstract
The 5-Hydroxytryptamine (5HT)-2A receptor, a key target in psychoactive drug development, presents significant challenges in the design of selective compounds. Here, we describe the construction, evaluation and validation of two machine learning (ML) models for the classification of bioactivity mechanisms against the (5HT)-2A receptor. Employing neural networks and XGBoost models, we achieved an overall accuracy of around 87 %, which was further enhanced through molecular modelling (MM) (e.g. molecular dynamics simulations) and binding free energy analysis. This ML-MM integration provided insights into the mechanisms of direct modulators and prodrugs. A significant outcome of the current study is the development of a 'binding free energy fingerprint' specific to (5HT)-2A modulators, offering a novel metric for evaluating drug efficacy against this target. Our study demonstrates the prospective of employing a successful workflow combining AI with structural biology, offering a powerful tool for advancing psychoactive drug discovery.
Collapse
Affiliation(s)
- Xian Yu
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Yasmine Eid
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Maryam Jama
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Diane Pham
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Marawan Ahmed
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Melika Shabani Attar
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Zainab Samiuddin
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Khaled Barakat
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada.
| |
Collapse
|
3
|
Zhao L, Qiu Q, Zhang S, Yan F, Li X. Tau pathology mediated the plasma biomarkers and cognitive function in patients with mild cognitive impairment. Exp Gerontol 2024; 195:112535. [PMID: 39128687 DOI: 10.1016/j.exger.2024.112535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 07/27/2024] [Accepted: 07/31/2024] [Indexed: 08/13/2024]
Abstract
Glial fibrillary acidic protein (GFAP) and neurofilament light (NfL) are putative non-amyloid biomarkers indicative of ongoing inflammatory and neurodegenerative disease processes. Hence, this study aimed to demonstrate the relationship between plasma biomarkers (GFAP and NfL) and 18F-AV-1451 tau PET images, and to explore their effects on cognitive function. Ninety-one participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database and 20 participants from the Shanghai Action of Prevention Dementia for the Elderly (SHAPE) cohort underwent plasma biomarker testing, 18F-AV-1451 tau PET scans and cognitive function assessments. Within the ADNI, there were 42 cognitively normal (CN) individuals and 49 with mild cognitive impairment (MCI). Similarly, in the SHAPE, we had 10 CN and 10 MCI participants. We calculated the standardized uptake value ratios (SUVRs) for the regions of interest (ROIs) in the 18F-AV-1451 PET scans. Using plasma biomarkers and regional SUVRs, we trained machine learning models to differentiate between MCI and CN subjects with ADNI database and validated in SHAPE. Results showed that eight selected variables (including left amygdala SUVR, right amygdala SUVR, left entorhinal cortex SUVR, age, education, plasma NfL, plasma GFAP, plasma GFAP/ NfL) identified by LASSO could differentiate between the MCI and CN individuals, with AUC ranging from 0.783 to 0.926. Additionally, cognitive function was negatively associated with the plasma biomarkers and tau deposition in amygdala and left entorhinal cortex. Increased tau deposition in amygdala and left entorhinal cortex were related to increased plasma biomarkers. Moreover, tau pathology mediated the effect of plasma biomarkers level on the cognitive decline. The present study provides valuable insights into the association among plasma markers (GFAP and NfL), regional tau deposition and cognitive function. This study reports the mediation effect of brain regions tau deposition on the plasma biomarkers level and cognitive function, indicating the significance of tau pathology in the MCI patients.
Collapse
Affiliation(s)
- Lu Zhao
- Department of Geriatric Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University of Medicine, Shanghai, China
| | - Qi Qiu
- Department of Geriatric Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University of Medicine, Shanghai, China
| | - Shaowei Zhang
- Department of Geriatric Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University of Medicine, Shanghai, China
| | - Feng Yan
- Department of Geriatric Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University of Medicine, Shanghai, China.
| | - Xia Li
- Department of Geriatric Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University of Medicine, Shanghai, China.
| |
Collapse
|
4
|
Lin S, Zhuang Y, Chen K, Lu J, Wang K, Han L, Li M, Li X, Zhu X, Yang M, Yin G, Lin J, Zhang X. Osteoinductive biomaterials: Machine learning for prediction and interpretation. Acta Biomater 2024; 187:422-433. [PMID: 39178926 DOI: 10.1016/j.actbio.2024.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 08/04/2024] [Accepted: 08/13/2024] [Indexed: 08/26/2024]
Abstract
Biomaterials with osteoinductivity are widely used for bone defect repair due to their unique structures and functions. Machine learning (ML) is pivotal in analyzing osteoinductivity and accelerating new material design. However, challenges include creating a comprehensive database of osteoinductive materials and dealing with low-quality, disparate data. As a standard for evaluating the osteoinductivity of biomaterials, ectopic ossification has been used. This paper compiles research findings from the past thirty years, resulting in a robust database validated by experts. To tackle issues of limited data samples, missing data, and high-dimensional sparsity, a data enhancement strategy is developed. This approach achieved an area under the curve (AUC) of 0.921, a precision of 0.839, and a recall of 0.833. Model interpretation identified key factors such as porosity, bone morphogenetic protein-2 (BMP-2), and hydroxyapatite (HA) proportion as crucial determinants of outcomes. Optimizing pore structure and material composition through partial dependence plot (PDP) analysis led to a new bone area ratio of 14.7 ± 7 % in animal experiments, surpassing the database average of 10.97 %. This highlights the significant potential of ML in the development and design of osteoinductive materials. STATEMENT OF SIGNIFICANCE: This study leverages machine learning to analyze osteoinductive biomaterials, addressing challenges in database creation and data quality. Our data enhancement strategy significantly improved model performance. By optimizing pore structure and material composition, we increased new bone formation rates, showcasing the vast potential of machine learning in biomaterial design.
Collapse
Affiliation(s)
- Sicong Lin
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China
| | - Yan Zhuang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China
| | - Ke Chen
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China
| | - Jian Lu
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China; National Engineering Research Centre for Biomaterials, Sichuan University, Chengdu 610065, China
| | - Kefeng Wang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China; National Engineering Research Centre for Biomaterials, Sichuan University, Chengdu 610065, China
| | - Lin Han
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China
| | - Mufei Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China
| | - Xiangfeng Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China; National Engineering Research Centre for Biomaterials, Sichuan University, Chengdu 610065, China.
| | - Xiangdong Zhu
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China; National Engineering Research Centre for Biomaterials, Sichuan University, Chengdu 610065, China.
| | - Mingli Yang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China; National Engineering Research Centre for Biomaterials, Sichuan University, Chengdu 610065, China
| | - Guangfu Yin
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China
| | - Jiangli Lin
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China.
| | - Xingdong Zhang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China; Provincial Engineering Research Center for Biomaterials Genome of Sichuan, Sichuan University, Chengdu 610065, China; National Engineering Research Centre for Biomaterials, Sichuan University, Chengdu 610065, China
| |
Collapse
|
5
|
Komissarov L, Manevski N, Groebke Zbinden K, Schindler T, Zitnik M, Sach-Peltason L. Actionable Predictions of Human Pharmacokinetics at the Drug Design Stage. Mol Pharm 2024; 21:4356-4371. [PMID: 39132855 DOI: 10.1021/acs.molpharmaceut.4c00311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
We present a novel computational approach for predicting human pharmacokinetics (PK) that addresses the challenges of early stage drug design. Our study introduces and describes a large-scale data set of 11 clinical PK end points, encompassing over 2700 unique chemical structures to train machine learning models. To that end multiple advanced training strategies are compared, including the integration of in vitro data and a novel self-supervised pretraining task. In addition to the predictions, our final model provides meaningful epistemic uncertainties for every data point. This allows us to successfully identify regions of exceptional predictive performance, with an absolute average fold error (AAFE/geometric mean fold error) of less than 2.5 across multiple end points. Together, these advancements represent a significant leap toward actionable PK predictions, which can be utilized early on in the drug design process to expedite development and reduce reliance on nonclinical studies.
Collapse
Affiliation(s)
- Leonid Komissarov
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Nenad Manevski
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Katrin Groebke Zbinden
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Torsten Schindler
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| | - Marinka Zitnik
- Harvard Medical School, Department of Biomedical Informatics, Boston, Massachusetts 02115, United States
| | - Lisa Sach-Peltason
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Basel 4070, Switzerland
| |
Collapse
|
6
|
Zhang X, Jablonka KM, Smit B. Deep learning-based recommendation system for metal-organic frameworks (MOFs). DIGITAL DISCOVERY 2024; 3:1410-1420. [PMID: 38993728 PMCID: PMC11235176 DOI: 10.1039/d4dd00116h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 06/06/2024] [Indexed: 07/13/2024]
Abstract
This work presents a recommendation system for metal-organic frameworks (MOFs) inspired by online content platforms. By leveraging the unsupervised Doc2Vec model trained on document-structured intrinsic MOF characteristics, the model embeds MOFs into a high-dimensional chemical space and suggests a pool of promising materials for specific applications based on user-endorsed MOFs with similarity analysis. This proposed approach significantly reduces the need for exhaustive labeling of every material in the database, focusing instead on a select fraction for in-depth investigation. Ranging from methane storage and carbon capture to quantum properties, this study illustrates the system's adaptability to various applications.
Collapse
Affiliation(s)
- Xiaoqi Zhang
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne(EPFL) Rue de l'Industrie 17 CH-1951 Sion Valais Switzerland
| | - Kevin Maik Jablonka
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne(EPFL) Rue de l'Industrie 17 CH-1951 Sion Valais Switzerland
- Laboratory of Organic and Macromolecular Chemistry (IOMC), Friedrich Schiller University Jena Humboldtstrasse 10 07743 Jena Germany
- Helmholtz Institute for Polymers in Energy Applications Jena (HIPOLE Jena) Lessingstrasse 12-14 07743 Jena Germany
| | - Berend Smit
- Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne(EPFL) Rue de l'Industrie 17 CH-1951 Sion Valais Switzerland
| |
Collapse
|
7
|
Aksu GO, Keskin S. Rapid and Accurate Screening of the COF Space for Natural Gas Purification: COFInformatics. ACS APPLIED MATERIALS & INTERFACES 2024; 16:19806-19818. [PMID: 38588323 DOI: 10.1021/acsami.4c01641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
In this work, we introduced COFInformatics, a computational approach merging molecular simulations and machine learning (ML) algorithms, to evaluate all synthesized and hypothetical covalent organic frameworks (COFs) for the CO2/CH4 mixture separation under four different adsorption-based processes: pressure swing adsorption (PSA), vacuum swing adsorption (VSA), temperature swing adsorption (TSA), and pressure-temperature swing adsorption (PTSA). We first extracted structural, chemical, energy-based, and graph-based molecular fingerprint features of every single COF structure in the very large COF space, consisting of nearly 70,000 materials, and then performed grand canonical Monte Carlo simulations to calculate the CO2/CH4 mixture adsorption properties of 7540 COFs. These features and simulation results were used to develop ML models that accurately and rapidly predict CO2/CH4 mixture adsorption and separation properties of all 68,614 COFs. The most efficient separation process and the best adsorbent candidates among the entire COF spectrum were identified and analyzed in detail to reveal the most important molecular features that lead to high-performance adsorbents. Our results showed that (i) many hypoCOFs outperform synthesized COFs by achieving higher CO2/CH4 selectivities; (ii) the top COF adsorbents consist of narrow pores and linkers comprising aromatic, triazine, and halogen groups; and (iii) PTSA is the most efficient process to use COF adsorbents for natural gas purification. We believe that COFInformatics promises to expedite the evaluation of COF adsorbents for CO2/CH4 separation, thereby circumventing the extensive, time- and resource-intensive molecular simulations.
Collapse
Affiliation(s)
- Gokhan Onder Aksu
- Department of Chemical and Biological Engineering, Koc University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
| | - Seda Keskin
- Department of Chemical and Biological Engineering, Koc University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
| |
Collapse
|
8
|
Deng C, Liao J, Fu Z, Fu F, Li D, Li Y, Wang J, Chen H, Zhang Y. Systemic immune index predicts tumor-infiltrating lymphocyte intensity and immunotherapy response in small cell lung cancer. Transl Lung Cancer Res 2024; 13:292-306. [PMID: 38496688 PMCID: PMC10938096 DOI: 10.21037/tlcr-23-696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 02/02/2024] [Indexed: 03/19/2024]
Abstract
Background Despite recent progresses in immune checkpoint blockade (ICB) in small-cell lung cancer (SCLC), a lack of understanding regarding the systemic tumor immune environment (STIE) and local tumor immune microenvironment (TIME) makes it difficult to accurately predict clinical outcomes and identify potential beneficiaries from ICB therapy. Methods We enrolled 191 patients with stage I-III SCLC and comprehensively evaluated the prognostic role of STIE by several quantitative measurements, and further integrate it with a local immune score system (LISS) established by eXtreme Gradient Boosting (XGBoost) machine learning algorithm. We also test the value of STIE in beneficiary selection in our independent advanced SCLC cohort receiving programmed cell death 1 ligand 1 (PD-L1) blockade therapy. Results Among several systemic immune markers, the STIE as assessed by prognostic nutritional index (PNI) was correlated with disease-free survival (DFS) and overall survival (OS), and remained as an independent prognostic factor for SCLC patients [hazard ratio (HR): 0.473, 95% confidence interval (CI): 0.241-0.929, P=0.030]. Higher PNI score was closely associated with inflamed SCLC molecular subtype and local tumor-infiltrating lymphocytes (TILs). We further constructed a LISS which combined top three important local immune biomarkers (CD8+ T-cell count, PD-L1 expression on CD8+ T-cell and CD4+ T-cell count) and integrated it with the PNI score. The final integrated immune risk system was an independent prognostic factor and achieved better predictive performance than Tumor Node Metastasis (TNM) stages and single immune biomarker. Furthermore, PNI-high extensive-stage SCLC patients achieved better clinical response and longer progression-free survival (PFS) (11.8 vs. 5.9 months, P=0.012) from PD-L1 blockade therapy. Conclusions This study provides a method to investigate the prognostic value of overall immune status by combining the PNI with local immune biomarkers in SCLC. The promising clinical application of PNI in efficacy prediction and beneficiary selection for SCLC immunotherapy is also highlighted.
Collapse
Affiliation(s)
- Chaoqiang Deng
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Jiatao Liao
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Department of Thoracic Medical Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Zichen Fu
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Fangqiu Fu
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Di Li
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yuan Li
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Jialei Wang
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Department of Thoracic Medical Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Haiquan Chen
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yang Zhang
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
- Institute of Thoracic Oncology, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| |
Collapse
|
9
|
Yang Y, Guo S, Li S, Wu Y, Qiao Z. Topological Data Analysis Combined with High-Throughput Computational Screening of Hydrophobic Metal-Organic Frameworks: Application to the Adsorptive Separation of C3 Components. NANOMATERIALS (BASEL, SWITZERLAND) 2024; 14:298. [PMID: 38334569 PMCID: PMC10857702 DOI: 10.3390/nano14030298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 01/29/2024] [Accepted: 01/30/2024] [Indexed: 02/10/2024]
Abstract
The shape and topology of pores have significant impacts on the gas storage properties of nanoporous materials. Metal-organic frameworks (MOFs) are ideal materials with which to tailor to the needs of specific applications, due to properties such as their tunable structure and high specific surface area. It is, therefore, particularly important to develop descriptors that accurately identify the topological features of MOF pores. In this work, a topological data analysis method was used to develop a topological descriptor, based on the pore topology, which was combined with the Extreme Gradient Boosting (XGBoost) algorithm to predict the adsorption performance of MOFs for methane/ethane/propane. The final results show that this descriptor can accurately predict the performance of MOFs, and the introduction of the topological descriptor also significantly improves the accuracy of the model, resulting in an increase of up to 17.55% in the R2 value of the model and a decrease of up to 46.1% in the RMSE, compared to commonly used models that are based on the structural descriptor. The results of this study contribute to a deeper understanding of the relationship between the performance and structure of MOFs and provide useful guidelines and strategies for the design of high-performance separation materials.
Collapse
Affiliation(s)
| | | | | | - Yufang Wu
- Guangzhou Key Laboratory for New Energy and Green Catalysis, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou 510006, China; (Y.Y.); (S.G.); (S.L.)
| | - Zhiwei Qiao
- Guangzhou Key Laboratory for New Energy and Green Catalysis, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou 510006, China; (Y.Y.); (S.G.); (S.L.)
| |
Collapse
|
10
|
Li W, Situ Y, Ding L, Chen Y, Yang Q. MOF-GRU: A MOFid-Aided Deep Learning Model for Predicting the Gas Separation Performance of Metal-Organic Frameworks. ACS APPLIED MATERIALS & INTERFACES 2023; 15:59887-59894. [PMID: 38087435 DOI: 10.1021/acsami.3c11790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The remarkable versatility of metal-organic frameworks (MOFs) stems from their rich chemical information, leading to numerous successful applications. However, identifying optimal MOFs for specific tasks necessitates a thorough assessment of their chemical attributes. Conventional machine learning approaches for MOF prediction have relied on intricate chemical and structural details, hampering rapid evaluations. Drawing inspiration from recent advancements exemplified by Snurr et al., wherein a text string was used to represent a MOF (MOFid), we introduce a MOFid-aided deep learning model, named the MOF-GRU model. This model, founded on natural language processing principles and utilizing the gated recurrent unit architecture, leverages the serialized text string representation of metal-organic frameworks (MOFs) to forecast gas separation performance. Through a focused study on CH4/N2 separation, we substantiate the efficacy of this approach. Comparative assessments against traditional machine learning techniques underscore our model's superior predictive accuracy and its capacity to handle extensive data sets adeptly. The MOF-GRU model remarkably uncovers latent structure-performance relationships with only MOF sequences, obviating the necessity for intricate three-dimensional (3D) structural information. Overall, this model's judicious design empowers efficient data utilization, thereby hastening the discovery of high-performance materials tailored for gas separation applications.
Collapse
Affiliation(s)
- Wenxuan Li
- State Key Laboratory of Organic-Inorganic Composites, College of Chemical Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Yizhen Situ
- State Key Laboratory of Organic-Inorganic Composites, College of Chemical Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Lifeng Ding
- Department of Chemistry, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou 215123, Jiangsu, China
| | - Yanling Chen
- State Key Laboratory of Organic-Inorganic Composites, College of Chemical Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Qingyuan Yang
- State Key Laboratory of Organic-Inorganic Composites, College of Chemical Engineering, Beijing University of Chemical Technology, Beijing 100029, China
- Engineering Laboratory of Chemical Resources Utilization in South Xinjiang of Xinjiang Production and Construction Corps, Tarim University, Alar 843300, Xinjiang, China
| |
Collapse
|
11
|
Song W, Wu F, Yan Y, Li Y, Wang Q, Hu X, Li Y. Gut microbiota landscape and potential biomarker identification in female patients with systemic lupus erythematosus using machine learning. Front Cell Infect Microbiol 2023; 13:1289124. [PMID: 38169617 PMCID: PMC10758415 DOI: 10.3389/fcimb.2023.1289124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/28/2023] [Indexed: 01/05/2024] Open
Abstract
Objectives Systemic Lupus Erythematosus (SLE) is a complex autoimmune disease that disproportionately affects women. Early diagnosis and prevention are crucial for women's health, and the gut microbiota has been found to be strongly associated with SLE. This study aimed to identify potential biomarkers for SLE by characterizing the gut microbiota landscape using feature selection and exploring the use of machine learning (ML) algorithms with significantly dysregulated microbiotas (SDMs) for early identification of SLE patients. Additionally, we used the SHapley Additive exPlanations (SHAP) interpretability framework to visualize the impact of SDMs on the risk of developing SLE in females. Methods Stool samples were collected from 54 SLE patients and 55 Negative Controls (NC) for microbiota analysis using 16S rRNA sequencing. Feature selection was performed using Elastic Net and Boruta on species-level taxonomy. Subsequently, four ML algorithms, namely logistic regression (LR), Adaptive Boosting (AdaBoost), Random Forest (RF), and eXtreme gradient boosting (XGBoost), were used to achieve early identification of SLE with SDMs. Finally, the best-performing algorithm was combined with SHAP to explore how SDMs affect the risk of developing SLE in females. Results Both alpha and beta diversity were found to be different in SLE group. Following feature selection, 68 and 21 microbiota were retained in Elastic Net and Boruta, respectively, with 16 microbiota overlapping between the two, i.e., SDMs for SLE. The four ML algorithms with SDMs could effectively identify SLE patients, with XGBoost performing the best, achieving Accuracy, Sensitivity, Specificity, Positive Predictive Value, Negative Predictive Value, and AUC values of 0.844, 0.750, 0.938, 0.923, 0.790, and 0.930, respectively. The SHAP interpretability framework showed a complex non-linear relationship between the relative abundance of SDMs and the risk of SLE, with Escherichia_fergusonii having the largest SHAP value. Conclusions This study revealed dysbiosis in the gut microbiota of female SLE patients. ML classifiers combined with SDMs can facilitate early identification of female patients with SLE, particularly XGBoost. The SHAP interpretability framework provides insight into the impact of SDMs on the risk of SLE and may inform future scientific treatment for SLE.
Collapse
Affiliation(s)
- Wenzhu Song
- School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Feng Wu
- Department of Nephrology, Shanxi Provincial People's Hospital (Fifth Hospital) of Shanxi Medical University, Taiyuan, China
| | - Yan Yan
- Department of Nephrology, Shanxi Provincial People's Hospital (Fifth Hospital) of Shanxi Medical University, Taiyuan, China
| | - Yaheng Li
- Shanxi Provincial Key Laboratory of Kidney Disease, Taiyuan, Shanxiuan, China
| | - Qian Wang
- Shanxi Provincial Key Laboratory of Kidney Disease, Taiyuan, Shanxiuan, China
| | - Xueli Hu
- Department of Nephrology, Hejin People’s Hospital, Yuncheng, Shanxi, China
| | - Yafeng Li
- Department of Nephrology, Shanxi Provincial People's Hospital (Fifth Hospital) of Shanxi Medical University, Taiyuan, China
- Shanxi Provincial Key Laboratory of Kidney Disease, Taiyuan, Shanxiuan, China
- Core Laboratory, Shanxi Provincial People's Hospital (Fifth Hospital) of Shanxi Medical University, Taiyuan, China
- Academy of Microbial Ecology, Shanxi Medical University, Taiyuan, China
| |
Collapse
|
12
|
Du XM, Xiao ST, Wang X, Sun X, Lin YF, Wang Q, Chen GH. Combination of High-Throughput Screening and Assembly to Discover Efficient Metal-Organic Frameworks on Kr/Xe Adsorption Separation. J Phys Chem B 2023; 127:8116-8130. [PMID: 37725055 DOI: 10.1021/acs.jpcb.3c03139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
Recycling Kr and Xe from used nuclear fuel (UNF) is conducive to regenerating economy and protecting the environment, and it is urgent to screen or design high-performance cutting-edge metal-organic framework (MOF) materials for Kr/Xe adsorption separation. After grand canonical Monte Carlo (GCMC) simulations of Kr/Xe adsorption separation on 11,000 frameworks in CoRE MOFs (2019), the important structure-adsorption property relationship (SAPR) was induced; that is, the porosity (φ) at 0.30-0.40, LCD/PLD at 1.00-1.49, density (ρ) range between 1.20 and 2.30 g/cm3, and PLD at 2.40-3.38 Å can be utilized to screen for high-performance G-MOFs and hMOFs. In addition, the key "genes" (metal nodes and linkers) of MOFs determining the Kr/Xe adsorption separation were data-mined by a machine learning technique, which were assembled into novel MOFs. After comprehensive consideration of thermal stability and the adsorbent performance score (APS), eight promising MOFs on Kr/Xe separation with the APS more than 1290.89 were screened out and assembled, which are better than most of the reported frameworks. Note that the adsorption isotherms of these MOFs on Kr and Xe belong to type I curve with the thermodynamic equilibrium mechanism on Kr/Xe based on the confinement effect. Furthermore, according to the electronic structure calculations of the independent gradient model based on Hirshfeld partition (IGMH) and energy decomposition analysis, it is found that the interactions between guests and frameworks are vdW forces with dominant induction energy (Eind). In addition, the electrostatic potential gradients of frameworks are generally linearly negative correlated with Kr uptakes. Therefore, both the geometrical and electronic structures dominate the adsorption separation performance on Kr/Xe. Interestingly, these eight MOFs are also suitable for the separation of CH4/H2 with considerable selectivities and CH4 uptakes of up to 2566.67 and 3.04 mmol/g, respectively. Herein, the accurately constructed SAPR and material genomics strategy should be helpful for the experimental discovery of novel MOFs on Kr/Xe separation experimentally.
Collapse
Affiliation(s)
- Xin-Ming Du
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Shantou 515063, Guangdong, China
| | - Song-Tao Xiao
- Institute of Radiochemistry, China Institute of Atomic Energy (CIAE), Beijing 102413, PR China
| | - Xin Wang
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Shantou 515063, Guangdong, China
| | - Xi Sun
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Shantou 515063, Guangdong, China
| | - Yu-Fei Lin
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Shantou 515063, Guangdong, China
| | - Qiang Wang
- Department of Applied Chemistry, College of Science, Nanjing Tech University, Nanjing 211816, PR China
| | - Guang-Hui Chen
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Shantou 515063, Guangdong, China
| |
Collapse
|
13
|
Gao L, Cao Y, Cao X, Shi X, Lei M, Su X, Liu Y. Machine learning-based algorithms to predict severe psychological distress among cancer patients with spinal metastatic disease. Spine J 2023; 23:1255-1269. [PMID: 37182703 DOI: 10.1016/j.spinee.2023.05.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/12/2023] [Accepted: 05/08/2023] [Indexed: 05/16/2023]
Abstract
BACKGROUND CONTEXT Metastatic spinal disease is an advanced stage of cancer patients and often suffer from terrible psychological health status; however, the ability to estimate the risk probability of this adverse outcome using current available data is very limited. PURPOSE The goal of this study was to propose a precise model based on machine learning techniques to predict psychological status among cancer patients with spinal metastatic disease. STUDY DESIGN/SETTING A prospective cohort study. PATIENT SAMPLE A total of 1043 cancer patients with spinal metastatic disease were included. OUTCOME MEASURES The main outcome was severe psychological distress. METHODS The total of patients was randomly divided into a training dataset and a testing dataset on a ratio of 9:1. Patients' demographics, lifestyle choices, cancer-related features, clinical manifestations, and treatments were collected as potential model predictors in the study. Five machine learning algorithms, including XGBoosting machine, random forest, gradient boosting machine, support vector machine, and ensemble prediction model, as well as a logistic regression model were employed to train and optimize models in the training set, and their predictive performance was assessed in the testing set. RESULTS Up to 21.48% of all patients who were recruited had severe psychological distress. Elderly patients (p<0.001), female (p =0.045), current smoking (p=0.002) or drinking (p=0.003), a lower level of education (p<0.001), a stronger spiritual desire (p<0.001), visceral metastasis (p=0.005), and a higher Eastern Cooperative Oncology Group (ECOG) score (p<0.001) were significantly associated with worse psychological health. With an area under the curve (AUC) of 0.865 (95% CI: 0.788-0.941) and an accuracy of up to 0.843, the gradient boosting machine algorithm performed best in the prediction of the outcome, followed by the XGBooting machine algorithm (AUC: 0.851, 95% CI: 0.768-0.934; Accuracy: 0.826) and ensemble prediction (AUC: 0.851, 95% CI: 0.770-0.932; Accuracy: 0.809) in the testing set. In contrast, the AUC of the logistic regression model was only 0.836 (95% CI: 0.756-0.916; Accuracy: 0.783). CONCLUSIONS Machine learning models have greater predictive power and can offer useful tools to identify individuals with spinal metastatic disease who are experiencing severe psychological distress.
Collapse
Affiliation(s)
- Le Gao
- Department of Oncology, Senior Department of Oncology, The Fifth Medical Center of PLA General Hospital, No. 8 Dongdajie Street, Fengtai District, Beijing, China
| | - Yuncen Cao
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, No. 51 Fucheng Road, Haidian District, Beijing, 100048, China
| | - Xuyong Cao
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, No. 51 Fucheng Road, Haidian District, Beijing, 100048, China
| | - Xiaolin Shi
- Department of Orthopedic Surgery, The Second Affiliated Hospital of Zhejiang Chinese Medical University, No. 318 Chaowang Road, Hangzhou, 310005, China
| | - Mingxing Lei
- Department of Orthopedic Surgery, Hainan Hospital of PLA General Hospital, No. 80 Jianglin Road, Haitang District, Sanya, 572022, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, No. 28 Fuxing Road, Haidian District, Beijing, 100039, China.
| | - Xiuyun Su
- Intelligent Medical Innovation Institute, Southern University of Science and Technology Hospital, No. 6019 Xili Liuxian Avenue, Nanshan District, Shenzhen, 518071, China.
| | - Yaosheng Liu
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, No. 51 Fucheng Road, Haidian District, Beijing, 100048, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, No. 28 Fuxing Road, Haidian District, Beijing, 100039, China.
| |
Collapse
|
14
|
Li Y, Yang H, He W, Li Y. Human Endocrine-Disrupting Effects of Phthalate Esters through Adverse Outcome Pathways: A Comprehensive Mechanism Analysis. Int J Mol Sci 2023; 24:13548. [PMID: 37686353 PMCID: PMC10488033 DOI: 10.3390/ijms241713548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/11/2023] [Accepted: 08/30/2023] [Indexed: 09/10/2023] Open
Abstract
Phthalate esters (PAEs) are widely exposed in the environment as plasticizers in plastics, and they have been found to cause significant environmental and health hazards, especially in terms of endocrine disruption in humans. In order to investigate the processes underlying the endocrine disruption effects of PAEs, three machine learning techniques were used in this study to build an adverse outcome pathway (AOP) for those effects on people. According to the results of the three machine learning techniques, the random forest and XGBoost models performed well in terms of prediction. Subsequently, sensitivity analysis was conducted to identify the initial events, key events, and key features influencing the endocrine disruption effects of PAEs on humans. Key features, such as Mol.Wt, Q+, QH+, ELUMO, minHCsats, MEDC-33, and EG, were found to be closely related to the molecular structure. Therefore, a 3D-QSAR model for PAEs was constructed, and, based on the three-dimensional potential energy surface information, it was discovered that the hydrophobic, steric, and electrostatic fields of PAEs significantly influence their endocrine disruption effects on humans. Lastly, an analysis of the contributions of amino acid residues and binding energy (BE) was performed, identifying and confirming that hydrogen bonding, hydrophobic interactions, and van der Waals forces are important factors affecting the AOP of PAEs' molecular endocrine disruption effects. This study defined and constructed a comprehensive AOP for the endocrine disruption effects of PAEs on humans and developed a method based on theoretical simulation to characterize the AOP, providing theoretical guidance for studying the mechanisms of toxicity caused by other pollutants.
Collapse
Affiliation(s)
| | | | | | - Yu Li
- College of Environmental Science and Engineering, North China Electric Power University, Beijing 102206, China; (Y.L.); (H.Y.); (W.H.)
| |
Collapse
|
15
|
Villot C, Huang T, Lao KU. Accurate prediction of global-density-dependent range-separation parameters based on machine learning. J Chem Phys 2023; 159:044103. [PMID: 37486048 DOI: 10.1063/5.0157340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 07/03/2023] [Indexed: 07/25/2023] Open
Abstract
In this work, we develop an accurate and efficient XGBoost machine learning model for predicting the global-density-dependent range-separation parameter, ωGDD, for long-range corrected functional (LRC)-ωPBE. This ωGDDML model has been built using a wide range of systems (11 466 complexes, ten different elements, and up to 139 heavy atoms) with fingerprints for the local atomic environment and histograms of distances for the long-range atomic correlation for mapping the quantum mechanical range-separation values. The promising performance on the testing set with 7046 complexes shows a mean absolute error of 0.001 117 a0-1 and only five systems (0.07%) with an absolute error larger than 0.01 a0-1, which indicates the good transferability of our ωGDDML model. In addition, the only required input to obtain ωGDDML is the Cartesian coordinates without electronic structure calculations, thereby enabling rapid predictions. LRC-ωPBE(ωGDDML) is used to predict polarizabilities for a series of oligomers, where polarizabilities are sensitive to the asymptotic density decay and are crucial in a variety of applications, including the calculations of dispersion corrections and refractive index, and surpasses the performance of all other popular density functionals except for the non-tuned LRC-ωPBE. Finally, LRC-ωPBE (ωGDDML) combined with (extended) symmetry-adapted perturbation theory is used in calculating noncovalent interactions to further show that the traditional ab initio system-specific tuning procedure can be bypassed. The present study not only provides an accurate and efficient way to determine the range-separation parameter for LRC-ωPBE but also shows the synergistic benefits of fusing the power of physically inspired density functional LRC-ωPBE and the data-driven ωGDDML model.
Collapse
Affiliation(s)
- Corentin Villot
- Department of Chemistry, Virginia Commonwealth University, Richmond, Virginia 23284, USA
| | - Tong Huang
- Department of Chemistry, Virginia Commonwealth University, Richmond, Virginia 23284, USA
| | - Ka Un Lao
- Department of Chemistry, Virginia Commonwealth University, Richmond, Virginia 23284, USA
| |
Collapse
|
16
|
Aksu GO, Keskin S. Advancing CH 4/H 2 separation with covalent organic frameworks by combining molecular simulations and machine learning. JOURNAL OF MATERIALS CHEMISTRY. A 2023; 11:14788-14799. [PMID: 37441278 PMCID: PMC10335334 DOI: 10.1039/d3ta02433d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/05/2023] [Indexed: 07/15/2023]
Abstract
A high-throughput computational screening approach combined with machine learning (ML) was introduced to unlock the potential of both synthesized and hypothetical COFs (hypoCOFs) for adsorption-based CH4/H2 separation. We studied 597 synthesized COFs for adsorption of a CH4/H2 mixture using Grand Canonical Monte Carlo (GCMC) simulations under pressure-swing adsorption (PSA) and vacuum-swing adsorption (VSA) conditions. Based on the simulation results, the CH4/H2 selectivities, CH4 working capacities, adsorbent performance scores, and regenerabilities of the synthesized COFs were assessed and the structural properties of the top-performing COFs were identified. The hypoCOF database composed of 69 840 materials was then filtered to identify 7737 hypothetical materials having similar structural properties to the top synthesized COFs. These hypothetical COFs were then examined for CH4/H2 separation using molecular simulations and the results showed that the top hypoCOFs have CH4 selectivities and working capacities in the ranges of 21.9-28.7 (64.7-128.6) and 5.8-7.6 (1.3-3.1) mol kg-1 under PSA (VSA) conditions, respectively, outperforming the synthesized COFs and metal-organic frameworks (MOFs). ML models were then developed based on the hypoCOF simulation results to accurately predict the CH4/H2 mixture adsorption properties of all remaining hypothetical materials when their structural and chemical properties are fed into the models. These models accurately assessed the CH4/H2 mixture separation performances of any hypoCOF within seconds without performing computationally demanding molecular simulations. The computational approach that we have proposed in this study will provide an accurate and efficient assessment of COF materials for CH4/H2 separation and significantly accelerate the experimental efforts towards the design and discovery of new high-performing COF adsorbents.
Collapse
Affiliation(s)
- Gokhan Onder Aksu
- Department of Chemical and Biological Engineering, Koc University Rumelifeneri Yolu, Sariyer 34450 Istanbul Turkey +90 212 338 1362
| | - Seda Keskin
- Department of Chemical and Biological Engineering, Koc University Rumelifeneri Yolu, Sariyer 34450 Istanbul Turkey +90 212 338 1362
| |
Collapse
|
17
|
Li L, Zhao Y, Yu H, Wang Z, Zhao Y, Jiang M. An XGBoost Algorithm Based on Molecular Structure and Molecular Specificity Parameters for Predicting Gas Adsorption. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2023; 39:6756-6766. [PMID: 37130050 DOI: 10.1021/acs.langmuir.3c00255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
In this paper, an improved Extreme Gradient Boosting (XGBoost) algorithm based on the Graph Isomorphic Network (GIN) for predicting the adsorption performance of metal-organic frameworks (MOFs) is developed. It is shown that the graph isomorphic layer of this algorithm can directly learn the feature representation of materials from the connection of atoms in MOFs. Then, XGBoost can be used to predict the adsorption performance of MOFs based on feature representation. In this sense, it is not only possible to achieve end-to-end prediction directly from the structure of MOFs to adsorption performance but also to ensure the accuracy of prediction. The comparison between Grand Canonical Monte Carlo (GCMC) simulation and prediction supports the performance and effectiveness of the proposed algorithm.
Collapse
Affiliation(s)
- Lujun Li
- Department of Automation, University of Science and Technology of China, Hefei 230026, China
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
| | - Yiming Zhao
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
| | - Haibin Yu
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
| | - Zhuo Wang
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
| | - Yongjia Zhao
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
| | - Mingqi Jiang
- State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
- Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
18
|
Daglar H, Gulbalkan HC, Habib N, Durak O, Uzun A, Keskin S. Integrating Molecular Simulations with Machine Learning Guides in the Design and Synthesis of [BMIM][BF 4]/MOF Composites for CO 2/N 2 Separation. ACS APPLIED MATERIALS & INTERFACES 2023; 15:17421-17431. [PMID: 36972354 PMCID: PMC10080536 DOI: 10.1021/acsami.3c02130] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 03/15/2023] [Indexed: 06/18/2023]
Abstract
Considering the existence of a large number and variety of metal-organic frameworks (MOFs) and ionic liquids (ILs), assessing the gas separation potential of all possible IL/MOF composites by purely experimental methods is not practical. In this work, we combined molecular simulations and machine learning (ML) algorithms to computationally design an IL/MOF composite. Molecular simulations were first performed to screen approximately 1000 different composites of 1-n-butyl-3-methylimidazolium tetrafluoroborate ([BMIM][BF4]) with a large variety of MOFs for CO2 and N2 adsorption. The results of simulations were used to develop ML models that can accurately predict the adsorption and separation performances of [BMIM][BF4]/MOF composites. The most important features that affect the CO2/N2 selectivity of composites were extracted from ML and utilized to computationally generate an IL/MOF composite, [BMIM][BF4]/UiO-66, which was not present in the original material data set. This composite was finally synthesized, characterized, and tested for CO2/N2 separation. Experimentally measured CO2/N2 selectivity of the [BMIM][BF4]/UiO-66 composite matched well with the selectivity predicted by the ML model, and it was found to be comparable, if not higher than that of all previously synthesized [BMIM][BF4]/MOF composites reported in the literature. Our proposed approach of combining molecular simulations with ML models will be highly useful to accurately predict the CO2/N2 separation performances of any [BMIM][BF4]/MOF composite within seconds compared to the extensive time and effort requirements of purely experimental methods.
Collapse
Affiliation(s)
- Hilal Daglar
- Department
of Chemical and Biological Engineering, Koç University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
| | - Hasan Can Gulbalkan
- Department
of Chemical and Biological Engineering, Koç University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
| | - Nitasha Habib
- Department
of Chemical and Biological Engineering, Koç University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
- Koç
University TÜPRAŞ Energy Center (KUTEM), Koç University, Rumelifeneri Yolu, 34450 Sariyer, Istanbul, Turkey
| | - Ozce Durak
- Department
of Chemical and Biological Engineering, Koç University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
- Koç
University TÜPRAŞ Energy Center (KUTEM), Koç University, Rumelifeneri Yolu, 34450 Sariyer, Istanbul, Turkey
| | - Alper Uzun
- Department
of Chemical and Biological Engineering, Koç University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
- Koç
University TÜPRAŞ Energy Center (KUTEM), Koç University, Rumelifeneri Yolu, 34450 Sariyer, Istanbul, Turkey
- Koç
University Surface Science and Technology Center (KUYTAM), Koç University, Rumelifeneri Yolu, 34450 Sariyer, Istanbul, Turkey
| | - Seda Keskin
- Department
of Chemical and Biological Engineering, Koç University, Rumelifeneri Yolu, Sariyer, 34450 Istanbul, Turkey
| |
Collapse
|
19
|
Sun X, Lin W, Jiang K, Liang H, Chen G. Accelerated screening and assembly of promising MOFs with open Cu sites for isobutene/isobutane separation using a data-driven approach. Phys Chem Chem Phys 2023; 25:8608-8623. [PMID: 36891889 DOI: 10.1039/d2cp05410h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
As the by-products of catalytic cracking or alkane dehydrogenation, isobutene (2-methyl-propylene) and isobutane (2-methyl-propane) are important chemical feedstocks, but the separation of their mixture is a challenging issue in the petrochemical industry. Herein, we report the first example of large-scale computational screening of metal-organic frameworks (MOFs) with copper open metal sites (Cu-OMS) on the adsorptive separation of isobutene/isobutane using configuration-bias Monte Carlo (CBMC) simulations and machine learning among >330 000 MOFs data. We discovered that the optimal structural features governing the MOFs-based separation of isobutene/isobutane were density (ρ) and porosity (φ), with ranges of 0.2-0.5 g cm-3 and 0.8-0.9, respectively. Furthermore, the key genes (metal nodes or linkers of frameworks) contributing to such adsorptive separation were data-mined by feature engineering of ML. These genes were cross-assembled into novel frameworks using a material-genomics strategy. The screened AVAKEP, XAHPON, HUNCIE, Cu2O8-mof177-TDPAT_No730 and assembled Cu2O8-BTC_B-core-4_No1 possessed high isobutene uptake and isobutene/isobutane selectivity of >19.5 mmol g-1 and 4.7, with high thermal stability (as validated by molecular-dynamics simulations) overcoming the critical "trade-off" problem to some extent. The macroporous structures (pore-limiting diameter >12 Å) of these five promising frameworks with multi-layer adsorption on isobutene resulted in high isobutene loading, as validated by adsorption isotherms and CBMC simulations. The higher adsorption energy and heat of adsorption of isobutene than those of isobutane indicated that the thermodynamic equilibrium drove their selective adsorption. Generalized charge decomposition analysis and localized orbit locator calculations based on density functional theory wavefunctions suggested that high selectivity was due to complexation of feedback π bonds between isobutene and Cu-OMS, but also the strong π-π stacking interaction induced by the CC bond of isobutene with the multiple aromatic rings and unsaturated bonds of frameworks. Our theoretical results and data-driven approach may provide insights into the development of efficient MOF materials for the separation of isobutene/isobutane and other mixtures.
Collapse
Affiliation(s)
- Xi Sun
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Guangdong 515063, P. R. China.
| | - Wangqiang Lin
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Guangdong 515063, P. R. China.
| | - Kun Jiang
- Department of Natural Science, Shantou Polytechnic, Shantou 515041, Guangdong, China
| | - Heng Liang
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Guangdong 515063, P. R. China.
| | - Guanghui Chen
- Department of Chemistry and Key Laboratory for Preparation and Application of Ordered Structural Materials of Guangdong Province, Shantou University, Guangdong 515063, P. R. China.
| |
Collapse
|
20
|
Data-mining based assembly of promising metal-organic frameworks on Xe/Kr separation. Sep Purif Technol 2023. [DOI: 10.1016/j.seppur.2022.122357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
21
|
Zhang J, Yang X, Chen J, Han J, Chen X, Fan Y, Zheng H. Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model. J Obstet Gynaecol Res 2023; 49:296-303. [PMID: 36220631 DOI: 10.1111/jog.15458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 08/18/2022] [Accepted: 09/23/2022] [Indexed: 01/19/2023]
Abstract
BACKGROUND The pathological phenotype of early-stage cervical cancer (CC) is similar to that of cervical intraepithelial neoplasia (CIN), which provides a challenge for the diagnosis of cervical precancerous lesions. Meanwhile, the existing diagnostic methods have certain subjectivity and limitations, resulting in the possibility of misdiagnosis or missed diagnosis. Hence, some methods are needed to assist diagnosis of CC and CIN. METHODS Based on the data of CIN and CC in gene expression omnibus (GEO) dataset, the eXtreme Gradient Boosting (XGBoost) algorithm was used to screen the feature genes between CIN and CC for constructing the classifier. Incremental feature selection (IFS) curve was also used for screening. The classifier was validated for reliability using principal component analysis (PCA) dimensionality reduction analysis and heat map analysis of gene expression. Then, differentially expressed genes of CIN and CC were intersected with the classifier genes. Genes in the intersection were used as seeds for protein-protein interaction network construction and restart random walk analysis. And the genes with the top 50 affinity coefficients were selected for gene ontology (GO) and kyoto encyclopedia of genes and genome (KEGG) enrichment analyses to observe the biological functions with differences between CIN and CC. RESULTS The peripheral blood genes of CIN and CC were analyzed, and seven genes were screened. Using this gene for classifier construction, IFS curve screening revealed that the three-feature gene classifier constructed according to the random forest model had the best effect. The results of PCA dimensionality reduction analysis and gene expression heat map analysis showed that the three-gene classifier could effectively distinguish CIN from CC. CONCLUSION A three-gene diagnostic classifier can effectively distinguish CIN patients from CC patients and provide a reference for the clinical diagnosis of early CC.
Collapse
Affiliation(s)
- Jing Zhang
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Xiuqing Yang
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Jia Chen
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Jing Han
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Xiaofeng Chen
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Yueping Fan
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Hui Zheng
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| |
Collapse
|
22
|
Mai H, Le TC, Chen D, Winkler DA, Caruso RA. Machine Learning in the Development of Adsorbents for Clean Energy Application and Greenhouse Gas Capture. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2203899. [PMID: 36285802 PMCID: PMC9798988 DOI: 10.1002/advs.202203899] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/27/2022] [Indexed: 06/04/2023]
Abstract
Addressing climate change challenges by reducing greenhouse gas levels requires innovative adsorbent materials for clean energy applications. Recent progress in machine learning has stimulated technological breakthroughs in the discovery, design, and deployment of materials with potential for high-performance and low-cost clean energy applications. This review summarizes basic machine learning methods-data collection, featurization, model generation, and model evaluation-and reviews their use in the development of robust adsorbent materials. Key case studies are provided where these methods are used to accelerate adsorbent materials design and discovery, optimize synthesis conditions, and understand complex feature-property relationships. The review provides a concise resource for researchers wishing to use machine learning methods to rapidly develop effective adsorbent materials with a positive impact on the environment.
Collapse
Affiliation(s)
- Haoxin Mai
- Applied Chemistry and Environmental ScienceSchool of ScienceSTEM CollegeRMIT UniversityMelbourneVictoria3001Australia
| | - Tu C. Le
- School of EngineeringSTEM CollegeRMIT UniversityGPO Box 2476MelbourneVictoria3001Australia
| | - Dehong Chen
- Applied Chemistry and Environmental ScienceSchool of ScienceSTEM CollegeRMIT UniversityMelbourneVictoria3001Australia
| | - David A. Winkler
- Monash Institute of Pharmaceutical SciencesMonash UniversityParkvilleVIC3052Australia
- School of Biochemistry and ChemistryLa Trobe UniversityKingsbury DriveBundoora3042Australia
- School of PharmacyUniversity of NottinghamNottinghamNG7 2RDUK
| | - Rachel A. Caruso
- Applied Chemistry and Environmental ScienceSchool of ScienceSTEM CollegeRMIT UniversityMelbourneVictoria3001Australia
| |
Collapse
|
23
|
Zhao F, Zhang H, Cheng D, Wang W, Li Y, Wang Y, Lu D, Dong C, Ren D, Yang L. Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models. Front Med (Lausanne) 2022; 9:1037944. [PMID: 36507527 PMCID: PMC9732087 DOI: 10.3389/fmed.2022.1037944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 11/11/2022] [Indexed: 11/27/2022] Open
Abstract
Background Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models to predict the occurrence of nodular thyroid disease in coal miners. The aim of this study was to predict the high risk of nodular thyroid disease in coal miners based on five different Machine learning (ML) models. Methods This is a retrospective clinical study in which 1,708 coal miners who were examined at the Huaihe Energy Occupational Disease Control Hospital in Anhui Province in April 2021 were selected and their clinical physical examination data, including general information, laboratory tests and imaging findings, were collected. A synthetic minority oversampling technique (SMOTE) was used for sample balancing, and the data set was randomly split into a training and Test dataset in a ratio of 8:2. Lasso regression and correlation heat map were used to screen the predictors of the models, and five ML models, including Extreme Gradient Augmentation (XGBoost), Logistic Classification (LR), Gaussian Parsimonious Bayesian Classification (GNB), Neural Network Classification (MLP), and Complementary Parsimonious Bayesian Classification (CNB) for their predictive efficacy, and the model with the highest AUC was selected as the optimal model for predicting the occurrence of nodular thyroid disease in coal miners. Result Lasso regression analysis showed Age, H-DLC, HCT, MCH, PLT, and GGT as predictor variables for the ML models; in addition, heat maps showed no significant correlation between the six variables. In the prediction of nodular thyroid disease, the AUC results of the five ML models, XGBoost (0.892), LR (0.577), GNB (0.603), MLP (0.601), and CNB (0.543), with the XGBoost model having the largest AUC, the model can be applied in clinical practice. Conclusion In this research, all five ML models were found to predict the risk of nodular thyroid disease in coal miners, with the XGBoost model having the best overall predictive performance. The model can assist clinicians in quickly and accurately predicting the occurrence of nodular thyroid disease in coal miners, and in adopting individualized clinical prevention and treatment strategies.
Collapse
Affiliation(s)
- Feng Zhao
- The First Hospital of Anhui University of Science & Technology (Huainan First People’s Hospital), Huainan, China
| | - Hongzhen Zhang
- Anhui University of Science and Technology College of Medicine, Huainan, China
| | - Danqing Cheng
- Graduate School of Bengbu Medical College, Bengbu, China
| | - Wenping Wang
- Graduate School of Bengbu Medical College, Bengbu, China
| | - Yongtian Li
- Anhui University of Science and Technology College of Medicine, Huainan, China
| | - Yisong Wang
- Anhui University of Science and Technology College of Medicine, Huainan, China
| | - Dekun Lu
- The First Hospital of Anhui University of Science & Technology (Huainan First People’s Hospital), Huainan, China
| | - Chunhui Dong
- Anhui University of Science and Technology College of Medicine, Huainan, China
| | - Dingfei Ren
- Occupational Control Hospital of Huai He Energy Group, Huainan, Anhui, China
| | - Lixin Yang
- The First Hospital of Anhui University of Science & Technology (Huainan First People’s Hospital), Huainan, China,*Correspondence: Lixin Yang,
| |
Collapse
|
24
|
Yan T, Bi Z, Liu D, Zhang X, Lu G, Yang Q. A Self-Evolutionary Methodology for Reverse Design of Novel MOFs. J Phys Chem A 2022; 126:8476-8486. [DOI: 10.1021/acs.jpca.2c05647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Tongan Yan
- State Key Laboratory of Organic-Inorganic Composites, Beijing University of Chemical Technology, Beijing100029, China
| | - Zhiyuan Bi
- State Key Laboratory of Organic-Inorganic Composites, Beijing University of Chemical Technology, Beijing100029, China
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing100029, China
| | - Dahuan Liu
- State Key Laboratory of Organic-Inorganic Composites, Beijing University of Chemical Technology, Beijing100029, China
| | - Xiaonan Zhang
- State Key Laboratory of Organic-Inorganic Composites, Beijing University of Chemical Technology, Beijing100029, China
| | - Gang Lu
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing100029, China
| | - Qingyuan Yang
- State Key Laboratory of Organic-Inorganic Composites, Beijing University of Chemical Technology, Beijing100029, China
| |
Collapse
|
25
|
Guo R, Dai J, Xu H, Zang S, Zhang L, Ma N, Zhang X, Zhao L, Luo H, Liu D, Zhang J. The diagnostic significance of integrating m6A modification and immune microenvironment features based on bioinformatic investigation in aortic dissection. Front Cardiovasc Med 2022; 9:948002. [PMID: 36105536 PMCID: PMC9464924 DOI: 10.3389/fcvm.2022.948002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
PurposeThe aim of this study was to investigate the role of m6A modification and the immune microenvironment (IME) features in aortic dissection (AD) and establish a clinical diagnostic model for AD based on m6A and IME factors.MethodsGSE52093, GSE98770, GSE147026, GSE153434, and GSE107844 datasets were downloaded from the GEO database. The expression of 21 m6A genes including m6A writers, erasers, readers, and immune cell infiltrates was analyzed in AD and healthy samples by differential analysis and ssGSEA method, respectively. Both correlation analyses between m6A genes and immune cells were conducted by Pearson and Spearman analysis. XGboost was used to dissect the major m6A genes with significant influences on AD. AD samples were classified into two subgroups via consensus cluster and principal component analysis (PCA) analysis, respectively. Among each subgroup, paramount IME features were evaluated. Random forest (RF) was used to figure out key genes from AD and healthy shared differentially expressed genes (DEGs) and two AD subgroups after gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Finally, we constructed an AD diagnostic model combining important m6A regulatory genes and assessed its efficacy.ResultsAmong 21 m6A genes, WTAP, HNRNPC, and FTO were upregulated in AD samples, while IGF2BP1 was downregulated compared with healthy samples. Immune cell infiltrating analysis revealed that YTHDF1 was positively correlated with γδT cell level, while FTO was negatively correlated with activated CD4+ T cell abundance. FTO and IGF2BP1 were identified to be crucial genes that facilitate AD development according to the XGboost algorithm. Notably, patients with AD could be classified into two subgroups among which 21 m6A gene expression profiles and IME features differ from each other via consensus cluster analysis. The RF identified SYNC and MAPK1IP1L as the crucial genes from common 657 shared common genes in 1,141 DEGs between high and low m6A scores of AD groups. Interestingly, the AD diagnostic model coordinating SYNC and MAPK1IP1L with FTO and IGF2BP1 performed well in distinguishing AD samples.ConclusionThis study indicated that FTO and IGF2BP1 were involved in the IME of AD. Integrating FTO and IGF2BP1 and MAPK1IP1L key genes in AD with a high m6A level context would provide clues for forthcoming AD diagnosis and therapy.
Collapse
|
26
|
Daglar H, Keskin S. Combining Machine Learning and Molecular Simulations to Unlock Gas Separation Potentials of MOF Membranes and MOF/Polymer MMMs. ACS APPLIED MATERIALS & INTERFACES 2022; 14:32134-32148. [PMID: 35818710 PMCID: PMC9305976 DOI: 10.1021/acsami.2c08977] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Due to the enormous increase in the number of metal-organic frameworks (MOFs), combining molecular simulations with machine learning (ML) would be a very useful approach for the accurate and rapid assessment of the separation performances of thousands of materials. In this work, we combined these two powerful approaches, molecular simulations and ML, to evaluate MOF membranes and MOF/polymer mixed matrix membranes (MMMs) for six different gas separations: He/H2, He/N2, He/CH4, H2/N2, H2/CH4, and N2/CH4. Single-component gas uptakes and diffusivities were computed by grand canonical Monte Carlo (GCMC) and molecular dynamics (MD) simulations, respectively, and these simulation results were used to assess gas permeabilities and selectivities of MOF membranes. Physical, chemical, and energetic features of MOFs were used as descriptors, and eight different ML models were developed to predict gas adsorption and diffusion properties of MOFs. Gas permeabilities and membrane selectivities of 5249 MOFs and 31,494 MOF/polymer MMMs were predicted using these ML models. To examine the transferability of the ML models, we also focused on computer-generated, hypothetical MOFs (hMOFs) and predicted the gas permeability and selectivity of 1000 hMOF/polymer MMMs. The ML models that we developed accurately predict the uptake and diffusion properties of He, H2, N2, and CH4 gases in MOFs and will significantly accelerate the assessment of separation performances of MOF membranes and MOF/polymer MMMs. These models will also be useful to direct the extensive experimental efforts and computationally demanding molecular simulations to the fabrication and analysis of membrane materials offering high performance for a target gas separation.
Collapse
|
27
|
Wang NN, Wang XG, Xiong GL, Yang ZY, Lu AP, Chen X, Liu S, Hou TJ, Cao DS. Machine learning to predict metabolic drug interactions related to cytochrome P450 isozymes. J Cheminform 2022; 14:23. [PMID: 35428354 PMCID: PMC9013037 DOI: 10.1186/s13321-022-00602-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 03/26/2022] [Indexed: 11/28/2022] Open
Abstract
Drug–drug interaction (DDI) often causes serious adverse reactions and thus results in inestimable economic and social loss. Currently, comprehensive DDI evaluation has become a major challenge in pharmaceutical research due to the time-consuming and costly process of the experimental assessment and it is of high necessity to develop effective in silico methods to predict and evaluate DDIs accurately and efficiently. In this study, based on a large number of substrates and inhibitors related to five important CYP450 isozymes (CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4), a series of high-performance predictive models for metabolic DDIs were constructed by two machine learning methods (random forest and XGBoost) and 4 different types of descriptors (MOE_2D, CATS, ECFP4 and MACCS). To reduce the uncertainty of individual models, the consensus method was applied to yield more reliable predictions. A series of evaluations illustrated that the consensus models were more reliable and robust for the DDI predictions of new drug combination. For the internal validation, the whole prediction accuracy and AUC value of the DDI models were around 0.8 and 0.9, respectively. When it was applied to the external datasets, the model accuracy was 0.793 and 0.795 for multi-level validation and external validation, respectively. Furthermore, we also compared our model with some recently published tools and then applied the final model to predict FDA-approved drugs and proposed 54,013 possible drug pairs with potential DDIs. In summary, we developed a powerful DDI predictive model from the perspective of the CYP450 enzyme family and it will help a lot in the future drug development and clinical pharmacy research.
Collapse
|
28
|
Wang S, Xue X, Cheng M, Chen S, Liu C, Zhou L, Bi K, Ji X. High-Throughput Computational Screening of Metal-Organic Frameworks for CH 4/H 2 Separation by Synergizing Machine Learning and Molecular Simulation. ACTA CHIMICA SINICA 2022. [DOI: 10.6023/a22010031] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
29
|
Machine Learning Prediction of Resistance to Subinhibitory Antimicrobial Concentrations from Escherichia coli Genomes. mSystems 2021; 6:e0034621. [PMID: 34427505 PMCID: PMC8407197 DOI: 10.1128/msystems.00346-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Escherichia coli is an important cause of bacterial infections worldwide, with multidrug-resistant strains incurring substantial costs on human lives. Besides therapeutic concentrations of antimicrobials in health care settings, the presence of subinhibitory antimicrobial residues in the environment and in clinics selects for antimicrobial resistance (AMR), but the underlying genetic repertoire is less well understood. Here, we used machine learning to predict the population doubling time and cell growth yield of 1,407 genetically diverse E. coli strains expanding under exposure to three subinhibitory concentrations of six classes of antimicrobials from single-nucleotide genetic variants, accessory gene variation, and the presence of known AMR genes. We predicted cell growth yields in the held-out test data with an average correlation (Spearman's ρ) of 0.63 (0.36 to 0.81 across concentrations) and cell doubling times with an average correlation of 0.59 (0.32 to 0.92 across concentrations), with moderate increases in sample size unlikely to improve predictions further. This finding points to the remaining missing heritability of growth under antimicrobial exposure being explained by effects that are too rare or weak to be captured unless sample size is dramatically increased, or by effects other than those conferred by the presence of individual single-nucleotide polymorphisms (SNPs) and genes. Predictions based on whole-genome information were generally superior to those based only on known AMR genes and were accurate for AMR resistance at therapeutic concentrations. We pinpointed genes and SNPs determining the predicted growth and thereby recapitulated many known AMR determinants. Finally, we estimated the effect sizes of resistance genes across the entire collection of strains, disclosing the growth effects for known resistance genes in each individual strain. Our results underscore the potential of predictive modeling of growth patterns from genomic data under subinhibitory concentrations of antimicrobials, although the remaining missing heritability poses a challenge for achieving the accuracy and precision required for clinical use. IMPORTANCE Predicting bacterial growth from genome sequences is important for a rapid characterization of strains in clinical diagnostics and to disclose candidate novel targets for anti-infective drugs. Previous studies have dissected the relationship between bacterial growth and genotype in mutant libraries for laboratory strains, yet no study so far has examined the predictive power of genome sequence in natural strains. In this study, we used a high-throughput phenotypic assay to measure the growth of a systematic collection of natural Escherichia coli strains and then employed machine learning models to predict bacterial growth from genomic data under nontherapeutic subinhibitory concentrations of antimicrobials that are common in nonclinical settings. We found a moderate to strong correlation between predicted and actual values for the different collected data sets. Moreover, we observed that the known resistance genes are still effective at sublethal concentrations, pointing to clinical implications of these concentrations.
Collapse
|