1
|
Wang S, Yue H, Yuan X. Accelerating Polymer Discovery with Uncertainty-Guided PGCNN: Explainable AI for Predicting Properties and Mechanistic Insights. J Chem Inf Model 2024; 64:5500-5509. [PMID: 38953249 DOI: 10.1021/acs.jcim.4c00555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Deep learning holds great potential for expediting the discovery of new polymers from the vast chemical space. However, accurately predicting polymer properties for practical applications based on their monomer composition has long been a challenge. The main obstacles include insufficient data, ineffective representation encoding, and lack of explainability. To address these issues, we propose an interpretable model called the Polymer Graph Convolutional Neural Network (PGCNN) that can accurately predict various polymer properties. This model is trained using the RadonPy data set and validated using experimental data samples. By integrating evidential deep learning with the model, we can quantify the uncertainty of predictions and enable sample-efficient training through uncertainty-guided active learning. Additionally, we demonstrate that the global attention of the graph embedding can aid in discovering underlying physical principles by identifying important functional groups within polymers and associating them with specific material attributes. Lastly, we explore the high-throughput screening capability of our model by rapidly identifying thousands of promising candidates with low and high thermal conductivity from a pool of one million hypothetical polymers. In summary, our research not only advances our mechanistic understanding of polymers using explainable AI but also paves the way for data-driven trustworthy discovery of polymer materials.
Collapse
Affiliation(s)
- Shuyu Wang
- Department of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, Hebei 066000, China
| | - Hongxing Yue
- Department of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, Hebei 066000, China
| | - Xiaoming Yuan
- Xiaoming Yuan - Department of Computer Science and Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, Hebei 066000, China
| |
Collapse
|
2
|
Song S, Xu X, Lan H, Gao L, Lin J, Du L, Wang Y. Design of Co-Cured Multi-Component Thermosets with Enhanced Heat Resistance, Toughness, and Processability via a Machine Learning Approach. Macromol Rapid Commun 2024:e2400337. [PMID: 39018478 DOI: 10.1002/marc.202400337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 06/30/2024] [Indexed: 07/19/2024]
Abstract
Designing heat-resistant thermosets with excellent comprehensive performance has been a long-standing challenge. Co-curing of various high-performance thermosets is an effective strategy, however, the traditional trial-and-error experiments have long research cycles for discovering new materials. Herein, a two-step machine learning (ML) assisted approach is proposed to design heat-resistant co-cured resins composed of polyimide (PI) and silicon-containing arylacetylene (PSA), that is, poly(silicon-alkyne imide) (PSI). First, two ML prediction models are established to evaluate the processability of PIs and their compatibility with PSA. Then, another two ML models are developed to predict the thermal decomposition temperature and flexural strength of the co-cured PSI resins. The optimal molecular structures and compositions of PSI resins are high-throughput screened. The screened PSI resins are experimentally verified to exhibit enhanced heat resistance, toughness, and processability. The research framework established in this work can be generalized to the rational design of other advanced multi-component polymeric materials.
Collapse
Affiliation(s)
- Shuang Song
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Xinyao Xu
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Haoxiang Lan
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Liang Gao
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Jiaping Lin
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Lei Du
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Yuyuan Wang
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| |
Collapse
|
3
|
Kehrein J, Bunker A, Luxenhofer R. POxload: Machine Learning Estimates Drug Loadings of Polymeric Micelles. Mol Pharm 2024; 21:3356-3374. [PMID: 38805643 DOI: 10.1021/acs.molpharmaceut.4c00086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Block copolymers, composed of poly(2-oxazoline)s and poly(2-oxazine)s, can serve as drug delivery systems; they form micelles that carry poorly water-soluble drugs. Many recent studies have investigated the effects of structural changes of the polymer and the hydrophobic cargo on drug loading. In this work, we combine these data to establish an extended formulation database. Different molecular properties and fingerprints are tested for their applicability to serve as formulation-specific mixture descriptors. A variety of classification and regression models are built for different descriptor subsets and thresholds of loading efficiency and loading capacity, with the best models achieving overall good statistics for both cross- and external validation (balanced accuracies of 0.8). Subsequently, important features are dissected for interpretation, and the DrugBank is screened for potential therapeutic use cases where these polymers could be used to develop novel formulations of hydrophobic drugs. The most promising models are provided as an open-source software tool for other researchers to test the applicability of these delivery systems for potential new drug candidates.
Collapse
Affiliation(s)
- Josef Kehrein
- Soft Matter Chemistry, Department of Chemistry, Faculty of Science, University of Helsinki, A. I. Virtasen aukio 1, 00014 Helsinki, Finland
- Drug Research Program, Division of Pharmaceutical Biosciences Faculty of Pharmacy, University of Helsinki, Viikinkaari 5 E, 00014 Helsinki, Finland
| | - Alex Bunker
- Drug Research Program, Division of Pharmaceutical Biosciences Faculty of Pharmacy, University of Helsinki, Viikinkaari 5 E, 00014 Helsinki, Finland
| | - Robert Luxenhofer
- Soft Matter Chemistry, Department of Chemistry, Faculty of Science, University of Helsinki, A. I. Virtasen aukio 1, 00014 Helsinki, Finland
| |
Collapse
|
4
|
Yang M, Zhu JJ, McGaughey AL, Priestley RD, Hoek EMV, Jassby D, Ren ZJ. Machine Learning for Polymer Design to Enhance Pervaporation-Based Organic Recovery. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:10128-10139. [PMID: 38743597 DOI: 10.1021/acs.est.4c00060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Pervaporation (PV) is an effective membrane separation process for organic dehydration, recovery, and upgrading. However, it is crucial to improve membrane materials beyond the current permeability-selectivity trade-off. In this research, we introduce machine learning (ML) models to identify high-potential polymers, greatly improving the efficiency and reducing cost compared to conventional trial-and-error approach. We utilized the largest PV data set to date and incorporated polymer fingerprints and features, including membrane structure, operating conditions, and solute properties. Dimensionality reduction, missing data treatment, seed randomness, and data leakage management were employed to ensure model robustness. The optimized LightGBM models achieved RMSE of 0.447 and 0.360 for separation factor and total flux, respectively (logarithmic scale). Screening approximately 1 million hypothetical polymers with ML models resulted in identifying polymers with a predicted permeation separation index >30 and synthetic accessibility score <3.7 for acetic acid extraction. This study demonstrates the promise of ML to accelerate tailored membrane designs.
Collapse
Affiliation(s)
- Meiqi Yang
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Jun-Jie Zhu
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Allyson L McGaughey
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Rodney D Priestley
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Eric M V Hoek
- Department of Civil & Environmental Engineering, University of California Los Angeles, Los Angeles, California 90095, United States
| | - David Jassby
- Department of Civil & Environmental Engineering, University of California Los Angeles, Los Angeles, California 90095, United States
| | - Zhiyong Jason Ren
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
5
|
Woodhouse AW, Kocaarslan A, Garden JA, Mutlu H. Unlocking the Potential of Polythioesters. Macromol Rapid Commun 2024:e2400260. [PMID: 38824417 DOI: 10.1002/marc.202400260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 05/20/2024] [Indexed: 06/03/2024]
Abstract
As the demand for sustainable polymers increases, most research efforts have focused on polyesters, which can be bioderived and biodegradable. Yet analogous polythioesters, where one of the oxygen atoms has been replaced by a sulfur atom, remain a relatively untapped source of potential. The incorporation of sulfur allows the polymer to exhibit a wide range of favorable properties, such as thermal resistance, degradability, and high refractive index. Polythioester synthesis represents a frontier in research, holding the promise of paving the way for eco-friendly alternatives to conventional polyesters. Moreover, polythioester research can also open avenues to the development of sustainable and recyclable materials. In the last 25 years, many methods to synthesize polythioesters have been developed. However, to date no industrial synthesis of polythioesters has been developed due to challenges of costs, yields, and the toxicity of the by-products. This review will summarize the recent advances in polythioester synthesis, covering step-growth polymerization, ring-opening polymerization (ROP), and biosynthesis. Crucially, the benefits and challenges of the processes will be highlighted, paying particular attention to their sustainability, with the aim of encouraging further exploration and research into the fast-growing field of polythioesters.
Collapse
Affiliation(s)
- Adam W Woodhouse
- Institut de Science des Matériaux de Mulhouse, UMR 7361 CNRS/Université de Haute Alsace, 15 Rue Jean Starcky, Mulhouse, Cedex, 68057, France
- School of Chemistry, Joseph Black Building, David Brewster Road, Edinburgh, EH9 3FJ, UK
| | - Azra Kocaarslan
- Institute of Chemical Technology and Polymer Chemistry, Karlsruhe Institute of Technology, Engesserstrasee 15, 76131, Karlsruhe, Germany
| | - Jennifer A Garden
- School of Chemistry, Joseph Black Building, David Brewster Road, Edinburgh, EH9 3FJ, UK
| | - Hatice Mutlu
- Institut de Science des Matériaux de Mulhouse, UMR 7361 CNRS/Université de Haute Alsace, 15 Rue Jean Starcky, Mulhouse, Cedex, 68057, France
| |
Collapse
|
6
|
Uddin MJ, Fan J. Interpretable Machine Learning Framework to Predict the Glass Transition Temperature of Polymers. Polymers (Basel) 2024; 16:1049. [PMID: 38674969 PMCID: PMC11054142 DOI: 10.3390/polym16081049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 03/25/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
The glass transition temperature of polymers is a key parameter in meeting the application requirements for energy absorption. Previous studies have provided some data from slow, expensive trial-and-error procedures. By recognizing these data, machine learning algorithms are able to extract valuable knowledge and disclose essential insights. In this study, a dataset of 7174 samples was utilized. The polymers were numerically represented using two methods: Morgan fingerprint and molecular descriptor. During preprocessing, the dataset was scaled using a standard scaler technique. We removed the features with small variance from the dataset and used the Pearson correlation technique to exclude the features that were highly connected. Then, the most significant features were selected using the recursive feature elimination method. Nine machine learning techniques were employed to predict the glass transition temperature and tune their hyperparameters. The models were compared using the performance metrics of mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). We observed that the extra tree regressor provided the best results. Significant features were also identified using statistical machine learning methods. The SHAP method was also employed to demonstrate the influence of each feature on the model's output. This framework can be adaptable to other properties at a low computational expense.
Collapse
Affiliation(s)
| | - Jitang Fan
- School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
7
|
Shi J, Walsh D, Zou W, Rebello NJ, Deagen ME, Fransen KA, Gao X, Olsen BD, Audus DJ. Calculating Pairwise Similarity of Polymer Ensembles via Earth Mover's Distance. ACS POLYMERS AU 2024; 4:66-76. [PMID: 38371731 PMCID: PMC10870752 DOI: 10.1021/acspolymersau.3c00029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 11/28/2023] [Accepted: 11/29/2023] [Indexed: 02/20/2024]
Abstract
Synthetic polymers, in contrast to small molecules and deterministic biomacromolecules, are typically ensembles composed of polymer chains with varying numbers, lengths, sequences, chemistry, and topologies. While numerous approaches exist for measuring pairwise similarity among small molecules and sequence-defined biomacromolecules, accurately determining the pairwise similarity between two polymer ensembles remains challenging. This work proposes the earth mover's distance (EMD) metric to calculate the pairwise similarity score between two polymer ensembles. EMD offers a greater resolution of chemical differences between polymer ensembles than the averaging method and provides a quantitative numeric value representing the pairwise similarity between polymer ensembles in alignment with chemical intuition. The EMD approach for assessing polymer similarity enhances the development of accurate chemical search algorithms within polymer databases and can improve machine learning techniques for polymer design, optimization, and property prediction.
Collapse
Affiliation(s)
- Jiale Shi
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Dylan Walsh
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Weizhong Zou
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Nathan J. Rebello
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Michael E. Deagen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Katharina A. Fransen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Xian Gao
- Department
of Chemical and Biomolecular Engineering, University of Notre Dame, Notre
Dame, Indiana 46556, United States
| | - Bradley D. Olsen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Debra J. Audus
- Materials
Science and Engineering Division, National
Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
8
|
Wang C, Wang L, Yu H, Seo A, Wang Z, Rajabzadeh S, Ni BJ, Shon HK. Machine learning for layer-by-layer nanofiltration membrane performance prediction and polymer candidate exploration. CHEMOSPHERE 2024; 350:140999. [PMID: 38151066 DOI: 10.1016/j.chemosphere.2023.140999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/18/2023] [Accepted: 12/19/2023] [Indexed: 12/29/2023]
Abstract
In this study, machine learning-based models were established for layer-by-layer (LBL) nanofiltration (NF) membrane performance prediction and polymer candidate exploration. Four different models, i.e., linear, random forest (RF), boosted tree (BT), and eXtreme Gradient Boosting (XGBoost), were formed, and membrane performance prediction was determined in terms of membrane permeability and selectivity. The XGBoost exhibited optimal prediction accuracy for membrane permeability (coefficient of determination (R2): 0.99) and membrane selectivity (R2: 0.80). The Shapley Additive exPlanation (SHAP) method was utilized to evaluate the effects of different LBL NF membrane fabrication conditions on membrane performances. The SHAP method was also used to identify the relationships between polymer structure and membrane performance. Polymers were represented by Morgan fingerprint, which is an effective description approach for developing modeling. Based on the SHAP value results, two reference Morgan fingerprints were constructed containing atomic groups with positive contributions to membrane permeability and selectivity. According to the reference Morgan fingerprint, 204 potential polymers were explored from the largest polymer database (PoLyInfo). By calculating the similarities between each potential polymer and both reference Morgan fingerprints, 23 polymer candidates were selected and could be further used for LBL NF membrane fabrication with the potential for providing good membrane performance. Overall, this work provided new ways both for LBL NF membrane performance prediction and high-performance polymer candidate exploration. The source code for the models and algorithms used in this study is publicly available to facilitate replication and further research. https://github.com/wangliwfsd/LLNMPP/.
Collapse
Affiliation(s)
- Chen Wang
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Li Wang
- CSIRO Space and Astronomy, PO Box 1130, Bentley, WA, 6102, Australia
| | - Hanwei Yu
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Allan Seo
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Zhining Wang
- Shandong Provincial Key Laboratory of Water Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Shandong University, Qingdao, 266237, China
| | - Saeid Rajabzadeh
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Bing-Jie Ni
- School of Civil and Environmental Engineering, University of New South Wales, Sydney, New South Wales, 2052, Australia
| | - Ho Kyong Shon
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia.
| |
Collapse
|
9
|
Zhang P, Kearney L, Bhowmik D, Fox Z, Naskar AK, Gounley J. Transferring a Molecular Foundation Model for Polymer Property Predictions. J Chem Inf Model 2023; 63:7689-7698. [PMID: 38055952 DOI: 10.1021/acs.jcim.3c01650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
Transformer-based large language models have remarkable potential to accelerate design optimization for applications such as drug development and material discovery. Self-supervised pretraining of transformer models requires large-scale data sets, which are often sparsely populated in topical areas such as polymer science. State-of-the-art approaches for polymers conduct data augmentation to generate additional samples but unavoidably incur extra computational costs. In contrast, large-scale open-source data sets are available for small molecules and provide a potential solution to data scarcity through transfer learning. In this work, we show that using transformers pretrained on small molecules and fine-tuned on polymer properties achieves comparable accuracy to those trained on augmented polymer data sets for a series of benchmark prediction tasks.
Collapse
Affiliation(s)
- Pei Zhang
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Logan Kearney
- Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Debsindhu Bhowmik
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Zachary Fox
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Amit K Naskar
- Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - John Gounley
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| |
Collapse
|
10
|
Deng S, Chen C, Li K, Chen X, Xia K, Li S. Structure-Based Multilevel Descriptors for High-throughput Screening of Elastomers. J Phys Chem B 2023; 127:10077-10087. [PMID: 37942925 DOI: 10.1021/acs.jpcb.3c06025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
To discover new materials, high-throughput screening (HTS) with machine learning (ML) requires universally available descriptors that can accurately predict the desired properties. For elastomers, experimental and simulation data in current descriptors may not be available for all candidates of interest, hindering elastomer discovery through HTS. To address this challenge, we introduce structure-based multilevel (SM) descriptors of elastomers derived solely from molecular structure that is universally available. Our SM descriptors are hierarchically organized to capture both local soft and hard segment structures as well as the global structures of elastomers. With the SM-Morgan Fingerprint (SM-MF) descriptor, one of our SM descriptors, a machine learning model accurately predicts elastomer toughness with a remarkable accuracy of 0.91. Furthermore, an HTS pipeline is established to swiftly screen elastomers with targeted toughness. We also demonstrate the generality and applicability of SM descriptors by using them to construct HTS pipelines for screening elastomers with a targeted critical strain or Young's modulus. The user-friendliness and low computational cost of SM descriptors make them a promising tool to significantly enhance HTS in the search for novel materials.
Collapse
Affiliation(s)
- Siyan Deng
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Chao Chen
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Ke Li
- Institute of Materials Research and Engineering (IMRE), Agency for Science, Technology and Research (A*STAR), 2 Fusionopolis Way, Innovis #08-03, Singapore 138634, Republic of Singapore
| | - Xi Chen
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Kelin Xia
- School of Physical and Mathematical Sciences, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Shuzhou Li
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| |
Collapse
|
11
|
Hu J, Li Z, Lin J, Zhang L. Prediction and Interpretability of Glass Transition Temperature of Homopolymers by Data-Augmented Graph Convolutional Neural Networks. ACS APPLIED MATERIALS & INTERFACES 2023; 15:54006-54017. [PMID: 37934171 DOI: 10.1021/acsami.3c13698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
Establishing the structure-property relationship by machine learning (ML) models is extremely valuable for accelerating the molecular design of polymers. However, existing ML models for the polymers are subject to scarcity issues of training data and fewer variations of graph structures of molecules. In addition, limited works have explored the interpretability of ML models to infer the latent knowledge in the field of polymer science that could inspire ML-assisted molecular design. In this contribution, we integrate graph convolutional neural networks (GCNs) with data augmentation strategy to predict the glass transition temperature Tg of polymers. It is demonstrated that the data-augmented GCN model outperforms the conventional models and achieves a higher accuracy for the prediction of Tg despite a small amount of training data. Furthermore, taking advantage of molecular graph representations, the data-augmented GCN model has the capability to infer the importance of atoms or substructures from the understanding of Tg, which generally agrees with the experimental findings in the field of polymer science. The inferred knowledge of the GCN model is used to advise on the design of functional polymers with specific Tg. The data-augmented GCN model possesses prominent superiorities in the establishment of structure-property relationship and also provides an efficient way for accelerating the rational design of polymer molecules.
Collapse
Affiliation(s)
- Junyang Hu
- Shanghai Key Laboratory of Advanced Polymeric Materials, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zean Li
- Shanghai Key Laboratory of Advanced Polymeric Materials, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Jiaping Lin
- Shanghai Key Laboratory of Advanced Polymeric Materials, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Liangshun Zhang
- Shanghai Key Laboratory of Advanced Polymeric Materials, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
12
|
AlFaraj Y, Mohapatra S, Shieh P, Husted KEL, Ivanoff DG, Lloyd EM, Cooper JC, Dai Y, Singhal AP, Moore JS, Sottos NR, Gomez-Bombarelli R, Johnson JA. A Model Ensemble Approach Enables Data-Driven Property Prediction for Chemically Deconstructable Thermosets in the Low-Data Regime. ACS CENTRAL SCIENCE 2023; 9:1810-1819. [PMID: 37780353 PMCID: PMC10540282 DOI: 10.1021/acscentsci.3c00502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Indexed: 10/03/2023]
Abstract
Thermosets present sustainability challenges that could potentially be addressed through the design of deconstructable variants with tunable properties; however, the combinatorial space of possible thermoset molecular building blocks (e.g., monomers, cross-linkers, and additives) and manufacturing conditions is vast, and predictive knowledge for how combinations of these molecular components translate to bulk thermoset properties is lacking. Data science could overcome these problems, but computational methods are difficult to apply to multicomponent, amorphous, statistical copolymer materials for which little data exist. Here, leveraging a data set with 101 examples, we introduce a closed-loop experimental, machine learning (ML), and virtual screening strategy to enable predictions of the glass transition temperature (Tg) of polydicyclopentadiene (pDCPD) thermosets containing cleavable bifunctional silyl ether (BSE) comonomers and/or cross-linkers with varied compositions and loadings. Molecular features and formulation variables are used as model inputs, and uncertainty is quantified through model ensembling, which together with heavy regularization helps to avoid overfitting and ultimately achieves predictions within <15 °C for thermosets with compositionally diverse BSEs. This work offers a path to predicting the properties of thermosets based on their molecular building blocks, which may accelerate the discovery of promising plastics, rubbers, and composites with improved functionality and controlled deconstructability.
Collapse
Affiliation(s)
- Yasmeen
S. AlFaraj
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States of America
| | - Somesh Mohapatra
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, Cambridge, Massachusetts 02139, United States of America
| | - Peyton Shieh
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States of America
| | - Keith E. L. Husted
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States of America
| | - Douglass G. Ivanoff
- Department
of Materials Science and Engineering, University
of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States of America
- The
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States
of America
| | - Evan M. Lloyd
- The
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States
of America
- Department
of Chemistry, University of Illinois at
Urbana—Champaign, Urbana, Illinois 61801, United States of America
| | - Julian C. Cooper
- The
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States
of America
- Department
of Chemistry, University of Illinois at
Urbana—Champaign, Urbana, Illinois 61801, United States of America
| | - Yutong Dai
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States of America
| | - Avni P. Singhal
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, Cambridge, Massachusetts 02139, United States of America
| | - Jeffrey S. Moore
- Department
of Materials Science and Engineering, University
of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States of America
- The
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States
of America
| | - Nancy R. Sottos
- Department
of Materials Science and Engineering, University
of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States of America
- The
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana—Champaign, Urbana, Illinois 61801, United States
of America
| | - Rafael Gomez-Bombarelli
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, Cambridge, Massachusetts 02139, United States of America
| | - Jeremiah A. Johnson
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States of America
| |
Collapse
|
13
|
Yue T, He J, Tao L, Li Y. High-Throughput Screening and Prediction of High Modulus of Resilience Polymers Using Explainable Machine Learning. J Chem Theory Comput 2023; 19:4641-4653. [PMID: 37338332 DOI: 10.1021/acs.jctc.3c00131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
The ability to store and release elastic strain energy, as well as mechanical strength, are crucial factors in both natural and man-made mechanical systems. The modulus of resilience (R) indicates a material's capacity to absorb and release elastic strain energy, with the yield strength (σy) and Young's modulus (E) as R = σy2/(2E) for linear elastic solids. To improve the R in linear elastic solids, a high σy and low E combination in materials is sought after. However, achieving this combination is a significant challenge as both properties typically increase together. To address this challenge, we propose a computational method to quickly identify polymers with a high modulus of resilience using machine learning (ML) and validate the predictions through high-fidelity molecular dynamics (MD) simulations. Our approach commences by training single-task ML models, multitask ML models, and Evidential Deep Learning models to forecast the mechanical properties of polymers based on experimentally reported values. Utilizing explainable ML models, we were able to determine the critical substructures that significantly impact the mechanical properties of polymers, such as E and σy. This information can be utilized to create and develop new polymers with improved mechanical characteristics. Our single-task and multitask ML models can predict the properties of 12 854 real polymers and 8 million hypothetical polyimides and uncover 10 new real polymers and 10 hypothetical polyimides with exceptional modulus of resilience. The improved modulus of resilience of these novel polymers was validated through MD simulations. Our method efficiently speeds up the discovery of high-performing polymers using ML predictions and MD validation and can be applied to other polymer material discovery challenges, such as polymer membranes, dielectric polymers, and more.
Collapse
Affiliation(s)
- Tianle Yue
- Department of Mechanical Engineering, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Jinlong He
- Department of Mechanical Engineering, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Ying Li
- Department of Mechanical Engineering, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
14
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
15
|
Terrones GG, Duan C, Nandy A, Kulik HJ. Low-cost machine learning prediction of excited state properties of iridium-centered phosphors. Chem Sci 2023; 14:1419-1433. [PMID: 36794185 PMCID: PMC9906783 DOI: 10.1039/d2sc06150c] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/05/2023] [Indexed: 01/07/2023] Open
Abstract
Prediction of the excited state properties of photoactive iridium complexes challenges ab initio methods such as time-dependent density functional theory (TDDFT) both from the perspective of accuracy and of computational cost, complicating high-throughput virtual screening (HTVS). We instead leverage low-cost machine learning (ML) models and experimental data for 1380 iridium complexes to perform these prediction tasks. We find the best-performing and most transferable models to be those trained on electronic structure features from low-cost density functional tight binding calculations. Using artificial neural network (ANN) models, we predict the mean emission energy of phosphorescence, the excited state lifetime, and the emission spectral integral for iridium complexes with accuracy competitive with or superseding that of TDDFT. We conduct feature importance analysis to determine that high cyclometalating ligand ionization potential correlates to high mean emission energy, while high ancillary ligand ionization potential correlates to low lifetime and low spectral integral. As a demonstration of how our ML models can be used for HTVS and the acceleration of chemical discovery, we curate a set of novel hypothetical iridium complexes and use uncertainty-controlled predictions to identify promising ligands for the design of new phosphors while retaining confidence in the quality of the ANN predictions.
Collapse
Affiliation(s)
- Gianmarco G. Terrones
- Department of Chemical Engineering, Massachusetts Institute of TechnologyCambridgeMA 02139USA
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of TechnologyCambridgeMA 02139USA,Department of Chemistry, Massachusetts Institute of TechnologyCambridgeMA 02139USA
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of TechnologyCambridgeMA 02139USA,Department of Chemistry, Massachusetts Institute of TechnologyCambridgeMA 02139USA
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of TechnologyCambridgeMA 02139USA,Department of Chemistry, Massachusetts Institute of TechnologyCambridgeMA 02139USA
| |
Collapse
|
16
|
Tao L, He J, Arbaugh T, McCutcheon JR, Li Y. Machine learning prediction on the fractional free volume of polymer membranes. J Memb Sci 2023. [DOI: 10.1016/j.memsci.2022.121131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
17
|
Tao L, Arbaugh T, Byrnes J, Varshney V, Li Y. Unified machine learning protocol for copolymer structure-property predictions. STAR Protoc 2022; 3:101875. [PMID: 36595914 PMCID: PMC9700038 DOI: 10.1016/j.xpro.2022.101875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Revised: 10/06/2022] [Accepted: 11/01/2022] [Indexed: 11/23/2022] Open
Abstract
Structure-property relationships are extremely valuable when predicting the properties of polymers. This protocol demonstrates a step-by-step approach, based on multiple machine learning (ML) architectures, which is capable of processing copolymer types such as alternating, random, block, and gradient copolymers. We detail steps for necessary software installation and construction of datasets. We further describe training and optimization steps for four neural network models and subsequent model visualization and comparison using training and test values. For complete details on the use and execution of this protocol, please refer to Tao et al. (2022).1.
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Tom Arbaugh
- Department of Physics, Wesleyan University, Middletown, CT 06459, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Dayton, OH 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA,Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706-1572, USA,Corresponding author
| |
Collapse
|
18
|
Schmid F. Understanding and Modeling Polymers: The Challenge of Multiple Scales. ACS POLYMERS AU 2022. [DOI: 10.1021/acspolymersau.2c00049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Friederike Schmid
- Institut für Physik, Johannes Gutenberg-Universität Mainz, Staudingerweg 9, 55128Mainz, Germany
| |
Collapse
|
19
|
Andraju N, Curtzwiler GW, Ji Y, Kozliak E, Ranganathan P. Machine-Learning-Based Predictions of Polymer and Postconsumer Recycled Polymer Properties: A Comprehensive Review. ACS APPLIED MATERIALS & INTERFACES 2022; 14:42771-42790. [PMID: 36102317 DOI: 10.1021/acsami.2c08301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
There has been a tremendous increase in demand for virgin and postconsumer recycled (PCR) polymers due to their wide range of chemical and physical characteristics. Despite the numerous potential benefits of using a data-driven approach to polymer design, major hurdles exist in the development of polymer informatics due to the complicated hierarchical polymer structures. In this review, a brief introduction on virgin polymer structure, PCR polymers, compatibilization of polymers to be recycled, and their characterization using sensor array technologies as well as factors affecting the polymer properties are provided. Machine-learning (ML) algorithms are gaining attention as cost-effective scalable solutions to exploit the physical and chemical structures of polymers. The basic steps for applying ML in polymer science such as fingerprinting, algorithms, open-source databases, representations, and polymer design are detailed in this review. Further, a state-of-the-art review of the prediction of various polymer material properties using ML is reviewed. Finally, we discuss open-ended research questions on ML application to PCR polymers as well as potential challenges in the prediction of their properties using artificial intelligence for more efficient and targeted PCR polymer discovery and development.
Collapse
Affiliation(s)
- Nagababu Andraju
- School of Electrical Engineering and Computer Science (SEECS), University of North Dakota, Grand Forks, North Dakota 58202, United States
| | - Greg W Curtzwiler
- Polymer and Food Protection Consortium, Department of Food Science and Human Nutrition, Iowa State University, Ames, Iowa 50011, United States
| | - Yun Ji
- Department of Chemical Engineering, University of North Dakota, Grand Forks, North Dakota 58202, United States
| | - Evguenii Kozliak
- Department of Chemistry, University of North Dakota, Grand Forks, North Dakota 58202, United States
| | - Prakash Ranganathan
- School of Electrical Engineering and Computer Science (SEECS), University of North Dakota, Grand Forks, North Dakota 58202, United States
| |
Collapse
|
20
|
Yang J, Tao L, He J, McCutcheon JR, Li Y. Machine learning enables interpretable discovery of innovative polymers for gas separation membranes. SCIENCE ADVANCES 2022; 8:eabn9545. [PMID: 35857839 PMCID: PMC9299556 DOI: 10.1126/sciadv.abn9545] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 06/07/2022] [Indexed: 05/21/2023]
Abstract
Polymer membranes perform innumerable separations with far-reaching environmental implications. Despite decades of research, design of new membrane materials remains a largely Edisonian process. To address this shortcoming, we demonstrate a generalizable, accurate machine learning (ML) implementation for the discovery of innovative polymers with ideal performance. Specifically, multitask ML models are trained on experimental data to link polymer chemistry to gas permeabilities of He, H2, O2, N2, CO2, and CH4. We interpret the ML models and extract valuable insights into the contributions of different chemical moieties to permeability and selectivity. We then screen over 9 million hypothetical polymers and identify thousands that lie well above current performance upper bounds, including hundreds of never-before-seen ultrapermeable polymer membranes with O2 and CO2 permeability greater than 104 and 105 Barrers, respectively. High-fidelity molecular dynamics simulations confirm the ML-predicted gas permeabilities of the promising candidates, which suggests that many can be translated to reality.
Collapse
Affiliation(s)
- Jason Yang
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Jinlong He
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Jeffrey R. McCutcheon
- Department of Chemical & Biomolecular Engineering, Center for Environmental Sciences and Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
- Corresponding author.
| |
Collapse
|
21
|
Wang M, Jiang J. Accelerating Discovery of High Fractional Free Volume Polymers from a Data-Driven Approach. ACS APPLIED MATERIALS & INTERFACES 2022; 14:31203-31215. [PMID: 35767720 DOI: 10.1021/acsami.2c03917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As a fundamental structure characteristic in polymers, fractional free volume (FFV) plays an indispensable role in governing polymer properties and performance. However, the design of new high-FFV polymers is challenging. In this study, we report a data-driven approach and aim to accelerate the discovery of high-FFV polymers. First, a computational method is proposed to calculate FFV, and a two-step fragmentation method is developed to construct a fragment library for digital representation of polymer structures. Data mining is employed to identify promising fragments for high FFV. Subsequently, machine learning (ML) models are trained using a data set with 1683 polymers and their excellent transferability is demonstrated by out-of-sample predictions in another data set with 11,479 polymers. Finally, the ML models are used to screen ∼1 million hypothetical polymers, and 29,482 polymers with FFV > 0.2 are shortlisted; representative high-FFV polymers are validated by molecular simulations, and design strategies are highlighted. To further facilitate the discovery of new high-FFV polymers, we develop an online interactive platform https://ffv-prediction.herokuapp.com, which allows for rapid FFV predictions, given polymer structures. The data-driven approach in this study might advance the development of new high-FFV polymers and further explore quantitative structure-property relationships for polymers.
Collapse
Affiliation(s)
- Mao Wang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 117576 Singapore, Singapore
| | - Jianwen Jiang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 117576 Singapore, Singapore
| |
Collapse
|
22
|
Tao L, Byrnes J, Varshney V, Li Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 2022; 25:104585. [PMID: 35789847 PMCID: PMC9249671 DOI: 10.1016/j.isci.2022.104585] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/26/2022] [Accepted: 06/07/2022] [Indexed: 11/15/2022] Open
Abstract
Establishing the structure-property relationship is extremely valuable for the molecular design of copolymers. However, machine learning (ML) models can incorporate both chemical composition and sequence distribution of monomers, and have the generalization ability to process various copolymer types (e.g., alternating, random, block, and gradient copolymers) with a unified approach are missing. To address this challenge, we formulate four different ML models for investigation, including a feedforward neural network (FFNN) model, a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, and a combined FFNN/RNN (Fusion) model. We use various copolymer types to systematically validate the performance and generalizability of different models. We find that the RNN architecture that processes the monomer sequence information both forward and backward is a more suitable ML model for copolymers with better generalizability. As a supplement to polymer informatics, our proposed approach provides an efficient way for the evaluation of copolymers. Establish structure-property relationships of copolymer with machine learning (ML) Incorporate both chemical composition and sequential distribution of copolymers Analyze various copolymer types with different models in a unified approach Differentiate the effects of random, block, and gradient patterns of copolymers
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
- Corresponding author
| |
Collapse
|
23
|
Ma R, Zhang H, Luo T. Exploring High Thermal Conductivity Amorphous Polymers Using Reinforcement Learning. ACS APPLIED MATERIALS & INTERFACES 2022; 14:15587-15598. [PMID: 35344333 DOI: 10.1021/acsami.1c23610] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Developing amorphous polymers with desirable thermal conductivity has significant implications, as they are ubiquitous in applications where thermal transport is critical. Conventional Edisonian approaches are slow and without guarantee of success in material development. In this work, using a reinforcement learning scheme, we design polymers with thermal conductivity above 0.400 W/m·K. We leverage a machine learning model trained against 469 thermal conductivity data calculated from high-throughput molecular dynamics (MD) simulations as the surrogate for thermal conductivity prediction, and we use a recurrent neural network trained with around one million virtual polymer structures as a polymer generator. For all generated polymers with thermal conductivity ≥0.400 W/m·K, we have evaluated their synthesizability by calculating the synthetic accessibility score and validated the thermal conductivity of selected polymers using MD simulations. The best thermally conductive polymer designed has an MD-calculated thermal conductivity of 0.693 W/m·K, which is also estimated to be easily synthesizable. Our demonstrated inverse design scheme based on reinforcement learning may advance polymer development with target properties, and the scheme can also be generalized to other material development tasks for different applications.
Collapse
Affiliation(s)
- Ruimin Ma
- Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Hanfeng Zhang
- Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Tengfei Luo
- Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
24
|
Xu P, Chen H, Li M, Lu W. New Opportunity: Machine Learning for Polymer Materials Design and Discovery. ADVANCED THEORY AND SIMULATIONS 2022. [DOI: 10.1002/adts.202100565] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Pengcheng Xu
- Materials Genome Institute Shanghai University Shanghai 200444 China
| | - Huimin Chen
- Department of Mathematics College of Sciences Shanghai University Shanghai 200444 China
| | - Minjie Li
- Department of Chemistry College of Sciences Shanghai University Shanghai 200444 China
| | - Wencong Lu
- Materials Genome Institute Shanghai University Shanghai 200444 China
- Department of Chemistry College of Sciences Shanghai University Shanghai 200444 China
| |
Collapse
|
25
|
Nguyen D, Tao L, Li Y. Integration of Machine Learning and Coarse-Grained Molecular Simulations for Polymer Materials: Physical Understandings and Molecular Design. Front Chem 2022; 9:820417. [PMID: 35141207 PMCID: PMC8819075 DOI: 10.3389/fchem.2021.820417] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 12/31/2021] [Indexed: 12/21/2022] Open
Abstract
In recent years, the synthesis of monomer sequence-defined polymers has expanded into broad-spectrum applications in biomedical, chemical, and materials science fields. Pursuing the characterization and inverse design of these polymer systems requires our fundamental understanding not only at the individual monomer level, but also considering the chain scales, such as polymer configuration, self-assembly, and phase separation. However, our accessibility to this field is still rudimentary due to the limitations of traditional design approaches, the complexity of chemical space along with the burdened cost and time issues that prevent us from unveiling the underlying monomer sequence-structure-property relationships. Fortunately, thanks to the recent advancements in molecular dynamics simulations and machine learning (ML) algorithms, the bottlenecks in the tasks of establishing the structure-function correlation of the polymer chains can be overcome. In this review, we will discuss the applications of the integration between ML techniques and coarse-grained molecular dynamics (CGMD) simulations to solve the current issues in polymer science at the chain level. In particular, we focus on the case studies in three important topics—polymeric configuration characterization, feed-forward property prediction, and inverse design—in which CGMD simulations are leveraged to generate training datasets to develop ML-based surrogate models for specific polymer systems and designs. By doing so, this computational hybridization allows us to well establish the monomer sequence-functional behavior relationship of the polymers as well as guide us toward the best polymer chain candidates for the inverse design in undiscovered chemical space with reasonable computational cost and time. Even though there are still limitations and challenges ahead in this field, we finally conclude that this CGMD/ML integration is very promising, not only in the attempt of bridging the monomeric and macroscopic characterizations of polymer materials, but also enabling further tailored designs for sequence-specific polymers with superior properties in many practical applications.
Collapse
Affiliation(s)
- Danh Nguyen
- Department of Mechanical Engineering, University of Connecticut, Mansfield, CT, United States
| | - Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Mansfield, CT, United States
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Mansfield, CT, United States
- Polymer Program, Institute of Materials Science, University of Connecticut, Mansfield, CT, United States
- *Correspondence: Ying Li,
| |
Collapse
|
26
|
Bejagam KK, Lalonde J, Iverson CN, Marrone BL, Pilania G. Machine Learning for Melting Temperature Predictions and Design in Polyhydroxyalkanoate-Based Biopolymers. J Phys Chem B 2022; 126:934-945. [DOI: 10.1021/acs.jpcb.1c08354] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Karteek K. Bejagam
- Materials Science and Technology Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Jessica Lalonde
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Biomolecular and Tissue Engineering, Duke University, Durham, North Carolina 27708, United States
| | - Carl N. Iverson
- Chemistry Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Babetta L. Marrone
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ghanshyam Pilania
- Materials Science and Technology Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
27
|
Park J, Shim Y, Lee F, Rammohan A, Goyal S, Shim M, Jeong C, Kim DS. Prediction and Interpretation of Polymer Properties Using the Graph Convolutional Network. ACS POLYMERS AU 2022; 2:213-222. [PMID: 36855563 PMCID: PMC9954297 DOI: 10.1021/acspolymersau.1c00050] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We present machine learning models for the prediction of thermal and mechanical properties of polymers based on the graph convolutional network (GCN). GCN-based models provide reliable prediction performances for the glass transition temperature (T g), melting temperature (T m), density (ρ), and elastic modulus (E) with substantial dependence on the dataset, which is the best for T g (R 2 ∼ 0.9) and worst for E (R 2 ∼ 0.5). It is found that the GCN representations for polymers provide prediction performances of their properties comparable to the popular extended-connectivity circular fingerprint (ECFP) representation. Notably, the GCN combined with the neural network regression (GCN-NN) slightly outperforms the ECFP. It is investigated how the GCN captures important structural features of polymers to learn their properties. Using the dimensionality reduction, we demonstrate that the polymers are organized in the principal subspace of the GCN representation spaces with respect to the backbone rigidity. The organization in the representation space adaptively changes with the training and through the NN layers, which might facilitate a subsequent prediction of target properties based on the relationships between the structure and the property. The GCN models are found to provide an advantage to automatically extract a backbone rigidity, strongly correlated with T g, as well as a potential transferability to predict other properties associated with a backbone rigidity. Our results indicate both the capability and limitations of the GCN in learning to describe polymer systems depending on the property.
Collapse
Affiliation(s)
- Jaehong Park
- Innovation
Center, Samsung Electronics Co., Ltd., 1 Samsungjeonja-ro, Hwaseong-si, Gyeonggi-do 18448, Korea
| | - Youngseon Shim
- Innovation
Center, Samsung Electronics Co., Ltd., 1 Samsungjeonja-ro, Hwaseong-si, Gyeonggi-do 18448, Korea,
| | - Franklin Lee
- Science
and Technology Division, Corning Incorporated, Corning, New York 14831, United States
| | - Aravind Rammohan
- Science
and Technology Division, Corning Incorporated, Corning, New York 14831, United States
| | - Sushmit Goyal
- Science
and Technology Division, Corning Incorporated, Corning, New York 14831, United States
| | - Munbo Shim
- Innovation
Center, Samsung Electronics Co., Ltd., 1 Samsungjeonja-ro, Hwaseong-si, Gyeonggi-do 18448, Korea
| | - Changwook Jeong
- Innovation
Center, Samsung Electronics Co., Ltd., 1 Samsungjeonja-ro, Hwaseong-si, Gyeonggi-do 18448, Korea,
| | - Dae Sin Kim
- Innovation
Center, Samsung Electronics Co., Ltd., 1 Samsungjeonja-ro, Hwaseong-si, Gyeonggi-do 18448, Korea
| |
Collapse
|
28
|
Aldeghi M, Coley CW. A graph representation of molecular ensembles for polymer property prediction. Chem Sci 2022; 13:10486-10498. [PMID: 36277616 PMCID: PMC9473492 DOI: 10.1039/d2sc02839e] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 08/15/2022] [Indexed: 12/02/2022] Open
Abstract
Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles. A graph representation that captures critical features of polymeric materials and an associated graph neural network achieve superior accuracy to off-the-shelf cheminformatics methodologies.![]()
Collapse
Affiliation(s)
- Matteo Aldeghi
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Connor W. Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
29
|
Tao L, Varshney V, Li Y. Benchmarking Machine Learning Models for Polymer Informatics: An Example of Glass Transition Temperature. J Chem Inf Model 2021; 61:5395-5413. [PMID: 34662106 DOI: 10.1021/acs.jcim.1c01031] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
In the field of polymer informatics, utilizing machine learning (ML) techniques to evaluate the glass transition temperature Tg and other properties of polymers has attracted extensive attention. This data-centric approach is much more efficient and practical than the laborious experimental measurements when encountered a daunting number of polymer structures. Various ML models are demonstrated to perform well for Tg prediction. Nevertheless, they are trained on different data sets, using different structure representations, and based on different feature engineering methods. Thus, the critical question arises on selecting a proper ML model to better handle the Tg prediction with generalization ability. To provide a fair comparison of different ML techniques and examine the key factors that affect the model performance, we carry out a systematic benchmark study by compiling 79 different ML models and training them on a large and diverse data set. The three major components in setting up an ML model are structure representations, feature representations, and ML algorithms. In terms of polymer structure representation, we consider the polymer monomer, repeat unit, and oligomer with longer chain structure. Based on that feature, representation is calculated, including Morgan fingerprinting with or without substructure frequency, RDKit descriptors, molecular embedding, molecular graph, etc. Afterward, the obtained feature input is trained using different ML algorithms, such as deep neural networks, convolutional neural networks, random forest, support vector machine, LASSO regression, and Gaussian process regression. We evaluate the performance of these ML models using a holdout test set and an extra unlabeled data set from high-throughput molecular dynamics simulation. The ML model's generalization ability on an unlabeled data set is especially focused, and the model's sensitivity to topology and the molecular weight of polymers is also taken into consideration. This benchmark study provides not only a guideline for the Tg prediction task but also a useful reference for other polymer informatics tasks.
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright Patterson Air Force Base, Ohio 45433, United States
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, Connecticut 06269, United States.,Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, Connecticut 06269, United States
| |
Collapse
|
30
|
Liu Q, Gao Y, Xu B. Transferable, Deep-Learning-Driven Fast Prediction and Design of Thermal Transport in Mechanically Stretched Graphene Flakes. ACS NANO 2021; 15:16597-16606. [PMID: 34648261 DOI: 10.1021/acsnano.1c06340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Piling graphene sheets into a bulk form is essential for achieving massive applications of graphene in flexible structures and devices, and the arbitrary shape, random distributions, and adjacent overlaps of graphene sheets are yet challenging the prediction of its fundamental properties that are strongly coupled by mechanical strength and thermal or electronic transport. Here, we present a deep neural network (DNN)-based machine learning (ML) approach that enables the prediction of thermal conductivity of piled graphene structures with a broad range of geometric configurations and dimensions in response to external mechanical loading. A physics-informed pixel value matrix is developed to capture the key geometric features of piled graphene structures and is incorporated into the DNN to train the ML model with the only training data ratio of 12.5% but the prediction accuracy of 94%. The ML model is further extended with the transferred knowledge from primitive training data sets to predict the thermal transport of piled graphene in a custom data set. Extensive demonstrations in search of piled graphene structures with desirable thermal conductivity and its response to mechanical loading are presented and illustrate the capability and accuracy of the DNN-ML model for establishing a mechanically adaptive structure: responsive thermal property paradigm in piled graphene. This work lays a foundation for quantitatively evaluating thermal conductivity of piled graphene in response to mechanical loadings through an ML model and also offers a rational route for exploring mechanically tunable thermal properties of nanomaterial-based bulk forms, potentially useful in the design of flexible thermal structures and devices with controllable thermal management performance.
Collapse
Affiliation(s)
- Qingchang Liu
- Department of Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, Virginia 22904, United States
| | - Yuan Gao
- Department of Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, Virginia 22904, United States
| | - Baoxing Xu
- Department of Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, Virginia 22904, United States
| |
Collapse
|
31
|
Predicting Polymers' Glass Transition Temperature by a Chemical Language Processing Model. Polymers (Basel) 2021; 13:polym13111898. [PMID: 34200505 PMCID: PMC8201381 DOI: 10.3390/polym13111898] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/03/2021] [Accepted: 06/04/2021] [Indexed: 12/14/2022] Open
Abstract
We propose a chemical language processing model to predict polymers' glass transition temperature (Tg) through a polymer language (SMILES, Simplified Molecular Input Line Entry System) embedding and recurrent neural network. This model only receives the SMILES strings of a polymer's repeat units as inputs and considers the SMILES strings as sequential data at the character level. Using this method, there is no need to calculate any additional molecular descriptors or fingerprints of polymers, and thereby, being very computationally efficient. More importantly, it avoids the difficulties to generate molecular descriptors for repeat units containing polymerization point '*'. Results show that the trained model demonstrates reasonable prediction performance on unseen polymer's Tg. Besides, this model is further applied for high-throughput screening on an unlabeled polymer database to identify high-temperature polymers that are desired for applications in extreme environments. Our work demonstrates that the SMILES strings of polymer repeat units can be used as an effective feature representation to develop a chemical language processing model for predictions of polymer Tg. The framework of this model is general and can be used to construct structure-property relationships for other polymer properties.
Collapse
|