1
|
Methorst J, van Hilten N, Hoti A, Stroh KS, Risselada HJ. When Data Are Lacking: Physics-Based Inverse Design of Biopolymers Interacting with Complex, Fluid Phases. J Chem Theory Comput 2024; 20:1763-1776. [PMID: 38413010 PMCID: PMC10938504 DOI: 10.1021/acs.jctc.3c00874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 01/03/2024] [Accepted: 01/03/2024] [Indexed: 02/29/2024]
Abstract
Biomolecular research traditionally revolves around comprehending the mechanisms through which peptides or proteins facilitate specific functions, often driven by their relevance to clinical ailments. This conventional approach assumes that unraveling mechanisms is a prerequisite for wielding control over functionality, which stands as the ultimate research goal. However, an alternative perspective emerges from physics-based inverse design, shifting the focus from mechanisms to the direct acquisition of functional control strategies. By embracing this methodology, we can uncover solutions that might not have direct parallels in natural systems, yet yield crucial insights into the isolated molecular elements dictating functionality. This provides a distinctive comprehension of the underlying mechanisms.In this context, we elucidate how physics-based inverse design, facilitated by evolutionary algorithms and coarse-grained molecular simulations, charts a promising course for innovating the reverse engineering of biopolymers interacting with intricate fluid phases such as lipid membranes and liquid protein phases. We introduce evolutionary molecular dynamics (Evo-MD) simulations, an approach that merges evolutionary algorithms with the Martini coarse-grained force field. This method directs the evolutionary process from random amino acid sequences toward peptides interacting with complex fluid phases such as biological lipid membranes, offering significant promises in the development of peptide-based sensors and drugs. This approach can be tailored to recognize or selectively target specific attributes such as membrane curvature, lipid composition, membrane phase (e.g., lipid rafts), and protein fluid phases. Although the resulting optimal solutions may not perfectly align with biological norms, physics-based inverse design excels at isolating relevant physicochemical principles and thermodynamic driving forces governing optimal biopolymer interaction within complex fluidic environments. In addition, we expound upon how physics-based evolution using the Evo-MD approach can be harnessed to extract the evolutionary optimization fingerprints of protein-lipid interactions from native proteins. Finally, we outline how such an approach is uniquely able to generate strategic training data for predictive neural network models that cover the whole relevant physicochemical domain. Exploring challenges, we address key considerations such as choosing a fitting fitness function to delineate the desired functionality. Additionally, we scrutinize assumptions tied to system setup, the targeted protein structure, and limitations posed by the utilized (coarse-grained) force fields and explore potential strategies for guiding evolution with limited experimental data. This discourse encapsulates the potential and remaining obstacles of physics-based inverse design, paving the way for an exciting frontier in biomolecular research.
Collapse
Affiliation(s)
- Jeroen Methorst
- Leiden
Institute of Chemistry, Leiden University, 2333 CC Leiden, The Netherlands
- Department
of Physics, Technische Universität
Dortmund, 44227 Dortmund, Germany
| | - Niek van Hilten
- Leiden
Institute of Chemistry, Leiden University, 2333 CC Leiden, The Netherlands
| | - Art Hoti
- Leiden
Institute of Chemistry, Leiden University, 2333 CC Leiden, The Netherlands
| | - Kai Steffen Stroh
- Department
of Physics, Technische Universität
Dortmund, 44227 Dortmund, Germany
| | - Herre Jelger Risselada
- Leiden
Institute of Chemistry, Leiden University, 2333 CC Leiden, The Netherlands
- Department
of Physics, Technische Universität
Dortmund, 44227 Dortmund, Germany
| |
Collapse
|
2
|
Himanshu, Chakraborty K, Patra TK. Developing efficient deep learning model for predicting copolymer properties. Phys Chem Chem Phys 2023; 25:25166-25176. [PMID: 37712405 DOI: 10.1039/d3cp03100d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Deep learning models are gaining popularity and potency in predicting polymer properties. These models can be built using pre-existing data and are useful for the rapid prediction of polymer properties. However, the performance of a deep learning model is intricately connected to its topology and the volume of training data. There is no facile protocol available to select a deep learning architecture, and there is a lack of a large volume of homogeneous sequence-property data of polymers. These two factors are the primary bottleneck for the efficient development of deep learning models for polymers. Here we assess the severity of these factors and propose strategies to address them. We show that a linear layer-by-layer expansion of a neural network can help in identifying the best neural network topology for a given problem. Moreover, we map the discrete sequence space of a polymer to a continuous one-dimensional latent space using a feature extraction technique to identify minimal data points for training a deep learning model. We implement these approaches for two representative cases of building sequence-property surrogate models, viz., the single-molecule radius of gyration of a copolymer and copolymer compatibilizer. This work demonstrates efficient methods for building deep learning models with minimal data and hyperparameters for predicting sequence-defined properties of polymers.
Collapse
Affiliation(s)
- Himanshu
- Department of Chemical Engineering and Center for Atomistic Modeling and Materials Design, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| | - Kaushik Chakraborty
- Department of Chemical Engineering and Center for Atomistic Modeling and Materials Design, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| | - Tarak K Patra
- Department of Chemical Engineering and Center for Atomistic Modeling and Materials Design, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| |
Collapse
|
3
|
Yang C, Raza S, Li X, Liu J. Thermal Transport in Poly( p-phenylene): Anomalous Dimensionality Dependence and Role of π-π Stacking. J Phys Chem B 2023. [PMID: 37478475 DOI: 10.1021/acs.jpcb.3c02947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/23/2023]
Abstract
For heat conduction along polymer chains, a decrease in the axial thermal conductivity often occurs when the polymer structure changes from one-dimensional (1D) to three-dimensional (3D). For example, a single extended aliphatic chain (e.g., polyethylene or poly(dimethylsiloxane)) usually has a higher axial thermal conductivity than its double-chain or crystal counterparts because coupling between chains induces strong interchain anharmonic scatterings. Intuitively, for chains with an aromatic backbone, the even stronger π-π stacking, once formed between chains, should enhance thermal transport across chains and suppress the thermal conductivity along the chains. However, we show that this trend is the opposite in poly(p-phenylene) (PPP), a typical chain with an aromatic backbone. Using molecular dynamics simulations, we found that the axial thermal conductivity of PPP chains shows an anomalous dimensionality dependence where the thermal conductivity of double-chain and 3D crystal structures is higher than that of a 1D single chain. We analyzed the probability distribution of dihedral angles and found that π-π stacking between phenyl rings restricts the free rotation of phenyl rings and forms a long-range order along the chain, thus enhancing thermal transport along the chain direction. Though possessing a stronger bonding strength and stabilizing the multiple-chain structure, π-π stacking does not lead to a higher interchain thermal conductance between phenyl rings compared with that between aliphatic chains. Our simulation results on the effects of π-π stacking provide insights to engineer thermal transport in polymers at the molecular level.
Collapse
Affiliation(s)
- Cong Yang
- Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Saqlain Raza
- Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Xiaobo Li
- School of Energy and Power Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jun Liu
- Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, North Carolina 27695, United States
| |
Collapse
|
4
|
Gavrilov AA, Potemkin II. Copolymers with Nonblocky Sequences as Novel Materials with Finely Tuned Properties. J Phys Chem B 2023; 127:1479-1489. [PMID: 36790352 DOI: 10.1021/acs.jpcb.2c07689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
The copolymer sequence can be considered as a new tool to shape the resulting system properties on demand. This perspective is devoted to copolymers with "partially segregated" (or nonblocky) sequences. Such copolymers include gradient copolymers and copolymers with random sequences as well as copolymers with precisely controlled sequences. We overview recent developments in the synthesis of these systems as well as new findings regarding their properties, in particular, self-assembly in solutions and in melts. An emphasis is put on how the microscopic behavior of polymer chains is influenced by the chain sequences. In addition to that, a novel class of approaches allowing one to efficiently tackle the problem of copolymer chain sequence design─data driven methods (artificial intelligence and machine learning)─is discussed.
Collapse
Affiliation(s)
- Alexey A Gavrilov
- Physics Department, Lomonosov Moscow State University, Moscow 119991, Russian Federation.,Semenov Federal Research Center for Chemical Physics, Moscow 119991, Russian Federation
| | - Igor I Potemkin
- Physics Department, Lomonosov Moscow State University, Moscow 119991, Russian Federation
| |
Collapse
|
5
|
Ramesh PS, Patra TK. Polymer sequence design via molecular simulation-based active learning. SOFT MATTER 2023; 19:282-294. [PMID: 36519427 DOI: 10.1039/d2sm01193j] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Molecular-scale interactions and chemical structures offer an enormous opportunity to tune material properties. However, designing materials from their molecular scale is a grand challenge owing to the practical limitations in exploring astronomically large design spaces using traditional experimental or computational methods. Advancements in data science and machine learning have produced a host of tools and techniques that can address this problem and facilitate the efficient exploration of large search spaces. In this work, a blended approach integrating physics-based methods, machine learning techniques and uncertainty quantification is implemented to effectively screen a macromolecular sequence space and design target structures. Here, we survey and assess the efficacy of data-driven methods within the framework of active learning for a challenging design problem, viz., sequence optimization of a copolymer. We report the impact of surrogate models, kernels, and initial conditions on the convergence of the active learning method for the sequence design problem. This work establishes optimal strategies and hyperparameters for efficient inverse design of polymer sequences via active learning.
Collapse
Affiliation(s)
- Praneeth S Ramesh
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| | - Tarak K Patra
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| |
Collapse
|
6
|
Zhou T, Qiu D, Wu Z, Alberti SAN, Bag S, Schneider J, Meyer J, Gámez JA, Gieler M, Reithmeier M, Seidel A, Müller-Plathe F. Compatibilization Efficiency of Graft Copolymers in Incompatible Polymer Blends: Dissipative Particle Dynamics Simulations Combined with Machine Learning. Macromolecules 2022. [DOI: 10.1021/acs.macromol.2c00821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Tianhang Zhou
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| | - Dejian Qiu
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| | - Zhenghao Wu
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| | - Simon A. N. Alberti
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| | - Saientan Bag
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| | - Jurek Schneider
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| | - Jan Meyer
- Covestro Deutschland AG, Kaiser-Wilhelm-Allee 60, 51373 Leverkusen, Germany
| | - José A. Gámez
- Covestro Deutschland AG, Kaiser-Wilhelm-Allee 60, 51373 Leverkusen, Germany
| | - Mandy Gieler
- Covestro Deutschland AG, Kaiser-Wilhelm-Allee 60, 51373 Leverkusen, Germany
| | - Marina Reithmeier
- Covestro Deutschland AG, Kaiser-Wilhelm-Allee 60, 51373 Leverkusen, Germany
| | - Andreas Seidel
- Covestro Deutschland AG, Kaiser-Wilhelm-Allee 60, 51373 Leverkusen, Germany
| | - Florian Müller-Plathe
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Str. 8, 64287 Darmstadt, Germany
| |
Collapse
|
7
|
Tao L, Byrnes J, Varshney V, Li Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 2022; 25:104585. [PMID: 35789847 PMCID: PMC9249671 DOI: 10.1016/j.isci.2022.104585] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/26/2022] [Accepted: 06/07/2022] [Indexed: 11/15/2022] Open
Abstract
Establishing the structure-property relationship is extremely valuable for the molecular design of copolymers. However, machine learning (ML) models can incorporate both chemical composition and sequence distribution of monomers, and have the generalization ability to process various copolymer types (e.g., alternating, random, block, and gradient copolymers) with a unified approach are missing. To address this challenge, we formulate four different ML models for investigation, including a feedforward neural network (FFNN) model, a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, and a combined FFNN/RNN (Fusion) model. We use various copolymer types to systematically validate the performance and generalizability of different models. We find that the RNN architecture that processes the monomer sequence information both forward and backward is a more suitable ML model for copolymers with better generalizability. As a supplement to polymer informatics, our proposed approach provides an efficient way for the evaluation of copolymers. Establish structure-property relationships of copolymer with machine learning (ML) Incorporate both chemical composition and sequential distribution of copolymers Analyze various copolymer types with different models in a unified approach Differentiate the effects of random, block, and gradient patterns of copolymers
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
- Corresponding author
| |
Collapse
|
8
|
Kumar R. Materiomically Designed Polymeric Vehicles for Nucleic Acids: Quo Vadis? ACS APPLIED BIO MATERIALS 2022; 5:2507-2535. [PMID: 35642794 DOI: 10.1021/acsabm.2c00346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Despite rapid advances in molecular biology, particularly in site-specific genome editing technologies, such as CRISPR/Cas9 and base editing, financial and logistical challenges hinder a broad population from accessing and benefiting from gene therapy. To improve the affordability and scalability of gene therapy, we need to deploy chemically defined, economical, and scalable materials, such as synthetic polymers. For polymers to deliver nucleic acids efficaciously to targeted cells, they must optimally combine design attributes, such as architecture, length, composition, spatial distribution of monomers, basicity, hydrophilic-hydrophobic phase balance, or protonation degree. Designing polymeric vectors for specific nucleic acid payloads is a multivariate optimization problem wherein even minuscule deviations from the optimum are poorly tolerated. To explore the multivariate polymer design space rapidly, efficiently, and fruitfully, we must integrate parallelized polymer synthesis, high-throughput biological screening, and statistical modeling. Although materiomics approaches promise to streamline polymeric vector development, several methodological ambiguities must be resolved. For instance, establishing a flexible polymer ontology that accommodates recent synthetic advances, enforcing uniform polymer characterization and data reporting standards, and implementing multiplexed in vitro and in vivo screening studies require considerable planning, coordination, and effort. This contribution will acquaint readers with the challenges associated with materiomics approaches to polymeric gene delivery and offers guidelines for overcoming these challenges. Here, we summarize recent developments in combinatorial polymer synthesis, high-throughput screening of polymeric vectors, omics-based approaches to polymer design, barcoding schemes for pooled in vitro and in vivo screening, and identify materiomics-inspired research directions that will realize the long-unfulfilled clinical potential of polymeric carriers in gene therapy.
Collapse
Affiliation(s)
- Ramya Kumar
- Department of Chemical & Biological Engineering, Colorado School of Mines, 1613 Illinois St, Golden, Colorado 80401, United States
| |
Collapse
|
9
|
Bale AA, Gautham SMB, Patra TK. Sequence‐defined Pareto frontier of a copolymer structure. JOURNAL OF POLYMER SCIENCE 2022. [DOI: 10.1002/pol.20220088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Ashwin A. Bale
- Department of Chemical Engineering Birla Institute of Technology and Science Pilani‐Hyderabad Hyderabad India
| | - Sachin M. B. Gautham
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage Indian Institute of Technology Madras Chennai India
| | - Tarak K. Patra
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage Indian Institute of Technology Madras Chennai India
| |
Collapse
|
10
|
Liu Y, Zhou Y, Xu Y. State-of-the-Art, Opportunities, and Challenges in Bottom-up Synthesis of Polymers with High Thermal Conductivity. Polym Chem 2022. [DOI: 10.1039/d2py00272h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In contrast to metals, polymers are predominantly thermal and electrical insulators. With their unparalleled advantages such as light weight, turning polymer insulators into heat conductors with metal-like thermal conductivity is...
Collapse
|
11
|
Abstract
Optimal design of polymers is a challenging task due to their enormous chemical and configurational space. Recent advances in computations, machine learning, and increasing trends in data and software availability can potentially address this problem and accelerate the molecular-scale design of polymers. Here, the central problem of polymer design is reviewed, and the general ideas of data-driven methods and their working principles in the context of polymer design are discussed. This Review provides a historical perspective and a summary of current trends and outlines future scopes of data-driven methods for polymer research. A few representative case studies on the use of such data-driven methods for discovering new polymers with exceptional properties are presented. Moreover, attempts are made to highlight how data-driven strategies aid in establishing new correlations and advancing the fundamental understanding of polymers. This Review posits that the combination of machine learning, rapid computational characterization of polymers, and availability of large open-sourced homogeneous data will transform polymer research and development over the coming decades. It is hoped that this Review will serve as a useful reference to researchers who wish to develop and deploy data-driven methods for polymer research and education.
Collapse
|