1
|
Back S, Aspuru-Guzik A, Ceriotti M, Gryn'ova G, Grzybowski B, Gu GH, Hein J, Hippalgaonkar K, Hormázabal R, Jung Y, Kim S, Kim WY, Moosavi SM, Noh J, Park C, Schrier J, Schwaller P, Tsuda K, Vegge T, von Lilienfeld OA, Walsh A. Accelerated chemical science with AI. DIGITAL DISCOVERY 2024; 3:23-33. [PMID: 38239898 PMCID: PMC10793638 DOI: 10.1039/d3dd00213f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/06/2023] [Indexed: 01/22/2024]
Abstract
In light of the pressing need for practical materials and molecular solutions to renewable energy and health problems, to name just two examples, one wonders how to accelerate research and development in the chemical sciences, so as to address the time it takes to bring materials from initial discovery to commercialization. Artificial intelligence (AI)-based techniques, in particular, are having a transformative and accelerating impact on many if not most, technological domains. To shed light on these questions, the authors and participants gathered in person for the ASLLA Symposium on the theme of 'Accelerated Chemical Science with AI' at Gangneung, Republic of Korea. We present the findings, ideas, comments, and often contentious opinions expressed during four panel discussions related to the respective general topics: 'Data', 'New applications', 'Machine learning algorithms', and 'Education'. All discussions were recorded, transcribed into text using Open AI's Whisper, and summarized using LG AI Research's EXAONE LLM, followed by revision by all authors. For the broader benefit of current researchers, educators in higher education, and academic bodies such as associations, publishers, librarians, and companies, we provide chemistry-specific recommendations and summarize the resulting conclusions.
Collapse
|
2
|
Tučs A, Ito T, Kurumida Y, Kawada S, Nakazawa H, Saito Y, Umetsu M, Tsuda K. Extensive antibody search with whole spectrum black-box optimization. Sci Rep 2024; 14:552. [PMID: 38177656 PMCID: PMC10767033 DOI: 10.1038/s41598-023-51095-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 12/30/2023] [Indexed: 01/06/2024] Open
Abstract
In designing functional biological sequences with machine learning, the activity predictor tends to be inaccurate due to shortage of data. Top ranked sequences are thus unlikely to contain effective ones. This paper proposes to take prediction stability into account to provide domain experts with a reasonable list of sequences to choose from. In our approach, multiple prediction models are trained by subsampling the training set and the multi-objective optimization problem, where one objective is the average activity and the other is the standard deviation, is solved. The Pareto front represents a list of sequences with the whole spectrum of activity and stability. Using this method, we designed VHH (Variable domain of Heavy chain of Heavy chain) antibodies based on the dataset obtained from deep mutational screening. To solve multi-objective optimization, we employed our sequence design software MOQA that uses quantum annealing. By applying several selection criteria to 19,778 designed sequences, five sequences were selected for wet-lab validation. One sequence, 16 mutations away from the closest training sequence, was successfully expressed and found to possess desired binding specificity. Our whole spectrum approach provides a balanced way of dealing with the prediction uncertainty, and can possibly be applied to extensive search of functional sequences.
Collapse
|
3
|
Yoshida T, Hanada H, Nakagawa K, Taji K, Tsuda K, Takeuchi I. Efficient model selection for predictive pattern mining model by safe pattern pruning. PATTERNS (NEW YORK, N.Y.) 2023; 4:100890. [PMID: 38106611 PMCID: PMC10724371 DOI: 10.1016/j.patter.2023.100890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 11/02/2023] [Accepted: 11/09/2023] [Indexed: 12/19/2023]
Abstract
Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering unified inconsistent notation sub-structures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the safe pattern pruning method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences.
Collapse
|
4
|
Yuan W, Hibi Y, Tamura R, Sumita M, Nakamura Y, Naito M, Tsuda K. Revealing factors influencing polymer degradation with rank-based machine learning. PATTERNS (NEW YORK, N.Y.) 2023; 4:100846. [PMID: 38106610 PMCID: PMC10724228 DOI: 10.1016/j.patter.2023.100846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 07/07/2023] [Accepted: 08/30/2023] [Indexed: 12/19/2023]
Abstract
The efficient treatment of polymer waste is a major challenge for marine sustainability. It is useful to reveal the factors that dominate the degradability of polymer materials for developing polymer materials in the future. The small number of available datasets on degradability and the diversity of their experimental means and conditions hinder large-scale analysis. In this study, we have developed a platform for evaluating the degradability of polymers that is suitable for such data, using a rank-based machine learning technique based on RankSVM. We then made a ranking model to evaluate the degradability of polymers, integrating three datasets on the degradability of polymers that are measured by different means and conditions. Analysis of this ranking model with a decision tree revealed factors that dominate the degradability of polymers.
Collapse
|
5
|
Terayama K, Osaki Y, Fujita T, Tamura R, Naito M, Tsuda K, Matsui T, Sumita M. Koopmans' Theorem-Compliant Long-Range Corrected (KTLC) Density Functional Mediated by Black-Box Optimization and Data-Driven Prediction for Organic Molecules. J Chem Theory Comput 2023; 19:6770-6781. [PMID: 37729470 DOI: 10.1021/acs.jctc.3c00764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
Density functional theory (DFT) is a significant computational tool that has substantially influenced chemistry, physics, and materials science. DFT necessitates parametrized approximation for determining an expected value. Hence, to predict the properties of a given molecule using DFT, appropriate parameters of the functional should be set for each molecule. Herein, we optimize the parameters of range-separated functionals (LC-BLYP and CAM-B3LYP) via Bayesian optimization (BO) to satisfy Koopmans' theorem. Our results demonstrate the effectiveness of the BO in optimizing functional parameters. Particularly, Koopmans' theorem-compliant LC-BLYP (KTLC-BLYP) shows results comparable to the experimental UV-absorption values. Furthermore, we prepared an optimized parameter dataset of KTLC-BLYP for over 3000 molecules through BO for satisfying Koopmans' theorem. We have developed a machine learning model on this dataset to predict the parameters of the LC-BLYP functional for a given molecule. The prediction model automatically predicts the appropriate parameters for a given molecule and calculates the corresponding values. The approach in this paper would be useful to develop new functionals and to update the previously developed functionals.
Collapse
|
6
|
Zhang H, Nguyen DH, Tsuda K. Differentiable optimization layers enhance GNN-based mitosis detection. Sci Rep 2023; 13:14306. [PMID: 37653108 PMCID: PMC10471751 DOI: 10.1038/s41598-023-41562-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023] Open
Abstract
Automatic mitosis detection from video is an essential step in analyzing proliferative behaviour of cells. In existing studies, a conventional object detector such as Unet is combined with a link prediction algorithm to find correspondences between parent and daughter cells. However, they do not take into account the biological constraint that a cell in a frame can correspond to up to two cells in the next frame. Our model called GNN-DOL enables mitosis detection by complementing a graph neural network (GNN) with a differentiable optimization layer (DOL) that implements the constraint. In time-lapse microscopy sequences cultured under four different conditions, we observed that the layer substantially improved detection performance in comparison with GNN-based link prediction. Our results illustrate the importance of incorporating biological knowledge explicitly into deep learning models.
Collapse
|
7
|
Tučs A, Berenger F, Yumoto A, Tamura R, Uzawa T, Tsuda K. Quantum Annealing Designs Nonhemolytic Antimicrobial Peptides in a Discrete Latent Space. ACS Med Chem Lett 2023; 14:577-582. [PMID: 37197452 PMCID: PMC10184305 DOI: 10.1021/acsmedchemlett.2c00487] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 04/10/2023] [Indexed: 05/19/2023] Open
Abstract
Increasing the variety of antimicrobial peptides is crucial in meeting the global challenge of multi-drug-resistant bacterial pathogens. While several deep-learning-based peptide design pipelines are reported, they may not be optimal in data efficiency. High efficiency requires a well-compressed latent space, where optimization is likely to fail due to numerous local minima. We present a multi-objective peptide design pipeline based on a discrete latent space and D-Wave quantum annealer with the aim of solving the local minima problem. To achieve multi-objective optimization, multiple peptide properties are encoded into a score using non-dominated sorting. Our pipeline is applied to design therapeutic peptides that are antimicrobial and non-hemolytic at the same time. From 200 000 peptides designed by our pipeline, four peptides proceeded to wet-lab validation. Three of them showed high anti-microbial activity, and two are non-hemolytic. Our results demonstrate how quantum-based optimizers can be taken advantage of in real-world medical studies.
Collapse
|
8
|
Berenger F, Tsuda K. 3D-Sensitive Encoding of Pharmacophore Features. J Chem Inf Model 2023; 63:2360-2369. [PMID: 37036083 DOI: 10.1021/acs.jcim.2c01623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Abstract
In the presence of structural data, one sometimes need to compare 3D ligands. We design an overlay-free method to rank order 3D molecules in the pharmacophore feature space. The proposed encoding includes only two fittable parameters, is sparse, and is not too high dimensional. At the cost of an additional parameter, to delineate the binding site from a protein-ligand complex, the method can compare binding sites. The method was benchmarked on the LIT-PCBA data set for ligand-based virtual screening experiments and the sc-PDB and a Vertex data set when comparing binding sites. In similarity searches, the proposed method outperforms an open-source software doing optimal superposition of ligand-based pharmacophores and RDKit's 3D pharmacophore fingerprints. When comparing binding sites, the method is competitive with state of the art approaches. On a single CPU core, up to 374,000 ligand/s or 132,000 binding site/s can be rank ordered. The "AutoCorrelation of Pharmacophore Features" open-source software is released at https://github.com/tsudalab/ACP4.
Collapse
|
9
|
Nakao A, Harabuchi Y, Maeda S, Tsuda K. Exploring the Quantum Chemical Energy Landscape with GNN-Guided Artificial Force. J Chem Theory Comput 2023; 19:713-717. [PMID: 36689311 PMCID: PMC9933424 DOI: 10.1021/acs.jctc.2c01061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Artificial force has been proven useful to get over energy barriers and quickly search a large portion of the energy landscape. This work proposes a method based on graph neural networks to optimize the choice of transformation patterns to examine and accelerate energy landscape exploration. In open search from glutathione, the search efficiency was largely improved in comparison to random selection. We also applied transfer learning from glutathione to tuftsin, resulting in further efficiency gains.
Collapse
|
10
|
Tucs A, Tsuda K, Sljoka A. Probing Conformational Dynamics of Antibodies with Geometric Simulations. Methods Mol Biol 2023; 2552:125-139. [PMID: 36346589 DOI: 10.1007/978-1-0716-2609-2_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
This chapter describes the application of constrained geometric simulations for prediction of antibody structural dynamics. We utilize constrained geometric simulations method FRODAN, which is a low computational complexity alternative to molecular dynamics (MD) simulations that can rapidly explore flexible motions in protein structures. FRODAN is highly suited for conformational dynamics analysis of large proteins, complexes, intrinsically disordered proteins, and dynamics that occurs on longer biologically relevant time scales that are normally inaccessible to classical MD simulations. This approach predicts protein dynamics at an all-atom scale while retaining realistic covalent bonding, maintaining dihedral angles in energetically good conformations while avoiding steric clashes in addition to performing other geometric and stereochemical criteria checks. In this chapter, we apply FRODAN to showcase its applicability for probing functionally relevant dynamics of IgG2a, including large-amplitude domain-domain motions and motions of complementarity determining region (CDR) loops. As was suggested in previous experimental studies, our simulations show that antibodies can explore a large range of conformational space.
Collapse
|
11
|
Ito T, Nguyen TD, Saito Y, Kurumida Y, Nakazawa H, Kawada S, Nishi H, Tsuda K, Kameda T, Umetsu M. Selection of target-binding proteins from the information of weakly enriched phage display libraries by deep sequencing and machine learning. MAbs 2023; 15:2168470. [PMID: 36683172 PMCID: PMC9872955 DOI: 10.1080/19420862.2023.2168470] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Despite the advances in surface-display systems for directed evolution, variants with high affinity are not always enriched due to undesirable biases that increase target-unrelated variants during biopanning. Here, our goal was to design a library containing improved variants from the information of the "weakly enriched" library where functional variants were weakly enriched. Deep sequencing for the previous biopanning result, where no functional antibody mimetics were experimentally identified, revealed that weak enrichment was partly due to undesirable biases during phage infection and amplification steps. The clustering analysis of the deep sequencing data from appropriate steps revealed no distinct sequence patterns, but a Bayesian machine learning model trained with the selected deep sequencing data supplied nine clusters with distinct sequence patterns. Phage libraries were designed on the basis of the sequence patterns identified, and four improved variants with target-specific affinity (EC50 = 80-277 nM) were identified by biopanning. The selection and use of deep sequencing data without undesirable bias enabled us to extract the information on prospective variants. In summary, the use of appropriate deep sequencing data and machine learning with the sequence data has the possibility of finding sequence space where functional variants are enriched.
Collapse
|
12
|
Kaiya Y, Tamura R, Tsuda K. Understanding Chemical Processes with Entropic Sampling. Org Process Res Dev 2022. [DOI: 10.1021/acs.oprd.2c00254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
|
13
|
Sumita M, Terayama K, Tamura R, Tsuda K. QCforever: A Quantum Chemistry Wrapper for Everyone to Use in Black-Box Optimization. J Chem Inf Model 2022; 62:4427-4434. [PMID: 36074116 PMCID: PMC9518232 DOI: 10.1021/acs.jcim.2c00812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Indexed: 11/29/2022]
Abstract
To obtain observable physical or molecular properties such as ionization potential and fluorescent wavelength with quantum chemical (QC) computation, multi-step computation manipulated by a human is required. Hence, automating the multi-step computational process and making it a black box that can be handled by anybody are important for effective database construction and fast realistic material design through the framework of black-box optimization where machine learning algorithms are introduced as a predictor. Here, we propose a Python library, QCforever, to automate the computation of some molecular properties and chemical phenomena induced by molecules. This tool just requires a molecule file for providing its observable properties, automating the computation process of molecular properties (for ionization potential, fluorescence, etc.) and output analysis for providing their multi-values for evaluating a molecule. Incorporating the tool in black-box optimization, we can explore molecules that have properties we desired within the limitation of QC computation.
Collapse
|
14
|
Okuda R, Osaki M, Saeki Y, Okano T, Tsuda K, Nakamura T, Morio Y, Nagashima H, Hagino H. Effect of coordinator-based osteoporosis intervention on quality of life in patients with fragility fractures: a prospective randomized trial. Osteoporos Int 2022; 33:1445-1455. [PMID: 35195752 DOI: 10.1007/s00198-021-06279-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 12/17/2021] [Indexed: 10/19/2022]
Abstract
UNLABELLED We examined the effects of the coordinator-based intervention on quality of life (QOL) in the aftermath of a fragility fracture, as well as factors predictive of post-fracture QOL. The coordinator-based interventions mitigated the decrease in QOL. Secondary fracture after primary fracture, however, was a significant predictor of lower QOL. PURPOSE This study aimed to determine the effects of the coordinator-based intervention on QOL in the aftermath of a fragility fracture, as well as factors predictive of post-fracture QOL, in an Asian population. METHODS Patients with new fractures in the intervention group received the coordinator-based intervention by a designated nurse certified as a coordinator, within 3 months of injury. QOL was evaluated using the Japanese version of the EuroQol 5 Dimension 5 Level (EQ-5D-5L) scale before the fracture (through patient recollections) and at 0.5, 1, and 2 years after the primary fracture. RESULTS Data for 141 patients were analyzed: 70 in the liaison intervention (LI) group and 71 in the non-LI group. Significant intervention effects on QOL were observed at 6 months after the fracture; the QOL score was 0.079 points higher in the LI group than in the non-LI group (p=0.019). Further, the LI group reported significantly less pain/discomfort at 2 years after the fracture, compared to the non-LI group (p=0.037). In addition, secondary fractures were found to significantly prevent improvement and maintenance of QOL during the recovery period (p=0.015). CONCLUSION Short-term intervention effects were observable 6 months after the primary fracture, with the LI group mitigated the decrease in QOL. Few patients in the LI group reported pain/discomfort 2 years after the fracture, but there is uncertainty regarding its clinical significance. Secondary fracture after initial injury was a significant predictor of lower QOL after a fracture.
Collapse
|
15
|
Fujita T, Terayama K, Sumita M, Tamura R, Nakamura Y, Naito M, Tsuda K. Understanding the evolution of a de novo molecule generator via characteristic functional group monitoring. SCIENCE AND TECHNOLOGY OF ADVANCED MATERIALS 2022; 23:352-360. [PMID: 35693890 PMCID: PMC9176351 DOI: 10.1080/14686996.2022.2075240] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 05/01/2022] [Accepted: 05/04/2022] [Indexed: 06/15/2023]
Abstract
Recently, artificial intelligence (AI)-enabled de novo molecular generators (DNMGs) have automated molecular design based on data-driven or simulation-based property estimates. In some domains like the game of Go where AI surpassed human intelligence, humans are trying to learn from AI about the best strategy of the game. To understand DNMG's strategy of molecule optimization, we propose an algorithm called characteristic functional group monitoring (CFGM). Given a time series of generated molecules, CFGM monitors statistically enriched functional groups in comparison to the training data. In the task of absorption wavelength maximization of pure organic molecules (consisting of H, C, N, and O), we successfully identified a strategic change from diketone and aniline derivatives to quinone derivatives. In addition, CFGM led us to a hypothesis that 1,2-quinone is an unconventional chromophore, which was verified with chemical synthesis. This study shows the possibility that human experts can learn from DNMGs to expand their ability to discover functional molecules.
Collapse
|
16
|
Nakao A, Harabuchi Y, Maeda S, Tsuda K. Leveraging algorithmic search in quantum chemical reaction path finding. Phys Chem Chem Phys 2022; 24:10305-10310. [PMID: 35437567 DOI: 10.1039/d2cp01079h] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Reaction path finding methods construct a graph connecting reactants and products in a quantum chemical energy landscape. They are useful in elucidating various reactions and provide footsteps for designing new reactions. Their enormous computational cost, however, limits their application to relatively simple reactions. This paper engages in accelerating reaction path finding by introducing the principles of algorithmic search. A new method called RRT/SC-AFIR is devised by combining rapidly exploring random tree (RRT) and single component artificial force induced reaction (SC-AFIR). Using 96 cores, our method succeeded in constructing a reaction graph for Fritsch-Buttenberg-Wiechell rearrangement within a time limit of 3 days, while the conventional methods could not. Our results illustrate that the algorithm theory provides refreshing and beneficial viewpoints on quantum chemical methodologies.
Collapse
|
17
|
Sumita M, Terayama K, Suzuki N, Ishihara S, Tamura R, Chahal MK, Payne DT, Yoshizoe K, Tsuda K. De novo creation of a naked eye-detectable fluorescent molecule based on quantum chemical computation and machine learning. SCIENCE ADVANCES 2022; 8:eabj3906. [PMID: 35263133 PMCID: PMC8906732 DOI: 10.1126/sciadv.abj3906] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 01/19/2022] [Indexed: 06/14/2023]
Abstract
Designing fluorescent molecules requires considering multiple interrelated molecular properties, as opposed to properties that straightforwardly correlated with molecular structure, such as light absorption of molecules. In this study, we have used a de novo molecule generator (DNMG) coupled with quantum chemical computation (QC) to develop fluorescent molecules, which are garnering significant attention in various disciplines. Using massive parallel computation (1024 cores, 5 days), the DNMG has produced 3643 candidate molecules. We have selected an unreported molecule and seven reported molecules and synthesized them. Photoluminescence spectrum measurements demonstrated that the DNMG can successfully design fluorescent molecules with 75% accuracy (n = 6/8) and create an unreported molecule that emits fluorescence detectable by the naked eye.
Collapse
|
18
|
Nguyen DH, Tsuda K. Generating reaction trees with cascaded variational autoencoders. J Chem Phys 2022; 156:044117. [DOI: 10.1063/5.0076749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
19
|
Sun X, Tamura R, Sumita M, Mori K, Terayama K, Tsuda K. Integrating Incompatible Assay Data Sets with Deep Preference Learning. ACS Med Chem Lett 2022; 13:70-75. [PMID: 35047110 PMCID: PMC8762726 DOI: 10.1021/acsmedchemlett.1c00439] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 12/27/2021] [Indexed: 11/30/2022] Open
Abstract
A large amount of bioactivity assay data is already accumulated in public databases, but the integration of these data sets for quantitative structure-activity relationship (QSAR) studies is not straightforward due to differences in experimental methods and settings. We present an efficient deep-learning-based approach called Deep Preference Data Integration (DPDI). For integrating outcome variables of different assay types, a surrogate variable is introduced, and a neural network is trained such that the total order induced by the surrogate variable is maximally consistent with given data sets. In a task of predicting efficacy of factor Xa inhibitors, DPDI successfully integrated 2959 molecules distributed in 129 assay data sets. In most of our experiments, data integration improved prediction accuracy strongly in interpolation and extrapolation tasks, indicating that DPDI is an effective tool for QSAR studies.
Collapse
|
20
|
Saito Y, Oikawa M, Sato T, Nakazawa H, Ito T, Kameda T, Tsuda K, Umetsu M. Machine-Learning-Guided Library Design Cycle for Directed Evolution of Enzymes: The Effects of Training Data Composition on Sequence Space Exploration. ACS Catal 2021. [DOI: 10.1021/acscatal.1c03753] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
21
|
Takagiwa Y, Hou Z, Tsuda K, Ikeda T, Kojima H. Fe-Al-Si Thermoelectric (FAST) Materials and Modules: Diffusion Couple and Machine-Learning-Assisted Materials Development. ACS APPLIED MATERIALS & INTERFACES 2021; 13:53346-53354. [PMID: 34019762 DOI: 10.1021/acsami.1c04583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
To lower the introduction and maintenance costs of autonomous power supplies for driving Internet-of-things (IoT) devices, we have developed low-cost Fe-Al-Si-based thermoelectric (FAST) materials and power generation modules. Our development approach combines computational science, experiments, mapping measurements, and machine learning (ML). FAST materials have a good balance of mechanical properties and excellent chemical stability, superior to that of conventional Bi-Te-based materials. However, it remains challenging to enhance the power factor (PF) and lower the thermal conductivity of FAST materials to develop reliable power generation devices. This forum paper describes the current status of materials development based on experiments and ML with limited data, together with power generation module fabrication related to FAST materials with a view to commercialization. Combining bulk combinatorial methods with diffusion couple and mapping measurements could accelerate the search to enhance PF for FAST materials. We report that ML prediction is a powerful tool for finding unexpected off-stoichiometric compositions of the Fe-Al-Si system and dopant concentrations of a fourth element to enhance the PF, i.e., Co substitution for Fe atoms in FAST materials.
Collapse
|
22
|
Terayama K, Sumita M, Katouda M, Tsuda K, Okuno Y. Efficient Search for Energetically Favorable Molecular Conformations against Metastable States via Gray-Box Optimization. J Chem Theory Comput 2021; 17:5419-5427. [PMID: 34261321 DOI: 10.1021/acs.jctc.1c00301] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In order to accurately understand and estimate molecular properties, finding energetically favorable molecular conformations is the most fundamental task for atomistic computational research on molecules and materials. Geometry optimization based on quantum chemical calculations has enabled the conformation prediction of arbitrary molecules, including de novo ones. However, it is computationally expensive to perform geometry optimizations for enormous conformers. In this study, we introduce the gray-box optimization (GBO) framework, which enables optimal control over the entire geometry optimization process, among multiple conformers. Algorithms designed for GBO roughly estimate energetically preferable conformers during their geometry optimization iterations. They then preferentially compute promising conformers. To evaluate the performance of the GBO framework, we applied it to a test set consisting of seven dipeptides and mycophenolic acid to determine their stable conformations at the density functional theory level. We thus preferentially obtained energetically favorable conformations. Furthermore, the computational costs required to find the most stable conformation were significantly reduced (approximately 1% on average, compared to the naive approach for the dipeptides).
Collapse
|
23
|
Aryal B, Morikawa D, Tsuda K, Terauchi M. Improvement of precision in refinements of structure factors using convergent-beam electron diffraction patterns taken at Bragg-excited conditions. ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES 2021; 77:289-295. [PMID: 34196291 DOI: 10.1107/s2053273321004137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 04/17/2021] [Indexed: 11/11/2022]
Abstract
A local structure analysis method based on convergent-beam electron diffraction (CBED) has been used for refining isotropic atomic displacement parameters and five low-order structure factors with sin θ/λ ≤ 0.28 Å-1 of potassium tantalate (KTaO3). Comparison between structure factors determined from CBED patterns taken at the zone-axis (ZA) and Bragg-excited conditions is made in order to discuss their precision and sensitivities. Bragg-excited CBED patterns showed higher precision in the refinement of structure factors than ZA patterns. Consistency between higher precision and sensitivity of the Bragg-excited CBED patterns has been found only for structure factors of the outer zeroth-order Laue-zone reflections with larger reciprocal-lattice vectors. Correlation coefficients among the refined structure factors in the refinement of Bragg-excited patterns are smaller than those of the ZA ones. Such smaller correlation coefficients lead to higher precision in the refinement of structure factors.
Collapse
|
24
|
Hou Z, Takagiwa Y, Shinohara Y, Xu Y, Tsuda K. First-principles study of electronic structures and elasticity of Al 2Fe 3Si 3. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2021; 33:195501. [PMID: 33561849 DOI: 10.1088/1361-648x/abe474] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 02/09/2021] [Indexed: 06/12/2023]
Abstract
Al2Fe3Si3intermetallic compound shows promising application in low-cost and non-toxic thermoelectric device because of its relatively high power factor of ∼700μW m-1 K-2at 400 K. Herein we performed the first-principles calculations with the projector augmented-wave (PAW) method to study the formation energies, elastic constants, electronic structures, and electronic transport properties of Al2Fe3Si3. We discussed the thermodynamical stability of Al2Fe3Si3against other ternary crystalline compounds in Al-Fe-Si phase. The band gap of Al2Fe3Si3was particularly examined using the semilocal and hybrid functionals and the on-site Hubbard correction, which were also applied to β-FeSi2to calibrate the prediction reliability of our employed computational methods. Our calculations show that Al2Fe3Si3is a narrow-gap semiconductor. The semilocal functional within generalized gradient approximation (GGA) shows an exceptional agreement between the predicted band gap of Al2Fe3Si3and the available experiment data, which is in contrast to the typical trend and rationally understood through a comprehensive comparison. We found that both HSE06 and PBE0 hybrid functionals with a standard setup overestimated the band gaps of Al2Fe3Si3and β-FeSi2too much. The underlying reasons may be ascribed to a large electronic screening, which arises from the unique characteristics of Fe 3dstates appearing in both sides of band gaps of Al2Fe3Si3and β-FeSi2, and to a reduced delocalization error thanks to the covalent Fe-Si and Si-Si bonding nature. The chemical bonding and elasticity of Al2Fe3Si3were compared with those of β-FeSi2and FeAl2. In Al2Fe3Si3the Fe-Al bonding is more ionic and the Fe-Si bonding is more covalent. The elastic moduli of Al2Fe3Si3are comparable to those of β-FeSi2and larger than those of FeAl2. Our calculation results indicate that the mechanical strength of Al2Fe3Si3could be strong enough for the practical application in thermoelectric device.
Collapse
|
25
|
Fujita Y, Tamaki J, Kouda K, Yura A, Sato Y, Tachiki T, Hamada M, Kajita E, Kamiya K, Kaji K, Tsuda K, Ohara K, Moon JS, Kitagawa J, Iki M. Determinants of bone health in elderly Japanese men: study design and key findings of the Fujiwara-kyo Osteoporosis Risk in Men (FORMEN) cohort study. Environ Health Prev Med 2021; 26:51. [PMID: 33892635 PMCID: PMC8066970 DOI: 10.1186/s12199-021-00972-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 04/11/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The Fujiwara-kyo Osteoporosis Risk in Men (FORMEN) study was launched to investigate risk factors for osteoporotic fractures, interactions of osteoporosis with other non-communicable chronic diseases, and effects of fracture on QOL and mortality. METHODS FORMEN baseline study participants (in 2007 and 2008) included 2012 community-dwelling men (aged 65-93 years) in Nara prefecture, Japan. Clinical follow-up surveys were conducted 5 and 10 years after the baseline survey, and 1539 and 906 men completed them, respectively. Supplemental mail, telephone, and visit surveys were conducted with non-participants to obtain outcome information. Survival and fracture outcomes were determined for 2006 men, with 566 deaths identified and 1233 men remaining in the cohort at 10-year follow-up. COMMENTS The baseline survey covered a wide range of bone health-related indices including bone mineral density, trabecular microarchitecture assessment, vertebral imaging for detecting vertebral fractures, and biochemical markers of bone turnover, as well as comprehensive geriatric assessment items. Follow-up surveys were conducted to obtain outcomes including osteoporotic fracture, cardiovascular diseases, initiation of long-term care, and mortality. A complete list of publications relating to the FORMEN study can be found at https://www.med.kindai.ac.jp/pubheal/FORMEN/Publications.html .
Collapse
|