1
|
Prabantu VM, Gadiyaram V, Vishveshwara S, Srinivasan N. Comparison of structural networks across homologous proteins. Proteins 2025; 93:267-278. [PMID: 38058245 DOI: 10.1002/prot.26650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 11/10/2023] [Accepted: 11/22/2023] [Indexed: 12/08/2023]
Abstract
Protein sequence determines its structure and function. The indirect relationship between protein function and structure lies deep-rooted in the structural topology that has evolved into performing optimal function. The evolution of structure and its interconnectivity has been conventionally studied by comparing the root means square deviation between protein structures at the backbone level. Two factors that are necessary for the quantitative comparison of non-covalent interactions are (a) explicit inclusion of the coordinates of side-chain atoms and (b) consideration of multiple structures from the conformational landscape to account for structural variability. We have recently addressed these fundamental issues by investigating the alteration of inter-residue interactions across an ensemble of protein structure networks through a graph spectral approach. In this study, we have developed a rigorous method to compare the structure networks of homologous proteins, with a wide range of sequence identity percentages. A range of dissimilarity measures that show the extent of change in the network across homologous structures are generated, which also includes the comparison of the protein structure variability. We discuss in detail, scenarios where the variation of structure is not accompanied by loss or gain of the overall network and its vice versa. The sequence-based phylogeny among the homologs is also compared with the lineage obtained from information from such a robust structure comparison. In summary, we can obtain a quantitative comparison score for the structure networks of homologous proteins, which also enables us to study the evolution of protein function based on the variation of their topologies.
Collapse
|
2
|
Mohammad-Rahimi H, Sohrabniya F, Ourang SA, Dianat O, Aminoshariae A, Nagendrababu V, Dummer PMH, Duncan HF, Nosrat A. Artificial intelligence in endodontics: Data preparation, clinical applications, ethical considerations, limitations, and future directions. Int Endod J 2024; 57:1566-1595. [PMID: 39075670 DOI: 10.1111/iej.14128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Revised: 07/03/2024] [Accepted: 07/16/2024] [Indexed: 07/31/2024]
Abstract
Artificial intelligence (AI) is emerging as a transformative technology in healthcare, including endodontics. A gap in knowledge exists in understanding AI's applications and limitations among endodontic experts. This comprehensive review aims to (A) elaborate on technical and ethical aspects of using data to implement AI models in endodontics; (B) elaborate on evaluation metrics; (C) review the current applications of AI in endodontics; and (D) review the limitations and barriers to real-world implementation of AI in the field of endodontics and its future potentials/directions. The article shows that AI techniques have been applied in endodontics for critical tasks such as detection of radiolucent lesions, analysis of root canal morphology, prediction of treatment outcome and post-operative pain and more. Deep learning models like convolutional neural networks demonstrate high accuracy in these applications. However, challenges remain regarding model interpretability, generalizability, and adoption into clinical practice. When thoughtfully implemented, AI has great potential to aid with diagnostics, treatment planning, clinical interventions, and education in the field of endodontics. However, concerted efforts are still needed to address limitations and to facilitate integration into clinical workflows.
Collapse
Affiliation(s)
- Hossein Mohammad-Rahimi
- Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany
| | - Fatemeh Sohrabniya
- Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany
| | - Seyed AmirHossein Ourang
- Dentofacial Deformities Research Center, Research Institute of Dental Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Omid Dianat
- Division of Endodontics, Department of Advanced Oral Sciences and Therapeutics, School of Dentistry, University of Maryland, Baltimore, Maryland, USA
- Private Practice, Irvine Endodontics, Irvine, California, USA
| | - Anita Aminoshariae
- Department of Endodontics, School of Dental Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | | | | | - Henry F Duncan
- Division of Restorative Dentistry, Dublin Dental University Hospital, Trinity College Dublin, Dublin, Ireland
| | - Ali Nosrat
- Division of Endodontics, Department of Advanced Oral Sciences and Therapeutics, School of Dentistry, University of Maryland, Baltimore, Maryland, USA
- Private Practice, Centreville Endodontics, Centreville, Virginia, USA
| |
Collapse
|
3
|
Vu T, Wang Y, Fowler A, Simieou A, McCarty N. TRIM44, a Novel Prognostic Marker, Supports the Survival of Proteasome-Resistant Multiple Myeloma Cells. Cells 2024; 13:1431. [PMID: 39273003 PMCID: PMC11394402 DOI: 10.3390/cells13171431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 08/20/2024] [Accepted: 08/22/2024] [Indexed: 09/15/2024] Open
Abstract
TRIM44, a tripartite motif (TRIM) family member, is pivotal in linking the ubiquitin-proteasome system (UPS) to autophagy in multiple myeloma (MM). However, its prognostic impact and therapeutic potential remain underexplored. Here, we report that TRIM44 overexpression is associated with poor prognosis in a Multiple Myeloma Research Foundation (MMRF) cohort of 858 patients, persisting across primary and recurrent MM cases. TRIM44 expression notably increases in advanced MM stages, indicating its potential role in disease progression. Single-cell RNA sequencing across MM stages showed significant TRIM44 upregulation in smoldering MM (SMM) and MM compared to normal bone marrow, especially in patients with t(4;14) cytogenetic abnormalities. This analysis further identified high TRIM44 expression as predictive of lower responsiveness to proteasome inhibitor (PI) treatments, underscoring its critical function in the unfolded protein response (UPR) in TRIM44-high MM cells. Our findings also demonstrate that TRIM44 facilitates SQSTM1 oligomerization under oxidative stress, essential for its phosphorylation and subsequent autophagic degradation. This process supports the survival of PI-resistant MM cells by activating the NRF2 pathway, which is crucial for oxidative stress response and, potentially, other chemotherapy-induced stressors. Additionally, TRIM44 counters the TRIM21-mediated suppression of the antioxidant response, enhancing MM cell survival under oxidative stress. Collectively, our discoveries highlight TRIM44's significant role in MM progression and resistance to therapy, suggesting its potential value as a therapeutic target.
Collapse
Affiliation(s)
- Trung Vu
- Brown Foundation Institute of Molecular Medicine for the Prevention of Human Diseases (IMM), The University of Texas-Health Science Center at Houston, Houston, TX 77021, USA; (T.V.); (Y.W.)
| | - Yuqin Wang
- Brown Foundation Institute of Molecular Medicine for the Prevention of Human Diseases (IMM), The University of Texas-Health Science Center at Houston, Houston, TX 77021, USA; (T.V.); (Y.W.)
| | - Annaliese Fowler
- The Department of Biomedical Engineering, Texas A&M University, Houston, TX 77030, USA;
| | - Anton Simieou
- The Department of Biomedical Engineering, The University of Texas at Austin, Austin, TX 78712, USA;
| | - Nami McCarty
- Brown Foundation Institute of Molecular Medicine for the Prevention of Human Diseases (IMM), The University of Texas-Health Science Center at Houston, Houston, TX 77021, USA; (T.V.); (Y.W.)
| |
Collapse
|
4
|
Shi D, Grey AC, Guo G. An isotopically-labelled temporal mass spectrometry imaging data analysis workflow to reveal glucose spatial metabolism patterns in bovine lens tissue. Sci Rep 2024; 14:18843. [PMID: 39138264 PMCID: PMC11322647 DOI: 10.1038/s41598-024-69507-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 08/06/2024] [Indexed: 08/15/2024] Open
Abstract
Application of stable isotopically labelled (SIL) molecules in Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry Imaging (MALDI-MSI) over a series of time points allows the temporal and spatial dynamics of biochemical reactions to be tracked in a biological system. However, these large kinetic MSI datasets and the inherent variability of biological replicates presents significant challenges to the rapid analysis of the data. In addition, manual annotation of downstream SIL metabolites involves human input to carefully analyse the data based on prior knowledge and personal expertise. To overcome these challenges to the analysis of spatiotemporal MALDI-MSI data and improve the efficiency of SIL metabolite identification, a bioinformatics pipeline has been developed and demonstrated by analysing normal bovine lens glucose metabolism as a model system. The pipeline consists of spatial alignment to mitigate the impact of sample variability and ensure spatial comparability of the temporal data, dimensionality reduction to rapidly map regional metabolic distinctions within the tissue, and metabolite annotation coupled with pathway enrichment modules to summarise and display the metabolic pathways induced by the treatment. This pipeline will be valuable for the spatial metabolomics community to analyse kinetic MALDI-MSI datasets, enabling rapid characterisation of spatio-temporal metabolic patterns from tissues of interest.
Collapse
Affiliation(s)
- Dingchang Shi
- Department of Physiology, School of Medical Sciences, University of Auckland, 85 Park Rd, Grafton, Auckland, 1023, New Zealand
| | - Angus C Grey
- Department of Physiology, School of Medical Sciences, University of Auckland, 85 Park Rd, Grafton, Auckland, 1023, New Zealand.
| | - George Guo
- Department of Physiology, School of Medical Sciences, University of Auckland, 85 Park Rd, Grafton, Auckland, 1023, New Zealand
| |
Collapse
|
5
|
Bhatt R, Koes DR, Durrant JD. CENsible: Interpretable Insights into Small-Molecule Binding with Context Explanation Networks. J Chem Inf Model 2024; 64:4651-4660. [PMID: 38847393 PMCID: PMC11200255 DOI: 10.1021/acs.jcim.4c00825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 05/28/2024] [Accepted: 05/28/2024] [Indexed: 06/18/2024]
Abstract
We present a novel and interpretable approach for assessing small-molecule binding using context explanation networks. Given the specific structure of a protein/ligand complex, our CENsible scoring function uses a deep convolutional neural network to predict the contributions of precalculated terms to the overall binding affinity. We show that CENsible can effectively distinguish active vs inactive compounds for many systems. Its primary benefit over related machine-learning scoring functions, however, is that it retains interpretability, allowing researchers to identify the contribution of each precalculated term to the final affinity prediction, with implications for subsequent lead optimization.
Collapse
Affiliation(s)
- Roshni Bhatt
- Department
of Computational and Systems Biology, University
of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
- Department
of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - David Ryan Koes
- Department
of Computational and Systems Biology, University
of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jacob D. Durrant
- Department
of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
6
|
Liu D, Fu J, Elishav O, Sakakibara M, Yamanouchi K, Hirshberg B, Nakamuro T, Nakamura E. Melting entropy of crystals determined by electron-beam-induced configurational disordering. Science 2024; 384:1212-1219. [PMID: 38815089 DOI: 10.1126/science.adk3620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 05/01/2024] [Indexed: 06/01/2024]
Abstract
Upon melting, the molecules in a crystal explore numerous configurations, reflecting an increase in disorder. The molar entropy of disorder can be defined by Boltzmann's formula ΔSd = Rln(Wd), where Wd is the increase in the number of microscopic states, so far inaccessible experimentally. We found that the Arrhenius frequency factor A of the electron diffraction signal decay provides Wd through an experimental equation A = AINTWd, where AINT is an inelastic scattering cross section. The method connects Clausius and Boltzmann experimentally and supplements the Clausius approach, being applicable to a femtogram quantity of thermally unstable and biomolecular crystals. The data also showed that crystal disordering and crystallization of melt are reciprocal, both governed by the entropy change but manifesting in opposite directions.
Collapse
Affiliation(s)
- Dongxin Liu
- Department of Chemistry, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Jiarui Fu
- Department of Chemistry, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Oren Elishav
- School of Chemistry, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Masaya Sakakibara
- Department of Chemistry, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Kaoru Yamanouchi
- Department of Chemistry, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Barak Hirshberg
- School of Chemistry, Tel Aviv University, Tel Aviv 6997801, Israel
- The Center for Computational Molecular and Materials Science, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Takayuki Nakamuro
- Department of Chemistry, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Eiichi Nakamura
- Department of Chemistry, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
7
|
Dong J, Wang S, Cui W, Sun X, Guo H, Yan H, Vogel H, Wang Z, Yuan S. Machine Learning Deciphered Molecular Mechanistics with Accurate Kinetic and Thermodynamic Prediction. J Chem Theory Comput 2024; 20:4499-4513. [PMID: 38394691 DOI: 10.1021/acs.jctc.3c01412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Time-lagged independent component analysis (tICA) and the Markov state model (MSM) have been extensively employed for extracting conformational dynamics and kinetic community networks from unbiased trajectory ensembles. However, these techniques may not be the optimal choice for elucidating transition mechanisms within low-dimensional representations, especially for intricate biosystems. Unraveling the association mechanism in such complex systems always necessitates permutations of several essential independent components or collective variables, a process that is inherently obscure and may require empirical knowledge for selection. To address these challenges, we have implemented an integrated unsupervised dimension reduction model: uniform manifold approximation and projection (UMAP) with hierarchy density-based spatial clustering of applications with noise (HDBSCAN). This approach effectively generates low-dimensional configurational embeddings. The hierarchical application of this architecture, in conjunction with MSM, reveals global kinetic connectivity while identifying local conformational states. Consequently, our methodology establishes a multiscale mechanistic elucidation framework. Leveraging the benefits of the uniform sample distribution and a denoising approach, our model demonstrates robustness in preserving global and local data structures compared to traditional dimension reduction methods in the field of MD analysis area. The interpretability of hyperparameter selection and compatibility with downstream tasks are cross-validated across various simulation data sets, utilizing both computational evaluation metrics and experimental kinetic observables. Furthermore, the predicted Mcl1-BH3 association kinetics (0.76 s-1) is in close agreement with surface plasmon resonance experiments (0.12 s-1), affirming the plausibility of the identified pathway composed of representative conformations. We anticipate that the devised workflow will serve as a foundational framework for studying recognition patterns in complex biological systems. Its contributions extend to the exploration of protein functional dynamics and rational drug design, offering a potent avenue for advancing research in these domains.
Collapse
Affiliation(s)
- Junlin Dong
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shiyu Wang
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- AlphaMol Science Ltd, Shenzhen 518055, China
| | - Wenqiang Cui
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaolin Sun
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Haojie Guo
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Hailu Yan
- School of Biological Sciences, College of Science and Engineering, University of Edinburgh, Edinburgh EH8 9YL, U.K
| | - Horst Vogel
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Zhi Wang
- Artificial Intelligence Department, Zhejiang Financial College, Hangzhou 310018, China
| | - Shuguang Yuan
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- AlphaMol Science Ltd, Shenzhen 518055, China
| |
Collapse
|
8
|
Ye Y, Wang H, Chen W, Chen Z, Wu D, Zhang F, Hu F. Dynamic changes of immunocyte subpopulations in thermogenic activation of adipose tissues. Front Immunol 2024; 15:1375138. [PMID: 38812501 PMCID: PMC11133676 DOI: 10.3389/fimmu.2024.1375138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 04/30/2024] [Indexed: 05/31/2024] Open
Abstract
Objectives The effects of cold exposure on whole-body metabolism in humans have gained increasing attention. Brown or beige adipose tissues are crucial in cold-induced thermogenesis to dissipate energy and thus have the potential to combat metabolic disorders. Despite the immune regulation of thermogenic adipose tissues, the overall changes in vital immune cells during distinct cold periods remain elusive. This study aimed to discuss the overall changes in immune cells under different cold exposure periods and to screen several potential immune cell subpopulations on thermogenic regulation. Methods Cibersort and mMCP-counter algorithms were employed to analyze immune infiltration in two (brown and beige) thermogenic adipose tissues under distinct cold periods. Changes in some crucial immune cell populations were validated by reanalyzing the single-cell sequencing dataset (GSE207706). Flow cytometry, immunofluorescence, and quantitative real-time PCR assays were performed to detect the proportion or expression changes in mouse immune cells of thermogenic adipose tissues under cold challenge. Results The proportion of monocytes, naïve, and memory T cells increased, while the proportion of NK cells decreased under cold exposure in brown adipose tissues. Conclusion Our study revealed dynamic changes in immune cell profiles in thermogenic adipose tissues and identified several novel immune cell subpopulations, which may contribute to thermogenic activation of adipose tissues under cold exposure.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Fang Hu
- National Clinical Research Center for Metabolic Diseases, Key Laboratory of Diabetes Immunology, Ministry of Education, Department of Metabolism and Endocrinology, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| |
Collapse
|
9
|
Liu M, Zhang J, Li X, Wang Y. Research progress of DDR1 inhibitors in the treatment of multiple human diseases. Eur J Med Chem 2024; 268:116291. [PMID: 38452728 DOI: 10.1016/j.ejmech.2024.116291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/25/2024] [Accepted: 02/26/2024] [Indexed: 03/09/2024]
Abstract
Discoidin domain receptor 1 (DDR1) is a collagen-activated receptor tyrosine kinase (RTK) and plays pivotal roles in regulating cellular functions such as proliferation, differentiation, invasion, migration, and matrix remodeling. DDR1 is involved in the occurrence and progression of many human diseases, including cancer, fibrosis, and inflammation. Therefore, DDR1 represents a highly promising therapeutic target. Although no selective small-molecule inhibitors have reached clinical trials to date, many molecules have shown therapeutic effects in preclinical studies. For example, BK40143 has demonstrated significant promise in the therapy of neurodegenerative diseases. In this context, our perspective aims to provide an in-depth exploration of DDR1, encompassing its structure characteristics, biological functions, and disease relevance. Furthermore, we emphasize the importance of understanding the structure-activity relationship of DDR1 inhibitors and highlight the unique advantages of dual-target or multitarget inhibitors. We anticipate offering valuable insights into the development of more efficacious DDR1-targeted drugs.
Collapse
Affiliation(s)
- Mengying Liu
- Department of Pulmonary and Critical Care Medicine, Targeted Tracer Research and Development Laboratory, Institute of Respiratory Health, Frontiers Science Center for Disease-related Molecular Network, Precision Medicine Key Laboratory of Sichuan Province & Precision Medicine Research Center, Neuro-system and Multimorbidity Laboratory, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, 610041, Sichuan, China; Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, 610212, Sichuan, China
| | - Jifa Zhang
- Department of Pulmonary and Critical Care Medicine, Targeted Tracer Research and Development Laboratory, Institute of Respiratory Health, Frontiers Science Center for Disease-related Molecular Network, Precision Medicine Key Laboratory of Sichuan Province & Precision Medicine Research Center, Neuro-system and Multimorbidity Laboratory, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, 610041, Sichuan, China; Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, 610212, Sichuan, China
| | - Xiaoxue Li
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, 610041, Sichuan, China
| | - Yuxi Wang
- Department of Pulmonary and Critical Care Medicine, Targeted Tracer Research and Development Laboratory, Institute of Respiratory Health, Frontiers Science Center for Disease-related Molecular Network, Precision Medicine Key Laboratory of Sichuan Province & Precision Medicine Research Center, Neuro-system and Multimorbidity Laboratory, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, 610041, Sichuan, China; Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, 610212, Sichuan, China.
| |
Collapse
|
10
|
Hradiská H, Kurečka M, Beránek J, Tedeschi G, Višňovský V, Křenek A, Spiwok V. Acceleration of Molecular Simulations by Parametric Time-Lagged tSNE Metadynamics. J Phys Chem B 2024; 128:903-913. [PMID: 38237064 PMCID: PMC10839826 DOI: 10.1021/acs.jpcb.3c05669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/22/2023] [Accepted: 12/28/2023] [Indexed: 02/02/2024]
Abstract
The potential of molecular simulations is limited by their computational costs. There is often a need to accelerate simulations using some of the enhanced sampling methods. Metadynamics applies a history-dependent bias potential that disfavors previously visited states. To apply metadynamics, it is necessary to select a few properties of the system─collective variables (CVs) that can be used to define the bias potential. Over the past few years, there have been emerging opportunities for machine learning and, in particular, artificial neural networks within this domain. In this broad context, a specific unsupervised machine learning method was utilized, namely, parametric time-lagged t-distributed stochastic neighbor embedding (ptltSNE) to design CVs. The approach was tested on a Trp-cage trajectory (tryptophan cage) from the literature. The trajectory was used to generate a map of conformations, distinguish fast conformational changes from slow ones, and design CVs. Then, metadynamic simulations were performed. To accelerate the formation of the α-helix, we added the α-RMSD collective variable. This simulation led to one folding event in a 350 ns metadynamics simulation. To accelerate degrees of freedom not addressed by CVs, we performed parallel tempering metadynamics. This simulation led to 10 folding events in a 200 ns simulation with 32 replicas.
Collapse
Affiliation(s)
- Helena Hradiská
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Martin Kurečka
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Jan Beránek
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Guglielmo Tedeschi
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Vladimír Višňovský
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Aleš Křenek
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Vojtěch Spiwok
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| |
Collapse
|
11
|
Liu X, Xing J, Fu H, Shao X, Cai W. Analyzing Molecular Dynamics Trajectories Thermodynamically through Artificial Intelligence. J Chem Theory Comput 2024; 20:665-676. [PMID: 38193858 DOI: 10.1021/acs.jctc.3c00975] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Molecular dynamics simulations produce trajectories that correspond to vast amounts of structure when exploring biochemical processes. Extracting valuable information, e.g., important intermediate states and collective variables (CVs) that describe the major movement modes, from molecular trajectories to understand the underlying mechanisms of biological processes presents a significant challenge. To achieve this goal, we introduce a deep learning approach, coined DIKI (deep identification of key intermediates), to determine low-dimensional CVs distinguishing key intermediate conformations without a-priori assumptions. DIKI dynamically plans the distribution of latent space and groups together similar conformations within the same cluster. Moreover, by incorporating two user-defined parameters, namely, coarse focus knob and fine focus knob, to help identify conformations with low free energy and differentiate the subtle distinctions among these conformations, resolution-tunable clustering was achieved. Furthermore, the integration of DIKI with a path-finding algorithm contributes to the identification of crucial intermediates along the lowest free-energy pathway. We postulate that DIKI is a robust and flexible tool that can find widespread applications in the analysis of complex biochemical processes.
Collapse
Affiliation(s)
- Xuyang Liu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Jingya Xing
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Haohao Fu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| |
Collapse
|
12
|
Zhou M, Li H, Hu J, Zhou T, Zhou L, Li Y. Construction and validation of a prognostic signature based on seven endoplasmic reticulum stress-related lncRNAs for patients with head and neck squamous cell carcinoma. Sci Rep 2023; 13:22414. [PMID: 38104177 PMCID: PMC10725423 DOI: 10.1038/s41598-023-49987-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/14/2023] [Indexed: 12/19/2023] Open
Abstract
Endoplasmic reticulum stress (ERS) occurs when misfolded or unfolded proteins accumulate in the endoplasmic reticulum (ER), and it is often observed in tumors, including head and neck squamous cell carcinoma (HNSCC). Relevant studies have demonstrated the prognostic significance of ERS-related long non-coding RNAs (lncRNAs) in various cancers. However, the relationship between ERS and lncRNAs in HNSCC has received limited attention in previous studies. In this study, we aimed to develop an ERS-related lncRNAs prognostic model using correlation analysis, Cox regression analysis, least absolute shrinkage, and selection operator (LASSO) regression analysis based on data from The Cancer Genome Atlas (TCGA) database. The survival and predictive ability of this model were evaluated using Kaplan-Meier analysis and time-dependent receiver operating characteristics (ROC), while nomograms and calibration curves were constructed. Then, functional enrichment analyses, tumor mutation burden (TMB), tumor infiltration of immune cells, single sample Gene Set Enrichment Analysis (ssGSEA), and drug sensitivity analysis were performed. Additionally, we conducted a consensus cluster analysis to compare differences between subtypes of tumors. Finally, we validated the expression of the ERS-related lncRNAs that constructed prognostic risk score model in HNSCC tissues through quantitative real-time PCR (qRT-PCR). We developed a prognostic signature based on seven ERS-related lncRNAs, which showed better predictive performance than other clinicopathological features. The high-risk poor prognosis group had a poorer prognosis in comparison to the low-risk good prognosis. The area under the ROC curve (AUC) predicted by this model for 3-year survival rates of HNSCC patients was 0.805. Enrichment analysis revealed that the differentially expressed genes were primarily enriched in pathways related to immune responses and signal transduction. Low-risk patients had lower TMB, more immune cell infiltrations, and enhanced anti-tumor immunity. Cluster analysis indicated that cluster 3 may have a better prognosis and immunotherapy effect. In addition, the result of qRT-PCR was consistent with our analysis. This prognostic model based on seven ERS-related lncRNAs is a promising tool for risk stratification, survival prediction, and immune cell infiltration status assessment.
Collapse
Affiliation(s)
- Mingzhu Zhou
- Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Huihui Li
- Physical Examination Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Juanjuan Hu
- Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Tao Zhou
- Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| | - Liuqing Zhou
- Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| | - Yuncheng Li
- Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| |
Collapse
|
13
|
Wakayama R, Takasugi S, Honda K, Kanaya S. Application of a Two-Dimensional Mapping-Based Visualization Technique: Nutrient-Value-Based Food Grouping. Nutrients 2023; 15:5006. [PMID: 38068864 PMCID: PMC10707954 DOI: 10.3390/nu15235006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/29/2023] [Accepted: 12/01/2023] [Indexed: 12/18/2023] Open
Abstract
Worldwide, several food-based dietary guidelines, with diverse food-grouping methods in various countries, have been developed to maintain and promote public health. However, standardized international food-grouping methods are scarce. In this study, we used two-dimensional mapping to classify foods based on their nutrient composition. The Standard Tables of Food Composition in Japan were used for mapping with a novel technique-t-distributed stochastic neighbor embedding-to visualize high-dimensional data. The mapping results showed that most foods formed food group-based clusters in the Standard Tables of Food Composition in Japan. However, the beverages did not form large clusters and demonstrated scattered distribution on the map. Green tea, black tea, and coffee are located within or near the vegetable cluster whereas cocoa is near the pulse cluster. These results were ensured by the k-nearest neighbors. Thus, beverages made from natural materials can be categorized based on their origin. Visualization of food composition could enable an enhanced comprehensive understanding of the nutrients in foods, which could lead to novel aspects of nutrient-value-based food classifications.
Collapse
Affiliation(s)
- Ryota Wakayama
- Meiji Co., Ltd., 2-2-1 Kyobashi, Chuo-ku 104-9306, Tokyo, Japan;
- Computational Systems Biology Laboratory, Division of Information Science, Graduate School of Science and Technology & Data Science Center, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma 630-0192, Nara, Japan
| | - Satoshi Takasugi
- Meiji Co., Ltd., 2-2-1 Kyobashi, Chuo-ku 104-9306, Tokyo, Japan;
| | - Keiko Honda
- Medicine Nutrition, Faculty of Nutrition, Kagawa Nutrition University, 3-9-21 Chiyoda, Sakado 350-0288, Saitama, Japan
| | - Shigehiko Kanaya
- Computational Systems Biology Laboratory, Division of Information Science, Graduate School of Science and Technology & Data Science Center, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma 630-0192, Nara, Japan
| |
Collapse
|
14
|
Bhatt R, Koes DR, Durrant JD. CENsible: Interpretable Insights into Small-Molecule Binding with Context Explanation Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.18.562959. [PMID: 37904961 PMCID: PMC10614872 DOI: 10.1101/2023.10.18.562959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
We present a novel and interpretable approach for predicting small-molecule binding affinities using context explanation networks (CENs). Given the specific structure of a protein/ligand complex, our CENsible scoring function uses a deep convolutional neural network to predict the contributions of pre-calculated terms to the overall binding affinity. We show that CENsible can effectively distinguish active vs. inactive compounds for many systems. Its primary benefit over related machine-learning scoring functions, however, is that it retains interpretability, allowing researchers to identify the contribution of each pre-calculated term to the final affinity prediction, with implications for subsequent lead optimization.
Collapse
Affiliation(s)
- Roshni Bhatt
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| | - David Ryan Koes
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260
| | - Jacob D Durrant
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
15
|
Appadurai R, Koneru JK, Bonomi M, Robustelli P, Srivastava A. Clustering Heterogeneous Conformational Ensembles of Intrinsically Disordered Proteins with t-Distributed Stochastic Neighbor Embedding. J Chem Theory Comput 2023; 19:4711-4727. [PMID: 37338049 PMCID: PMC11108026 DOI: 10.1021/acs.jctc.3c00224] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Intrinsically disordered proteins (IDPs) populate a range of conformations that are best described by a heterogeneous ensemble. Grouping an IDP ensemble into "structurally similar" clusters for visualization, interpretation, and analysis purposes is a much-desired but formidable task, as the conformational space of IDPs is inherently high-dimensional and reduction techniques often result in ambiguous classifications. Here, we employ the t-distributed stochastic neighbor embedding (t-SNE) technique to generate homogeneous clusters of IDP conformations from the full heterogeneous ensemble. We illustrate the utility of t-SNE by clustering conformations of two disordered proteins, Aβ42, and α-synuclein, in their APO states and when bound to small molecule ligands. Our results shed light on ordered substates within disordered ensembles and provide structural and mechanistic insights into binding modes that confer specificity and affinity in IDP ligand binding. t-SNE projections preserve the local neighborhood information, provide interpretable visualizations of the conformational heterogeneity within each ensemble, and enable the quantification of cluster populations and their relative shifts upon ligand binding. Our approach provides a new framework for detailed investigations of the thermodynamics and kinetics of IDP ligand binding and will aid rational drug design for IDPs.
Collapse
Affiliation(s)
- Rajeswari Appadurai
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka 560012, India
| | | | - Massimiliano Bonomi
- Structural Bioinformatics Unit, Department of Structural Biology and Chemistry. CNRS UMR 3528, C3BI, CNRS USR 3756, Institut Pasteur, Paris, France
| | - Paul Robustelli
- Dartmouth College, Department of Chemistry, Hanover, NH, 03755, USA
| | - Anand Srivastava
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
16
|
Peng J, Wang L, Wang P, Pei Y. Density Functional Theory Computation and Machine Learning Studies of Interaction between Au 3 Clusters and 20 Natural Amino Acid Molecules. ACS OMEGA 2023; 8:23024-23031. [PMID: 37396243 PMCID: PMC10308543 DOI: 10.1021/acsomega.3c02195] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 05/17/2023] [Indexed: 07/04/2023]
Abstract
The optimal adsorption sites and the binding energies of neutral Au3 clusters with 20 natural amino acids under the gas phase and water solvation were systematically investigated based on density functional theory (DFT) calculations. The calculation results showed that in the gas phase Au3 tends to bind with N atoms of amino groups in amino acids, except methionine, which tends to bind with Au3 through S atoms. Under water solvation, Au3 clusters tended to bind to N atoms of amino groups and N atoms of side chain amino groups in amino acids. However, methionine and cysteine bind more strongly to the gold atom through the S atom. Based on the binding energy data of Au3 clusters and 20 natural amino acids under water solvation calculated by DFT, a machine learning model (gradient boosted decision tree) was proposed to predict the optimal binding Gibbs free energy (ΔG) of the interaction between Au3 clusters and amino acids. The main factors affecting the strength of the interaction between Au3 and amino acids were uncovered by the feature importance analysis.
Collapse
Affiliation(s)
- Jiao Peng
- Department
of Chemistry, Key Laboratory for Green Organic Synthesis and Application
of Hunan Province, Key Laboratory of Environmentally Friendly Chemistry
and Applications of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Li Wang
- Department
of Chemistry, Key Laboratory for Green Organic Synthesis and Application
of Hunan Province, Key Laboratory of Environmentally Friendly Chemistry
and Applications of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Pu Wang
- Department
of Chemistry, Key Laboratory for Green Organic Synthesis and Application
of Hunan Province, Key Laboratory of Environmentally Friendly Chemistry
and Applications of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China
| | - Yong Pei
- Department
of Chemistry, Key Laboratory for Green Organic Synthesis and Application
of Hunan Province, Key Laboratory of Environmentally Friendly Chemistry
and Applications of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China
- School
of Minerals Processing and Bioengineering, Central South University, Changsha, Hunan 410083, China
- State
Key Laboratory of Complex Nonferrous Metal Resources Clean Utilization, Kunming 650093, China
| |
Collapse
|
17
|
Chandra R, Bansal C, Kang M, Blau T, Agarwal V, Singh P, Wilson LOW, Vasan S. Unsupervised machine learning framework for discriminating major variants of concern during COVID-19. PLoS One 2023; 18:e0285719. [PMID: 37200352 PMCID: PMC10194860 DOI: 10.1371/journal.pone.0285719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/28/2023] [Indexed: 05/20/2023] Open
Abstract
Due to the high mutation rate of the virus, the COVID-19 pandemic evolved rapidly. Certain variants of the virus, such as Delta and Omicron emerged with altered viral properties leading to severe transmission and death rates. These variants burdened the medical systems worldwide with a major impact to travel, productivity, and the world economy. Unsupervised machine learning methods have the ability to compress, characterize, and visualize unlabelled data. This paper presents a framework that utilizes unsupervised machine learning methods to discriminate and visualize the associations between major COVID-19 variants based on their genome sequences. These methods comprise a combination of selected dimensionality reduction and clustering techniques. The framework processes the RNA sequences by performing a k-mer analysis on the data and further visualises and compares the results using selected dimensionality reduction methods that include principal component analysis (PCA), t-distributed stochastic neighbour embedding (t-SNE), and uniform manifold approximation projection (UMAP). Our framework also employs agglomerative hierarchical clustering to visualize the mutational differences among major variants of concern and country-wise mutational differences for selected variants (Delta and Omicron) using dendrograms. We also provide country-wise mutational differences for selected variants via dendrograms. We find that the proposed framework can effectively distinguish between the major variants and has the potential to identify emerging variants in the future.
Collapse
Affiliation(s)
- Rohitash Chandra
- Transitional Artificial Intelligence Research Group, School of Mathematics and Statistics, UNSW Sydney, Sydney, Australia
| | - Chaarvi Bansal
- Transitional Artificial Intelligence Research Group, School of Mathematics and Statistics, UNSW Sydney, Sydney, Australia
- Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Rajasthan, India
| | - Mingyue Kang
- Transitional Artificial Intelligence Research Group, School of Mathematics and Statistics, UNSW Sydney, Sydney, Australia
| | - Tom Blau
- Data 61, CSIRO, Sydney, Australia
| | - Vinti Agarwal
- Department of Computer Science and Information Systems, Birla Institute of Technology and Science Pilani, Rajasthan, India
| | - Pranjal Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Guwathi, Assam, India
| | - Laurence O. W. Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, North Ryde, Australia
| | - Seshadri Vasan
- Department of Health Sciences, University of York, York, United Kingdom
| |
Collapse
|
18
|
Shen X, Shang L, Han J, Zhang Y, Niu W, Liu H, Shi H. Immune-related gene signature associates with immune landscape and predicts prognosis accurately in patients with skin cutaneous melanoma. Front Genet 2023; 13:1095867. [PMID: 36685954 PMCID: PMC9845246 DOI: 10.3389/fgene.2022.1095867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 12/02/2022] [Indexed: 01/06/2023] Open
Abstract
Skin cutaneous melanoma (SKCM) is the skin cancer that causes the highest number of deaths worldwide. There is growing evidence that the tumour immune microenvironment is associated with cancer prognosis, however, there is little research on the role of immune status in melanoma prognosis. In this study, data on patients with Skin cutaneous melanoma were downloaded from the GEO, TCGA, and GTEx databases. Genes associated with the immune pathway were screened from published papers and lncRNAs associated with them were identified. We performed immune microenvironment and functional enrichment analyses. The analysis was followed by applying univariate/multivariate Cox regression algorithms to finally identify three lncRNAs associated with the immune pathway for the construction of prognostic prediction models (CXCL10, RXRG, and SCG2). This stepwise downscaling method, which finally screens out prognostic factors and key genes and then uses them to build a risk model, has excellent predictive power. According to analyses of the model's reliability, it was able to differentiate the prognostic value and continued existence of Skin cutaneous melanoma patient populations more effectively. This study is an analysis of the immune pathway that leads lncRNAs in Skin cutaneous melanoma in an effort to open up new treatment avenues for Skin cutaneous melanoma.
Collapse
|
19
|
A study of autoencoders as a feature extraction technique for spike sorting. PLoS One 2023; 18:e0282810. [PMID: 36893210 PMCID: PMC9997908 DOI: 10.1371/journal.pone.0282810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 02/22/2023] [Indexed: 03/10/2023] Open
Abstract
Spike sorting is the process of grouping spikes of distinct neurons into their respective clusters. Most frequently, this grouping is performed by relying on the similarity of features extracted from spike shapes. In spite of recent developments, current methods have yet to achieve satisfactory performance and many investigators favour sorting manually, even though it is an intensive undertaking that requires prolonged allotments of time. To automate the process, a diverse array of machine learning techniques has been applied. The performance of these techniques depends however critically on the feature extraction step. Here, we propose deep learning using autoencoders as a feature extraction method and evaluate extensively the performance of multiple designs. The models presented are evaluated on publicly available synthetic and real "in vivo" datasets, with various numbers of clusters. The proposed methods indicate a higher performance for the process of spike sorting when compared to other state-of-the-art techniques.
Collapse
|
20
|
Tsukiyama S, Hasan MM, Kurata H. CNN6mA: Interpretable neural network model based on position-specific CNN and cross-interactive network for 6mA site prediction. Comput Struct Biotechnol J 2022; 21:644-654. [PMID: 36659917 PMCID: PMC9826936 DOI: 10.1016/j.csbj.2022.12.043] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 12/26/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022] Open
Abstract
N6-methyladenine (6mA) plays a critical role in various epigenetic processing including DNA replication, DNA repair, silencing, transcription, and diseases such as cancer. To understand such epigenetic mechanisms, 6 mA has been detected by high-throughput technologies on a genome-wide scale at single-base resolution, together with conventional methods such as immunoprecipitation, mass spectrometry and capillary electrophoresis, but these experimental approaches are time-consuming and laborious. To complement these problems, we have developed a CNN-based 6 mA site predictor, named CNN6mA, which proposed two new architectures: a position-specific 1-D convolutional layer and a cross-interactive network. In the position-specific 1-D convolutional layer, position-specific filters with different window sizes were applied to an inquiry sequence instead of sharing the same filters over all positions in order to extract the position-specific features at different levels. The cross-interactive network explored the relationships between all the nucleotide patterns within the inquiry sequence. Consequently, CNN6mA outperformed the existing state-of-the-art models in many species and created the contribution score vector that intelligibly interpret the prediction mechanism. The source codes and web application in CNN6mA are freely accessible at https://github.com/kuratahiroyuki/CNN6mA.git and http://kurata35.bio.kyutech.ac.jp/CNN6mA/, respectively.
Collapse
Key Words
- 6mA, N6-methyladenine
- AUCs, Area under the curves
- BERT, Bidirectional Encoder Representations from Transformers
- CNN
- CNN, Convolutional neural network
- DNA modification
- Deep learning
- Interpretable prediction
- LSTM, Long short-term memory
- MCC, Matthews correlation coefficient
- Machine learning
- N6-methyladenine
- RF, Random forest
- SMRT, Single-molecule real-time
- SN, Sensitivity
- SP, Specificity
- UMAP, Uniform manifold approximation and projection
- t-SNE, t-distributed stochastic neighbor embedding
Collapse
Affiliation(s)
- Sho Tsukiyama
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680–4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Md Mehedi Hasan
- Tulane Center for Aging and Department of Medicine, Tulane University Health Sciences Center, New Orleans, LA 70112, USA
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680–4 Kawazu, Iizuka, Fukuoka 820-8502, Japan,Corresponding author.
| |
Collapse
|
21
|
Baidya L, Reddy G. pH Induced Switch in the Conformational Ensemble of Intrinsically Disordered Protein Prothymosin-α and Its Implications for Amyloid Fibril Formation. J Phys Chem Lett 2022; 13:9589-9598. [PMID: 36206480 DOI: 10.1021/acs.jpclett.2c01972] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Aggregation of intrinsically disordered proteins (IDPs) can lead to neurodegenerative diseases. Although there is experimental evidence that acidic pH promotes IDP monomer compaction leading to aggregation, the general mechanism is unclear. We studied the pH effect on the conformational ensemble of prothymosin-α (proTα), which is involved in multiple essential functions, and probed its role in aggregation using computer simulations. We show that compaction in the proTα dimension at low pH is due to the protein's collapse in the intermediate region (E41-D80) rich in glutamic acid residues, enhancing its β-sheet content. We observed by performing dimer simulations that the conformations with high β-sheet content could act as aggregation-prone (N*) states and nucleate the aggregation process. The simulations initiated using N* states form dimers within a microsecond time scale, whereas the non-N* states do not form dimers within this time scale. This study contributes to understanding the general principles of pH-induced IDP aggregation.
Collapse
Affiliation(s)
- Lipika Baidya
- Solid State and Structural Chemistry Unit, Indian Institute of Science, Bengaluru, Karnataka560012, India
| | - Govardhan Reddy
- Solid State and Structural Chemistry Unit, Indian Institute of Science, Bengaluru, Karnataka560012, India
| |
Collapse
|
22
|
Li S, Pan J, Zhang Y, Tang Y, Zeng X, Wang S, Wu D, Liu Y, Xu D, Lan J, Hu D. An eleven autophagy-related genes-based prognostic signature for endometrial carcinoma. J Egypt Natl Canc Inst 2022; 34:42. [DOI: 10.1186/s43046-022-00135-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 07/11/2022] [Indexed: 12/24/2022] Open
Abstract
Abstract
Background
Endometrial cancer (EC) is a common malignant tumor in women with increasing mortality. The prognosis of EC is highly heterogeneous which needs more effective biomarkers for clinical decision. Here, we reported the effect of autophagy-related genes (ARGs) on the prognosis of EC.
Methods
The expression data of EC tissues and adjacent non-tumor samples were available from the TCGA dataset and 232 autophagy-related genes were from The Human Autophagy Database. A prognostic ARGs risk model was further constructed by using LASSO-Cox regression, and its prognostic and predictive value were evaluated by nomogram. Further functional analysis was conducted to reveal a significant signaling pathway.
Results
A total of 45 differentially expressed ARGs were obtained, including 18 upregulated and 27 downregulated genes. Eleven ARGs (BID, CAPN2, CDKN2A, DLC1, GRID2, IFNG, MYC, NRG3, P4HB, PTK6, and TP73) were finally selected to build ARGs risk. This signature could well distinguish between the high- and low-risk patients (survival analysis: P = 1.18E-10; AUC: 0.733 at 1 year, 0.795 at 3 years, and 0.823 at 5 years). Furthermore, a nomogram was plotting to predict the possibility of overall survival and suggested good value for clinical utility.
Conclusion
We established an eleven-ARG signature, which was probably effective in the prognostic prediction of patients with EC.
Collapse
|
23
|
Rodríguez Serrano AF, Hsing IM. Prediction of Aptamer-Small-Molecule Interactions Using Metastable States from Multiple Independent Molecular Dynamics Simulations. J Chem Inf Model 2022; 62:4799-4809. [PMID: 36134737 DOI: 10.1021/acs.jcim.2c00734] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Understanding aptamer-ligand interactions is necessary to rationally design aptamer-based systems. Commonly used in silico tools have proven to be accurate to predict RNA and DNA oligonucleotide tertiary structures. However, given the complexity of nucleic acids, the most thermodynamically stable conformation is not necessarily the one with the highest affinity for a specific ligand. Because many metastable states may coexist, it remains challenging to predict binding sites through molecular docking simulations using available computational pipelines. In this study, we used independent simulations to broaden the conformational diversity sampled from DNA initial models of distinct stability and assessed the binding affinity of selected metastable representative structures. In our results, utilizing multiple metastable conformations for molecular docking analysis helped identify structures favorable for ligand binding and accurately predict the binding sites. Our workflow was able to correctly identify the binding sites of the characterized adenosine monophosphate and l-argininamide aptamers. Additionally, we demonstrated that our pipeline can be used to aid the design of competition assays that are conducive to aptasensing strategies using an uncharacterized aflatoxin B1 aptamer. We foresee that this approach may help rationally design effective and truncated aptamer sequences interacting with protein biomarkers or small molecules of interest for drug design and sensor applications.
Collapse
Affiliation(s)
- Alan Fernando Rodríguez Serrano
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR 999077, China
| | - I-Ming Hsing
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR 999077, China
| |
Collapse
|
24
|
Predicting the prevalence of lung cancer using feature transformation techniques. EGYPTIAN INFORMATICS JOURNAL 2022. [DOI: 10.1016/j.eij.2022.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
25
|
Bai F, Puk KM, Liu J, Zhou H, Tao P, Zhou W, Wang S. Sparse group selection and analysis of function-related residue for protein-state recognition. J Comput Chem 2022; 43:1342-1354. [PMID: 35656889 PMCID: PMC9248267 DOI: 10.1002/jcc.26937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 03/23/2022] [Accepted: 05/08/2022] [Indexed: 11/08/2022]
Abstract
Machine learning methods have helped to advance wide range of scientific and technological field in recent years, including computational chemistry. As the chemical systems could become complex with high dimension, feature selection could be critical but challenging to develop reliable machine learning based prediction models, especially for proteins as bio-macromolecules. In this study, we applied sparse group lasso (SGL) method as a general feature selection method to develop classification model for an allosteric protein in different functional states. This results into a much improved model with comparable accuracy (Acc) and only 28 selected features comparing to 289 selected features from a previous study. The Acc achieves 91.50% with 1936 selected feature, which is far higher than that of baseline methods. In addition, grouping protein amino acids into secondary structures provides additional interpretability of the selected features. The selected features are verified as associated with key allosteric residues through comparison with both experimental and computational works about the model protein, and demonstrate the effectiveness and necessity of applying rigorous feature selection and evaluation methods on complex chemical systems.
Collapse
Affiliation(s)
- Fangyun Bai
- Department of Management Science and Engineering, Tongji University. Fangyun Bai and Kin Ming Puk contributed equally to this work
| | | | - Jin Liu
- Department of Pharmaceutical Sciences, University of North Texas System College of Pharmacy, University of North Texas Health Science Center
| | - Hongyu Zhou
- Department of Chemistry, Center for Scientific Computation, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University
| | - Peng Tao
- Department of Chemistry, Center for Scientific Computation, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University
| | - Wenyong Zhou
- Department of Management Science and Engineering, Tongji University
| | - Shouyi Wang
- Corresponding author: Shouyi Wang, Department of Industrial, Manufacturing and Systems Engineering, University of Texas at Arlington.
| |
Collapse
|
26
|
Oide M, Sugita Y. Protein Folding Intermediates on the Dimensionality Reduced Landscape with UMAP and Native Contact Likelihood. J Chem Phys 2022; 157:075101. [DOI: 10.1063/5.0099094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To understand protein folding mechanisms from molecular dynamics (MD) simulations, it is important to explore not only folded/unfolded states but also representative intermediate structures on the conformational landscape. Here, we propose a novel approach to construct the landscape using the uniform manifold approximation and projection (UMAP) method, which reduces the dimensionality without losing data-point proximity. In the approach, native contact likelihood is used as feature variables rather than the conventional Cartesian coordinates or dihedral angles of protein structures. We tested the performance of UMAP for coarse-grained MD simulation trajectories of B1 domain in protein G and observed on-pathway transient structures and other metastable states on the UMAP conformational landscape. In contrast, these structures were not clearly distinguished on the dimensionality reduced landscape using principal component analysis (PCA) or time-lagged independent component analysis (tICA). This approach is also useful to obtain dynamical information through Markov State Modeling and would be applicable to large-scale conformational changes in many other biomacromolecules.
Collapse
Affiliation(s)
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN, Japan
| |
Collapse
|
27
|
Baltrukevich H, Podlewska S. From Data to Knowledge: Systematic Review of Tools for Automatic Analysis of Molecular Dynamics Output. Front Pharmacol 2022; 13:844293. [PMID: 35359865 PMCID: PMC8960308 DOI: 10.3389/fphar.2022.844293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 01/26/2022] [Indexed: 12/02/2022] Open
Abstract
An increasing number of crystal structures available on one side, and the boost of computational power available for computer-aided drug design tasks on the other, have caused that the structure-based drug design tools are intensively used in the drug development pipelines. Docking and molecular dynamics simulations, key representatives of the structure-based approaches, provide detailed information about the potential interaction of a ligand with a target receptor. However, at the same time, they require a three-dimensional structure of a protein and a relatively high amount of computational resources. Nowadays, as both docking and molecular dynamics are much more extensively used, the amount of data output from these procedures is also growing. Therefore, there are also more and more approaches that facilitate the analysis and interpretation of the results of structure-based tools. In this review, we will comprehensively summarize approaches for handling molecular dynamics simulations output. It will cover both statistical and machine-learning-based tools, as well as various forms of depiction of molecular dynamics output.
Collapse
Affiliation(s)
- Hanna Baltrukevich
- Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
- Faculty of Pharmacy, Chair of Technology and Biotechnology of Medical Remedies, Jagiellonian University Medical College in Krakow, Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
| |
Collapse
|
28
|
Ni D, Liu Y, Kong R, Yu Z, Lu S, Zhang J. Computational elucidation of allosteric communication in proteins for allosteric drug design. Drug Discov Today 2022; 27:2226-2234. [DOI: 10.1016/j.drudis.2022.03.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 01/22/2022] [Accepted: 03/17/2022] [Indexed: 02/07/2023]
|
29
|
Ebrahimi S, Lim G, Hobbs BP, Lin SH, Mohan R, Cao W. A hybrid deep learning model for forecasting lymphocyte depletion during radiation therapy. Med Phys 2022; 49:3507-3522. [PMID: 35229311 DOI: 10.1002/mp.15584] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 01/21/2022] [Accepted: 02/20/2022] [Indexed: 11/06/2022] Open
Abstract
PURPOSE Recent studies have shown that severe depletion of the absolute lymphocyte count (ALC) induced by radiation therapy (RT) has been associated with poor overall survival of patients with many solid tumors. In this paper, we aimed to predict radiation-induced lymphocyte depletion in esophageal cancer patients during the course of RT based on patient characteristics and dosimetric features. METHODS We proposed a hybrid deep learning model in a stacked structure to predict a trend toward ALC depletion based on the clinical information before or at the early stages of RT treatment. The proposed model consisted of four channels, one channel based on long short-term memory (LSTM) network and three channels based on neural networks, to process four categories of features followed by a dense layer to integrate the outputs of four channels and predict the weekly ALC values. Moreover, a discriminative kernel was developed to extract temporal features and assign different weights to each part of the input sequence which enabled the model to focus on the most relevant parts. The proposed model was trained and tested on a dataset of 860 esophageal cancer patients who received concurrent chemoradiotherapy. RESULTS The performance of the proposed model was evaluated based on several important prediction metrics and compared to other commonly used prediction models. The results showed that the proposed model outperformed off-the-shelf prediction methods with at least a 30% reduction in the mean squared error (MSE) of weekly ALC predictions based on pretreatment data.Moreover, using an extended model based on augmented first-week treatment data reduced the MSE of predictions by 70% compared to the model based on the pretreatment data. CONCLUSIONS In conclusion, our model performed well in predicting radiation-induced lymphocyte depletion for RT treatment planning. The ability to predict ALC will enable physicians to evaluate individual RT treatment plans for lymphopenia risk and to identify patients at high risk who would benefit from modified treatment approaches. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Saba Ebrahimi
- Department of Industrial Engineering, University of Houston, Houston, Texas, USA
| | - Gino Lim
- Department of Industrial Engineering, University of Houston, Houston, Texas, USA
| | - Brian P Hobbs
- Department of Population Health, The University of Texas at Austin, Austin, Texas, USA
| | - Steven H Lin
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Radhe Mohan
- Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Wenhua Cao
- Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| |
Collapse
|
30
|
Feng Q, Qian C, Fan S. A Hypoxia-Related Long Non-Coding RNAs Signature Associated With Prognosis in Lower-Grade Glioma. Front Oncol 2021; 11:771512. [PMID: 34869006 PMCID: PMC8640178 DOI: 10.3389/fonc.2021.771512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 11/01/2021] [Indexed: 12/19/2022] Open
Abstract
Accumulating evidence suggests that hypoxia microenvironment and long non-coding lncRNAs (lncRNAs) exert critical roles in tumor development. Herein, we aim to develop a hypoxia-related lncRNA (HRL) model to predict the survival outcomes of patient with lower-grade glioma (LGG). The RNA-sequencing data of 505 LGG samples were acquired from The Cancer Genome Atlas (TCGA). Using consensus clustering based on the expression of hypoxia-related mRNAs, these samples were divided into three subsets that exhibit distinct hypoxia content, clinicopathologic features, and survival status. The differentially expressed lncRNAs across the subgroups were documented as candidate HRLs. With LASSO regression analysis, eight informative lncRNAs were selected for constructing the prognostic HRL model. This signature had a good performance in predicting LGG patients’ overall survival in the TCGA cohort, and similar results could be achieved in two validation cohorts from the Chinese Glioma Genome Atlas. The HRL model also showed correlations with important clinicopathologic characteristics such as patients’ age, tumor grade, IDH mutation, 1p/19q codeletion, MGMT methylation, and tumor progression risk. Functional enrichment analysis indicated that the HLR signature was mainly involved in regulation of inflammatory response, complement, hypoxia, Kras signaling, and apical junction. More importantly, the signature was related to immune cell infiltration, estimated immune score, tumor mutation burden, neoantigen load, and expressions of immune checkpoints and immunosuppressive cytokines. Finally, a nomogram was developed by integrating the HRL signature and clinicopathologic features, with a concordance index of 0.852 to estimate the survival probability of LGG patients. In conclusion, our study established an effective HRL model for prognosis assessment of LGG patients, which may provide insights for future research and facilitate the designing of individualized treatment.
Collapse
Affiliation(s)
- Qinglin Feng
- Department of Neurosurgery, Chongqing University Three Gorges Hospital & Chongqing Three Gorges Central Hospital, Chongqing, China
| | - Cheng Qian
- Department of Cardiology, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Shibing Fan
- Department of Neurosurgery & Chongqing Municipality Clinical Research Center for Geriatric Diseases, Chongqing University Three Gorges Hospital, and School of Medicine Chongqing University, Chongqing, China
| |
Collapse
|
31
|
Bai X, Yin Y. Exploration and augmentation of pharmacological space via adversarial auto-encoder model for facilitating kinase-centric drug development. J Cheminform 2021; 13:95. [PMID: 34872613 PMCID: PMC8650415 DOI: 10.1186/s13321-021-00574-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 11/20/2021] [Indexed: 11/10/2022] Open
Abstract
Predicting compound-protein interactions (CPIs) is of great importance for drug discovery and repositioning, yet still challenging mainly due to the sparse nature of CPI matrixes, resulting in poor generalization performance. Hence, unlike typical CPI prediction models focused on representation learning or model selection, we propose a deep neural network-based strategy, PCM-AAE, that re-explores and augments the pharmacological space of kinase inhibitors by introducing the adversarial auto-encoder model (AAE) to improve the generalization of the prediction model. To complete the data space, we constructed Ensemble of PCM-AAE (EPA), an ensemble model that quickly and accurately yields quantitative predictions of binding affinity between any human kinase and inhibitor. In rigorous internal validation, EPA showed excellent performance, consistently outperforming the model trained with the imbalanced set, especially for targets with relatively fewer training data points. Improved prediction accuracy of EPA for external datasets enhances its generalization ability, making it possible to gracefully handle previously unseen kinases and inhibitors. EPA showed promising potential when directly applied to virtual screening and off-target prediction, exhibiting its practicality in hit prediction. Our strategy is expected to facilitate kinase-centric drug development, as well as to solve more challenging prediction problems with insufficient data points.
Collapse
Affiliation(s)
- Xinyu Bai
- Department of Pathology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, China
- Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, People's Republic of China
| | - Yuxin Yin
- Department of Pathology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, China.
- Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, People's Republic of China.
- Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China.
| |
Collapse
|
32
|
Nassir N, Bankapur A, Samara B, Ali A, Ahmed A, Inuwa IM, Zarrei M, Safizadeh Shabestari SA, AlBanna A, Howe JL, Berdiev BK, Scherer SW, Woodbury-Smith M, Uddin M. Single-cell transcriptome identifies molecular subtype of autism spectrum disorder impacted by de novo loss-of-function variants regulating glial cells. Hum Genomics 2021; 15:68. [PMID: 34802461 PMCID: PMC8607722 DOI: 10.1186/s40246-021-00368-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 11/05/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In recent years, several hundred autism spectrum disorder (ASD) implicated genes have been discovered impacting a wide range of molecular pathways. However, the molecular underpinning of ASD, particularly from the point of view of 'brain to behaviour' pathogenic mechanisms, remains largely unknown. METHODS We undertook a study to investigate patterns of spatiotemporal and cell type expression of ASD-implicated genes by integrating large-scale brain single-cell transcriptomes (> million cells) and de novo loss-of-function (LOF) ASD variants (impacting 852 genes from 40,122 cases). RESULTS We identified multiple single-cell clusters from three distinct developmental human brain regions (anterior cingulate cortex, middle temporal gyrus and primary visual cortex) that evidenced high evolutionary constraint through enrichment for brain critical exons and high pLI genes. These clusters also showed significant enrichment with ASD loss-of-function variant genes (p < 5.23 × 10-11) that are transcriptionally highly active in prenatal brain regions (visual cortex and dorsolateral prefrontal cortex). Mapping ASD de novo LOF variant genes into large-scale human and mouse brain single-cell transcriptome analysis demonstrate enrichment of such genes into neuronal subtypes and are also enriched for subtype of non-neuronal glial cell types (astrocyte, p < 6.40 × 10-11, oligodendrocyte, p < 1.31 × 10-09). CONCLUSION Among the ASD genes enriched with pathogenic de novo LOF variants (i.e. KANK1, PLXNB1), a subgroup has restricted transcriptional regulation in non-neuronal cell types that are evolutionarily conserved. This association strongly suggests the involvement of subtype of non-neuronal glial cells in the pathogenesis of ASD and the need to explore other biological pathways for this disorder.
Collapse
Affiliation(s)
- Nasna Nassir
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Asma Bankapur
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Bisan Samara
- Biomedical Engineering Department, McGill University, Montréal, QC, Canada
| | - Abdulrahman Ali
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Awab Ahmed
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Ibrahim M Inuwa
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Mehdi Zarrei
- The Centre for Applied Genomics (TCAG), The Hospital for Sick Children, Toronto, ON, Canada.,Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | | | - Ammar AlBanna
- Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE.,The Mental Health Center of Excellence, Al Jalila Children's Speciality Hospital, Dubai, UAE
| | - Jennifer L Howe
- The Centre for Applied Genomics (TCAG), The Hospital for Sick Children, Toronto, ON, Canada.,Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Bakhrom K Berdiev
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Stephen W Scherer
- The Centre for Applied Genomics (TCAG), The Hospital for Sick Children, Toronto, ON, Canada.,Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada.,Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Marc Woodbury-Smith
- The Centre for Applied Genomics (TCAG), The Hospital for Sick Children, Toronto, ON, Canada.,Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Mohammed Uddin
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE. .,Cellular Intelligence (Ci) Lab, GenomeArc Inc., Toronto, ON, Canada.
| |
Collapse
|
33
|
Single-Cell Proteomic Profiling Identifies Nanoparticle Enhanced Therapy for Triple Negative Breast Cancer Stem Cells. Cells 2021; 10:cells10112842. [PMID: 34831064 PMCID: PMC8616083 DOI: 10.3390/cells10112842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 10/09/2021] [Accepted: 10/19/2021] [Indexed: 11/16/2022] Open
Abstract
Breast cancer remains a major cause of cancer-related deaths in women worldwide. Chemotherapy-promoted stemness and enhanced stem cell plasticity in breast cancer is a cause for great concern. The discovery of drugs targeting BCSCs was suggested to be an important advancement in the establishment of therapy that improves the efficacy of chemotherapy. In this work, by using single-cell mass cytometry, we observed that stemness in spheroid-forming cells derived from MDA-MB-231 cells was significantly increased after doxorubicin administration and up-regulated integrin αvβ3 expression was also observed. An RGD-included nanoparticle (CS-V) was designed, and it was found that it could promote doxorubicin’s efficacy against MDA-MB-231 spheroid cells. The above observations suggested that the combination of RGD-included nanoparticles (CS-V) with the chemo-drug doxorubicin could be developed as a potential therapy for breast cancer.
Collapse
|
34
|
Chen M. Collective variable-based enhanced sampling and machine learning. THE EUROPEAN PHYSICAL JOURNAL. B 2021; 94:211. [PMID: 34697536 PMCID: PMC8527828 DOI: 10.1140/epjb/s10051-021-00220-w] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 10/03/2021] [Indexed: 05/14/2023]
Abstract
ABSTRACT Collective variable-based enhanced sampling methods have been widely used to study thermodynamic properties of complex systems. Efficiency and accuracy of these enhanced sampling methods are affected by two factors: constructing appropriate collective variables for enhanced sampling and generating accurate free energy surfaces. Recently, many machine learning techniques have been developed to improve the quality of collective variables and the accuracy of free energy surfaces. Although machine learning has achieved great successes in improving enhanced sampling methods, there are still many challenges and open questions. In this perspective, we shall review recent developments on integrating machine learning techniques and collective variable-based enhanced sampling approaches. We also discuss challenges and future research directions including generating kinetic information, exploring high-dimensional free energy surfaces, and efficiently sampling all-atom configurations. GRAPHIC ABSTRACT
Collapse
Affiliation(s)
- Ming Chen
- Department of Chemistry, Purdue University, West Lafayette, IN 47907 USA
| |
Collapse
|
35
|
Zhu K, Liu X, Liu C, Xu Y, Fu Y, Dong W, Yan Y, Wang W, Qian C. AKT inhibitor AZD5363 suppresses stemness and promotes anti-cancer activity of 3,3'-diindolylmethane in human breast cancer cells. Toxicol Appl Pharmacol 2021; 429:115700. [PMID: 34464674 DOI: 10.1016/j.taap.2021.115700] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 08/19/2021] [Accepted: 08/21/2021] [Indexed: 01/16/2023]
Abstract
3,3'-diindolylmethane (DIM) is a dimer compound converted from Indoly-3-carbinol that had been studied as promising chemo-preventive agent against breast cancer. In this study, we observed that proportion of CD133+Nanog+ subpopulation in MCF-7 cells was significantly increased after DIM administration with up-regulated AKT activity by using CyTOF assay. SPADE analysis revealed this stem-like subpopulation exhibited apoptosis-resistance property against DIM treatment. By combining with AKT inhibitor AZD5363, DIM induced CD133 expression could be suppressed. In addition, a combination treatment of MCF-7 and MDA-MB-231 breast cancer cells with DIM and AZD5363 showed synergistic decreases in cell proliferation and induced apoptosis. Furthermore, results from imaging flow cytometry suggested that FoxO3a nuclear localization and PUMA expression could be improved by combination of AZD5363 with DIM. Taken together, the above observations suggested that the combination of AZD5363 with DIM could be developed as potential therapy for breast cancer.
Collapse
Affiliation(s)
- Kaiyuan Zhu
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Xu Liu
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Chunxiao Liu
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Yuting Xu
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Yingqiang Fu
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Wei Dong
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Yadong Yan
- Beijing Institute of Hepatology, Beijing YouAn Hospital, Capital Medical University, Beijing 100069, China
| | - Wenjing Wang
- Beijing Institute of Hepatology, Beijing YouAn Hospital, Capital Medical University, Beijing 100069, China.
| | - Cheng Qian
- Department of breast cancer surgery, Harbin Medical University Cancer Hospital, Harbin Medical University, Harbin, Heilongjiang 150086, China; North China Translational Medicine Research Center of Harbin Medical University, Harbin Medical University, Harbin, Heilongjiang 150086, China.
| |
Collapse
|
36
|
Glielmo A, Husic BE, Rodriguez A, Clementi C, Noé F, Laio A. Unsupervised Learning Methods for Molecular Simulation Data. Chem Rev 2021; 121:9722-9758. [PMID: 33945269 PMCID: PMC8391792 DOI: 10.1021/acs.chemrev.0c01195] [Citation(s) in RCA: 141] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Indexed: 12/21/2022]
Abstract
Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss feature representation of molecular systems and present state-of-the-art algorithms of dimensionality reduction, density estimation, and clustering, and kinetic models. We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.
Collapse
Affiliation(s)
- Aldo Glielmo
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
| | - Brooke E. Husic
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
| | - Alex Rodriguez
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| | - Cecilia Clementi
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Frank Noé
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Alessandro Laio
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| |
Collapse
|
37
|
Rydzewski J, Valsson O. Multiscale Reweighted Stochastic Embedding: Deep Learning of Collective Variables for Enhanced Sampling. J Phys Chem A 2021; 125:6286-6302. [PMID: 34213915 PMCID: PMC8389995 DOI: 10.1021/acs.jpca.1c02869] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/17/2021] [Indexed: 12/29/2022]
Abstract
Machine learning methods provide a general framework for automatically finding and representing the essential characteristics of simulation data. This task is particularly crucial in enhanced sampling simulations. There we seek a few generalized degrees of freedom, referred to as collective variables (CVs), to represent and drive the sampling of the free energy landscape. In theory, these CVs should separate different metastable states and correspond to the slow degrees of freedom of the studied physical process. To this aim, we propose a new method that we call multiscale reweighted stochastic embedding (MRSE). Our work builds upon a parametric version of stochastic neighbor embedding. The technique automatically learns CVs that map a high-dimensional feature space to a low-dimensional latent space via a deep neural network. We introduce several new advancements to stochastic neighbor embedding methods that make MRSE especially suitable for enhanced sampling simulations: (1) weight-tempered random sampling as a landmark selection scheme to obtain training data sets that strike a balance between equilibrium representation and capturing important metastable states lying higher in free energy; (2) a multiscale representation of the high-dimensional feature space via a Gaussian mixture probability model; and (3) a reweighting procedure to account for training data from a biased probability distribution. We show that MRSE constructs low-dimensional CVs that can correctly characterize the different metastable states in three model systems: the Müller-Brown potential, alanine dipeptide, and alanine tetrapeptide.
Collapse
Affiliation(s)
- Jakub Rydzewski
- Institute
of Physics, Faculty of Physics, Astronomy and Informatics, Nicolaus Copernicus University, Grudziadzka 5, 87-100 Torun, Poland
| | - Omar Valsson
- Max
Planck Institute for Polymer Research, Ackermannweg 10, Mainz D-55128, Germany
| |
Collapse
|
38
|
Trozzi F, Wang F, Verkhivker G, Zoltowski BD, Tao P. Dimeric allostery mechanism of the plant circadian clock photoreceptor ZEITLUPE. PLoS Comput Biol 2021; 17:e1009168. [PMID: 34310591 PMCID: PMC8341706 DOI: 10.1371/journal.pcbi.1009168] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 08/05/2021] [Accepted: 06/10/2021] [Indexed: 11/19/2022] Open
Abstract
In Arabidopsis thaliana, the Light-Oxygen-Voltage (LOV) domain containing protein ZEITLUPE (ZTL) integrates light quality, intensity, and duration into regulation of the circadian clock. Recent structural and biochemical studies of ZTL indicate that the protein diverges from other members of the LOV superfamily in its allosteric mechanism, and that the divergent allosteric mechanism hinges upon conservation of two signaling residues G46 and V48 that alter dynamic motions of a Gln residue implicated in signal transduction in all LOV proteins. Here, we delineate the allosteric mechanism of ZTL via an integrated computational approach that employs atomistic simulations of wild type and allosteric variants of ZTL in the functional dark and light states, together with Markov state and supervised machine learning classification models. This approach has unveiled key factors of the ZTL allosteric mechanisms, and identified specific interactions and residues implicated in functional allosteric changes. The final results reveal atomic level insights into allosteric mechanisms of ZTL function that operate via a non-trivial combination of population-shift and dynamics-driven allosteric pathways.
Collapse
Affiliation(s)
- Francesco Trozzi
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, United States of America
| | - Feng Wang
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, United States of America
| | - Gennady Verkhivker
- Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California, United States of America
- Chapman University School of Pharmacy, Irvine, California, United States of America
| | - Brian D. Zoltowski
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, United States of America
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, United States of America
| |
Collapse
|
39
|
Kaushik A, Dunham D, He Z, Manohar M, Desai M, Nadeau KC, Andorf S. CyAnno: A semi-automated approach for cell type annotation of mass cytometry datasets. Bioinformatics 2021; 37:4164-4171. [PMID: 34037686 PMCID: PMC9502137 DOI: 10.1093/bioinformatics/btab409] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 04/04/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION For immune system monitoring in large-scale studies at the single-cell resolution using CyTOF, (semi-)automated computational methods are applied for annotating live cells of mixed cell types. Here, we show that the live cell pool can be highly enriched with undefined heterogeneous cells, i.e., 'ungated' cells, and that current semi-automated approaches ignore their modeling resulting in misclassified annotations. RESULT We introduce 'CyAnno', a novel semi-automated approach for deconvoluting the unlabeled cytometry dataset based on a machine learning framework utilizing manually gated training data that allows the integrative modeling of 'gated' cell types and the 'ungated' cells. By applying this framework on several CyTOF datasets, we demonstrated that including the 'ungated' cells can lead to a significant increase in the precision of the 'gated' cell types prediction. CyAnno can be used to identify even a single cell type, including rare cells, with higher efficacy than current state-of-the-art semi-automated approaches. AVAILABILITY The CyAnno is available as a python script with a user-manual and sample dataset at https://github.com/abbioinfo/CyAnno. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Abhinav Kaushik
- Sean N Parker Center for Allergy and Asthma Research at Stanford University, Stanford University, Stanford, CA 94305-5101, USA
| | - Diane Dunham
- Sean N Parker Center for Allergy and Asthma Research at Stanford University, Stanford University, Stanford, CA 94305-5101, USA
| | - Ziyuan He
- Sean N Parker Center for Allergy and Asthma Research at Stanford University, Stanford University, Stanford, CA 94305-5101, USA
| | - Monali Manohar
- Sean N Parker Center for Allergy and Asthma Research at Stanford University, Stanford University, Stanford, CA 94305-5101, USA
| | - Manisha Desai
- Quantitative Sciences Unit, Stanford University, Stanford, CA 94305-5101, USA
| | - Kari C Nadeau
- Sean N Parker Center for Allergy and Asthma Research at Stanford University, Stanford University, Stanford, CA 94305-5101, USA
| | - Sandra Andorf
- Sean N Parker Center for Allergy and Asthma Research at Stanford University, Stanford University, Stanford, CA 94305-5101, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA.,Divisions of Biomedical Informatics and Allergy & Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| |
Collapse
|
40
|
Trozzi F, Wang X, Tao P. UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study. J Phys Chem B 2021; 125:5022-5034. [PMID: 33973773 PMCID: PMC8356557 DOI: 10.1021/acs.jpcb.1c02081] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Proteins are the molecular machines of life. The multitude of possible conformations that proteins can adopt determines their free-energy landscapes. However, the inherently high dimensionality of a protein free-energy landscape poses a challenge to deciphering how proteins perform their functions. For this reason, dimensionality reduction is an active field of research for molecular biologists. The uniform manifold approximation and projection (UMAP) is a dimensionality reduction method based on a fuzzy topological analysis of data. In the present study, the performance of UMAP is compared with that of other popular dimensionality reduction methods such as t-distributed stochastic neighbor embedding (t-SNE), principal component analysis (PCA), and time-structure independent components analysis (tICA) in the context of analyzing molecular dynamics simulations of the circadian clock protein VIVID. A good dimensionality reduction method should accurately represent the data structure on the projected components. The comparison of the raw high-dimensional data with the projections obtained using different dimensionality reduction methods based on various metrics showed that UMAP has superior performance when compared with linear reduction methods (PCA and tICA) and has competitive performance and scalable computational cost.
Collapse
Affiliation(s)
- Francesco Trozzi
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, 75275, United States of America
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, Texas, 75275, United States of America
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, 75275, United States of America
| |
Collapse
|
41
|
Bhakat S. Pepsin-like aspartic proteases (PAPs) as model systems for combining biomolecular simulation with biophysical experiments. RSC Adv 2021; 11:11026-11047. [PMID: 35423571 PMCID: PMC8695779 DOI: 10.1039/d0ra10359d] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 02/21/2021] [Indexed: 01/26/2023] Open
Abstract
Pepsin-like aspartic proteases (PAPs) are a class of aspartic proteases which shares tremendous structural similarity with human pepsin. One of the key structural features of PAPs is the presence of a β-hairpin motif otherwise known as flap. The biological function of the PAPs is highly dependent on the conformational dynamics of the flap region. In apo PAPs, the conformational dynamics of the flap is dominated by the rotational degrees of freedom associated with χ1 and χ2 angles of conserved Tyr (or Phe in some cases). However it is plausible that dihedral order parameters associated with several other residues might play crucial roles in the conformational dynamics of apo PAPs. Due to their size, complexities associated with conformational dynamics and clinical significance (drug targets for malaria, Alzheimer's disease etc.), PAPs provide a challenging testing ground for computational and experimental methods focusing on understanding conformational dynamics and molecular recognition in biomolecules. The opening of the flap region is necessary to accommodate substrate/ligand in the active site of the PAPs. The BIG challenge is to gain atomistic details into how reversible ligand binding/unbinding (molecular recognition) affects the conformational dynamics. Recent reports of kinetics (K i, K d) and thermodynamic parameters (ΔH, TΔS, and ΔG) associated with macro-cyclic ligands bound to BACE1 (belongs to PAP family) provide a perfect challenge (how to deal with big ligands with multiple torsional angles and select optimum order parameters to study reversible ligand binding/unbinding) for computational methods to predict binding free energies and kinetics beyond typical test systems e.g. benzamide-trypsin. In this work, i reviewed several order parameters which were proposed to capture the conformational dynamics and molecular recognition in PAPs. I further highlighted how machine learning methods can be used as order parameters in the context of PAPs. I then proposed some open ideas and challenges in the context of molecular simulation and put forward my case on how biophysical experiments e.g. NMR, time-resolved FRET etc. can be used in conjunction with biomolecular simulation to gain complete atomistic insights into the conformational dynamics of PAPs.
Collapse
Affiliation(s)
- Soumendranath Bhakat
- Division of Biophysical Chemistry, Center for Molecular Protein Science, Department of Chemistry, Lund University P. O. Box 124 SE-22100 Lund Sweden +46-769608418
| |
Collapse
|
42
|
Zhou Q, Yan X, Liu W, Yin W, Xu H, Cheng D, Jiang X, Ren C. Three Immune-Associated Subtypes of Diffuse Glioma Differ in Immune Infiltration, Immune Checkpoint Molecules, and Prognosis. Front Oncol 2020; 10:586019. [PMID: 33425739 PMCID: PMC7786360 DOI: 10.3389/fonc.2020.586019] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 11/19/2020] [Indexed: 12/29/2022] Open
Abstract
Diffuse glioma is one of the most prevalent malignancies of the brain, with high heterogeneity of tumor-infiltrating immune cells. However, immune-associated subtypes of diffuse glioma have not been determined, nor has the effect of different immune-associated subtypes on disease prognosis and immune infiltration of diffuse glioma patients. We retrieved the expression profiles of immune-related genes from The Cancer Genome Atlas (TCGA) (n = 672) and GSE16011 (n = 268) cohorts and used them to identify subtypes of diffuse glioma via Consensus Cluster Plus analysis. We used the limma, clusterProfiler, ESTIMATE, and survival packages of R for differential analysis, functional enrichment, immune and stromal score evaluation respectively in three subtypes, and performed log-rank tests in immune subtypes of diffuse glioma. The immune-associated features of diffuse glioma in the two cohorts were characterized via bioinformatic analyses of the mRNA expression data of immune-related genes. Three subtypes (C1–3) of diffuse glioma were identified from TCGA data, and were verified using the GSE16011 cohort. We then evaluated their immune characteristics and clinical features. Our mRNA profiling analyses indicated that the different subtypes of diffuse glioma presented differential expression profile of specific genes and signal pathways in the TCGA cohort. Patients with subtype C1, who were mostly diagnosed with grade IV glioma, had poorer outcomes than patients with subtype C2 or C3. Subtype C1 was characterized by a higher degree of immune cell infiltration as estimated by GSVA, and more frequent wildtype IDH1. By contrast, subtype C3 included more grade II and IDH1-mutated glioma, and was associated with more infiltration of CD4+T cells. Most subtype C2 had the features between subtypes C1 and C3. Meanwhile, immune checkpoints and their ligand molecules, including PD1/(PD-L1/PDL2), CTLA4/(CD80/CD86), and B7H3/TLT2, were significantly upregulated in subtype C1 and downregulated in subtype C3. In addition, patients with subtype C1 exhibited more frequent gene mutations. Univariate and multivariate Cox regression analyses revealed that diffuse glioma subtype was an effective, independent, and better prognostic factor. Therefore, we established a novel immune-related classification of diffuse glioma, which provides potential immunotherapy targets for diffuse glioma.
Collapse
Affiliation(s)
- Quanwei Zhou
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Xuejun Yan
- Cancer Research Institute, School of Basic Medical Science, Central South University, Changsha, China.,The NHC Key Laboratory of Carcinogenesis and The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Xiangya Hospital, Central South University, Changsha, China
| | - Weidong Liu
- Cancer Research Institute, School of Basic Medical Science, Central South University, Changsha, China.,The NHC Key Laboratory of Carcinogenesis and The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Xiangya Hospital, Central South University, Changsha, China
| | - Wen Yin
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Hongjuan Xu
- Cancer Research Institute, School of Basic Medical Science, Central South University, Changsha, China.,The NHC Key Laboratory of Carcinogenesis and The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Xiangya Hospital, Central South University, Changsha, China
| | - Damei Cheng
- Cancer Research Institute, School of Basic Medical Science, Central South University, Changsha, China.,The NHC Key Laboratory of Carcinogenesis and The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Xiangya Hospital, Central South University, Changsha, China
| | - Xingjun Jiang
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Caiping Ren
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Cancer Research Institute, School of Basic Medical Science, Central South University, Changsha, China.,The NHC Key Laboratory of Carcinogenesis and The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
43
|
Conformational Landscapes of Halohydrin Dehalogenases and Their Accessible Active Site Tunnels. Catalysts 2020. [DOI: 10.3390/catal10121403] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Halohydrin dehalogenases (HHDH) are industrially relevant biocatalysts exhibiting a promiscuous epoxide-ring opening reactivity in the presence of small nucleophiles, thus giving access to novel carbon–carbon, carbon–oxygen, carbon–nitrogen, and carbon–sulfur bonds. Recently, the repertoire of HHDH has been expanded, providing access to some novel HHDH subclasses exhibiting a broader epoxide substrate scope. In this work, we develop a computational approach based on the application of linear and non-linear dimensionality reduction techniques to long time-scale Molecular Dynamics (MD) simulations to study the HHDH conformational landscapes. We couple the analysis of the conformational landscapes to CAVER calculations to assess their impact on the active site tunnels and potential ability towards bulky epoxide ring opening reaction. Our study indicates that the analyzed HHDHs subclasses share a common breathing motion of the halide binding pocket, but present large deviations in the loops adjacent to the active site pocket and N-terminal regions. Such conformational differences affect the available tunnels for epoxide binding to the active site. The superior activity of the HHDH G subclass towards bulkier substrates is explained by the additional structural elements delimiting the active site region, its rich conformational heterogeneity, and the substantially wider and frequently observed active site tunnels. This study therefore provides key information for HHDH promiscuity and engineering.
Collapse
|
44
|
|
45
|
Fodeh SJ, Al-Garadi M, Elsankary O, Perrone J, Becker W, Sarker A. Utilizing a multi-class classification approach to detect therapeutic and recreational misuse of opioids on Twitter. Comput Biol Med 2020; 129:104132. [PMID: 33290931 DOI: 10.1016/j.compbiomed.2020.104132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 11/10/2020] [Accepted: 11/16/2020] [Indexed: 10/23/2022]
Abstract
BACKGROUND Opioid misuse (OM) is a major health problem in the United States, and can lead to addiction and fatal overdose. We sought to employ natural language processing (NLP) and machine learning to categorize Twitter chatter based on the motive of OM. MATERIALS AND METHODS We collected data from Twitter using opioid-related keywords, and manually annotated 6988 tweets into three classes-No-OM, Pain-related-OM, and Recreational-OM-with the No-OM class representing tweets indicating no use/misuse, and the Pain-related misuse and Recreational-misuse classes representing misuse for pain or recreation/addiction. We trained and evaluated multi-class classifiers, and performed term-level k-means clustering to assess whether there were terms closely associated with the three classes. RESULTS On a held-out test set of 1677 tweets, a transformer-based classifier (XLNet) achieved the best performance with F1-score of 0.71 for the Pain-misuse class, and 0.79 for the Recreational-misuse class. Macro- and micro-averaged F1-scores over all classes were 0.82 and 0.92, respectively. Content-analysis using clustering revealed distinct clusters of terms associated with each class. DISCUSSION While some past studies have attempted to automatically detect opioid misuse, none have further characterized the motive for misuse. Our multi-class classification approach using XLNet showed promising performance, including in detecting the subtle differences between pain-related and recreation-related misuse. The distinct clustering of class-specific keywords may help conduct targeted data collection, overcoming under-representation of minority classes. CONCLUSION Machine learning can help identify pain-related and recreational-related OM contents on Twitter to potentially enable the study of the characteristics of individuals exhibiting such behavior.
Collapse
Affiliation(s)
- Samah Jamal Fodeh
- Department of Emergency Medicine, Yale School of Medicine, Yale University, New Haven, CT 06510, USA; VA Connecticut Healthcare System, West Haven, CT 06516, USA.
| | - Mohammed Al-Garadi
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Osama Elsankary
- Frank Netter M.D. School of Medicine, Quinnipiac University, North Haven, CT 06473, USA
| | - Jeanmarie Perrone
- Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - William Becker
- VA Connecticut Healthcare System, West Haven, CT 06516, USA
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
46
|
Ao C, Zhou W, Gao L, Dong B, Yu L. Prediction of antioxidant proteins using hybrid feature representation method and random forest. Genomics 2020; 112:4666-4674. [DOI: 10.1016/j.ygeno.2020.08.016] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 08/10/2020] [Accepted: 08/13/2020] [Indexed: 12/19/2022]
|
47
|
Song Z, Zhou H, Tian H, Wang X, Tao P. Unraveling the energetic significance of chemical events in enzyme catalysis via machine-learning based regression approach. Commun Chem 2020; 3:134. [PMID: 36703376 PMCID: PMC9814854 DOI: 10.1038/s42004-020-00379-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 09/11/2020] [Indexed: 01/29/2023] Open
Abstract
The bacterial enzyme class of β-lactamases are involved in benzylpenicillin acylation reactions, which are currently being revisited using hybrid quantum mechanical molecular mechanical (QM/MM) chain-of-states pathway optimizations. Minimum energy pathways are sampled by reoptimizing pathway geometry under different representative protein environments obtained through constrained molecular dynamics simulations. Predictive potential energy surface models in the reaction space are trained with machine-learning regression techniques. Herein, using TEM-1/benzylpenicillin acylation reaction as the model system, we introduce two model-independent criteria for delineating the energetic contributions and correlations in the predicted reaction space. Both methods are demonstrated to effectively quantify the energetic contribution of each chemical process and identify the rate limiting step of enzymatic reaction with high degrees of freedom. The consistency of the current workflow is tested under seven levels of quantum chemistry theory and three non-linear machine-learning regression models. The proposed approaches are validated to provide qualitative compliance with experimental mutagenesis studies.
Collapse
Affiliation(s)
- Zilin Song
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX, 75275, USA
| | - Hongyu Zhou
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX, 75275, USA
| | - Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX, 75275, USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, TX, 75275, USA
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX, 75275, USA.
| |
Collapse
|
48
|
Duan C, Chaovalitwongse WA, Bai F, Hippe DS, Wang S, Thammasorn P, Pierce LA, Liu X, You J, Miyaoka RS, Vesselle HJ, Kinahan PE, Rengan R, Zeng J, Bowen SR. Sensitivity analysis of FDG PET tumor voxel cluster radiomics and dosimetry for predicting mid-chemoradiation regional response of locally advanced lung cancer. Phys Med Biol 2020; 65:205007. [PMID: 33027064 PMCID: PMC7593986 DOI: 10.1088/1361-6560/abb0c7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
We investigated the sensitivity of regional tumor response prediction to variability in voxel clustering techniques, imaging features, and machine learning algorithms in 25 patients with locally advanced non-small cell lung cancer (LA-NSCLC) enrolled on the FLARE-RT clinical trial. Metabolic tumor volumes (MTV) from pre-chemoradiation (PETpre) and mid-chemoradiation fluorodeoxyglucose-positron emission tomography (FDG PET) images (PETmid) were subdivided into K-means or hierarchical voxel clusters by standardized uptake values (SUV) and 3D-positions. MTV cluster separability was evaluated by CH index, and morphologic changes were captured by Dice similarity and centroid Euclidean distance. PETpre conventional features included SUVmean, MTV/MTV cluster size, and mean radiation dose. PETpre radiomics consisted of 41 intensity histogram and 3D texture features (PET Oncology Radiomics Test Suite) extracted from MTV or MTV clusters. Machine learning models (multiple linear regression, support vector regression, logistic regression, support vector machines) of conventional features or radiomic features were constructed to predict PETmid response. Leave-one-out-cross-validated root-mean-squared-error (RMSE) for continuous response regression (ΔSUVmean) and area-under-receiver-operating-characteristic-curve (AUC) for binary response classification were calculated. K-means MTV 2-clusters (MTVhi, MTVlo) achieved maximum CH index separability (Friedman p < 0.001). Between PETpre and PETmid, MTV cluster pairs overlapped (Dice 0.70-0.87) and migrated 0.6-1.1 cm. PETmid ΔSUVmean response prediction was superior in MTV and MTVlo (RMSE = 0.17-0.21) compared to MTVhi (RMSE = 0.42-0.52, Friedman p < 0.001). PETmid ΔSUVmean response class prediction performance trended higher in MTVlo (AUC = 0.83-0.88) compared to MTVhi (AUC = 0.44-0.58, Friedman p = 0.052). Models were more sensitive to MTV/MTV cluster regions (Friedman p = 0.026) than feature sets/algorithms (Wilcoxon signed-rank p = 0.36). Top-ranked radiomic features included GLZSM-LZHGE (large-zone-high-SUV), GTSDM-CP (cluster-prominence), GTSDM-CS (cluster-shade) and NGTDM-CNT (contrast). Top-ranked features were consistent between MTVhi and MTVlo cluster pairs but varied between MTVhi-MTVlo clusters, reflecting distinct regional radiomic phenotypes. Variability in tumor voxel cluster response prediction can inform robust radiomic target definition for risk-adaptive chemoradiation in patients with LA-NSCLC. FLARE-RT trial: NCT02773238.
Collapse
Affiliation(s)
- Chunyan Duan
- Department of Mechanical Engineering, Tongji University School of Mechanical Engineering, Shanghai China
- Department of Industrial Engineering, University of Arkansas College of Engineering, Fayetteville AR
- Department of Radiation Oncology, University of Washington School of Medicine, Seattle WA
| | - W. Art Chaovalitwongse
- Department of Industrial Engineering, University of Arkansas College of Engineering, Fayetteville AR
| | - Fangyun Bai
- Department of Management Science and Engineering, Tongji University School of Economics and Management, Shanghai China
- Department of Industrial, Manufacturing, & Systems Engineering, University of Texas at Arlington College of Engineering, Arlington, TX
| | - Daniel S. Hippe
- Department of Radiology, University of Washington School of Medicine, Seattle WA
| | - Shouyi Wang
- Department of Industrial, Manufacturing, & Systems Engineering, University of Texas at Arlington College of Engineering, Arlington, TX
| | - Phawis Thammasorn
- Department of Industrial Engineering, University of Arkansas College of Engineering, Fayetteville AR
| | - Larry A. Pierce
- Department of Radiology, University of Washington School of Medicine, Seattle WA
| | - Xiao Liu
- Department of Industrial Engineering, University of Arkansas College of Engineering, Fayetteville AR
| | - Jianxin You
- Department of Management Science and Engineering, Tongji University School of Economics and Management, Shanghai China
| | - Robert S. Miyaoka
- Department of Radiology, University of Washington School of Medicine, Seattle WA
| | - Hubert J. Vesselle
- Department of Radiology, University of Washington School of Medicine, Seattle WA
| | - Paul E. Kinahan
- Department of Radiology, University of Washington School of Medicine, Seattle WA
| | - Ramesh Rengan
- Department of Radiation Oncology, University of Washington School of Medicine, Seattle WA
| | - Jing Zeng
- Department of Radiation Oncology, University of Washington School of Medicine, Seattle WA
| | - Stephen R. Bowen
- Department of Radiation Oncology, University of Washington School of Medicine, Seattle WA
- Department of Radiology, University of Washington School of Medicine, Seattle WA
| |
Collapse
|
49
|
Tian H, Trozzi F, Zoltowski BD, Tao P. Deciphering the Allosteric Process of the Phaeodactylum tricornutum Aureochrome 1a LOV Domain. J Phys Chem B 2020; 124:8960-8972. [PMID: 32970438 DOI: 10.1021/acs.jpcb.0c05842] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The conformational-driven allosteric protein diatom Phaeodactylum tricornutum aureochrome 1a (PtAu1a) differs from other light-oxygen-voltage (LOV) proteins for its uncommon structural topology. The mechanism of signaling transduction in the PtAu1a LOV domain (AuLOV) including flanking helices remains unclear because of this dissimilarity, which hinders the study of PtAu1a as an optogenetic tool. To clarify this mechanism, we employed a combination of tree-based machine learning models, Markov state models, machine-learning-based community analysis, and transition path theory to quantitatively analyze the allosteric process. Our results are in good agreement with the reported experimental findings and reveal a previously overlooked Cα helix and protein linkers as important in promoting the protein conformational changes. This integrated approach can be considered as a general workflow and applied on other allosteric proteins to provide detailed information about their allosteric mechanisms.
Collapse
Affiliation(s)
- Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Francesco Trozzi
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Brian D Zoltowski
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| |
Collapse
|
50
|
Tian H, Tao P. ivis Dimensionality Reduction Framework for Biomacromolecular Simulations. J Chem Inf Model 2020; 60:4569-4581. [PMID: 32820912 DOI: 10.1021/acs.jcim.0c00485] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Molecular dynamics (MD) simulations have been widely applied to study macromolecules including proteins. However, the high dimensionality of the data sets produced by simulations makes thorough analysis difficult and further hinders a deeper understanding of biomacromolecules. To gain more insights into the protein structure-function relations, appropriate dimensionality reduction methods are needed to project simulations onto low-dimensional spaces. Linear dimensionality reduction methods, such as principal component analysis (PCA) and time-structure-based independent component analysis (t-ICA), could not preserve sufficient structural information. Though better than linear methods, nonlinear methods, such as t-distributed stochastic neighbor embedding (t-SNE), still suffer from the limitations in avoiding system noise and keeping inter-cluster relations. ivis is a novel deep learning-based dimensionality reduction method originally developed for single-cell data sets. Here, we applied this framework for the study of light, oxygen, and voltage (LOV) domains of diatom Phaeodactylum tricornutum aureochrome 1a (PtAu1a). Compared with other methods, ivis is shown to be superior in constructing a Markov state model (MSM), preserving information of both local and global distances, and maintaining similarity between high and low dimensions with the least information loss. Moreover, the ivis framework is capable of providing new perspectives for deciphering residue-level protein allostery through the feature weights in the neural network. Overall, ivis is a promising member of the analysis toolbox for proteins.
Collapse
Affiliation(s)
- Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75205, United States
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75205, United States
| |
Collapse
|