1
|
Kabir KL, Ma B, Nussinov R, Shehu A. Fewer Dimensions, More Structures for Improved Discrete Models of Dynamics of Free versus Antigen-Bound Antibody. Biomolecules 2022; 12:biom12071011. [PMID: 35883567 PMCID: PMC9313177 DOI: 10.3390/biom12071011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 07/12/2022] [Accepted: 07/19/2022] [Indexed: 12/10/2022] Open
Abstract
Over the past decade, Markov State Models (MSM) have emerged as powerful methodologies to build discrete models of dynamics over structures obtained from Molecular Dynamics trajectories. The identification of macrostates for the MSM is a central decision that impacts the quality of the MSM but depends on both the selected representation of a structure and the clustering algorithm utilized over the featurized structures. Motivated by a large molecular system in its free and bound state, this paper investigates two directions of research, further reducing the representation dimensionality in a non-parametric, data-driven manner and including more structures in the computation. Rigorous evaluation of the quality of obtained MSMs via various statistical tests in a comparative setting firmly shows that fewer dimensions and more structures result in a better MSM. Many interesting findings emerge from the best MSM, advancing our understanding of the relationship between antibody dynamics and antibody–antigen recognition.
Collapse
Affiliation(s)
- Kazi Lutful Kabir
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA;
- Correspondence: ; Tel.: +1-571-201-5070
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody School of Pharmacy, Shanghai Jiaotong University, Shanghai 200240, China;
| | - Ruth Nussinov
- Computational Structural Biology Section, Cancer Innovation Laboratory, Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA;
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA;
| |
Collapse
|
2
|
Akhter N, Kabir KL, Chennupati G, Vangara R, Alexandrov BS, Djidjev H, Shehu A. Improved Protein Decoy Selection via Non-Negative Matrix Factorization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1670-1682. [PMID: 33400654 DOI: 10.1109/tcbb.2020.3049088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A central challenge in protein modeling research and protein structure prediction in particular is known as decoy selection. The problem refers to selecting biologically-active/native tertiary structures among a multitude of physically-realistic structures generated by template-free protein structure prediction methods. Research on decoy selection is active. Clustering-based methods are popular, but they fail to identify good/near-native decoys on datasets where near-native decoys are severely under-sampled by a protein structure prediction method. Reasonable progress is reported by methods that additionally take into account the internal energy of a structure and employ it to identify basins in the energy landscape organizing the multitude of decoys. These methods, however, incur significant time costs for extracting basins from the landscape. In this paper, we propose a novel decoy selection method based on non-negative matrix factorization. We demonstrate that our method outperforms energy landscape-based methods. In particular, the proposed method addresses both the time cost issue and the challenge of identifying good decoys in a sparse dataset, successfully recognizing near-native decoys for both easy and hard protein targets.
Collapse
|
3
|
Alam FF, Shehu A. Unsupervised multi-instance learning for protein structure determination. J Bioinform Comput Biol 2021; 19:2140002. [PMID: 33568002 DOI: 10.1142/s0219720021400023] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Many regions of the protein universe remain inaccessible by wet-laboratory or computational structure determination methods. A significant challenge in elucidating these dark regions in silico relates to the ability to discriminate relevant structure(s) among many structures/decoys computed for a protein of interest, a problem known as decoy selection. Clustering decoys based on geometric similarity remains popular. However, it is unclear how exactly to exploit the groups of decoys revealed via clustering to select individual structures for prediction. In this paper, we provide an intuitive formulation of the decoy selection problem as an instance of unsupervised multi-instance learning. We address the problem in three stages, first organizing given decoys of a protein molecule into bags, then identifying relevant bags, and finally drawing individual instances from these bags to offer as prediction. We propose both non-parametric and parametric algorithms for drawing individual instances. Our evaluation utilizes two datasets, one benchmark dataset of ensembles of decoys for a varied list of protein molecules, and a dataset of decoy ensembles for targets drawn from recent CASP competitions. A comparative analysis with state-of-the-art methods reveals that the proposed approach outperforms existing methods, thus warranting further investigation of multi-instance learning to advance our treatment of decoy selection.
Collapse
Affiliation(s)
- Fardina Fathmiul Alam
- Department of Computer Science, George Mason University, Fairfax, Virginia 22030, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia 22030, USA
| |
Collapse
|
4
|
Akhter N, Chennupati G, Djidjev H, Shehu A. Decoy selection for protein structure prediction via extreme gradient boosting and ranking. BMC Bioinformatics 2020; 21:189. [PMID: 33297949 PMCID: PMC7724862 DOI: 10.1186/s12859-020-3523-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 04/29/2020] [Indexed: 11/10/2022] Open
Abstract
Background Identifying one or more biologically-active/native decoys from millions of non-native decoys is one of the major challenges in computational structural biology. The extreme lack of balance in positive and negative samples (native and non-native decoys) in a decoy set makes the problem even more complicated. Consensus methods show varied success in handling the challenge of decoy selection despite some issues associated with clustering large decoy sets and decoy sets that do not show much structural similarity. Recent investigations into energy landscape-based decoy selection approaches show promises. However, lack of generalization over varied test cases remains a bottleneck for these methods. Results We propose a novel decoy selection method, ML-Select, a machine learning framework that exploits the energy landscape associated with the structure space probed through a template-free decoy generation. The proposed method outperforms both clustering and energy ranking-based methods, all the while consistently offering better performance on varied test-cases. Moreover, ML-Select shows promising results even for the decoy sets consisting of mostly low-quality decoys. Conclusions ML-Select is a useful method for decoy selection. This work suggests further research in finding more effective ways to adopt machine learning frameworks in achieving robust performance for decoy selection in template-free protein structure prediction.
Collapse
Affiliation(s)
- Nasrin Akhter
- Department of Computer Science, George Mason University, Fairfax, 22030, VA, USA
| | - Gopinath Chennupati
- Information Sciences (CCS-3) Group, Los Alamos National Laboratory, Bikini At al Rd., Los Alamos, 87545, USA.
| | - Hristo Djidjev
- Information Sciences (CCS-3) Group, Los Alamos National Laboratory, Bikini At al Rd., Los Alamos, 87545, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, 22030, VA, USA.,Department of Bioengineering, George Mason University, Fairfax, 22030, VA, USA.,School of Systems Biology, George Mason University, Manassas, 20110, VA, USA
| |
Collapse
|
5
|
Zaman AB, Kamranfar P, Domeniconi C, Shehu A. Reducing Ensembles of Protein Tertiary Structures Generated De Novo via Clustering. Molecules 2020; 25:E2228. [PMID: 32397410 PMCID: PMC7248879 DOI: 10.3390/molecules25092228] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 04/21/2020] [Accepted: 04/28/2020] [Indexed: 11/16/2022] Open
Abstract
Controlling the quality of tertiary structures computed for a protein molecule remains a central challenge in de-novo protein structure prediction. The rule of thumb is to generate as many structures as can be afforded, effectively acknowledging that having more structures increases the likelihood that some will reside near the sought biologically-active structure. A major drawback with this approach is that computing a large number of structures imposes time and space costs. In this paper, we propose a novel clustering-based approach which we demonstrate to significantly reduce an ensemble of generated structures without sacrificing quality. Evaluations are related on both benchmark and CASP target proteins. Structure ensembles subjected to the proposed approach and the source code of the proposed approach are publicly-available at the links provided in Section 1.
Collapse
Affiliation(s)
- Ahmed Bin Zaman
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
| | - Parastoo Kamranfar
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
| | - Carlotta Domeniconi
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
- Center for Advancing Human-Machine Partnerships, George Mason University, Fairfax, VA 22030, USA
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA
- School of Systems Biology, George Mason University, Fairfax, VA 22030, USA
| |
Collapse
|
6
|
Tadepalli S, Akhter N, Barbara D, Shehu A. Anomaly Detection-Based Recognition of Near-Native Protein Structures. IEEE Trans Nanobioscience 2020; 19:562-570. [PMID: 32340957 DOI: 10.1109/tnb.2020.2990642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The three-dimensional structures populated by a protein molecule determine to a great extent its biological activities. The rich information encoded by protein structure on protein function continues to motivate the development of computational approaches for determining functionally-relevant structures. The majority of structures generated in silico are not relevant. Discriminating relevant/native protein structures from non-native ones is an outstanding challenge in computational structural biology. Inherently, this is a recognition problem that can be addressed under the umbrella of machine learning. In this paper, based on the premise that near-native structures are effectively anomalies, we build on the concept of anomaly detection in machine learning. We propose methods that automatically select relevant subsets, as well as methods that select a single structure to offer as prediction. Evaluations are carried out on benchmark datasets and demonstrate that the proposed methods advance the state of the art. The presented results motivate further building on and adapting concepts and techniques from machine learning to improve recognition of near-native structures in protein structure prediction.
Collapse
|
7
|
Alam FF, Rahman T, Shehu A. Evaluating Autoencoder-Based Featurization and Supervised Learning for Protein Decoy Selection. Molecules 2020; 25:E1146. [PMID: 32143444 PMCID: PMC7179114 DOI: 10.3390/molecules25051146] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 02/18/2020] [Accepted: 02/25/2020] [Indexed: 11/24/2022] Open
Abstract
Rapid growth in molecular structure data is renewing interest in featurizing structure. Featurizations that retain information on biological activity are particularly sought for protein molecules, where decades of research have shown that indeed structure encodes function. Research on featurization of protein structure is active, but here we assess the promise of autoencoders. Motivated by rapid progress in neural network research, we investigate and evaluate autoencoders on yielding linear and nonlinear featurizations of protein tertiary structures. An additional reason we focus on autoencoders as the engine to obtain featurizations is the versatility of their architectures and the ease with which changes to architecture yield linear versus nonlinear features. While open-source neural network libraries, such as Keras, which we employ here, greatly facilitate constructing, training, and evaluating autoencoder architectures and conducting model search, autoencoders have not yet gained popularity in the structure biology community. Here we demonstrate their utility in a practical context. Employing autoencoder-based featurizations, we address the classic problem of decoy selection in protein structure prediction. Utilizing off-the-shelf supervised learning methods, we demonstrate that the featurizations are indeed meaningful and allow detecting active tertiary structures, thus opening the way for further avenues of research.
Collapse
Affiliation(s)
- Fardina Fathmiul Alam
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (F.F.A.); (T.R.)
| | - Taseef Rahman
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (F.F.A.); (T.R.)
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (F.F.A.); (T.R.)
- Center for Advancing Human-Machine Partnerships, George Mason University, Fairfax, VA 22030, USA
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA
- School of Systems Biology, George Mason University, Fairfax, VA 22030, USA
| |
Collapse
|
8
|
Zaman AB, Shehu A. Building maps of protein structure spaces in template-free protein structure prediction. J Bioinform Comput Biol 2020; 17:1940013. [DOI: 10.1142/s0219720019400134] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
An important goal in template-free protein structure prediction is how to control the quality of computed tertiary structures of a target amino-acid sequence. Despite great advances in algorithmic research, given the size, dimensionality, and inherent characteristics of the protein structure space, this task remains exceptionally challenging. It is current practice to aim to generate as many structures as can be afforded so as to increase the likelihood that some of them will reside near the sought but unknown biologically-active/native structure. When operating within a given computational budget, this is impractical and uninformed by any metrics of interest. In this paper, we propose instead to equip algorithms that generate tertiary structures, also known as decoy generation algorithms, with memory of the protein structure space that they explore. Specifically, we propose an evolving, granularity-controllable map of the protein structure space that makes use of low-dimensional representations of protein structures. Evaluations on diverse target sequences that include recent hard CASP targets show that drastic reductions in storage can be made without sacrificing decoy quality. The presented results make the case that integrating a map of the protein structure space is a promising mechanism to enhance decoy generation algorithms in template-free protein structure prediction.
Collapse
Affiliation(s)
- Ahmed Bin Zaman
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA
| |
Collapse
|
9
|
Kabir KL, Akhter N, Shehu A. From molecular energy landscapes to equilibrium dynamics via landscape analysis and markov state models. J Bioinform Comput Biol 2020; 17:1940014. [DOI: 10.1142/s0219720019400146] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Molecular dynamics (MD) simulation software allows probing the equilibrium structural dynamics of a molecule of interest, revealing how a molecule navigates its structure space one structure at a time. To obtain a broader view of dynamics, typically one needs to launch many such simulations, obtaining many trajectories. A summarization of the equilibrium dynamics requires integrating the information in the various trajectories, and Markov State Models (MSM) are increasingly being used for this task. At its core, the task involves organizing the structures accessed in simulation into structural states, and then constructing a transition probability matrix revealing the transitions between states. While now considered a mature technology and widely used to summarize equilibrium dynamics, the underlying computational process in the construction of an MSM ignores energetics even though the transition of a molecule between two nearby structures in an MD trajectory is governed by the corresponding energies. In this paper, we connect theory with simulation and analysis of equilibrium dynamics. A molecule navigates the energy landscape underlying the structure space. The structural states that are identified via off-the-shelf clustering algorithms need to be connected to thermodynamically-stable and semi-stable (macro)states among which transitions can then be quantified. Leveraging recent developments in the analysis of energy landscapes that identify basins in the landscape, we evaluate the hypothesis that basins, directly tied to stable and semi-stable states, lead to better models of dynamics. Our analysis indicates that basins lead to MSMs of better quality and thus can be useful to further advance this widely-used technology for summarization of molecular equilibrium dynamics.
Collapse
Affiliation(s)
- Kazi Lutful Kabir
- Department of Computer Science, George Mason University, 4400 University Drive, Fairfax, VA 22030, USA
| | - Nasrin Akhter
- Department of Computer Science, George Mason University, 4400 University Drive, Fairfax, VA 22030, USA
| | - Amarda Shehu
- Department of Computer Science, Department of Bioengineering, School of Systems Biology, George Mason University, 4400 University Drive, Fairfax, VA 22030, USA
| |
Collapse
|
10
|
Akhter N, Chennupati G, Kabir KL, Djidjev H, Shehu A. Unsupervised and Supervised Learning over theEnergy Landscape for Protein Decoy Selection. Biomolecules 2019; 9:E607. [PMID: 31615116 PMCID: PMC6843838 DOI: 10.3390/biom9100607] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 10/03/2019] [Accepted: 10/04/2019] [Indexed: 11/17/2022] Open
Abstract
The energy landscape that organizes microstates of a molecular system and governs theunderlying molecular dynamics exposes the relationship between molecular form/structure, changesto form, and biological activity or function in the cell. However, several challenges stand in the wayof leveraging energy landscapes for relating structure and structural dynamics to function. Energylandscapes are high-dimensional, multi-modal, and often overly-rugged. Deep wells or basins inthem do not always correspond to stable structural states but are instead the result of inherentinaccuracies in semi-empirical molecular energy functions. Due to these challenges, energeticsis typically ignored in computational approaches addressing long-standing central questions incomputational biology, such as protein decoy selection. In the latter, the goal is to determine over apossibly large number of computationally-generated three-dimensional structures of a protein thosestructures that are biologically-active/native. In recent work, we have recast our attention on theprotein energy landscape and its role in helping us to advance decoy selection. Here, we summarizesome of our successes so far in this direction via unsupervised learning. More importantly, we furtheradvance the argument that the energy landscape holds valuable information to aid and advance thestate of protein decoy selection via novel machine learning methodologies that leverage supervisedlearning. Our focus in this article is on decoy selection for the purpose of a rigorous, quantitativeevaluation of how leveraging protein energy landscapes advances an important problem in proteinmodeling. However, the ideas and concepts presented here are generally useful to make discoveriesin studies aiming to relate molecular structure and structural dynamics to function.
Collapse
Affiliation(s)
- Nasrin Akhter
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
| | - Gopinath Chennupati
- Information Sciences (CCS-3) Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - Kazi Lutful Kabir
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
| | - Hristo Djidjev
- Information Sciences (CCS-3) Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
- Center for Adaptive Human-Machine Partnership, George Mason University, Fairfax, VA 22030, USA.
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA.
- School of Systems Biology, George Mason University, Fairfax, VA 22030, USA.
| |
Collapse
|
11
|
Akhter N, Hassan L, Rajabi Z, Barbará D, Shehu A. Learning Organizations of Protein Energy Landscapes: An Application on Decoy Selection in Template-Free Protein Structure Prediction. Methods Mol Biol 2019; 1958:147-171. [PMID: 30945218 DOI: 10.1007/978-1-4939-9161-7_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
The protein energy landscape, which lifts the protein structure space by associating energies with structures, has been useful in improving our understanding of the relationship between structure, dynamics, and function. Currently, however, it is challenging to automatically extract and utilize the underlying organization of an energy landscape to the link structural states it houses to biological activity. In this chapter, we first report on two computational approaches that extract such an organization, one that ignores energies and operates directly in the structure space and another that operates on the energy landscape associated with the structure space. We then describe two complementary approaches, one based on unsupervised learning and another based on supervised learning. Both approaches utilize the extracted organization to address the problem of decoy selection in template-free protein structure prediction. The presented results make the case that learning organizations of protein energy landscapes advances our ability to link structures to biological activity.
Collapse
Affiliation(s)
- Nasrin Akhter
- Department of Computer Science, George Mason University, Fairfax, VA, USA
| | - Liban Hassan
- Department of Computer Science, George Mason University, Fairfax, VA, USA
| | - Zahra Rajabi
- Department of Computer Science, George Mason University, Fairfax, VA, USA
| | - Daniel Barbará
- Department of Computer Science, George Mason University, Fairfax, VA, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA, USA.
| |
Collapse
|
12
|
Kabir KL, Hassan L, Rajabi Z, Akhter N, Shehu A. Graph-Based Community Detection for Decoy Selection in Template-Free Protein Structure Prediction. MOLECULES (BASEL, SWITZERLAND) 2019; 24:molecules24050854. [PMID: 30823390 PMCID: PMC6429114 DOI: 10.3390/molecules24050854] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Revised: 02/14/2019] [Accepted: 02/22/2019] [Indexed: 11/30/2022]
Abstract
Significant efforts in wet and dry laboratories are devoted to resolving molecular structures. In particular, computational methods can now compute thousands of tertiary structures that populate the structure space of a protein molecule of interest. These advances are now allowing us to turn our attention to analysis methodologies that are able to organize the computed structures in order to highlight functionally relevant structural states. In this paper, we propose a methodology that leverages community detection methods, designed originally to detect communities in social networks, to organize computationally probed protein structure spaces. We report a principled comparison of such methods along several metrics on proteins of diverse folds and lengths. We present a rigorous evaluation in the context of decoy selection in template-free protein structure prediction. The results make the case that network-based community detection methods warrant further investigation to advance analysis of protein structure spaces for automated selection of functionally relevant structures.
Collapse
Affiliation(s)
- Kazi Lutful Kabir
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
| | - Liban Hassan
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
| | - Zahra Rajabi
- Department of Information Sciences and Technology, George Mason University, Fairfax, VA 22030, USA.
| | - Nasrin Akhter
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA.
- School of Systems Biology, George Mason University, Fairfax, VA 22030, USA.
| |
Collapse
|
13
|
Nussinov R, Tsai CJ, Shehu A, Jang H. Computational Structural Biology: Successes, Future Directions, and Challenges. Molecules 2019; 24:molecules24030637. [PMID: 30759724 PMCID: PMC6384756 DOI: 10.3390/molecules24030637] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 02/05/2019] [Accepted: 02/10/2019] [Indexed: 02/06/2023] Open
Abstract
Computational biology has made powerful advances. Among these, trends in human health have been uncovered through heterogeneous 'big data' integration, and disease-associated genes were identified and classified. Along a different front, the dynamic organization of chromatin is being elucidated to gain insight into the fundamental question of genome regulation. Powerful conformational sampling methods have also been developed to yield a detailed molecular view of cellular processes. when combining these methods with the advancements in the modeling of supramolecular assemblies, including those at the membrane, we are finally able to get a glimpse into how cells' actions are regulated. Perhaps most intriguingly, a major thrust is on to decipher the mystery of how the brain is coded. Here, we aim to provide a broad, yet concise, sketch of modern aspects of computational biology, with a special focus on computational structural biology. We attempt to forecast the areas that computational structural biology will embrace in the future and the challenges that it may face. We skirt details, highlight successes, note failures, and map directions.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA.
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA.
| | - Amarda Shehu
- Departments of Computer Science, Department of Bioengineering, and School of Systems Biology, George Mason University, Fairfax, VA 22030, USA.
| | - Hyunbum Jang
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA.
| |
Collapse
|
14
|
Precision medicine review: rare driver mutations and their biophysical classification. Biophys Rev 2019; 11:5-19. [PMID: 30610579 PMCID: PMC6381362 DOI: 10.1007/s12551-018-0496-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 12/18/2018] [Indexed: 02/07/2023] Open
Abstract
How can biophysical principles help precision medicine identify rare driver mutations? A major tenet of pragmatic approaches to precision oncology and pharmacology is that driver mutations are very frequent. However, frequency is a statistical attribute, not a mechanistic one. Rare mutations can also act through the same mechanism, and as we discuss below, “latent driver” mutations may also follow the same route, with “helper” mutations. Here, we review how biophysics provides mechanistic guidelines that extend precision medicine. We outline principles and strategies, especially focusing on mutations that drive cancer. Biophysics has contributed profoundly to deciphering biological processes. However, driven by data science, precision medicine has skirted some of its major tenets. Data science embodies genomics, tissue- and cell-specific expression levels, making it capable of defining genome- and systems-wide molecular disease signatures. It classifies cancer driver genes/mutations and affected pathways, and its associated protein structural data guide drug discovery. Biophysics complements data science. It considers structures and their heterogeneous ensembles, explains how mutational variants can signal through distinct pathways, and how allo-network drugs can be harnessed. Biophysics clarifies how one mutation—frequent or rare—can affect multiple phenotypic traits by populating conformations that favor interactions with other network modules. It also suggests how to identify such mutations and their signaling consequences. Biophysics offers principles and strategies that can help precision medicine push the boundaries to transform our insight into biological processes and the practice of personalized medicine. By contrast, “phenotypic drug discovery,” which capitalizes on physiological cellular conditions and first-in-class drug discovery, may not capture the proper molecular variant. This is because variants of the same protein can express more than one phenotype, and a phenotype can be encoded by several variants.
Collapse
|
15
|
An Energy Landscape Treatment of Decoy Selection in Template-Free Protein Structure Prediction. COMPUTATION 2018. [DOI: 10.3390/computation6020039] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|