Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hou J, Sims GE, Zhang C, Kim SH. A global representation of the protein fold space. Proc Natl Acad Sci U S A 2003;100:2386-90. [PMID: 12606708 PMCID: PMC151350 DOI: 10.1073/pnas.2628030100] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Hou J, Sims GE, Zhang C, Kim SH. A global representation of the protein fold space. Proc Natl Acad Sci U S A 2003;100:2386-90. [PMID: 12606708 PMCID: PMC151350 DOI: 10.1073/pnas.2628030100] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Porter LL. Fluid protein fold space and its implications. Bioessays 2023;45:e2300057. [PMID: 37431685 PMCID: PMC10529699 DOI: 10.1002/bies.202300057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023]

Koehler Leman J, Szczerbiak P, Renfrew PD, Gligorijevic V, Berenberg D, Vatanen T, Taylor BC, Chandler C, Janssen S, Pataki A, Carriero N, Fisk I, Xavier RJ, Knight R, Bonneau R, Kosciolek T. Sequence-structure-function relationships in the microbial protein universe. Nat Commun 2023;14:2351. [PMID: 37100781 PMCID: PMC10133388 DOI: 10.1038/s41467-023-37896-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 04/05/2023] [Indexed: 04/28/2023] Open

Affiliation(s)

Julia Koehler Leman Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA. Department of Biology, New York University, New York, NY, USA.
Pawel Szczerbiak Malopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
P Douglas Renfrew Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA Department of Biology, New York University, New York, NY, USA
Vladimir Gligorijevic Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA Prescient Design, a Genentech accelerator, New York, NY, 10010, USA
Daniel Berenberg Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA Prescient Design, a Genentech accelerator, New York, NY, 10010, USA Center for Data Science, New York University, New York, NY, 10011, USA Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, NY, USA
Tommi Vatanen Broad Institute, Cambridge, MA, USA Liggins Institute, University of Auckland, Auckland, New Zealand Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, 00014 University of Helsinki, Helsinki, Finland
Bryn C Taylor Department of Pediatrics, University of California San Diego, La Jolla, CA, USA In Silico Discovery and External Innovation, Janssen Research and Development, San Diego, CA, 92122, USA
Chris Chandler Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
Stefan Janssen Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, 92093, USA Algorithmic Bioinformatics, Justus Liebig University Giessen, Giessen, Germany
Andras Pataki Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
Nick Carriero Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
Ian Fisk Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
Ramnik J Xavier Broad Institute, Cambridge, MA, USA Center for Microbiome Informatics and Therapeutics, MIT, Cambridge, MA, 02139, USA
Rob Knight Department of Pediatrics, University of California San Diego, La Jolla, CA, USA Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, 92093, USA Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA Department of Bioengineering, University of California, San Diego, USA
Richard Bonneau Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA Department of Biology, New York University, New York, NY, USA Center for Data Science, New York University, New York, NY, 10011, USA Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, NY, USA Prescient Design, a Genentech accelerator, New York, NY, 10010, USA
Tomasz Kosciolek Malopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland.

Collapse

Sykes J, Holland BR, Charleston MA. A review of visualisations of protein fold networks and their relationship with sequence and function. Biol Rev Camb Philos Soc 2023;98:243-262. [PMID: 36210328 PMCID: PMC10092621 DOI: 10.1111/brv.12905] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 09/08/2022] [Accepted: 09/09/2022] [Indexed: 01/12/2023]

Pražnikar J, Attygalle NT. Quantitative analysis of visual codewords of a protein distance matrix. PLoS One 2022;17:e0263566. [PMID: 35120181 PMCID: PMC8815937 DOI: 10.1371/journal.pone.0263566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 01/24/2022] [Indexed: 12/02/2022] Open

Carrillo-Cabada H, Benson J, Razavi AM, Mulligan B, Cuendet MA, Weinstein H, Taufer M, Estrada T. A Graphic Encoding Method for Quantitative Classification of Protein Structure and Representation of Conformational Changes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1336-1349. [PMID: 31603792 PMCID: PMC9119144 DOI: 10.1109/tcbb.2019.2945291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Cao Y, Das P, Chenthamarakshan V, Chen PY, Melnyk I, Shen Y. Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2021;139:1261-1271. [PMID: 34423306 PMCID: PMC8375603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Searching protein space for ancient sub-domain segments. Curr Opin Struct Biol 2021;68:105-112. [PMID: 33476896 DOI: 10.1016/j.sbi.2020.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 11/29/2020] [Indexed: 01/08/2023]

Shukla P, Verma S, Kumar M. A rotation based regularization method for semi-supervised learning. Pattern Anal Appl 2021;24:887-905. [PMID: 33424433 PMCID: PMC7781196 DOI: 10.1007/s10044-020-00947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Accepted: 12/09/2020] [Indexed: 12/01/2022]

Karimi M, Zhu S, Cao Y, Shen Y. De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks. J Chem Inf Model 2020;60:5667-5681. [PMID: 32945673 PMCID: PMC7775287 DOI: 10.1021/acs.jcim.0c00593] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Abstract

Although massive data is quickly accumulating on protein sequence and structure, there is a small and limited number of protein architectural types (or structural folds). This study is addressing the following question: how well could one reveal underlying sequence-structure relationships and design protein sequences for an arbitrary, potentially novel, structural fold? In response to the question, we have developed novel deep generative models, namely, semisupervised gcWGAN (guided, conditional, Wasserstein Generative Adversarial Networks). To overcome training difficulties and improve design qualities, we build our models on conditional Wasserstein GAN (WGAN) that uses Wasserstein distance in the loss function. Our major contributions include (1) constructing a low-dimensional and generalizable representation of the fold space for the conditional input, (2) developing an ultrafast sequence-to-fold predictor (or oracle) and incorporating its feedback into WGAN as a loss to guide model training, and (3) exploiting sequence data with and without paired structures to enable a semisupervised training strategy. Assessed by the oracle over 100 novel folds not in the training set, gcWGAN generates more successful designs and covers 3.5 times more target folds compared to a competing data-driven method (cVAE). Assessed by sequence- and structure-based predictors, gcWGAN designs are physically and biologically sound. Assessed by a structure predictor over representative novel folds, including one not even part of basis folds, gcWGAN designs have comparable or better fold accuracy yet much more sequence diversity and novelty than cVAE. The ultrafast data-driven model is further shown to boost the success of a principle-driven de novo method (RosettaDesign), through generating design seeds and tailoring design space. In conclusion, gcWGAN explores uncharted sequence space to design proteins by learning generalizable principles from current sequence-structure data. Data, source codes, and trained models are available at https://github.com/Shen-Lab/gcWGAN.

Collapse

Exploring Protein Fold Space. Biomolecules 2020;10:biom10020193. [PMID: 32012781 PMCID: PMC7072414 DOI: 10.3390/biom10020193] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 01/22/2020] [Accepted: 01/24/2020] [Indexed: 11/17/2022] Open

A global map of the protein shape universe. PLoS Comput Biol 2019;15:e1006969. [PMID: 30978181 PMCID: PMC6481876 DOI: 10.1371/journal.pcbi.1006969] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 04/24/2019] [Accepted: 03/20/2019] [Indexed: 11/19/2022] Open

Gomes CM, Faísca PFN. Protein Folding: An Introduction. PROTEIN FOLDING 2019. [DOI: 10.1007/978-3-319-00882-0_1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

What Can We Learn from Wide-Angle Solution Scattering? ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018;1009:131-147. [PMID: 29218557 DOI: 10.1007/978-981-10-6038-0_8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Rajendran S, Jothi A. Sequentially distant but structurally similar proteins exhibit fold specific patterns based on their biophysical properties. Comput Biol Chem 2018;75:143-153. [PMID: 29783123 DOI: 10.1016/j.compbiolchem.2018.05.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 05/06/2018] [Accepted: 05/07/2018] [Indexed: 11/25/2022]

A proteome view of structural, functional, and taxonomic characteristics of major protein domain clusters. Sci Rep 2017;7:14210. [PMID: 29079755 PMCID: PMC5660162 DOI: 10.1038/s41598-017-13297-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Accepted: 09/21/2017] [Indexed: 12/28/2022] Open

Lee J, Konc J, Janežič D, Brooks BR. Global organization of a binding site network gives insight into evolution and structure-function relationships of proteins. Sci Rep 2017;7:11652. [PMID: 28912495 PMCID: PMC5599562 DOI: 10.1038/s41598-017-10412-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 08/07/2017] [Indexed: 01/06/2023] Open

Garland J. Unravelling the complexity of signalling networks in cancer: A review of the increasing role for computational modelling. Crit Rev Oncol Hematol 2017;117:73-113. [PMID: 28807238 DOI: 10.1016/j.critrevonc.2017.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Revised: 06/01/2017] [Accepted: 06/08/2017] [Indexed: 02/06/2023] Open

Abstract

Cancer induction is a highly complex process involving hundreds of different inducers but whose eventual outcome is the same. Clearly, it is essential to understand how signalling pathways and networks generated by these inducers interact to regulate cell behaviour and create the cancer phenotype. While enormous strides have been made in identifying key networking profiles, the amount of data generated far exceeds our ability to understand how it all "fits together". The number of potential interactions is astronomically large and requires novel approaches and extreme computation methods to dissect them out. However, such methodologies have high intrinsic mathematical and conceptual content which is difficult to follow. This review explains how computation modelling is progressively finding solutions and also revealing unexpected and unpredictable nano-scale molecular behaviours extremely relevant to how signalling and networking are coherently integrated. It is divided into linked sections illustrated by numerous figures from the literature describing different approaches and offering visual portrayals of networking and major conceptual advances in the field. First, the problem of signalling complexity and data collection is illustrated for only a small selection of known oncogenes. Next, new concepts from biophysics, molecular behaviours, kinetics, organisation at the nano level and predictive models are presented. These areas include: visual representations of networking, Energy Landscapes and energy transfer/dissemination (entropy); diffusion, percolation; molecular crowding; protein allostery; quinary structure and fractal distributions; energy management, metabolism and re-examination of the Warburg effect. The importance of unravelling complex network interactions is then illustrated for some widely-used drugs in cancer therapy whose interactions are very extensive. Finally, use of computational modelling to develop micro- and nano- functional models ("bottom-up" research) is highlighted. The review concludes that computational modelling is an essential part of cancer research and is vital to understanding network formation and molecular behaviours that are associated with it. Its role is increasingly essential because it is unravelling the huge complexity of cancer induction otherwise unattainable by any other approach.

Collapse

Dybas JM, Fiser A. Development of a motif-based topology-independent structure comparison method to identify evolutionarily related folds. Proteins 2016;84:1859-1874. [PMID: 27671894 PMCID: PMC5118133 DOI: 10.1002/prot.25169] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 08/17/2016] [Accepted: 08/25/2016] [Indexed: 11/09/2022]

Semantic Signature: Comparative Interpretation of Gene Expression on a Semantic Space. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2016;2016:5174503. [PMID: 27242916 PMCID: PMC4868886 DOI: 10.1155/2016/5174503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 03/23/2016] [Indexed: 11/17/2022]

Zhou H, Li S, Makowski L. Visualizing global properties of a molecular dynamics trajectory. Proteins 2015;84:82-91. [PMID: 26522428 DOI: 10.1002/prot.24957] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2014] [Revised: 08/13/2015] [Accepted: 10/14/2015] [Indexed: 11/10/2022]

Machine Learnable Fold Space Representation based on Residue Cluster Classes. Comput Biol Chem 2015;59 Pt A:1-7. [PMID: 26366526 DOI: 10.1016/j.compbiolchem.2015.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Revised: 07/17/2015] [Accepted: 07/25/2015] [Indexed: 11/21/2022]

Edwards H, Deane CM. Structural Bridges through Fold Space. PLoS Comput Biol 2015;11:e1004466. [PMID: 26372166 PMCID: PMC4570669 DOI: 10.1371/journal.pcbi.1004466] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 07/12/2015] [Indexed: 12/05/2022] Open

Abstract

Several protein structure classification schemes exist that partition the protein universe into structural units called folds. Yet these schemes do not discuss how these units sit relative to each other in a global structure space. In this paper we construct networks that describe such global relationships between folds in the form of structural bridges. We generate these networks using four different structural alignment methods across multiple score thresholds. The networks constructed using the different methods remain a similar distance apart regardless of the probability threshold defining a structural bridge. This suggests that at least some structural bridges are method specific and that any attempt to build a picture of structural space should not be reliant on a single structural superposition method. Despite these differences all representations agree on an organisation of fold space into five principal community structures: all-α, all-β sandwiches, all-β barrels, α/β and α + β. We project estimated fold ages onto the networks and find that not only are the pairings of unconnected folds associated with higher age differences than bridged folds, but this difference increases with the number of networks displaying an edge. We also examine different centrality measures for folds within the networks and how these relate to fold age. While these measures interpret the central core of fold space in varied ways they all identify the disposition of ancestral folds to fall within this core and that of the more recently evolved structures to provide the peripheral landscape. These findings suggest that evolutionary information is encoded along these structural bridges. Finally, we identify four highly central pivotal folds representing dominant topological features which act as key attractors within our landscapes.

Folds are considered to be the structural units which make up the protein universe. Structural classification schemes focus on the assignment and organisation of protein domains into folds. However, they do not suggest how different folds might relate to one another in a global way. We introduce the concept of bridges through fold space: significant similarities between these units. We consider four alignment methods and a dynamic approach to placing these bridges. A greater consensus between these methods cannot be achieved by simply increasing the stringency with which edges are assigned. Instead, we emphasise the importance of considering consensus maps and only report results where there is agreement across all networks. It is possible that a study of the bridges may reveal evolutionary relationships. Based on a phylogenetic analysis of structures, we find that bridges consistently fall between folds which evolved at similar times. Moreover, the landscapes all consist of a core of older folds, with younger structures more often seen at the periphery. Finally we identify four pivotal folds in the landscapes. They contain topological motifs which unite disparate regions of fold space.

Collapse

Rackovsky S. Nonlinearities in protein space limit the utility of informatics in protein biophysics. Proteins 2015;83:1923-8. [PMID: 26315852 DOI: 10.1002/prot.24916] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Revised: 08/12/2015] [Accepted: 08/20/2015] [Indexed: 11/08/2022]

Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015;11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Minami S, Sawada K, Chikenji G. How a spatial arrangement of secondary structure elements is dispersed in the universe of protein folds. PLoS One 2014;9:e107959. [PMID: 25243952 PMCID: PMC4171485 DOI: 10.1371/journal.pone.0107959] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2014] [Accepted: 08/18/2014] [Indexed: 11/18/2022] Open

Ben-Tal N, Kolodny R. Representation of the Protein Universe using Classifications, Maps, and Networks. Isr J Chem 2014. [DOI: 10.1002/ijch.201400001] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Shokry AM, Al-Karim S, Ramadan A, Gadallah N, Al Attas SG, Sabir JSM, Hassan SM, Madkour MA, Bressan R, Mahfouz M, Bahieldin A. Detection of a Usp-like gene in Calotropis procera plant from the de novo assembled genome contigs of the high-throughput sequencing dataset. C R Biol 2014;337:86-94. [PMID: 24581802 DOI: 10.1016/j.crvi.2013.12.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2013] [Accepted: 12/20/2013] [Indexed: 11/18/2022]

Affiliation(s)

Ahmed M Shokry Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia; Agricultural Genetic Engineering Research Institute (AGERI), Agriculture Research Center (ARC), Giza, Egypt
Saleh Al-Karim Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia
Ahmed Ramadan Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia; Agricultural Genetic Engineering Research Institute (AGERI), Agriculture Research Center (ARC), Giza, Egypt
Nour Gadallah Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia; Genetics and Cytology Department, Genetic Engineering and Biotechnology Division, National Research Center, Dokki, Egypt
Sanaa G Al Attas Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia
Jamal S M Sabir Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia
Sabah M Hassan Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia; Department of Genetics, Faculty of Agriculture, Ain Shams University, Cairo, Egypt
Magdy A Madkour Arid Lands Agricultural Research Institute, Ain Shams University, Cairo, Egypt
Ray Bressan School of Agriculture, Purdue University, West Lafayette, Indiana, USA
Magdy Mahfouz Division of Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Ahmed Bahieldin Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), P.O. Box 80141, Jeddah 21589, Saudi Arabia; Department of Genetics, Faculty of Agriculture, Ain Shams University, Cairo, Egypt.

Collapse

Shi JY, Yiu SM, Zhang YN, Chin FYL. Effective moment feature vectors for protein domain structures. PLoS One 2014;8:e83788. [PMID: 24391828 DOI: 10.1371/journal.pone.0083788] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 11/08/2013] [Indexed: 11/19/2022] Open

Asarnow D, Singh R. The impact of structural diversity and parameterization on maps of the protein universe. BMC Proc 2013;7:S1. [PMID: 24565442 PMCID: PMC4029320 DOI: 10.1186/1753-6561-7-s7-s1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Edwards H, Abeln S, Deane CM. Exploring fold space preferences of new-born and ancient protein superfamilies. PLoS Comput Biol 2013;9:e1003325. [PMID: 24244135 PMCID: PMC3828129 DOI: 10.1371/journal.pcbi.1003325] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 09/23/2013] [Indexed: 11/18/2022] Open

Singh R, Yang H, Dalziel B, Asarnow D, Murad W, Foote D, Gormley M, Stillman J, Fisher S. Towards human-computer synergetic analysis of large-scale biological data. BMC Bioinformatics 2013;14 Suppl 14:S10. [PMID: 24267485 PMCID: PMC3851181 DOI: 10.1186/1471-2105-14-s14-s10] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Advances in technology have led to the generation of massive amounts of complex and multifarious biological data in areas ranging from genomics to structural biology. The volume and complexity of such data leads to significant challenges in terms of its analysis, especially when one seeks to generate hypotheses or explore the underlying biological processes. At the state-of-the-art, the application of automated algorithms followed by perusal and analysis of the results by an expert continues to be the predominant paradigm for analyzing biological data. This paradigm works well in many problem domains. However, it also is limiting, since domain experts are forced to apply their instincts and expertise such as contextual reasoning, hypothesis formulation, and exploratory analysis after the algorithm has produced its results. In many areas where the organization and interaction of the biological processes is poorly understood and exploratory analysis is crucial, what is needed is to integrate domain expertise during the data analysis process and use it to drive the analysis itself.

RESULTS

In context of the aforementioned background, the results presented in this paper describe advancements along two methodological directions. First, given the context of biological data, we utilize and extend a design approach called experiential computing from multimedia information system design. This paradigm combines information visualization and human-computer interaction with algorithms for exploratory analysis of large-scale and complex data. In the proposed approach, emphasis is laid on: (1) allowing users to directly visualize, interact, experience, and explore the data through interoperable visualization-based and algorithmic components, (2) supporting unified query and presentation spaces to facilitate experimentation and exploration, (3) providing external contextual information by assimilating relevant supplementary data, and (4) encouraging user-directed information visualization, data exploration, and hypotheses formulation. Second, to illustrate the proposed design paradigm and measure its efficacy, we describe two prototype web applications. The first, called XMAS (Experiential Microarray Analysis System) is designed for analysis of time-series transcriptional data. The second system, called PSPACE (Protein Space Explorer) is designed for holistic analysis of structural and structure-function relationships using interactive low-dimensional maps of the protein structure space. Both these systems promote and facilitate human-computer synergy, where cognitive elements such as domain knowledge, contextual reasoning, and purpose-driven exploration, are integrated with a host of powerful algorithmic operations that support large-scale data analysis, multifaceted data visualization, and multi-source information integration.

CONCLUSIONS

The proposed design philosophy, combines visualization, algorithmic components and cognitive expertise into a seamless processing-analysis-exploration framework that facilitates sense-making, exploration, and discovery. Using XMAS, we present case studies that analyze transcriptional data from two highly complex domains: gene expression in the placenta during human pregnancy and reaction of marine organisms to heat stress. With PSPACE, we demonstrate how complex structure-function relationships can be explored. These results demonstrate the novelty, advantages, and distinctions of the proposed paradigm. Furthermore, the results also highlight how domain insights can be combined with algorithms to discover meaningful knowledge and formulate evidence-based hypotheses during the data analysis process. Finally, user studies against comparable systems indicate that both XMAS and PSPACE deliver results with better interpretability while placing lower cognitive loads on the users. XMAS is available at: http://tintin.sfsu.edu:8080/xmas. PSPACE is available at: http://pspace.info/.

Collapse

Sequence and structure space model of protein divergence driven by point mutations. J Theor Biol 2013;330:1-8. [DOI: 10.1016/j.jtbi.2013.03.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Revised: 03/07/2013] [Accepted: 03/18/2013] [Indexed: 12/11/2022]

Mach P, Koehl P. Capturing protein sequence-structure specificity using computational sequence design. Proteins 2013;81:1556-70. [DOI: 10.1002/prot.24307] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Revised: 03/28/2013] [Accepted: 04/11/2013] [Indexed: 02/05/2023]

Kolodny R, Kosloff M. From Protein Structure to Function via Computational Tools and Approaches. Isr J Chem 2013. [DOI: 10.1002/ijch.201200078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Robertson JWF, Kasianowicz JJ, Banerjee S. Analytical Approaches for Studying Transporters, Channels and Porins. Chem Rev 2012;112:6227-49. [DOI: 10.1021/cr300317z] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Interaction between soluble and membrane-embedded potassium channel peptides monitored by Fourier transform infrared spectroscopy. PLoS One 2012;7:e49070. [PMID: 23145073 PMCID: PMC3493504 DOI: 10.1371/journal.pone.0049070] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Accepted: 10/08/2012] [Indexed: 11/19/2022] Open

A mapping of an ensemble of mitochondrial sequences for various organisms into 3D space based on the word composition. Mol Phylogenet Evol 2012;65:380-9. [DOI: 10.1016/j.ympev.2012.06.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2012] [Revised: 06/01/2012] [Accepted: 06/25/2012] [Indexed: 11/24/2022]

Classification of protein functional surfaces using structural characteristics. Proc Natl Acad Sci U S A 2012;109:1170-5. [PMID: 22238424 DOI: 10.1073/pnas.1119684109] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Wang C, Mao X, Yang A, Niu L, Wang S, Li D, Guo Y, Wang Y, Yang Y, Wang C. Determination of relative binding affinities of labeling molecules with amino acids by using scanning tunneling microscopy. Chem Commun (Camb) 2011;47:10638-40. [PMID: 21869951 DOI: 10.1039/c1cc12380g] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Maps of protein structure space reveal a fundamental relationship between protein structure and function. Proc Natl Acad Sci U S A 2011;108:12301-6. [PMID: 21737750 DOI: 10.1073/pnas.1102727108] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Aita T, Nishigaki K. A visualization of 3D proteome universe: mapping of a proteome ensemble into 3D space based on the protein-structure composition. Mol Phylogenet Evol 2011;61:484-94. [PMID: 21762784 DOI: 10.1016/j.ympev.2011.06.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2011] [Revised: 06/23/2011] [Accepted: 06/25/2011] [Indexed: 10/18/2022]

Nguyen MN, Tan KP, Madhusudhan MS. CLICK--topology-independent comparison of biomolecular 3D structures. Nucleic Acids Res 2011;39:W24-8. [PMID: 21602266 PMCID: PMC3125785 DOI: 10.1093/nar/gkr393] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2011] [Revised: 04/19/2011] [Accepted: 05/03/2011] [Indexed: 01/28/2023] Open

Practical applications of structural genomics technologies for mutagen research. Mutat Res 2011;722:165-70. [PMID: 21182983 DOI: 10.1016/j.mrgentox.2010.12.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2010] [Accepted: 12/10/2010] [Indexed: 11/23/2022]

Newman J. One plate, two plates, a thousand plates. How crystallisation changes with large numbers of samples. Methods 2011;55:73-80. [PMID: 21571072 DOI: 10.1016/j.ymeth.2011.04.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2010] [Revised: 04/28/2011] [Accepted: 04/29/2011] [Indexed: 11/15/2022] Open

Pelé J, Abdi H, Moreau M, Thybert D, Chabbert M. Multidimensional scaling reveals the main evolutionary pathways of class A G-protein-coupled receptors. PLoS One 2011;6:e19094. [PMID: 21544207 PMCID: PMC3081337 DOI: 10.1371/journal.pone.0019094] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2010] [Accepted: 03/16/2011] [Indexed: 11/21/2022] Open

Ikebe J, Standley DM, Nakamura H, Higo J. Ab initio simulation of a 57-residue protein in explicit solvent reproduces the native conformation in the lowest free-energy cluster. Protein Sci 2011;20:187-96. [PMID: 21082745 DOI: 10.1002/pro.553] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Grainger B, Sadowski MI, Taylor WR. Re-evaluating the "rules" of protein topology. J Comput Biol 2010;17:1371-84. [PMID: 20649421 DOI: 10.1089/cmb.2009.0265] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Alva V, Remmert M, Biegert A, Lupas AN, Söding J. A galaxy of folds. Protein Sci 2010;19:124-30. [PMID: 19937658 DOI: 10.1002/pro.297] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Carugo O. Clustering tendency in the protein fold space. Bioinformation 2010;4:347-51. [PMID: 20975898 PMCID: PMC2951670 DOI: 10.6026/97320630004347] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2009] [Accepted: 07/23/2009] [Indexed: 11/23/2022] Open

Sadowski MI, Taylor WR. Protein structures, folds and fold spaces. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2010;22:033103. [PMID: 21386276 DOI: 10.1088/0953-8984/22/3/033103] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]