Hu GM, Secario MK, Chen CM. SeQuery: an interactive graph database for visualizing the GPCR superfamily.
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;
2019:5522636. [PMID:
31236561 PMCID:
PMC6591535 DOI:
10.1093/database/baz073]
[Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 04/28/2019] [Accepted: 05/16/2019] [Indexed: 01/01/2023]
Abstract
The rate at which new protein and gene sequences are being discovered has grown explosively in the omics era, which has increasingly complicated the efficient characterization and analysis of their biological properties. In this study, we propose a web-based graphical database tool, SeQuery, for intuitively visualizing proteome/genome networks by integrating the sequential, structural and functional information of sequences. As a demonstration of our tool’s effectiveness, we constructed a graph database of G protein-coupled receptor (GPCR) sequences by integrating data from the UniProt, GPCRdb and RCSB PDB databases. Our tool attempts to achieve two goals: (i) given the sequence of a query protein, correctly and efficiently identify whether the protein is a GPCR, and, if so, define its sequential and functional roles in the GPCR superfamily; and (ii) present a panoramic view of the GPCR superfamily and its network centralities that allows users to explore the superfamily at various resolutions. Such a bottom-up-to-top-down view can provide the users with a comprehensive understanding of the GPCR superfamily through interactive navigation of the graph database. A test of SeQuery with the GPCR2841 dataset shows that it correctly identifies 99 out of 100 queried protein sequences. The developed tool is readily applicable to other biological networks, and we aim to expand SeQuery by including additional biological databases in the near future.
Collapse