1
|
Clements J, Goina C, Hubbard PM, Kawase T, Olbris DJ, Otsuna H, Svirskas R, Rokicki K. NeuronBridge: an intuitive web application for neuronal morphology search across large data sets. BMC Bioinformatics 2024; 25:114. [PMID: 38491365 PMCID: PMC10943809 DOI: 10.1186/s12859-024-05732-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 03/06/2024] [Indexed: 03/18/2024] Open
Abstract
BACKGROUND Neuroscience research in Drosophila is benefiting from large-scale connectomics efforts using electron microscopy (EM) to reveal all the neurons in a brain and their connections. To exploit this knowledge base, researchers relate a connectome's structure to neuronal function, often by studying individual neuron cell types. Vast libraries of fly driver lines expressing fluorescent reporter genes in sets of neurons have been created and imaged using confocal light microscopy (LM), enabling the targeting of neurons for experimentation. However, creating a fly line for driving gene expression within a single neuron found in an EM connectome remains a challenge, as it typically requires identifying a pair of driver lines where only the neuron of interest is expressed in both. This task and other emerging scientific workflows require finding similar neurons across large data sets imaged using different modalities. RESULTS Here, we present NeuronBridge, a web application for easily and rapidly finding putative morphological matches between large data sets of neurons imaged using different modalities. We describe the functionality and construction of the NeuronBridge service, including its user-friendly graphical user interface (GUI), extensible data model, serverless cloud architecture, and massively parallel image search engine. CONCLUSIONS NeuronBridge fills a critical gap in the Drosophila research workflow and is used by hundreds of neuroscience researchers around the world. We offer our software code, open APIs, and processed data sets for integration and reuse, and provide the application as a service at http://neuronbridge.janelia.org .
Collapse
Affiliation(s)
- Jody Clements
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Cristian Goina
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Philip M Hubbard
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Takashi Kawase
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Donald J Olbris
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Hideo Otsuna
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Robert Svirskas
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA
| | - Konrad Rokicki
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, USA.
| |
Collapse
|
2
|
Wang R, Jiang H, Lu M, Tong J, An S, Wang J, Yu C. MRMPro: a web-based tool to improve the speed of manual calibration for multiple reaction monitoring data analysis by mass spectrometry. BMC Bioinformatics 2024; 25:60. [PMID: 38321388 PMCID: PMC10848457 DOI: 10.1186/s12859-024-05685-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND As a gold-standard quantitative technique based on mass spectrometry, multiple reaction monitoring (MRM) has been widely used in proteomics and metabolomics. In the analysis of MRM data, as no peak picking algorithm can achieve perfect accuracy, manual inspection is necessary to correct the errors. In large cohort analysis scenarios, the time required for manual inspection is often considerable. Apart from the commercial software that comes with mass spectrometers, the open-source and free software Skyline is the most popular software for quantitative omics. However, this software is not optimized for manual inspection of hundreds of samples, the interactive experience also needs to be improved. RESULTS Here we introduce MRMPro, a web-based MRM data analysis platform for efficient manual inspection. MRMPro supports data analysis of MRM and schedule MRM data acquired by mass spectrometers of mainstream vendors. With the goal of improving the speed of manual inspection, we implemented a collaborative review system based on cloud architecture, allowing multiple users to review through browsers. To reduce bandwidth usage and improve data retrieval speed, we proposed a MRM data compression algorithm, which reduced data volume by more than 60% and 80% respectively compared to vendor and mzML format. To improve the efficiency of manual inspection, we proposed a retention time drift estimation algorithm based on similarity of chromatograms. The estimated retention time drifts were then used for peak alignment and automatic EIC grouping. Compared with Skyline, MRMPro has higher quantification accuracy and better manual inspection support. CONCLUSIONS In this study, we proposed MRMPro to improve the usability of manual calibration for MRM data analysis. MRMPro is free for non-commercial use. Researchers can access MRMPro through http://mrmpro.csibio.com/ . All major mass spectrometry formats (wiff, raw, mzML, etc.) can be analyzed on the platform. The final identification results can be exported to a common.xlsx format for subsequent analysis.
Collapse
Affiliation(s)
- Ruimin Wang
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Fudan University, Shanghai, China
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd., Hangzhou, Zhejiang, China
| | - Hengxuan Jiang
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd., Hangzhou, Zhejiang, China
| | - Miaoshan Lu
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Zhejiang University, Hangzhou, Zhejiang, China
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd., Hangzhou, Zhejiang, China
| | - Junjie Tong
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China
- College of Chemistry and Chemical Engineering, Hainan Normal University, Haikou, Hainan, China
| | - Shaowei An
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China
- School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Fudan University, Shanghai, China
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd., Hangzhou, Zhejiang, China
| | - Jinyin Wang
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China
- School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Zhejiang University, Hangzhou, Zhejiang, China
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd., Hangzhou, Zhejiang, China
| | - Changbin Yu
- Shandong First Medical University (SDFMU) & Central Hospital Affiliated to SDFMU, Jinan, China.
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd., Hangzhou, Zhejiang, China.
| |
Collapse
|
3
|
Lu M, Jiang H, Wang R, An S, Wang J, Yu C. Injectiondesign: web service of plate design with optimized stratified block randomization for modern GC/LC-MS-based sample preparation. BMC Bioinformatics 2023; 24:489. [PMID: 38124029 PMCID: PMC10734102 DOI: 10.1186/s12859-023-05598-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 12/04/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Plate design is a necessary and time-consuming operation for GC/LC-MS-based sample preparation. The implementation of the inter-batch balancing algorithm and the intra-batch randomization algorithm can have a significant impact on the final results. For researchers without programming skills, a stable and efficient online service for plate design is necessary. RESULTS Here we describe InjectionDesign, a free online plate design service focused on GC/LC-MS-based multi-omics experiment design. It offers the ability to separate the position design from the sequence design, making the output more compatible with the requirements of a modern mass spectrometer-based laboratory. In addition, it has implemented an optimized block randomization algorithm, which can be better applied to sample stratification with block randomization for an unbalanced distribution. It is easy to use, with built-in support for common instrument models and quick export to a worksheet. CONCLUSIONS InjectionDesign is an open-source project based on Java. Researchers can get the source code for the project from Github: https://github.com/CSi-Studio/InjectionDesign . A free web service is also provided: http://www.injection.design .
Collapse
Affiliation(s)
- Miaoshan Lu
- Zhejiang University, Hangzhou, Zhejiang, China
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Hengxuan Jiang
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Ruimin Wang
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Fudan University, Shanghai, China
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Shaowei An
- School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Fudan University, Shanghai, China
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Jiawei Wang
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd, Hangzhou, China
| | - Changbin Yu
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China.
| |
Collapse
|
4
|
Wang S, You R, Liu Y, Xiong Y, Zhu S. NetGO 3.0: Protein Language Model Improves Large-scale Functional Annotations. Genomics Proteomics Bioinformatics 2023; 21:349-358. [PMID: 37075830 PMCID: PMC10626176 DOI: 10.1016/j.gpb.2023.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 02/24/2023] [Accepted: 04/07/2023] [Indexed: 04/21/2023]
Abstract
As one of the state-of-the-art automated function prediction (AFP) methods, NetGO 2.0 integrates multi-source information to improve the performance. However, it mainly utilizes the proteins with experimentally supported functional annotations without leveraging valuable information from a vast number of unannotated proteins. Recently, protein language models have been proposed to learn informative representations [e.g., Evolutionary Scale Modeling (ESM)-1b embedding] from protein sequences based on self-supervision. Here, we represented each protein by ESM-1b and used logistic regression (LR) to train a new model, LR-ESM, for AFP. The experimental results showed that LR-ESM achieved comparable performance with the best-performing component of NetGO 2.0. Therefore, by incorporating LR-ESM into NetGO 2.0, we developed NetGO 3.0 to improve the performance of AFP extensively. NetGO 3.0 is freely accessible at https://dmiip.sjtu.edu.cn/ng3.0.
Collapse
Affiliation(s)
- Shaojun Wang
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
| | - Ronghui You
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
| | - Yunjia Liu
- School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Yi Xiong
- Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China; Shanghai Qi Zhi Institute, Shanghai 200030, China; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China; Shanghai Key Laboratory of Intelligent Information Processing and Shanghai Institute of Artificial Intelligence Algorithm, Fudan University, Shanghai 200433, China; Zhangjiang Fudan International Innovation Center, Shanghai 200433, China.
| |
Collapse
|
5
|
Fukunaga T, Iwakiri J, Hamada M. Web Services for RNA-RNA Interaction Prediction. Methods Mol Biol 2023; 2586:175-95. [PMID: 36705905 DOI: 10.1007/978-1-0716-2768-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Non-coding RNAs have various biological functions such as translational regulation, and RNA-RNA interactions play essential roles in the mechanisms of action of these RNAs. Therefore, RNA-RNA interaction prediction is an important problem in bioinformatics, and many tools have been developed for the computational prediction of RNA-RNA interactions. In addition to the development of novel algorithms with high accuracy, the development and maintenance of web services is essential for enhancing usability by experimental biologists. In this review, we survey web services for RNA-RNA interaction predictions and introduce how to use primary web services. We present various prediction tools, including general interaction prediction tools, prediction tools for specific RNA classes, and RNA-RNA interaction-based RNA design tools. Additionally, we discuss the future perspectives of the development of RNA-RNA interaction prediction tools and the sustainability of web services.
Collapse
|
6
|
Romberg D, Strohmenger K, Jansen C, Küster T, Weiss N, Geißler C, Sołtysiński T, Takla M, Hufnagl P, Zerbe N, Homeyer A. EMPAIA App Interface: An open and vendor-neutral interface for AI applications in pathology. Comput Methods Programs Biomed 2022; 215:106596. [PMID: 34968788 DOI: 10.1016/j.cmpb.2021.106596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 12/03/2021] [Accepted: 12/18/2021] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Artificial intelligence (AI) apps hold great potential to make pathological diagnoses more accurate and time efficient. Widespread use of AI in pathology is hampered by interface incompatibilities between pathology software. We studied the existing interfaces in order to develop the EMPAIA App Interface, an open standard for the integration of pathology AI apps. METHODS The EMPAIA App Interface relies on widely-used web communication protocols and containerization. It consists of three parts: A standardized format to describe the semantics of an app, a mechanism to deploy and execute apps in computing environments, and a web API through which apps can exchange data with a host application. RESULTS Five commercial AI app manufacturers successfully adapted their products to the EMPAIA App Interface and helped improve it with their feedback. Open source tools facilitate the adoption of the interface by providing reusable data access and scheduling functionality and enabling automatic validation of app compliance. CONCLUSIONS Existing AI apps and pathology software can be adapted to the EMPAIA App Interface with little effort. It is a viable alternative to the proprietary interfaces of current software. If enough vendors join in, the EMPAIA App Interface can help to advance the use of AI in pathology.
Collapse
Affiliation(s)
- Daniel Romberg
- Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Straße 2, 28359 Bremen, Germany.
| | - Klaus Strohmenger
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Pathology, Charitéplatz 1, 10117 Berlin, Germany
| | - Christoph Jansen
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Pathology, Charitéplatz 1, 10117 Berlin, Germany
| | - Tobias Küster
- Technische Universität Berlin, DAI-Labor, Ernst-Reuter-Platz 7, 10587 Berlin, Germany
| | - Nick Weiss
- Fraunhofer Institute for Digital Medicine MEVIS, Maria-Goeppert-Straße 3, 23562 Lübeck, Germany
| | - Christian Geißler
- Technische Universität Berlin, DAI-Labor, Ernst-Reuter-Platz 7, 10587 Berlin, Germany
| | | | - Michael Takla
- vitasystems GmbH, Gottlieb-Daimler-Straße 8, 68165 Mannheim, Germany
| | - Peter Hufnagl
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Pathology, Charitéplatz 1, 10117 Berlin, Germany; HTW University of Applied Sciences Berlin, Wilhelminenhofstraße 75A, 12459 Berlin, Germany
| | - Norman Zerbe
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Pathology, Charitéplatz 1, 10117 Berlin, Germany
| | - André Homeyer
- Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Straße 2, 28359 Bremen, Germany
| |
Collapse
|
7
|
Kreis J, Nedić B, Mazur J, Urban M, Schelhorn SE, Grombacher T, Geist F, Brors B, Zühlsdorf M, Staub E. RosettaSX: Reliable gene expression signature scoring of cancer models and patients. Neoplasia 2021; 23:1069-1077. [PMID: 34583245 PMCID: PMC8479477 DOI: 10.1016/j.neo.2021.08.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/28/2021] [Accepted: 08/30/2021] [Indexed: 11/29/2022]
Abstract
Gene expression signatures have proven their potential to characterize important cancer phenomena like oncogenic signaling pathway activities, cellular origins of tumors, or immune cell infiltration into tumor tissues. Large collections of expression signatures provide the basis for their application to data sets, but the applicability of each signature in a new experimental context must be reassessed. We apply a methodology that utilizes the previously developed concept of coherent expression of genes in signatures to identify translatable signatures before scoring their activity in single tumors. We present a web interface (www.rosettasx.com) that applies our methodology to expression data from the Cancer Cell Line Encyclopaedia and The Cancer Genome Atlas. Configurable heat maps visualize per-cancer signature scores for 293 hand-curated literature-derived gene sets representing a wide range of cancer-relevant transcriptional modules and phenomena. The platform allows users to complement heatmaps of signature scores with molecular information on SNVs, CNVs, gene expression, gene dependency, and protein abundance or to analyze own signatures. Clustered heatmaps and further plots to drill-down results support users in studying oncological processes in cancer subtypes, thereby providing a rich resource to explore how mechanisms of cancer interact with each other as demonstrated by exemplary analyses of 2 cancer types.
Collapse
Affiliation(s)
- Julian Kreis
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany; Faculty of Bioscience, University of Heidelberg, Heidelberg, Germany
| | - Boro Nedić
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Johanna Mazur
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Miriam Urban
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Sven-Eric Schelhorn
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Thomas Grombacher
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany
| | - Felix Geist
- Therapeutic Innovation Platform Oncology & Immuno-Oncology, Merck KGaA, Darmstadt, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany; German Cancer Consortium (DKTK), Core Center, Heidelberg, Germany
| | - Michael Zühlsdorf
- Therapeutic Innovation Platform Oncology & Immuno-Oncology, Merck KGaA, Darmstadt, Germany
| | - Eike Staub
- Department of Translational Medicine, Oncology Bioinformatics, Merck KGaA, Darmstadt, Germany.
| |
Collapse
|
8
|
Abstract
Background Computational methods support nowadays each stage of drug design campaigns. They assist not only in the process of identification of new active compounds towards particular biological target, but also help in the evaluation and optimization of their physicochemical and pharmacokinetic properties. Such features are not less important in terms of the possible turn of a compound into a future drug than its desired affinity profile towards considered proteins. In the study, we focus on metabolic stability, which determines the time that the compound can act in the organism and play its role as a drug. Due to great complexity of xenobiotic transformation pathways in the living organisms, evaluation and optimization of metabolic stability remains a big challenge. Results Here, we present a novel methodology for the evaluation and analysis of structural features influencing metabolic stability. To this end, we use a well-established explainability method called SHAP. We built several predictive models and analyse their predictions with the SHAP values to reveal how particular compound substructures influence the model’s prediction. The method can be widely applied by users thanks to the web service, which accompanies the article. It allows a detailed analysis of SHAP values obtained for compounds from the ChEMBL database, as well as their determination and analysis for any compound submitted by a user. Moreover, the service enables manual analysis of the possible structural modifications via the provision of analogous analysis for the most similar compound from the ChEMBL dataset. Conclusions To our knowledge, this is the first attempt to employ SHAP to reveal which substructural features are utilized by machine learning models when evaluating compound metabolic stability. The accompanying web service for metabolic stability evaluation can be of great help for medicinal chemists. Its significant usefulness is related not only to the possibility of assessing compound stability, but also to the provision of information about substructures influencing this parameter. It can assist in the design of new ligands with improved metabolic stability, helping in the detection of privileged and unfavourable chemical moieties during stability optimization. The tool is available at https://metstab-shap.matinf.uj.edu.pl/.
Collapse
Affiliation(s)
- Agnieszka Wojtuch
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 S. Łojasiewicza Street, 30-348, Kraków, Poland
| | - Rafał Jankowski
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 S. Łojasiewicza Street, 30-348, Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, 12 Smętna Street, 31-343, Kraków, Poland. .,Department of Technology and Biotechnology of Drugs, Faculty of Pharmacy, Jagiellonian University Medical College, 9 Medyczna Street, 30-688, Kraków, Poland.
| |
Collapse
|
9
|
Schindler O, Raček T, Maršavelski A, Koča J, Berka K, Svobodová R. Optimized SQE atomic charges for peptides accessible via a web application. J Cheminform 2021; 13:45. [PMID: 34193251 PMCID: PMC8243439 DOI: 10.1186/s13321-021-00528-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 06/18/2021] [Indexed: 12/03/2022] Open
Abstract
Background Partial atomic charges find many applications in computational chemistry, chemoinformatics, bioinformatics, and nanoscience. Currently, frequently used methods for charge calculation are the Electronegativity Equalization Method (EEM), Charge Equilibration method (QEq), and Extended QEq (EQeq). They all are fast, even for large molecules, but require empirical parameters. However, even these advanced methods have limitations—e.g., their application for peptides, proteins, and other macromolecules is problematic. An empirical charge calculation method that is promising for peptides and other macromolecular systems is the Split-charge Equilibration method (SQE) and its extension SQE+q0. Unfortunately, only one parameter set is available for these methods, and their implementation is not easily accessible. Results In this article, we present for the first time an optimized guided minimization method (optGM) for the fast parameterization of empirical charge calculation methods and compare it with the currently available guided minimization (GDMIN) method. Then, we introduce a further extension to SQE, SQE+qp, adapted for peptide datasets, and compare it with the common approaches EEM, QEq EQeq, SQE, and SQE+q0. Finally, we integrate SQE and SQE+qp into the web application Atomic Charge Calculator II (ACC II), including several parameter sets. Conclusion The main contribution of the article is that it makes SQE methods with their parameters accessible to the users via the ACC II web application (https://acc2.ncbr.muni.cz) and also via a command-line application. Furthermore, our improvement, SQE+qp, provides an excellent solution for peptide datasets. Additionally, optGM provides comparable parameters to GDMIN in a markedly shorter time. Therefore, optGM allows us to perform parameterizations for charge calculation methods with more parameters (e.g., SQE and its extensions) using large datasets. Graphic Abstract ![]()
Collapse
Affiliation(s)
- Ondřej Schindler
- CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, 602 00, Brno, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Tomáš Raček
- CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, 602 00, Brno, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic.,Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
| | - Aleksandra Maršavelski
- Division of Biochemistry, Department of Chemistry, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000, Zagreb, Croatia
| | - Jaroslav Koča
- CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, 602 00, Brno, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Karel Berka
- Department of Physical Chemistry, Faculty of Science, Palacký University Olomouc, 17. listopadu 1192/12, 771 46, Olomouc, Czech Republic
| | - Radka Svobodová
- CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, 602 00, Brno, Czech Republic. .,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic.
| |
Collapse
|
10
|
Kane NJ, Wang X, Gerkovich MM, Breitkreutz M, Rivera B, Kunchithapatham H, Hoffman MA. The Envirome Web Service: Patient context at the point of care. J Biomed Inform 2021; 119:103817. [PMID: 34020026 DOI: 10.1016/j.jbi.2021.103817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 05/13/2021] [Accepted: 05/15/2021] [Indexed: 11/27/2022]
Abstract
Patient context - the "envirome" - can have a significant impact on patient health. While envirome indicators are available through large scale public data sources, they are not provided in a format that can be easily accessed and interpreted at the point of care by healthcare providers with limited time during a patient encounter. We developed a clinical decision support tool to bring envirome indicators to the point of care in a large pediatric hospital system in the Kansas City region. The Envirome Web Service (EWS) securely geocodes patient addresses in real time to link their records with publicly available context data. End-users guided the design of the EWS, which presents summaries of patient context data in the electronic health record (EHR) without disrupting the provider workflow. Through surveys, focus groups, and a formal review by hospital staff, the EWS was deployed into production use, integrating publicly available data on food access with the hospital EHR. Evaluation of EWS usage during the 2020 calendar year shows that 1,034 providers viewed the EWS, with a total of 29,165 sessions. This suggests that the EWS was successfully integrated with the EHR and is highly visible. The results also indicate that 63 (6.1%) of the providers are regular users that opt to maintain the EWS in their custom workflows, logging more than 100 EWS sessions during the year. The vendor agnostic design of the EWS supports interoperability and makes it accessible to health systems with disparate EHR vendors.
Collapse
Affiliation(s)
- N J Kane
- Children's Mercy Hospital, Kansas City, MO, United States
| | - X Wang
- University of Missouri-Kansas City, United States
| | | | - M Breitkreutz
- Children's Mercy Hospital, Kansas City, MO, United States
| | - B Rivera
- Children's Mercy Hospital, Kansas City, MO, United States
| | | | - M A Hoffman
- Children's Mercy Hospital, Kansas City, MO, United States; University of Missouri-Kansas City, United States.
| |
Collapse
|
11
|
Wang H, Wang Q, Liu Y, Liao X, Chu H, Chang H, Cao Y, Li Z, Zhang T, Cheng J, Jiang H. PCPD: Plant cytochrome P450 database and web-based tools for structural construction and ligand docking. Synth Syst Biotechnol 2021; 6:102-109. [PMID: 33997360 PMCID: PMC8094579 DOI: 10.1016/j.synbio.2021.04.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 03/25/2021] [Accepted: 04/16/2021] [Indexed: 01/03/2023] Open
Abstract
Plant cytochrome P450s play key roles in the diversification and functional modification of plant natural products. Although over 200,000 plant P450 gene sequences have been recorded, only seven crystalized P450 genes severely hampered the functional characterization, gene mining and engineering of important P450s. Here, we combined Rosetta homologous modeling and MD-based refinement to construct a high-resolution P450 structure prediction process (PCPCM), which was applied to 181 plant P450s with identified functions. Furthermore, we constructed a ligand docking process (PCPLD) that can be applied for plant P450s virtual screening. 10 examples of virtual screening indicated the process can reduce about 80% screening space for next experimental verification. Finally, we constructed a plant P450 database (PCPD: http://p450.biodesign.ac.cn/), which includes the sequences, structures and functions of the 181 plant P450s, and a web service based on PCPCM and PCPLD. Our study not only developed methods for the P450-specific structure analysis, but also introduced a universal approach that can assist the mining and functional analysis of P450 enzymes.
Collapse
Affiliation(s)
- Hui Wang
- College of Biotechnology, Tianjin University of Science & Technology, Tianjin, 300457, China.,Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| | - Qian Wang
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yuqian Liu
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.,School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Xiaoping Liao
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| | - Huanyu Chu
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| | - Hong Chang
- College of Biotechnology, Tianjin University of Science & Technology, Tianjin, 300457, China
| | - Yang Cao
- Department of Environmental Medicine, Institute of Environmental and Operational Medicine, Tianjin, China
| | - Zhigang Li
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Tongcun Zhang
- College of Biotechnology, Tianjin University of Science & Technology, Tianjin, 300457, China
| | - Jian Cheng
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| | - Huifeng Jiang
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| |
Collapse
|
12
|
Chen YS, Tu YH, Chen BH, Liu YY, Hong YP, Teng RH, Wang YW, Chiou CS. cgMLST@Taiwan: A web service platform for Vibrio cholerae cgMLST profiling and global strain tracking. J Microbiol Immunol Infect 2021; 55:102-106. [PMID: 33485793 DOI: 10.1016/j.jmii.2020.12.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 12/28/2020] [Accepted: 12/30/2020] [Indexed: 11/24/2022]
Abstract
BACKGROUND Cholera, a rapidly dehydrating diarrheal disease caused by toxigenic Vibrio cholerae, is a leading cause of morbidity and mortality in some regions of the world. Core genome multilocus sequence typing (cgMLST) is a promising approach in generating genetic fingerprints from whole-genome sequencing (WGS) data for strain comparison among laboratories. METHODS We constructed a V. cholerae core gene allele database using an in-house developed computational pipeline, a database with cgMLST profiles converted from genomic sequences from the National Center for Biotechnology Information, and built a REST-based web accessible via the Internet. RESULTS We built a web service platform-cgMLST@Taiwan and installed a V. cholerae allele database, a cgMLST profile database, and computational tools for generating V. cholerae cgMLST profiles (based on 3,017 core genes), performing rapid global strain tracking, and clustering analysis of cgMLST profiles. This web-based platform provides services to researchers, public health microbiologists, and physicians who use WGS data for the investigation of cholera outbreaks and tracking of V. cholerae strain transmission across countries and geographic regions. The cgMLST@Taiwan is accessible at http://rdvd.cdc.gov.tw/cgMLST.
Collapse
Affiliation(s)
- Yi-Syong Chen
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - Yueh-Hua Tu
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - Bo-Han Chen
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - Yen-Yi Liu
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - Yu-Ping Hong
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - Ru-Hsiou Teng
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - You-Wun Wang
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan
| | - Chien-Shun Chiou
- Center for Diagnostics and Vaccine Development, Centers for Disease Control, Ministry of Health and Welfare, Taiwan.
| |
Collapse
|
13
|
Wilkey AP, Brown AV, Cannon SB, Cannon EKS. GCViT: a method for interactive, genome-wide visualization of resequencing and SNP array data. BMC Genomics 2020; 21:822. [PMID: 33228531 PMCID: PMC7686774 DOI: 10.1186/s12864-020-07217-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 11/09/2020] [Indexed: 01/07/2023] Open
Abstract
Background Large genotyping datasets have become commonplace due to efficient, cheap methods for SNP identification. Typical genotyping datasets may have thousands to millions of data points per accession, across tens to thousands of accessions. There is a need for tools to help rapidly explore such datasets, to assess characteristics such as overall differences between accessions and regional anomalies across the genome. Results We present GCViT (Genotype Comparison Visualization Tool), for visualizing and exploring large genotyping datasets. GCViT can be used to identify introgressions, conserved or divergent genomic regions, pedigrees, and other features for more detailed exploration. The program can be used online or as a local instance for whole genome visualization of resequencing or SNP array data. The program performs comparisons of variants among user-selected accessions to identify allele differences and similarities between accessions and a user-selected reference, providing visualizations through histogram, heatmap, or haplotype views. The resulting analyses and images can be exported in various formats. Conclusions GCViT provides methods for interactively visualizing SNP data on a whole genome scale, and can produce publication-ready figures. It can be used in online or local installations. GCViT enables users to confirm or identify genomics regions of interest associated with particular traits. GCViT is freely available at https://github.com/LegumeFederation/gcvit. The 1.0 version described here is available at 10.5281/zenodo.4008713.
Collapse
Affiliation(s)
- Andrew P Wilkey
- ORISE Fellow, USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | - Anne V Brown
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | - Steven B Cannon
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | | |
Collapse
|
14
|
Yang D, Wang D, Zhou H, Wang Y, Song S, Dong Q. A novel application integration architecture for the education industry. Procedia Comput Sci 2020; 176:1813-1822. [PMID: 33042304 PMCID: PMC7531982 DOI: 10.1016/j.procs.2020.09.220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Since the Ministry of Education launched Education Informatization 2.0, the digitalization of colleges and universities has entered a stage of rapid growth. However, after more than 20 years of construction, problems such as system barriers and information islands have emerged in the digital construction of university systems. In order to solve such problems between the university systems, this paper proposes an easily expandable and configurable open information integration architecture by considering traditional information integration methods and combining with Web service technology. The architecture handles user service invocation information through a service layer, and manages the registration and invocation of services through a service module. The permission module manages user permissions to prevent information leakage and security issues. The data module abstracts data-related services to provide a basis for the deep use of data. And other optional development services are designed to satisfy special requirements for different platforms. The architecture proposed in this paper can integrate different heterogeneous subsystems in colleges and universities, eliminating the problem of system barriers and information islands, and providing specifications for the construction of new applications.
Collapse
Affiliation(s)
- Dongming Yang
- School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
| | - Daojiang Wang
- School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
| | - Huan Zhou
- School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
| | - Ye Wang
- School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
| | - Shubing Song
- School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
- Graduate School, East China Normal University
| | - Qiwen Dong
- School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
| |
Collapse
|
15
|
Kim S, Thiessen PA, Cheng T, Zhang J, Gindulyte A, Bolton EE. PUG-View: programmatic access to chemical annotations integrated in PubChem. J Cheminform 2019; 11:56. [PMID: 31399858 PMCID: PMC6688265 DOI: 10.1186/s13321-019-0375-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Accepted: 07/29/2019] [Indexed: 12/29/2022] Open
Abstract
PubChem is a chemical data repository that provides comprehensive information on various chemical entities. It contains a wealth of chemical information from hundreds of data sources. Programmatic access to this large amount of data provides researchers with new opportunities for data-intensive research. PubChem provides several programmatic access routes. One of these is PUG-View, which is a Representational State Transfer (REST)-style web service interface specialized for accessing annotation data contained in PubChem. The present paper describes various aspects of PUG-View, including the scope of data accessible through PUG-View, the syntax for formulating a PUG-View request URL, the difference of PUG-View from other web service interfaces in PubChem, and its limitations and usage policies.
Collapse
Affiliation(s)
- Sunghwan Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Paul A Thiessen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Tiejun Cheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Jian Zhang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Asta Gindulyte
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Evan E Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA.
| |
Collapse
|
16
|
Zielezinski A, Girgis HZ, Bernard G, Leimeister CA, Tang K, Dencker T, Lau AK, Röhling S, Choi JJ, Waterman MS, Comin M, Kim SH, Vinga S, Almeida JS, Chan CX, James BT, Sun F, Morgenstern B, Karlowski WM. Benchmarking of alignment-free sequence comparison methods. Genome Biol 2019; 20:144. [PMID: 31345254 PMCID: PMC6659240 DOI: 10.1186/s13059-019-1755-7] [Citation(s) in RCA: 97] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 07/03/2019] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Alignment-free (AF) sequence comparison is attracting persistent interest driven by data-intensive applications. Hence, many AF procedures have been proposed in recent years, but a lack of a clearly defined benchmarking consensus hampers their performance assessment. RESULTS Here, we present a community resource (http://afproject.org) to establish standards for comparing alignment-free approaches across different areas of sequence-based research. We characterize 74 AF methods available in 24 software tools for five research applications, namely, protein sequence classification, gene tree inference, regulatory element detection, genome-based phylogenetic inference, and reconstruction of species trees under horizontal gene transfer and recombination events. CONCLUSION The interactive web service allows researchers to explore the performance of alignment-free tools relevant to their data types and analytical goals. It also allows method developers to assess their own algorithms and compare them with current state-of-the-art tools, accelerating the development of new, more accurate AF solutions.
Collapse
Affiliation(s)
- Andrzej Zielezinski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University Poznan, Uniwersytetu Poznańskiego 6, 61-614, Poznan, Poland
| | - Hani Z Girgis
- Tandy School of Computer Science, The University of Tulsa, 800 South Tucker Drive, Tulsa, OK, 74104, USA
| | | | - Chris-Andre Leimeister
- Department of Bioinformatics, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077, Göttingen, Germany
| | - Kujin Tang
- Department of Biological Sciences, Quantitative and Computational Biology Program, University of Southern California, Los Angeles, CA, 90089, USA
| | - Thomas Dencker
- Department of Bioinformatics, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077, Göttingen, Germany
| | - Anna Katharina Lau
- Department of Bioinformatics, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077, Göttingen, Germany
| | - Sophie Röhling
- Department of Bioinformatics, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077, Göttingen, Germany
| | - Jae Jin Choi
- Department of Chemistry, University of California, Berkeley, CA, 94720, USA
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Michael S Waterman
- Department of Biological Sciences, Quantitative and Computational Biology Program, University of Southern California, Los Angeles, CA, 90089, USA
- Centre for Computational Systems Biology, School of Mathematical Sciences, Fudan University, Shanghai, 200433, China
| | - Matteo Comin
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Sung-Hou Kim
- Department of Chemistry, University of California, Berkeley, CA, 94720, USA
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Susana Vinga
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001, Lisbon, Portugal
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001, Lisbon, Portugal
| | - Jonas S Almeida
- Division of Cancer Epidemiology and Genetics (DCEG), National Cancer Institute (NIH/NCI), Bethesda, USA
| | - Cheong Xin Chan
- Institute for Molecular Bioscience, and School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Benjamin T James
- Tandy School of Computer Science, The University of Tulsa, 800 South Tucker Drive, Tulsa, OK, 74104, USA
| | - Fengzhu Sun
- Department of Biological Sciences, Quantitative and Computational Biology Program, University of Southern California, Los Angeles, CA, 90089, USA
- Centre for Computational Systems Biology, School of Mathematical Sciences, Fudan University, Shanghai, 200433, China
| | - Burkhard Morgenstern
- Department of Bioinformatics, Institute of Microbiology and Genetics, University of Göttingen, Goldschmidtstr. 1, 37077, Göttingen, Germany
| | - Wojciech M Karlowski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University Poznan, Uniwersytetu Poznańskiego 6, 61-614, Poznan, Poland.
| |
Collapse
|
17
|
Peng C, Goswami P. Meaningful Integration of Data from Heterogeneous Health Services and Home Environment Based on Ontology. Sensors (Basel) 2019; 19:E1747. [PMID: 31013678 DOI: 10.3390/s19081747] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 04/08/2019] [Accepted: 04/09/2019] [Indexed: 11/21/2022]
Abstract
The development of electronic health records, wearable devices, health applications and Internet of Things (IoT)-empowered smart homes is promoting various applications. It also makes health self-management much more feasible, which can partially mitigate one of the challenges that the current healthcare system is facing. Effective and convenient self-management of health requires the collaborative use of health data and home environment data from different services, devices, and even open data on the Web. Although health data interoperability standards including HL7 Fast Healthcare Interoperability Resources (FHIR) and IoT ontology including Semantic Sensor Network (SSN) have been developed and promoted, it is impossible for all the different categories of services to adopt the same standard in the near future. This study presents a method that applies Semantic Web technologies to integrate the health data and home environment data from heterogeneously built services and devices. We propose a Web Ontology Language (OWL)-based integration ontology that models health data from HL7 FHIR standard implemented services, normal Web services and Web of Things (WoT) services and Linked Data together with home environment data from formal ontology-described WoT services. It works on the resource integration layer of the layered integration architecture. An example use case with a prototype implementation shows that the proposed method successfully integrates the health data and home environment data into a resource graph. The integrated data are annotated with semantics and ontological links, which make them machine-understandable and cross-system reusable.
Collapse
|
18
|
Al-Koofee DAF, Ismael JM, Mubarak SMH. Point mutation detection by economic HRM protocol primer design. Biochem Biophys Rep 2019; 18:100628. [PMID: 31008377 DOI: 10.1016/j.bbrep.2019.100628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 03/13/2019] [Accepted: 03/18/2019] [Indexed: 11/17/2022] Open
Abstract
Globally more than 100 million SNPs in populations. These variations approximately 4-5 million SNPs in a people genome, occur almost every 1000 nucleotides on average and present either unique or in many in individuals. They can act as genetic signs, associated with illness and respond to chemicals and drugs. SNPs occurrence within or near a gene play important role in disease throughout affecting gene task. Frequently many protocols have been used to study single nucleotide polymorphism (SNP) among human variants genome. Restriction fragment length polymorphism (RFLP), Amplification refractory mutation system PCR(ARMS-PCR), sequencing and SNaPshot assays considered familial methods. The potential risk of contamination after PCR is common due to further other steps. In this direction, a high resolution melting (HRM) real-time PCR method is an alternative, reducing the post-PCR transferring steps. uVariants is clarified as appropriate website for designing primers used for SNP recognition by easy and inexpensive protocol called HRM. The researchers can focus on the interest of reference SNP ID number, or "rs" ID to avoid loss time. In this article description how to uses uVariants website for primer design used in HRM technique. Aims To describe uVariants and uDesign software, application and usefulness of HRM technique primer design in the genotyping SNPs among people and public health. Accessibility and requirements uVariants and uDesign are freely accessible at: https://www.dna.utah.edu/variants/;https://www.dna.utah.edu/udesign/app.php respectively.The network server supports the browsers: Chrome, Firefox, Torch, CoolNovo, 360 Browser, Internet Explorer, Opera, and Safari.
Collapse
Affiliation(s)
- Dhafer A F Al-Koofee
- Dept. of Clinical Laboratory Science, Faculty of Pharmacy, University of Kufa, Iraq
| | | | - Shaden M H Mubarak
- Dept. of Clinical Laboratory Science, Faculty of Pharmacy, University of Kufa, Iraq
| |
Collapse
|
19
|
Efimova D, Tyakht A, Popenko A, Vasilyev A, Altukhov I, Dovidchenko N, Odintsova V, Klimenko N, Loshkarev R, Pashkova M, Elizarova A, Voroshilova V, Slavskii S, Pekov Y, Filippova E, Shashkova T, Levin E, Alexeev D. Knomics-Biota - a system for exploratory analysis of human gut microbiota data. BioData Min 2018; 11:25. [PMID: 30450127 PMCID: PMC6220475 DOI: 10.1186/s13040-018-0187-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 10/22/2018] [Indexed: 01/04/2023] Open
Abstract
Background Metagenomic surveys of human microbiota are becoming increasingly widespread in academic research as well as in food and pharmaceutical industries and clinical context. Intuitive tools for investigating experimental data are of high interest to researchers. Results Knomics-Biota is a web-based resource for exploratory analysis of human gut metagenomes. Users can generate and share analytical reports corresponding to common experimental schemes (like case-control study or paired comparison). Interactive visualizations and statistical analysis are provided in association with the external factors and in the context of thousands of publicly available datasets arranged into thematic collections. The web-service is available at https://biota.knomics.ru. Conclusions Knomics-Biota web service is a comprehensive tool for interactive metagenomic data analysis. Electronic supplementary material The online version of this article (10.1186/s13040-018-0187-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Daria Efimova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Alexander Tyakht
- 2Computer Technologies Laboratory, ITMO University, Saint Petersburg, Russian Federation
| | - Anna Popenko
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Anatoly Vasilyev
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Ilya Altukhov
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation
| | - Nikita Dovidchenko
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,7Institute of Protein Research, Russian Academy of Sciences, Pushchino Moscow, 142290 Russia
| | - Vera Odintsova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Natalya Klimenko
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Robert Loshkarev
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Maria Pashkova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation
| | - Anna Elizarova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation
| | - Viktoriya Voroshilova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation
| | - Sergei Slavskii
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation.,4Life Sciences Department, Skolkovo Institute of Science and Technology, Moscow, Russian Federation
| | - Yury Pekov
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation
| | - Ekaterina Filippova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,5Biology Department, Lomonosov Moscow State University, Moscow, Russian Federation
| | - Tatiana Shashkova
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation.,6Institute of Cytology and Genetics, Novosibirsk State University, Novosibirsk, Russian Federation
| | - Evgenii Levin
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,3Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology (State University), Moscow, Russian Federation
| | - Dmitry Alexeev
- Research and Development Department, Knomics LLC, Skolkovo Innovation Center, Moscow, Russian Federation.,2Computer Technologies Laboratory, ITMO University, Saint Petersburg, Russian Federation
| |
Collapse
|
20
|
Xie Y, Luo X, Li Y, Chen L, Ma W, Huang J, Cui J, Zhao Y, Xue Y, Zuo Z, Ren J. DeepNitro: Prediction of Protein Nitration and Nitrosylation Sites by Deep Learning. Genomics Proteomics Bioinformatics 2018; 16:294-306. [PMID: 30268931 DOI: 10.1016/j.gpb.2018.04.007] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2018] [Revised: 04/12/2018] [Accepted: 04/27/2018] [Indexed: 11/24/2022]
Abstract
Protein nitration and nitrosylation are essential post-translational modifications (PTMs) involved in many fundamental cellular processes. Recent studies have revealed that excessive levels of nitration and nitrosylation in some critical proteins are linked to numerous chronic diseases. Therefore, the identification of substrates that undergo such modifications in a site-specific manner is an important research topic in the community and will provide candidates for targeted therapy. In this study, we aimed to develop a computational tool for predicting nitration and nitrosylation sites in proteins. We first constructed four types of encoding features, including positional amino acid distributions, sequence contextual dependencies, physicochemical properties, and position-specific scoring features, to represent the modified residues. Based on these encoding features, we established a predictor called DeepNitro using deep learning methods for predicting protein nitration and nitrosylation. Using n-fold cross-validation, our evaluation shows great AUC values for DeepNitro, 0.65 for tyrosine nitration, 0.80 for tryptophan nitration, and 0.70 for cysteine nitrosylation, respectively, demonstrating the robustness and reliability of our tool. Also, when tested in the independent dataset, DeepNitro is substantially superior to other similar tools with a 7%−42% improvement in the prediction performance. Taken together, the application of deep learning method and novel encoding schemes, especially the position-specific scoring feature, greatly improves the accuracy of nitration and nitrosylation site prediction and may facilitate the prediction of other PTM sites. DeepNitro is implemented in JAVA and PHP and is freely available for academic research at http://deepnitro.renlab.org.
Collapse
|
21
|
Abstract
BACKGROUND Network controllability focuses on discovering combinations of external interventions that can drive a biological system to a desired configuration. In practice, this approach translates into finding a combined multi-drug therapy in order to induce a desired response from a cell; this can lead to developments of novel therapeutic approaches for systemic diseases like cancer. RESULT We develop a novel bioinformatics data analysis pipeline called NetControl4BioMed based on the concept of target structural control of linear networks. Our pipeline generates novel molecular interaction networks by combining pathway data from various public databases starting from the user's query. The pipeline then identifies a set of nodes that is enough to control a given, user-defined set of disease-specific essential proteins in the network, i.e., it is able to induce a change in their configuration from any initial state to any final state. We provide both the source code of the pipeline as well as an online web-service based on this pipeline http://combio.abo.fi/nc/net_control/remote_call.php . CONCLUSION The pipeline can be used by researchers for controlling and better understanding of molecular interaction networks through combinatorial multi-drug therapies, for more efficient therapeutic approaches and personalised medicine.
Collapse
Affiliation(s)
- Krishna Kanhaiya
- Computational Biomodeling Laboratory, Turku Centre for Computer Science, and Department of Computer Science, Å bo Akademi University, Domkyrkotorget 3, Turku, 20500 Finland
| | - Vladimir Rogojin
- Computational Biomodeling Laboratory, Turku Centre for Computer Science, and Department of Computer Science, Å bo Akademi University, Domkyrkotorget 3, Turku, 20500 Finland
| | - Keivan Kazemi
- Computational Biomodeling Laboratory, Turku Centre for Computer Science, and Department of Computer Science, Å bo Akademi University, Domkyrkotorget 3, Turku, 20500 Finland
| | - Eugen Czeizler
- Computational Biomodeling Laboratory, Turku Centre for Computer Science, and Department of Computer Science, Å bo Akademi University, Domkyrkotorget 3, Turku, 20500 Finland
- National Institute for Research and Development for Biological Sciences, Splaiul Independentei 296, Bucharest, 060031 Romania
| | - Ion Petre
- Computational Biomodeling Laboratory, Turku Centre for Computer Science, and Department of Computer Science, Å bo Akademi University, Domkyrkotorget 3, Turku, 20500 Finland
- National Institute for Research and Development for Biological Sciences, Splaiul Independentei 296, Bucharest, 060031 Romania
| |
Collapse
|
22
|
Shen L, Attimonelli M, Bai R, Lott MT, Wallace DC, Falk MJ, Gai X. MSeqDR mvTool: A mitochondrial DNA Web and API resource for comprehensive variant annotation, universal nomenclature collation, and reference genome conversion. Hum Mutat 2018. [PMID: 29539190 DOI: 10.1002/humu.23422] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Accurate mitochondrial DNA (mtDNA) variant annotation is essential for the clinical diagnosis of diverse human diseases. Substantial challenges to this process include the inconsistency in mtDNA nomenclatures, the existence of multiple reference genomes, and a lack of reference population frequency data. Clinicians need a simple bioinformatics tool that is user-friendly, and bioinformaticians need a powerful informatics resource for programmatic usage. Here, we report the development and functionality of the MSeqDR mtDNA Variant Tool set (mvTool), a one-stop mtDNA variant annotation and analysis Web service. mvTool is built upon the MSeqDR infrastructure (https://mseqdr.org), with contributions of expert curated data from MITOMAP (https://www.mitomap.org) and HmtDB (https://www.hmtdb.uniba.it/hmdb). mvTool supports all mtDNA nomenclatures, converts variants to standard rCRS- and HGVS-based nomenclatures, and annotates novel mtDNA variants. Besides generic annotations from dbNSFP and Variant Effect Predictor (VEP), mvTool provides allele frequencies in more than 47,000 germline mitogenomes, and disease and pathogenicity classifications from MSeqDR, Mitomap, HmtDB and ClinVar (Landrum et al., 2013). mvTools also provides mtDNA somatic variants annotations. "mvTool API" is implemented for programmatic access using inputs in VCF, HGVS, or classical mtDNA variant nomenclatures. The results are reported as hyperlinked html tables, JSON, Excel, and VCF formats. MSeqDR mvTool is freely accessible at https://mseqdr.org/mvtool.php.
Collapse
Affiliation(s)
- Lishuang Shen
- Center for Personalized Medicine, Department of Pathology & Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California
| | - Marcella Attimonelli
- Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Bari, Italy
| | | | - Marie T Lott
- Center for Mitochondrial and Epigenomic Medicine, Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
| | - Douglas C Wallace
- Center for Mitochondrial and Epigenomic Medicine, Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.,University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Marni J Falk
- University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.,Division of Human Genetics, Department of Pediatrics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
| | - Xiaowu Gai
- Center for Personalized Medicine, Department of Pathology & Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California.,Keck School of Medicine, University of Southern California, California
| |
Collapse
|
23
|
Abstract
Metabolomics data analysis includes several repetitive tasks, including data sorting, calculation of exact masses or other physicochemical properties, or searching for identifiers in different databases. Several of these tasks can be automated using command line tools or short scripts in different scripting languages like Perl, Python, or R. This chapter presents simple solutions and short scripts written in R that can be used for the interaction with specific web services or for the calculation of physicochemical properties or molecular formulae.
Collapse
Affiliation(s)
- Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany. .,Chair of Analytical Analytical Food Chemistry, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Technische Universität München, Freising, Germany.
| |
Collapse
|
24
|
Schäuble S, Stavrum AK, Bockwoldt M, Puntervoll P, Heiland I. SBMLmod: a Python-based web application and web service for efficient data integration and model simulation. BMC Bioinformatics 2017. [PMID: 28646877 PMCID: PMC5483284 DOI: 10.1186/s12859-017-1722-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Background Systems Biology Markup Language (SBML) is the standard model representation and description language in systems biology. Enriching and analysing systems biology models by integrating the multitude of available data, increases the predictive power of these models. This may be a daunting task, which commonly requires bioinformatic competence and scripting. Results We present SBMLmod, a Python-based web application and service, that automates integration of high throughput data into SBML models. Subsequent steady state analysis is readily accessible via the web service COPASIWS. We illustrate the utility of SBMLmod by integrating gene expression data from different healthy tissues as well as from a cancer dataset into a previously published model of mammalian tryptophan metabolism. Conclusion SBMLmod is a user-friendly platform for model modification and simulation. The web application is available at http://sbmlmod.uit.no, whereas the WSDL definition file for the web service is accessible via http://sbmlmod.uit.no/SBMLmod.wsdl. Furthermore, the entire package can be downloaded from https://github.com/MolecularBioinformatics/sbml-mod-ws. We envision that SBMLmod will make automated model modification and simulation available to a broader research community. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1722-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sascha Schäuble
- Jena University Language & Information Engineering (JULIE) Lab, Friedrich-Schiller-University Jena, Jena, Germany
| | | | - Mathias Bockwoldt
- Department of Arctic and Marine Biology, UiT The Arctic University of Norway, Tromsø, Norway
| | - Pål Puntervoll
- Centre for Applied Biotechnology, Uni Research Environment, Bergen, Norway
| | - Ines Heiland
- Department of Arctic and Marine Biology, UiT The Arctic University of Norway, Tromsø, Norway.
| |
Collapse
|
25
|
Gan RC, Chen TW, Wu TH, Huang PJ, Lee CC, Yeh YM, Chiu CH, Huang HD, Tang P. PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms. BMC Bioinformatics 2016; 17:513. [PMID: 28155708 PMCID: PMC5260104 DOI: 10.1186/s12859-016-1366-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
Background Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Results Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. Conclusions In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw.
Collapse
Affiliation(s)
- Ruei-Chi Gan
- Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu, 300, Taiwan.,Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan
| | - Ting-Wen Chen
- Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan
| | - Timothy H Wu
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei City, Taiwan
| | - Po-Jung Huang
- Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan
| | - Chi-Ching Lee
- Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan
| | - Yuan-Ming Yeh
- Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan
| | - Cheng-Hsun Chiu
- Molecular Infectious Diseases Research Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Hsien-Da Huang
- Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu, 300, Taiwan. .,Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu, 300, Taiwan.
| | - Petrus Tang
- Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan. .,Molecular Infectious Diseases Research Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan. .,Molecular Regulation & Bioinformatics Laboratory, Chang Gung University, Taoyuan, Taiwan.
| |
Collapse
|
26
|
Wang L, Zhang C, Watkins J, Jin Y, McNutt M, Yin Y. SoftPanel: a website for grouping diseases and related disorders for generation of customized panels. BMC Bioinformatics 2016; 17:153. [PMID: 27044653 PMCID: PMC4820874 DOI: 10.1186/s12859-016-0998-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 03/23/2016] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Targeted next-generation sequencing is playing an increasingly important role in biological research and clinical diagnosis by allowing researchers to sequence high priority genes at much higher depths and at a fraction of the cost of whole genome or exome sequencing. However, in designing the panel of genes to be sequenced, investigators need to consider the tradeoff between the better sensitivity of a broad panel and the higher specificity of a potentially more relevant panel. Although tools to prioritize candidate disease genes have been developed, the great majority of these require prior knowledge and a set of seed genes as input, which is only possible for diseases with a known genetic etiology. RESULTS To meet the demands of both researchers and clinicians, we have developed a user-friendly website called SoftPanel. This website is intended to serve users by allowing them to input a single disorder or a disorder group and generate a panel of genes predicted to underlie the disorder of interest. Various methods of retrieval including a keyword search, browsing of an arborized list of International Classification of Diseases, 10th revision (ICD-10) codes or using disorder phenotypic similarities can be combined to define a group of disorders and the genes known to be associated with them. Moreover, SoftPanel enables users to expand or refine a gene list by utilizing several biological data resources. In addition to providing users with the facility to create a "hard" panel that contains an exact gene list for targeted sequencing, SoftPanel also enables generation of a "soft" panel of genes, which may be used to further filter a significantly altered set of genes identified through whole genome or whole exome sequencing. The service and data provided by SoftPanel can be accessed at http://www.isb.pku.edu.cn/SoftPanel/ . A tutorial page is included for trying out sample data and interpreting results. CONCLUSION SoftPanel provides a convenient and powerful tool for creating a targeted panel of potential disease genes while supporting different forms of input. SoftPanel may be utilized in both genomics research and personalized medicine.
Collapse
Affiliation(s)
- Likun Wang
- Institute of Systems Biomedicine, Department of Pathology, School of Basic Medical Sciences, Beijing Key Laboratory of Tumor Systems Biology, Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China
| | - Cong Zhang
- Institute of Systems Biomedicine, Department of Pathology, School of Basic Medical Sciences, Beijing Key Laboratory of Tumor Systems Biology, Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China
| | - Johnathan Watkins
- Institute for Mathematical and Molecular Biomedicine, King's College London, Guy's Campus, London, SE1 1UL, UK.,Department of Research Oncology, King's College London, Guy's Campus, Great Maze Pond, London, SE1 9RT, UK
| | - Yan Jin
- Institute of Systems Biomedicine, Department of Pathology, School of Basic Medical Sciences, Beijing Key Laboratory of Tumor Systems Biology, Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China
| | - Michael McNutt
- Institute of Systems Biomedicine, Department of Pathology, School of Basic Medical Sciences, Beijing Key Laboratory of Tumor Systems Biology, Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China
| | - Yuxin Yin
- Institute of Systems Biomedicine, Department of Pathology, School of Basic Medical Sciences, Beijing Key Laboratory of Tumor Systems Biology, Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China.
| |
Collapse
|
27
|
Abstract
BACKGROUND Semantic Web technologies have been widely applied in the life sciences, for example by data providers such as OpenLifeData and through web services frameworks such as SADI. The recently reported OpenLifeData2SADI project offers access to the vast OpenLifeData data store through SADI services. FINDINGS This article describes how to merge data retrieved from OpenLifeData2SADI with other SADI services using the Galaxy bioinformatics analysis platform, thus making this semantic data more amenable to complex analyses. This is demonstrated using a working example, which is made distributable and reproducible through a Docker image that includes SADI tools, along with the data and workflows that constitute the demonstration. CONCLUSIONS The combination of Galaxy and Docker offers a solution for faithfully reproducing and sharing complex data retrieval and analysis workflows based on the SADI Semantic web service design patterns.
Collapse
Affiliation(s)
- Mikel Egaña Aranguren
- Genomic Resources, Department of Genetics, Physical Anthropology and Animal Physiology, Faculty of Science and Technology, University of Basque Country (UPV/EHU), Sarriena auzoa z/g, Leioa - Bilbo, 48940 Spain ; Eurohelp Consulting, 48011 Maximo Aguirre 18, Bilbo, Spain
| | - Mark D Wilkinson
- Biological Informatics, Centre for Plant Biotechnology and Genomics (CBGP), Technical University of Madrid (UPM), Campus of Montegancedo, Pozuelo de Alarcón, 28223 Spain
| |
Collapse
|
28
|
Abstract
The recent advancement of high-throughput genome sequencing technologies has resulted in a considerable increase in demands for large-scale genome annotation. While annotation is a crucial step for downstream data analyses and experimental studies, this process requires substantial expertise and knowledge of bioinformatics. Here we present MEGANTE, a web-based annotation system that makes plant genome annotation easy for researchers unfamiliar with bioinformatics. Without any complicated configuration, users can perform genomic sequence annotations simply by uploading a sequence and selecting the species to query. MEGANTE automatically runs several analysis programs and integrates the results to select the appropriate consensus exon-intron structures and to predict open reading frames (ORFs) at each locus. Functional annotation, including a similarity search against known proteins and a functional domain search, are also performed for the predicted ORFs. The resultant annotation information is visualized with a widely used genome browser, GBrowse. For ease of analysis, the results can be downloaded in Microsoft Excel format. All of the query sequences and annotation results are stored on the server side so that users can access their own data from virtually anywhere on the web. The current release of MEGANTE targets 24 plant species from the Brassicaceae, Fabaceae, Musaceae, Poaceae, Salicaceae, Solanaceae, Rosaceae and Vitaceae families, and it allows users to submit a sequence up to 10 Mb in length and to save up to 100 sequences with the annotation information on the server. The MEGANTE web service is available at https://megante.dna.affrc.go.jp/.
Collapse
Affiliation(s)
| | - Takeshi Itoh
- *Corresponding author: E-mail, ; Fax, +81-29-838-7065
| |
Collapse
|
29
|
Kontopoulos DG, Glykos NM. Pinda: a web service for detection and analysis of intraspecies gene duplication events. Comput Methods Programs Biomed 2013; 111:711-714. [PMID: 23796449 DOI: 10.1016/j.cmpb.2013.05.021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2012] [Revised: 05/23/2013] [Accepted: 05/27/2013] [Indexed: 06/02/2023]
Abstract
We present Pinda, a Web service for the detection and analysis of possible duplications of a given protein or DNA sequence within a source species. Pinda fully automates the whole gene duplication detection procedure, from performing the initial similarity searches, to generating the multiple sequence alignments and the corresponding phylogenetic trees, to bootstrapping the trees and producing a Z-score-based list of duplication candidates for the input sequence. Pinda has been cross-validated using an extensive set of known and bibliographically characterized duplication events. The service facilitates the automatic and dependable identification of gene duplication events, using some of the most successful bioinformatics software to perform an extensive analysis protocol. Pinda will prove of use for the analysis of newly discovered genes and proteins, thus also assisting the study of recently sequenced genomes. The service's location is http://orion.mbg.duth.gr/Pinda. The source code is freely available via https://github.com/dgkontopoulos/Pinda/.
Collapse
Affiliation(s)
- Dimitrios-Georgios Kontopoulos
- Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, 68100 Alexandroupolis, Greece
| | | |
Collapse
|
30
|
Vianello D, Sevini F, Castellani G, Lomartire L, Capri M, Franceschi C. HAPLOFIND: a new method for high-throughput mtDNA haplogroup assignment. Hum Mutat 2013; 34:1189-94. [PMID: 23696374 DOI: 10.1002/humu.22356] [Citation(s) in RCA: 118] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2013] [Accepted: 05/03/2013] [Indexed: 11/06/2022]
Abstract
Deep sequencing technologies are completely revolutionizing the approach to DNA analysis. Mitochondrial DNA (mtDNA) studies entered in the "postgenomic era": the burst in sequenced samples observed in nuclear genomics is expected also in mitochondria, a trend that can already be detected checking complete mtDNA sequences database submission rate. Tools for the analysis of these data are available, but they fail in throughput or in easiness of use. We present here a new pipeline based on previous algorithms, inherited from the "nuclear genomic toolbox," combined with a newly developed algorithm capable of efficiently and easily classify new mtDNA sequences according to PhyloTree nomenclature. Detected mutations are also annotated using data collected from publicly available databases. Thanks to the analysis of all freely available sequences with known haplogroup obtained from GenBank, we were able to produce a PhyloTree-based weighted tree, taking into account each haplogroup pattern conservation. The combination of a highly efficient aligner, coupled with our algorithm and massive usage of asynchronous parallel processing, allowed us to build a high-throughput pipeline for the analysis of mtDNA sequences that can be quickly updated to follow the ever-changing nomenclature. HaploFind is freely accessible at the following Web address: https://haplofind.unibo.it.
Collapse
Affiliation(s)
- Dario Vianello
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna 40126, Italy.
| | | | | | | | | | | |
Collapse
|
31
|
Kawamoto K, Del Fiol G, Orton C, Lobach DF. System-agnostic clinical decision support services: benefits and challenges for scalable decision support. Open Med Inform J 2010; 4:245-54. [PMID: 21603281 PMCID: PMC3097478 DOI: 10.2174/1874431101004010245] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Revised: 06/17/2010] [Accepted: 08/06/2010] [Indexed: 11/22/2022] Open
Abstract
System-agnostic clinical decision support (CDS) services provide patient evaluation capabilities that are independent of specific CDS systems and system implementation contexts. While such system-agnostic CDS services hold great potential for facilitating the widespread implementation of CDS systems, little has been described regarding the benefits and challenges of their use. In this manuscript, the authors address this need by describing potential benefits and challenges of using a system-agnostic CDS service. This analysis is based on the authors’ formal assessments of, and practical experiences with, various approaches to developing, implementing, and maintaining CDS capabilities. In particular, the analysis draws on the authors’ experience developing and leveraging a system-agnostic CDS Web service known as SEBASTIAN. A primary potential benefit of using a system-agnostic CDS service is the relative ease and flexibility with which the service can be leveraged to implement CDS capabilities across applications and care settings. Other important potential benefits include facilitation of centralized knowledge management and knowledge sharing; the potential to support multiple underlying knowledge representations and knowledge resources through a common service interface; improved simplicity and componentization; easier testing and validation; and the enabling of distributed CDS system development. Conversely, important potential challenges include the increased effort required to develop knowledge resources capable of being used in many contexts and the critical need to standardize the service interface. Despite these challenges, our experiences to date indicate that the benefits of using a system-agnostic CDS service generally outweigh the challenges of using this approach to implementing and maintaining CDS systems.
Collapse
Affiliation(s)
- Kensaku Kawamoto
- Division of Clinical Informatics, Department of Community and Family Medicine, Box 2914, Duke University Medical Center, Durham, NC 27710, USA.
| | | | | | | |
Collapse
|
32
|
Abstract
A few neuroinformatics databases now exist that record results from neuroimaging studies in the form of brain coordinates in stereotaxic space. The Brede Toolbox was originally developed to extract, analyze and visualize data from one of them - the BrainMap database. Since then the Brede Toolbox has expanded and now includes its own database with coordinates along with ontologies for brain regions and functions: The Brede Database. With Brede Toolbox and Database combined, we setup automated workflows for extraction of data, mass meta-analytic data mining and visualizations. Most of the Web presence of the Brede Database is established by a single script executing a workflow involving these steps together with a final generation of Web pages with embedded visualizations and links to interactive three-dimensional models in the Virtual Reality Modeling Language. Apart from the Brede tools I briefly review alternate visualization tools and methods for Internet-based visualization and information visualization as well as portals for visualization tools.
Collapse
Affiliation(s)
- Finn Årup Nielsen
- Center for Integrated Molecular Brain ImagingCopenhagen, Denmark
- DTU Informatics, Technical University of DenmarkLyngby, Denmark
- Neurobiology Research Unit, Copenhagen University Hospital, RigshospitaletCopenhagen, Denmark
| |
Collapse
|